No. 3208: COMPUTERS AND READING COMPREHENSION
by Andy Boyd
Today, computer comprehension. The University of Houston presents this series about the machines that make our civilization run, and the people whose ingenuity created them.
You may remember them from your days as a student. A passage of text is read after which you answer a sequence of questions. You may or may not be fond of these reading comprehension tests, but they're a wonderful example of something humans are uniquely suited for. A well-designed test requires the test taker to understand the nuances of words, relate information scattered throughout the passage, understand the goals and intentions of the characters, and much, much more. It's a situation where computers fail miserably.
Woman reading Photo Credit: Pixabay
Which is why I was skeptical when I learned that in 2018 two different computer programs outperformed a group of humans on a reading comprehension test. How could that be? Should we bemoan the fact that computers are truly starting to infringe on our territory? Or were the claims overexaggerated? So I went looking, beyond the first few Google entries with subtitles like "The Robots are Coming, and They Can Read."
Reading robot Photo Credit: Pixabay
At the core of what I discovered were the questions themselves. If a passage reads "The 1973 oil crisis began in October 1973," it's no surprise that when asked when the oil crisis began a computer program will respond "October 1973." Similarly, given the phrase "The OPEC oil embargo caused an oil crisis, or 'shock'" and faced with the question "What was another term used for the oil crisis?" a program will answer "shock." It turns out that in an effort to quickly generate a huge number of questions and answers, researchers had relied on text passages taken from Wikipedia, which tend to be rather dry and factual - perfect for computers to parse.
Now consider the following question taken from a college entrance exam. "In the context of the written passage," goes the question, "the author's use of the phrase 'her light step flying to keep time with his long stride' is intended to convey what?" Well, perhaps it means the man and woman are playfully competing. Or perhaps the woman is a bit frustrated that she can't keep up. Or maybe it's a statement about shared enthusiasm. There's no way to tell without understanding the interrelationship of the characters at that moment. Most people reading the entire passage would realize the right answer is shared enthusiasm, since the characters are engaged in an "incomparable" moment of personal connection.
And that's the kind of reasoning that causes computers to stumble badly. As humans, we have the ability to comprehend meaning from a collection of words. Computers are okay when it comes to grammar - to the syntactical rules by which we create proper sentences. But when it comes to meaning they're dead-on-arrival. That isn't to say these reading comprehension programs aren't useful. They're very handy for answering rudimentary questions from customers over the internet or searching through large quantities of legal documents for invaluable facts. However, they don't as yet comprehend anything. Nor do I expect they will any time soon.
I'm Andy Boyd at the University of Houston, where we're interested in the way inventive minds work.
For a related episode, see CONSCIOUSNESS.
The oil crisis questions were taken from the Github website: Click here. Accessed May 28, 2019. Many sample questions and answers can be found here.
The full college entrance exam passage referred to here can be found on the College Board website: Click here. Accessed May 28, 2019.
Anthony Cuthbertson. "Robots Can Now Read Better Than Humans, Putting Millions of Jobs at Risk." Newsweek, January 15, 2018. See also: Click here. Accessed May 28, 2019.
Pranav Rajpurkar. The Stanford Question Answering Dataset: Background, Challenges, Progress. April 3, 2017. From the Github website: Click here. Accessed May 28, 2019.
This episode was first aired on May 30, 2019