So IBM is at it again: first chess, now quizzes? In the new year their AI system ‘Watson’ (named after the founder of the company – not the partner of a system called ‘Crick’, nor yet of a vastly cleverer system called ‘Holmes’) is to be pitted against human contestants in the TV game Jeopardy: it has already demonstrated its remarkable ability to produce correct answers frequently enough to win against human opposition. There is certainly something impressive in the sight of a computer buzzing in and enunciating a well-formed, correct answer.
However, if you launch your new technological breakthrough on a TV quiz, rather than describing it in a peer-reviewed paper, or releasing it so that the world at large can kick it around a bit, I think you have to accept that people are going to suspect your discovery is more a matter of marketing than of actual science; and much of the stuff IBM has put out tends to confirm this impression. It’s long on hoopla, here and there it has that patronising air large businesses often seem to adopt for their publicity (“Imagine if a box could talk!”) and it’s rather short on details of how Watson actually works. This video seems to give a reasonable summary: there doesn’t seem to be anything very revolutionary going on, just a canny application of known techniques on a very large, massively parallel machine.
Not a breakthrough, then? But it looks so good! It’s worth remembering that a breakthrough in this area might be of very high importance. One of the things which computers have never been much good at is tasks that call for a true grasp of meaning, or for the capacity to deal with open-ended real environments. This is why the Turing test seems (in principle, anyway) like a good idea – to carry on a conversation reliably you have to be able to work out what the other person means; and in a conversation you can talk about anything in any way. If we could crack these problems, we should be a lot closer to the kind of general intelligence which at present robots only have in science fiction.
Sceptically, there are a number of reasons to think that Watson’s performance is actually less remarkable than it seems. First, a problem of fair competition is that the game requires contestants to buzz first in order to answer a question. It’s no surprise that Watson should be able to buzz in much faster than human contestants, which amounts to giving the machine the large advantage of having first pick of whatever questions it likes.
Second, and more fundamental, is Jeopardy really a restricted domain after all? This is crucial because AI systems have always been able to perform relatively well in ‘toy worlds’ where the range of permutations could be kept under control. It’s certainly true that the interactions involved in the game are quite rigidly stylised, eliminating at a stroke many of the difficult problems of pragmatics which crop up in the Turing Test. In a real conversation the words thrown at you might require all sorts of free-form interpretation, and have all kinds of conative, phatic and inferential functions; in the quiz you know they’re all going to be questions which just require answers in a given form. On the other hand, so far as topics go, quiz questions do appear to be unrestricted ones which can address any aspect of the world (I note that Jeopardy questions are grouped under topics, but I’m not quite sure whether Watson will know in advance the likely categories, or the kinds of categories, it will be competing in). It may be interesting in this connection that Watson does not tap into the Internet for its information, but its own large corpus of data. The Internet to some degree reflects the buzzing chaos of reality, so it’s not really surprising or improper that Watson’s creators should prefer something a little more structured, but it does raise a slight question as to whether the vast database involved has been customised for the specifics of Jeopardy-world.
I said the quiz questions were a stylised form of discourse; but we’re asked to note in this connection that Jeopardy questions are peculiarly difficult: they’re not just straight factual questions with a straight answer, but allusive, referential, clever ones that require some intelligence to see through. Isn’t it all the more surprising that Watson should be able to deal with them? Well, no, I don’t think so: it’s no more impressive than a blind man offering to fight you in the dark. Watson has no idea whether the questions are ‘straight’ or not; so long as enough clues are in there somewhere, it doesn’t matter how contorted or even nonsensical they might be; sometimes meanings can be distracting as well as helpful, but Watson has the advantage of not being bothered by that.
Another reason to withhold some of our admiration is that Watson is, in fact, far from infallible. It would be interesting to see more of Watson’s failures. The wrong answers mentioned by IBM tend to be good misses: answers that are incorrect, but make some sort of sense. We’re more used to AIs that fail disastrously, suddenly producing responses that are bizarre or unintelligible. This will be important for IBM if they want to sell Watson technology, since buyers are much less likely to want a system that works well most of the time but abysmally every now and then.
Does all this matter? If it really is mainly a marketing gimmick, why should we pay attention? IBM make absolutely no claims that Watson is doing human-style thought or has anything approaching consciousness, but they do speak rather loosely of it dealing with meanings. There is a possibility that a famous victory by Watson would lead to AI claiming another tranche of vocabulary as part of its legitimate territory. Look, people might say; there’s no point in saying that Watson and similar machines can’t deal with meaning and intentionality, any more than saying planes can’t fly because they don’t do it the way birds do. If machines can answer questions as well as human beings, it’s pointless to claim they can’t understand the questions: that’s what understanding is. OK, they might say, you can still have your special ineffable meat-world kind of understanding, but you’re going to have to redefine that as a narrower and frankly less important business.