So, was the brouhaha over the Turing Test justified? It was widely reported last week that the test had been passed for the first time by a chatbot named ‘Eugene Goostman’. I think the name itself is a little joke: it sort of means ‘well-made ghost man’.
This particular version of the test was not the regular Loebner which we have discussed before (Hugh Loebner must be grinding his teeth in frustration at the apprent ease with which Warwick garnered international media attention), but a session at the Royal Society apparently organised by Kevin Warwick. The bar was set unusually low in this case: to succeed the chatbot only had to convince 30% of the judges that it was human. This was based on the key sentence in the paper by Turing which started the whole thing:
I believe that in about fifty years’ time it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.
Fair enough, perhaps, but less impressive than the 50% demanded by other versions; and if 30% is the benchmark, this wasn’t actually the first ever pass, because other chatbots like Cleverbot have scored higher in the past. Goostman doesn’t seem to be all that special: in an “interview” on BBC radio, with only the softest of questioning, it used one response twice over, word for word.
The softness of the questioning does seem to be a crucial variable in Turing tests. If the judges stick to standard small talk and allow the chatbot to take the initiative, quite a reasonable dialogue may result: if the judges are tough it is easy to force the success rate down to zero by various tricks and traps. Iph u spel orl ur werds rong, fr egsampl, uh bott kanot kope, but a human generally manages fine.
Perhaps that wouldn’t have worked for Goostman, though because he was presented as relatively ignorant young boy whose first language was not English, giving him some excuse for not understanding things. This stratagem attracted some criticism, but really it is of a piece with chatbot strategy in general; faking and gaming is what it’s all about. N0-one remotely supposes that Goostman, or Cleverbot, or any of the others, has actually attained consciousness, or is doing anything that could properly be called thinking. Many years ago I believe there were serious efforts to write programs that to some degree imitated the probable mental processes of a human being: they identified a topic, accessed a database of information about it, retained a set of ‘attitudes’ towards things and tried to construct utterances that made sense in relation to them. It is a weakness of the Turing test that it does not reward that kind of effort; a robot with poor grammar and general knowledge might be readily detectable even though it gave signs of some nascent understanding, while a bot which generates smooth responses without any attempt at understanding has a much better chance of passing.
So perhaps the curtain should be drawn down on the test; not because it has been passed, but because it’s no use.