The Loebner Prize.

Linespace
The Loebner prize

1 October 2005

The Loebner Gold Medal

The 2005 Loebner prize was awarded to Jabberwacky recently - or rather, to George, one of the virtual personalities supported by the Jabberwacky software. The Loebner prize,as you may know, puts chat-bot programs through a version of the Turing test. The judges have a series of on-line conversations, some with real humans, some with the chat-bot programs; the bot which most nearly persuades the judges that it is in fact a human being, gets the prize. There is a gold medal waiting for the first bot which actually passes as human, but there seems to be no immediate risk of it's being won.

I believe serious AI researchers have, on the whole, tended to stay away from the Loebner (it seems that in 1995 Marvin Minsky offered a "prize" of $100 to anyone who could make Hugh Loebner desist from holding the contest), but it has also had support from serious intellectuals. Ned Block appears to have been one of this year's judges; Daniel Dennett chaired the panel during some of the early years (but eventually withdrew when he could not get agreement to his plans, which would have seen a number of more 'serious' AI challenges introduced as preliminaries to the main event). It's certainly an entertaining event - sometimes the transcripts of the conversations have a demented but irresistible humour about them - but I wonder how Alan Turing would have felt about it. Nowadays the contest rather underlines the failure of Turing's prediction - that we should have conversational computers by the end of the twentieth century. Personally, I think the other two points which come across most strongly from the event are the continuing weakness of the chatbots and the unserviceable qualities of the Turing test itself.

Linespace

I think that's a bit hard. There's a vicious circle here, isn't there? The AI panjandrums don't participate because they fear the bots will perform badly and disgrace them; the bots continue to perform badly because they don't get the input of the AI panjandrums. What are Minsky and the rest so haughty about? Designing chatterbots is a perfectly legitimate way of researching several interesting problems; parsing (grammatical analysis) most obviously, but also human interaction in general; the rules of topic maintenance or change; emotional tone; and perhaps even the pragmatics of conversation. It's not hard to imagine bots that made a real contribution to these fields even if they didn't seem particularly human.

Linespace

Blandula I don't think the 'panjandrums' are mainly worried about being disgraced by the poor performance of the bots. The real problem is that there's a kind of dishonesty about the whole enterprise. Nobody is able to produce anything like a real simulation of a human brain in conversation, so the contest implicitly invites people to rig up a program which appears to be conscious even though it unarguably isn't. I don't think that's quite what Turing had in mind (although I don't think he ever actually says a machine would have to be conscious to pass his test). Personally, I also think it would be an extremely dangerous thing to do if it were possible, blurring the distinction between people and things in a way which might have disastrous moral consequences.

Luckily, of course, it isn't possible. It's quite clear if you read some of the transcripts that the bots have two big problems. The first is that they cannot follow the thread of a conversation, or even interpret the meaning of individual sentences correctly much of the time. This is because they only really deal with grammar; to decipher human communications you have to be able to deal with intentionality, with the actual meaning, and none of the designers have found a way of doing that. Second, they just don't know enough. You can stall and evade questions to some extent, but a human interogator soon discovers that the bots completely lack whole areas of information that even the most ignorant human being would take for granted - is my toe bigger than a 747? I think it's clear that you can't have human-level conversation without human-level consciousness - or something close to it.

Linespace

Well, you know, one valuable point the Loebner makes is that the Turing test really is a hard one, and maybe it's a bit too hard. I'm sure the reason Turing proposed it because the power and flexiblity of language mean that a conversation can probe the memory, intelligence, and other mental qualities of an experimental subject in a far more searching and challenging way than any other experimental set-up. Maybe we need some scaled-down challenges: I believe that in some past years, the Loebner conversations were restricted to particular topics, for example. Myself, I wonder if it would be more productive to have sessions in which the interlocutors were not actively trying to catch the computer out. Conversations in which the judge merely throws in strings of nonsense to see whether they are recognised as such don't seem to achieve very much.

Linespace

Yes, but you see, that's the whole problem with the Turing test principle. If you find a group of people who want to believe the computer is talking sensibly, and they make enough allowances for it, you can easily get a positive result. A program which just bats back people's own input in the form of questions, like Joseph Weizenbaum's famous Eliza, is quite capable of fooling some people. On the other hand, if you have a skilled forensic examination, it's always going to be possible to find inconsistencies in the conversation of any human-like entity which hasn't actually lived a genuine human life. So what's the point?

Linespace

The point is that the prospect of talking to a computer, or a robot, is what actually fires people's imagination. It's like the space program; there's tons of interesting science to be done with satellites and probes, but what actually gets people's interest is going to Mars. Like it or not, the programme needs that excitement; and the project is a perfectly legitimate and valuable one, too. It's the same with the Loebner: OK, there are lots of less glamorous projects in AI which in some ways probably deserve attention more than the creation of chatterbots. But if it weren't for the ultimate prospect of a friendly chat with your metal pal, how much interest and support would the artificial modelling of mental processes attract? Heavens - the whole field might be left to neurologists!

Linespace