Loebner 2008

Picture:  Elbot The annual Loebner Prize has been won by Elbot. As you may know, the Loebner competition implements the Turing Test, inviting contestants to put forward a chat-bot program which can conduct online conversation indistinguishable from one conducted with a human being. We previously discussed the victory of Rollo Carpenter’s Jabberwacky, a contender again this year.

One of Elbot’s characteristics, which presumably helped tip the balance this year, is a particular assertiveness about trying to manage the conversation into ‘safe’ areas. One of the easiest ways to unmask a chat-bot is to exploit its lack of knowledge about the real world; but if the bot can keep the conversation in domains it is well-informed about, it stands a much better chance of being plausible. Otherwise the only option is often to resort to relatively weak default responses (‘I don’t know’, ‘What do you think?’, ‘Why do you mention [noun extracted from the input sentence]?).

But aren’t Elbot’s tactics cheating? Don’t these cheap tricks invalidate the whole thing as a serious project? Some would say so: the Loebner does not enjoy universal esteem among academics, and Marvin Minsky famously offered a cash reward to anyone who could stop the contest.

We have to remember, however, that the contestants are not seeking to reproduce the real operation of the human brain. Humans are able to conduct general conversation because they have general-purpose consciousness, but that is far too much to expect of a simple chat-bot. The Turing Test is sometimes interpreted as a test for consciousness, but that isn’t quite how Turing himself described it (he proposed it as a more practical alternative to considering the question ‘Can machines think?’).

OK so it’s not cheating, but all the same, if it’s just fakery, what’s the value of the exercise? There are several answers to this. One is the ‘plane wing’ argument: planes don’t fly the way birds do, but they’re still of value. It might well be that a program that does conversation is useful in its own right, even if it doesn’t do things the way the human brain does; perhaps for human/machine interfaces. On the other hand, as a second answer, it might turn out that discoveries we make while making chat-bots will eventually shed some light on how some parts of the brain put together well-structured and relevant sentences. If they don’t do that, they may still lead to the discovery of unexpectedly valuable techniques in programming: solving difficult but apparently pointless problems just for the hell of it does sometimes prove more fruitful than expected. A fourth point which I think perhaps deserves more attention is that even if chat-bots tell us nothing about AI, they may still tell us interesting things about human beings.  The way trust and belief are evoked, for example: the implicit rules of conversation, and the pragmatics of human communication.

The clincher in my view, however, is that the Loebner is fun, and enlivens the subject in a way which must surely be helpful overall. How many serious scientists were inspired in part by a secret childhood desire to have a robot friend they could talk to?

In a way you could say Elbot is attempting to refine the terms of the test. A successful program actually needs to deploy several different kinds of ability, and one of the most challenging is bringing to bear a fund of implicit background knowledge. No existing program is remotely as good at this as the human brain, and there are some reasons to doubt whether they ever will be. In the meantime, at any rate, there may be an argument for taking this factor out of the equation: Elbot tries to do this by managing the conversation, but in some early Loebner contests the conversations were explicitly limited to particular topic, and maybe this approach has something to be said for it. I believe Daniel Dennett, who was once a Loebner judge, suggested that the contest should develop towards testing a range of narrower abilities rather than the total conversational package. Perhaps we can imagine tests of parsing, of knowledge management, and so on.

At any rate, the Loebner seems in vigorous health, with a strong group of contenders this year: I hope it continues.

10 thoughts on “Loebner 2008

  1. Check out this Web 2.0 approach to chatbots: http://chatbotgame.com.

    Just as Deep Blue brute-forced it in chess with speed, the idea behind the Chatbot Game is to brute-force it with a huge number of user-submitted Google-like chat rules.

  2. Good thoughts! I’m doing scripted work for commercial purposes, using recursive phrase and word building blocks which could very possibly do well, but kept clear of using it for the Loebner Prize, as it seems too contrary to the Turing Test. Instead the entry was a new variant on Jabberwacky – same learnt data, less wacky, more fuzzy, deeper context: http://www.cleverbot.com

  3. Thanks, amichail.

    And thanks, Rollo – if this were the stock exchange, I should certainly advise people to buy into Carpenter shares.

  4. It sounds as if Elbot has been programmed to talk like Sarah Palin. Or is it the other way round?

  5. Ever since I first read about ELIZA, I have been fascinated by chatbots, so I definitely follow the Loebner Prize with interest. I think a misunderstanding people have is asking “well, chatbots aren’t “real” intelligence”, meaning that rules-based keyword matching and so on that the original ELIZA used and her descendants may use in one form or other today bear little resemblance to how the human brain works. I think the best response to this so-called criticism would be: “Yeah, and?” A submarine bears little resemblance to a shark in terms of going down or up – a submarine takes on water to go down, releases it to go up, and so forth. A shark is in equilibrium with the water and so doesn’t need this feature. Yet from a distance a small “tourist” sub shaped roughly like a large shark could look similar to the shark, and for all practical purposes be considered a shark. Point being: even if the “how” of a technology is different from its counterpart in nature, if its effect is similar, then that is what matters. So to say ELIZA and her children aren’t “real intelligence” is like saying a submarine is not a “real mariner” because it dives differently than a shark does. Well, that would be ridiculous. The submarine is a real enough mariner to the aquatic denizens which encounter it. Similarly, just because chatbots behave differently in terms of the “how” to naturally occuring intelligences, doesn’t mean we can just dismiss them as not “real intelligence”.

    In a word: if it looks like a duck, and quacks like a duck, it’s a duck. If one of ELIZA’s grandchildren pass the Turing Test, then, in my view, such a machine would therefore deserve all rights afforded to man, irrespective of how it was programmed, even if it were, for example, by “brute force” calculations (I am a QA engineer so I know a bit about how things are programmed, lol!). Deep Blue beat Gary Kasparov. I confess as a teenager at the time I was a bit disappointed, somehow sensing the inevitability of technology outpacing human effort. But I’ve had time to get over that, and now am cheering for chatbots to pass the Turing Test! 🙂

  6. Thanks, Frank. I quite agree with your first paragraph, but I’m not sure I’d go as far as you do in the second. If a submarine showed itself capable of moving through water like a whale, you could reasonably dismiss any complaint that it wasn’t ‘real swimming’. But you wouldn’t start worrying about whether the submarine was comfortable or needed to be returned ‘to the wild’!

  7. Pingback: scholarships and grants

  8. this conversation is a waste of good time. What the hell are u talking about?

Leave a Reply

Your email address will not be published. Required fields are marked *