Turing Test tactics

turing22012 was Alan Turing Year, marking the hundredth centenary of his birth.  The British Government, a little late perhaps, announced recently that it would support a Bill giving Turing a posthumous pardon; Gordon Brown, then the Prime Minister, had already issued an official apology in 2009. As you probably know, Turing, who was gay, was threatened with blackmail by one of his lovers (homosexuality being still illegal at the time) and reported the matter to the authorities; he was then tried and convicted and offered a choice of going to jail or taking hormones, effectively a form of oestrogen. He chose the latter, but subsequently died of cyanide poisoning in what is generally believed to have been suicide, leaving by his bed a partly-eaten apple, thought by many to be a poignant allusion to the story of Snow White. In fact it is not clear that the apple had any significance or that his death was actually suicide

The pardon was widely but not universally welcomed: some thought it an empty  gesture; some asked why Turing alone should be pardoned; and some even saw it as an insult, confirming by implication that Turing’s homosexuality was indeed an offence that needed to be forgiven.

Turing is generally celebrated for wartime work at Bletchley Park, the code-breaking centre, and for his work on the Halting Problem: on the latter he was pipped at the post by Alonzo Church, but his solution included the elegant formalisation of the idea of digital computing embodied in the Turing Machine, recognised as the foundation stone of modern computing. In a famous paper from 1950 he also effectively launched the field of Artificial Intelligence, and it is here that we find what we now call the Turing Test, a much-debated proposal that the ability of machines to think might be tested by having a short conversation with them.

Turing’s optimism about artificial intelligence has not been justified by developments since: he thought the Test would be passed by the end of the twentieth century. For many years the Loebner Prize contest has invited contestants to provide computerised interlocutors to be put through a real Turing Test by a panel of human judges, who attempt to tell which of their conversational partners, communicating remotely by text on a screen, is human and which machine.  None of the ‘chat-bots’  has succeeded in passing itself off as human so far – but then so far as I can tell none of the candidates ever pretended to be a genuinely thinking machine – they’re simply designed to scrape through the test by means of various cunning tricks – so according to Turing, none of them should have succeeded.

One lesson which has emerged from the years of trials – often inadvertently hilarious – is that success depends strongly on the judges. If the judge allows the chat-bot to take the lead and steer the conversation, a good impression is liely to be possible; but judges who try to make things difficult for the computer never fail. So how do you go about tripping up a chat-bot?

Well, we could try testing its general knowledge. Human beings have a vast repository of facts, which even the largest computer finds it difficult to match. One problem with this approach is that human beings cannot be relied on to know anything in particular – not knowing the year of the battle of Hastings, for example, does not prove that you’re not human. The second problem is that computers have been getting much better at this. Some clever chat-bots these days are permanently accessible online; they save the inputs made by casual visitors and later discreetly feed them back to another subject, noting the response for future use. Over time they accumulate a large database of what humans say in these circumstances and what other humans say in response. The really clever part of this strategy is that not only does it provide good responses, it means your database is automatically weighted towards the most likely topics and queries. It turns out that human beings are fairly predictable, and so the chat-bot can come back with responses that are sometimes eerily good, embodying human-style jokes, finishing quotations, apparently picking up web-culture references, and so on.

If we’re subtle we might try to turn this tactic of saving real human input against the chat-bot, looking for responses that seem more appropriate for someone speaking to a chat-bot than someone engaging in normal conversation, or perhaps referring to earlier phases of the conversation that never happened. But this is a tricky strategy to rely on, generally requiring some luck.

Perhaps rather than trying established facts, it might be better to ask the chat-bot questions which have never been asked before in the entire history of the world, but which any human can easily answer. When was the last time a mole fought an octopus? How many emeralds were in the crown worn by Shakespeare during his visit to Tashkent?

It might be possible to make things a little more difficult for the chat-bot by asking questions that require an answer in a specific format; but it’s hard to do that effectively in a Turing Test because normal usage is generally extremely flexible about what it will accept as an answer; and failing to match the prescribed format might be more human rather than less. Moreover, rephrasing is another field where the computers have come on a lot: we only have to think of the Watson system’s performance at the quiz game Jeopardy, which besides rapid retrieval of facts required just this kind of reformulation.

So it might be better to move away from general stuff and ask the chat-bot about specifics that any human would know but which are unlikely to be in a database – the weather outside, which hotel it is supposedly staying at. Perhaps we should ask it about its mother, as they did in similar circumstances in Blade Runner, though probably not for her maiden name.

On a different tack, we might try to exploit the weakness of many chat-bots when it comes to holding a context: instead of falling into the standard rhythm of one input, one response, we can allude to something we mentioned three inputs ago. Although they have got a little better, most chat-bots still seem to have great difficulty maintaining a topic across several inputs or ensuring consistency of response. Being cruel, we might deliberately introduce oddities that the bot needs to remember: we tell it our cat is called Fish  and then a little later ask whether it thinks the Fish we mentioned likes to swim.

Wherever possible we should fall back on Gricean implicature and provide good enough clues without spelling things out. Perhaps we could observe to the chat-bot that poor grammar is very human – which to a human more or less invites an ungrammatical response, although of course we can never rely on a real human’s getting the point. The same thing is true, alas, in the case of some of the simplest and deadliest strategies, which involve changing the rules of discourse. We tell the chat-bot that all our inputs from now on lliw eb delleps tuo sdrawkcab and ask it to reply in the same way, or we js mss t ll th vwls.

Devising these strategies makes us think in potentially useful ways about the special qualities of human thought. If we bring all our insights together, can we devise an Ultra-Turing Test? That would be a single question which no computer ever answers correctly and all reasonably alert and intelligent humans get right. We’d have to make some small allowance for chance, as there is obviously no answer that couldn’t be generated at random in some tiny number of cases. We’d also have to allow for the fact that as soon as any question was known, artful chat-bot programmers would seek to build in an answer; the question would have to be such that they couldn’t do that successfully.

Perhaps the question would allude to some feature of the local environment which would be obvious but not foreseeable (perhaps just the time?) but pick it out in a non-specific allusive way which relied on the ability to generate implications quickly from a vast store of background knowledge. It doesn’t sound impossible…


Ambiguous Turing

Picture: Turing.

It’s just 50 years since Alan Turing’s tragic death. The anniversary was marked in Manchester and elsewhere, but little seems to have appeared on the Internet – perhaps surprisingly, given his importance in the development of the computer..

Turing has a number of tremendous achievements to his credit. His war-time code-breaking may be the most famous; but perhaps the most important was the idea of the Turing machine, the theoretical apparatus which defined computation and computers. It had two distinct consequences: on the one hand, it dealt with the Entscheidungsproblem, one of the key issues of 20th century mathematics; on the other, it gave rise, via Turing’s famous (1950) paper, to the period of intense optimism about artificial intelligence which I referred to earlier as the ‘Turing era’ . The curious thing is that these two consequences of the Turing machine point in opposite, almost antithetical directions.

How so? The Entscheidungsproblem, posed by Hilbert, asks whether there is any mechanical procedure for determining whether a mathematical problem is solvable. The universal Turing machine embodies and clarifies the idea of ‘mechanical’ calculation. It is a simple apparatus which prints or erases characters on a paper tape according to the rules it has been given. In spite of this extreme simplicity it can in principle carry out any mechanical computation. In theory, in fact, it can run an appropriate version of any computer program, including the ones being used to display this page. In many respects it appears to be an entirely realistic machine which could easily be put together, but it has certain other qualities which make it an impossible abstraction. For one thing, it has to have an infinite paper tape: for another, it has to be immune to malfunction, no matter how long it runs; and most fundamental of all, it has to operate with discrete states – it must switch from one physical configuration to another without any intervening half-way stages. These characteristics mean that it is actually more like a complex function than a real machine. Nevertheless, all real-world computers owe their computerhood to their resemblance to it.

The clear conception of computation which the Turing machine provided allowed Turing to show that the Entscheidungsproblem had to be answered in the negative – there is no general procedure which can deal with all mathematical problems, even in principle. In fact, Turing was slightly too late to claim full credit for this result, which had already been established by Alonzo Church using a different approach,

The thing is, this result goes naturally with Gödel’s proof of the incompleteness of arithmetic in the sense that both establish limitations of formal algorithmic calculation. Both, therefore, suggest that the kind of computation performed by machines can never fully equal the thought processes of human beings (however those may work), which do not seem to suffer the same limitations. Gödel seems to have interpreted his own work this way. In fact there is some reason to think that Turing initially took a similar view. Andrew Hodges has pointed out that after completing his work on the Entscheidungsproblem, Turing attempted to produce a formal logic based on ordinals. It seems to have been the idea that this new, ordinal-based work would provide the basis for the kind of ‘intuitive’ reasoning which Turing machines couldn’t deliver – the kind human beings used to see the truth of Gödel statements. Only when these efforts failed, it seems, did Turing look for reasons to think that machine-style computation might be good enough to deliver a real mind after all.

Looked at again in this light, the 1950 paper seems more evasive and equivocal. It is a curious paper in many ways, with its playful tone and respectful mentions of ESP and Ada, Countess of Lovelace, but it also skirts the issue. Can machines think? Well, it says, let’s consider instead whether they can pass the Turing test . If they can, well, perhaps the original question is too meaningless to worry about.

But it surely isn’t meaningless: it’s partly because we believe that people really can think that our attitude to death is so different from our attitude to switching off the computer, for example.

It seems possible, anyway, that Turing’s desire to believe that a mechanical mind was possible led him to seek ways around the negative implications of his own work. The logic of ordinals was one possibility: when that failed, the Turing Test was basically another, justifiying further work with Turing-machine style computers.

Had he lived, of course, he might eventually have changed his mind about his own Test, or found better ways of dealing with ‘intuition’. We’ll never know quite how much we lost when, punished for his homosexuality with oestrogen injections and expelled from further participation in Government work, he killed himself with a poisoned apple.

But it is a poignant thought that in the natural course of things he could still have been alive today.

Is the Turing Era over?

Picture: Turing. Picture: Blandula. Can machines think? That was the question with which Alan Turing opened his famous paper of 1950, ‘Computing machinery and intelligence’. The question was not exactly new, but the answer he gave opened up a new era in our thinking about minds. It had been more or less agreed up to that time that consciousness required a special and particularly difficult kind of explanation. If it didn’t require spiritual intervention, or outright magic, it still needed some special power which no mere machine could possibly reproduce. Turing boldly predicted that by the end of the century we should have machines which everyone habitually treated as conscious entities, and his paper inspired a new optimism about our ability to solve the problems. But that was 1950. I’m afraid that fifty years of work since then have effectively shown that the answer is no – machines can’t think

Picture: Bitbucket. A little premature, I think. You have to remember that until 1950 there was very little discussion of consciousness. Textbooks on psychology never mentioned the subject. Any scientist who tried to discuss it seriously risked being taken for a loony by his colleagues. It was effectively taboo. Turing changed all that, partly by making the notion of a computer a clear and useful mathematical concept, but also through the ingenious suggestion of the Turing Test . It transformed the debate and during the second half of the century it made consciousness the hot topic of the day, the one all the most ambitious scientists wanted to crack: a subject eminent academics would take up after their knighthood or Nobel. The programme got under way, and although we have yet to achieve anything like a full human consciousness, it’s already clear that there is no insurmountable barrier after all. I’d argue, in fact, that some simple forms of artificial consciousness have already been achieved.

Picture: Blandula. But Turing’s deadline, the year 2000, is past. We know now that his prediction, and others made since, were just wrong. Granted, some progress has been made: no-one now would claim that computers can’t play chess. But they haven’t done that well, even against Turing’s own test, which in some ways is quite undemanding. It’s not that computers failed it; they never got good enough even to put up a serious candidate. You say that consciousness used to be a taboo subject, but perhaps it was just that earlier generations of scientists knew how to shut up when they had nothing worth saying…

Picture: Bitbucket. Of course, people got a bit over-optimistic during the last half of the last century. People always quote the story about Marvin Minsky giving one of his graduate students the job of sorting out vision over the course of the summer (I have a feeling that if that ever happened it was a joke in the first place). Of course it’s embarassing that some of the wilder predictions have not come true. But you’re misrepresenting Turing. The way I read him, he wasn’t saying it would all be over by 2000, he was saying, look, let’s put the philosophy aside until we’ve got a computer that can at least hold some kind of conversation.

But really I’m wasting my breath – you’ve just got a closed mind on the subject. Let’s face it, even if I presented you with a perfectly human robot (even if I suddenly revealed that I myself had been a robot all along), you still wouldn’t accept that it proved anything, would you?

Picture: Blandula. Your version of Turing sounds relatively sensible, but I just don’t think his paper bears that interpretation. As for your ‘perfectly human’ robot, I look forward to seeing it, but no, you’re right, I probably wouldn’t think it proved anything much. Imitating a person, however brilliantly, and being a person are two different things. I’d need to know what was going on inside the robot, and have a convincing theory of why it added up to real consciousness.

Picture: Bitbucket. No theory is going to be convincing if you won’t give it fair consideration. I think you must sometimes have serious doubts about the so-called problem of other minds. Do you actually feel sure that all your fellow human beings are really fully conscious entities?

Picture: Blandula. Well…