Turing Test tactics

turing22012 was Alan Turing Year, marking the hundredth centenary of his birth.  The British Government, a little late perhaps, announced recently that it would support a Bill giving Turing a posthumous pardon; Gordon Brown, then the Prime Minister, had already issued an official apology in 2009. As you probably know, Turing, who was gay, was threatened with blackmail by one of his lovers (homosexuality being still illegal at the time) and reported the matter to the authorities; he was then tried and convicted and offered a choice of going to jail or taking hormones, effectively a form of oestrogen. He chose the latter, but subsequently died of cyanide poisoning in what is generally believed to have been suicide, leaving by his bed a partly-eaten apple, thought by many to be a poignant allusion to the story of Snow White. In fact it is not clear that the apple had any significance or that his death was actually suicide

The pardon was widely but not universally welcomed: some thought it an empty  gesture; some asked why Turing alone should be pardoned; and some even saw it as an insult, confirming by implication that Turing’s homosexuality was indeed an offence that needed to be forgiven.

Turing is generally celebrated for wartime work at Bletchley Park, the code-breaking centre, and for his work on the Halting Problem: on the latter he was pipped at the post by Alonzo Church, but his solution included the elegant formalisation of the idea of digital computing embodied in the Turing Machine, recognised as the foundation stone of modern computing. In a famous paper from 1950 he also effectively launched the field of Artificial Intelligence, and it is here that we find what we now call the Turing Test, a much-debated proposal that the ability of machines to think might be tested by having a short conversation with them.

Turing’s optimism about artificial intelligence has not been justified by developments since: he thought the Test would be passed by the end of the twentieth century. For many years the Loebner Prize contest has invited contestants to provide computerised interlocutors to be put through a real Turing Test by a panel of human judges, who attempt to tell which of their conversational partners, communicating remotely by text on a screen, is human and which machine.  None of the ‘chat-bots’  has succeeded in passing itself off as human so far – but then so far as I can tell none of the candidates ever pretended to be a genuinely thinking machine – they’re simply designed to scrape through the test by means of various cunning tricks – so according to Turing, none of them should have succeeded.

One lesson which has emerged from the years of trials – often inadvertently hilarious – is that success depends strongly on the judges. If the judge allows the chat-bot to take the lead and steer the conversation, a good impression is liely to be possible; but judges who try to make things difficult for the computer never fail. So how do you go about tripping up a chat-bot?

Well, we could try testing its general knowledge. Human beings have a vast repository of facts, which even the largest computer finds it difficult to match. One problem with this approach is that human beings cannot be relied on to know anything in particular – not knowing the year of the battle of Hastings, for example, does not prove that you’re not human. The second problem is that computers have been getting much better at this. Some clever chat-bots these days are permanently accessible online; they save the inputs made by casual visitors and later discreetly feed them back to another subject, noting the response for future use. Over time they accumulate a large database of what humans say in these circumstances and what other humans say in response. The really clever part of this strategy is that not only does it provide good responses, it means your database is automatically weighted towards the most likely topics and queries. It turns out that human beings are fairly predictable, and so the chat-bot can come back with responses that are sometimes eerily good, embodying human-style jokes, finishing quotations, apparently picking up web-culture references, and so on.

If we’re subtle we might try to turn this tactic of saving real human input against the chat-bot, looking for responses that seem more appropriate for someone speaking to a chat-bot than someone engaging in normal conversation, or perhaps referring to earlier phases of the conversation that never happened. But this is a tricky strategy to rely on, generally requiring some luck.

Perhaps rather than trying established facts, it might be better to ask the chat-bot questions which have never been asked before in the entire history of the world, but which any human can easily answer. When was the last time a mole fought an octopus? How many emeralds were in the crown worn by Shakespeare during his visit to Tashkent?

It might be possible to make things a little more difficult for the chat-bot by asking questions that require an answer in a specific format; but it’s hard to do that effectively in a Turing Test because normal usage is generally extremely flexible about what it will accept as an answer; and failing to match the prescribed format might be more human rather than less. Moreover, rephrasing is another field where the computers have come on a lot: we only have to think of the Watson system’s performance at the quiz game Jeopardy, which besides rapid retrieval of facts required just this kind of reformulation.

So it might be better to move away from general stuff and ask the chat-bot about specifics that any human would know but which are unlikely to be in a database – the weather outside, which hotel it is supposedly staying at. Perhaps we should ask it about its mother, as they did in similar circumstances in Blade Runner, though probably not for her maiden name.

On a different tack, we might try to exploit the weakness of many chat-bots when it comes to holding a context: instead of falling into the standard rhythm of one input, one response, we can allude to something we mentioned three inputs ago. Although they have got a little better, most chat-bots still seem to have great difficulty maintaining a topic across several inputs or ensuring consistency of response. Being cruel, we might deliberately introduce oddities that the bot needs to remember: we tell it our cat is called Fish  and then a little later ask whether it thinks the Fish we mentioned likes to swim.

Wherever possible we should fall back on Gricean implicature and provide good enough clues without spelling things out. Perhaps we could observe to the chat-bot that poor grammar is very human – which to a human more or less invites an ungrammatical response, although of course we can never rely on a real human’s getting the point. The same thing is true, alas, in the case of some of the simplest and deadliest strategies, which involve changing the rules of discourse. We tell the chat-bot that all our inputs from now on lliw eb delleps tuo sdrawkcab and ask it to reply in the same way, or we js mss t ll th vwls.

Devising these strategies makes us think in potentially useful ways about the special qualities of human thought. If we bring all our insights together, can we devise an Ultra-Turing Test? That would be a single question which no computer ever answers correctly and all reasonably alert and intelligent humans get right. We’d have to make some small allowance for chance, as there is obviously no answer that couldn’t be generated at random in some tiny number of cases. We’d also have to allow for the fact that as soon as any question was known, artful chat-bot programmers would seek to build in an answer; the question would have to be such that they couldn’t do that successfully.

Perhaps the question would allude to some feature of the local environment which would be obvious but not foreseeable (perhaps just the time?) but pick it out in a non-specific allusive way which relied on the ability to generate implications quickly from a vast store of background knowledge. It doesn’t sound impossible…

 

A General Taxonomy of Lust

kiss… is not really what this piece is about (sorry). It’s an idea I had years ago for a short story or a novella. ‘Lust’ here would have been interpreted broadly as any state which impels a human being towards sex. I had in mind a number of axes defining a general ‘lust space’. One of the axes, if I remember rightly, had specific attraction to one person at one end and generalised indiscriminate enthusiasm at the other; another went from sadistic to masochistic, and so on. I think I had eighty-one basic forms of lust, and the idea was to write short episodes exemplifying each one: in fact, to interweave a coherent narrative with all of them in.

My creative gifts were not up to that challenge, but I mention it here because one of the axes went from the purely intellectual to the purely physical. At the intellectual extreme you might have an elderly homosexual aristocrat who, on inheriting a title, realises it is his duty to attempt to procure an heir. At the purely physical end you might have an adolescent boy on a train who notices he has an erection which is unrelated to anything that has passed through his mind.

That axis would have made a lot of sense (perhaps) to Luca Barlassina and Albert Newen, whose paper in Philosophy and Phenomenological Research sets out an impure somatic theory of the emotions. In short, they claim that emotions are constituted by the integration of bodily perceptions with representations of external objects and states of affairs.

Somatic theories say that emotions are really just bodily states. We don’t get red in the face because we’re angry, we get angry because we’ve become red in the face. As no less an authority than William James had it:

The more rational statement is that we feel sorry because we cry, angry because we strike, afraid because we tremble, and not that we cry, strike, or tremble, because we are sorry, angry, or fearful, as the case may be. Without the bodily states following on the perception, the latter would be purely cognitive in form, pale, colorless, destitute of emotional warmth.

This view did not appeal to everyone, but the elegantly parsimonious reduction it offers has retained its appeal, and Jesse Prinz has put forward a sophisticated 21st century version. It is Prinz’s theory that Barlassina and Newen address; they think it needs adulterating, but they clearly want to build on Prinz’s foundations, not reject them.

So what does Prinz say? His view of emotions fits into the framework of his general view about perception: for him, a state is a perceptual state if it is a state of a dedicated input system – eg the visual system. An emotion is simply a state of the system that monitors our own bodies; in other words emotions are just perceptions of our own bodily states.  Even for Prinz, that’s a little too pure: emotions, after all, are typically about something. They have intentional content. We don’t just feel angry, we feel angry about something or other. Prinz regards emotions as having dual content: they register bodily states but also represent core relational themes (as against say, fatigue, which both registers and represents a bodily state). On top of that, they may involve propositional attitudes, thoughts about some evocative future event, for example, but the propositional attitudes only evoke the emotions, they don’t play any role in constituting them. Further still, certain higher emotions are recalibrati0ns of lower ones: the simple emotion of sadness is recalibrated so it can be controlled by a particular set of stimuli and become guilt.

So far so good. Barlassina and Newen have four objections. First, if Prinz is right, then the neural correlates of emotion and the perception of the relevant bodily states must just be the same. Taking the example of disgust, B&N argue that the evidence suggests otherwise: interoception, the perception of bodily changes, may indeed cause disgust, but does not equate to it neurologically.

Second, they see problems with Prinz’s method of bringing in intentional content. For Prinz emotions differ from mere bodily feeling because they represent core relational themes. But, say B&N, what about ear pressure? It tells us about unhealthy levels of barometric pressure and oxygen, and so relates to survival, surely a core relational theme: and it’s certainly a perception of a bodily state – but ear pressure is not an emotion.

Third, Prinz’s account only allows emotions to be about general situations; but in fact they are about particular things. When we’re afraid of a dog, we’re afraid of that dog, we’re not just experiencing a general fear in the presence of a specific dog.

Fourth, Prinz doesn’t fully accommodate the real phenomenology of emotions. For him, fear of a lion is fear accompanied by some beleifs about a lion: but B&N maintain that the directedness of the emotion is built in, part of the inherent phenomenology.

Barlassina and Newen like Prinz’s somatic leanings, but they conclude that he simply doesn’t account sufficiently for the representative characteristics of emotions: consequently they propose an ‘impure’ theory by which emotions are cognitive states constituted when interoceptive states are integrated with with perceptions of external objects or states of affairs.

This pollution or elaboration of the pure theory seems pretty sensible and B&N give a clear and convincing exposition. At the end of the day it leaves me cold not because they haven’t done a good job but because I suspect that somatic theories are always going to be inadequate: for two reasons.

First, they just don’t capture the phenomenology. There’s no doubt at all that emotions are often or typically characterised or coloured by perception of distinctive bodily states, but is that what they are in essence? It doesn’t seem so. It seems possible to imagine that I might be angry or sad without a body at all: not, of course, in the same good old human way, but angry or sad nevertheless. There seems to be something almost qualic about emotions, something over and above any of the physical aspects, characteristic though they may be.

Second, surely emotions are often essentially about dispositions to behave in a certain way? An account of anger which never mentions that anger makes me more likely to hit people just doesn’t seem to cut the mustard. Even William James spoke of striking people. In fact, I think one could plausibly argue that the physical changes associated with an emotion can often be related to the underlying propensity to behave in a certain way. We begin to breathe deeply and our heart pounds because we are getting ready for violent exertion, just as parallel cognitive changes get us ready to take offence and start a fight. Not all emotions are as neat as this: we’ve talked in the past about the difficulty of explaining what grief is for. Still, these considerations seem enough to show that a somatic account, even an impure one, can’t quite cover the ground.

Still, just as Barlassina and Newen built on Prinz, it may well be that they have provided some good foundation work for an even more impure theory.