Robots on drugs

drugbotOver the years many variants and improvements to the Turing Test have been proposed, but surely none more unexpected than the one put forward by Andrew Smart in this piece, anticipating his forthcoming book Beyond Zero and One. He proposes that in order to be considered truly conscious, a robot must be able to take an acid trip.

He starts out by noting that computers seem to be increasing in intelligence (whatever that means), and that many people see them attaining human levels of performance by 2100 (actually quite a late date compared to the optimism of recent decades; Turing talked about 2000, after all). Some people, indeed, think we need to be concerned about whether the powerful AIs of the future will like us or behave well towards us. In my view these worries tend to blur together two different things; improving processing speeds and sophistication of programming on the one hand, and transformation from a passive data machine into a spontaneous agent, quite a different matter. Be that as it may, Smart reasonably suggests we could give some thought to whether and how we should make machines conscious.
It seems to me – this may be clearer in the book – that Smart divides things up in a slightly unusual way. I’ve got used to the idea that the big division is between access and phenomenal consciousness, which I take to be the same distinction as the one defined by the terminology of Hard versus Easy Problems. In essence, we have the kind of consciousness that’s relevant to behaviour, and the kind that’s relevant to subjective experience.
Although Smart alludes to the Chalmersian zombies that demonstrate this distinction, I think he puts the line a bit lower; between the kind of AI that no-one really supposes is thinking in a human sense and the kind that has the reflective capacities that make up the Easy Problem. He seems to think that experience just goes with that (which is a perfectly viable point of view). He speaks of consciousness as being essential to creative thought, for example, which to me suggests we’re not talking about pure subjectivity.
Anyway, what about the drugs? Smart sedans to think that requiring robots to be capable of an acid trip is raising the bar, because it is in these psychedelic regions that the highest, most distinctive kind of consciousness is realised. He quotes Hofman as believing that LSD…

…allows us to become aware of the ontologically objective existence of consciousness and ourselves as part of the universe…

I think we need to be wary here of the distinction between becoming aware of the universal ontology and having the deluded feeling of awareness. We should always remember the words of Oliver Wendell Holmes Sr:

…I once inhaled a pretty full dose of ether, with the determination to put on record, at the earliest moment of regaining consciousness, the thought I should find uppermost in my mind. The mighty music of the triumphal march into nothingness reverberated through my brain, and filled me with a sense of infinite possibilities, which made me an archangel for the moment. The veil of eternity was lifted. The one great truth which underlies all human experience, and is the key to all the mysteries that philosophy has sought in vain to solve, flashed upon me in a sudden revelation. Henceforth all was clear: a few words had lifted my intelligence to the level of the knowledge of the cherubim. As my natural condition returned, I remembered my resolution; and, staggering to my desk, I wrote, in ill-shaped, straggling characters, the all-embracing truth still glimmering in my consciousness. The words were these (children may smile; the wise will ponder): “A strong smell of turpentine prevails throughout.”…

A second problem is that Smart believes (with a few caveats) that any digital realisation of consciousness will necessarily have the capacity for the equivalent of acid trips. This seems doubtful. To start with, LSD is clearly a chemical matter and digital simulations of consciousness generally neglect the hugely complex chemistry of the brain in favour of the relatively tractable (but still unmanageably vast) network properties of the connectome. Of course it might be that a successful artificial consciousness would necessarily have to reproduce key aspects of the chemistry and hence necessarily offer scope for trips, but that seems far from certain. Think of headaches; I believe they generally arise from incidental properties of human beings – muscular tension, constriction of the sinuses, that sort of thing – I don’t believe they’re in any way essential to human cognition and I don’t see why a robot would need them. Might not acid trips be the same, a chance by-product of details of the human body that don’t have essential functional relevance?

The worst thing, though, is that Smart seems to have overlooked the main merit of the Turing Test; it’s objective. OK, we may disagree over the quality of some chat-bot’s conversational responses, but whether it fools a majority of people is something testable, at least in principle. How would we know whether a robot was really having an acid trip? Writing a chat-bot to sound as if were tripping seems far easier than the original test; but other than talking to it, how can we know what it’s experiencing? Yes, if we could tell it was having intense trippy experiences, we could conclude it was conscious… but alas, we can’t. That seems a fatal flaw.

Maybe we can ask tripbot whether it smells turpentine.

The Turing Test – is it all over?

Turing surprisedSo, was the brouhaha over the Turing Test  justified? It was widely reported last week that the test had been passed for the first time by a chatbot named ‘Eugene Goostman’.  I think the name itself is a little joke: it sort of means ‘well-made ghost man’.

This particular version of the test was not the regular Loebner which we have discussed before (Hugh Loebner must be grinding his teeth in frustration at the apprent ease with which Warwick garnered international media attention), but a session at the Royal Society apparently organised by Kevin Warwick.  The bar was set unusually low in this case: to succeed the chatbot only had to convince 30% of the judges that it was human. This was based on the key sentence in the paper by Turing which started the whole thing:

I believe that in about fifty years’ time it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.

Fair enough, perhaps, but less impressive than the 50% demanded by other versions; and if 30% is the benchmark, this wasn’t actually the first ever pass, because other chatbots like Cleverbot have scored higher in the past. Goostman doesn’t seem to be all that special: in an “interview” on BBC radio, with only the softest of questioning, it used one response twice over, word for word.

The softness of the questioning does seem to be a crucial variable in Turing tests. If the judges stick to standard small talk and allow the chatbot to take the initiative, quite a reasonable dialogue may result: if the judges are tough it is easy to force the success rate down to zero by various tricks and traps.  Iph u spel orl ur werds rong, fr egsampl, uh bott kanot kope, but a human generally manages fine.

Perhaps that wouldn’t have worked for Goostman, though because he was presented as relatively ignorant young boy whose first language was not English, giving him some excuse for not understanding things. This stratagem attracted some criticism, but really it is of a piece with chatbot strategy in general; faking and gaming is what it’s all about. N0-one remotely supposes that Goostman, or Cleverbot, or any of the others, has actually attained consciousness, or is doing anything that could properly be called thinking. Many years ago I believe there were serious efforts to write programs that to some degree imitated the probable mental processes of a human being: they identified a topic, accessed a database of information about it, retained a set of ‘attitudes’ towards things and tried to construct utterances that made sense in relation to them.  It is a weakness of the Turing test that it does not reward that kind of effort; a robot with poor grammar and general knowledge might be readily detectable even though it gave signs of some nascent understanding, while a bot which generates smooth responses without any attempt at understanding has a much better chance of passing.

So perhaps the curtain should be drawn down on the test; not because it has been passed, but because it’s no use.

The bots are back in town…

bot Botprize is a version of the Turing Test for in-game AIs: they don’t have to talk, just run around playing Unreal Tournament (a first-person shooter game) in a way that convinces other players that they are human. In the current version players use a gun to tag their opponents as bots or humans; the bots, of course, do the same.

The contest initially ran from 2008 up to 2012; in the last year, two of the bots exceeded the 50% benchmark of humanness. The absence of a 2013 contest might have suggested that that had wrapped things up for good: but now the 2014 contest is under way: it’s not too late to enter if you can get your bot sorted by 12 May. This time there will be two methods of judging; one called ‘first person’ (rather confusingly – that sounds as if participants will ask themselves: am I a bot?) is the usual in-game judging; the other (third person) will be a ‘crowd-sourced’ judgement based on people viewing selected videos after the event.

How does such a contest compare with the original Turing Test, a version of which is run every year as the Loebner Prize? The removal of any need to talk seems to make the test easier. Judges cannot use questions to test the bots’ memory (at least not in any detail), general knowledge, or ability to carry the thread of a conversation and follow unpredictable linkages of the kind human beings are so good at. They cannot set traps for the bots by making quirky demands (‘please reverse the order of the letters in each word when you respond’) or looking for a sense of humour.

In practice a significant part of the challenge is simply making a bot that plays the game at an approximately human level. This means the bot must never get irretrievably stuck in a corner or attempt to walk through walls; but also, it must not be too good – not a perfect shot that never misses and is inhumanly quick on the draw, for example. This kind of thing is really not different from the challenges faced by every game designer, and indeed the original bots supplied with the game don’t perform all that badly as human imitators, though they’re not generally as convincing as the contestants.

The way to win is apparently to build in typical or even exaggerated human traits. One example is that when a human player is shot at, they tend to go after the player that attacked them, even when a cool appraisal of the circumstances suggests that they’d do better to let it go. It’s interesting to reflect that if humans reliably seek revenge in this way, that tendency probably had survival value in the real world when the human brain was evolving; there must be important respects in which the game theory of the real world diverges from that of the game.

Because Botprize is in some respects less demanding than the original Turing Test, the conviction it delivers is less; the 2012 wins did not really make us believe that the relevant bots had human thinking ability, still less that they were conscious. In that respect a proper conversation carries more weight. The best chat-bots in the Loebner, however, are not at all convincing either, partly for a different reason – we know that no attempt has been made to endow them with real understanding or real thought; they are just machines designed to pass the test by faking thoughtful responses.

Ironically some of the less successful Botprize entrants have been more ambitious. In particular, Neurobot, created by Zafeiros Fountas as an MSc project, used a spiking neural network with a Global Workspace architecture; while not remotely on the scale of a human brain, this is in outline a plausible design for human-style cognition; indeed, one of the best we’ve got (which may not be saying all that much, of course). The Global Workspace idea, originated by Bernard Baars, situates consciousness as a general purpose space where inputs from different modules can be brought together and handled effectively. Although I have my reservations about that concept, it could at least reasonably be claimed that Neurobot’s functional states were somewhere on a spectrum which ultimately includes proper consciousness (interestingly, they would presumably be cognitive states of a kind which have never existed in nature, far simpler than those of most animals yet in some respects more like states of a human brain).

The 2012 winners by contraast, like the most successful Loebner chat-bots, relied on replaying recorded sequences of real human behaviour. Alas, this seems in practice to be the Achilles heel of Turing-style tests; canned responses just work too well.

Turing Test tactics

turing22012 was Alan Turing Year, marking the hundredth centenary of his birth.  The British Government, a little late perhaps, announced recently that it would support a Bill giving Turing a posthumous pardon; Gordon Brown, then the Prime Minister, had already issued an official apology in 2009. As you probably know, Turing, who was gay, was threatened with blackmail by one of his lovers (homosexuality being still illegal at the time) and reported the matter to the authorities; he was then tried and convicted and offered a choice of going to jail or taking hormones, effectively a form of oestrogen. He chose the latter, but subsequently died of cyanide poisoning in what is generally believed to have been suicide, leaving by his bed a partly-eaten apple, thought by many to be a poignant allusion to the story of Snow White. In fact it is not clear that the apple had any significance or that his death was actually suicide

The pardon was widely but not universally welcomed: some thought it an empty  gesture; some asked why Turing alone should be pardoned; and some even saw it as an insult, confirming by implication that Turing’s homosexuality was indeed an offence that needed to be forgiven.

Turing is generally celebrated for wartime work at Bletchley Park, the code-breaking centre, and for his work on the Halting Problem: on the latter he was pipped at the post by Alonzo Church, but his solution included the elegant formalisation of the idea of digital computing embodied in the Turing Machine, recognised as the foundation stone of modern computing. In a famous paper from 1950 he also effectively launched the field of Artificial Intelligence, and it is here that we find what we now call the Turing Test, a much-debated proposal that the ability of machines to think might be tested by having a short conversation with them.

Turing’s optimism about artificial intelligence has not been justified by developments since: he thought the Test would be passed by the end of the twentieth century. For many years the Loebner Prize contest has invited contestants to provide computerised interlocutors to be put through a real Turing Test by a panel of human judges, who attempt to tell which of their conversational partners, communicating remotely by text on a screen, is human and which machine.  None of the ‘chat-bots’  has succeeded in passing itself off as human so far – but then so far as I can tell none of the candidates ever pretended to be a genuinely thinking machine – they’re simply designed to scrape through the test by means of various cunning tricks – so according to Turing, none of them should have succeeded.

One lesson which has emerged from the years of trials – often inadvertently hilarious – is that success depends strongly on the judges. If the judge allows the chat-bot to take the lead and steer the conversation, a good impression is liely to be possible; but judges who try to make things difficult for the computer never fail. So how do you go about tripping up a chat-bot?

Well, we could try testing its general knowledge. Human beings have a vast repository of facts, which even the largest computer finds it difficult to match. One problem with this approach is that human beings cannot be relied on to know anything in particular – not knowing the year of the battle of Hastings, for example, does not prove that you’re not human. The second problem is that computers have been getting much better at this. Some clever chat-bots these days are permanently accessible online; they save the inputs made by casual visitors and later discreetly feed them back to another subject, noting the response for future use. Over time they accumulate a large database of what humans say in these circumstances and what other humans say in response. The really clever part of this strategy is that not only does it provide good responses, it means your database is automatically weighted towards the most likely topics and queries. It turns out that human beings are fairly predictable, and so the chat-bot can come back with responses that are sometimes eerily good, embodying human-style jokes, finishing quotations, apparently picking up web-culture references, and so on.

If we’re subtle we might try to turn this tactic of saving real human input against the chat-bot, looking for responses that seem more appropriate for someone speaking to a chat-bot than someone engaging in normal conversation, or perhaps referring to earlier phases of the conversation that never happened. But this is a tricky strategy to rely on, generally requiring some luck.

Perhaps rather than trying established facts, it might be better to ask the chat-bot questions which have never been asked before in the entire history of the world, but which any human can easily answer. When was the last time a mole fought an octopus? How many emeralds were in the crown worn by Shakespeare during his visit to Tashkent?

It might be possible to make things a little more difficult for the chat-bot by asking questions that require an answer in a specific format; but it’s hard to do that effectively in a Turing Test because normal usage is generally extremely flexible about what it will accept as an answer; and failing to match the prescribed format might be more human rather than less. Moreover, rephrasing is another field where the computers have come on a lot: we only have to think of the Watson system’s performance at the quiz game Jeopardy, which besides rapid retrieval of facts required just this kind of reformulation.

So it might be better to move away from general stuff and ask the chat-bot about specifics that any human would know but which are unlikely to be in a database – the weather outside, which hotel it is supposedly staying at. Perhaps we should ask it about its mother, as they did in similar circumstances in Blade Runner, though probably not for her maiden name.

On a different tack, we might try to exploit the weakness of many chat-bots when it comes to holding a context: instead of falling into the standard rhythm of one input, one response, we can allude to something we mentioned three inputs ago. Although they have got a little better, most chat-bots still seem to have great difficulty maintaining a topic across several inputs or ensuring consistency of response. Being cruel, we might deliberately introduce oddities that the bot needs to remember: we tell it our cat is called Fish  and then a little later ask whether it thinks the Fish we mentioned likes to swim.

Wherever possible we should fall back on Gricean implicature and provide good enough clues without spelling things out. Perhaps we could observe to the chat-bot that poor grammar is very human – which to a human more or less invites an ungrammatical response, although of course we can never rely on a real human’s getting the point. The same thing is true, alas, in the case of some of the simplest and deadliest strategies, which involve changing the rules of discourse. We tell the chat-bot that all our inputs from now on lliw eb delleps tuo sdrawkcab and ask it to reply in the same way, or we js mss t ll th vwls.

Devising these strategies makes us think in potentially useful ways about the special qualities of human thought. If we bring all our insights together, can we devise an Ultra-Turing Test? That would be a single question which no computer ever answers correctly and all reasonably alert and intelligent humans get right. We’d have to make some small allowance for chance, as there is obviously no answer that couldn’t be generated at random in some tiny number of cases. We’d also have to allow for the fact that as soon as any question was known, artful chat-bot programmers would seek to build in an answer; the question would have to be such that they couldn’t do that successfully.

Perhaps the question would allude to some feature of the local environment which would be obvious but not foreseeable (perhaps just the time?) but pick it out in a non-specific allusive way which relied on the ability to generate implications quickly from a vast store of background knowledge. It doesn’t sound impossible…

 

Most Human

Picture: Brian Christian In The Most Human Human: A Defence of Humanity in the Age of the Computer, Brian Christian gives an entertaining and generally sensible account of his campaign to win the ‘most human human’ award at the Loebner prize.

As you probably know, the Loebner prize is an annual staging of the Turing test: judges conduct online conversations with real humans and with chatbots and try to determine which is which.  The main point of the contest is for the chatbot creators to deceive as many judges as they can, but in order to encourage the human subjects there’s also an award for the participant considered to be least like a computer.  Christian set himself to win this prize, and intersperses the story of his effort with reflections on the background and significance of the enterprise.

He tries to ramp up the drama a bit by pointing out that in 2008 a chatbot came close to succeeding, persuading 3 of the judges that it was human: four would have got it the first-ever win. This year, therefore, he suggests, might in some sense prove to be humanity’s last stand.  I think it’s true that Loebner entrants have improved over the years. In the early days none of the chatbots was at all convincing and few of their efforts at conversation rose above the ludicrous – in fact, many early transcripts are worth reading for the surreal humour they inadvertently generated. Nowadays, if they’re not put under particular pressure the leading chatbots can produce a lengthy stream of fairly reasonable responses, mixed with occasional touches of genius and – still – the inevitable periodic lapses into the inapposite or plain incoherent. But I don’t think the Turing Test is seriously about to fall. The variation in success at the Loebner prize has something to do with the quality of the bots, but more with the variability of the judges. I don’t think it’s made very clear to the judges what their task is, and they seem to divide into hawks and doves: some appear to feel they should be sporting and play along with the bots, while others approach the conversations inquisitorially and do their best to catch their  ‘opponents’ out. The former approach sometimes lets the bots look good; the latter, I’m afraid, never really fails to unmask them.  I suspect that in 2008 there just happened to be a lot of judges who were ready to give the bots a fighting chance.

How do you demonstrate your humanity?  In the past some human contenders have tried to signal their authenticity with displays of emotional affect, but I should have thought that if that approach was susceptible to fakery. However, compared to any human being, the bots have little information and no understanding. They can therefore be thrown off either by allusions to matters of fact that a human participant would certainly know (the details of the hotel breakfast; a topical story from the same day’s news) but would not be in the bots’ databases; or they can be stumped by questions that require genuine comprehension (prhps w cld sk qstns wth n vwls nd sk fr rpls n the sm frmt?). In one way or another they rely on scripts, so as Christian deduces, it is basically a matter of breaking the pattern.

I’m not altogether sure how it comes about that humans can break pattern so easily while remaining confident that another human will readily catch their drift. Sometimes it’s a matter of human thought operating on more than one level, so that where two topics intersect we can leap from one to the other (a feature it might be worth trying to build into bots to some extent). In the case of the Loebner, though, hawkish judges are likely to make a point of leaving no thread of relevance whatever between each input, confident that any human being will be able to pick up a new narrative instantly from a standing start. I think it has something to do with the un-named human faculty that allows us to deal with pragmatics in language, evade the frame problem, and effortlessly catch and attribute meanings (at least, I think all those things rely at least partly on a common underlying faculty,or perhaps on an unidentified common property of mental processes).

Christian quotes an example of a bot which appeared to be particularly impoverished, having only one script: if it could persuade the judge to talk about Bill Clinton it looked very good, but s soon as the subject was changed it was dead meat. The best bots, like Rollo Carpenter’s Jabberwacky,  seem to have a very large repertoire of examples of real human responses to the kind of thing real humans say in a conversation with chatbots (helpfully real humans are not generally all that original in these circumstances, so it’s almost possible to treat chatbot conversation as a large but limited domain in itself). They often seem to make sense, but still fall down on consistency, being liable to give random and conflicting answers, for example about their own supposed gender and marital status.

Reflecting on this, Christian notes that a great deal of ordinary everyday human interaction effectively follows scripts, too. In the routine of small talk, shopping, or ordering food, there tend to be ritual formulas to be followed and only a tiny exchange of real information. Where communication is poor, for example where there is no shared language, it’s still often relatively easy to get the tiny nuggets of actual information across and complete the transaction successfully (though not always: Christian tells against himself the story of a small disaster he suffered in Paris through assuming, by analogy with Spanish, that ‘station est’ must mean ‘this station’).

Doesn’t this show, Christian asks, that half the time we’re wasting time? Wouldn’t it be better if we dropped the stereotyped phatic exchanges and cut to the chase? In speed-dating it is apparently necessary to have rules forbidding participants to ask certain standard questions (Where do you live? What do you do for a living?) which eat up scarce time without people getting any real feel for each other’s personality. Wouldn’t it be more rewarding if we applied similar rules to all our conversations?

This,  Christian thinks, might be the gift which artificial intelligence ultimately bestows on us. Unlike some others, he’s not worried that dialogue with computers will make us start to think of ourselves as machines – the difference, he thinks, is too obvious. On the contrary, the experience of dealing with robots will bring home to us for the first time how much of our own behaviour is needlessly stereotyped and robotic and inspire us to become more original – more human – than we ever were before.

In some ways this makes sense. As a similar point it has sometimes occurred to me in the past to wonder whether our time is  best spent by so many of us watching the same films and reading the same books. Too often I have had conversations sharing identical memories of the same television programme and quoting the same passages from reviews in the same newspapers. Mightn’t it be more productive, mightn’t we cover more ground, if we all had different experiences of different things?

Maybe, but it would be hard work. If Christian had his way, we should no longer be saying things like this.

– Hi, how are you?

– Fine, thanks, you too?

– Yeah, not so bad. I see the rain’s stopped.

– Mm, hope it stays fine for the weekend.

– Oh, yeah, the weekend.

– Well, it’s Friday tomorrow – at last!

– That’s right. One more day to go.

Instead I suppose our conversations would be earnest, informative, intense, and personal.

– Tell me something important.

– Sometimes I’m secretly glad my father died young rather than living to see his hopes crushed.

– Mithridates, he died old.

– Ugh: Housman’s is the only poetry which is necessarily improved by parody.

– I’ve never respected your taste or your intellect, but I’ve still always felt protective towards you.

– There’s a useful sociological framework theory of amorance I can summarise if it would help?

– What are you really thinking?

Perhaps the second kind of exchange is more interesting than the first, but all day every day it would be tough to sustain and wearing to endure. It seems to me there’s something peculiarly Western about the idea that even our small talk should be made to yield a profit. I believe historically most civilisations have been inclined to believe that the world was gradually deteriorating from a previous Golden Age, and that keeping things the way they had been in past was the most anyone could generally aspire to. Since the Renaissance, perhaps, we have become more accustomed to the idea of improvement and tend to look restlessly for progress: a culture constantly gearing up and apparently preparing itself for some colossal future undertaking the nature of which remains obscure. This driven quality clearly yields its benefits in prosperity for us, but when it gets down to the personal level it has its dangers, at worst it may promote slave-like levels of work, degrade friendship into networking and reinterpret leisure as mere recuperation. I’m not sure I want to see self-help books about leveraging those moments of idle chat. (In fairness, that’s not what Christian has in mind either.)

Christian may be right, in any case, that human interaction with machines will tend to emphasise the differences more than the similarities. I won’t reveal whether he ultimately succeeded in his quest to be Most Human Human (or perhaps was pipped at the post when a rival and his judge stumbled on a common and all-too-human sporting interest?), but I can tell you that this was not on any view humanity’s last stand:  the bots were routed.

Google consciousness

Picture: Google chatbot. Bitbucket I was interested to see this Wired piece recently; specifically the points about how Google picks up contextual clues. I’ve heard before about how Google’s translation facilities basically use the huge database of the web: instead of applying grammatical rules or anything like that, they just find equivalents in parallel texts, or alternatives that people use when searching, and this allows them to do a surprisingly good – not perfect – job of picking up those contextual issues that are the bane of most translation software. At least, that’s my understanding of how it works.  Somehow it hadn’t quite occurred to me before, but a similar approach lends itself to the construction of a pretty good kind of chatbot – one that could finally pass the Turing Test unambiguously.

Blandula Ah, the oft-promised passing of the Turing Test. Wake me up when it happens – we’ve been round this course so many times in the past.

Bitbucket Strangely enough, this does remind me of one of the things we used to argue about a lot in the past.  You’ve always wanted to argue that computers couldn’t match human performance in certain respects in principle. As a last resort, I tried to get you to admit that in principle we could get a computer to hold a conversation with human-level responses just by the brutest of brute force solutions.  You just can a perfect response for every possible sentence. When you get that sentence as input, you send the canned response as output. The longest sentence ever spoken is not infinitely long, and the number of sentences of any finite length is finite; so in principle we can do it.

Blandula I remember: what you could never grasp was that the meaning of a sentence depends on the context, so you can’t devise a perfect response for every sentence without knowing what conversation it was part of.  What would the canned response be to;  ‘What do you mean?’  – to take just one simple example.

Bitbucket What you could never grasp was that in principle we can build in the context, too. Instead of just taking one sentence, we can have a canned response to sets of the last ten sentences if we like – or the last hundred sentences, or whatever it takes. Of course the resources required get absurd, but we’re talking about the principle, so we can assume whatever resources we want.  The point I wanted to make is that by using the contents of the Internet and search enquiries, Google could implement a real-world brute-force solution of broadly this kind.

Blandula I don’t think the Internet actually contains every set of a hundred sentences ever spoken during the history of the Universe.

Bitbucket No, granted; but it’s pretty good, and it’s growing rapidly, and it’s skewed towards the kind of thing that people actually say. I grant you that in practice there will always be unusual contextual clues that the Google chatbot won’t pick up, or will mishandle. But don’t forget that human beings miss the point sometimes, too.  It seems to me a realistic aspiration that the level of errors could fairly quickly be pushed down to human levels based on Internet content.

Blandula It would of course tell us nothing whatever about consciousness or the human mind; it would just be a trick. And a damaging one.  If Google could fake human conversation, many people would ascribe consciousness to it, however unjustifiably. You know that quite poor, unsophisticated chatbots have been treated by naive users as serious conversational partners ever since Eliza, the grandmother of them all. The internet connection makes it worse, because a surprising number of people seem to think that the Internet itself might one day accidentally attain consciousness. A mad idea: so all those people working on AI get nowhere, but some piece of kit which is carefully designed to do something quite different just accidentally hits on the solution? It’s as though Jethro Tull had been working on his machine and concluded it would never be a practical seed-drill; but then realised he had inadvertently built a viable flying machine. Not going to happen. Thing is, believing some machine is a person when it isn’t is not a trivial matter, because you then naturally start to think of people as being no more than machines.  It starts to seem natural to close people down when they cease to be useful, and to work them like slaves while they’re operative. I’m well aware that a trend in this direction is already established, but a successful chatbot would make things much, much, worse.

Bitbucket Well, that’s a nice exposition of the paranoia which lies behind so many of your attitudes. Look, you can talk to automated answering services as it is: nobody gets het up about it, or starts to lose their concept of humanity.

Of course you’re right that a Google chatbot in itself is not conscious. But isn’t it a good step forward?  You know that in the brain there are several areas that deal with speech;  Broca’s area seems to put coherent sentences together while Wernicke’s area provides the right words and sense. People whose Wernicke’s area has been destroyed, but who still have a sound Broca’s area apparently talk fluently and sort of convincingly, but without ever really making sense in terms of the world around them. I would claim that a working Google chatbot is in essence a Broca’s area for a future conscious AI. That’s all I’ll claim, just for the moment.

New Turing tests

Picture: ducks. The New Scientist, under the title ‘Tests that show machines closing in on human abilities’ has a short review of some different ways in which robotic or computer performance is being tested against human achievement. The piece is not really reporting fresh progress in the way its title suggests  – the success of Weizenbaum’s early chat-bot Eliza is not exactly breaking news,  for example – but I think the overall point is a good one. In the last half of the last century, it was often assumed that progress towards AI would see all the barriers come down at once: as soon as we had a robot that could meet Turing’s target of a few minutes’ successful small-talk, fully-functioning androids would rapidly follow.

In practice, Turing’s test has not been passed as he expected, although some stalwart souls continue to work on it. But we have seen the overall problem of human mentality unpicked into a series of substantial, but lesser challenges. Human levels of competence remain the target, but competence in different narrow fields, with no expectation that solving the problem in one area solves it in all of them.

THe piece ends with a quote from Stevan Harnad which suggests he clings to the old way of looking at this:

“If a machine can prove indistinguishable from a human, we should award it the respect we would to a human”

That may be true, in fact, but the question is, indistinguishable in which respects? People often quote a  particular saying in this respect: if it walks like a duck and quacks like a duck, it’s a duck.  Actually, even in the case of ducks this isn’t as straightforward as it might seem since, other wildfowl may be ducklike in some respects. Given a particular bird, how many of us could say with any certainty whether it were a Velvet Scoter, a White-winged Coot – or just a large sea-duck? But it’s worse than that. The Duck Law, as we may call it, works fairly well in real life; but that’s because as a matter of fact there aren’t all that many anatine entities in the world other than outright ducks. If there were a cunning artificer who was bent on turning out false, mechanical ducks like the legendary one made by Vaucanson, which did not merely walk and quack like a duck, but ate and pooped like one, we should need a more searching principle.  When it comes to the Turing Test, that is pretty much the position we find ourselves in.

There is, of course, a more rigorous version of Duck Law which is intellectually irreproachable, namely Leibniz’s Law. Loosely, this says that if two object share the same properties, they are the same. The problem is, that in order to work, Leibniz’s law has to be applied in the most rigorous fashion.  It requires that all properties must be the same. To be indistinguishable from a human being in this sense means literally indistinguishable, ie having human guts, a human mother and so on.

So, in which respects must a robot resemble a human being in order to be awarded the same respect as a human? It now seems unlikely that a machine will pass the original Turing Test soon; but even if it did, would that really be enough?  Just looking like a human, even in flexible animation which reproduces the pores of the skin and typical human movements, is clearly not enough. Nor is being able to improvise jazz tunes seamlessly with a human player. But these things are all significant achievements nevertheless. Perhaps this is the way we make progress.

Or possibly, at some stage in the future, someone will notice that if he were to bolt together a hundred different software modules and some appropriate bits of hardware, all by then readily available, he could theoretically produce a machine able to do everything a human can do; but he won’t by then see any point in actually doing it.

Is the Turing Era over?

Picture: Turing. Picture: Blandula. Can machines think? That was the question with which Alan Turing opened his famous paper of 1950, ‘Computing machinery and intelligence’. The question was not exactly new, but the answer he gave opened up a new era in our thinking about minds. It had been more or less agreed up to that time that consciousness required a special and particularly difficult kind of explanation. If it didn’t require spiritual intervention, or outright magic, it still needed some special power which no mere machine could possibly reproduce. Turing boldly predicted that by the end of the century we should have machines which everyone habitually treated as conscious entities, and his paper inspired a new optimism about our ability to solve the problems. But that was 1950. I’m afraid that fifty years of work since then have effectively shown that the answer is no – machines can’t think

Picture: Bitbucket. A little premature, I think. You have to remember that until 1950 there was very little discussion of consciousness. Textbooks on psychology never mentioned the subject. Any scientist who tried to discuss it seriously risked being taken for a loony by his colleagues. It was effectively taboo. Turing changed all that, partly by making the notion of a computer a clear and useful mathematical concept, but also through the ingenious suggestion of the Turing Test . It transformed the debate and during the second half of the century it made consciousness the hot topic of the day, the one all the most ambitious scientists wanted to crack: a subject eminent academics would take up after their knighthood or Nobel. The programme got under way, and although we have yet to achieve anything like a full human consciousness, it’s already clear that there is no insurmountable barrier after all. I’d argue, in fact, that some simple forms of artificial consciousness have already been achieved.

Picture: Blandula. But Turing’s deadline, the year 2000, is past. We know now that his prediction, and others made since, were just wrong. Granted, some progress has been made: no-one now would claim that computers can’t play chess. But they haven’t done that well, even against Turing’s own test, which in some ways is quite undemanding. It’s not that computers failed it; they never got good enough even to put up a serious candidate. You say that consciousness used to be a taboo subject, but perhaps it was just that earlier generations of scientists knew how to shut up when they had nothing worth saying…

Picture: Bitbucket. Of course, people got a bit over-optimistic during the last half of the last century. People always quote the story about Marvin Minsky giving one of his graduate students the job of sorting out vision over the course of the summer (I have a feeling that if that ever happened it was a joke in the first place). Of course it’s embarassing that some of the wilder predictions have not come true. But you’re misrepresenting Turing. The way I read him, he wasn’t saying it would all be over by 2000, he was saying, look, let’s put the philosophy aside until we’ve got a computer that can at least hold some kind of conversation.

But really I’m wasting my breath – you’ve just got a closed mind on the subject. Let’s face it, even if I presented you with a perfectly human robot (even if I suddenly revealed that I myself had been a robot all along), you still wouldn’t accept that it proved anything, would you?

Picture: Blandula. Your version of Turing sounds relatively sensible, but I just don’t think his paper bears that interpretation. As for your ‘perfectly human’ robot, I look forward to seeing it, but no, you’re right, I probably wouldn’t think it proved anything much. Imitating a person, however brilliantly, and being a person are two different things. I’d need to know what was going on inside the robot, and have a convincing theory of why it added up to real consciousness.

Picture: Bitbucket. No theory is going to be convincing if you won’t give it fair consideration. I think you must sometimes have serious doubts about the so-called problem of other minds. Do you actually feel sure that all your fellow human beings are really fully conscious entities?

Picture: Blandula. Well…