Botprize is a version of the Turing Test for in-game AIs: they don’t have to talk, just run around playing Unreal Tournament (a first-person shooter game) in a way that convinces other players that they are human. In the current version players use a gun to tag their opponents as bots or humans; the bots, of course, do the same.
The contest initially ran from 2008 up to 2012; in the last year, two of the bots exceeded the 50% benchmark of humanness. The absence of a 2013 contest might have suggested that that had wrapped things up for good: but now the 2014 contest is under way: it’s not too late to enter if you can get your bot sorted by 12 May. This time there will be two methods of judging; one called ‘first person’ (rather confusingly – that sounds as if participants will ask themselves: am I a bot?) is the usual in-game judging; the other (third person) will be a ‘crowd-sourced’ judgement based on people viewing selected videos after the event.
How does such a contest compare with the original Turing Test, a version of which is run every year as the Loebner Prize? The removal of any need to talk seems to make the test easier. Judges cannot use questions to test the bots’ memory (at least not in any detail), general knowledge, or ability to carry the thread of a conversation and follow unpredictable linkages of the kind human beings are so good at. They cannot set traps for the bots by making quirky demands (‘please reverse the order of the letters in each word when you respond’) or looking for a sense of humour.
In practice a significant part of the challenge is simply making a bot that plays the game at an approximately human level. This means the bot must never get irretrievably stuck in a corner or attempt to walk through walls; but also, it must not be too good – not a perfect shot that never misses and is inhumanly quick on the draw, for example. This kind of thing is really not different from the challenges faced by every game designer, and indeed the original bots supplied with the game don’t perform all that badly as human imitators, though they’re not generally as convincing as the contestants.
The way to win is apparently to build in typical or even exaggerated human traits. One example is that when a human player is shot at, they tend to go after the player that attacked them, even when a cool appraisal of the circumstances suggests that they’d do better to let it go. It’s interesting to reflect that if humans reliably seek revenge in this way, that tendency probably had survival value in the real world when the human brain was evolving; there must be important respects in which the game theory of the real world diverges from that of the game.
Because Botprize is in some respects less demanding than the original Turing Test, the conviction it delivers is less; the 2012 wins did not really make us believe that the relevant bots had human thinking ability, still less that they were conscious. In that respect a proper conversation carries more weight. The best chat-bots in the Loebner, however, are not at all convincing either, partly for a different reason – we know that no attempt has been made to endow them with real understanding or real thought; they are just machines designed to pass the test by faking thoughtful responses.
Ironically some of the less successful Botprize entrants have been more ambitious. In particular, Neurobot, created by Zafeiros Fountas as an MSc project, used a spiking neural network with a Global Workspace architecture; while not remotely on the scale of a human brain, this is in outline a plausible design for human-style cognition; indeed, one of the best we’ve got (which may not be saying all that much, of course). The Global Workspace idea, originated by Bernard Baars, situates consciousness as a general purpose space where inputs from different modules can be brought together and handled effectively. Although I have my reservations about that concept, it could at least reasonably be claimed that Neurobot’s functional states were somewhere on a spectrum which ultimately includes proper consciousness (interestingly, they would presumably be cognitive states of a kind which have never existed in nature, far simpler than those of most animals yet in some respects more like states of a human brain).
The 2012 winners by contraast, like the most successful Loebner chat-bots, relied on replaying recorded sequences of real human behaviour. Alas, this seems in practice to be the Achilles heel of Turing-style tests; canned responses just work too well.