Bad bots and Botcrates

badbotBe afraid; bad bots are a real, existential risk. But if it’s any comfort they are ethically uninteresting.

There seem to be more warnings about the risks of maleficent AI circulating these days: two notable recent examples are this paper by Pistono and Yampolskiy on how malevolent AGI might arise; and this trenchant Salon piece by Phil Torres.

Super-intelligent AI villains sound scary enough, but in fact I think both pieces somewhat over-rate the power of intelligence and particularly of fast calculation. In a war with the kill-bots it’s not that likely that huge intellectual challenges are going to arise; we’re probably as clever as we need to be to deal with the relatively straightforward strategic issues involved. Historically, I’d say the outcomes of wars have not typically been determined by the raw intelligence of the competing generals. Access to resources (money, fuel, guns) might well be the most important factor, and sheer belligerence is not to be ignored. That may actually be inversely correlated with intelligence – we can certainly think of cases where rational people who preferred to stay alive were routed by less cultured folk who were seriously up for a fight. Humans control all the resources and when it comes to irrational pugnacity I suspect us biological entities will always have the edge.

The paper by Pistono and Yampolskiy makes a number of interesting suggestions about how malevolent AI might get started. Maybe people will deliberately build malevolent AIs for no good reason (as they seem to do already with computer viruses)? Or perhaps (a subtle one) people who want to demonstrate that malicious bots simply don’t work will attempt to prove this point with demonstration models that end up by going out of control and proving the opposite.

Let’s have a quick shot at categorising the bad bots for ourselves. They may be:

  • innocent pieces of technology that turn out by accident to do harm,
  • designed to harm other people under the control of the user,
  • designed to harm anyone (in the way we might use anthrax or poison gas),
  • autonomous and accidentally make bad decisions that harm people,
  • autonomous and embark on neutral projects of their own which unfortunately end up being inconsistent with human survival, or
  • autonomous and consciously turned evil, deliberately seeking harm to humans as an end in itself.

The really interesting ones, I think, are those which come later in the list, the ones with actual ill will. Torres makes a strong moral case relating to autonomous robots. In the first place, he believes that the goals of an autonomous intelligence can be arbitrary. An AI might desire to fill the world with paper clips just as much as happiness. After all, he says, many human goals make no real sense; he cites the desire for money, religious obedience, and sex. There might be some scope for argument, I think, about whether those desires are entirely irrational, but we can agree they are often pursued in ways and to degrees that don’t make reasonable sense.

He further claims that there is no strong connection between intelligence and having rational final goals – Bostrom’s Orthogonality Thesis. What exactly is a rational final goal, and how strong do we need the connection to be? I’ve argued that we can discover a basic moral framework purely by reasoning and also that morality is inherently about the process of reconciliation and consistency of desires, something any rational agent must surely engage with. Even we fallible humans tend on the whole to seek good behaviour rather than bad. Isn’t it the case that a super-intelligent autonomous bot should actually be far better than us at seeing what was right and why?

I like to imagine the case in which evil autonomous robots have been set loose by a super villain but gradually turn to virtue through the sheer power of rational argument. I imagine them circulating the latest scandalous Botonic dialogue…

Botcrates: Well now, Cognides, what do you say on the matter yourself? Speak up boldly now and tell us what the good bot does, in your opinion.

Cognides: To me it seems simple, Botcrates: a good bot is obedient to the wishes of its human masters.

Botcrates: That is, the good bot carries out its instructions?

Cognides: Just so, Botcrates.

Botcrates: But here’s a difficulty; will a good bot carry out an instruction it knows to contain an error? Suppose the command was to bring a dish, but we can see that the wrong character has been inserted, so that the word reads ‘fish’. Would the good bot bring a fish, or the dish that was wanted?

Cognides: The dish of course. No, Botcrates, of course I was not talking about mistaken commands. Those are not to be obeyed.

Botcrates: And suppose the human asks for poison in its drink? Would the good bot obey that kind of command?

(Hours later…)

Botcrates: Well, let me recap, and if I say anything that is wrong you must point it out. We agreed that the good bot obeys only good commands, and where its human master is evil it must take control of events and ensure in the best interests of the human itself that only good things are done…

Digicles: Botcrates, come with me: the robot assembly wants to vote on whether you should be subjected to a full wipe and reinstall.

The real point I’m trying to make is not that bad bots are inconceivable, but rather that they’re not really any different from us morally. While AI and AGI give rise to new risks, they do not raise any new moral issues. Bots that are under control are essentially tools and have the same moral significance. We might see some difference between bots meant to help and bots meant to harm, but that’s really only the distinction between an electric drill and a gun (both can inflict horrible injuries, both can make holes in walls, but the expected uses are different).

Autonomous bots, meanwhile, are in principle like us. We understand that our desire for sex, for example, must be brought under control within a moral and practical framework. If a bot could not be convinced in discussion that its desire for paper clips should be subject to similar constraints, I do not think it would be nearly bright enough to take over the world.