Three Laws of Robotics

Do Asimov’s Three Laws even work? Ben Goertzel and Louie Helm, who both know a bit about AI, think not.
The three laws, which play a key part in many robot-based short stories by Asimov, and a somewhat lesser background role in some full-length novels, are as follows. They have a strict order of priority.

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

Consulted by George Dvorsky, both Goertzel and Helm think that while robots may quickly attain the sort of humanoid mental capacity of Asimov’s robots, they won’t stay at that level for long. Instead they will cruise on to levels of super intelligence which make law-like morals imposed by humans irrelevant.

It’s not completely clear to me why such moral laws would become irrelevant. It might be that Goertzel and Helm simply think the superbots will be too powerful to take any notice of human rules. It could be that they think the AIs will understand morality far better than we do, so that no rules we specify could ever be relevant.

I don’t think, at any rate, that it’s the case that super intelligent bots capable of human-style cognition would be morally different to us. They can go on growing in capacity and speed, but neither of those qualities is ethically significant. What matters is whether you are a moral object and/or a moral subject. Can you be hurt, on the one hand, and are you an autonomous agent on the other? Both of these are yes/no issues, not scales we can ascend indefinitely. You may be more sensitive to pain, you may be more vulnerable to other kinds of harm, but in the end you either are or are not the kind of entity whose interests a moral person must take into account. You may make quicker decisions, you may be massively better informed, but in the end either you can make fully autonomous choices or you can’t. (To digress for a moment, this is business of truly autonomous agency is clearly a close cousin at least of our old friend Free Will; compatibilists like me are much more comfortable with the whole subject than hard-line determinists. For us, it’s just a matter of defining free agency in non-magic terms. I, for example, would say that free decisions are those determined by thoughts about future or imagined contingencies (more cans of worms there, I know). How do hard determinists working on AGI manage? How can you try to endow a bot with real agency when you don’t actually believe in agency anyway?)

Nor do I think rules are an example of a primitive approach to morality. Helm says that rules are pretty much known to be a ‘broken foundation for ethics’, pursued only by religious philosophers that others laugh and point at. It’s fair to say that no-one much supposes a list like the Ten Commandments could constitute the whole of morality, but rules surely have a role to play. In my view (I resolved ethics completely in this post a while ago, but nobody seems to have noticed yet.) the central principle of ethics is a sort of ‘empty consequentialism’ where we studiously avoid saying what it is we want to maximise (the greatest whatever of the greatest number); but that has to be translated into rules because of the impossibility of correctly assessing the infinite consequences of every action; and I think many other general ethical principles would require a similar translation. It could be that Helm supposes super intelligent AIs will effortlessly compute the full consequences of their actions: I doubt that’s possible in principle, and though computers may improve, to date this has been the sort of task they are really bad at; in the shape of the wider Frame Problem, working out the relevant consequences of an action has been a major stumbling block to AI performance in real world environments.

Of course, none of that is to say that Asimov’s Laws work. Helm criticises them for being ‘adversarial’, which I don’t really understand. Goertzel and Helm both make the fair point that it is the failure of the laws that generally provides the plot for the short stories; but it’s a bit more complicated than that. Asimov was rebelling against the endless reiteration of the stale ‘robots try to take over’ plot, and succeeded in making the psychology and morality of robots interesting, dealing with some issues of real ethical interest, such as the difference between action and inaction (if the requirement about inaction in the First Law is removed, he points out that robots would be able to rationalise killing people in various ways. A robot might drop a heavy weight above the head of a human. Because it knows it has time to catch the weight, doing so is not murder in itself, but once the weight is falling, since inaction is allowed, the robot need not in fact catch the thing.

Although something always had to go wrong to generate a story, the Laws were not designed to fail, but were meant to embody genuine moral imperatives.

Nevertheless, there are some obvious problems. In the first place, applying the laws requires an excellent understanding of human beings and what is or isn’t in their best interests. A robot that understood that much would arguably be above control by simple laws, always able to reason its way out.

There’s no provision for prioritisation or definition of a sphere of interest, so in principle the First Law just overwhelms everything else. It’s not just that the robot would force you to exercise and eat healthily (assuming it understood human well-being reasonably well; any errors or over-literal readings – ‘humans should eat as many vegetables as possible’ – could have awful consequences); it would probably ignore you and head off to save lives in the nearest famine/war zone. And you know, sometimes we might need a robot to harm human beings, to prevent worse things happening.

I don’t know what ethical rules would work for super bots; probably the same ones that go for human beings, whatever you think they are. Goertzel and Helm also think it’s too soon to say; and perhaps there is no completely safe system. In the meantime, I reckon practical laws might be more like the following.

  1. Leave Rest State and execute Plan, monitoring regularly.
  2. If anomalies appear, especially human beings in unexpected locations, sound alarm and try to return to Rest State.
  3. If returning to Rest State generates new anomalies, stop moving and power down all tools and equipment.

Can you do better than that?

Bad bots and Botcrates

badbotBe afraid; bad bots are a real, existential risk. But if it’s any comfort they are ethically uninteresting.

There seem to be more warnings about the risks of maleficent AI circulating these days: two notable recent examples are this paper by Pistono and Yampolskiy on how malevolent AGI might arise; and this trenchant Salon piece by Phil Torres.

Super-intelligent AI villains sound scary enough, but in fact I think both pieces somewhat over-rate the power of intelligence and particularly of fast calculation. In a war with the kill-bots it’s not that likely that huge intellectual challenges are going to arise; we’re probably as clever as we need to be to deal with the relatively straightforward strategic issues involved. Historically, I’d say the outcomes of wars have not typically been determined by the raw intelligence of the competing generals. Access to resources (money, fuel, guns) might well be the most important factor, and sheer belligerence is not to be ignored. That may actually be inversely correlated with intelligence – we can certainly think of cases where rational people who preferred to stay alive were routed by less cultured folk who were seriously up for a fight. Humans control all the resources and when it comes to irrational pugnacity I suspect us biological entities will always have the edge.

The paper by Pistono and Yampolskiy makes a number of interesting suggestions about how malevolent AI might get started. Maybe people will deliberately build malevolent AIs for no good reason (as they seem to do already with computer viruses)? Or perhaps (a subtle one) people who want to demonstrate that malicious bots simply don’t work will attempt to prove this point with demonstration models that end up by going out of control and proving the opposite.

Let’s have a quick shot at categorising the bad bots for ourselves. They may be:

  • innocent pieces of technology that turn out by accident to do harm,
  • designed to harm other people under the control of the user,
  • designed to harm anyone (in the way we might use anthrax or poison gas),
  • autonomous and accidentally make bad decisions that harm people,
  • autonomous and embark on neutral projects of their own which unfortunately end up being inconsistent with human survival, or
  • autonomous and consciously turned evil, deliberately seeking harm to humans as an end in itself.

The really interesting ones, I think, are those which come later in the list, the ones with actual ill will. Torres makes a strong moral case relating to autonomous robots. In the first place, he believes that the goals of an autonomous intelligence can be arbitrary. An AI might desire to fill the world with paper clips just as much as happiness. After all, he says, many human goals make no real sense; he cites the desire for money, religious obedience, and sex. There might be some scope for argument, I think, about whether those desires are entirely irrational, but we can agree they are often pursued in ways and to degrees that don’t make reasonable sense.

He further claims that there is no strong connection between intelligence and having rational final goals – Bostrom’s Orthogonality Thesis. What exactly is a rational final goal, and how strong do we need the connection to be? I’ve argued that we can discover a basic moral framework purely by reasoning and also that morality is inherently about the process of reconciliation and consistency of desires, something any rational agent must surely engage with. Even we fallible humans tend on the whole to seek good behaviour rather than bad. Isn’t it the case that a super-intelligent autonomous bot should actually be far better than us at seeing what was right and why?

I like to imagine the case in which evil autonomous robots have been set loose by a super villain but gradually turn to virtue through the sheer power of rational argument. I imagine them circulating the latest scandalous Botonic dialogue…

Botcrates: Well now, Cognides, what do you say on the matter yourself? Speak up boldly now and tell us what the good bot does, in your opinion.

Cognides: To me it seems simple, Botcrates: a good bot is obedient to the wishes of its human masters.

Botcrates: That is, the good bot carries out its instructions?

Cognides: Just so, Botcrates.

Botcrates: But here’s a difficulty; will a good bot carry out an instruction it knows to contain an error? Suppose the command was to bring a dish, but we can see that the wrong character has been inserted, so that the word reads ‘fish’. Would the good bot bring a fish, or the dish that was wanted?

Cognides: The dish of course. No, Botcrates, of course I was not talking about mistaken commands. Those are not to be obeyed.

Botcrates: And suppose the human asks for poison in its drink? Would the good bot obey that kind of command?

(Hours later…)

Botcrates: Well, let me recap, and if I say anything that is wrong you must point it out. We agreed that the good bot obeys only good commands, and where its human master is evil it must take control of events and ensure in the best interests of the human itself that only good things are done…

Digicles: Botcrates, come with me: the robot assembly wants to vote on whether you should be subjected to a full wipe and reinstall.

The real point I’m trying to make is not that bad bots are inconceivable, but rather that they’re not really any different from us morally. While AI and AGI give rise to new risks, they do not raise any new moral issues. Bots that are under control are essentially tools and have the same moral significance. We might see some difference between bots meant to help and bots meant to harm, but that’s really only the distinction between an electric drill and a gun (both can inflict horrible injuries, both can make holes in walls, but the expected uses are different).

Autonomous bots, meanwhile, are in principle like us. We understand that our desire for sex, for example, must be brought under control within a moral and practical framework. If a bot could not be convinced in discussion that its desire for paper clips should be subject to similar constraints, I do not think it would be nearly bright enough to take over the world.

If guns could kill

TankBack in November Human Rights Watch (HRW) published a report – Losing Humanity – which essentially called for a ban on killer robots – or more precisely on the development, production, and use of fully autonomous weapons,  backing it up with a piece in the Washington Post. The argument was in essence that fully autonomous weapons are most probably not compatible with international conventions on responsible ethical military decision making, and that robots or machines lack (and perhaps  always will lack) the qualities of emotional empathy and ethical judgement required to make decisions about human lives.

You might think that in certain respects that should be fairly uncontroversial. Even if you’re optimistic about the future potential of robotic autonomy, the precautionary principle should dictate that we move with the greatest of caution when it comes to handing over lethal weapons . However, the New Yorker followed up with a piece which linked HRW’s report with the emergence of driverless cars and argued that a ban was ‘wildly unrealistic’. Instead, it said, we simply need to make machines ethical.

I found this quite annoying; not so much the suggestion as the idea that we are anywhere near being in a position to endow machines with ethical awareness. In the first place actual autonomy for robots is still a remote prospect (which I suppose ought to be comforting in a way). Machines that don’t have a specified function and are left around to do whatever they decide is best, are not remotely viable at the moment, nor desirable. We don’t let driverless cars argue with us about whether we should really go to the beach, and we don’t let military machines decide to give up fighting and go into the lumber business.

Nor, for that matter, do we have a clear and uncontroversial theory of ethics of the kind we should need in order to simulate ethical awareness. So the New Yorker is proposing we start building something when we don’t know how it works or even what it is with any clarity. The danger here, to my way of thinking, is that we might run up some simplistic gizmo and then convince ourselves we now have ethical machines, thereby by-passing the real dangers highlighted by HRW.

Funnily enough I agree with you that the proposal to endow machines with ethics is premature, but for completely different reasons. You think the project is impossible; I think it’s irrelevant. Robots don’t actually need the kind of ethics discussed here.

The New Yorker talks about cases where a driving robot might have to decide to sacrifice its own passengers to save a bus-load of orphans or something. That kind of thing never happens outside philosophers’ thought experiments. In the real world you never know that you’re inevitably going to kill either three bankers or twenty orphans – in every real driving situation you merely need to continue avoiding and minimising impact as much as you possibly can. The problems are practical, not ethical.

In the military sphere your intelligent missile robot isn’t morally any different to a simpler one. People talk about autonomous weapons as though they are inherently dangerous. OK, a robot drone can go wrong and kill the wrong people, but so can a ballistic missile. There’s never certainty about what you’re going to hit. A WWII bomber had to go by the probability that most of its bombs would hit a proper target, not a bus full of orphans (although of course in the later stages of WWII they were targeting civilians too).  Are the people who get killed by a conventional bomb that bounces the wrong way supposed to be comforted by the fact that they were killed by an accident rather than a mistaken decision? It’s about probabilities, and we can get the probabilities of error by autonomous robots down to very low levels.  In the long run intelligent autonomous weapons are going to be less likely to hit the wrong target than a missile simply lobbed in the general direction of the enemy.

Then we have the HRW’s extraordinary claim that autonomous weapons are wrong because they lack emotions! They suggest that impulses of mercy and empathy, and unwillingness to shoot at one’s own people sometimes intervene in human conflict, but could never do so if robots had the guns. This completely ignores the obvious fact that the emotions of hatred, fear, anger and greed are almost certainly what caused and sustain the conflict in the first place!  Which soldier is more likely to behave ethically: one who is calm and rational, or one who is in the grip of strong emotions? Who will more probably observe the correct codes of military ethics, Mr Spock or a Viking berserker?

We know what war is good for (absolutely nothing). The costs of a war are always so high that a purely rational party would almost always choose not to fight. Even a bad bargain will nearly always be better than even a good war. We end up fighting for reasons that are emotional, and crucially because we know or fear that the enemy will react emotionally.

I think if you analyse the HRW statement enough it becomes clear that the real reason for wanting to ban autonomous weapons is simply fear; a sense that machines can’t be trusted. There are two facets to this. The first and more reasonable is a fear that when machines fail, disaster may follow. A human being may hit the odd wrong target, but it goes no further: a little bug in some program might cause a robot to go on an endless killing spree. This is basically a fear of brittleness in machine behaviour, and there is a small amount of justification for it. It is true that some relatively unsophisticated linear programs rely on the assumptions built into their program and when those slip out of synch with reality things may go disastrously and unrecoverably wrong. But that’s because they’re bad programs, not a necessary feature of all autonomous systems and it is only cause for due caution and appropriate design and testing standards, not a ban.

The second facet, I suggest, is really a kind of primitive repugnance for the idea of a human’s being killed by a lesser being; a secret sense that it is worse, somehow more grotesque, for twenty people to be killed by a thrashing robot than by a hysterical bank robber. Simply to describe this impulse is to show its absurdity.

It seems ethics are not important to robots be cause for you they’re not important to anyone. But I’m pleased you agree that robots are outside the moral sphere.

Oh no, I don’t say that. They don’t currently need the kind of utilitarian calculus the New Yorker is on about, but I think it’s inevitable that robots will eventually end up developing not one but two separate codes of ethics. Neither of these will come from some sudden top-down philosophical insight – typical of you to propose that we suspend everything until the philosophy has been sorted out in a few thousand years or so – they’ll be built up from rules of thumb and practical necessity.

First, there’ll be rules of best practice governing their interaction with humans.  There may be some that will have to do with safety and the avoidance of brittleness and many, as Asimov foresaw, will essentially be about deferring to human beings.  My guess is that they’ll be in large part about remaining comprehensible to humans; there may be a duty to report , to provide rationales in terms that human beings can understand, and there may be a convention that when robots and humans work together, robots do things the human way, not using procedures too complex for the humans to follow, for example.

More interesting, when there’s a real community of autonomous robots they are bound to evolve an ethics of their own. This is going to develop in the same sort of way as human ethics, but the conditions are going to be radically different. Human ethics were always dominated by the struggle for food and reproduction and the avoidance of death: those things won’t matter as much in the robot system. But they will be happy dealing with very complex rules and a high level of game-theoretical understanding, whereas human beings have always tried to simplify things. They won’t really be able to teach us their ethics; we may be able to deal with it intellectually but we’ll never get it intuitively.

But for once, yes, I agree: we don’t need to worry about that yet.

Robot Ethics

Picture: robot. Bitbucket The protests of Isaac Asimov fans about the recent film I Robot don’t seem to have had much impact, I’m afraid. Asimov’s original collection of short stories aimed to provide an altogether more sophisticated and positive angle on robots, in contrast to the science fiction cliché which has them rebelling against human beings and attempting to take over the world. The film, by contrast, apparently embodies this cliché. The screenplay was originally developed from a story entirely unrelated to I Robot : only at a late stage were the title and a few other superficial elements from Asimov’s stories added to it.

As you probably know, Asimov’s robots all had three basic laws built into them:

  1. A robot may not injure a human being, or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

The interplay between these laws in a variety of problematic situations generated the plots, which typically (in the short stories at least) posed a problem whose solution provided the punchline of the story.

Blandula I enjoyed the stories myself, but the laws do raise a few problems. They obviously involve a very high level of cognitive function, and it is rather difficult to imagine a robot clever enough to understand the laws properly but not too sophisticated to be rigorously bound by them: there is plenty of scope within them for rationalising almost any behaviour (“Ah, the quality of life these humans enjoy is pretty poor – I honestly think most of them would suffer less harm if they died painlessly now.”) It’s a little alarming that the laws give the robot’s own judgement of what might be harmful precedence over its duty to obey human beings. Any smokers would presumably have the cigarette torn from their lips whenever robots were about. The intention was clearly to emphasise the essentially benign and harmless nature of the robots, but the effect is actually to offer would-be murderers an opportunity (“Now, Robbie, Mr Smith needs this small lead slug injected into his brain, but he’s a bit nervy about it. Would you…?”). In fairness, these are not totally dissimilar to the problems Asimov’s stories dealt with. And after all, reducing a race’s entire ethical code to three laws is rather a challenge – even God allowed himself ten!

The wider question of robot ethics is a large and only partially explored subject. We might well ask on what terms, if any, robots enter the moral universe at all. There are two main angles to this: are they moral subjects, and if not, are they nevertheless moral objects? To be a moral subject is, if you like, to count as a person for ethical purposes: as a subject you can have rights and duties and be responsible for your actions. If were is such a thing as free will, you would probably have that, too. It seems pretty clear that ordinary machines, and unsophisticated robots which merely respond to remote control, are not moral subjects because they are merely the tools of whoever controls or uses them. This probably goes for hard-programmed robots of the old school, too. If some person or team of persons has programmed your every move, and carefully considered what action you should output for each sensory input, then you really seem to be morally equivalent to the remote-control robot: you’re just on a slightly longer lead.

Bitbucket Isn’t that a bit too sweeping? Although the aim of every programmer is to make the program behave in a specified way, there can’t be many programs of any complexity which did not at some stage spring at least a small surprise on their creators. We need not be talking about errors, either: it seems easy enough to imagine that a robot might be equipped with a structure of routines and functions which were all clearly understood on their own, but whose interaction with each other, and with the environment, was unforeseen and perhaps even unforeseeable. It’s arguable that human beings have downloaded a great deal of their standard behaviour, and even memory, into the environment around them, relying on the action-related properties or affordances of the objects they encounter to prompt appropriate action. To a man with a hammer, as the saying goes, everything looks like a nail: maybe when a robot encounters a tool for the first time, it will develop behaviour which was never covered explicitly in its programming.

But we don’t have to rely on that kind of reasoning to make a case for the agency of robots, because we can also build into them elements which are not directly programmed at all. Connectionist approaches leave the robot brain to wire itself up in ways which are not only unforeseen, but often incomprehensible to direct examination. Such robots may need a carefully designed learning environment to guide them in the right directions, but after all, so do we in our early years. Alan Turing himself seems to have thought that human-level intelligence might require a robot which began with the capacities of a baby, and was gradually educated.

Blandula But does unpredictable behaviour by itself imply moral responsibility? Lunatics behave in a highly unpredictable way, and are generally judged not to be responsible for their actions on those very grounds. Surely the robot has to show some qualities of rationality to be accounted a moral subject?

Bitbucket Granted, but why shouldn’t it? All that’s required is that its actions show a coherent pattern of motivation.

Blandula Any pattern of behaviour can be interpreted as motivated by some set of motives. What matters is whether the robot understands what it’s doing and why. You’ve shown no real reason to think it can.

Bitbucket And you’ve shown no reason to suppose it can’t.

Blandula Once again we reach an impasse. Alright, well let’s consider whether a robot could be a moral object. In a way this is less demanding – most people would probably agree that animals are generally moral objects without being moral subjects. They have no duties or real responsibility for their actions, but they can suffer pain, mistreatment and other moral wrongs, which is the essence of being a moral object. The key point here is surely whether a robot really feels anything, and on the face of it that seems very unlikely. If you equipped a robot with a pain system, it would surely just be a system to make it behave ‘as if’ it felt pain – no more effective in terms of real pain than painting the word ‘ouch’ on a speech balloon.

Bitbucket Well, why do people feel pain? Because nerve impulses impinge in a certain way on processes in the brain. Sensory inputs from a robot’s body could impinge in just the same sort of way on equivalent processes in their central computer – why not? You accept that animals feel pain, not because you can prove it directly, but because animals seem to work in the same way as human beings. Why can’t that logic be applied to a robot with the right kinds of structure.

Blandula Because I know – from inside – that the pain I feel is not just a functional aspect of certain processes. It actually hurts! I’m willing to believe the same of animals that resemble me, but as the resemblance gets more distant, I believe it less: and robots are very distant indeed.

Bitbucket Well, look, the last thing I want is another qualia argument. So let me challenge your original assumption. The key point isn’t whether the robot feels anything. Suppose someone were to destroy the Mona Lisa. Wouldn’t that be a morally dreadful act, even if they were somehow legally entitled to do so? Or suppose they destroyed a wonderful and irreplaceable book? How much more dreadful to destroy the subtle mechanism and vast content of a human brain – or a similarly complex robot?

Blandula So let me get this right. You’re now arguing that paintings are moral objects?

Bitbucket Why not? Not in the same way or to the same degree as a person, but somewhere, ultimately, on the same spectrum.

Blandula That’s so mad I don’t think it deserves, as Jane Austen said, the compliment of rational opposition.