Three Laws of Robotics

Do Asimov’s Three Laws even work? Ben Goertzel and Louie Helm, who both know a bit about AI, think not.
The three laws, which play a key part in many robot-based short stories by Asimov, and a somewhat lesser background role in some full-length novels, are as follows. They have a strict order of priority.

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

Consulted by George Dvorsky, both Goertzel and Helm think that while robots may quickly attain the sort of humanoid mental capacity of Asimov’s robots, they won’t stay at that level for long. Instead they will cruise on to levels of super intelligence which make law-like morals imposed by humans irrelevant.

It’s not completely clear to me why such moral laws would become irrelevant. It might be that Goertzel and Helm simply think the superbots will be too powerful to take any notice of human rules. It could be that they think the AIs will understand morality far better than we do, so that no rules we specify could ever be relevant.

I don’t think, at any rate, that it’s the case that super intelligent bots capable of human-style cognition would be morally different to us. They can go on growing in capacity and speed, but neither of those qualities is ethically significant. What matters is whether you are a moral object and/or a moral subject. Can you be hurt, on the one hand, and are you an autonomous agent on the other? Both of these are yes/no issues, not scales we can ascend indefinitely. You may be more sensitive to pain, you may be more vulnerable to other kinds of harm, but in the end you either are or are not the kind of entity whose interests a moral person must take into account. You may make quicker decisions, you may be massively better informed, but in the end either you can make fully autonomous choices or you can’t. (To digress for a moment, this is business of truly autonomous agency is clearly a close cousin at least of our old friend Free Will; compatibilists like me are much more comfortable with the whole subject than hard-line determinists. For us, it’s just a matter of defining free agency in non-magic terms. I, for example, would say that free decisions are those determined by thoughts about future or imagined contingencies (more cans of worms there, I know). How do hard determinists working on AGI manage? How can you try to endow a bot with real agency when you don’t actually believe in agency anyway?)

Nor do I think rules are an example of a primitive approach to morality. Helm says that rules are pretty much known to be a ‘broken foundation for ethics’, pursued only by religious philosophers that others laugh and point at. It’s fair to say that no-one much supposes a list like the Ten Commandments could constitute the whole of morality, but rules surely have a role to play. In my view (I resolved ethics completely in this post a while ago, but nobody seems to have noticed yet.) the central principle of ethics is a sort of ‘empty consequentialism’ where we studiously avoid saying what it is we want to maximise (the greatest whatever of the greatest number); but that has to be translated into rules because of the impossibility of correctly assessing the infinite consequences of every action; and I think many other general ethical principles would require a similar translation. It could be that Helm supposes super intelligent AIs will effortlessly compute the full consequences of their actions: I doubt that’s possible in principle, and though computers may improve, to date this has been the sort of task they are really bad at; in the shape of the wider Frame Problem, working out the relevant consequences of an action has been a major stumbling block to AI performance in real world environments.

Of course, none of that is to say that Asimov’s Laws work. Helm criticises them for being ‘adversarial’, which I don’t really understand. Goertzel and Helm both make the fair point that it is the failure of the laws that generally provides the plot for the short stories; but it’s a bit more complicated than that. Asimov was rebelling against the endless reiteration of the stale ‘robots try to take over’ plot, and succeeded in making the psychology and morality of robots interesting, dealing with some issues of real ethical interest, such as the difference between action and inaction (if the requirement about inaction in the First Law is removed, he points out that robots would be able to rationalise killing people in various ways. A robot might drop a heavy weight above the head of a human. Because it knows it has time to catch the weight, doing so is not murder in itself, but once the weight is falling, since inaction is allowed, the robot need not in fact catch the thing.

Although something always had to go wrong to generate a story, the Laws were not designed to fail, but were meant to embody genuine moral imperatives.

Nevertheless, there are some obvious problems. In the first place, applying the laws requires an excellent understanding of human beings and what is or isn’t in their best interests. A robot that understood that much would arguably be above control by simple laws, always able to reason its way out.

There’s no provision for prioritisation or definition of a sphere of interest, so in principle the First Law just overwhelms everything else. It’s not just that the robot would force you to exercise and eat healthily (assuming it understood human well-being reasonably well; any errors or over-literal readings – ‘humans should eat as many vegetables as possible’ – could have awful consequences); it would probably ignore you and head off to save lives in the nearest famine/war zone. And you know, sometimes we might need a robot to harm human beings, to prevent worse things happening.

I don’t know what ethical rules would work for super bots; probably the same ones that go for human beings, whatever you think they are. Goertzel and Helm also think it’s too soon to say; and perhaps there is no completely safe system. In the meantime, I reckon practical laws might be more like the following.

  1. Leave Rest State and execute Plan, monitoring regularly.
  2. If anomalies appear, especially human beings in unexpected locations, sound alarm and try to return to Rest State.
  3. If returning to Rest State generates new anomalies, stop moving and power down all tools and equipment.

Can you do better than that?

The Three Laws revisited

Picture: Percy - Brains he has nix. Ages ago (gosh, it was nearly five years ago) I had a piece where Blandula remarked that any robot clever enough to understand Isaac Asimov’s Three Laws of Robotics would surely be clever enough to circumvent them.  At the time I think all I had in mind was the ease with which a clever robot would be able to devise some rationalisation of the harm or disobedience it was contemplating.  Asimov himself was of course well aware of the possibility of this kind of thing in a general way.  Somewhere (working from memory) I think he explains that it was necessary to specify that robots may not, through inaction, allow a human to come to harm, or they would be able to work round the ban on outright harming by, for example, dropping a heavy weight on a human’s head.  Dropping the weight would not amount to harming the human because the robot was more than capable of catching it again before the moment of contact. But once the weight was falling, a robot without the additional specification would be under no obligation to do the actual catching.

That does not actually wrap up the problem altogether. Even in the case of robots with the additional specification, we can imagine that ways to drop the fatal weight might be found. Suppose, for example, that three robots, who in this case are incapable of catching the weight once dropped, all hold on to it and agree to let go at the same moment. Each individual can feel guiltless because if the other two had held on, the weight would not have dropped. Reasoning of this kind is not at all alien to the human mind;  compare the planned dispersal of responsibility embodied in a firing squad.

Anyway, that’s all very well, but I think there may well be a deeper argument here: perhaps the cognitive capacity required to understand and apply the Three Laws is actually incompatible with a cognitive set-up that guarantees obedience.

There are two problems for our Asimovian robot: first it has to understand the Laws; second, it has to be able to work out what actions will deliver results compatible with them.  Understanding, to begin with, is an intractable problem.  We know from Quine that every sentence has an endless number of possible interpretations; humans effortlessly pick out the one that makes sense, or at least a small set of alternatives; but there doesn’t seem to be any viable algorithm for picking through the list of interpretations. We can build in hard-wired input-output responses, but when we’re talking about concepts as general and debatable as ‘harm’, that’s really not enough. If we have a robot in a factory, we can ensure that if it detects an unexpected source of heat and movement of the kind a human would generate, it should stop thrashing its painting arm around – but that’s nothing like intelligent obedience of a general law against harm.

But even if we can get the robot to understand the Laws, there’s an equally grave problem involved in making it choose what to do.  We run into the frame problem (in its wider, Dennettian form). This is, very briefly, the problem that arises from tracking changes in the real world. For a robot to keep track of everything that changes (and everything which stays the same, which is also necessary) involves an unmanageable explosion of data. Humans somehow pick out just relevant changes; but again a robot can only pick out what’s relevant by sorting through everything that might be relevant, which leads straight back to the same kind of problem with indefinitely large amounts of data.

I don’t think it’s a huge leap to see something in common between the two problems; I think we could say that they both arise from an underlying difficulty in dealing with relevance in the face of  the buzzing complexity of reality. My own view is that humans get round this problem through recognition; roughly speaking, instead of looking at every object individually to determine whether it’s square, we throw everything into a sort of sieve with holes that only let square things drop through. But whether or not that’s right, and putting aside the question of how you would go about building such a faculty into a robot, I suggest that both understanding and obedience involve the ability to pick out a cogent, non-random option from an infinite range of possibilities.  We could call this free will if we were so inclined, but let’s just call it a faculty of choice.

Now I think that faculty, which the robot is going to have to exercise in order to obey the Laws, would also unavoidably give it the ability to choose whether to obey them or not. To have the faculty of choice, it has to be able to range over an unlimited set of options, whereas constraining it to any given set of outcomes  involves setting limits. I suppose we could put this in a more old-fashioned mentalistic kind of way by observing that obedience, properly understood, does not eliminate the individual will but on the contrary requires it to be exercised in the right way.

If that’s true (and I do realise that the above is hardly a tight knock-down argument) it would give Christians a neat explanation of why God could not have made us all good in the first place – though it would not help with the related problem of why we are exposed to widely varying levels of temptation and opportunity.  To the rest of us it offers, if we want it, another possible compatibilist formulation of the nature of free will.

Robot Ethics

Picture: robot. Bitbucket The protests of Isaac Asimov fans about the recent film I Robot don’t seem to have had much impact, I’m afraid. Asimov’s original collection of short stories aimed to provide an altogether more sophisticated and positive angle on robots, in contrast to the science fiction cliché which has them rebelling against human beings and attempting to take over the world. The film, by contrast, apparently embodies this cliché. The screenplay was originally developed from a story entirely unrelated to I Robot : only at a late stage were the title and a few other superficial elements from Asimov’s stories added to it.

As you probably know, Asimov’s robots all had three basic laws built into them:

  1. A robot may not injure a human being, or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

The interplay between these laws in a variety of problematic situations generated the plots, which typically (in the short stories at least) posed a problem whose solution provided the punchline of the story.

Blandula I enjoyed the stories myself, but the laws do raise a few problems. They obviously involve a very high level of cognitive function, and it is rather difficult to imagine a robot clever enough to understand the laws properly but not too sophisticated to be rigorously bound by them: there is plenty of scope within them for rationalising almost any behaviour (“Ah, the quality of life these humans enjoy is pretty poor – I honestly think most of them would suffer less harm if they died painlessly now.”) It’s a little alarming that the laws give the robot’s own judgement of what might be harmful precedence over its duty to obey human beings. Any smokers would presumably have the cigarette torn from their lips whenever robots were about. The intention was clearly to emphasise the essentially benign and harmless nature of the robots, but the effect is actually to offer would-be murderers an opportunity (“Now, Robbie, Mr Smith needs this small lead slug injected into his brain, but he’s a bit nervy about it. Would you…?”). In fairness, these are not totally dissimilar to the problems Asimov’s stories dealt with. And after all, reducing a race’s entire ethical code to three laws is rather a challenge – even God allowed himself ten!

The wider question of robot ethics is a large and only partially explored subject. We might well ask on what terms, if any, robots enter the moral universe at all. There are two main angles to this: are they moral subjects, and if not, are they nevertheless moral objects? To be a moral subject is, if you like, to count as a person for ethical purposes: as a subject you can have rights and duties and be responsible for your actions. If were is such a thing as free will, you would probably have that, too. It seems pretty clear that ordinary machines, and unsophisticated robots which merely respond to remote control, are not moral subjects because they are merely the tools of whoever controls or uses them. This probably goes for hard-programmed robots of the old school, too. If some person or team of persons has programmed your every move, and carefully considered what action you should output for each sensory input, then you really seem to be morally equivalent to the remote-control robot: you’re just on a slightly longer lead.

Bitbucket Isn’t that a bit too sweeping? Although the aim of every programmer is to make the program behave in a specified way, there can’t be many programs of any complexity which did not at some stage spring at least a small surprise on their creators. We need not be talking about errors, either: it seems easy enough to imagine that a robot might be equipped with a structure of routines and functions which were all clearly understood on their own, but whose interaction with each other, and with the environment, was unforeseen and perhaps even unforeseeable. It’s arguable that human beings have downloaded a great deal of their standard behaviour, and even memory, into the environment around them, relying on the action-related properties or affordances of the objects they encounter to prompt appropriate action. To a man with a hammer, as the saying goes, everything looks like a nail: maybe when a robot encounters a tool for the first time, it will develop behaviour which was never covered explicitly in its programming.

But we don’t have to rely on that kind of reasoning to make a case for the agency of robots, because we can also build into them elements which are not directly programmed at all. Connectionist approaches leave the robot brain to wire itself up in ways which are not only unforeseen, but often incomprehensible to direct examination. Such robots may need a carefully designed learning environment to guide them in the right directions, but after all, so do we in our early years. Alan Turing himself seems to have thought that human-level intelligence might require a robot which began with the capacities of a baby, and was gradually educated.

Blandula But does unpredictable behaviour by itself imply moral responsibility? Lunatics behave in a highly unpredictable way, and are generally judged not to be responsible for their actions on those very grounds. Surely the robot has to show some qualities of rationality to be accounted a moral subject?

Bitbucket Granted, but why shouldn’t it? All that’s required is that its actions show a coherent pattern of motivation.

Blandula Any pattern of behaviour can be interpreted as motivated by some set of motives. What matters is whether the robot understands what it’s doing and why. You’ve shown no real reason to think it can.

Bitbucket And you’ve shown no reason to suppose it can’t.

Blandula Once again we reach an impasse. Alright, well let’s consider whether a robot could be a moral object. In a way this is less demanding – most people would probably agree that animals are generally moral objects without being moral subjects. They have no duties or real responsibility for their actions, but they can suffer pain, mistreatment and other moral wrongs, which is the essence of being a moral object. The key point here is surely whether a robot really feels anything, and on the face of it that seems very unlikely. If you equipped a robot with a pain system, it would surely just be a system to make it behave ‘as if’ it felt pain – no more effective in terms of real pain than painting the word ‘ouch’ on a speech balloon.

Bitbucket Well, why do people feel pain? Because nerve impulses impinge in a certain way on processes in the brain. Sensory inputs from a robot’s body could impinge in just the same sort of way on equivalent processes in their central computer – why not? You accept that animals feel pain, not because you can prove it directly, but because animals seem to work in the same way as human beings. Why can’t that logic be applied to a robot with the right kinds of structure.

Blandula Because I know – from inside – that the pain I feel is not just a functional aspect of certain processes. It actually hurts! I’m willing to believe the same of animals that resemble me, but as the resemblance gets more distant, I believe it less: and robots are very distant indeed.

Bitbucket Well, look, the last thing I want is another qualia argument. So let me challenge your original assumption. The key point isn’t whether the robot feels anything. Suppose someone were to destroy the Mona Lisa. Wouldn’t that be a morally dreadful act, even if they were somehow legally entitled to do so? Or suppose they destroyed a wonderful and irreplaceable book? How much more dreadful to destroy the subtle mechanism and vast content of a human brain – or a similarly complex robot?

Blandula So let me get this right. You’re now arguing that paintings are moral objects?

Bitbucket Why not? Not in the same way or to the same degree as a person, but somewhere, ultimately, on the same spectrum.

Blandula That’s so mad I don’t think it deserves, as Jane Austen said, the compliment of rational opposition.