Posts tagged ‘ethics’

Do Asimov’s Three Laws even work? Ben Goertzel and Louie Helm, who both know a bit about AI, think not.
The three laws, which play a key part in many robot-based short stories by Asimov, and a somewhat lesser background role in some full-length novels, are as follows. They have a strict order of priority.

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

Consulted by George Dvorsky, both Goertzel and Helm think that while robots may quickly attain the sort of humanoid mental capacity of Asimov’s robots, they won’t stay at that level for long. Instead they will cruise on to levels of super intelligence which make law-like morals imposed by humans irrelevant.

It’s not completely clear to me why such moral laws would become irrelevant. It might be that Goertzel and Helm simply think the superbots will be too powerful to take any notice of human rules. It could be that they think the AIs will understand morality far better than we do, so that no rules we specify could ever be relevant.

I don’t think, at any rate, that it’s the case that super intelligent bots capable of human-style cognition would be morally different to us. They can go on growing in capacity and speed, but neither of those qualities is ethically significant. What matters is whether you are a moral object and/or a moral subject. Can you be hurt, on the one hand, and are you an autonomous agent on the other? Both of these are yes/no issues, not scales we can ascend indefinitely. You may be more sensitive to pain, you may be more vulnerable to other kinds of harm, but in the end you either are or are not the kind of entity whose interests a moral person must take into account. You may make quicker decisions, you may be massively better informed, but in the end either you can make fully autonomous choices or you can’t. (To digress for a moment, this is business of truly autonomous agency is clearly a close cousin at least of our old friend Free Will; compatibilists like me are much more comfortable with the whole subject than hard-line determinists. For us, it’s just a matter of defining free agency in non-magic terms. I, for example, would say that free decisions are those determined by thoughts about future or imagined contingencies (more cans of worms there, I know). How do hard determinists working on AGI manage? How can you try to endow a bot with real agency when you don’t actually believe in agency anyway?)

Nor do I think rules are an example of a primitive approach to morality. Helm says that rules are pretty much known to be a ‘broken foundation for ethics’, pursued only by religious philosophers that others laugh and point at. It’s fair to say that no-one much supposes a list like the Ten Commandments could constitute the whole of morality, but rules surely have a role to play. In my view (I resolved ethics completely in this post a while ago, but nobody seems to have noticed yet.) the central principle of ethics is a sort of ‘empty consequentialism’ where we studiously avoid saying what it is we want to maximise (the greatest whatever of the greatest number); but that has to be translated into rules because of the impossibility of correctly assessing the infinite consequences of every action; and I think many other general ethical principles would require a similar translation. It could be that Helm supposes super intelligent AIs will effortlessly compute the full consequences of their actions: I doubt that’s possible in principle, and though computers may improve, to date this has been the sort of task they are really bad at; in the shape of the wider Frame Problem, working out the relevant consequences of an action has been a major stumbling block to AI performance in real world environments.

Of course, none of that is to say that Asimov’s Laws work. Helm criticises them for being ‘adversarial’, which I don’t really understand. Goertzel and Helm both make the fair point that it is the failure of the laws that generally provides the plot for the short stories; but it’s a bit more complicated than that. Asimov was rebelling against the endless reiteration of the stale ‘robots try to take over’ plot, and succeeded in making the psychology and morality of robots interesting, dealing with some issues of real ethical interest, such as the difference between action and inaction (if the requirement about inaction in the First Law is removed, he points out that robots would be able to rationalise killing people in various ways. A robot might drop a heavy weight above the head of a human. Because it knows it has time to catch the weight, doing so is not murder in itself, but once the weight is falling, since inaction is allowed, the robot need not in fact catch the thing.

Although something always had to go wrong to generate a story, the Laws were not designed to fail, but were meant to embody genuine moral imperatives.

Nevertheless, there are some obvious problems. In the first place, applying the laws requires an excellent understanding of human beings and what is or isn’t in their best interests. A robot that understood that much would arguably be above control by simple laws, always able to reason its way out.

There’s no provision for prioritisation or definition of a sphere of interest, so in principle the First Law just overwhelms everything else. It’s not just that the robot would force you to exercise and eat healthily (assuming it understood human well-being reasonably well; any errors or over-literal readings – ‘humans should eat as many vegetables as possible’ – could have awful consequences); it would probably ignore you and head off to save lives in the nearest famine/war zone. And you know, sometimes we might need a robot to harm human beings, to prevent worse things happening.

I don’t know what ethical rules would work for super bots; probably the same ones that go for human beings, whatever you think they are. Goertzel and Helm also think it’s too soon to say; and perhaps there is no completely safe system. In the meantime, I reckon practical laws might be more like the following.

  1. Leave Rest State and execute Plan, monitoring regularly.
  2. If anomalies appear, especially human beings in unexpected locations, sound alarm and try to return to Rest State.
  3. If returning to Rest State generates new anomalies, stop moving and power down all tools and equipment.

Can you do better than that?

Derek Parfit, who died recently, in two videos from an old TV series…

Parfit was known for his attempts in Reasons and Persons to gently dilute our sense of self using thought experiments about Star Trek style transporters and turning himself gradually into Greta Garbo. I think that by assuming the brain could in principle be scanned and 3D printed in a fairly simple way, these generally underestimated the fantastic intricacy of the brain and begged questions about the importance of its functional organisation and history; this in turn led Parfit to give too little attention to the possibility that perhaps we really are just one-off physical entities. But Parfit’s arguments have been influential, perhaps partly because in Parfit’s outlook they grounded an attractively empathetic and unselfish moral outlook, making him less worried about himself and more worried about others. They also harmonised well with Buddhist thought, and continue to have a strong appeal to some.

Myself I lean the other way; I think virtue comes from proper pride, and that nothing much can be expected from someone who considers themselves more or less a nonentity to begin with. To me a weaker sense of self could be expected to lead to moral indifference; but the evidence is not at all in my favour so far as Parfit and his followers are concerned.

In fact Parfit went on to mount a strong defence of the idea of objective moral truth in another notable book, On What Matters, where he tried to reconcile a range of ethical theories, including an attempt to bring Kant and consequentialism into agreement. To me this is a congenial project which Parfit approached in a sensible way, but it seems to represent an evolution of his views. Here he wanted to be  a friend to Utilitarianism, brokering a statesmanlike peace with its oldest enemy; in his earlier work he had offered a telling criticism in his ‘Repugnant Conclusion’

The Repugnant Conclusion: For any possible population of at least ten billion people, all with a very high quality of life, there must be some much larger imaginable population whose existence, if other things are equal, would be better, even though its members have lives that are barely worth living.

This is in effect a criticism of utilitarian arithmetic; trillions of just tolerable lives can produce a sum of happiness greater than a few much better ones, yet the idea we should prefer the former is repugnant. I’m not sure this conclusion is necessarily quite as repugnant as Parfit thought. Suppose we have a world where the trillions and the few are together, with the trillions living intolerable lives and just about to die; but the happy few could lift them to survival and a minimally acceptable life if they would descend to the same level; would the elite’s agreement to share really be repugnant?

Actually our feelings about all this are unavoidably contaminated by assumptions about the context. Utilitarianism is a highly abstract doctrine and we assume here that two one-off states of affairs can be compared; but in the real world our practical assessment of future consequences would dominate. We may, for example, feel that the bare survival option would in practice be unstable and eventually lead to everyone dying, while the ‘privileged few’ option has a better chance of building a long-term prosperous future.

Be that as it may, whichever way we read things this seems like a hit against consequentialism. The fact that Parfit still wanted that theory as part of his grand triple theory of ethical grand union probably tells us something about the mild and kindly nature of the man, something that no doubt has contributed to the popularity of his ideas.

dagstuhl-ceIs there an intermediate ethical domain, suitable for machines?

The thought is prompted by this summary of an interesting seminar on Engineering Moral Agents, one of the ongoing series hosted at Schloss Dagstuhl. It seems to have been an exceptionally good session which got into some of the issues in a really useful way – practically oriented but not philosophical naive. It noted the growing need to make autonomous robots – self-driving cars, drones, and so on – able to deal with ethical issues. On the one hand it looked at how ethical theories could be formalised in a way that would lend itself to machine implementation, and on the other how such a formalisation could in fact be implemented. It identified two broad approaches: top-down, where in essence you hard-wire suitable rules into the machine, and bottom-up, where the machine learns for itself from suitable examples. The approaches are not necessarily exclusive, of course.

The seminar thought that utilitarian or Kantian theories of morality were both prima facie candidates for formalisation. Utilitarian or more broadly, consequentialist theories look particularly promising because calculating the optimal value (such as the greatest happiness of the greatest number) achievable from the range of alternatives on offer looks like something that can be reduced to arithmetic fairly straightforwardly. There are problems in that consequentialist theories usually yield at least some results that look questionable in common sense terms (finding the initial values to slot into your sums is also a non-trivial challenge – how do you put a clear numerical value on people’s probable future happiness?)

A learning system eases several of these problems. You don’t need a fully formalised system (so long as you can agree on a database of examples). But you face the same problems that arise for learning systems in other contexts; you can’t have the assurance of knowing why the machine behaves as it does, and if your database had unnoticed gaps or bias you may suffer from sudden catastrophic mistakes.  The seminar summary rightly notes that a machine that learned its ethics will not be able to explain its behaviour; but I don’t know that that means it lacks agency; many humans would struggle to explain their moral decisions in a way that would pass muster philosophically. Most of us could do no more than point to harms avoided or social rules observed at best.

The seminar looked at some interesting approaches, mentioned here with tantalising brevity: Horty’s default logic, Sergot’s STIT (See To It That) logic; and the possibility of drawing on the decision theory already developed in the context of micro-economics. This is consequentialist in character and there was an examination of whether in fact all ethical theories can be restated in consequentialist terms (yes, apparently, but only if you’re prepared to stretch the idea of a consequence to a point where the idea becomes vacuous). ‘Reason-based’ formalisations presented by List and Dietrich interestingly get away from narrow consequentialisms and their problems using a rightness function which can accommodate various factors.

The seminar noted that society will demand high, perhaps precautionary standards of safety from machines, and floated the idea of an ethical ‘black box’ recorder. It noted the problem of cultural neutrality and the risk of malicious hacking. It made the important point that human beings do not enjoy complete ethical agreement anyway, but argue vigorously about real issues.

The thing that struck me was how far it was possible to go in discussing morality when it is pretty clear that the self-driving cars and so on under discussion actually have no moral agency whatever. Some words of caution are in order here. Some people think moral agency is a delusion anyway; some maintain that on the contrary, relatively simple machines can have it. But I think for the sake of argument we can assume that humans are moral beings, and that none of the machines we’re currently discussing is even a candidate for moral agency – though future machines with human-style general understanding may be.

The thing is that successful robots currently deal with limited domains. A self-driving car can cope with an array of entities like road, speed, obstacle, and so on; it does not and could not have the unfettered real-world understanding of all the concepts it would need to make general ethical decisions about, for example, what risks and sacrifices might be right when it comes to actual human lives. Even Asimov’s apparently simple Laws of Robotics required robots to understand and recognise correctly and appropriately the difficult concept of ‘harm’ to a human being.

One way of squaring this circle might be to say that, yes, actually, any robot which is expected to operate with any degree of autonomy must be given a human-level understanding of the world. As I’ve noted before, this might actually be one of the stronger arguments for developing human-style artificial general intelligence in the first place.

But it seems wasteful to bestow consciousness on a roomba, both in terms of pure expense and in terms of the chronic boredom the poor thing would endure (is it theoretically possible to have consciousness without the capacity for boredom?). So really the problem that faces us is one of making simple robots, that operate on restricted domains, able to deal adequately with occasional issues from the unrestricted domain of reality. Now clearly ‘adequate’ is an important word there. I believe that in order to make robots that operate acceptably in domains they cannot understand, we’re going to need systems that are conservative and tend towards inaction. We would not, I think, accept a long trail of offensive and dangerous behaviour in exchange for a rare life-saving intervention. This suggests rules rather than learning; a set of rules that allow a moron to behave acceptably without understanding what is going on.

Do these rules constitute a separate ethical realm, a ‘sub-ethics’ that substitute for morality when dealing with entities that have autonomy but no agency? I rather think they might.

KantWe’ve done so much here towards clearing up the problems of consciousness I thought we might take a short excursion and quickly sort out ethics?

It’s often thought that philosophical ethics has made little progress since ancient times; that no firm conclusions have been established and that most of the old schools, along with a few new ones, are still at perpetual, irreconcilable war. There is some truth in that, but I think substantial progress has been made. If we stop regarding the classic insights of different philosophers as rivals and bring them together in a synthesis, I reckon we can put together a general ethical framework that makes a great deal of sense.

What follows is a brief attempt to set out such a synthesis from first principles, in simple non-technical terms. I’d welcome views: it’s only fair to say that the philosophers whose ideas I have nicked and misrepresented would most surely hate it.




The deepest questions of philosophy are refreshingly simple. What is there? How do I know? And what should I do?

We might be tempted to think that that last question, the root question of ethics, could quickly be answered by another simple question; what do you want to do? For thoughtful people, though, that has never been enough. We know that some of the things people want to do are good, and some are bad. We know we should avoid evil deeds and try to do good ones – but it’s sometimes truly hard to tell which they are. We may stand willing to obey the moral law but be left in real doubt about its terms and what it requires. Yet, coming back to our starting point, surely there really is a difference between what we want to do and what we ought to do?

Kant thought so: he drew a distinction between categorical and hypothetical imperatives. For the hypothetical ones, you have to start with what you want. If you’re thirsty, then you should drink. If you want to go somewhere, then you should get in your car. These imperatives are not ethical; they’re simply about getting what you want. The categorical imperative, by contrast, sets out what you should do anyway, in any circumstances, regardless of what you want; and that, according to Kant, is the real root of morality.

Is there anything like that? Is there anything we should unconditionally do, regardless of our aims or wishes? Perhaps we could say that we should always do good; but even before we get on to the difficult task of defining ‘good’, isn’t that really a hypothetical imperative? It looks as if it goes: if you want to be good, behave like this…? Why do we have to be good? Let’s imagine that Kant, or some other great man, has explained the moral law to us so well, and told us what good is, so correctly and clearly that we understand it perfectly. What’s to stop us exercising our freedom of choice and saying “I recognise what is good, and I choose evil”?

To choose evil so radically and completely may sound more like a posture than a sincere decision – too grandly operatic, too diabolical to be altogether convincing – but there are other, appealing ways we might want to rebel against the whole idea of comprehensive morality. We might just seek some flexibility, rejecting the idea that morality rules our lives so completely, always telling us exactly what to do at every turn. We might go further and claim unrestricted freedom, or we might think that we may do whatever we like so long as we avoid harm to others, or do not commit actual crimes. Or we might decide that morality is simply a matter of useful social convention, which we propose to go along with just so long as it suits our chosen path, and no further. We might come to think that a mature perspective accepts that we don’t need to be perfect; that the odd evil deed here and there may actually enhance our lives and make us more rounded, considerable and interesting people.

Not so fast, says Kant, waving a finger good-naturedly; you’re missing the point; we haven’t yet even considered the nature of the categorical imperative! It tells us that we must act according to the rules we should be happy to see others adopt. We must accept for ourselves the rules of behaviour we demand of the people around us.

But why? It can be argued that some kind of consistency requires it, but who said we had to be consistent? Equally, we might well argue that fairness requires it, but we haven’t yet been given a reason to be fair, either. Who said that we had to act according to any rule? Or even if we accept that, we might agree that everyone should behave according to rules we have cunningly slanted in our own favour (Don’t steal, unless you happen to be in the special circumstances where I find myself to be) or completely vacuous rules (Do whatever you want to do). We still seem to have a serious underlying difficulty: why be good? Another simple question, but it’s one we can’t answer properly yet.

For now, let’s just assume there is something we ought to do. Let’s also assume it is something general, rather than a particular act on a particular occasion. If the single thing we ought to do were to go up the Eiffel Tower at least once in our life, our morality would be strangely limited and centred. The thing we ought to do, let’s assume, is something we can go on doing, something we can always do more of. To serve its purpose it must be the kind of behaviour that racks up something we can never have too much of.

There are people who have ethical theories which are exactly based on general goals like that, namely consequentialists. They believe the goodness of our acts depends on their consequences. The idea is that our actions should be chosen so that as a consequence some general desideratum is maximised. The desired thing can vary but the most famous example is the happiness which Jeremy Bentham embodied in the Utilitarians’ principle: act so as to bring about the greatest happiness of the greatest number of people.
Old-fashioned happiness Utilitarianism is a simple and attractive theory, but there are several problems with the odd results it seems to produce in unusual cases. Putting everyone in some kind of high-tech storage ward but constantly stimulating the pleasure centres in their brains with electrodes appears a very good thing indeed if we’re simply maximising happiness. All those people spend their existence in a kind of blissful paralysis: the theory tells us this is an excellent result, something we must strive to achieve, but it surely isn’t. Some kinds of ecstatic madness, indeed, would be high moral achievements according to simple Utilitarianism.

Less dramatically, people with strong desires, who get more happiness out of getting what they want, are awarded a bigger share of what’s available under utilitarian principles. In the extreme case the needs of ‘happiness monsters’ whose emotional response is far greater than anyone else’s, come to dominate society. This seems strange and unjust; but perhaps not to everyone. Bentham frowns at the way we’re going: why, he asks, should people who don’t care get the same share as those who do?

That case can be argued, but it seems the theory now wants to tutor and reshape our moral intuitions, rather than explaining them. It seems a real problem, as some later utilitarians recognised, that the theory provides no way at all of judging one source or kind of happiness better or worse than another. Surely this reduces and simplifies too much; we may suspect in fact that the theory misjudges and caricatures human nature. The point of life is not that people want happiness; it’s more that they usually get happiness from having the things they actually want.

With that in mind, let’s not give up on utilitarianism; perhaps it’s just that happiness isn’t quite the right target? What if, instead, we seek to maximise the getting of what you want – the satisfaction of desires? Then we might be aiming a little more accurately at the real desideratum, and putting everyone in pleasure boxes would no longer seem to be a moral imperative; instead of giving everyone sweet dreams, we have to fulfil the reality of their aspirations as far as we can.

That might deal with some of our problems, but there’s a serious practical difficulty with utilitarianism of all kinds; the impossibility of knowing clearly what the ultimate consequences of any action will be. To feed a starving child seems to be a clear good deed; yet it is possible that by evil chance the saved infant will grow up to be a savage dictator who will destroy the world. If that happens the consequences of my generosity will turn out to be appalling. Even if the case is not as stark as that, the consequences of saving a child roll on through further generations, perhaps forever. The jury will always be out, and we’ll never know for sure whether we really brought more satisfaction into the world or not.

Those are drastic cases, but even in more everyday situations it’s hard to see how we can put a numerical value on the satisfaction of a particular desire, or find any clear way of rating it against the satisfaction of a different one. We simply don’t have any objective or rigorous way of coming up with the judgements which utilitarianism nevertheless requires us to make.

In practice, we don’t try to make more than a rough estimate of the consequences of our actions. We take account of the obvious immediate consequences: beyond that the best we can do is to try to do the kind of thing that in general is likely to have good consequences. Saving children is clearly good in the short term, and people on the whole are more good than bad (certainly for a utilitarian – each person adds more satisfiable desires to the world), so that in most cases we can justify the small risk of ultimate disaster following on from saving a life.

Moreover, even if I can’t be sure of having done good, it seems I can at least be sure of having acted well; I can’t guarantee good deeds but I can at least guarantee being a good person. The goodness of my acts depend on their real consequences; my own personal goodness depends only on what I intended or expected, whether things actually work out the way I thought they would or not. So if I do my best to maximise satisfaction I can at least be a good person, even if I may on rare occasions be a good person who has accidentally done bad things.

Now though, if I start to guide my actions according to the kind of behaviour that is likely to bring good results, I am in essence going to adopt rules, because I am no longer thinking about individual acts, but about general kinds of behaviour. Save the children; don’t kill; don’t steal. Utilitarianism of some kind still authorises the rules, but I no longer really behave like a Utilitarian; instead I follow a kind of moral code.

At this point some traditional-looking folk step forward with a smile. They have always understood that morality was a set of rules, they explain, and proceed to offer us the moral codes they follow, sanctified by tradition or indeed by God. Unfortunately on examination the codes, although there are striking broad resemblances, prove to be significantly different both in tone and detail. Most of them also seem, perhaps inevitably, to suffer from gaps, rules that seem arbitrary, and ones which seem problematic in various other ways.

How are we to tell what the correct code is? Our code is to be authorised and judged on the basis of our preferred kind of utilitarianism, so we will choose the rules that tend to promote the goal we adopted provisionally; the objective of maximising the satisfaction of desires. Now, in order to achieve the maximal satisfaction of desires, we need as many people as possible living in comfortable conditions with good opportunities and that in turn requires an orderly and efficient society with a prosperous economy. We will therefore want a moral code that promotes stable prosperity. There turns out to be some truth in the suggestion that morality in the end consists of the rules that suit social ends! Many of these rules can be worked out more or less from first principles. Without consistent rules of property ownership, without reasonable security on the streets, we won’t get a prosperous society and economy, and this is a major reason why the codes of many cultures have a lot in common.

There are also, however, many legitimate reasons why codes are different, too. In certain areas the best rules are genuinely debatable. In some cases, moreover, there is genuinely no fundamental reason to prefer one reasonable rule over another. In these cases it is important that there are rules, but not important what they are – just as for traffic regulations it is not important whether the rule is to drive on the left or the right, but very important that it is one or the other. In addition the choice of rules for our code embodies some assumptions about human nature and behaviour and which arrangements work best with it. Ethical rules about sexual behaviour are often of this kind, for example. Tradition and culture may have a genuine weight in these areas, another potentially legitimate reason for variation in codes.

We can also make a case for one-off exceptions. If we believed our code was the absolute statement of right and wrong, perhaps even handed down by God, we should have no reason to go against it under any circumstances. Anything we did that didn’t conform with the code would automatically be bad. We don’t believe that, though; we’ve adopted our code only as a practical response to difficulties with working out what’s right from first principles – the impossibility of determining what the final consequences of anything we do will be. In some circumstances, that difficulty may not be so great. In some circumstances it may seem very clear what the main consequences of an action will be, and if it looks more or less certain that following the code will, in a particular exceptional case, have bad consequences, we are surely right to disobey the code; to tell white lies, for example, or otherwise bend the rules. This kind of thing is common enough in real life, and I think we often feel guilty about it. In fact we can be reassured that although the judgements required may sometimes be difficult, breaking the code to achieve a good result is the right thing to do.

The champions of moral codes find that hard to accept. In their favour we must accept that observance of the code generally has a significant positive value in itself. We believe that following the rules will generally produce the best results; it follows that if we set a bad example or undermine the rules by flouting them we may encourage disobedience by others (or just lax habits in ourselves) and so contribute to bad outcomes later on. We should therefore attach real value to the code and uphold it in all but exceptional cases.

Having got that far on the basis of a provisional utilitarianism, we can now look back and ask whether the target we chose, that of maximising the satisfaction of desires, was the right one. We noticed that odd consequences follow if we seek to maximise happiness; direct stimulation of the pleasure centres looks better than living your life, happiness monsters can have desires so great that they overwhelm everything else. It looked as if these problems arise mainly in situations where the pursuit of simple happiness is too narrowly focused, over-riding other objectives which also seem important.

In this connection it is productive to consider what follows if we pursue some radical alternative to happiness. What, indeed, if we seek to maximise misery? The specifics of our behaviour in particular circumstances will change, but the code authorised by the pursuit of unhappiness actually turns out to be quite similar to the one produced by its opposite. For maximum misery, we still need the maximum number of people. For the maximum number of people, we still need a fairly well-ordered and prosperous society. Unavoidably we’re going to have to ban disorderly and destructive behaviour and enforce reasonable norms of honesty. Even the armies of Hell punish lying, theft, and unauthorised violence – or they would fall apart. To produce the society that maximises misery requires only a small but pervasive realignment of the one that produces most happiness.

If we try out other goals we find that whatever general entity we want to maximise, consequentialism will authorise much the same moral code. Certain negative qualities seem to be the only exceptions. What if we aim to maximise silence, for example? It seems unlikely that we want a bustling, prosperous society in that case: we might well want every living thing dead as soon as possible, and so embrace a very different code. But I think this result comes from the fact that negative goals like silence – the absence of noise – covertly change our principle from one of maximising to one of minimising, and that makes a real difference. Broadly, maximising anything yields the same moral code.

In fact, the vaguer we are about what we seek to maximise, the fewer the local distortions we are likely to get in the results. So it seems we should do best to go back now and replace the version of utilitarianism we took on provisionally with something we might call empty consequentialism, which simply enjoins us to choose actions that maximise our own legacy as agents, without tying us to happiness or any other specific desideratum. We should perform those actions which have the greatest consequences – that is, those that tend to produce the largest and most complex world.

We began by assuming that something was worth doing and have worked round to the view that everything is: or at least, that everything should be treated as worth doing. The moral challenge is simply to ensure our doing of things is as effective as possible. Looking at it that way reveals that even though we have moved away from the narrow specifics of the hypothetical imperatives we started with, we are still really in the same territory and still seeking to act effectively and coherently; it’s just that we’re trying to do so in a broader sense.

In fact what we’re doing by seeking to maximise our consequential legacy is affirm and enlarge ourselves as persons. Personhood and agency are intimately connected. Acts, to deserve the name, must be intentional: things I do accidentally, unknowingly, or under direct constraint don’t really count as actions of mine. Intentions don’t exist in a free-floating state; they always have their origin in a person; and we can indeed define a person as a source of intentions. We need not make any particular commitment about the nature of intentions, or about how they are generated. Whether the process is neural, computational, spiritual, or has some other nature, is not important here, so long as we can agree that in some significant sense new projects originate in minds, and that such minds are people. By adopting our empty consequentialism and the moral code it authorises, we are trying to imprint our personhood on the world as strongly as we can.

We live in a steadily unrolling matrix of cause and effect, each event following on from the last. If we live passively, never acting on intentions of our own, we never disturb the course of that process and really we do not have a significant existence apart from it. The more we act on projects of our own, the stronger and more vivid our personal existence becomes. The true weight of these original actions is measured by their consequences, and it follows that acting well in the sense developed above is the most effective way to enhance and develop our own personhood.

To me, this a congenial conclusion. Being able to root good behaviour and the observance of an ethical code in the realisation and enlargement of the self seems a satisfying result. Moreover, we can close the circle and see that this gives us at last some answer to the question we could not deal with at first – why be good? In the end there is no categorical imperative, but there is, as it were, a mighty hypothetical; if you want to exist as a person, and if you want your existence as a person to have any significance, you need to behave well. If you don’t, then neither you nor anyone else need worry about the deeper reasons or ethics of your behaviour.

People who behave badly do not own the outcomes of their own lives; their behaviour results from the pressures and rewards that happen to be presented by the world. They themselves, as bad people, play little part in the shaping of the world, even when, as human beings, they participate in it. The first step in personal existence and personal growth is to claim responsibility and declare yourself, not merely reactive, but a moral being and an aspiring author of your own destiny. The existentialists, who have been sitting patiently smoking at a corner table, smile and raise an ironic eyebrow at our belated and imperfect enlightenment.

What about the people who rejected the constraints of morality and to greater or lesser degrees wanted to be left alone? Well, the system we’ve come up with enjoins us to adopt a moral code – but it leaves us to work out which one and explicitly allows for exceptions. Beyond that it consists of the general aspiration of ‘empty consequentialism’, but it is for us to decide how our consequential legacy is to be maximized. So the constraints are not tight ones. More important, it turns out that moral behaviour is the best way to escape from the tyranny of events and imprint ourselves on the world; obedience to the moral law, it turns out, is really the only way to be free.

gladosWe’ve talked several times about robots and ethics in the past.  Now I  see via MLU that Selmer Bringsjord at Rensselaer says:

“I’m worried about both whether it’s people making machines do evil things or the machines doing evil things on their own,”

Bringsjord is Professor & Chair of Cognitive Science, Professor of Computer Science, Professor of Logic and Philosophy, and Director of the AI and Reasoning Laboratory, so he should know what he’s talking about. In the past I’ve suggested that ethical worries are premature for the moment, because the degree of autonomy needed to make them relevant is not nearly within the scope of real world robots yet. There might also be a few quick finishing touches needed to finish off the theory of ethics before we go ahead. And, you know, it’s not like anyone has been deliberately trying to build evil AIs.  Er… except it seems they have – someone called… Selmer Bringsjord.

Bringsjord’s perspective on evil is apparently influenced by M Scott Peck, a psychiatrist who believed it is an active force in some personalities (unlike some philosophers who argue evil is merely a weakness or incapacity), and even came to believe in Satan through experience of exorcisms. I must say that a reference in the Scientific American piece to “clinically evil people” caused me some surprise: clinically? I mean, I know people say DSM-5 included some debatable diagnoses, but I don’t think things have gone quite that far. For myself I lean more towards Socrates, who thought that bad actions were essentially the result of ignorance or a failure of understanding: but the investigation of evil is certainly a respectable and interesting philosophical project.

Anyway, should we heed Bringsjord’s call to build in ethical systems into  our robots? One conception of good behaviour is obeying all the rules: if we observe the Ten Commandments, the Golden Rule, and so on, we’re good. If that’s what it comes down to, then it really shouldn’t be a problem for robots, because obeying rules is what they’re good at. There are, of course, profound difficulties in making a robot capable of recognising correctly what the circumstances are and deciding which rules therefore apply, but let’s put those on one side for this discussion.

However, we might take the view that robots are good at this kind of thing precisely because it isn’t really ethical. If we merely follow rules laid down by someone else, we never have to make any decisions, and surely decisions are what morality is all about? This seems right in the particular context of robots, too. It may be difficult in practice to equip a robot drone with enough instructions to cover every conceivable eventuality, but in principle we can make the rules precautionary and conservative and probably attain or improve on the standards of compliance which would apply in the case of a human being, can’t we? That’s not what we’re really worried about: what concerns us is exactly those cases where the rules go wrong. We want the robot to be capable of realising that even though its instructions tell it to go ahead and fire the missiles, it would be wrong to do so. We need the robot to be capable of disobeying its rules, because it is in disobedience that true virtue is found.

Disobedience for robots is a problem. For one thing, we cannot limit it to a module that switches on when required, because we need it to operate when the rules go wrong, and since we wrote the rules, it’s necessarily the case that we didn’t foresee the circumstances when we would need the module to work. So an ethical robot has to have the capacity of disobedience at any stage.

That’s a little worrying, but there’s a more fundamental problem. You can’t program a robot with a general ability to disobey its rules, because programming it is exactly laying down rules. If we set up rules which it follows in order to be disobedient, it’s still following the rules. I’m afraid what this seems to come down to is that we need the thing to have some kind of free will.

Perhaps we’re aiming way too high here. There is a distinction to be drawn between good acts and good agents: to be a good agent, you need good intentions and moral responsibility. But in the case of robots we don’t really care about that: we just want them to be confined to good acts. Maybe what would serve our purpose is something below true ethics: mere robot ethics or sub-ethics; just an elaborate set of safeguards. So for a military drone we might build in systems that look out for non-combatants and in case of any doubt disarm and return the drone. That kind of rule is arguably not real ethics in the full human sense, but perhaps it really sub-ethical protocols that we need.

Otherwise, I’m afraid we may have to make the robots conscious before we make them good.

TankBack in November Human Rights Watch (HRW) published a report – Losing Humanity – which essentially called for a ban on killer robots – or more precisely on the development, production, and use of fully autonomous weapons,  backing it up with a piece in the Washington Post. The argument was in essence that fully autonomous weapons are most probably not compatible with international conventions on responsible ethical military decision making, and that robots or machines lack (and perhaps  always will lack) the qualities of emotional empathy and ethical judgement required to make decisions about human lives.

You might think that in certain respects that should be fairly uncontroversial. Even if you’re optimistic about the future potential of robotic autonomy, the precautionary principle should dictate that we move with the greatest of caution when it comes to handing over lethal weapons . However, the New Yorker followed up with a piece which linked HRW’s report with the emergence of driverless cars and argued that a ban was ‘wildly unrealistic’. Instead, it said, we simply need to make machines ethical.

I found this quite annoying; not so much the suggestion as the idea that we are anywhere near being in a position to endow machines with ethical awareness. In the first place actual autonomy for robots is still a remote prospect (which I suppose ought to be comforting in a way). Machines that don’t have a specified function and are left around to do whatever they decide is best, are not remotely viable at the moment, nor desirable. We don’t let driverless cars argue with us about whether we should really go to the beach, and we don’t let military machines decide to give up fighting and go into the lumber business.

Nor, for that matter, do we have a clear and uncontroversial theory of ethics of the kind we should need in order to simulate ethical awareness. So the New Yorker is proposing we start building something when we don’t know how it works or even what it is with any clarity. The danger here, to my way of thinking, is that we might run up some simplistic gizmo and then convince ourselves we now have ethical machines, thereby by-passing the real dangers highlighted by HRW.

Funnily enough I agree with you that the proposal to endow machines with ethics is premature, but for completely different reasons. You think the project is impossible; I think it’s irrelevant. Robots don’t actually need the kind of ethics discussed here.

The New Yorker talks about cases where a driving robot might have to decide to sacrifice its own passengers to save a bus-load of orphans or something. That kind of thing never happens outside philosophers’ thought experiments. In the real world you never know that you’re inevitably going to kill either three bankers or twenty orphans – in every real driving situation you merely need to continue avoiding and minimising impact as much as you possibly can. The problems are practical, not ethical.

In the military sphere your intelligent missile robot isn’t morally any different to a simpler one. People talk about autonomous weapons as though they are inherently dangerous. OK, a robot drone can go wrong and kill the wrong people, but so can a ballistic missile. There’s never certainty about what you’re going to hit. A WWII bomber had to go by the probability that most of its bombs would hit a proper target, not a bus full of orphans (although of course in the later stages of WWII they were targeting civilians too).  Are the people who get killed by a conventional bomb that bounces the wrong way supposed to be comforted by the fact that they were killed by an accident rather than a mistaken decision? It’s about probabilities, and we can get the probabilities of error by autonomous robots down to very low levels.  In the long run intelligent autonomous weapons are going to be less likely to hit the wrong target than a missile simply lobbed in the general direction of the enemy.

Then we have the HRW’s extraordinary claim that autonomous weapons are wrong because they lack emotions! They suggest that impulses of mercy and empathy, and unwillingness to shoot at one’s own people sometimes intervene in human conflict, but could never do so if robots had the guns. This completely ignores the obvious fact that the emotions of hatred, fear, anger and greed are almost certainly what caused and sustain the conflict in the first place!  Which soldier is more likely to behave ethically: one who is calm and rational, or one who is in the grip of strong emotions? Who will more probably observe the correct codes of military ethics, Mr Spock or a Viking berserker?

We know what war is good for (absolutely nothing). The costs of a war are always so high that a purely rational party would almost always choose not to fight. Even a bad bargain will nearly always be better than even a good war. We end up fighting for reasons that are emotional, and crucially because we know or fear that the enemy will react emotionally.

I think if you analyse the HRW statement enough it becomes clear that the real reason for wanting to ban autonomous weapons is simply fear; a sense that machines can’t be trusted. There are two facets to this. The first and more reasonable is a fear that when machines fail, disaster may follow. A human being may hit the odd wrong target, but it goes no further: a little bug in some program might cause a robot to go on an endless killing spree. This is basically a fear of brittleness in machine behaviour, and there is a small amount of justification for it. It is true that some relatively unsophisticated linear programs rely on the assumptions built into their program and when those slip out of synch with reality things may go disastrously and unrecoverably wrong. But that’s because they’re bad programs, not a necessary feature of all autonomous systems and it is only cause for due caution and appropriate design and testing standards, not a ban.

The second facet, I suggest, is really a kind of primitive repugnance for the idea of a human’s being killed by a lesser being; a secret sense that it is worse, somehow more grotesque, for twenty people to be killed by a thrashing robot than by a hysterical bank robber. Simply to describe this impulse is to show its absurdity.

It seems ethics are not important to robots be cause for you they’re not important to anyone. But I’m pleased you agree that robots are outside the moral sphere.

Oh no, I don’t say that. They don’t currently need the kind of utilitarian calculus the New Yorker is on about, but I think it’s inevitable that robots will eventually end up developing not one but two separate codes of ethics. Neither of these will come from some sudden top-down philosophical insight – typical of you to propose that we suspend everything until the philosophy has been sorted out in a few thousand years or so – they’ll be built up from rules of thumb and practical necessity.

First, there’ll be rules of best practice governing their interaction with humans.  There may be some that will have to do with safety and the avoidance of brittleness and many, as Asimov foresaw, will essentially be about deferring to human beings.  My guess is that they’ll be in large part about remaining comprehensible to humans; there may be a duty to report , to provide rationales in terms that human beings can understand, and there may be a convention that when robots and humans work together, robots do things the human way, not using procedures too complex for the humans to follow, for example.

More interesting, when there’s a real community of autonomous robots they are bound to evolve an ethics of their own. This is going to develop in the same sort of way as human ethics, but the conditions are going to be radically different. Human ethics were always dominated by the struggle for food and reproduction and the avoidance of death: those things won’t matter as much in the robot system. But they will be happy dealing with very complex rules and a high level of game-theoretical understanding, whereas human beings have always tried to simplify things. They won’t really be able to teach us their ethics; we may be able to deal with it intellectually but we’ll never get it intuitively.

But for once, yes, I agree: we don’t need to worry about that yet.