Is there an intermediate ethical domain, suitable for machines?
The thought is prompted by this summary of an interesting seminar on Engineering Moral Agents, one of the ongoing series hosted at Schloss Dagstuhl. It seems to have been an exceptionally good session which got into some of the issues in a really useful way – practically oriented but not philosophical naive. It noted the growing need to make autonomous robots – self-driving cars, drones, and so on – able to deal with ethical issues. On the one hand it looked at how ethical theories could be formalised in a way that would lend itself to machine implantation, and on the other how such a formalisation could in fact be implemented. It identified two broad approaches: top-down, where in essence you hard-wire suitable rules into the machine, and bottom-up, where the machine learns for itself from suitable examples. The approaches are not necessarily exclusive, of course.
The seminar thought that utilitarian or Kantian theories of morality were both prima facie candidates for formalisation. Utilitarian or more broadly, consequentialist theories look particularly promising because calculating the optimal value (such as the greatest happiness of the greatest number) achievable from the range of alternatives on offer looks like something that can be reduced to arithmetic fairly straightforwardly. There are problems in that consequentialist theories usually yield at least some results that look questionable in common sense terms (finding the initial values to slot into your sums is also a non-trivial challenge – how do you put a clear numerical value on people’s probable future happiness?)
A learning system eases several of these problems. You don’t need a fully formalised system (so long as you can agree on a database of examples). But you face the same problems that arise for learning systems in other contexts; you can’t have the assurance of knowing why the machine behaves as it does, and if your database had unnoticed gaps or bias you may suffer from sudden catastrophic mistakes. The seminar summary rightly notes that a machine that learned its ethics will not be able to explain its behaviour; but I don’t know that that means it lacks agency; many humans would struggle to explain their moral decisions in a way that would pass muster philosophically. Most of us could do no more than point to harms avoided or social rules observed at best.
The seminar looked at some interesting approaches, mentioned here with tantalising brevity: Horty’s default logic, Sergot’s STIT (See To It That) logic; and the possibility of drawing on the decision theory already developed in the context of micro-economics. This is consequentialist in character and there was an examination of whether in fact all ethical theories can be restated in consequentialist terms (yes, apparently, but only if you’re prepared to stretch the idea of a consequence to a point where the idea becomes vacuous). ‘Reason-based’ formalisations presented by List and Dietrich interestingly get away from narrow consequentialisms and their problems using a rightness function which can accommodate various factors.
The seminar noted that society will demand high, perhaps precautionary standards of safety from machines, and floated the idea of an ethical ‘black box’ recorder. It noted the problem of cultural neutrality and the risk of malicious hacking. It made the important point that human beings do not enjoy complete ethical agreement anyway, but argue vigorously about real issues.
The thing that struck me was how far it was possible to go in discussing morality when it is pretty clear that the self-driving cars and so on under discussion actually have no moral agency whatever. Some words of caution are in order here. Some people think moral agency is a delusion anyway; some maintain that on the contrary, relatively simple machines can have it. But I think for the sake of argument we can assume that humans are moral beings, and that none of the machines we’re currently discussing is even a candidate for moral agency – though future machines with human-style general understanding may be.
The thing is that successful robots currently deal with limited domains. A self-driving car can cope with an array of entities like road, speed, obstacle, and so on; it does not and could not have the unfettered real-world understanding of all the concepts it would need to make general ethical decisions about, for example, what risks and sacrifices might be right when it comes to actual human lives. Even Asimov’s apparently simple Laws of Robotics required robots to understand and recognise correctly and appropriately the difficult concept of ‘harm’ to a human being.
One way of squaring this circle might be to say that, yes, actually, any robot which is expected to operate with any degree of autonomy must be given a human-level understanding of the world. As I’ve noted before, this might actually be one of the stronger arguments for developing human-style artificial general intelligence in the first place.
But it seems wasteful to bestow consciousness on a roomba, both in terms of pure expense and in terms of the chronic boredom the poor thing would endure (is it theoretically possible to have consciousness without the capacity for boredom?). So really the problem that faces us is one of making simple robots, that operate on restricted domains, able to deal adequately with occasional issues from the unrestricted domain of reality. Now clearly ‘adequate’ is an important word there. I believe that in order to make robots that operate acceptably in domains they cannot understand, we’re going to need systems that are conservative and tend towards inaction. We would not, I think, accept a long trail of offensive and dangerous behaviour in exchange for a rare life-saving intervention. This suggests rules rather than learning; a set of rules that allow a moron to behave acceptably without understanding what is going on.
Do these rules constitute a separate ethical realm, a ‘sub-ethics’ that substitute for morality when dealing with entities that have autonomy but no agency? I rather think they might.