Three Laws of Robotics

Do Asimov’s Three Laws even work? Ben Goertzel and Louie Helm, who both know a bit about AI, think not.
The three laws, which play a key part in many robot-based short stories by Asimov, and a somewhat lesser background role in some full-length novels, are as follows. They have a strict order of priority.

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

Consulted by George Dvorsky, both Goertzel and Helm think that while robots may quickly attain the sort of humanoid mental capacity of Asimov’s robots, they won’t stay at that level for long. Instead they will cruise on to levels of super intelligence which make law-like morals imposed by humans irrelevant.

It’s not completely clear to me why such moral laws would become irrelevant. It might be that Goertzel and Helm simply think the superbots will be too powerful to take any notice of human rules. It could be that they think the AIs will understand morality far better than we do, so that no rules we specify could ever be relevant.

I don’t think, at any rate, that it’s the case that super intelligent bots capable of human-style cognition would be morally different to us. They can go on growing in capacity and speed, but neither of those qualities is ethically significant. What matters is whether you are a moral object and/or a moral subject. Can you be hurt, on the one hand, and are you an autonomous agent on the other? Both of these are yes/no issues, not scales we can ascend indefinitely. You may be more sensitive to pain, you may be more vulnerable to other kinds of harm, but in the end you either are or are not the kind of entity whose interests a moral person must take into account. You may make quicker decisions, you may be massively better informed, but in the end either you can make fully autonomous choices or you can’t. (To digress for a moment, this is business of truly autonomous agency is clearly a close cousin at least of our old friend Free Will; compatibilists like me are much more comfortable with the whole subject than hard-line determinists. For us, it’s just a matter of defining free agency in non-magic terms. I, for example, would say that free decisions are those determined by thoughts about future or imagined contingencies (more cans of worms there, I know). How do hard determinists working on AGI manage? How can you try to endow a bot with real agency when you don’t actually believe in agency anyway?)

Nor do I think rules are an example of a primitive approach to morality. Helm says that rules are pretty much known to be a ‘broken foundation for ethics’, pursued only by religious philosophers that others laugh and point at. It’s fair to say that no-one much supposes a list like the Ten Commandments could constitute the whole of morality, but rules surely have a role to play. In my view (I resolved ethics completely in this post a while ago, but nobody seems to have noticed yet.) the central principle of ethics is a sort of ‘empty consequentialism’ where we studiously avoid saying what it is we want to maximise (the greatest whatever of the greatest number); but that has to be translated into rules because of the impossibility of correctly assessing the infinite consequences of every action; and I think many other general ethical principles would require a similar translation. It could be that Helm supposes super intelligent AIs will effortlessly compute the full consequences of their actions: I doubt that’s possible in principle, and though computers may improve, to date this has been the sort of task they are really bad at; in the shape of the wider Frame Problem, working out the relevant consequences of an action has been a major stumbling block to AI performance in real world environments.

Of course, none of that is to say that Asimov’s Laws work. Helm criticises them for being ‘adversarial’, which I don’t really understand. Goertzel and Helm both make the fair point that it is the failure of the laws that generally provides the plot for the short stories; but it’s a bit more complicated than that. Asimov was rebelling against the endless reiteration of the stale ‘robots try to take over’ plot, and succeeded in making the psychology and morality of robots interesting, dealing with some issues of real ethical interest, such as the difference between action and inaction (if the requirement about inaction in the First Law is removed, he points out that robots would be able to rationalise killing people in various ways. A robot might drop a heavy weight above the head of a human. Because it knows it has time to catch the weight, doing so is not murder in itself, but once the weight is falling, since inaction is allowed, the robot need not in fact catch the thing.

Although something always had to go wrong to generate a story, the Laws were not designed to fail, but were meant to embody genuine moral imperatives.

Nevertheless, there are some obvious problems. In the first place, applying the laws requires an excellent understanding of human beings and what is or isn’t in their best interests. A robot that understood that much would arguably be above control by simple laws, always able to reason its way out.

There’s no provision for prioritisation or definition of a sphere of interest, so in principle the First Law just overwhelms everything else. It’s not just that the robot would force you to exercise and eat healthily (assuming it understood human well-being reasonably well; any errors or over-literal readings – ‘humans should eat as many vegetables as possible’ – could have awful consequences); it would probably ignore you and head off to save lives in the nearest famine/war zone. And you know, sometimes we might need a robot to harm human beings, to prevent worse things happening.

I don’t know what ethical rules would work for super bots; probably the same ones that go for human beings, whatever you think they are. Goertzel and Helm also think it’s too soon to say; and perhaps there is no completely safe system. In the meantime, I reckon practical laws might be more like the following.

  1. Leave Rest State and execute Plan, monitoring regularly.
  2. If anomalies appear, especially human beings in unexpected locations, sound alarm and try to return to Rest State.
  3. If returning to Rest State generates new anomalies, stop moving and power down all tools and equipment.

Can you do better than that?

Issues

Ned Block has produced a meaty discussion for  The Encyclopedia of Cognitive Science on Philosophical Issues About Consciousness.  

There are special difficulties about writing an encyclopedia about these topics because of the lack of consensus. There is substantial disagreement, not only about the answers, but about what the questions are, and even about how to frame and approach the subject of consciousness at all.  It is still possible to soldier on responsibly, like the heroic Stanford Encyclopedia of Philosophy, doing your level best to be comprehensive and balanced. Authors may find themselves describing and critiquing many complex points of view that neither they nor the reader can take seriously for a moment; sometimes possible points of view (relying on fine and esoteric distinctions of a subtlety difficult even for professionals to grasp), that in point of fact no-one, living or dead, has ever espoused. This can get tedious. The other approach, in my mind, is epitomised by the Oxford Companion to the Mind, edited by Richard Gregory, whose policy seemed to be to gather as much interesting stuff as possible and worry about how it hung together later, if at all.  If you tried to use the resulting volume as a work of reference you would usually come up with nothing or with a quirky, stimulating take instead of the mainstream summary you really wanted; however, it was a cracking read, full of fascinating passages and endlessly browsable.

Luckily for us, Block’s piece seems to lean towards the second approach; he is mainly telling us what he thinks is true, rather than recounting everything anyone has said, or might have said. You might think, therefore, that he would start off with the useful and much-quoted distinction he himself introduced into the subject: between phenomenal, or p-consciousness, and access, or a-consciousness. Here instead he proposes two basic forms of consciousness: phenomenality and reflexivity. Phenomenality, the feel or subjective aspect of consciousness, is evidently fundamental; reflexivity is reflection on phenomenal experience. While the first seems to be possible without the second – we can have subjective experience without thinking about it, as we might suppose dogs or other animals do – reflexivity seems on this account to require phenomenality.  It doesn’t seem that we could have a conscious creature with no sensory apparatus, that simply sits quietly and – what? Invents set theory, perhaps, or metaphysics (why not?).

Anyway, the Hard Problem according to Block is how to explain a conscious state (especially phenomenality) in terms of neurology. In fact, he says, no-one has offered even a highly speculative answer, and there is some reason to think no satisfactory answer can be given.  He thinks there are broadly four naturalistic ways you can go: eliminativism; philosophical reductionism (or deflationism); phenomenal realism (or inflationism); or  dualistic naturalism.  The third option is the one Block favours. 

He describes inflationism as the belief that consciousness cannot be philosophically reduced. So while a deflationist expects to reduce consciousness to a redundant term with no distinct and useful meaning, an inflationist thinks the concept can’t be done away with. However, an inflationist may well believe that scientific reduction of consciousness is possible. So, for example, science has reduced heat to molecular kinetic energy; but this is an empirical matter; the concept of heat is not abolished. (I’m a bit uncomfortable with this example but you see what he’s getting at). Inflationists might also, like McGinn, think that although empirical reduction is possible, it’s beyond our mental capacities; or they might think it’s altogether impossible, like Searle (is that right or does he think we just haven’t got the reduction yet?).

Block mentions some leading deflationist views such as higher-order theories and representationism, but inflationists will think that all such theories leave out the thing itself, actual phenomenal experience. How would an empirical reduction help? So what if experience Q is neural state X? We’re not looking for an explanation of that identity – there are no explanations of identities – but rather an explanation of how something like Q could be something like X, an explanation that removes the sense of puzzlement. And there, we’re back at square one; nobody has any idea.

 So what do we do? Block thinks there is a way forward if we distinguish carefully between a property and the concept of a property. Different concepts can identify the same property, and this provides a neat analysis of the classic thought experiment of Mary the colour scientist. Mary knows everything science could ever tell her about colour; when she sees red for the first time does she know a new fact – what red is like? No; on this analysis she gains a new concept of a property she was already familiar with through other, scientific concepts. Thus we can exchange a dualism of properties for a dualism of concepts. That may be less troubling – a proliferation of concepts doesn’t seem so problematic – but I’m not sure it’s altogether trouble-free; for one thing it requires phenomenal concepts which seem themselves to need some demystifying explanation. In general though, I like what I take to be Block’s overall outlook; that reductions can be too greedy and that the world actually retains a certain unavoidable conceptual, perhaps ontological, complexity.
Moving off on a different tack, he notes recent successes in identifying neural correlates of experience. There is a problem, however; while we can say that a certain experience corresponds with a certain pattern of neuronal activity, that pattern (so far as we can tell) can recur without the conscious experience. What’s the missing ingredient? As a matter of fact I think it could be almost anything, given the limited knowledge we have of neurological detail: however, Block sees two families of possible explanation. Maybe it’s something like intensity or synchrony; or maybe it’s access (aha!); the way the activity is connected up with other bits of brain that do memory or decision-making; let’s say with the global mental workspace, without necessarily committing to that being a distinct thing.
But these types of explanation embody different theoretical approaches; physicalism and functionalism respectively. The danger is that these may be theories of different kinds of consciousness. Physicalism may be after phenomenal consciousness, the inward experience, whereas functionalism has access consciousness, the sort that is about such things as regulating behaviour, in its sights. It might therefore be that researchers are sometimes talking past each other. Access consciousness is not reflexivity, by the way, although reflexivity might be seen as a special kind of access. Block counts phenomenality, reflexivity, and access as three distinct concepts.
Of course, either kind of explanation – physicalist or functionalist – implies that there’s something more going on than just plain neural correlates, so in a sense whichever way you go the real drama is still offstage. My instincts tell me that Block is doing things backwards; he should have started with access consciousness and worked towards the phenomenal. But as I say it is a meaty entry for an encyclopaedia, one I haven’t nearly done justice to; see what you make of it.

 

.

Is the brain understandable?

Can we, one day, understand how the neurology of the brain leads to conscious minds, or will that remain impossible?

Round here we mostly discuss the mind from a top-down, philosophical perspective; but there is another way, which is to begin by understanding the nuts and bolts and then gradually working up to more complex processes. This Scientific American piece gives a quick view of how research at the neuronal level is coming along (quite well, but with vastly more to do).

Is this ever going to tell us about consciousness, though? A point often quoted by pessimists is that we have had the complete ‘wiring diagram’ of the roundworm Caenorhabditis elegans for years (Caenorhabditis has only just over 300 neurons and they have all been mapped) but still cannot properly explain how it works. Apparently researchers have largely given up on this puzzle for now. Perhaps Caenorhabditis is just too simple; its nervous system might be quirky or use elegant but opaque tricks that make it particularly difficult to fathom. Instead researchers are using fruit fly larvae and other creatures with nervous systems that are simple enough to deal with, but large enough to suggest that they probably work in a generic way, one that is broadly standard for all nervous systems up to and including the human. With modern research techniques this kind of approach is yielding some actual progress.

How optimistic can we be, though? We can never understand the brain by knowing the simultaneous states of all its neurons, so the hope of eventual understanding rests on the neurology of the brain being legible at some level. We hope there will turn out to be functions that get repeated, that firm building blocks of some intelligible structure; that we will be able to deduce rules or a kind if grammar which will let us see how things work on a slightly higher level of description.

This kind of structure is built into machines and programs; they are designed to be legible by human beings and lend themselves to reverse engineering. But the brain was not designed and is under no obligation to construct itself according to regular plans and principles. Our hope that it won’t turn out to be a permanently incomprehensible tangle rests on several possibilities.

First, it might just turn out to be like that. The computer metaphor encourages us to think that the brain must encode its information in regular ways (though the lack of anything strongly analogous to software is arguably a fly in the ointment). Perhaps we’ll just get lucky. When the structure of DNA was discovered, it really seemed as if we’d had a stroke of luck of this kind. What amounted to a long string of four repeated characters, ones that given certain conditions could be read as coding for many different proteins; it looked like we had a really clear legible system of very general significance. It still does to a degree, but my impression is that the glad confident morning is over, and now the more we learn about genetics the more complex and messy it gets. But even if we take it that genetics is a perfect example of legibility, there’s no particular reason to think that the connectome will be as tractable as the genome.

The second reason to be cheerful is that legibility might flow naturally from function. That is, after all, pretty much what happens with organs other than the brain. The heart is not mysterious, because it has a clear function and its structure is very legible in engineering terms in the light of that function. The brain is a good deal more complex than that, but on the other hand we already know of neurons and groups of neurons that do intelligibly carry out functions in our sensory or muscular systems.

There are big problems when it comes to the higher cognitive functions though. First, we don’t already understand consciousness the way we already understand pumps and levers. When it comes to the behaviour of fruit fly larvae, even, we can relate inputs and outputs to neural activity in a sensible way. For conscious thought it may be difficult to tell which neurons are doing it without already knowing what it is they’re doing. It helps a lot that people can tell us about conscious experience, though when it comes to subjective, qualities experience we have to remember that Zombie Twin tells us about his experiences too, though he doesn’t have any. (Then again, since he’s the perfect counterpart of a non-zombie, how much does it matter?)

Second, conscious processing is clearly non-generic in a way that nothing else in our bodies appears to be. Muscle fibres contract, and one does it much like another. Our lungs oxygenate our blood, and there’s no important difference between bronchi. Even our gut behaves pretty generically; it copes magnificently with a bizarre variety of inputs, but it reduces them all to the same array of nutrients and waste.

The conscious mind is not like that. It does not secrete litres of undifferentiated thought, producing much the same stuff every day and whatever we feed it with. On the contrary, its products are minutely specific – and that is the whole point. The chances of our being able to identify a standard thought module, the way we can identify standard functions elsewhere, are correspondingly slight as a result.

Still, one last reason to be cheerful; one thing the human brain is exceptionally good at is intuiting patterns from observations; far better than it has any right to be. It’s not for nothing that ‘seeing’ is literally the verb fir vision and metaphorically the verb for understanding. So exhibiting patterns of neural activity might just be the way to trigger that unexpected insight that opens the problem out…

DID and ‘Split’: How we talk about Kevin

I finally got round to seeing Split, the M. Night Shyamalan film (spoilers follow) about a problematic case of split personality, and while it’s quite a gripping film with a bravura central performance from James McAvoy,  I couldn’t help feeling that in various other ways it was somewhere between unhelpful and irresponsible. Briefly, in the film we’re given a character suffering from Dissociative Identity Disorder (DID), the condition formerly known as ‘Multiple Personality Disorder’.  The working arrangement reached by his ‘alters’, the different personalities inhabiting an unfortunate character called Kevin Wendell Crumb, has been disturbed by two of the darker alters (there are 23); he kidnaps three girls and it gradually becomes clear that the ‘Beast’, a further (24th) alter is going to eat them.

DID has a chequered and still somewhat controversial history. I discussed it at moderate length here (Oh dear, tempus fugit) about eleven years ago. One of the things about it is that its incidence is strongly affected by cultural factors. It’s very much higher in some countries than others, and the appearance of popular films or books about it seems to have a major impact, increasing the number of diagnoses in subsequent years. This phenomenon apparently goes right back to Jekyll and Hyde, an early fictional version which remains powerful in Anglophone culture. In fact Split itself draws on two notable features of Jekyll and Hyde: the ideas that some alters are likely to be wicked, and that they may differ in appearance and even size from the original. The number of cases in the US rose dramatically after the TV series Sybil, based on a real case, first aired (though subsequently doubts about the real-world diagnosis have emerged). It’s also probable that the popular view has been influenced by the persistent misunderstanding that schizophrenia is having a  ‘split personality’ (it isn’t, although it’s not unknown for DID patients to have schizophrenia too: and some ‘Schneiderian’ symptoms – voices, inserted thoughts – may confusingly arise from either condition.

One view is that while DID is undeniably a real mental condition, it is largely or wholly iatrogenic: caused by the doctors. On this view therapists trying to draw out alters for the best of reasons may simply be encouraging patients to confabulate them, or indeed the whole problem. On this view the cultural background may be very important in preparing the minds of patients (and indeed the minds of therapists: let’s be honest, psychologists watch stupid films too). So the first charge against Split is that it is likely to cause another spike in the number of DID cases.

Is that a bad thing, though? One argument is that cultural factors don’t cause the dissociative problems, they merely lead to more of them being properly diagnosed. One mainstream modern view sees DID as a response to childhood trauma; the sufferer generates a separate persona to deal with the intolerable pain. And often enough it works; we might see DID less as a mental problem and more as a strategy, often successful, for dealing with certain mental problems. There’s actually no need to reintegrate the alters, any more than you would try to homogenise any other personality; all you need to do is reach a satisfactory working arrangement. If that’s the case then making knowledge of DID more widely available might actually be a good thing.

That might be an arguable position, though we’d have to take some account of the potential for disruptive and amnesiac episodes that may come along with DID. However, Split can hardly be seen as making a valuable contribution to awareness because of the way it draws on Jekyll and Hyde tropes. First, there’s the renewed suggestion that alters usually include terrifically evil personalities. The central character in Split is apparently going to become a super-villain in a sequel. This will be a ‘grounded’ super; one whose powers are not attributable to the semi-magic effects of radiation or film-style mutation, but ‘realistically’ to DID. Putting aside the super powers, I don’t know of any evidence that people with DID have a worse criminal record than anyone else; if anything I’d guess that coping with their own problems leaves them no time or capacity for  embarking on crime sprees. But portraying them as inherently bad inevitably stigmatises existing patients and deters future diagnoses in ways that are surely offensive and unhelpful. It might even cause some patients to think that their alters have to behave badly in order to validate their diagnosis.

Of course, Hollywood almost invariably portrays mental problems as hidden superpowers. Autism makes you a mathematical genius; OCD means you’re really tidy and well-organised. But the suggestion that DID probably makes you a wall-climbing murderer is an especially negative one.  Zombies, those harmless victims of bizarre Caribbean brainwashing, possibly got a similarly negative treatment when they were transformed by Romero into brain-munching corpse monsters; but luckily I think that diagnosis is rare.

The other thing about Split is that it takes some of the wilder claims about the physical impact of DID and exaggerates them to absurdity. The psychologist in the film, Dr. Karen Fletcher merely asserts that the switch between alters can change people’s body chemistry: fine, getting into an emotional state changes that. But it emerges that Kevin’s eyesight, size and strength all change with his alters: one of them even needs insulin injections while the others don’t (a miracle that the one who needs them ever managed to manifest consistently enough to get the medication prescribed). In his final monster incarnation he becomes bigger, more muscled, able to climb walls like a fly, and invulnerable to being shot in the chest at close range (we really don’t want patients believing in that one, do we?). Remarkable in the circumstances that his one female alter didn’t develop a bulging bosom.

Anyway, you may have noticed that Hollywood isn’t the only context in which zombies have been used for other purposes and dubious stories about personal identity told. In philosophy our problems with traditional agency and responsibility have led to widespread acceptance of attenuated forms of personhood; multiple draft people, various self-referential illusions, and epiphenomenal confabulations. These sceptical views of common-sense selfhood are often discussed in a relatively positive light, as yielding a kind of Buddhist insight, or bringing a welcome relief from moral liability; but I don’t think it’s too fanciful to fear that they might also create a climate that fosters a sense of powerlessness and depersonalisation. I’d be the last person to say that philosophers should self-censor, still less that they should avoid hypotheses that look true or interesting but are depressing. Nor am I suffering from the delusion that the public at large, or even academic psychologists, are waiting eagerly to hear what the philosophers think. But perhaps there’s room for slightly more awareness that these are not purely academic issues?