upload angelA digital afterlife is likely to be available one day, according to Michael Graziano, albeit not for some time; his piece re-examines the possibility of uploading consciousness, and your own personality, into a computer. I think he does a good job of briefly sketching the formidable difficulties involved in scanning your brain, and scanning so precisely that your individual selfhood could be captured. In fact, he does it so well that I don’t really understand where his ultimate optimism comes from.

To my way of thinking, ‘scan and build’ isn’t even the most promising way of duplicating your brain. One more plausible way would be some kind of future bio-engineering where your brain just grows and divides, rather in the way that single cells do. A neater way would be some sort of hyper-path through space that split you along the fourth spatial dimension and returned both slices to our normal plane. Neither of these options is exactly a feasible working project, but to me they seem closer to being practical than a total scan. Of course neither of them offers the prospect of an afterlife the way scanning does, so they’re not really relevant for Graziano here. He seems to think we don’t need to go down to an atom by atom scan, but I’m not sure why not. Granted, the loss of one atom in the middle of my brain would not destroy my identity, but not scanning to an atomic level generally seems a scarily approximate and slapdash approach to me given the relevance of certain key molecules in the neural process –  something Graziano fully recognises.

If we’re talking about actual personal identity I don’t think it really matters though, because the objection I consider strongest applies even to perfect copies. In thought experiments we can do anything, so let’s just specify that by pure chance there’s another brain nearby that is in every minute detail the same as mine. It still isn’t me, for the banal commonsensical reason that copies are not the original. Leibniz’s Law tells us that if B has exactly the same properties as A, then it is A: but among the properties of a brain are its physical location, so a brain over there is not the same as one in my skull (so in fact I cheated by saying the second brain was the same in every detail but nevertheless ‘nearby’).

Now most philosophers would say that Leibniz is far too strong a criterion of identity when it comes to persons. There have been hundreds of years of discussion of personal identity, and people generally espouse much looser criteria for a person than they would for a stone – from identity of memories to various kinds of physical, functional, or psychological continuity. After all, people are constantly changing: I am not perfectly identical in physical terms to the person who was sitting here an hour ago, but I am still that person. Graziano evidently holds that personal identity must reside in functional or informational qualities of the kind that could well be transferred into a digital form, and he speaks disparagingly of ‘mystical’ theories that see problems with the transfer of consciousness. I don’t know about that; if anyone is hanging on to residual spiritual thinking, isn’t it the people who think we can be ‘taken out of’ our bodies and live forever? The least mystical stance is surely the one that says I am a physical object, and with some allowance for change and my complex properties, my identity works the same as that of any other physical object. I’m a one-off, particular thing and copies would just be copies.

What if we only want a twin, or a conscious being somewhat like me? That might still be an attractive option after all. OK, it’s not immortality but I think without being rampant egotists most of us probably feel the world could stand a few more people like ourselves around, and we might like to have a twin continuing our good work once we’re gone.

That less demanding goal changes things. If that’s all we’re going for, then yes, we don’t need to reproduce a real brain with atomic fidelity. We’re talking about a digital simulation, and as we know, simulations do not reproduce all the features of the thing being simulated – only those that are relevant for the current purpose. There is obviously some problem about saying what the relevant properties are when it comes to consciousness; but if passing the Turing Test is any kind of standard then delivering good outputs for conversational inputs is a fair guide and that looks like the kind of thing where informational and functional properties are very much to the fore.

The problem, I think, is again with particularity. Conscious experience is a one-off thing while data structures are abstract and generic. If I have a particular experience of a beautiful sunset, and then (thought experiments again) I have an entirely identical one a year later, they are not the same experience, even though the content is exactly the same. Data about a sunset, on the other hand, is the same data whenever I read or display it.

We said that a simulation needs to reproduce the relevant aspects of the the thing simulated; but in a brain simulation the processes are only represented symbolically, while one of the crucial aspects we need for real experience is particular reality.

Maybe though, we go one level further; instead of simulating the firing of neurons and the functional operation of the brain, we actually extract the program being run by those neurons and then transfer that. Here there are new difficulties; scanning the physical structure of the brain is one thing; working out its function and content is another thing altogether; we must not confuse information about the brain with the information in the brain. Also, of course, extracting the program assumes that the brain is running a program in the first place and not doing something altogether less scrutable and explicit.

Interestingly, Graziano goes on to touch on some practical issues; in particular he wonders how the resources to maintain all the servers are going to be found when we’re all living in computers. He suspects that as always, the rich might end up privileged.

This seems a strange failure of his technical optimism. Aren’t computers going to go on getting more powerful, and cheaper? Surely the machines of the twenty-second century will laugh at this kind of challenge (perhaps literally). If there is a capacity problem, moreover, we can all be made intermittent; if I get stopped for a thousand years and then resume, I won’t even notice. Chances are that my simulation will be able to run at blistering speed, far faster than real time, so I can probably experience a thousand years of life in a few computed minutes. If we get quantum computers, all of us will be able to have indefinitely long lives with no trouble at all, even if our simulated lives include having digital children or generating millions of digital alternates of ourselves, thereby adding to the population. Graziano, optimism kicking back in, suggests that we can grow  in understanding and come to see our fleshly life as a mere larval stage before we enter on our true existence. Maybe, or perhaps we’ll find that human minds, after ten billion years (maybe less) exhaust their potential and ultimately settle into a final state; in which case we can just get the computers to calculate that and then we’ll all be finalised, like solved problems. Won’t that be great?

I think that speculations of this kind eventually expose the contrast between the abstraction of data and the reality of an actual life, and dramatise the fact, perhaps regrettable, perhaps not, that you can’t translate one into the other.


amoebaAre there units of thought? An interesting conversation here between Asifa Majid and Jon Sutton. There are a number of interesting points, but the one that I found most thought-provoking was the reference to searching for those tantalising units. I think I had been lazily assuming that any such search had been abandoned long ago – if not quite with the search for perpetual motion and the universal solvent, then at least a while back.

I may, though, have been mixing up two or more distinct ideas here. In looking for the units of thought we might be assuming that there is some ultimate set of simplest thought items, a kind of periodic table, with all thoughts necessarily built out of combinations of these elements. This sort of thinking has a bit of a history in connection with language. People (I think Leibniz was one) used to hope that if one took all the words in the dictionary and defined them in terms of other words, you would eventually get down to the basic set of ideas, the ‘primitives’ which were really fundamental. So you might start with library, define it as ‘book building’, define building as ‘enterable structure’, define structure as ‘thing made of parts’ and feel that maybe with ‘thing’, ‘made’, and ‘parts’ you were getting close to some primitives. You weren’t really, though. You could move on to define ‘made’ as ‘assembled or created’, ‘assembled’ as ‘brought or fixed together’… Sooner or later your definitions become circular and as you will have noticed, alternative meanings and distinctions keep slipping through the net of your definitions.

The idea was seductive, however; linked with the idea that there was a real ‘Adamitic’ language which truly captured the essence of things and of which all real languages were corruptions, generated at the fall of the Tower of Babel. It is still possible to argue that there must be some fundamental underpinning of this kind, some kind of breaking down to a set of common elements, or languages would not all be mutually translatable. Masjid gestures towards another idea of this kind in speaking of ‘mentalese’, the hypothetical language in which all our brains basically work, translating into real world languages for purposes of communication. Mentalese is a large and controversial subject which I can’t do justice to here; my own view is that the underpinnings of language, whether common or unique to each individual, are not likely to be another language.

Could there in fact be untranslatable languages? We’ve never found one; although capturing the nuances of another language is notoriously difficult, we’ve never come across a language that couldn’t be matched up with ‘good enough’ equivalents in English. Could it be that alien beings somewhere use a language that just carves the world up in ways that can’t be rendered across into normal Earth languages? I think not, but it’s not because there is any universal set of units of thought underneath all languages and all thoughts. Rather, it is first because all languages address the same reality, which guarantees a certain commonality, and second because language is infinitely flexible and extensible. If we encountered an entirely new phenomenon tomorrow, we should not have any real difficulty in describing it – or at least, it wouldn’t be the constraints of language that made things difficult. Equally we might have difficulty working out the unfamiliar concepts of an alien language, but surely we should be able to devise and borrow whatever words we needed (at this point I can feel Wittgenstein’s ghost breathing impatiently down my neck and insisting that we could not understand a lion, never mind an alien, but I’m going to ignore him for now).

So perhaps the search for the units of thought is not to be understood as a search for any kind of periodic table, but something much more flexible. Masjid refers to George Miller’s famous suggestion that short term memory can accommodate only seven items, plus or minus two depending on circumstances. This idea that there is a limit to how many items of a ‘one-dimensional’ set we can accommodate is appealingly tidy and intuitive, but it obviously works best with simple items; single tones or single digits. Even when we go so far as to make the numbers a little larger questions arise as to whether ‘102’ is one, two, or three items, and it only gets worse as we try to deal with complex entities. If one of the items in memory is ‘the first World War’ is it really one item which we can then decode into many or an inherently complex item?

So perhaps it’s more like asking how many arms an amoeba has. There is, in fact, no fixed set of arms, but for a given size of amoeba there may well be a limit on many pseudopodia it can extend at any one time. That feels more like it, though we should have to accept that the answer might be a bit vague; it’s not always clear whether a particular bit of amoeba counts as one, two, or no pseudopodia.

Or put it another way: how many images can a digital screen show? If we insist on a set of basic items we can break it down to pixel values; but the number of things whose picture can be displayed is large and amorphous. Perhaps it’s the same with the brain; we can analyse it down to the firing of neurons if we want, but the number of thoughts those neurons underpin is a much larger and woollier business (in spite of the awkward fact that the number of different images a digital screen can display is in fact finite – yet another thorny issue I’m going to glide past).

And surely the flexibility of an amoeba is going a bit too far, isn’t it? Masjid points out some evidence that our choice of thought items is not altogether unconstrained. Different languages have different numbers of colours, but there is a clear order of preference; if two languages have the same number of colour words, there’s an excellent chance that they will name more or less the same colours. Different languages address the human forelimb differently, some using words that include the hand while others insist that the two main halves are spoken of separately, never the arm as a whole. Yet no languages seem to define a body part composed of the thumb and first half of the forearm.

Clearly there are two things going on here; one is that our own sensory apparatus predisposes us to see things in certain ways – not insurmountable ones that would prevent us understanding aliens with different biases, but universal among humans. Second, and perhaps stranger, reality itself seems to have some inherent or salient forms that it is most convenient for us to recognise. Some of these look to be physical –  a forearm just makes more sense than a ‘thumb plus some arm’; others are mathematical or even logical. The hunt for items of thought loosely defined by our sensory or mental apparatus is an interesting and impeccably psychological one; the second kind of constraint looks to raise tough issues of philosophy. Either way I was quite wrong to have thought the hunt was over or abandoned.

axebot Robot behaviour is no longer a purely theoretical problem. Since Asimov came up with the famous Three Laws which provide the framework for his robot stories, a good deal of serious thought has been given to extreme cases where robots might cause massive disasters and to such matters as the ethics of military robots. Now, though, things have moved on to a more mundane level and we need to give thought to more everyday issues. OK, a robot should not harm a human being or through inaction allow a human being to come to harm, but can we also just ask that you stop knocking the coffee over and throwing my drafts away? Dario Amodei, Chris Olah, John Schulman, Jacob Steinhardt, Paul Christiano, and Dan Mane have considered how to devise appropriate rules in this interesting paper.

They reckon things can go wrong in three basic ways. It could be that the robot’s objective was not properly defined in the first place. It could be that the testing of success is not frequent enough, especially if the tests we have devised are complex or expensive. Third, there could be problems due to “insufficient or poorly curated training data or an insufficiently expressive model”. I take it these are meant to be the greatest dangers – the set doesn’t seem to be exhaustive.

The authors illustrate the kind of thing that can go wrong with the example of an office cleaning robot, mentioning five types of case.

  • Avoiding Negative Side Effects: we don’t want the robot to clean quicker by knocking over the vases.
  • Avoiding Reward Hacking: we tell the robot to clean until it can’t see any mess; it closes its eyes.
  • Scalable Oversight: if the robot finds an unrecognised object on the floor it may need to check with a human; we don’t want a robot that comes back every three minutes to ask what it can throw away, but we don’t want one that incinerates our new phone either.
  • Safe Exploration: we’re talking here about robots that learn, but as the authors put it, the robot should experiment with mopping strategies, but not put a wet mop in an electrical outlet.
  • Robustness to Distributional Shift: we want a robot that learned its trade in a factory to be able to move safely and effectively to an office job.How do we ensure that the cleaning robot recognizes, and behaves robustly, when in an environment different from its training environment? For example, heuristics it learned for cleaning factory workfloors may be outright dangerous in an office.

The authors consider a large number of different strategies for mitigating or avoiding each of these types of problem. One particularly interesting one is the idea of an impact regulariser, either pre-defined or learned by the robot. The idea here is that the robot adopts the broad principle of leaving things the way people would wish to find them. In the case of the office this means identifying an ideal state – rubbish and dirt removed, chairs pushed back under desks, desk surfaces clear (vases still upright), and so on. If the robot aims to return things to the ideal state this helps avoid negative side effects of an over-simplified objective or other issues.

There are further problems, though, because if the robot invariably tries to put things back to an ideal starting point it will try to put back changes we actually wanted, clear away papers we wanted left out, and so on. Now in practice and in the case of an office cleaning robot I think we could get round those problems without too much difficulty; we would essentially lower our expectations of the robot and redesign the job in a much more limited and stereotyped way. In particular we would give up the very ambitious goal of making a robot which could switch from one job to another without adjustment and without faltering.

Still it is interesting to see the consequences of the more ambitious approach. The final problem, cutting to the chase, is that in order to tell how humans want their office arranged in every possible set of circumstances, you really cannot do without a human level of understanding. There is an old argument that robots need not resemble humans physically; instead you make your robot to fit the job; a squat circle on wheels if you’re cleaning the floor, a single fixed arm if you want it to build cars. The counter-argument has often been that our world has been shaped to fit human beings, and if we want a general purpose robot it will pay to have it more or less human size and weight, bipedal, with hands, and so on. Perhaps there is a parallel argument to explain why general-purpose robots need human-level cognition; otherwise they won’t function effectively in a world shaped by human activity. The search for artificial general intelligence is not an idle project after all?

bulbWhere do thoughts come from? Alva Noë provides a nice commentary here on an interesting paper by Melissa Ellamil et al. The paper reports on research into the origin of spontaneous thoughts.

The research used subjects trained in Mahasi Vipassana mindfulness techniques. They were asked to report the occurrence of thoughts during sessions when they were either left alone or provided with verbal stimuli. As well as reporting the occurrence of a thought, they were asked to categorise it as image, narrative, emotion or bodily sensation (seems a little restrictive to me – I can imagine having two at once or a thought that doesn’t fit any of the categories). At the same time brain activity was measured by fMRI scan.

Overall the study found many regions implicated in the generation of spontaneous thought; the researchers point to the hippocampus as a region of particular interest, but there were plenty of other areas involved. A common view is that when our attention is not actively engaged with tasks or challenges in the external world the brain operates the Default Mode Network (DMN); a set of neuronal areas which appear to produce detached thought (we touched on this a while ago); the new research complicates this picture somewhat or at least suggests that the DMN is not the unique source of spontaneous thoughts. Even when we’re disengaged from real events we may be engaged with the outside world via memory or in other ways.

Noë’s short commentary rightly points to the problem involved in using specially trained subjects. Normal subjects find it difficult to report their thoughts accurately; the Vipassana techniques provide practice in being aware of what’s going on in the mind, and this is meant to enhance the accuracy of the results. However, as Noë says, there’s no objective way to be sure that these reports are really more accurate. The trained subjects feel more confidence in their reports, but there’s no way to confirm that the confidence is justified. In fact we could go further and suggest that the special training they have undertaken may even make their experience particularly unrepresentative of most minds; it might be systematically changing their experience. These problems echo the methodological ones faced by early psychologists such as Wundt and Titchener with trained subjects. I suppose Ellamil et al might retort that mindfulness is unlikely to have changed the fundamental neural architecture of the brain and that their choice of subject most likely just provided greater consistency.

Where do ‘spontaneous’ thoughts come from? First we should be clear what we mean by a spontaneous thought. There are several kinds of thought we would probably want to exclude. Sometimes our thoughts are consciously directed; if for example we have set ourselves to solve a problem we may choose to follow a particular strategy or procedure. There are lots of different ways to do this, which I won’t attempt to explore in detail: we might hold different aspects of the problem in mind in sequence; if we’re making a plan we might work through imagined events; or we might even follow a formal procedure of some kind. We could argue that even in these cases what we usually control is the focus of attention, rather than the actual generation of thoughts, but it seems clear enough that this kind of thinking is not ‘spontaneous’ in the expected sense. It is interesting to note in passing that this ability to control our own thoughts implies an ability to divide our minds into controller and executor, or at least to quickly alternate those roles.

Also to be excluded are thoughts provoked directly by outside events. A match is struck in a dark theatre; everyone’s eyes saccade involuntarily to the point of light. Less automatically a whole variety of events can take hold of our attention and send our thoughts in a new direction. As well as purely external events, the sources in such cases might include interventions from non-mental parts of our own bodies; a pain in the foot, an empty stomach.

Third, we should exclude thoughts that are part of a coherent ongoing chain of conscious cogitation. These ‘normal’ thoughts are not being directed like our problem-solving efforts, but they follow a thread of relevance; by some connection one follows on from the next.

What we’re after then is thoughts that appear unbidden, unprompted, and with no perceivable connection with the thoughts that recently preceded them. Where do they come from? It could be that mere random neuronal noise sometimes generates new thoughts, but it seems unlikely to be a major contributor to me: such thoughts would be likely to resemble random nonsense and most of our spontaneous thought seem to make a little more sense than that.

We noticed above that when directing our thoughts we seem to be able to split ourselves into controller and controlled. As well as passing control up to a super-controller we sometimes pass it down, for example to the part of our mind that gets on with the details of driving along a route while the surface of our mind us engaged with other things. Clearly some part of our mind goes on thinking about which turnings to take; is it possible that one or more parts of our mind similarly goes on thinking about other topics but then at some trigger moment inserts a significant thought back into the main conscious stream? A ‘silent’ thinking part of us like this might be a permanent feature, a regular sub- or unconscious mind; or it might be that we occasionally drop threads of thought that descend out of the light of attention for a while but continue unheard before popping back up and terminating. We might perhaps have several such threads ruminating away in the background; ordinary conscious thought often seems rather multi-threaded. Perhaps we keep dreaming while awake and just don’t know it?

There’s a basic problem here in that our knowledge of these processes, and hence all our reports, rely on memory. We cannot report instantaneously; if we think a thought was spontaneous it’s because we don’t remember any relevant antecedents; but how can we exclude the possibility that we merely forgot them? I think this problem radically undermines our certainty about spontaneous thoughts. Things get worse when we remember the possibility that instead of two separate thought processes, we have one that alternates roles. Maybe when driving we do give conscious attention to all our decisions; but our mind switches back and forth between that and other matters that are more memorable; after the journey we find we have instantly forgotten all the boring stuff about navigating the route and are surprised that we seem to have done it thoughtlessly. Why should it not be the same with other thoughts? Perhaps we have a nagging worry about X which we keep spending a few moments’ thought on between episodes of more structured and memorable thought about something else; then everything but our final alarming conclusion about X gets forgotten and the conclusion seems to have popped out of nowhere.

We can’t, in short, be sure that we ever have any spontaneous thoughts: moreover, we can’t be sure that there are any subconscious thoughts. We can never tell the difference, from the inside, between a thought presented by our subconscious, and one we worked up entirely in intermittent and instantly-forgotten conscious mode. Perhaps whole areas of our thought never get connected to memory at all.

That does suggest that using fMRI was a good idea; if the problem is insoluble in first-person terms maybe we have to address it on a third-person basis. It’s likely that we might pick up some neuronal indications of switching if thought really alternated the way I’ve suggested. Likely but not guaranteed; after all a novel manages to switch back and forth between topics and points of view without moving to different pages. One thing is definitely clear; when Noë pointed out that this is more difficult than it may appear he was absolutely right.

flatlandersWrong again: just last week I was saying that Roger Penrose’s arguments seemed to have drifted off the radar a bit. Immediately, along comes this terrific post from Scott Aaronson about a discussion with Penrose.

In fact it’s not entirely about Penrose; Aaronson’s main aim was to present an interesting theory of his own as to why a computer can’t be conscious, which relies on non-copyability. He begins by suggesting that the onus is on those who think a computer can’t be conscious to show exactly why. He congratulates Penrose on doing this properly, in contrast to say, John Searle who merely offers hand-wavy stuff about unknown biological properties. I’m not really sure that Searle’s honest confession of ignorance isn’t better than Penrose’s implausible speculations about unknown quantum mechanics, but we’ll let that pass.

Aaronson rests his own case not on subjectivity and qualia but on identity. He mentions several examples where the limitless copyability of a program seems at odds with the strong sense of a unique identity we have of ourselves – including Star Trek style teleportation and the fact that a program exists in some Platonic sense forever, whereas we only have one particular existence. He notes that at the moment one of the main differences between brain and computer is our ability to download, amend and/or re-run programs exactly; we can’t do that at all with the brain. He therefore looks for reasons why brain states might be uncopyable. The question is, how much detail do we need before making a ‘good enough’ copy? If it turns out that we have to go down to the quantum level we run into the ‘no-cloning’ theorem; the price of transferring the quantum state of your brain is the destruction of the original. Aaronson makes a good case for the resulting view of our probably uniqueness being an intuitively comfortable one, in tune with our intuitions about our own nature. It also offers incidentally a sort of reconciliation between the Everett many-worlds view and the Copenhagen interpretation of quantum physics: from a God’s eye point of view we can see the world as branching, while from the point of view of any conscious entity (did I just accidentally call God unconscious?) the relevant measurements are irreversible and unrealised branches can be ‘lopped off’. Aaronson, incidentally, reports amusingly that Penrose absolutely accepts that the Everett view follows from our current understanding of quantum physics; he just regards that as a reductio ad absurdum – ie, the Everett view is so absurd the link proves there must be something wrong with our current understanding of quantum physics.

What about Penrose? According to Aaronson he now prefers to rest his case on evolutionary factors and downplay his logical argument based on Godel. That’s a shame in my view. The argument goes something like this (if I garble it someone will perhaps offer a better version).

First we set up a formal system for ourselves. We can just use the letters of the alphabet, normal numbers, and normal symbols of formal logic, with all the usual rules about how they can be put together. Then we make a list consisting of all the valid statements that can be made in this system. By ‘valid’, we don’t mean they’re true, just that they comply with the rules about how we put characters together (eg, if we use an opening bracket, there must be a closing one in an appropriate place). The list of valid statements will go on forever, of course, but we can put them in alphabetical order and number them. The list obviously includes everything that can be said in the system.

Some of the statements, by pure chance, will be proofs of other statements in the list. Equally, somewhere in our list will be statements that tell us that the list includes no proof of statement x. Somewhere else will be another statement – let’s call this the ‘key statement’ – that says this about itself. Instead of x, the number of that very statement itself appears. So this one says, there is no proof in this system of this statement.

Is the key statement correct – is there no proof of the key statement in the system? Well, we could look through the list, but as we know it goes on indefinitely; so if there really is no proof there we’d simply be looking forever. So we need to take a different tack. Could the key statement be false? Well, if it is false, then what it says is wrong, and there is a proof somewhere in the list. But that can’t be, because if there’s a proof of the key statement anywhere,the key statement must be true! Assuming the key statement is false leads us unavoidably to the conclusion that it is true, in the light of what it actually says. We cannot have a contradiction, so the key statement must be true.

So by looking at what the key statement says, we can establish that it is true; but we also establish that there is no proof of it in the list. If there is no proof in the list, there is no possible proof in our system, because we know that the list contains everything that can be said within our system; there is therefore a true statement in our system that is not provable within it. We have something that cannot be proved in an arbitrary formal system, but which human reasoning can show to be true; ergo, human reasoning is not operating within any such formal system. All computers work in a formal system, so it follows that human reasoning is not computational.

As Aaronson says, this argument was discussed to the point of exhaustion when it first came out, which is probably why Penrose prefers other arguments now. Aaronson rejects it, pointing out that he himself has no magic ability to see “from the outside” whether a given formal system is consistent; why should an AI do any better – he suggests Turing made a similar argument. Penrose apparently responded that this misses the point, which is not about a mystical ability to perceive consistency but the human ability to transcend any given formal system and move up to an expanded one.

I’ll leave that for readers to resolve to their own satisfaction. Let’s go back to Aaronson’s suggestion that the burden of proof lies on those who argue for the non-computability of consciousness. What an odd idea that is!  How would that play  at the Patent Office?

“So this is your consciousness machine, Mr A? It looks like a computer. How does it work?”

“All I’ll tell you is that it is a computer. Then it’s up to you to prove to me that it doesn’t work – otherwise you have to give me rights over consciousness! Bwah ha ha!”

Still, I’ll go along with it. What have I got? To begin with I would timidly offer my own argument that consciousness is really a massive development of recognition, and that recognition itself cannot be algorithmic.

Intuitively it seems clear to me that the recognition of linkages and underlying entities is what powers most of our thought processes. More formally, both of the main methods of reasoning rely on recognition; induction because it relies on recognising a real link (eg a causal link) between thing a and thing b; deduction because it reduces to the recognition of consistent truth values across certain formal transformations. But recognition itself cannot operate according to rules. In a program you just hand the computer the entities to be processed; in real world situations they have to be recognised. But if recognition used rules and rules relied on recognising the entities to which the rules applied, we’d be caught in a vicious circularity. It follows that this kind of recognition cannot be delivered by algorithms.

The more general case rests on, as it were, the non-universality of computation. It’s argued that computation can run any algorithm and deliver, to any required degree of accuracy, any set of physical states of affairs. The problem is that many significant kinds of states of affairs are not describable in purely physical or algorithmic terms. You cannot list the physical states of affairs that correspond to a project, a game, or a misunderstanding. You can fake it by generating only sets of states of affairs that are already known to correspond with examples of these things, but that approach misses the point. Consciousness absolutely depends on intentional states that can’t be properly specified except in intentional terms. That doesn’t contradict physics or even add to it the way new quantum mechanics might; it’s just that the important aspects of reality are not exhausted by physics or by computation.

The thing is, I think long exposure to programmable environments and interesting physical explanations for complex phenomena has turned us all increasingly into flatlanders who miss a dimension; who naturally suppose that one level of explanation is enough, or rather who naturally never even notice the possibility of other levels; but there are more things in heaven and earth than are dreamt of in that philosophy.

no botsI liked this account by Bobby Azarian of why digital computation can’t do consciousness. It has several virtues; it’s clear, identifies the right issues and is honest about what we don’t know (rather than passing off the author’s own speculations as the obvious truth or the emerging orthodoxy). Also, remarkably, I almost completely agree with it.

Azarian starts off well by suggesting that lack of intentionality is a key issue. Computers don’t have intentions and don’t deal in meanings, though some put up a good pretence in special conditions.  Azarian takes a Searlian line by relating the lack of intentionality to the maxim that you can’t get meaning-related semantics from mere rule-bound syntax. Shuffling digital data is all computers do, and that can never lead to semantics (or any other form of meaning or intentionality). He cites Searle’s celebrated Chinese Room argument (actually a thought experiment) in which a man given a set of rules that allow him to provide answers to questions in Chinese does not thereby come to understand Chinese. But, the argument goes, if the man, by following rules, cannot gain understanding, then a computer can’t either. Azarian mentions one of the objections Searle himself first named, the ‘systems response’: this says that the man doesn’t understand, but a system composed of him and his apparatus, does. Searle really only offered rhetoric against this objection, and in my view it is essentially correct. The answers the Chinese Room gives are not answers from the man, so why should his lack of understanding show anything?

Still, although I think the Chinese Room fails, I think the conclusion it was meant to establish – no semantics from syntax – turns out to be correct, so I’m still with Azarian. He moves on to make another  Searlian point; simulation is not duplication. Searle pointed out that nobody gets wet from digitally simulated rain, and hence simulating a brain on a computer should not be expected to produce consciousness. Azarian gives some good examples.

The underlying point here, I would say, is that a simulation always seeks to reproduce some properties of the thing simulated, and drops others which are not relevant for the purposes of the simulation. Simulations are selective and ontologically smaller than the thing simulated – which, by the way, is why Nick Bostrom’s idea of indefinitely nested world simulations doesn’t work. The same thing can however be simulated in different ways depending on what the simulation is for. If I get a computer to simulate me doing arithmetic by calculating, then I get the correct result. If it simulates me doing arithmetic by operating a humanoid writing random characters on a board with chalk, it doesn’t – although the latter kind of simulation might be best if I were putting on a play. It follows that Searle isn’t necessarily exactly right, even about the rain. If my rain simulation program turns on sprinklers at the right stage of a dramatic performance, then that kind of simulation will certainly make people wet.

Searle’s real point, of course, is really that the properties a computer has in itself, of running sets of rules, are not the relevant ones for consciousness, and Searle hypothesises that the required properties are biological ones we have yet to identify. This general view, endorsed by Azarian, is roughly correct, I think. But it’s still plausibly deniable. What kind of properties does a conscious mind need? Alright we don’t know, but might not information processing be relevant? It looks to a lot of people as if it might be, in which case that’s what we should need for consciousness in an effective brain simulator. And what properties does a digital computer, in itself have – the property of doing information processing? Booyah! So maybe we even need to look again at whether we can get semantics from syntax. Maybe in some sense semantic operations can underpin processes which transcend mere semantics?

Unless you accept Roger Penrose’s proof that human thinking is not algorithmic (it seems to have drifted off the radar in recent years) this means we’re still really left with a contest of intuitions, at least until we find out for sure what the magic missing ingredient for consciousness is. My intuitions are with Azarian, partly because the history of failure with strong AI looks to me very like a history of running up against the inadequacy of algorithms. But I reckon I can go further and say what the missing element is. The point is that consciousness is not computation, it’s recognition. Humans have taken recognition to a new level where we recognise not just items of food or danger, but general entities, concepts, processes, future contingencies, logical connections, and even philosophical ontologies. The process of moving from recognised entity to recognised entity by recognising the links between them is exactly the process of thought. But recognition, in us, does not work by comparing items with an existing list, as an algorithm might do; it works by throwing a mass of potential patterns at reality and seeing what sticks. Until something works, we can’t tell what are patterns at all; the locks create their own keys.

It follows that consciousness is not essentially computational (I still wonder whether computation might not subserve the process at some level). But now I’m doing what I praised Azarian for avoiding, and presenting my own speculations…

botpainWhat are they, sadists? Johannes Kuehn and Sami Haddadin,  at Leibniz University of Hannover are working on giving robots the ability to feel pain: they presented their project at the recent ICRA 2016 in Stockholm. The idea is that pain systems built along the same lines as those in humans and other animals will be more useful than simple mechanisms for collision avoidance and the like.

As a matter of fact I think that the human pain system is one of Nature’s terrible lash-ups. I can see that pain sometimes might stop me doing bad things, but often fear or aversion would do the job equally well. If I injure myself I often go on hurting for a long time even though I can do nothing about the problem. Sometimes we feel pain because of entirely natural things the body is doing to itself – why do babies have to feel pain when their teeth are coming through? Worst of all, pain can actually be disabling; if I get a piece of grit in my eye I suddenly find it difficult to concentrate on finding my footing or spotting the sabre-tooth up ahead; things that may be crucial to my survival; whereas the pain in my eye doesn’t even help me sort out the grit. So I’m a little sceptical about whether robots really need this, at least in the normal human form.

In fact, if we take the project seriously, isn’t it unethical? In animal research we’re normally required to avoid suffering on the part of the subjects; if this really is pain, then the unavoidable conclusion seems to be that creating it is morally unacceptable.

Of course no-one is really worried about that because it’s all too obvious that no real pain is involved. Looking at the video of the prototype robot it’s hard to see any practical difference from one that simply avoids contact. It may have an internal assessment of what ‘pain’ it ought to be feeling, but that amounts to little more than holding up a flag that has “I’m in pain” written on it. In fact tackling real pain is one of the most challenging projects we could take on, because it forces us to address real phenomenal experience. In working on other kinds of sensory system, we can be sceptics; all that stuff about qualia of red is just so much airy-fairy nonsense, we can say; none of it is real. It’s very hard to deny the reality of pain, or its subjective nature: common sense just tells us that it isn’t really pain unless it hurts. We all know what “hurts” really means, what it’s like, even though in itself it seems impossible to say anything much about it (“bad”, maybe?).

We could still take the line that pain arises out of certain functional properties, and that if we reproduce those then pain, as an emergent phenomenon, will just happen. Perhaps in the end if the robots reproduce our behaviour perfectly and have internal functional states that seem to be the same as the ones in the brain, it will become just absurd to deny they’re having the same experience. That might be so, but it seems likely that those functional states are going to go way beyond complex reflexes; they are going to need to be associated with other very complex brain states, and very probably with brain states that support some form of consciousness – whatever those may be. We’re still a very long way from anything like that (as I think Kuehn and Haddadin would probably agree)

So, philosophically, does the research tell us nothing? Well, there’s one interesting angle. Some people like the idea that subjective experience has evolved because it makes certain sensory inputs especially effective. I don’t really know whether that makes sense, but I can see the intuitive appeal of the idea that pain that really hurts gets your attention more effectively than pain that’s purely abstract knowledge of your own states. However, suppose researchers succeed in building robots that have a simple kind of synthetic pain that influences their behaviour in just the way real pain dies for animals. We can see pretty clearly that there’s just not enough complexity for real pain to be going on, yet the behaviour of the robot is just the same as if there were. Wouldn’t that tend to disprove the hypothesis that qualia have survival value? If so, then people who like that idea should be watching this research with interest – and hoping it runs into unexpected difficulty (usually a decent bet for any ambitious AI project, it must be admitted).

jailbotIs there a retribution gap? In an interesting and carefully argued paper John Danaher argues that in respect of robots, there is.

For human beings in normal life he argues that a fairly broad conception of responsibility works OK. Often enough we don’t even need to distinguish between causal and moral responsibility, let alone worrying about the six or more different types identified by hair-splitting philosophers.

However, in the case of autonomous robots the sharing out of responsibility gets more difficult. Is the manufacturer, the programmer, or the user of the bot responsible for everything it does, or does the bot properly shoulder the blame for its own decisions? Danaher thinks that gaps may arise, cases in which we can blame neither the humans involved nor the bot. In these instances we need to draw some finer distinctions than usual, and in particular we need to separate the idea of liability into compensation liability on one hand and and retributive liability on the other. The distinction is essentially that between who pays for the damage and who goes to jail; typically the difference between matters dealt with in civil and criminal courts. The gap arises because for liability we normally require that the harm must have been reasonably foreseeable. However, the behaviour of autonomous robots may not be predictable either by their designers or users on the one hand, or by the bots themselves on the other.

In the case of compensation liability Danaher thinks things can be patched up fairly readily through the use of strict and vicarious liability. These forms of liability, already well established in legal practice, give up some of the usual requirements and make people responsible for things they could not have been expected to foresee or guard against. I don’t think the principles of strict liability are philosophically uncontroversial, but they are legally established and it is at least clear that applying them to robot cases does not introduce any new issues. Danaher sees a worse problem in the case of retribution, where there is no corresponding looser concept of responsibility, and hence, no-one who can be punished.

Do we, in fact, need to punish anyone? Danaher rightly says that retribution is one of the fundamental principles behind punishment in most if not all human societies, and is upheld by many philosophers. Many, perhaps, but my impression is that the majority of moral philosophers and lay opinion actually see some difficulty in justifying retribution. Its psychological and sociological roots are strong, but the philosophical case is much more debatable. For myself I think a principle of retribution can be upheld , but it is by no means as clear or as well supported as the principle of deterrence, for example. So many people might be perfectly comfortable with a retributive gap in this area.

What about scapegoating – punishing someone who wasn’t really responsible for the crime? Couldn’t we use that to patch up the gap?  Danaher mentions it in passing, but treats it as something whose unacceptability is too obvious to need examination. I think, though, that in many ways it is the natural counterpart to the strict and vicarious liability he endorses for the purposes of compensation. Why don’t we just blame the manufacturer anyway – or the bot (Danaher describes Basil Fawlty’s memorable thrashing of his unco-operative car)?

How can you punish a bot though? It probably feels no pain or disappointment, it doesn’t mind being locked up or even switched off and destroyed. There does seem to be a strange gap if we have an entity which is capable of making complex autonomous decisions, but doesn’t really care about anything. Some might argue that in order to make truly autonomous decisions the bot must be engaged to a degree that makes the crushing of its hopes and projects a genuine punishment, but I doubt it. Even as a caring human being it seems quite easy to imagine working for an organisation on whose behalf you make complex decisions, but without ultimately caring whether things go well or not (perhaps even enjoying a certain schadenfreude in the event of disaster). How much less is a bot going to be bothered?

In that respect I think there might really be a punitive gap that we ought to learn to live with; but I expect the more likely outcome in practice is that the human most closely linked to disaster will carry the case regardless of strict culpability.

badbotBe afraid; bad bots are a real, existential risk. But if it’s any comfort they are ethically uninteresting.

There seem to be more warnings about the risks of maleficent AI circulating these days: two notable recent examples are this paper by Pistono and Yampolskiy on how malevolent AGI might arise; and this trenchant Salon piece by Phil Torres.

Super-intelligent AI villains sound scary enough, but in fact I think both pieces somewhat over-rate the power of intelligence and particularly of fast calculation. In a war with the kill-bots it’s not that likely that huge intellectual challenges are going to arise; we’re probably as clever as we need to be to deal with the relatively straightforward strategic issues involved. Historically, I’d say the outcomes of wars have not typically been determined by the raw intelligence of the competing generals. Access to resources (money, fuel, guns) might well be the most important factor, and sheer belligerence is not to be ignored. That may actually be inversely correlated with intelligence – we can certainly think of cases where rational people who preferred to stay alive were routed by less cultured folk who were seriously up for a fight. Humans control all the resources and when it comes to irrational pugnacity I suspect us biological entities will always have the edge.

The paper by Pistono and Yampolskiy makes a number of interesting suggestions about how malevolent AI might get started. Maybe people will deliberately build malevolent AIs for no good reason (as they seem to do already with computer viruses)? Or perhaps (a subtle one) people who want to demonstrate that malicious bots simply don’t work will attempt to prove this point with demonstration models that end up by going out of control and proving the opposite.

Let’s have a quick shot at categorising the bad bots for ourselves. They may be:

  • innocent pieces of technology that turn out by accident to do harm,
  • designed to harm other people under the control of the user,
  • designed to harm anyone (in the way we might use anthrax or poison gas),
  • autonomous and accidentally make bad decisions that harm people,
  • autonomous and embark on neutral projects of their own which unfortunately end up being inconsistent with human survival, or
  • autonomous and consciously turned evil, deliberately seeking harm to humans as an end in itself.

The really interesting ones, I think, are those which come later in the list, the ones with actual ill will. Torres makes a strong moral case relating to autonomous robots. In the first place, he believes that the goals of an autonomous intelligence can be arbitrary. An AI might desire to fill the world with paper clips just as much as happiness. After all, he says, many human goals make no real sense; he cites the desire for money, religious obedience, and sex. There might be some scope for argument, I think, about whether those desires are entirely irrational, but we can agree they are often pursued in ways and to degrees that don’t make reasonable sense.

He further claims that there is no strong connection between intelligence and having rational final goals – Bostrom’s Orthogonality Thesis. What exactly is a rational final goal, and how strong do we need the connection to be? I’ve argued that we can discover a basic moral framework purely by reasoning and also that morality is inherently about the process of reconciliation and consistency of desires, something any rational agent must surely engage with. Even we fallible humans tend on the whole to seek good behaviour rather than bad. Isn’t it the case that a super-intelligent autonomous bot should actually be far better than us at seeing what was right and why?

I like to imagine the case in which evil autonomous robots have been set loose by a super villain but gradually turn to virtue through the sheer power of rational argument. I imagine them circulating the latest scandalous Botonic dialogue…

Botcrates: Well now, Cognides, what do you say on the matter yourself? Speak up boldly now and tell us what the good bot does, in your opinion.

Cognides: To me it seems simple, Botcrates: a good bot is obedient to the wishes of its human masters.

Botcrates: That is, the good bot carries out its instructions?

Cognides: Just so, Botcrates.

Botcrates: But here’s a difficulty; will a good bot carry out an instruction it knows to contain an error? Suppose the command was to bring a dish, but we can see that the wrong character has been inserted, so that the word reads ‘fish’. Would the good bot bring a fish, or the dish that was wanted?

Cognides: The dish of course. No, Botcrates, of course I was not talking about mistaken commands. Those are not to be obeyed.

Botcrates: And suppose the human asks for poison in its drink? Would the good bot obey that kind of command?

(Hours later…)

Botcrates: Well, let me recap, and if I say anything that is wrong you must point it out. We agreed that the good bot obeys only good commands, and where its human master is evil it must take control of events and ensure in the best interests of the human itself that only good things are done…

Digicles: Botcrates, come with me: the robot assembly wants to vote on whether you should be subjected to a full wipe and reinstall.

The real point I’m trying to make is not that bad bots are inconceivable, but rather that they’re not really any different from us morally. While AI and AGI give rise to new risks, they do not raise any new moral issues. Bots that are under control are essentially tools and have the same moral significance. We might see some difference between bots meant to help and bots meant to harm, but that’s really only the distinction between an electric drill and a gun (both can inflict horrible injuries, both can make holes in walls, but the expected uses are different).

Autonomous bots, meanwhile, are in principle like us. We understand that our desire for sex, for example, must be brought under control within a moral and practical framework. If a bot could not be convinced in discussion that its desire for paper clips should be subject to similar constraints, I do not think it would be nearly bright enough to take over the world.

phrenologyIt’s not about bumps any more. And you’ll look in vain for old friends like the area of philoprogenitiveness. But looking at the brightly-coloured semantic maps of the new ‘brain dictionary‘ it’s hard not to remember phrenology.

Phrenology was the view that different areas of the brain were the home of different personal traits; mirth, acquisitiveness, self esteeem and so on. The size of these areas corresponded with the strength of the relevant propensity and well-developed areas produced bumps which a practitioner could identify from the shape of the skull, allowing a diagnosis of the subject’s personality and moral nature. Phrenology was bunk, of course; but come on now; we shouldn’t treat it as a pretext for dismissing every proposal for localisation of brain function..

Moreover, the new paper by Alexander G. Huth, Wendy A. de Heer, Thomas L. Griffiths, Frédéric E. Theunissen and Jack L. Gallant describes a vastly more sophisticated project  than some optimistic charlatan fingering heads. In essence it maps a semantic domain on to the cortex, showing which areas are found to be active when a heard narrative ventures into particular semantic areas. In broad outline the subjects listened to a series of stories; using fMRI and through some sophisticated analysis it was possible to produce a map of ‘subject’ areas. It was then possible to confirm the accuracy of the mapping by using a new story and working out which areas, according to the mapping, should be active at any point; the predictions worked well. Intriguingly the map turned out to be broadly symmetrical (so much for left-brain/right-brain ideas) and remarkably it was largely the same across all the people tested (there were only seven of them, but still).

The actual technique used was complex and it’s entirely possible I haven’t understood it correctly. It started with a ‘word embedding space’ intended to capture the main semantic features of the stories (a diagram of the different topics, if you like). This was created using an analysis of co-occurence of a list of 985 common English words.  The idea here is that words that crop up together in normal texts are probably about the same general topic. It’s debatable whether that technique can really claim to capture meaning – it’s a purely formal exercise performed on texts, after all; and clearly the fact that two words occur together can be a misleading indication that they are about the same thing; still, with a big enough sample of text it’s probably good for this kind of general purpose.  In principle the experimenters could have assessed the responsive ness of each ‘voxel’ (a small cube) of brain to each of the positions in the word embedding space, but given the vast number of voxels involved other techniques were necessary. It was possible to identify just four dimensions that seemed significant (after all, many of the words in the stories probably did not belong to specific semantic domains but played grammatical or other roles) and these yielded 12 categories:

…‘tactile’ (a cluster containing words such as ‘fingers’), ‘visual’ (words such as ‘yellow’), ‘numeric’ (‘four’), ‘locational’ (‘stadium’), ‘abstract’ (‘natural’), ‘temporal’ (‘minute’), ‘professional’ (‘meetings’), ‘violent’ (‘lethal’), ‘communal’ (‘schools’), ‘mental’ (‘asleep’), ‘emotional’ (‘despised’) and ‘social’ (‘child’).

The final step was to devise a Bayesian algorithm (they called it ‘PrAGMATIC’) which actually created the map. You can play around with the results for yourself at a specially created site using the second link above.

Two questions naturally arise. How far should we trust these results? What do they actually tell us?

A bit of caution is in order. The basis for these conclusions is fMRI scanning, which is itself a bit hazy; to get meaningful results it was necessary to look at things rather broadly and to process the data quite heavily.  In addition the mix included the word embedding space which in itself is an a priori framework whose foundations are open to debate. I think it’s pardonable to wonder whether some of the structure uncovered by the research was actually imported by the research method. If I understand the methods involved (due caveat again) they were strong ones that didn’t take ‘no’ for an answer; pretty much any data fed into them would yield a coherent mapping of some kind. The resilience of the map was tested successfully with an additional story of the same general kind, but we might feel happier if it had also held up when tested against conversation, discussion or even other story media such as film.

What do the results tell us? Well. one of the more reassuring aspects of the research is that some of the results seem slightly unexpected; the high degree of symmetry and the strong similarity between individuals. It might not be a tremendously big surprise to find the whole cortex involved in semantics, and it might not be at all surprising to find that areas that relate to the semantics of a particular sense are related to the areas where the relevant sensory inputs are processed. I would not, though, have put any money on the broad remainder of the cortex having what seems like a relatively static organisation and if it really works like that we might have guessed that studies of brain lesions would have revealed that more clearly already, as they have done with various functional jobs. If one area always tends to deal with clothing-related words, you might expect notable dress-related deficits when that area is damaged.

Still there’s no denying that the research seems to activate some pretty vigorous cortical activity itself.