Explaining to humans

We’ve discussed a few times recently how deep learning systems are sometimes inscrutable. They work out how to do things for themselves, and there may be no way to tell what method they are using. I’ve suggested that it’s worse than that; they are likely to be using methods which, though sound, are in principle incomprehensible to human beings. These unknowable methods may well be a very valuable addition to our stock of tools, but without understanding them we can’t be sure they won’t fail; and the worrying thing about that is that unlike humans, who often make small or recoverable mistakes, these systems have a way of failing that is sudden, unforgiving, and catastrophic.

So I was interested to see this short interview with Been Kim, who works for Google Brain (is there a Google Eyes?) has the interesting job of building a system that can explain to humans what the neural networks are up to.

That ought to be impossible, you’d think. If you use the same deep learning techniques to generate the explainer, you won’t be able to understand how it works, and the same worries will recur on another level. The other approach is essentially the old-fashioned one of directly writing an algorithm to do the job; but to write an explicit algorithm to explain system X you surely need to understand system X to begin with?

I suspect there’s an interesting general problem here about the transmission of understanding. Perhaps data can be transferred, but understanding just has to happen, and sometimes, no matter how helpfully the relevant information is transmitted, understanding just fails to occur (I feel teachers may empathise with this).

Suppose the explainer is in place; how do we know that its explanations are correct? We can see over time whether it successfully picks out systems that work reliably and those that are going to fail, but that is only ever going to be a provisional answer. The risk of sudden unexpected failure remains. For real certainty, the only way is for us to validate it by understanding it, and that is off the table.

So is Been Kim wasting her time on a quixotic project – perhaps one that Google has cynically created to reassure the public and politicians while knowing the goal is unattainable? No; her goal is actually a more modest, negative one. Her interpreter is not meant to provide an assurance that a given system is definitely reliable; rather, it is supposed to pick out one’s that are definitely dodgy; and this is much more practical. After all, we may not always understand how a given system executes a particular complex task, but we do know in general how neural networks and deep learning work. We know that the output decisions come from factors in the input data, and the interpreter ought to be able to tell us what factors are being taken into account. Then, using the unique human capacity to identify relevance, we may be able to spot some duds – cases where the system is using a variable that tracks the relevant stuff only unreliable, or where there was some unnoticed problem with the corpus of examples the system learnt from.

Is that OK? Well, in principle there’s the further risk that the system is actually cleverer than we realise; that it is using features (perhaps very complex ones) that actually work fine, but which we’re too dim to grasp. Our best reassurance here is again understanding; if we can see how things seem to be working, we have to be very unlucky to hit a system which is actually superior but just happens, in all the examined cases, to look like a dodgy one. We may not always understand the system, but if we understand something that’s going wrong, we’re probably on firm ground.

Of course, weeding out egregiously unreliable systems does not solve the basic problem of efficient but inscrutable systems. Without accusing Google of a cunning sleight of hand after all, I can well imagine that the legislators and bureaucrats who are gearing up to make rules about this issue might mistake interpreter systems like Kim’s for a solution, require them in all cases, and assume that the job is done and dusted…

16 thoughts on “Explaining to humans

  1. When a student comes up with the right answers but can’t explain their thinking, a good teacher will suspect cheating. Even if no cheating occurred, they would not trust that the student had learned the lesson. After all, getting the answers right on a test is only a proxy for knowledge. I have a hard time believing we will ever trust deep learning systems even with attached explainers. Of course, if the application does not require a high level of trust perhaps that’s ok. On the other hand, if an AI says that it can land the plane when severe wind shear has been detected around the airport, I am not sure we would trust it. We also can’t prove competence empirically as failures would still be considered too expensive.

  2. Polluted water-air-land come to mind…

    Computers test and predicts outcomes for man’s survival via our entanglements here on Earth…
    …we doggedly input while dodgingly compare consciousness and money…

    Could computers be towards ‘deep learning entanglements’ which fault in mankind’s survival…

  3. Been Kim’s work is excellent, and if I could just get Deep Learning methods to work better than old-fashioned machine learning methods at my job, I’d definitely want her “interpreter” add-on. But for AI safety, I’d definitely want more. At least when we get to the point of, say, appointing robots to be CEOs of major corporations.

  4. On the last point in the post, I think it’s worth remembering that organic intelligences are often unreliable as well, although evolution weeds out the most egregious cases. AIs will make mistakes. The problems that they’re useful for are too complicated to eliminate that possibility. The trick is to remember that when using them, and to make the error rate better than that of your average bored distracted human.

  5. This development is interesting.
    So computer intelligence would exceed the grasp of your typical Mensa member.
    In my idle time I have chanced across IQ tests on the internet exponentially beyond your typical Mensa member.
    Would these applications exceed the grasp of an Einstein who could visualize a chair at the speed of light, or a Picasso, with his ingenious and agile visualization? Or Shakespeare for that matter?
    If that’s apples and oranges, how about mathematicians such as Mandelbrot or Erdos or Von Neuman? I’d guess that would be up their alley.
    Or Cantor who could grasp the highest infinity, would this new AI be higher than the highest infinity?
    Is it really that far out if not far off?

  6. SelfAware,
    Many “mistakes” of human intelligence are actually good things. I’m using “mistakes” as taken from the point of view of our “designer”, Mother Nature. For example, we value sex independently of its contribution to reproduction. Of course, we also make mistakes we ourselves regret, such as math errors. But when we make “mistakes” that amount to valuing things we weren’t designed for, that’s totally OK.

    When AIs do likewise, it will be totally not OK.

  7. Paul,
    Certainly some mistakes have consequences that turn out to be beneficial. But many mistakes have awful consequences that are not at all OK.

    I can’t see why AI mistakes necessarily wouldn’t have both categories.

  8. On “weeding out egregiously unreliable systems”…
    …Are AI and HI (human intelligence) entanglements becoming a devolving complementarity…

    Can we posit quantum-quanta, as preceding objects, in procession to entanglements…
    …for then, objective philosophical conversations could begin…

    With, perhaps Mr. Einstein’s questioning ‘light before objectivity’…
    …then trying to understand light as material, becomes food for the philosopher and the physicist…

  9. Paul Topping,

    They’re certainly letting forms of AI to drive cars without those AI explaining anything first.

    Basically the situation seems to be headed towards encrypted behaviors. Behaviors which are hidden and not open to access – like dealing with a spy or double agent.

  10. Callan S,

    Sure, that’s true right now but we are in a mostly experimental phase with respect to AI-driven cars. Once the general public starts having accidents, people are going to be looking for explanations. Designers will want to know why accidents happened also but are likely to want to keep any “explanations” to themselves unless they let their AI creation off the hook. The public and insurers, on the other hand, will know that the designers have more information and will demand it be released. Government will likely have to set some standards for such information. It’s going to take a while to sort things out.

  11. AI as it is right now, particularly in the guise of deep learning systems, essentially shows what psychologists call ‘System 1’-thinking: fast, automatic, unconscious kind of recognition. An AI judges an image to be similar to another without reference to what the image shows—essentially, all it cares about is the distance between two sets of pixels in some high-dimensional space, computed according to an opaque metric that it has ‘learned’. An AI will identify a nine-legged hairy black thing as a spider, while every human would immediately identify it as a fake: there are no nine-legged spiders. But the AI doesn’t know what a spider is; all it ‘knows’ is that the image is similar to a cluster of images that it has tagged with ‘spider’.

    In order to be explicable to humans, AI would need to add some kind of ‘System 2’, model-based facility—that is, the semantic component that a spider is an eight-legged thing. Then, we can start asking AI questions regarding why it makes a certain identification—we ask it, ‘Why have you identified this as a spider?’, and it may answer, ‘because it has eight legs’, which we might take as a good justification.

    As to how we can get to this point, well, I have my own ideas regarding model-building—which are still mostly along the lines Peter discussed previously—but I recently read ‘The Enigma of Reason’, by Dan Sperber and Hugo Mercier, who propose that really, there is no distinct ‘System 2’-thinking. Rather, the reason for us coming up with reasons is to justify and to convince; and what is regarded as a valid justification, or what may convince us, is just as opaque to us as it is to a regular deep learning AI precisely why it identified an image as a spider. That is, it is ultimately rooted in ‘System 1’-thinking again: reasons are just that kind of justification that we recognize, in a ‘System 1’, intuitive manner, as belonging to the set of valid reasons.

    This way of viewing things has certain advantages—most notably, that the faculty of reason, rather than being curiously bad at its job of finding out the truth (what with all its biases), turns out to be actually pretty damn good at its real job, namely, finding justifications. I’m still not sure how it all hangs together, but I think there’s much fertile ground there.

  12. Jochen,
    Very good post… I have not read ‘The Enigma of Reason’ but from your brief synopsis it sound like a good read. Your analysis corresponds with what I’ve discovered about rationality being a closed loop, discrete binary system. The vast majority of people, including the scientific community will not understand what that statement means and a google search will not garner further discussion on the topic.

    Rationality is the only tool that we have in our toolbox and yet, because rationality is a discrete binary system, it will not and cannot accommodate a continuous, linear system such as consciousness. That is exactly why consciousness is considered to be a subjective experience. To add clarity to the discussion, consciousness is not a subjective experience, it’s an objective experience of some “thing” that is radically indeterminate. Rationality is where it begins, and rationality is where it ends. Unless or until, one is willing to address the distinction of rationality being a discrete binary system, nothing will change, because nothing can change.

    In conclusion: Rationality is the meta-problem of consciousness.

  13. That ration is akin to scale dimension degree measure…

    Our evolution a ration of probability…

    A ration of ‘fundamental interaction’ for this locality…

  14. “Understanding” is inference to the best explanation, right?—One of three kinds of belief-fixation (the other two: induction and deduction). Abduction (“understanding”) involves implicit and explicit statistical and probablistic assessment, with evolved met-theoretical criteria working as “framing-effects” in constraint-satisfaction networks. “Transfer” of understanding, wrought in these terms, doesn’t seem so quixotic. I like to put the exercise of our intelligence in the following terms: When do we reason? When do we think? We think, we reason, when we do not know. When we do not know what? When we do not know what to believe or what to do. These are the only two generic kinds of informational appetite that motivate exercise of the mechanisms of intelligence. What are those mechanisms? Well, there are mechanisms of epistemic intelligence (induction, deduction and abduction) and there are mechanisms of practical intelligence, of choice and decision-making (utility maximization). Cognitive science needs to answer the following questions: To what extent do evolved architectures support the exercise of practical and epistemic intelligence? To what extent do culturally evolved soft-ware(s) (e.g. fast and frugal hueristics) support that exercise? If its all entangled, though, so much the worse for the prospects of “transfer” of the reasoning behind one’s belief-fixation or decision-making.

  15. Understanding is more like semiotics semiology…
    …”but science is usually based on ‘inference’, not unvarnished observation”…Mr Stephen Jay Gold

    Explanation and digital explanation, seem before and after ‘signs’ of their times…

    Observation-presence-present, the ‘inscrutable deep learning system’…

  16. While waiting for Peter’s next post, check out Wikipedia’s “Transformational grammar”…

    My take, ‘moving’ between “Deep (learning) Structure” and “Surface Structure”…
    …explains our words, when possible, can be a means to meaning/experiencing transformation/observation…

    A variation of ‘let it be’, thanks

Leave a Reply

Your email address will not be published. Required fields are marked *