Global Workspace beats frame problem?

Picture: global workspace. Global Workspace theories have been popular ever since Bernard Baars put forward the idea back in the eighties; in ‘Applying global workspace theory to the frame problem’*,  Murray Shanahan and Baars suggest that among its other virtues, the global workspace provides a convenient solution to that old bugbear, the frame problem.

What is the frame problem, anyway? Initially, it was a problem that arose when early AI programs were attempting simple tasks like moving blocks around. It became clear that when they  moved a block, they not only had to update their database to correct the position of the block, they had to update every other piece of information to say it had not been changed. This led to unexpected demands on memory and processing. In the AI world, this problem never seemed too overwhelming, but philosophers got hold of it and gave it a new twist. Fodor, and in a memorable exposition, Dennett, suggested that there was a fundamental problem here. Humans had the ability to pick out what was relevant and ignore everything else, but there didn’t seem to be any way of giving computers the same capacity. Dennett’s version featured three robots: the first happily pulled a trolley out of a room to save it from a bomb, without noticing that the bomb was on the trolley, and came too; the second attempted to work out all the implications of pulling the trolley out of the room; but there were so many logical implications that it was stuck working through them when the bomb went off. The third was designed to ignore irrelevant implications, but it was still working on the task of identifying all the many irrelevant implications when again the bomb exploded.

Shanahan and Baars explain this background and rightly point out that the original frame problem arose in systems which used formal logic as their only means of drawing conclusions about things, no longer an approach that many people would expect to succeed. They don’t really believe that the case for the insolubility of the problem has been convincingly made. What exactly is the nature of the problem, they ask: is it combinatorial explosion? Or is it just that the number of propositions the AI has to sort through to find the relevant one is very large (and by the way, aren’t there better ways of finding it than searching every item in order?). Neither of those is really all that frightening; we have techniques to deal with them.

I think Shanahan and Baars, understandably enough, under-rate the task a bit here. The set of sentences we’re asking the AI to sort through is not just very large; it’s infinite. One of the absurd deductions Dennett assigns to his robots is that the number of revolutions the wheels of trolley will perform in being pulled out of the room is less than the number of walls in the room. This is clearly just one member of a set of valid deductions which goes on forever; the number of revolutions is also less than the number of walls plus one; it’s less than the number of walls plus two… It may be obvious that these deductions are uninteresting; but what is the algorithm that tells us so? More fundamentally, the superficial problems are proxies for a deeper concern; that the real world isn’t reducible to a set of propositions at all, that, as Borges put it

“it is clear that there is no classification of the Universe that is not arbitrary and full of conjectures. The reason for this is very simple: we do not know what thing the universe is.”

There’s no encyclopaedia which can contain all possible facts about any situation. You may have good heuristics and terrific search algorithms, but when you’re up against an uncategorisable domain of infinite extent, you’re surely still going to have problems.

However, the solution proposed by Shanahan and Baars is interesting. Instead of the mind having to search through a large set of sentences, it has a global workspace where things are decided and a series of specialised modules which compete to feed in information (there’s an issue here about how radically different inputs from different modules manage to talk to each other: Shanahan and Baars mention a couple of options and then say rather loftily that the details don’t matter for their current purposes. It’s true that in context we don’t need to know exactly what the solution is – but we do need to be left believing that there is one).

Anyway, the idea is that while the global workspace is going about its business each module is looking out for just one thing. When eventually the bomb-is-coming-too module gets stimulated, it begins sending very vigorously and that information gets into the workspace. Instead of having to identify relevant developments, the workspace is automatically fed with them.

That looks good on the face of it; instead of spending time endlessly sorting through propositions, we’ll just be alerted when it’s necessary. Notice, however, that instead of requiring an indefinitely large amount of time, we now need an indefinitely large number of specialised modules. Moreover, if we really cover all the bases, many of those modules are going to be firing off all the time. So when the bomb-is-coming-too module begins to signal frantically, it will be competing with the number-of-rotations-is-less-than-the-number-of-walls module and all the others, and will be drowned out. If we only want to have relevant modules, or only listen to relevant signals, we’re back with the original problem of determining just what is relevant.

Still, let’s not dismiss the whole thing too glibly. It reminded me to some degree of Edelman’s analogy with the immune system, which in a way really does work like that. The immune system cannot know in advance what antibodies it will need to produce, so instead it produces lots of random variations; then when one gets triggered it is quickly reproduced in large numbers. Perhaps we can imagine that if the global workspace were served by modules which were not pre-defined, but arose randomly out of chance neural linkages, it might work something like that. However, the immune system has the advantage of knowing that it has to react against anything foreign, whereas we need relevant responses for relevant stimuli. I don’t think we have the answer yet.

*Thanks to Lloyd for the reference.

69 thoughts on “Global Workspace beats frame problem?

  1. Another very cool article, Peter, and one that I’ve read just recently before it was mentioned here by Lloyd. You’re correct in your analysis, especially regarding so-called “massive modularity.” It beggars belief that this is still a plausible (and popular) thesis among some philosophers of cognitive science. Aside from being evolutionarily nonsensical, it’s architecturally implausible. Modules are supposed to be fast, always on, and informationally encapsulated from central cognition, but if one starts to modularize cognitive functions that might even be plausible to modularize, such as theory-of-mind, one immediately wonders how such a module could operate.

    If there’s a natural solution to the frame problem, I suspect it has to do with the (generalized) faculty of attention. I’ve been thinking more lately about the value of emotional appraisals of particular propositions or states of affairs being something like “processing sinkholes” which grab attention, and fixate it on being happy, sad, angry, etc…till the environment (or the cognitive system) generates a goal onto which attention *must* be focused, forcing a task-switch. Maybe I’ll publish this…the attentional pot-hole solution to the frame problem. lol.

  2. Thanks, Paul. I like the idea of attentional pot-holes – apart from anything else it has a certain psychological plausibility. You should definitely write it up.

  3. Paul: I, too, have a problem with the GWT design, but not the same problem as you describe.

    We are not told exactly how a process module is to be built, but we are given some hints. In the paper referenced above, it is said that [A specialist process can be responsible for some aspect of perception, long-term planning, …, or indeed any posited cognitive process.] A few pages further on, discussing how a Rorschach blot might be interpreted as an elephant, it is suggested that there might be a process [always on the lookout for elephantine shapes.] But then footnote 16 asks whether this might be taking specialization too far. Thus, we have a broad idea of the range of capabilities being considered. Considering that the field is limited by the individual’s experience, not by the logical range of possibilties, I might propose that, as a rough estimate, we might need several thousands, but not millions, of such modules.

    Another paper, “Cognition, Action Selection, and Inner Rehearsal”, also available on Shanahan’s site, presents a detailed model of a very simplified version of a process, presenting it as a pattern matching matrix somewhat like an artificial neural network. A key characteristic of the model shown is that it would be able to respond to approximate matches to a target pattern. Thus, we have the general concept that a process would be something like an artificial neural network, a Hopfield network, or perhaps something like a Hawkins heirarchical memory unit.

    Your criticism of the proposed process model seems to be directed toward a module that would implement some kind of formal logic, rather than a network organization. I see no real difficulty, for example, in thinking how a network module might be organized to sense whether or not you were thinking of an elephant based on your responses when I mentioned it. Maybe that’s not a very extreme example of the possible complexities required to implement a theory of mind, but I do not see any insurmountable barriers against extending such ideas to accomplish that task.

    On the other hand, I do have a problem with the nature of the communications among modules or between modules and the posited workspace. The workspace itself is said to do very little other than to accept communications from the modules and distribute those signals back to all of the modules. But just what sort of message would be meaningful to all of the various modules? To me, there is a major unanswered issue here concerning the nature of the information said to be transmitted by a module and, in turn, received back from the workspace or from other modules.

    The “Cognition” paper cited in my paragraph 3 above seems to suggest that each module would send out a code which would, in some sense, directly represent its results. But I do not see how such messages would make any sense to other modules. What sort of code could be transmitted by a visual perception module that would be meaningful to an auditory module?

    An alternative communication plan might be that each module does no more than identify itself when it has detected its perceptual specialty. This is the coding method suggested in another Shanahan paper, “Towards a Computational Account of Reflexive Consciousness”, also on his site. The idea is that the signal initiated by a module would be timed so as to occur within a specific “time slot” within the workspace competition/broadcast cycle. In this way, additional “bits” would be available to encode the messages from the various modules. However, in the example given, only three time slots are shown. I find it hard to see how more than, say, ten or so time slots could be encoded within a workspace cycle time of thirty or forty Hertz, given typical neural response times, even if one considers the ensemble response of a large number of neurons.

    And, in any case, what is the value of interconnecting a selected group of modules unless some sort of commonly meaningful message can be communicated?

    Suppose each module simply represents the top level of a perceptual pattern-detection hierarchy, performing an integration of lower-level feature detectors at a level that would serve to identify objects, individuals, or events in the world. Signals from the temporal lobes would indicate objects or people, signals from the parietal would identify where these things are, other signals from the hippocampus would relate the locations to familiar landmarks in the known world, and signals from frontal regions would indicate how these objects fit into my future plans. Is there anything to be gained by simply interconnecting these various regions to each other? What is it that needs to be communicated in order for the detected objects to be useful to me in pursuing my life’s objectives?

    Is it sufficient that the interconnection as described above could also be connected to motor organizational regions that would eventually guide my hands to pick up or my feet to move toward the recognized object? To me, there seems to be a major missing link here?

  4. Why restrict these intricate systems analyses to the brain ? Why not another physical object ? I would look forward to the description of all information flows within a developing orange, or maybe a nice, juicy apple. Not exactly much use, but cheaper and probably more accurate than 90% of cognitive science.

  5. Heh. Didn’t Hume somewhere have one of his characters say that the processes going on in the brain were not fundamentally all that different from those in a decaying cauliflower?

  6. Very interesting post, I just manage to finish reading the papers. I have a impression very different.

    If I am right, one of the baseline ideas is that in our brain there is a federation of processes looking for all kind of things in the “surroundings”, competing between them to take the control.

    Paul: it reminds me the distributed computing systems based on CORBA. I worked with this architecture a few years ago, tough! I don’t know if this is still used by experts like you to simulate these schemes, is it still used?

    So the question is, in a certain environment or scenario, how do we manage to filter the important relevant stuff from what is not. And I would add, who is the master? who is the agent that all these processes looking for elephants (Lloyd) report to?

    I believe the approach is just wrong.

    I look at it from an evolutionary perspective. So, in the beginning consciouss beings had NEEDS. Subsequently their processes evolve to quench those needs, ignoring what was not useful for the purpose.

    Imagine and amoeba in the very beginning. Using chemotaxis processes it just follows a gradient of nutrients, and pays no “attention” to the gradient of light. Because its need is to get food not light. So the issue is not, how is it that the amoeba knows how to discriminate between the gradient food, and the gradient of light. The need and purpose were before, and determined the action.

    Now extent this to us. We usually do have an objective, a plan, a need. So we filter every input according to our objetive. For example if we want to get out of a room, we consider were is the door, or the height of the windows respect to the floor, but not the ratio between the walls height and the door height, irrelevant for our intention, ie: to exit the room.

    So maybe this is the problem with AI systems, that you need to determine what is, and what is not relevant for the objetive of your system.

    So, from an evolutionary point of view, it is not that we ignore, or we discriminate what is irrelevant, it is simply that we never considered it.

    Somewhere in this site, I read the example of somebody saying: “I need to buy milk”, and somebody answering “I am going to post a letter”, so how is it that the first one, understands that the second one is volunteering to buy the milk. Because, he is ready to search for things that helps for his purpose: “buy milk” and nothign else. If the answer would have been, “I am talking a flight to Sidney”, it would have been ignored.

    In my understanding, we are in a permanent problem solving process, and we just consider what is relevant and useful to solve the current problem.

    I guess that to code this in an AI systems, you need to set the objetive and then enumerate all necessary parameters and info needed to achieve the goal, so the system looks for that info and ignores the rest.

    My view is that we are design to consider a few relevant things, and ignore most of the info around, and that’s it, and cannot be for economy constraints in any other way.

    It reminds me the spies movies, when a trainee agent is asked to described every item and people around him, in a cafe, after talking a few seconds with the experienced agent. Unless, you are trainned for that you can’t do it, we just pay attention to what is useful for our current task.

    How to code this in a computer is different of course. But in what concerns to humans, I think we are inflating a little problem to make a big one. Or maybe I don’t understand the issue.

  7. Ah, I forgot, the other point I had in mind is the “law of averages”, we usually do scanning around and it is also when things are outlayers of its class average that we consider them. So in cmmt #6, if the door height to ceiling height ratio differs much from average, we would also consider it.

  8. John: I am sorry of you are put off that I have used this space for my software design ideas. Perhaps I should try to flesh out my ideas a bit more and publish it somewhere such as an IEEE transactions paper. And indeed, this site seems to be more of a philosophy forum than a software design forum. However, the subject of my thoughts does, I believe, indeed qualify as material appropriate to this site.

  9. Lloyd:
    I think you’re somewhat right in saying that connectionist systems escape the frame problem as it was formulated by John McCarthy and Pat Hayes. But this isn’t the frame problem that is considered by Shanahan’s paper, it’s only a close cousin. The frame problem that Shanahan discusses is the one proposed by Jerry Fodor some years ago. It’s actually a bit more accurate to call it the “isotropy” problem, or the problem of relevance. Basically, the problem of relevance just states that in a system where beliefs, desires, memories, images, etc can all be potentially related to one another, how does one manage to infer anything in a reasonable amount of time. In a connectionist system, you’d still have all of these elements, with activation flowing from input layers to output layers and to other input layers and on and on…There are other ways in which the “frame problem” might be rendered otiose, if for example knowledge in the mind was arranged optimally according so some independence assumptions like those you find in Bayesian Networks. Unfortunately, this isn’t a silver bullet either, because humans are awfully good at relating disparate things through analogy and metaphor. The fact that everything is related to everything else seems to be almost inescapable.

    As far as your analysis of Baars GWT, I think you’re on the mark. What neuroscience seems to be telling us is that there are relatively few specialized areas of the brain, and those we find can often become co-opted for different functions with much training. The brain seems to be very plastic, even into adulthood. The answer to your question about communication in the workspace seems to be that if all of cognition is the product of neural firing, then there isn’t a problem. I think the problem really is with the profligate number of posited modules to accomplish even the simplest of tasks.

  10. Paul and Vicente: You have both commented that the GW model does not seem to fit in with the progress of evolution. I have a different view on this point.

    I would cite two places I have seen the idea that the brain was built up by continually adding layers at the junction between sensory inputs and motor outputs. Nauta and Feirtag in their book “Fundamental Neuroanatomy” (1986) and Rodolfo Llinas in his “The Workings of the Brain” (1990) both state this principal in pretty much those words. I suspect the same idea could also be found other places, such as Arbib, Damasio, Edelman, just to name a few.

    It seems to me that the GW model exactly fits this scenario. Various perceptual modules would have evolved first, initially interconnected among themselves and connected directly to various motor output modules. Later, where more flexibility was required in the types of interconnections, a new layer would emerge allowing more general combinations of interconnections between inputs and outputs. Initially, such new “interconnect” layers might be specific to just a few selected sensory and motor modules. But as the number of such interconnect layers grew, it would seem to make sense that they would be combined into a smaller number of more general control layers, eventually leading to a single master control layer.

    What I missed in my earlier comment #3 was that the interconnect layers could essentially recode the messages. They need not rebroadcast the same messages as they received from the modules. The task is basically to make a decision based on which modules are transmitting and, as a result, send out activations to a newly selected set of modules, both sensory and motor. The messages need be no more than “Hey, I am seeing something interesting here.” and “You: get going!”. What is required is specific pathways between each module and the interconnect layer. This is just what all the white matter axons do. I had earlier thought that the beta and gamma wave oscillation patterns would need to encode some specific messages. But they do not. The task may be as simple as to synchronize the transmissions between modules and the workspace.

    Paul: Thanks for the comments on the frame problem(s). I believe Shanahan says he has solved both versions (or maybe that one of the versions is irrelevant).

    I believe that the issue of choosing which inputs are the most relevant to a given task is still not solved, although I agree with your earlier comments that attention must have a lot to do with how this works. My question would be whether the known abilities of connectionist-type systems to find similarities can somehow be applied to determine relevance.

    John: Again sorry that my comments seem to run on and on ,,,

  11. If someone solved the relevance problem, it’d be news to me. Shanahan has been working on the so-called “event calculus” and its variants over the years. Much of his research has focused on solving the logical frame problem (the McCarthy/Hayes problem), and it’s related cousins: the qualifications and ramifications problems. Qualifications just has to do with the fact that there are infinitely many exceptions to simple action rules such as “when I turn the key in the ignition, the car will start.” If the car doesn’t start, people don’t normally check to see if there’s a potato wedged in the exhaust pipe, though it’s perfectly possible that this is the case. So qualifications have to do with the formulation of ceteris paribus principles, and solving the problem involves minimizing the consideration of exceptions. Ramifications deals with unanticipated side-effects. Still, none of these deals with the more general problem of everything (propositionally speaking, that is) being related to everything else.

    Certainly similarity is a huge part of the story that none of us really understand. Neural networks compute similarity in some fairly standard ways, but most of these computations have to do with similarity among features in a feature vector. What’s more puzzling is how people can generalize from hitting a nail with a hammer to using a rock to nail down a tent-spike into the ground when setting up a tent. They don’t use objects with any problem-specific shared features. This kind of “functional” similarity is really mysterious to me. My hunch is that we evolved specialized function for something like naive physical reasoning, and most of what we see and do can be represented using physical metaphors.

    and finally, about GWT: I agree that the scenario you sketched out re: the building-up of layers between sensory and motor pieces in the brain is plausible; but there seems to be no evidence for hyperspecialized *modules* of the form that both Jerry Fodor and Baars discuss. Abstraction layers are both plausible and have already been by and large identified neuroscientifically.

  12. Does the GW model really need such a very large (even infinite?) number of modules? I think not. For example, Peter describes a workspace served by [modules which were not pre-defined, but arose randomly out of chance neural linkages]. I believe the situation is even better than that. If each module potentially consists of a hierarchy of connectionist-like networks (see, for example, Jeff Hawkins, (on this site, Peter, help [Here – Peter]) it would seem reasonable that a module could restructure itself on the spot, keeping in mind that both bottom-up and top-down information is available from the perceptual inputs and from the last round of workspace notifications. So we effectively get a total number of logical modules very much larger than the actual number of available physical modules. I would argue that it is in this way that elephants can be detected (cmmts #3 and #6) and that the concept of elephants can be used in various scenarios without overly burdening the module inventory.

    And in any case, the range of tasks should not really be considered as infinite. That, I would argue, is a product of the formal-logic way of thinking. As an example, Peter cites the list "the number of walls plus one, plus two, …". But evidently every capable 8-month old has learned the concept "more than two" and can use that to form arbitrary number constructs (see Lakoff and Nunez, "Where Mathematics Comes From"). Admittedly, this was a simplified example. But I use it to highlight the ever-present temptation to apply formal methods to metaphoric problems.

    Paul: I will re-read the Shanahan sections on the frame problem.

  13. Hawkins book is “On Intelligence”. But his home site seems to have been taken down and the forum topics have been closed for some time. I do not know what’s going on there, but I still like some of his ideas.

  14. “That, I would argue, is a product of the formal-logic way of thinking. As an example, Peter cites the list “the number of walls plus one, plus two, …”. But evidently every capable 8-month old has learned the concept “more than two” and can use that to form arbitrary number constructs”

    Trust me, I’m no slave to logical formalization, but it’s equally as mistaken to assume the primacy of any particular (computational) form of representation. I think what I’m trying to say when I talk about “infinite” possibilities is that the human mind has evolved such that it affords us an imagination that allows us to disconnect from immediate environmental stimuli. It’s what makes us more than just stimulus-response pattern-matchers. I can sit here and think about flying pigs. I’ve never seen a pig fly, but I can certainly form a mental image. I can do this for arbitrary entities with arbitrary sets of features. We *do* have access to an infinitude of thoughts — just as we can produce a never-ending variety of different sentences on paper that are grammatical. For some good philosophy along these lines, check out the early debates between Fodor/Plyshyn vs. Smolensky. See these refs:

    Fodor, J. A.,& Pylyshyn, Z. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28: 3-
    71.

    Fodor, J.A., & McLaughlin, B. P. (1990). Connectionism and the problm of systematicity: Why Smolensky’s
    solution doesn’t work. Cognition, 35: 183-204.

    and

    Smolensky, P. (1987). The constituent structure of connectionist mental states: A reply to Fodor and Pylyshyn.
    Southern Journal of Philosophy, 26: 137-163.

    Van Gelder, T. (1990). Compositionality: A connectionist variation on a Classical theme. Cognitive Science, 14.

    for connectionist replies.

  15. Peter: The Hawkins reference was in a piece called “On the Future of Computing”. I do not know if it had a page number.

    Paul: What is similarity? Good question. Yes, ANNs deal with features in a feature vector. But what kinds of things can be cast that way? For example, we know that all of the varieties of “t” that can follow “n” or a vowel in English can be cast as features in a vector. But what are the limits?

    I would submit that whatever the answer to that question is, it can be augmented in a major way by the ability to include a memory access cycle in the process. For example, if my engine will not start, I can recall at least some of the occasions when this happened in the past. If those events include a potato in the tail-pipe, then it is a good chance I will look there before I give up. I can use a similar strategy to find that a rock can be used in place of a hammer. I see these both as cases of, first, being able to cast the task in terms amenable to feature search and memory search and, second, to be able to use the memory results in the same ways.

  16. I think that it must be fundamentally important that we find no neurobiological disease in which a person is conscious without their built-in solution to the frame problem(or whatever more appropriately named variation of this problem is actually being discussed here). I think that this is strong evidence for the assertion that framing the problem in terms “what the thing is that sorts out relevant information and what algorithm does it use” is inaccurate. The solution is inherent at a very fundamental level to human neurobiological structure. I suppose the nearest to a person without their built-in solution would be someone with ADD. However, this is not really a case of not being able to figure out what is relevant. It is more a case of not being able to “lock on” to that relevant thing; despite knowing what it is, one is unable to stop devoting cognitive resources to less important aspects of the current state of affairs.

    My favorite example of solving the problem which I think points us in the direction of an explanation for our solution is our linguistic ability. We don’t say things by going through a list of all possible words and picking out the one we want. We don’t even say things by going through a list of all gramatically appropriate words – because we often goof up and choose an incorrect form of whatever word we settle on. Out of tens of thousands of possible words, the right ones just find their way out our mouths. It would be rediculous to suggest that subconsciously we are sorting through a long list, because then we are left with the problem of ruling out incorrect words by considering their meaning; a task which quickly becomes overwhelming. It *feels* to me like the correct word is found with some sort of pattern recognition technique. I don’t know what else to do with the tip of the tongue phenomenon, where we can find the first letter, or the first sound, or “sounds like/rhymes with” but not actually know the word.

    Also, what can we tell from developmental psychology? It seems like infants start solving the frame problem by only paying attention to the most intense stimuli: loud noises, bright lights, fast movements, ect. And there is a definite spectrum of improvement between that time and later in life. We don’t just solve the frame problem one day, something happens over time and eventually we can find the relevant things. Children are more easily distracted by irrelevant features of their environments than adults. They’re not as good at solving the problem.

    I think that all of these factors point to some essential feature of pattern recognition in our neural network. I think that the ability to identify what is relevant is learned, through reinforcement. Starting with intense stimuli as relevant, the more a recognizable quality is relevant, the more attention is payed to it. I don’t think we start by filtering out everything. I think we start by building up what sorts of things to pay attention to, based on our appraisal of their past relevance. We look for features in a piece of information which have been relevant in the past, and if we find such features we pay attention to that thing. Saying “we look for” isn’t even right; it’s more that we are wired to be more receptive to such features because the structures of our neurals networks are based on past reception of similar information.

    The sharpness of a thing is only relevant to you if you know about getting cut. We learn to pay attention to this feature in the same way that we learn that touching the stove is bad idea. A lion tamer will pay attention to very different things in the presense of a lion than someone who has never seen one before. If I were to find myself on the bridge of a nuclear submarine, I would have no idea what to pay attention to.

    I think neural network learning is the basis for our solution to the frame problem.

    *All along I have been saying “solving the fame problem.” What I mean by this is the way that our brains actually go about solving this problem, not the solution to the question that philosophers are asking. When I say I’m solving the frame problem I mean that I’m navigating the world by attending to what is relevant and ignoring what is not, not that I am proposing an answer to a philosophical problem.

  17. Paul: “when I turn the key in the ignition, the car will start.” If the car doesn’t start, people don’t normally check to see if there’s a potato wedged in the exhaust pipe, though it’s perfectly possible that this is the case. So qualifications have to do with the formulation of ceteris paribus principles, and solving the problem involves minimizing the consideration of exceptions..

    And, does the consideration of exceptions and formulation of ceteris paribus principle has do with probability and previous experience?

    So the potato answer is perfectly possible, yes, “but very unlikely”, unless there is for example a nasty neighbour’s kid involved. So isn’t it our experience that is creating a probability estimation criteria that dictamines the selection of relevant perceptions.

    Could this be the reason for which very young researchers (<25 years old) are responsible for most great scientific and technological advances. Because their lack of experience made them free to look for unlikely new solutions to problems. Free to find "relevant" what experienced researchers couldn't or wouldn't.

    On the other hand experience allows a lot of time saving, because you can consider lots of data irrelevant and discard them beforehand.

    So probably in understanding this problem we have to look for the balance between learning, and conditioning learning.

  18. Paul: With my linguistics background, I have many times run across various Fodor books and have a few of them on the shelf. It’s about all I can stomach, though, to get through one of them — much harder than Searle. I have been thinking for some time I should try something by Pylyshyn, perhaps the book with Fodor. On the other hand, I’m a great fan of Smolensky.

    The anterior cingulate gyrus seems to be able to cook up any hypothetical scenario and propose it as a starting point for hypothetical tour through imaginary places. Apparently, this capability evolved as a way to imagine arbitrary situations so you could figure out how to get out of trouble before getting into it. Once a scenario has been so proposed, it can be cycled through the GW loop just as if it was actually happening, except that motor output is (usually) inhibited. Great thing, this brain.

  19. Lloyd:”… The anterior cingulate gyrus seems to be able to cook up any hypothetical scenario and propose it as a starting point for hypothetical tour through imaginary places. Apparently, this capability evolved as a way to imagine arbitrary situations so you could figure out how to get out of trouble before getting into it…”

    How true !! I’ve always thought that we have built-in mission planning facilities, including scenarios simulation (imagination). The thing is that for the cavemen this was useful, but for modern man… if you abuse of it, I see it more as a handicap, because you might reach the real situation with too much prejudice, that will make you slow to react. I believe modern scenarios are so complex, that it is also very useful to be open minded and ready to change plans on the spot.

    It is also dangerous because, as you said, motor output can be inhibited, but I am afraid that emotional output (fear, anger,…) is much more difficult to control. This is a great source of unnecessary psicological discomfort.

    So getting back to the issue, probably this pre-imagination also helps the individual to focus in the relevant aspects of the situation when it comes.

  20. Vicente: There is a talk by John Allman of CalTech on some of the characteristics of this part of the brain, video at
    http://video.google.com/videoplay?docid=6417527326328810589&ei=li51S9qYFZeyrAPg6qmUBw&q=brain%2C+mind+and+consciousness+-+session+2&hl=en&client=firefox-a#docid=-4692065277230230087
    He does not specifically discuss the hypothetical scenario part, but does discuss anticipation and the fact that the emotional modules, such as parts of the amygdala, appear to be in full effect during these scenarios.

  21. It seems to me that AI framing problem is not solvable because of conjectural nature of human intuition. It is not really the level of modularity but rather the inability of formalized computing systems to deal with variability and randomness as Paul suggested.
    Humans cannot solve a new frame problem heuristically either, but they have the ability to re-frame or to change the contextual frame into something they recognize from learned experience thus creating a new frame of relevance.
    People able to re-frame context, are very powerful lawyers because they can re-frame modular facts into a big picture that otherwise cannot be recognized.

  22. I thought the artificial neural network approach has achieved a certain level of efficiency in pattern recognition, such as deciphering the hand-written address on an envelope. So, why is it a surprise that the real neural network in our brain is so good at arriving at a matching pattern pre-existing in our memory, and thus providing a starting point for a response?

    Isn’t it true that the frame problem exist only because we start from a algorithm driven approach in problem solving?

    Then, a wild guess: Consider factoring a large number into its two prime factors using traditional algorithm driven way, as is in public key encryption. For a sufficiently large number, the process takes a long time. But a quantum computer can do it quickly through interference/coherency. If neural network approach isn’t the right reason that our brain is so good at zeroing in on the relevant issue, couldn’t quantum effect play some role in here? Just some wild guess.

  23. Kar Lee: Correct me if I am wrong, but a Neural Network is just a mathematical function that depends on a set of parameters than can be tuned using a network trainning cycle, (like least square fitting) based on a historical database, son we can check right outputs, reinforcing certain parameter values. I believe it was called “Neural” Network because it looks like nervous tissue a bit, but in real terms any further ressemblance is difficult to defend.

    Yes, I like wild guess policy. I was thinking how to combine this blog and the one about Integrated Information Theory (Psi function). I also guess that this process for integrating info is well design to discard irrelevant chunks of the picture. But how?

    Peter: you were recalling me the (Conscious Electomagnetic)CEMI theories. From a physics approach they are so weak that cannot be considered. The JCS papers you refered me would never get through the editorial board of a high impact physics journal by no means, although it is interesting to have the idea in mind. CEMI is something that sooner or later had to be raised. I agree with Kar Lee that quantum approach is much more promising, ORCH models could be part of a solution, and R.Penrose signature is always a quality stamp.

    So in a wild guess, maybe making a sauce with: GW + IIT + Quantum ORCH, could be part of a rich recipe.

  24. Doru and Kar Lee: I am certainly in agreement that the nature of classification/pattern-recognition is at the heart of the matter, but I am slightly more “liberal” than Paul in my enthusiasm for the potentialities of the method. However, Kar Lee, I absolutely do NOT believe that quantum effects play a role in the matter. In this, I agree with Michael Shermer, who notes that there are 3 or 4 orders of magnitude difference in the size of physical things which are affected by quantum vs. neural activities. Yes, I agree that ANNs are “purely algorithmic”, but I would claim that any limits posed by that definition do not get in the way of using such algorithms to make a brain work. Dan Dennett gave a wonderful talk, I think it was in Seattle, where he dramatically demonstrated the broad compass of algorithms. I don’t see that as a limitation.

    Vicente: As for “wild guesses”, it is my belief that a certain amount of noise probably does play a role at some point in how creativity works, but I believe that role will turn out to be very minimal, maybe even to the vanishing point. For the most part, creativity seems to be based on rational/emotional choices made from material in the remembered past. Arthur Koestler would probably say it a bit more elaborately than that.

  25. Lloyd, et al.
    Neural networks are universal function approximators, and are just Turing machines at bottom, just like an inference engine over logical statements. What I’ve been trying to say is that there’s nothing particularly “magical” about them, unless you move into the realm of Analog chaotic networks, where I think you find qualitatively different computational properties.

    As for Shermer and his merry band of skeptics … he’s the worst sort. He’ll advocate any interpretation of the data that suits his needs. Quantum effects have been demonstrated at the macrolevel across multiple applications and in warm-ish environs. See Hameroff’s wbsite, but if you think he’s too biased, just do a google search on entanglement in large scale systems. One of my favorite Shermer moments was the editorial he wrote in nature that completely misrepresented (by 180 degrees) the conclusions of a study by Pim van Lommel on Near Death Experience (that suggested it might be possible for consciousness to exist outside the brain), and then having van Lommel have to come in and issue a scathing correction. In any case, I’m not so sure what so-called skeptics are so scared of, that they need to have their own society. If you have any trust in the scientific method, these things take care of themselves.

  26. Paul and Doru: It’s not that I reject quantum theory. It certainly seems that confirmation of Fernentwicklung has indeed been achieved. (Or even, if you prefer, Spuckhaft Fernentwicklung). But the experiments I am aware of involved huge apparatus (OK, could be reduced), and VERY cold temperatures (not easy to get around). And the conditions are generally tricky, but that never stopped a determined researcher.

    Paul: As for the effectiveness of ANNs, the questions are not so much what the architecture can (in theory) do, but rather, how to apply the technology effectively in various yet-to-be-explored situations. The limitations are at best speculative.

  27. Interesting that I got some support for the quantum guess but none on the neural network side, which I think is far more plausible.

    I agree with Paul that artificial neural net is nothing magical. However, after conditioning (training, just like human going through a crash course of some sort), given an input pattern, the neural net immediately falls into a local minimum and calls out the closest output. To a certain degree, I think human solves problem in this fashion. One particular situation reminds you of a certain experience you have, and that in turn reminds you of something else, and so on and so forth and you pick out all those relevant factors. Many times one sees the solution before he realizes how he(or she) solved it. At the end, you have to carefully trace back your steps, and reconstruct a logical sequence of steps to arrive at a formal presentable solution. The insight thing definitely bears a lot of resemblance to how a neural net works. And the step by step solution one reconstructs is like the algorithm approach, which constitutes the formal solution.

    Even though neural net is not all that magical, the way it works is very different from how a “if…then…or else…” type of algorithm driven approach works. If the frame problem arises because of the latter approach, I have hope that a different approach like neural net can avoid the problem altogether. Why not?

  28. I meant to say “spukhaft Fernwirkung”. Sorry, my German is a little rusty. And I was not saying that I thought entanglement depended on cryogenic temperatures. It’s just that, as I understand it, current work has been done at cold temps. Obviously, these conditions do not hold in the brain.

    But aside from all that, my argument about quantum effects in the brain is based entirely on my experience with electronics as chip design has edged ever closer to quantum realities over the last few decades. My point is that the engineers are well aware of the nature of the forces and effects they are dealing with and can predict with reasonable accuracy how various chip structures will behave.

    I can’t claim that we know everything that goes on inside a living cell, but from what I read, I see no reason to believe that there are things going on significantly more mysterious than the stuff that happens on the surface of a chip.

    As for ANNs etc., the traditional adaline/madeline/McClelland/Rumelhart back-propagation network is not a particularly powerful classifier. Much of the newer work with high-dimensionality feature warping, support vector machines, and a few other technologies suggest more general methods of pattern classification. My view is that there is still considerable ground to be covered before anyone can say that the field has been explored.

  29. Kar, Lloyd, et al: I had no intention of turning this into a discussion on the merits and demerits of connectionist representations. I’m generally of the mind that they are indispensable to building human-level AI. On the other hand, I think rich logical representations are equally as indispensable. I spent many years trying to build large-scale neural systems that could reproduce moderately complex proofs that students in my intro to logic courses were able to solve, and had little success in doing so. My objection to a pure network-driven solution is an engineering/computer-science objection, not an in-principle dismissal. As an engineer, it makes more sense to me to start with the richest representation possible (e.g. some form of truth-bearing propositional representation) and augment it such that it’s capable of making probabilistic and other forms of uncertain inference. In fact, much of what’s hot in AI these days adopts this same strategy. Machine learning people are flocking in droves of hundreds to the idea of probabilistic relational inference (keyword *relational*). It’s a young area, but one that I think holds much promise.

    Lloyd: from a philosophical perspective, the question about quantum computation and mind is ultimately about final causes rather than efficient ones. When we ask “what is the nature of mind,” we are asking about final causes, even undetectable quantum ones, even if they may somehow be predictable by classical theory in the large — much like your microchip examples. The problem is, nobody has produced a satisfactory account of consciousness that doesn’t involve some suspicious non-reductive notion or another. I personally (like many or most philosophers of mind) do not believe there is a satisfactory (purely) reductive account. This may have to do with an incomplete notion of physics, or it may have to do with a unsatisfactory set of bridging laws between cognitive psychology and neuroscience…it’s hard to say. Issues with qualia lead me to believe it’s the former, which makes the quantum consciousness idea somewhat plausible in my mind.

  30. I hesitate to flash out this following argument to “demonstrate” the impossibility of a non-quantum brain, because it is not air-tight. But anyway, let’s see where it leads us.

    If quantum coherency does not play a role in consciousness, then consider the following two scenarios (similar to many star trek thought experiments, but with a new twist):

    Scenario 1,
    You are deconstructed, atom by atom, and the atoms are carried to a different hospital, where you are re-constructed, much like the Transporter in Star Trek. After you have been reconstructed, you decided that you want to look at your body blueprint from which you were re-constructed. For some reason, you ended up reconstructing a clone using brand new material and the blueprint. Now you have a clone. You have the original body materials, your clone has brand new, but identical materials in his body. You and your clone are structurally identical.

    Scenario 2,
    You are undergoing an atom replacement procedure in the same hospital, in which your body atoms are taken out and replaced by brand new atoms, one atom at a time, in a continuous fashion. During the entire process, you are fully awake and you can feel your continuous stream of consciousness by introspection. Since we can replace our organs surgically with functionally identical replacement organ without changing our consciousness or identity, replacing one atom in our body with an identical atom, which is functionality identical to the one it is replacing, should not cause any change. So, when this atom replacement procedure is 100% complete, you have a new body, but this is you nonetheless, as your continuous stream of conscious introspection can attest to that. After the procedure, you decide you don’t want to waste the old materials from your original body, which is just now piling up somewhere in a holding area. So, you decide to reconstruct it using the body blueprint. So, you give yourself a clone. Now you have a new body, your clone has the old body materials.

    Same initial state (you going in), and same final state (you and your clone coming out), but you and your clone have just switched positions depending on how the cloning procedure is carried out. We therefore arrive at an identity crisis. Logically this does not make sense.

    The thought experiment relies on the materialistic assumption that your body is like an atomic Lego structure, therefore, can be duplicated at will, which leads to this identity crisis.

    However, if quantum coherency is a factor, than our body cannot be duplicated because of the no-cloning theorem (if a quantum system can be duplicated, the uncertainty principle will be violated, therefore, cloning a quantum system will necessarily destroy the original one) and the identity crisis is avoid.

    Therefore, I will argue that quantum coherency is required in consciousness. Otherwise we will have some identity crisis.

  31. Rethinking some of the items posted here recently, I have been reading some of Shanahan’s papers on logical languages for AI, how to represent reality, etc. I have not been very much interested in logical proofs, logical languages, philosophical “proofs”, or several other somewhat related fields. So how do I square this with an interest in understanding and programming AI systems? Well, I just proceed as a programmer. My analogy is that while I may not completely understand the parser, I have no hesitation for writing code in the language. It’s not a very good analogy, but perhaps it makes my point, which is that I do not really want to argue the fine points of quantum mechanics, ANNs, or logic for AI. Nevertheless, I will continue to be interested in how consciousness comes about and whether we might someday be able to produce it or its effects in a man-made device.

  32. Kar Lee: What’s wrong with an identity crisis? I see no theoretical problem in having two “me”s. If my body/brain got identically reproduced, there would be two of me and each would think it was “me”, just as I now think I am “me”. Lots of practical problems, but this does not logically bother me.

  33. Paul: I have no idea what the difference is between final causes and efficient causes. I have said that I believe that a properly constructed device will experience consciousness as it experiences itself in the world. But I have also said that I have no idea how it really works that I “experience”. Perhaps these views are irreconcilable. I don’t know about that.

  34. Paul: As I recall, one of the Greeks said there were four kinds of causes. But I suspect that generations of philosophers have since redefined whatever it was that Greek said. And I could not begin to follow their thoughts through the ages. So I just let it go. Sorry if that seems cold.

  35. OK. If I understand what you’re saying, Paul, it would be something like this: Yes, quantum effects are certainly under there somewhere, but I may be safe to ignore them, depending on what questions I want to ask.

  36. I think that’s right Lloyd. But if one wants to ask the fundamental kind of question, like “what is mind?” then we have to be as reductive as possible.

  37. And by that, I assume you mean that we have to ask questions at the lowest possible level of explanation. I’m not so sure about that. After all, I can “completely” understand how a complicated mechanical object works without having to understand the orbitals in all the molecules which make up that object. So why would “mind” be any different?

    Granted that in a sense, I have not “completely” understood the object. But my claim is that I can segregate various aspects of the various realities underlying the object and that I do not really need to understand all of those aspects in order to have an adequate understanding of all I need to know about the object in order to do whatever it was I wanted to do with that object.

  38. Lloyd, I think you are right in saying that you can understand a complicated mechanical object without knowing its subatomic structure because those structure are irrelevant. However, you cannot understand superconductivity without understanding the underlying quantum coherency of the Cooper pairs. So, it really depends on what is being understood. I suspect quantum effect is important for consciousness because a Lego block structure mechanical view leads to nowhere, worse, to ID crisis.

    Your comfort and ability to imagine two selves (taking third person view, don’t think you can take the first person view in this case) is exactly my hesitation in flashing out the argument: Only works for certain groups of people.

  39. Lloyd,
    Of course, you’re right about being able to have an incomplete understanding of an object. The issue with the mind is really regarding the need for an account of subjectivity (and willful actions/omissions) that makes any sense under the causal closure of the physical and our current set of postulates for physics. If one can’t be had under classical assumptions, then perhaps quantum cognition is worth taking into consideration.

  40. It is my view that we have not yet reached the impasse in studying the mind where we need to jump to a lower (more fundamental) level of explanation. It would certainly not be prudent to start at the lowest, most detailed level of explanation, unless one sees clear evidence that the higher level explanations are not adequate. Truly, the brain is a most challenging reverse engineering project because, unfortunately, the designer did not publish the construction notes. But that itself does not mean that it is too complex for, as Kar Lee says, “the Lego block structure mechanical view”.

    Kar Lee: As for two selves, each would appear to the other as just another person, with the unique situation that the other person shares my memories up to the moment of the copy process. From that moment on, the experiences are unique to each individual. However, I do understand that our current moral/social climate does not have a place for this sort of multiple examples of a single ego.

  41. Hi Kar Lee, regarding #32 just a couple of comments:

    – It might not be a good idea to base your reasoning on experiments that probably will never be done.

    – In terms of particles and atoms this expression,

    “…replacing one atom in our body with an identical atom, which is functionality identical to the one it is replacing…”

    makes no sense.

    It is like the experiment of changing neurons by functionality identical (here functionality identical makes sense)chips, one by one. The day that can be done, we talk about it, meanwhile it fits well in an Asimov’s story.

    Lloyd: “…I can “completely” understand how a complicated mechanical object works without having to understand the orbitals in all the molecules which make up that object…”

    What do you mean by “completely”, maybe you can understand the logic of the mechanism, but not all of its properties, for example how much stress it can take, or how the different parts are worn out by friction, for that you need to know the tribological properties, and for that you need to know the molecular structure. So, I agree with Paul, it dependes on the depth of knowledge about the system you want.

    In programming terms, as you were mentioning, to know how the parser works, or how instructions are handled by the processor is sometimes very useful to provide the meaning of the high level language commands.

    Regarding the two identical selves issue, if it could be done, that will only happen for a very short time after duplication. Then each individual will evolve in a different way, even their past will become different because their memories will deteriorate and get modified in a differente way. Of course both would look quite alike for sometime. Think of twins.

  42. Vicente: As I said in the second paragr of #39, the “understanding” gained was to be sufficient for “whatever it was that I wanted to do with the object”. So if I needed to explore friction or Rockwell hardness or something else, then I might indeed need to dig deeper.

    But my basic claim is that it would not make sense to start at the most complex level. To learn how a JK flip-flop works, I want the logic diagram, not tables of transistor conduction currents. It may turn out, for example, that I do not really need to learn how neurons fire in spike trains in order to understand how waves of attractor basins wash across large ensembles of neurons.

    The real problem here is that you do not really know how deeply you need to dig until you already have a good understanding of the subject matter. The big advantage of being human is that we have language and can share the load. I can read your summary of how your part of the thing works. I acknowledge a deep desire to know everything, but, unfortunately, that will not happen in my lifetime. So I am happy to make use of your results. Please carry on.

  43. Lloyd, I like to follow up with your comment about have multiple examples of a single “ego”. I like to invite you to imagine in the most personal way possible, so that you won’t jump back to the third person viewpoint, but focusing on the first person view point, in a most up-close and personal way. I like to ask you to imagine like a person who is about to be executed and is fighting for his life, for the purpose of this argument. Say if you are the person who is going to go through this atom replacement procedure, and then you know that upon completion, between you and your clone, one will executed. However, before you enter into the procedure, you have a choice of which one will be executed. What would your choice be? The answer to this question (think of it in a first person way, personal way) will reveal your underlying thinking of this problem. Same goes for the other thought experiment.

    If you still can claim that it does not matter, because one will survive and one will not, then I will liken that to someone who stays calm before his execution, and tells the world that it does not matter because someone will be executed momentarily, as if he is talking about someone else. This is a great ability that I certainly don’t have, and keeps me trapped in a first person viewpoint about my personal mortality and sweat about it.

  44. If I really believe all your premises, that is, if I truly trust the executioners to do exactly as promised, then I would have to say that it is really a random choice, 50/50 for A or B. Because your original specification was that the “copy” would be identical to the original. Therefor, it was not really a “copy” at all, but a second instance of the original. So I see no grounds upon which to choose one or the other. And I say this from a true first-person position: that is, “I” am both people who come out of the copy process. Just before the execution, both believe “I” have a 50/50 chance of dying. As I say this, I imagine my “ego” (in the Metzinger sense) as flipping back and forth between the two, something like a Necker cube changing orientation.

    Do you find this answer to be satisfactory?

    Actually, your scenario did leave one gaping hole. It was not clearly stated whether the duplication machine would produce the “copy” in such a way that one or the other of the resulting “me”s would in fact have reason to think “I am the copy” vs “I am the original”. For example, one of the outputs might be standing on a platform in the same position as “I” went in, while the other output was in a different position. That makes the task a bit harder but does not change the outcome. “My” view as the “copy” would still be that “I” went in and “I” came out the same as I went in.

    However, I should also note that I am getting on in years and have already given considerable thought to issues surrounding my death. I have completed a will, have discussed many related issues with friends and family, and sincerely hope I have another 40 years to go.

    That was an interesting exercise. Thank you.

  45. Perhaps I was not clear in the last sentence of paragr 1. I was flipping back and forth only as a way to sample both egos, not that one disappeared while “I” was the other.

  46. Another thought on whether the machine discriminates:

    If one output is in the same position as I went in and the other is in a different position, then as I came out in the changed position, I would experience a moment of disorientation. Aside from the thought that “I am somehow inferior (I have accepted your promise that that is not an issue)”, the sudden jump in orientation could cause a bit of vertigo or other such reaction. That would put me at a disadvantage.

  47. Lloyd, very interesting answer indeed. I can grasp your Necker cube description. But I think it is physically impossible to have that kind of perception. It is almost like two bodies being occupied by one conscious being, through which the two bodies can have some sort of communications (think about it…), while in fact there is no physical link in-between.

    On a different front, just to respond to Vicente, I think in philosophy, thought experiments like this are highly invaluable, regardless of whether they are actually doable. “Mary the color scientist”, “twin earth”, Zombie, and so on and so forth, each provide a way to look deeper into the issue. They challenge our fundamental concepts. I enjoy these kinds of thought experiments a lot.

  48. Kar Lee: No. I said that badly. What I meant was that I as a reporter was flipping back and forth, trying to present views of both individuals. I do agree that each of the copies is a separate person.

  49. Lloyd, So which copy would you rather stay alive, so as to save yourself from execution? (Assuming the machine is non-discriminatory, and answer that even before going into the machine, not after.)

  50. Kar Lee: I repeat — it does not matter. Both wish for life. It is not a matter of saving “me” as I am now, before the copy. After the copy, each would take the same position.

    I gather this is a difficult position for you to take.

  51. Kar Lee: I was thinking some more about what happens if one or the other people stepping out of the machine experiences the transition differently. If there is any sense that one of the copies might be of lesser quality, even if such a possibility has been vehemently denied, I believe it could lead to serious psychological consequences for that copy.

    Option 1: Have the subject asleep during the process and each copy wakes up in the same position.
    Option 2: Each person steps out of the machine alone, in the same posture as the original went in, separate from the other, and goes off to a separate waiting room.

    Good luck on getting your machine to work.

  52. Lloyd, “…I repeat — it does not matter. Both wish for life…” I think we have to agree that at this point, we have reached an impasse. I respect your opinion. On another point, I believe the machine won’t work because of quantum coherency. I understand that you disagree on the need for quantum coherency as well. For that we will have to wait for more progress in brain science to tell.

  53. Kar Lee: When it comes to copying (transporting) complicated stuff like living tissue, where the detailed structure is essential, I would probably agree that you have to get the details right at a quantum level. But I think that does not imply that you need to invoke that level in order to understand the operation of the brain.

    Are you saying that “you” are, for some reason, not copyable? Or that the copy would not be the same as “you”, even though you promised me it would be the same? Maybe you can’t imagine two identical “you”s, both knowing they are “you”. Each of those alternatives seems to imply that there is a component of your consciousness that is not in your brain. Are you not a materialist?
    Is that question too personal for this blog? Are we off the topic?

  54. Lloyd, the best way to describe my preference is I am a skeptic. At this point, I think I am more aligned with an extended materialist view, i.e. I will allow for existence of things beyond what is currently detectable materialistically. And the definition of detectability may warrant a separate discussion on its own right.

  55. detectability ! hmmm… with this point you could connect back to the introspection blog.

    To detect, actually means to establish a cause-effect relation between detector and detected. The trouble comes when you observe an effect with no clear cause, see for example dark matter and dark energy hypothesis, where “detectability” plays a nice role.

    I wonder how many effects are there in the brain with no clear cause (are cause that cannot be detected). This line leads (in my opinion) to the core of the free-will problem, unless a cause-effect chain can be established for every decision, you have a blank zone…

    Lloyd, could it be a component of consciousness that is not in the brain, still related or connected to the brain?

  56. I believe that our brains are what we are and what we are is all in our brains. I do not believe there is any part of my consciousness outside of my brain. Of course, I leave traces all over the place as I inhabit this world and I sense those traces continually and as a result, such things alter my brain and who I am. But that alteration occurs only via the known perceptual channels (allowing for things that may not yet have been discovered or cataloged, such as pheromone smells).

  57. I can think of numerous ways in which we might approach the frame problem, and I think a working system would use more than one method. I will just give one example, now, as it is the easiest one to describe.

    If we have an AI system, then it should be able to learn to make outputs that give good results – it should be able to learn to make the right outputs to manipulate the outside world in such a way as to do whatever it is it is supposed to be doing.

    Now, suppose we regard the AI system itself as being part of the outside world – something that the AI system is going to manipulate using its outputs. We add special outputs to the AI system that actually manipulate the AI system itself – for example, focusing its modeling on particular aspects of the world. The AI system can then “choose” to “think” about certain things by issuing the appropriate outputs, which are then used to act on itself. It does not need any special system to do this, because it is supposed to be able to work out what outputs are best anyway. It does not even need to “know” that these outputs affect itself – it should be able to work out, from experience, how they affect its situation. A robot might learn, for example, that sending the correct outputs to stand on a chair and see over a wall gives a better view, and also that sending the correct outputs to manipulate itself, gives it better information in other ways – by focusing its thinking properly.

    If anyone has trouble imagining this, try thinking of a type of robot which has switches on its head that “tune” its brain’s modeling systems. One of these robots is trained as a robot engineer and it can walk up to another robot and use its hands to adjust the switches on the other robot’s head – to adjust its brain – depending on what the other robot is trying to do. There is no reason, in principle, why a machine should not be able to do this, but if it can, there is no reason why it cannot do it to itself just as easily. Then we dispense with the idea of using hands and switches, and just wire special outputs directly to the system itself, so they feed back in as “tuning” controls. In fact, they don’t even need to get anywhere near the outside of the machine.

    Another feature of this is that it is self-improving to some extent. If the system does a bit of tuning that works, it should be more focused and thinking better, making it better at tasks, making it better at tuning itself, making it thinking better, etc.

    This also raises the issue of what is really going on when you “decide to think about something”. Maybe, to some extent, this could involve something like this? Maybe “choosing to think about France” is very similar to “choosing to kick a ball” – except that the outputs are reflected back – and it only seems different because we don’t see a physical action in the outside world?

  58. The Frame problem is only an objection to reasoning systems, which may well trouble Murray but doesn’t necessarily demand an explanation outside of such (to my mind anachronistic) AI approaches. It seems to me as if it has something of the character of Zeno’s Paradoxes: these are a problem in terms of any linear, linguistic description of time, but time itself simply isn’t bothered by them – events still take place, hares beat tortoises and all is well with the world. In other words, Zeno’s “problems” aren’t a problem for time at all, just a problem with the way we tend to think about it. The same must be true of the Frame problem – we don’t need a mechanism for solving it; we need to think about thinking in a different way, and then the problem will go away. So it shows us that logical propositions are not the currency of the brain (surprise, surprise). Who cares whether global workspace theory can be made to help? It’s the wrong level of description.

    If we take Dan’s bomb-on-a-trolley example and imagine ourselves solving it, it seems to me that we don’t do any reasoning at all. We simulate. First we perceive the situation as it is now, and simulate what will happen if we do nothing. Ouch! Then we look at the affordances of the objects – there aren’t that many and we’re already programmed with the urge to carry them out (who can see a lollipop without wanting to lick it or read a word without feeling the phonemes trying to form on our lips?). Starting with the objects deemed most salient from our simulation – e.g. the bomb – we play around with things we can do to them and simulate what the consequences will be. It’s true that we could, in principle, do a huge variety of things – lick the trolley, speak angrily to the bomb, etc. – but these are pretty inventive possibilities and are actually quite hard to dream up. On the whole, from our past experience we know that small packages can be lifted, trolleys can be pushed, and a few other possibilities may suggest themselves, so we try these first. We imagine ourselves pushing the trolley from the room and see that the bomb still explodes. We imagine a few other scenarios and then fix on one that works, refining away any snags.

    It seems to me that simulation is not a combinatorial problem. Trolleys don’t ordinarily turn into petunias in my experience. Dealing with the physical world is not arbitrary symbol manipulation either, so it is vastly more constrained. Each decision is taken within a pre-existing or computed context, which constrains the possibilities further. So where’s the problem?

  59. I agree with Steve that the issue of licking walls, etc. should not concern us unduly in planning. Steve seems to be saying that this view comes from some kind of expectation that the planning process works in a simplistic way – like a chess algorithm’s tree search, in which the further we look ahead the broader the tree gets and we have all these branches that need pruning, with the problem that if we prune branches too early we might miss trying some sequence of actions that is useful.

    Steve seems to think that this problem is a non-issue, and I would agree with this. Planning in humans is not like a simple chess algorithm. Steve seems to be saying that it uses domain specific knowledge. I would go a bit further, actually: I think that the entire paradigm that the brain “plans” using any separate, well-defined planning process is wrong. The brain is not just better at this than a chess algorithm: A chess algorithm is completely the wrong paradigm.

    I think that practically everything is done by modeling. The brain looks at its history of inputs/outputs and constructs a model of the future development of the world. Such a model does not distinguish between the brain and the “outside world” – both are part of “the world” – the system causing the inputs/outputs to happen. The brain, in modeling “the world” is modeling its own behavior, just as it models trees and cats. The model does not even explicitly distinguish (in any archictectural sense) between the brain and anything else. As far as the brain is concerned, it has had a lot of inputs and outputs in the past, and it is trying to predict future inputs/outputs. The predictions of its future outputs can be used to actually drive those outputs. A history of competent past behavior can be used to generate current competent behavior, using just modeling. (Some people may notice a slightly tautological aspect to this – where does a history of competent behavior come from in the first place? However, this is very easily dealt with.)

    So, I agree that this tree of options, with regard to planning, is a non-issue.

    That does not solve all the problem though. Even if we do not have this problem in planning, we still have it in modeling. We have to decide what features of the world should be modeled. For example, it never occurred to me today that Toblerone would be of any relevance at all in working out what was going to happen. I don’t think this problem is as bad as some people might think, however. We don’t just have some “model everything” system. We clearly have systems that model selectively. When I suggested the “reflexive outputs”, earlier in this discussion, I was suggesting it in this context of modeling, really. We could do other things. A system’s model of the world is going to be a web of inferences, probabilities, facts, etc – depending on how it is built. What is going to happen is that this web is not going to cover all of reality with the same detail. There is going to be a basic, “skeletal” model – one in which some piece of information only gives rise to a small number of closely related pieces of information. The system will use various processes to increase and decrease the density of this web where appropriate. Where it seems appropriate, the model will be made denser, to model some interesting aspect of the world in greater detail. Where it seems less appropriate, the model density will be decreased.

    This web will not cover all of reality with the same detail. There is going to be a basic, “skeletal” model – one in which some piece of information only gives rise to a small number of “local” closely related pieces of information. For example, for a traditional, inferential system, the statement “Socrates is a man” would not contain many other statements that logically connect to this in a small number of logical steps. The system will use various processes to increase and decrease the density of this web where appropriate. Where it seems appropriate, the model will be made denser, to model some interesting aspect of the world in greater detail. Where it seems less appropriate, the model density will be decreased. One way of doing this is to look at what we are trying to predict and look at those parts of the web which seem to be having a substantial effect on the predictions. Those should be parts where we want more modeling density. We might do this by experimentally reducing or increasing modeling density. For example, we could experiment by reducing the density with which the Swiss chocolate industry is represented in our model. If this causes the uncertainty in the predictions that interest in us to not alter very much, then we know this is safe, and we may do more. If the predictions that interest us suddenly become much more uncertain, we know this was a bad idea. We might even try increasing the modeling density around the Swiss chocolate industry, to see if we can use it to get more reduction in uncertainty. Of course, such a process might be applied on a large scale. Most of the model would be reduced to almost zero density in a few such operations.

    Other approaches might work by looking at the “paths” through the model between inputs and outputs of interest. We might look at what in the model most affects the outputs directly, what most affects that, what most affects that, etc – tracing back through the model to see where the causes of the outputs are coming from. We could similarly do this with the inputs, looking at what part of the model is most directly affected by the inputs, what taht in turn affects and so on, etc. What we would be interested in is where these two processes meet in the model. It would give us an idea of “relevant paths” through the model, and we could then increase modeling density along these paths. These regions of the model, with now increased density, would be looked at again, later, and may have still further increases.

    An analogy for this is a bolt of lightning, the electricity itself increasing the conductivity of the air and allowing more electricity to flow. We are doing a similar kind of thing, through the model, between inputs and outputs of interest – we are finding paths through the model where there should be more modeling, which in turn is likely to reveal even better paths through the model and so on. You might ask how we get the resources to do this – but the fallacy here would be viewing this as if we had an inefficient, dense model If it is done properly, there should hardly be any model there with which to do all this! Most of the model should be so “thin” it is practically non-existent. Hardly any of the world is being modeled in any detail at any time. Not having most of the model explicitly represented at any time makes it viable to perform intensive operations on the model to determine where it should, and should not, be explicitly represented in more detail.

    Of course, as humans we will not experience this. As soon as something becomes relevant to you, the local density of the model around it will increase – and it will seem to you as though your brain “contains” a detailed model of it. When it becomes less relevant, all this will go back to a lower density model.

    (I expect to be discussing issues like this in that series I am doing now.)

  60. Steve’s sceptical point is a strong one; it’s quite true that if we don’t expect to get by with propositional/predicate calculus or whatever (and I agree it was always a bit bonkers to think we could), we need never run into the frame problem in its traditional guise.

    But I think the problem in that guise is just one head of a hydra-like problem which pops up in several places without us ever quite getting the measure of it. I think, for example, that the whole problem of meaning and intentionality has a strong connection with the sort of relevance-divining abilities which underlie the frame problem.

    In another form, the problem surely affects modelling, doesn’t it? You can never just model the whole reality of something; you have to model certain features of it. You have to pick out in advance those features which are (aha!) relevant, so that the respects in which your model is unlike the real thing don’t matter. But which are the relevant features? The danger is that the model goes along quite nicely and then suddenly runs into a situation where its colour, or something else you had assumed was unimportant, makes a difference. (I may not be expressing this very well!)

  61. I think you’re probably right, Peter, that the Frame issue is just the symbolic AI expression of a more general (or at least unspecified) factor. Yet I prefer to think of it as a beacon towards the truth, rather than a problem. What bothered me about Murray’s attempt to apply global workspace theory is that it sticks a bandaid over the wound instead of cutting out the gangrene. He may be right at some level but misleadingly.

    As for whether modeling reality creates a combinatorial explosion, I don’t think so. I think Paul’s thunderbolt analogy is very apt. Lightning seeks a minimum path by bottom-up means (it doesn’t have to “search” every air molecule), and the speculative path gets strengthened and refined through reflexive processes and positive feedback. Simulation (and therefore planning) is surely like this too: Although it can *potentially* ramify massively and become computationally intractable, in practice it doesn’t. Things usually do more or less what they’ve done before.

    Consider the difference between these two scenarios:

    1. A little girl sits crying on the side of the street. In front of her an ice-cream lies squashed with tyre marks all over it. You have money in your pocket. What do you do?

    2. A squamish sits farmishing on the side of the dinkum. In front of it a trumince lies carbled. You have syncopathy in your sarvelt. What do you do?

    Obviously the second situation can’t be modeled, predicted and planned for at all, because we know nothing about how squamishes behave or what we might do with some syncopathy. But we know what girls are, why they tend to eat ice-cream, where tyre marks probably come from, etc. In fact these properties and affordances essentially DEFINE the objects. It COULD be that the girl is crying because it’s her hobby, the ice-cream was thrown at her by a passing chicken and the tyre marks are produced by some form of mutant fungus, but it’s not likely to have been our experience in the past and so is unlikely to be true now. Tyre marks usually arise when cars run over things; ice-cream tends to be eaten by children, who tend to enjoy it and thus would resent its loss, etc.

    If we assemble the most probable elemental inferences into a sequence it is quite likely to be correct. If we “excite” possibilities in proportion to how likely we’ve found them to be in the past, and the most probable path turns out not to be consistent with future facts, then we suppress that possibility and allow others to come to the fore instead.

    At no point during this do we need to consider whether the girl has been magically transported from another planet, or whether the thing that looks like ice-cream is really some sort of cloth. Nothing in our experience causes us to leap to these conclusions, so we only consider them if the more probable interpretations fail to link up. Experience not only gives us knowledge but also probabilities, and so we start with the most probable and then expand outwards until something works.

    We never have to search our entire knowledgebase to FIND OUT which options are most probable – that would be to make the mistake of treating the brain like it’s a serial computer. All the possible pathways are constantly vying for supremacy with an amplitude determined by their probability and how well they fit our sensory inputs. In that sense Murray may well be right to use global workspace theory as a metaphor, although the old Pandemonium model might be closer and neither really captures the reality.

    My example is too abstract for my own good – I don’t mean to suggest that the brain actually deals in the currency of inferences and tags them with probabilities. I think the process is far more low-level than this. That’s why I’d not be happy with GWT as an explanation – it’s too symbolic, too high-level and too linear. But if the brain at the neural level is computing with “clouds” of possible pathways, in which past experience of probabilities and degree of match to the sensory facts are what determine the intensity of sub-paths that compete to predict the next outcome, then we have a system which models the future (i.e. interprets, anticipates, plans and decides) starting from what is most probable and working outwards only as required. The frame is implicitly there – past probabilities provide the context that makes most beliefs/actions so irrelevant that they’re never even examined.

  62. Peter:

    “…But which are the relevant features? The danger is that the model goes along quite nicely and then suddenly runs into a situation where its colour, or something else you had assumed was unimportant, makes a difference. (I may not be expressing this very well!)…”

    I have the impression that from the beginning of the discussion there is the implicit assumption that humans cope with this problem very well, while for AI systems it is an unbeatable barrier. But humans themselves can be quite lousy in getting the relevant features of a scenario. For example, think of police detectives, some are good in getting the case details and some are not. In fact, one of the signs of intelligence is to be able to pick up the relevant details of a certain situation. So maybe this is a general problem common to humans and AI systems, or any system that has to perceive an arbitrary evolving scenario. Of course, for the time being humans perform much better than AI, not perfectly though.

  63. Yes, good point – we ought to try not to forget that humans go wrong too. Actually, it’s quite possible that we are rubbish at picking out relevant factors in most areas, we’ve just got good at a few practical ones. It’s an uncomfortable thought, but if we were stumbling around making mistakes because of a relevance deficiency in certain domains, we probably wouldn’t even notice. We’d just wonder why certain things seemed to go wrong a lot.

  64. I certainly accept that humans go wrong too. Humans have a certain amount of ability in this area, but they will miss things because the parts of the model that they explicitly represent must be limited. There may be useful information about parts of reality that are “so far away” from any explored parts that they are effectively unreachable: There would be no way for the modeling system to reach them without running a process that involves a huge combinational explosion. It is not about perfection, but about getting machines up to something like this level. Combinational explosion will always be a limit, because if you don’t have it, it just means that you are avoiding it by not searching too far away from known parts of the model. This does not stop you reaching a distant part of the model, but only if it can be reached by progressive stages which all give useful modeling.

    Another point: Maybe this could be just one feature of what we regard as “genius” in some people – the ability to go into parts of a model that incur a considerable computing cost to reach, due to combinational explosion, so that such parts of the model are inaccessible to everyone else?

  65. Pingback: Moncler giubbotti

  66. whoah this weblog is fantastic i love studying your posts.
    Keep up the good work! You already know, many people are searching
    round for this information, you can aid them greatly.

Leave a Reply

Your email address will not be published. Required fields are marked *