On Intelligence - Jeff Hawkins
How a new understanding of the brain
will lead to the creation of truly intelligent machines
Jeff Hawkins with Sandra Blakeslee
Times Books, 2004
Want to understand how the brain works, not just from a philosophical perspective ... but in a detailed nuts and bolts engineering way.
Not only understand what intelligence is ... but how to build machines that work the same way as the brain does.
The question of intelligence is the last great terrestrial frontier of science.
... what is to be human ...
Intelligent machines will arise from a new set of principles about the nature of intelligence.
... a large industry will be created ...
We live at a time when the problem of understanding intelligence can be solved.
Most neurobiologists don't think much about overall theories of the brain because they are engrossed in doing experiments to collect more data about the brain's many subsystems.
Computer programmers have tried to make computers intelligent but they have failed and will continue to fail as long as they keep ignoring the differences between computers and brains.
What then is intelligence such that brains have it but computers don't?
The best way to look for an aswer to this question is to use the detailed biology of the brain as a constraint and as a guide, yet think about intelligence as a computational problem.
The book is ambitious: it describes a comprehensive theory of how the brain works, packaging and interpreting ideas that existed in one form or another, but not together in a coherent fashion.
We need to distinguish "real intelligence" form the old "artificial intelligence", and before we attempt to buid intelligent machines we have first to understand how the brain thinks.
The core idea of the theory presented is a memory-prediction framework.
Brains and computers do fundamentally different things.
Neural networks will not be more succesful at creating intelligent machines that computer programs have been.
Complexity is a symptom of confusion, not a cause.
The biggest mistake is to think that intelligence is defined by intelligent behavior.
The brain uses vasts amounts of memory to create a model of the world and it uses this model to make continous predictions of future events. It is the ability to make predictions about the future that is the crux of intelligence.
The seat of intelligence is the neocortex. In spite of its great number of abilities and powerful flexibility, the neocortex is surprisingly regular in its structural details and it has a hierarchical structure.
A fully mature theory will take years to develop, but that doesn't diminish the power of the core idea.
There must be a straightforward explanation: the most powerful things are simple.
Francis Crick, the codiscoverer of the strucuture of DNA, wrote an article, in the September 1979 issue of Scientific American entirely devouted to the brain, Thinking about the Brain, arguing that in spite of a steady accumulation of detailed knowledge about the brain, how the brain worked was still a profound mistery. According to Crick, neuroscience was a lot of data without a theory. This was a rallying call.
Ted Hoff, Intel's chief scientist in 1980, said he didn't believe it would be possible to figure out how the brain works in the foreseable future, and so it didn't make sense for Intel to support this kind of research.
At MIT were not interested in how real brains worked. Studying brains would limit your thinking. They believed it was better to study the ultimate limits of computation as best expressed in digital computers.
Computers and brains are built on completely different principles: one is programmed, one is self-learning; one has to be perfect to work at all, one is naturally flexible and tolerant of failures.
In 1981, MIT rejected Jeff Hawkins application for graduate studies.
AI suffers from a fundamental flaw in that it fails to adequately address what intelligence is or what it means to understand something. It has a central dogma: the brain is just another computer.
This assumption was bolstered by the idea that neurons could act as logic gates, but no one bothered to ask if that was how neurons actually were wired in the brain.
AI philosophy was also bolstered by the then dominant trend in psycology: behaviourism. Behaviourists believed that it was not possible to know what goes on inside the brain which could only be considered as an impenetrable "black box".
A high point of AI was the computer program Eliza which mimicked a psycoanalist but that could not be generalized to do anything useful.
Then there was a large stir about "expert systems", databases that could answer questions posed by human users. But again they turned out to be of limited use.
Computers could play checkers and Deep Blue finally beat Gary Kasparov, but not for being intelligent. It won for being millions of times faster than humans. Deep Blue had no intuition. It played chess but did not understand chess, in the same way that a calculator performs aritmethic but does not understand mathematics.
Even today, no computer can understand language as well as a three year old child.
There are still people who believe that AI's problems can be solved with faster computers, but most scientists think the entire endeavor was flawed. No matter how cleverly a computer is designed to simulate intelligence by producing the same behavior as a human, it has no understanding and it is not intelligent.
Understanding cannot be measured by external behavior. It is, instead, an internal metric of how the brain remembers things and uses its memories to make predictions.
Computers could, in theory, simulate the entire brain, but you cannot simulate something you do not understand how it works.
- They are static structures: do not have a time dimension. Real brains process rapidly changing streams of information.
- They have very limited feedback mechanisms. The brain is saturated with feedback connections between different parts and feedback dominates most connections throughout the neocortex as well.
- They do not take into account the physical architecture of the brain which is organized as a repeating hierarchy.
Neural networks settled on a class of ultrasimple models that did not meet any of the basic criteria for ressembling a brain: only processed static information, did not use feedback and didn't look like brains. They share a trait with AI: they focus on behavior assuming that intelligence lies in what is produced after processing a given input.
The field became dominated by people not interested in understanding how the brain works, or understanding what intelligence is.
Auto-Associative Memories
This is another connectionist approach that came much closer to describing how real brains work. They are built out of simple neurons that connect to each other using lots of feedback and fire when reach a given threshold.
When a pattern of activity is imposed on the network it forms a memory of this pattern. To retrieve a pattern you must provide the pattern to be retrieved. It may seem something ridiculous but they have an important property: you don't have to provide the entire pattern. They can retrieve the correct pattern, as was originally stored, even though you start with a messy pattern.
In addition, an autoassociative memory can be designed to store sequences of patterns, or temporal patterns. When presented with a part of the sequence, the memory can recall the rest.
This is how people learn practically everything and thus it is probable that the brain uses circuits similar to auto-associative memories.
A Misguided Intuitive Assumption
The quest for intelligent machines is burdened by an intuitive assumption that so far has hampered our progress: that intelligence is to be found in behavior.
This is very intuitive because we humans demonstrate our intelligence through speech, writing and actions, but only to a point: intelligence is something that is happening in our head. Behavior is an optional ingredient.
History shows that the best solutions to sceintific problems are simple and elegant. The details may be forbidding and the road to a final theory may be ardous, but the ultimate conceptual framework is generally simple (think for example in the theory of evolution).
Without a core explanation to guide inquiry, neurocientists don't have much to go on as they try to assemble all the details they collect into a coherent picture. The failing isn't one of not having enough data or the right pieces of data; what we need is a shift in perspective.
Not everyone thinks we can understand how the brain works. Many people think that somehow the brain and intelligence are beyond explanation.
Functionalism
Being functionalist means believing that beign intelligent or having a mind is purely a property of organization and has noting to do with what you are organized out of. A mind exists in any system whose constituent parts have the right causal relationships with each other, but those parts can be neurons, silicon chips or something else.
For half century we have been trying to program intelligence into computers, but intelligent machines still aren't anywhere in the picture.
To succeed, we will need to crib heavily from nature's engine of intelligence: the neocortex.
We have to extract intelligence from within the brain. No other road will get us there.
3. The Human Brain
What is so unsual about the brain's design, and why does it matter?
The brain's arquitecture has a great deal to tell us about how the brain really works and why it is fundamentally different from a computer.
We are going to focus most of our attention on the neocortex, a thin sheet of neural tissue that envelops most of the older parts of the brain.
The suggestion that we can get to the bottom of intelligence by understanding just the neocortex will raise objections from others researchers of the neurological community that think that one cannot understand the neocortex without understanding other brain regions, but we are not interested in understanding other human functions or in building humans.
Beign human and beign intelligent are separated matters. An intelligent machine need not have sexual urges, hunger, a pulse, muscles, emotions. We are biological creatures with all the necessary and sometimes unwanted baggage that comes from eons of evolution.
To build machines that are undoubtedly intelligent but not exactly humans, we can focus on the part of the brain strictly related to intelligence.
The neocortex is about 2 mm thick and has six layers. Streched flat is roughly the size of a large dinner napkin. Humans are smarter because our cortex, relative to body size, covers a larger area, not because our layers are thicker or contain some special class of "smart cells". Evolution folded our cortex stuffing it into our skull like a sheet of crumpled paper.
Some anatomists estimate that the typical human neocortex contains 30 billion cells (3x10^10), but it could be significantly higher or lower. This 30 billion cells are us. They contain almost all our memories, knowledge, skills and accumulated life experiences.
Francis Crick wrote a book about The Astonishing Hypotesis: that the mind is what the brain does. There is nothing else, no magic, no duality mind/matter, only neurons and a dance of information. There is a large philosophical gulf between a collection of cells and our conscious experience, yet brain and mind are one and the same.
We need to understand what these 30 billion cells do and how they do it.
Fortunately, the cortex is not just an amorphous blob of cells. We can take a closer look at its structure for ideas about how it gives rise to the human mind.
Just about every where you look, the convoluted surface looks pretty much the same. There are no visible boundary lines demarcating areas that especialize in different types of informations or thoughts. However, neuroscientists know that some mental functions are located in different regions, because a stroke in different areas produces different effects.
The cortex has many functional regions or areas. Each of these regions is semi-independent and seems specialized for certain aspects of perception or thought, forming an irregular patchwork quilt which varies only a little from person to person.
Functionally, they are arranged in a branching hierarchy. What makes one region "higher" or "lower" than another is how they are connected to one another.
Lower areas feed information up to higher areas by way of certain neural patterns of connectivity, while higher areas send feedback down to lower areas using a different connection pattern. There also lateral connections between areas that are in separate branches of the hierarchy.
The monkey cortex shows dozens of regions connected together in a complex hierarchy. We can assume that the human cortex has a similar structure.
The lowest of the functional areas receive sensory information. As the areas belong to higher levels, they are concerned with more specialized or abstract aspects of information.
Eventually, sensory information passes into "association areas", which receive inputs form more than one sense. Most of these areas receive highly processed inputs and their functions remain unclear.
The motor system of the cortex is also hierarchically organized. The hierarchies of the motor and the sensory areas look remarkably similar. They seem to be put together in the same way.
Although the up hierarchy is real, we have to be careful not to think that the information flow is all one way. Information in the cortex always flows in the opposite direction as well, with many more projections feeding back down the hierarchy than up.
The six layers of the neocortex are formed by variations in the density of cell bodies, cell types and their connections.
All neurons have features in common. In addition to the body cell, they have wirelike structures called axons and dendrites. When the axon from one neuron touches a dendrite of another they form small connections called synapses. The nerve impulses of one cell influence the behavior of another cell through the synaptic connections. Some synapses contribute to firing another cell or to inhibiting it: synapses can be excitatory or inhibitory.
The strength of the synapses can change depending on the behavior of the two cells. When two neurons generate a spike at nearly the same time, for example, the connection strength between them will be increased.
The formation and strengthening of synapes is what causes memories to be stored.
There are many types of neurons in the neocortex, but 80% of them are pyramidal neurons. Except for the top layer which has miles of axons but very few cells, every layer contains pyramidal cells.
Pyramidal cells have a body shaped like a pyramid and each pyramidal neuron connects to many other neurons in its inmediate neighborhood, and each sends a lengthy axon laterally to more distant regins of the cortex or to lower brain strucutres like the thalamus.
A typical pyramidal cell has several thousand synapses, varying from cell to cell, layer to layer and region to region.
All together our neocortex would have 30 trillion (3x10^13) synapses althogether, apparently sufficient enough to store all things we learn in our life.
An Astonishing and Ignored Fact
A fact about the neocortex is so surprising that some neuroscientists refuse to believe it and most of the rest ignore it because don't know what to make of it. A fact that comes from the basic anatomy of the cortex itself: the neocortex is remarkably uniform in appearance and structure, in spite of performing very different functions.
This fact led Vernon Mountcastle to sugest in 1987 that since these regions all look the same, perhaps they are actually performing the same basic operation, i.e., the cortex would use the same computational tool to accomplish everything it does.
Regions of cortex vary in thickness, cell density, relative proportion of cell types, length of horitzontal connections, synapse density, and many other ways that can be tricky to discover. However, it is their similarity that is surprinsingly and interesting, much more so than their differences.
[A common man marvels at uncommon things; a wise man marvels at the commonplace - Confucius]
Mountcastle argues that the reason one region cortex looks slightly different from another is because of what it is connected to, and not because its basic function is different. He does not say that earing and vision are the same. What he says is that the way the cortex processes signal from the ear is the same as the way ir porocesses signals from the eyes.
If Mountcastle is correct, the algorithm of the cortex must be expressed independently of any particular function or sense.
If true, it would be the most important discovery in neuroscience, even though most scientists and engineers either refuse to believe it, choose to ignore it, or aren't aware of it.
Part of the problem stems from the poverty of tools for studying how information flows within the six-layered cortex. The tools available operate on a macro level and generally aimed at locating where in the cortex, as opposed to when and how, various capabilities arise.
Where versus How: Plasticity
MRI and PET scanning focus on brain maps, gradually building up a picture of where certain functions happen in a typical brain, assuming that the brain carries out the various activities in different ways.
The extreme flexibility of the neocortex provides support for Mountcastle's proposal.
For example, an special visual area seems specifically devouted to representing written letters and digits. Because written language is a far too recent innovation and different cultures have different signs, our genes could not have evolved a specific mechanism for it. That's way we have to learn to read. Therfore the cortex is still dividing itself into task-specific functional areas long into childhood, based purely on experience.
This argues for an extremely flexible system, not one with one thousand solutions to one thousand problems.
The neocortex must be extremely "plastic", meaning that it can change and rewire itself depending on the type of inputs flowing into it. Cells are not born to specialize in specific tasks. Probably, no area of the cortex is predetermined to acomplish a certain function.
Brain regions develop specialized functions based largely on the kind of information that flows to them during development. The cortex is not rigidly designed to perform different functions using different algoritms anymore than the earth's surface was predestined to end up with a given arrangement of nations.
Genes dictate the overall architecture of the cortex, including which regions are connected together, but within this structure the system is highly flexible.
Everything points to a single algorithm performed by every region of the cortex. Connecting different regions of the cortex in a suitable hierarchy and providing them with a stream of inputs, it will learn about its environment.
Thus, it must be possible to deploy the cortical algorithm in novel ways, with novel senses in a machined cortical sheet in such a way taht genuine intelligence emerges outside of the biological realm.
Inputs to the Cortex
The inputs to the cortex are also basically alike. We can visualize these inputs as a bundle of electrical wires or optical fibers.
The sense organs supplying these signal are different, but once they are turned into action potentials, thay are all the same: just patterns.
Each pattern, seeing, hearing or feeling is experienced diferently because each gets channeled through a different path in the cortical hierarchy.
It matters where the "cables" go inside the brain, but at the abstract level, all are handled in similar ways in the six-layer cortex.
All our brain knows is patterns. All the information that enters in our mind comes in as spatial and temporal patterns on the axons.
Our conscious impression of a stable world is only made possible by our brain's ability to deal with a torrent of changing patterns never repeating exactly. Vision, for example, is more like a song than a painting. Even if we "fix" our eyes on an object, our eyes make sudden movements, three times per second, called saccades.
Time must play a central role in vision in the same way that space must play a central role in hearing, as temporal patterns are converted in spatial patterns in the cochlea's requency filters.
Even touch is a spatio-temporal sense. If we cannot move our fingers around an object we do not see, we have no clue about what object it may be.
We usually say that we have five senses, but we have many more. Vision is in fact three senses: motion, color and luminance. Touch has pressure, temperature, pain and vibration. We also have a proprioceptive system that tells us the position and angles of our body parts and a vestibular system which gives us a sense of balance.
Some senses are richer and more apparent than others, but all enter our brain as streams of spatial patterns flowing through time on axions.
The only thing our cortex knows about the world is patterns streamming into the input axons.
The fact that patterns from different senses are equivalent inside the brain is quite surprising and it supports the idea of a common algorithm. For example, experiments have been conducted to allowblind people to "see" through their tongues by linking a camera to a chip that transforms pixel images into pressure points. The brain quicly learns to interpret the patterns correctly. Images experienced as tongue sensations soon are experienced as images in space.
Brains as Pattern Machines
The cortex doesn't care if the patterns come from vision, hearing or any other sense. This means that we do not need any particular combination of senses to be intelligent.
As long as we can decipher the neocortical algorithm and come up with a sicence of patterns, we can apply it to any system that we want to make intelligent.
If all our knowledge of the world comes from patterns, are we certain that the world is real?
Our certainty of the world's existence can only be based on the consistency of patterns and the way we interpret them. Existence may be objective but the spatial-temporal patterns flowing into the axon bundles is all we have to go on.
This fact explains why some people may experience hallucinations, for example, hearing sounds no one else hears. If some malfunctioning of the brain self-produces input patterns, the patient really "hears" whatever other parts of the brain produce. Dreams are also brain selfinduced patterns.
Through the input patterns the cortex constructs a model of the world that is probably close to the real thing and then, remarkably, stores it into memory.
4. Memory
The world is an ocean of constantly changing patterns that come lapping and crashing into our brain. How do we manage to make sense of this onslaught? What happens when these patterns reach the neocortex?
Neurons are quite slow when compared, for example, to transistors. A neuron collects inputs from its synapses and combines them to decide when to output a spike to other neurons. A typical neuron can do this and reset itself in about 5 ms, or around 200 times per second, which about five milion times slower than a computer can perform an operation.
How is then possible that a brain could be faster and more powerful than a digital computer? Because it is a parallel computer, is the standard answer. But is it true?
Consider the following thought experiment. Show a photograph to a child and ask him if there is a cat in the image. This task is difficult or impossible for a computer to do, yet the child can solve it in less than half a second. But in half a second, the information entering our brain can only trasverse chains of one hundred neurons long. That is, a brain solves problems like this in one hundred steps or less, regardeless of how many neurons are involved.
But if we have many millions of neurons working together, ins't that like a parallel computer?
Not really. Parallel computers divide the task in pieces but for each piece they perform thousands of operations. The largest conceivable parallel computer can't do anything useful in one hundred steps, which are, in addition, essentially performing the same operation.
The difference is that the brain does not "compute" the answers to problems: it retrieves the aswers from memory.
The entire cortex is a memory system. It isn't a computer at all.
Retrieving versus computing
To understand the difference between retrieving an computing, consider the task of catching a ball. From a computational point of view, we would have to calculate the flight of the ball and would have to solve equations to determine the time and position trajectory for all our body joints, and would have to repeat it many times as the ball approaches to get the correct answer, with millions of steps to solve the numerous equations.
A brain solves this problem with less than hundreds of steps, because it does it differently: it uses memory.
Our brain has stored the sequences of muscle commands required to catch a ball. When we sight the ball, the appropriate sequence is retrieved and it is adjusted as it is recalled to accomodate the particulars of a given situation, such as the ball trajectory and our body position.
The memory of how to catch a ball was not programmed or computed into our brain: it was learned over the years of repetitive practice, and it is stored, not calculated. That is why goalkeepers, or pianists for that matter, have to train a lot.
To accomodate the variations of a particualr situation, the brain does not need to compute anything because the cortex creates invariant representations of events, which handle variations automatically.
The Brain is not a Computer
The neocortex is not a computer, parallel or otherwise; it uses stored memories to solve learned problems. But the characteristics of brain memory are fundamentally different than computer memories:
- The neocortex stores sequences of patterns
- The neocortex recalls patterns auto-associatively
- The neocortex stores patterns in invariant forms
- The neocortex stores patterns in a hierarchy
