How the Brain Inspired AI (and Won a Nobel Prize)

In a darkened room at Harvard Medical School in 1958, two young scientists were running out of patience. David Hubel and Torsten Wiesel had pushed a tungsten microelectrode into the primary visual cortex of an anaesthetized cat, and for hours they had flashed spots of light onto a screen, trying to make the neuron fire. The audio monitor that turned the cell's electrical spikes into clicks stayed stubbornly quiet. Then a glass slide jammed in the projector. As they jiggled it free, the dark edge of the slide swept across the screen, and the monitor suddenly erupted into a clean, rhythmic crackle. The neuron did not care about spots of light at all. It cared about a moving edge tilted at a particular angle.

That accidental crackle is one of the founding sounds of modern neuroscience, and, improbably, of modern artificial intelligence. The line that runs from that cat's visual cortex to the image classifiers and chatbots of the 2020s is direct and traceable, and in October 2024 the Royal Swedish Academy of Sciences certified it by awarding the Nobel Prize in Physics to two pioneers of artificial neural networks. This article follows that line: how a discovery about how the brain sees edges seeded an entire family of machines, and what the relationship between brains and the systems they inspired actually is, once you look closely.

The cat's cortex and the architecture of seeing

Between 1958 and 1965, working at Harvard Medical School, Hubel and Wiesel mapped the response properties of neurons in the primary visual cortex, the region also known as V1 or Brodmann area 17. Recording from anaesthetized cats and monkeys, they found that individual neurons were exquisitely fussy. Some cells, which they called simple cells, fired only when an edge of a specific orientation fell on a specific spot of the retina; tilt the edge or shift it a little and the cell went silent. Other cells, the complex cells, were just as orientation-selective but far more forgiving about position, responding to an edge of the right angle anywhere within a region.

The crucial insight was not the individual cells but the relationship between them. Hubel and Wiesel proposed a hierarchy, in which the precise, position-locked simple cells feed into the more tolerant complex cells, so that the system builds up a representation that recognizes a feature regardless of exactly where it sits. Specificity at the bottom, invariance built up by layering above it. For showing how the visual world is decomposed and reassembled in stages of cortical processing, the two shared the 1981 Nobel Prize in Physiology or Medicine with Roger Sperry. The idea that vision is a layered hierarchy of feature detectors, each stage combining the outputs of the one below into something more abstract and more stable, would turn out to be one of the most fertile ideas in the history of computing.

From cortex to silicon: the Neocognitron

The first engineer to take that hierarchy seriously as a blueprint was Kunihiko Fukushima. Working at the NHK Broadcasting Science Research Laboratories in Tokyo, he published a model in the journal Biological Cybernetics in 1980 with a title that announced its ambition plainly: "Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position." The phrase "unaffected by shift in position" is Hubel and Wiesel translated into the language of machines, because position invariance, the ability to recognize a shape no matter where it appears, was exactly the problem the complex cells solved.

The Neocognitron copied the cortex almost layer for layer. It alternated what Fukushima called S-cell layers, modeled directly on Hubel-Wiesel simple cells and tuned to local features, with C-cell layers, modeled on complex cells and pooling over position to grant tolerance to small shifts. Stacked into a deep hierarchy, the network was trained to recognize handwritten digits. It worked, and it demonstrated something profound: a machine built on the brain's wiring diagram could solve a real perceptual task. What it lacked was an efficient way to learn its own connection strengths from data, the piece that would arrive later and change everything.

The convolutional revolution: LeCun to AlexNet

That missing piece came together in the hands of Yann LeCun. At Bell Labs in 1989, LeCun published the first practical convolutional neural network for reading handwritten digits, a design later refined and named LeNet-5 in 1998. The convolutional network kept Fukushima's brain-inspired skeleton, the alternation of feature-detecting and pooling layers, but trained it with backpropagation, an algorithm that efficiently adjusts every connection in the network by tracing errors backward from the output. LeNet was deployed commercially to read the digits on bank checks, one of the first neural networks to do real economic work in the world.

For more than two decades the approach simmered without boiling over, limited by the data and computing power available. Then, in 2012, Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton at the University of Toronto entered an eight-layer convolutional network, soon known universally as AlexNet, into the ImageNet Large Scale Visual Recognition Challenge, a contest to classify photographs into a thousand categories. AlexNet did not merely win; it won by a margin so wide that it embarrassed every competing method. Within roughly a year the entire field of computer vision abandoned its old hand-crafted techniques and pivoted to deep learning. The lineage was unbroken: AlexNet's layered feature detectors were the great-grandchildren of the simple and complex cells in that 1958 cat, scaled up and trained on a million images.

The other tradition: Hopfield, energy, and memory

The convolutional line is only half the story, and the 2024 Nobel honored the other half too. In 1982, the physicist John Hopfield published a paper in the Proceedings of the National Academy of Sciences titled "Neural networks and physical systems with emergent collective computational abilities." Hopfield came at neural networks from statistical physics rather than biology, and introduced what is now called the Hopfield network, a recurrent model in which the connections define an energy landscape. Present the network with a corrupted or partial pattern, and its dynamics roll downhill, like a ball settling into a valley, until they reach a stored memory. This was a mathematical theory of associative memory, the capacity to retrieve a whole from a fragment, the way a snatch of melody can pull back an entire song.

Hopfield's energy-based framing seeded a research program that Geoffrey Hinton extended through the Boltzmann machine, a probabilistic network built on similar physical principles, and on through the deep belief networks that helped reignite interest in many-layered architectures in the mid-2000s. The tradition's reach became remarkably broad. The transformer, the architecture published by Ashish Vaswani and colleagues at Google in the 2017 paper "Attention Is All You Need" and now the engine inside large language models, descends from this same world of learned associations and emergent collective computation, even though its self-attention mechanism is a feedforward design rather than a recurrent one. Hopfield supplied the physics of memory, Hinton supplied the learning machinery, and between them they shaped the two great lineages of the field.

October 8, 2024: physics claims the neural network

On October 8, 2024, the Royal Swedish Academy of Sciences awarded the Nobel Prize in Physics jointly to John J. Hopfield, emeritus at Princeton University, and Geoffrey E. Hinton, of the University of Toronto and formerly of Google, "for foundational discoveries and inventions that enable machine learning with artificial neural networks." A physics prize for the science behind machine learning surprised many observers, but the choice was internally consistent: Hopfield's contribution was rooted in the statistical mechanics of physical systems, and the energy-based tradition he opened runs in a clean arc through Hinton's Boltzmann machine, his championing of backpropagation, and his deep belief networks, into the technology that now reshapes daily life. The award was the discipline's acknowledgment that abstractions borrowed from brains and from physics had become an intellectual achievement worthy of its highest honor.

When the machines started predicting the brain

So far the influence has flowed one way, from neuroscience into engineering. But one of the most striking developments of the last decade is the influence flowing back, with artificial networks turning into tools for understanding the brain that inspired them. In 2014, Daniel Yamins and James DiCarlo at MIT published a study in the same journal that had carried Hopfield's work three decades earlier. They trained deep convolutional networks on object recognition, then compared the activations inside those trained networks against actual single-neuron recordings from the inferotemporal cortex of macaque monkeys, a high-level visual region where objects are recognized. The networks predicted real neural firing rates better than any previous model, and, tellingly, the deepest, most categorization-relevant layers matched the high-level visual neurons best. A system built to imitate the brain had circled back to become its best model.

A parallel convergence appeared in the study of reward. In 1997, Wolfram Schultz, Peter Dayan, and Read Montague published a paper in Science showing that dopamine neurons in the midbrain, in the ventral tegmental area and the substantia nigra pars compacta, do not simply signal pleasure but encode a reward prediction error, the gap between the reward an animal expected and the reward it received. That biological signal turned out to look remarkably like the temporal-difference learning signal at the heart of the reinforcement-learning theory developed by Richard Sutton and Andrew Barto. A concept invented by computer scientists to make machines learn from trial and error was found, almost line for line, written into the chemistry of the brain. The same principles later powered DeepMind's deep reinforcement-learning systems, from the Atari-playing DQN in 2013 to AlphaGo in 2016 and AlphaZero in 2017.

A useful caution: networks are not neurons

For all these resonances, it would be a serious error to conclude that today's artificial networks are realistic models of biological brains, and this is perhaps the most consequential misconception in the whole conversation. Real neurons communicate with discrete electrical spikes, not the smooth continuous activations of an artificial unit. Biological learning does not appear to use gradient backpropagation, and how the brain actually adjusts its synapses remains an open question. A single neuron's dendrites perform computations far richer than the simple weighted sum a typical artificial unit calculates. And the scale is humbling: the human cortex holds on the order of 86 billion neurons wired through roughly 100 trillion synapses, embedded in cellular machinery that no current artificial network reproduces. The borrowing was an inspiration, not a copy, and the honest position is that brains and the machines they seeded are cousins, sharing an ancestor in Hubel and Wiesel's hierarchy while differing profoundly in their biology.

This is also where neuroscience and engineering are converging most directly on new hardware. A field sometimes called neuromorphic or brain-inspired computing builds silicon that mimics neural dynamics in the chip itself, rather than emulating it on conventional graphics processors. The leading efforts include Intel's Loihi, IBM's TrueNorth, the Neurogrid system from Kwabena Boahen at Stanford, and SpiNNaker, the spiking-network machine built under Steve Furber at the University of Manchester. Each runs spiking neural networks in silicon at very high energy efficiency. None has yet displaced GPU-based deep learning, but they mark the frontier where the brain's design principles and practical AI hardware meet most directly.

Key Takeaways

The story of how the brain inspired AI is a single traceable lineage that begins with a jammed projector slide in 1958, when Hubel and Wiesel discovered that visual cortex neurons are layered feature detectors, simple cells feeding position-tolerant complex cells, an architecture that Fukushima rendered into the Neocognitron in 1980, that LeCun made trainable as the convolutional network in 1989, and that exploded into the modern era when AlexNet won ImageNet in 2012. A second tradition, born from physics, runs from Hopfield's 1982 energy-based model of associative memory through Hinton's Boltzmann machines and deep belief networks toward the transformers behind today's language models, and these two lineages together earned Hopfield and Hinton the 2024 Nobel Prize in Physics. The influence now runs in both directions, as deep networks predict real firing in macaque inferotemporal cortex and as the dopamine reward prediction error discovered by Schultz, Dayan, and Montague mirrors reinforcement learning theory almost exactly. Yet the resemblance has firm limits, because real neurons spike, learn without backpropagation, and compute in their dendrites, packed 86 billion strong into circuits no artificial network reproduces, which is why the most accurate description of brains and AI is not identity but a deep and productive family resemblance.

Learn more with Mindoria

Bite-sized lessons, spaced repetition, and live PvP trivia battles. Free on Android.

Download Free