I will be joining a few colleagues for a symposium at CogSci2014 and I’ve been gathering some notes for my talk. The talk will focus on the impact of embodiment theories on a philosophy of mathematics. As I looked again at some of the things I’ve chosen to highlight in my blogs, I came upon a talk given by Josh Tenenbaum, Professor in the Department of Brain and Cognitive Sciences at MIT. I’ve read about aspects of his work before, but after listening to the talk he prepared for the Simons Foundation, I became even more interested in the implications that his investigation of Bayesian models of cognition might have for mathematics. One of the goals of the Simons Foundation talk was to highlight what Tenenbaum called “some of the new and deep math that has come out of the quest to understand intelligence.” I was particularly struck by his straightforward suggestion that statistics is not only intuitive, but that it may be part of our intuition.

The following remarks appear in the text introduction to the talk:

The mind and brain can be thought of as computational systems — but what kinds of computations do they carry out, and what kinds of mathematics can best characterize these computations? The last sixty years have seen several prominent proposals: the mind/brain should be viewed as a logic engine, or a probability engine, or a high-dimensional vector processor, or a nonlinear dynamical system. Yet none of these proposals appears satisfying on its own. The most important lessons learned concern the central role of mathematics in bridging different perspectives and levels of analysis — different views of the mind, or how the mind and the brain relate — and the need to integrate traditionally disparate branches of mathematics and paradigms of computation in order to build these bridges.

…The recent development of probabilistic programs offers a way to combine the expressiveness of symbolic logic for representing abstract and composable knowledge with the capacity of probability theory to support useful inferences and decisions from incomplete and noisy data. Probabilistic programs let us build the first quantitatively predictive mathematical models of core capacities of human common-sense thinking: intuitive physics and intuitive psychology, or how people reason about the dynamics of objects and infer the mental states of others from their behavior.

There are a few themes in this talk that are worth noting. There is certainly the suggestion that the brain’s computations are a kind of mathematics that happens within the body itself. Tenenbaum was clear, however, about the extent to which brain processes are not understood. No one yet knows how the brain actually learns, or how it translates symbols into meaning, nor how any of this can be understood with respect to the activity of neurons. But he has been gathering evidence, on more than one front, that supports the idea that probability theory has a lot to say about “the things that the brain is good at” — like visual perceptions, learning the cause and effect relationships in the physical world, understanding words and the meaning of actions…etc.

In one of the studies Tenenbaum discussed, Bayesian estimates that were computed from priors (defined by empirical statistics) were compared to the judgments that individuals made when asked to make predictions about those same variables. Subjects in the study were asked to estimate things like the amount of money a movie will gross, a human life expectancy, or a movie’s run time. For example, they might be asked, if you read about a movie that has made $60 million to date, how much will it make in total. There were five groups of subjects, and each of the groups were given different numbers. In our example one group may have been given $30 million, another $50 million, etc. The prior for a parameter describes what is known, a priori, about the parameter being estimated. In this case, what is known was gathered from empirical data. And the distribution of priors for the various categories (movies runs, life expectancies, movie grosses) are qualitatively different from each other. The data for one of the variables, for example, may produce a power distribution, while the data for another produces a Gaussian distribution. You can apply Baye’s rule to each of these distributions to compute a *posterior distribution* – priors updated by experience or evidence. The median of the posterior distribution provides what’s called a posterior predictive estimate. In the studies Tenenbaum cites, applying Bayesian inference to the priors fits very well with the estimates people actually made, both quantitatively and qualitatively.

A recent paper entitled A tutorial introduction to Bayesian models of cognitive development, which Tenenbaum co-authored, is a broad treatment of how and why Bayesian inference is used in probabilistic models of cognitive development. There, the point is made that the Bayesian framework is generative. By generative is meant that the data observed has been generated by some underlying process. And one of the values of Bayesian models is their flexibility.

…because a Bayesian model can be defined for any well-specified generative framework, inference can operate over any representation that can be specified by a generative process. This includes, among other possibilities, probability distributions in a space (appropriate for phonemes as clusters in phonetic space); directed graphical models (appropriate for causal reasoning); abstract structures including taxonomies (appropriate for some aspects of conceptual structure); objects as sets of features (appropriate for categorization and object understanding); word frequency counts (convenient for some types of semantic representation); grammars (appropriate for syntax); argument structure frames (appropriate for verb knowledge); Markov models (appropriate for action planning or part-of-speech tagging); and even logical rules (appropriate for some aspects of conceptual knowledge).

The representational flexibility of Bayesian models allows us to move beyond some of the traditional dichotomies that have shaped decades of research in cognitive development: structured knowledge vs. probabilistic learning (but not both), or innate structured knowledge vs. learned unstructured knowledge (but not the possibility of knowledge that is both learned and structured)p10

In his talk, Tenenbaum says that intelligence is about *finding structure in data*. This is key, I think, to why mathematics has played so prominent a role in physics. And, as Tenenbaum says, “it is the math, not a body of empirical phenomena that supports reduction and bridge-building.” His talk for the Simons Foundation is aimed at highlighting the significance of mathematics in the study of cognition — from Bayesian inference to Bayesian networks (probabilistic graphical models) where arrows describe probabilistic dependencies and algorithms compute inferences. His arguments lead to the development of probabilistic programming. He also made a brief reference to mathematics being explored by some of his colleagues, that would serve to translate inferences into stochastic (random) circuits, suggesting a potential parallelism to the brain.

Neuroscience, Tenenbaum points out, is using the language of electrical engineering and common sense is at the heart of human intelligence. He goes on to explain that the toolkit of graphical models is not enough to capture the causal processes underlying our intuitive reasoning about the physical world or our intuitive psychological/social reasoning. To do this, one needs to define probabilities “not over a fixed number of variables, but over something much more like a program.” Current work is focused on designing programs that can be run forward for prediction and backward for inference, explanation and learning.

You can find a nice account of recent work on probabilistic programing on Radar.

In Tenenbaum’s tutorial paper, he quoted Laplace (1816) who sort of summed things up when he said:

Probability theory is nothing but common sense reduced to calculation.

## Recent Comments