"A wing would be a most mystifying structure if one did not know that birds flew." Nearly 50 years ago, Barlow (1961) used this opening sentence in his landmark paper about the organization of the (sensory) nervous system to motivate why the properties of neurons and neural circuits should be studied in a normative way. According to this approach, one should start by thinking about the functions neurons ought to serve, and then derive their properties from the premise that they serve those functions well -- rather than amassing all the teeny-weeny details known about those properties and then search for some function that might explain some subset of them. Indeed, both neuronal and synaptic dynamics can be so complex that it is important to have in mind their potential functional roles, otherwise "one will get lost in a mass of irrelevant detail and fail to make the crucial observations" (Barlow 1961). In the past decades, computational neuroscience has seen a burgeoning of normative approaches. These studies made significant advances in formulating formal theories of optimality, and optimal computations, identifying relevant physical and computational constraints under which those computations need to be implemented, developing analytical methods and numerical algorithms to solve the resulting constrained optimization problems, and relating these solutions to biological substrates. However, only a relatively small fraction of these studies attempted to make specific predictions about, and thus interpret in normative terms, the cellular-level electrophysiological properties of individual neurons or synapses. Small in numbers it may be, the potential impact of this particular line of research cannot be ignored as such theories may provide a way to bridge the gap between the cellular-molecular and the systems-level branches of neuroscience by connecting low-level properties of the nervous system to its high-level functions. Our workshop aims to highlight and discuss recent work in this field. In this workshop we will discuss three different, though not necessarily unrelated, organizational principles that have been pursued in explaining cellular properties of neurons and synapses and assess their predictive and interpretive power:
- The redundancy reduction hypothesis, which was later formulated as the infomax principle, assumes that the amount of transmitted or stored (Shannon) information should be maximized in neural circuits. Modern reincarnations of this idea seek to show that neural adaptation and long-term synaptic plasticity have key properties which are optimal for this function.
- The Bayesian approach assumes that neurons need to perform statistical inferences on their inputs for efficient computation, and recent studies show how neural spiking dynamics and short-term synaptic plasticity may contribute to such computations.
The non-exhaustive list above aptly illustrates that each of these principles can be applied to study both neurons and synapses, and conversely, the same neuronal-synaptic properties may feature in several functional theories. Identifying these overlaps, conflicts, and alternative interpretations in lively discussions and debates will be a central aspect of the workshop. Since much of the theoretical background in this field has been adopted from information theory, machine learning, and related fields, we expect that not only experimental and computational neuroscientists, but also machine learning researchers will be interested in the general topic and the specific talks.
- The constraint-based approach assumes that the basic biophysical constraints (energy, space, intrinsic noise, etc) profoundly affect signalling in neurons and synapses and derive show how properties of spike generation and synaptic transmission reflect these constraints.
|7.40|| Máté Lengyel, U Cambridge
A beginner's guide to constructing neural networks that rememberA beginner's guide to constructing neural networks that remember
One of the classical computational tasks faced by the brain is that of autoassociative memory: memories stored in the past need to be recalled from fragmented and noisy information in the present. I will analyse auto-associative memory from an information theoretic perspective and treat it at the computational level as a problem of Bayesian inference. Unlike most previous approaches to autoassociative memory, this approach is fairly agnostic at the level of representation: it can be applied to cases when memories are represented by firing rates, or spike timings, or a combination of these. Therefore, the resulting theories have the potential to provide general guidelines for constructing optimal autoassociative networks, or to interpret properties of neural circuits in normative terms.
First, I will show how to optimise recall dynamics in a network of neurons. At the level of implementation, we predict how the synaptic plasticity rule that stores memories and the form of network dynamics that recalls them need to be matched to each other. We applied this theory to a case when memories are (at least partially) represented by spike times, and obtained experimental evidence confirming such a match in hippocampal area CA3. Second, I will present recent work about optimising synaptic plasticity for storing memories, treating it initially 'just' as a problem of information maximization. However, the theory points out a fundamental incompatibility between local learning rules for storage and local network dynamics for recall, which implies that the 'formatting' of information is just as relevant as its maximization. These results suggest normative interpretations for heterosynaptic learning rules and for wiring rules in sparsely connected networks.
|8.00|| Sophie Denève, École Normale Supérieure
Models of sensory coding and computation usually consider sensory cells as representing static stimuli in their receptive fields. In particular, this view pervades theories of visual perception where neurons are primarily seen as responding to stereotyped "patterns" which may include a temporal dimension (i.e. spatio-temporal receptive fields) but are otherwise largely independent of stimulus history.
However, all sensory systems, including the visual system, respond more strongly and precisely to dynamic stimuli than to steady ones. For instance, a hallmark of visual receptors is that they adapt so quickly that visual perception requires the constant retinal motion induced by small eye movements. Thus, it is likely that one of the major roles of sensory processing is to detect and signal sudden changes in the sensory environment.
To test this hypothesis, we explored the idea that sensory circuits are tuned to respond as quickly and reliably as possible to sudden transients in their inputs. To this end, we developed a Bayesian model of change detection under the assumption that appearance (or disappearance) of stimuli are unpredictable and cause rapid changes in firing rates of noisy input spike trains. From this "ideal observer" (normative) model, we derived a minimal neural circuit estimating on-line the probability of stimulus appearance. This minimal circuit couples an excitatory synapse exhibiting short term synaptic depression (STP) and an inhibitory synapse with short term facilitation (STF). This mechanism has anatomical correlates in the neocortex, e.g. through Martinotti inhibitory interneurons and feed-forward inhibition.
We next explored the implication of this simple mechanism for sensory coding and adaptation, in particular in early stages of visual processing. A neural structure tuned to detect binary changes (i.e. "ON" and "OFF" transitions) will respond very differently from a system signalling continuous levels of stimulation (i.e. local luminance or motion energy). Assuming a simple firing mechanism corresponding to a decision threshold, we found properties analogous to the bi-phasic temporal receptive fields (tRFs) reported in the retina, LGN and V1. However, the predicted responses to time varying stimuli are much sparser and temporally precise than would be predicted by the tRF alone. Moreover, response gain and tRF shapes adapt to stimulus variance, also as reported experimentally. This invites us to revise current theories of the computational role and mechanism of this form of adaptation.
Our models predicts how biophysical parameters such as time constant of synaptic plasticity should be tuned to the assumed statistics of the stimulus, i.e. its probability of appearance, duration and levels of input noise. We derived on-line learning rules for these parameters based on the expectation maximization algorithm.
|8.30|| Adrienne Fairhall, U Washington
Neural systems use strategies on several timescales for efficiently encoding different stimulus components. We review evidence for some of these strategies in several sensory systems. We will show how the biophysics of single neurons leads to dynamics and input/output transformations that can be seen to optimally track stimulus statistics and to implement efficient coding.
|9.30|| Tatyana Sharpee, Salk Institute
Predictable irregularities in retinal receptive fields based on normative approachesPredictable irregularities in retinal receptive fields based on normative approaches
Neural variability is present throughout the nervous system. Yet, in some tasks, motor output variability can be almost exclusively attributed to sensory input variability, despite the many levels of neural circuitry that are involved in sensory-motor transformations. This suggests that mechanisms for noise reduction exist. Here, we find one such mechanism by studying the properties of retinal cells. In vision, retinal ganglion cells partition visual space into approximately circular regions termed receptive fields (RFs). Average RF shapes are such that they would provide maximal spatial resolution if they were centered on a perfect lattice. However, individual shapes have fine-scale irregularities. Here, we find that irregular RF shapes increase the spatial resolution in the presence of lattice irregularities from 60% to 92% of that possible for a perfect lattice. Optimization of RF boundaries around their fixed center positions reproduced experimental observations neuron-by-neuron. Our results suggest that lattice irregularities determine the shapes of retinal RFs and that similar algorithms can improve the performance of retinal prosthetics where substantial irregularities arise at their interface with neural tissue.
|10.00|| Taro Toyoizumi, Columbia U
Randomly connected networks maximize Fisher information at the edge of chaosRandomly connected networks maximize Fisher information at the edge of chaos
A randomly connected network is known to show a transition from non-chaotic to chaotic behavior as the strengths of connections increase. Although this chaotic state has been argued as the origin of the irregular activity seen in the cortex, its functional significance is largely unknown. In this study, I analytically derived the Fisher information of a recurrently connected network about its external input. I found that the Fisher information is maximized at the edge of chaos where the system is most sensitive to the external input. Moreover, with observation noise, the chaotic state is more informative than the non-chaotic state around the critical point. The analytical expression of the Fisher information provides an intuitive picture of the trade-off between increasing signal and decreasing noise and shows how the input-output nonlinearity influences information coding. The optimal variation in synaptic strengths is predicted based on the input-output nonlinearity of neurons.
Paul Adams, Kingsley Cox, Stony Brook U
What requirements must real synapses meet to successfully implement useful learning, especially from large data sets with higher-order correlations? Quite apart from detailed algorithms, one overriding, but often neglected, requirement is that synaptic weight adjustments be highly connection-specific. Learning theorists assume adjustments are completely specific, but recent experiments revealed small but detectable chemical coupling (presumably reflecting high synapse density and necessarily good spine-shaft electrical coupling). We studied the effect of minor coupling of weight adjustments in unsupervised learning by single neurons, using 2 paradigms. In PCA learning, using a linear Hebb rule, minor coupling never prevents almost perfect learning of the principal component. However, in ICA learning, using a nonlinear rule, minor coupling often leads to catastrophic failure to learn an independent component. We now report that when this collapse occurs, the nonlinear rule learns instead just the appropriate principal component: if a nonlinear rule is not sufficiently accurate, it behaves as a linear rule. This has important implications for hierarchical learning in the brain: if one layer of a network cannot learn anything useful (i.e. using nonlinear rules), it cannot supply other layers. Since the neocortex manifestly does hierarchically learn complex relationships, it must have solved this "Hebbian inspecificity" problem, despite apparently insurmountable biophysical constraints. We suggest that it does so using a layer-6-based proofreading strategy. Corticothalamic neurons would fire in response to spike coincidences across thalamocortical connections, and their firing would "approve" draft Hebbian changes at those connections occurring within the preceding 100 milliseconds, by shifting thalamic relays to "burst" mode. In this view the key to successful massively parallel learning is superaccurate weight adjustment allowed by cortical proofreading. We show that the required thalamocorticothalamic circuitry matches closely that observed. We also suggest that weight coupling might undermine "neuromorphic" approaches to massively parallel machine learning.
Sharat Chikkerur, Thomas Serre, Cheston Tan, Tomaso Poggio, MIT
Bayesian inference theory predicts physiological effects of attentionBayesian inference theory predicts physiological effects of attention
The past four decades of research in visual neuroscience has generated a large and disparate body of literature on the role and effects of attention. One influential proposal by Tsotos (1997) is that attention reflects evolution's attempt to fix the processing bottleneck in the visual system (Broadbent 1958). This is done by directing the finite computational capacity of the visual system preferentially to relevant stimuli within the visual field while ignoring everything else. The feature integration theory by Treisman & Gelade (1980) suggested that attention is used to bind different features (e.g., color and form) of an object during visual perception. The biased competition hypothesis (Desimone 1998) suggested that the goal of attention is to bias the choice between competing stimuli within the visual field. The 'guided search' hypothesis (Wolfe 07) suggests that attention guides the search towards locations that share features with the target. These proposals however remain agnostic about how attention should be implemented in the visual cortex and do not yield any prediction about the various behavioral and physiological effects of attention. On the other hand, several computational models have attempted to model specific behavioral and physiological effects of attention. Behavioral effects include pop-out of salient objects (Itti & Koch 2001) , top-down bias of target features (Wolfe 2007), serial vs. parallel search effects (Wolfe 2007), etc. At the neurophysiological level, several phenomena have been described, which include a multiplicative modulation of the neural responses under spatial attention (Rao 2005, McAdams and Maunsell 1999) and feature-based attention (Bichot et al. 2005). A unifying framework that can account for these disparate effects and while being consistent with other mechanisms of visual perception is, however, still missing. In this work, we suggest that the idea of perception as Bayesian inference (Knill and Richards, 1996, Lee and Mumford, 2003) may be extended to include visual attention as well. In our approach, the main goal of the visual system is to infer the identity and the position of objects in visual scenes: spatial attention emerges as a strategy to reduce the uncertainty in shape information while feature-based attention reduces the uncertainty in spatial information. We propose an algorithmic implementation – a Bayesian network that can be mapped into the basic functional anatomy of attention involving the ventral stream (V4 and PIT) and the dorsal stream (LIP and FEF). The resulting model predicts – somewhat surprisingly – a number of psychophysical and physiological properties of attention. Attentional phenomena such such as pop-out, multiplicative modulation and change in contrast response, which have been described in the recent literature as fundamentally different and in some cases as conflicting findings, are all directly predicted by the same model. Thus, the theory proposes a computational role for attention and leads to a model that performs well in recognition tasks and that predicts some of the main properties of attention at the level of psychophysics and physiology.
|16.00|| Jean-Pascal Pfister, U Cambridge
Far from being static relays, synapses are complex dynamical elements. The effect of a presynaptic spike on the postsynaptic neuron depends on the history of the activity of both pre- and postsynaptic neurons, and thus the efficacy of a synapse undergoes perpetual modification. These changes in efficacy can last from hundreds of milliseconds or minutes (short-term plasticity) to hours or months (long-term plasticity). In order to regulate their efficacy over these different time scales, synapses use more than 1000 different proteins. In the face of this complexity, it seems reasonable to study synaptic plasticity by starting from first principles rather than by modelling every biophysical detail.
In this talk, I will present two normative models of synaptic plasticity: one for long-term plasticity and one for short-term plasticity. The first model considers a synaptic learning rule that maximises, under some constraints, the mutual information between the pre- and postsynaptic spike trains. This type of learning rule is consistent with data about spike timing-dependent plasticity and can also be mapped to the well-known BCM learning rule.
The second model focuses on short-term plasticity and views it in a Bayesian framework. It starts from the commonplace observation that the spiking of a neuron is an incomplete, digital, report of the analog quantity that contains all the critical information, namely its membrane potential. We therefore suggests that a synapse solves the inverse problem of estimating the pre-synaptic membrane potential from the spikes it receives, acting as a recursive filter. I will show that the dynamics of short-term synaptic depression closely resemble those required for optimal filtering.
|16.30|| Aldo Faisal, Imperial College
Normative neurophysiology from first-principle biophysical constraintsNormative neurophysiology from first-principle biophysical constraints
Do hard physical limits constrain the structure and function of neural circuits? We studied this problem from first-principle biophysics looking at three fundamental constraints (noise, energy and time) and how the basic properties of a neuron's components set up a trade-off between these. We focus on the action potentials as the fundamental signal used by neurons to transmit information rapidly and reliably to other neurons along neural circuits.
|17.30|| Dmitri Chklovskii, HHMI Janelia Farm
We have developed a theory of dendritic and axonal shape based on two principles: minimization of the wiring cost and maximization of the connectivity repertoire. These two principles can be expressed mathematically as an optimization problem. We solved this optimization problem using the methods of statistical physics and found good agreement with experimental measurements. Remaining discrepancies between theory and experiment should point to other factors affecting neuronal shape such as non-linear computations in dendrites.
How do processing delays influence the design of the cortex?
What normative models can describe heterogeneity in the nervous system, and quantify its constructive contribution to neural coding?
Is there any reason why higher level information processing strategies should constrain individual neurons?
Is there any link between the infomax principle and the organism's requirement for survival?
Is informational optimality (infomax) and computational optimality (eg Bayes) in conflict, and if so, how do we resolve this conflict?
|18.30||end of workshop|