``Dynamical Systems Hypothesis in Cognitive Science''

Robert F. Port, Indiana University

December 5, 2000 draft

-draft entry for `Encyclopedia of Cognitive Science MacMillan Reference Ltd,

London. Amy Lockyer, Assoc. Ed.

Charge from the editor.

Article Title: DYNAMICAL SYSTEMS HYPOTHESIS IN COGNITIVE SCIENCE

Maximum text length in words: 2000

Level of article: Focussed (suitable for undergraduates/postgraduates)

Suggested first level headings:

1. The computational framework and the dynamical systems framework

2. Strengths of the dynamical systems approach

3. The role of time in cognitive models

4. The role of discrete and continuous representations in cognitive models

5. Dynamical systems models of sensory-motor capabilities

6. Dynamical systems models of decision making and high-level cognition

7. Situated cognition and the coupling of systems

8. Connectionism and the dynamical systems hypothesis

9. Summary

Outline of Draft (current size excluding references = 4,099 words)

A. Overview (proposed 1)

B. Mathematical Context (proposed 1, 3)

C. Perceptual and Motor Models (proposed 5)

D. High-level Cognition (proposed 6)

E. Relation to Situated Cognition and Connectionism (proposed 7, 8)

F. Contrasting the Dynamical Hypothesis with Traditional Approaches (proposed 1, 3)

G. Strengths and Weaknesses of the Dynamical Models (proposed 2)

H. Discrete vs. Continuous Representations (proposed 4)

A. Overview

The dynamical hypothesis in cognition refers to various research paradigms applying the mathematics of dynamical systems to understanding cognitive function. The approach is allied with and inspired by research in neural science over the past fifty years for which dynamical equations have been found to provide excellent models for the behavior of single neurons (e.g., Hodgkins and Huxley, 1952). It also derives inspiration from work on gross motor activity by the limbs (e.g., Bernstein, 1967, Feldman, 1966). In the early 1950s, Ashby made the startling proposal that all of cognition might be accounted for with dynamical system models (1952), but little work directly followed from his speculative suggestion due to a lack of appropriate mathematical tools as well as the lack of computational methods to implement such models in a practical way. More recently, the connectionist movement (Rumelhart and McClelland, 1986) provided insights and mathematical implementations of learning, for example, that have helped restore interest in dynamical modeling.

The dynamical approach to cognition is also closely related to ideas about the embodiment of mind and the environmental situatedness of human cognition, since it emphasizes commonalities between behavior in neural and cognitive processes on one hand with physiological and environmental events on the other. The most important commonality is the dimension of time that is common to all of these domains. This permits real-time coupling between them, where the dynamic of one system influences another. Humans often couple many systems together, such as when dancing to music -- where the subject's auditory perception system is coupled with environmental sounds,

and the gross motor system is coupled with both audition and environmental sound. Because of this commonality between the world, the body and cognition, according to this view, the method of differential equations is applicable to events at all levels of analysis over a wide range of time scales. This approach directs explicit attention to change over time and rates of change over time of system variables.

B. Mathematical Context

The mathematical models employed by dynamical systems research derive from many sources in biology and physics. Only two schemas will be pointed out here. The first is the neural network idea, partially inspired by the remarkable equations of Hodgkins and Huxley (1952) which account for many known phenomena about neurons in terms of the dynamics of cell membrane. They proposed a set of differential equations for the flow of sodium and potassium ions through the axonal membrane during the passage of an action potential down the axon. These equations, which turn out to apply with slight modification to nearly all neurons, inspired extensions to account for whole cells (rather than just a patch of membrane) in terms of its likelihood to fire given various excitatory and inhibitory inputs. Interesting circuits of neuron-like units were also constructed and simulated on computer. The Hodgkin-Huxley equations also inspired many psychological models, like those of Grossberg (1976;1980), the neural network models (Rumelhart and McClelland, 1986; Hinton, 1986) and models of neural oscillations (Kopell, 1995).

In this general framework, each cell (or group of cells) in a network of interconnected cells (or cell groups) is hypothesized to follow an equation like:

Equation 1: dA/dt = gA(t) + d[aE(t) – bI(t) + cS(t)] + bias

indicating that the change in activation (i.e., the likelihood of firing) at time t, dA/dt, depends on the decay, g, of the previous value of A plus a term representing inputs from other cells that are either excitatory (that is, tending to increase the likelihood of firing), E(t), or inhibitory (tending to decrease the likelihood of firing), I(t). For some units there may be an external physical stimulus, S(t). The nonlinearity, d(x), encourages all-or-none firing behavior and the bias term adjusts the value of the firing threshold. An equation of this general form can describe any neuron. Over the past 50 years, networks of units like these have demonstrated a wide variety of behaviors, including many specific patterns of activity that animal nervous systems exhibit.

A second concrete schema for the dynamical approach to cognition is simply the classical equation for a simple oscillator like a pendulum. Indeed, it is obvious that arms and feet have many of the properties of pendula. Students of motor control have discovered that pendular motion is a reasonable architype for many limb gestures. A nearly identical system (lacking the complication of arc-shaped motion) is the equation for a mass-spring system. In this form:

Equation 2: md(d²x)dt² + d(dx/dt) + k(x-x₀) = 0

it specifies simple harmonic motion in terms of the mass, m, times the acceleration, (d²x)dt², the damping, d, scaling the velocity, (dx/dt), and the spring’s stiffness, k, times the deviation from the neutral position, x₀, of the mass. Feldman (1966) used heavily damped harmonic motion to model a simple reach with the arm. If the neutral position, x₀ (the attractor position in this heavily damped system) can be externally set to the intended target angle, then an infinity of motions from different distances and directions toward the target will result – simply by allowing the neuromuscular system for the arm to settle to its fixed point, x₀. A number of experimental results, such as maximum velocity in the middle of the gesture, larger maximum velocity for longer movements, the apparently automatic correction for an external perturbation plus the naturalness of limb oscillation, are accounted for with such a model.

In the most general terms, a dynamical system may be defined as a set of quantitative variables that change simultaneously and interdependently over quantitative time in accordance with some set of equations (van Gelder and Port, 1995). From this perspective, Newton's equations of motion for physical bodies were the earliest dynamical models. Mathematical developments over the past 30 years have revolutionized the field. Whereas up until the 1950s, the analysis of dynamical models was restricted to linear systems (where the dynamic rule has linearly proportional effects on system variables such as in Equation 2 above) and only when they contain no more than a couple of variables, now, through the use of computational simulations (using discrete approximations, of course) and computer graphics to facilitate geometric interpretations of these systems, practical methods for studying nonlinear models with many variables are now possible.

Although models of this general class have proven useful for many problems, it remains to be seen whether such models will be viable in the long run to account for the full range of cognitive behavior.

C. Perceptual Models

Dynamical models seem particularly appropriate to account for recognition and motor control, since research leading to temporal information about the process of perception has been gathered for many years (in contrast to `general thinking' or reasoning) and because motor control is clearly a task that requires refined temporal control.

One well-known example of a dynamical model for general perception is the ART model (adaptive resonance theory) of Grossberg (1995). This neural network is defined by a series of differential equations, similar to the network equation shown above, describing how the activation of a node is increased or decreased by stimulus inputs, excitation and inhibition from other nodes and intrinsic decay. This depends on weights (represented above as matrices for a, b and c in Equation 1) which are modified by successful perceptual experiences (to simulate learning from experience). The model can discover the low-level features that are most useful for differentiating frequent patterns in its stimulus environment (using unsupervised learning), identify specific high-level patterns (even from noisy or incomplete inputs) and reassign resources whenever a new significant pattern appears in its environment without forgetting earlier patterns.

To recognize an object such as a letter from visual input, the signal from a spatial retina-like system excites low-level features in a set of nodes called collectively F1. The pattern of activated features in F1 feeds excitation through weighted connections to all the nodes of F2, the set of nodes that will do the identification. These nodes compete with each other by inhibiting each other in proportion to their activation. Thus, the best matching unit in F2 will quickly win and suppress all its neighbors. But this is only the first step in the process. At this point, the winning F2 node feeds activation back to those F1 nodes that it predicts should be active (as determined by the weighted connections). If not enough of the F1 nodes turn out to be active (that is, if the pattern does not match well enough), then the system rejects this identification by shutting down the F2 node that had won (for some time interval). Then the activation of F2 by F1 begins again and a different F2 node will win. If on feedback this one matches sufficiently well, then a ``resonance loop’’ is established where F1 and F2 reinforce each other. Only at this point is successful (and, according to Grossberg, conscious) identification achieved. This perceptual model is dynamic because it depends on differential equations that gradually increase and decrease the activation of nodes in a network at various rates. Grossberg's group has shown that variants of this model can account in a very natural way for many phenomena of visual perception, including those involving backward masking, reaction time and so on.

D. High-level Models

In recent years, dynamical models have also been applied to more high-level cognitive phenomena. First, Grossberg and colleagues have elaborated the ART model with mechanisms like `masking fields' that permit the model to be extended to tasks like word recognition from auditory input arriving over time. Several time-sensitive phenomena of speech perception can be successfully modeled this way (Grossberg, 1986). Second, models of human decision making for many years have applied `expected utility theory,' where a choice is based on evaluation of relative advantages and disadvantages of each choice at some particular point in time. But Townsend and Busemeyer (1995) have been developing their decision field theory that not only accounts for the likelihood of each eventual choice, but also accounts for many time-varying aspects of decision making, such as approach-avoidance effects, vacillations, and the fact that some decisions need more time than others.

Finally, it’s important to note that phenomena that at first glance seem to depend on high-level cognitive skills may turn out to reflect much more low-level properties of cognition. One of the most startling results of this kind is in the ``A-not-B problem''. This is a traditional puzzle in early cognitive development first discovered by Piaget (1954) and interpreted by him as showing that very young children (roughly 9-12 months) do not have the concept of `object permanence', that is, that children have inadequate representations of objects, thinking falsely that objects intrinsically belong to the specific place where they were first found. Here is the supporting experiment procedure in canonical form: With the child watching, the experimenter places an interesting object under a cover on the table (at position A) in front of the child and lets the child reach over to lift the cover and grab the toy. This is done this several times. Now the object is put under a different cover in front of the child (at position B). Astonishingly children at this age will often reach back to lift the cover they had reached to earlier rather than the correct one – to A not to B! Many experiments have explored this phenomenon over the past forty years and a wide variety of cognitive accounts have been proposed in addition to Piaget’s notion of inadequate `object permanence’. Recently, Thelen, Schöner, Schrier and Smith have demonstrated that the hypothesized cognitive representation of the object has nothing to do with this behavior. Instead, their dynamical model for choosing the direction of reach shows that the children have generated a strong bias (or habituation) toward reaching in the A direction due to the repeated earlier reaches. Watching the experimenter put the object in a new location is not sufficient to overcome the bias to repeat the same gesture as before. The authors show that the bias can be overcome, however, by such simple revisions as (a) inserting a time delay between the first reaches and the final one (giving time for the first bias to decay), or (b) only letting the child reach once to the first location rather than multiple times (so the directional bias to A is less strong) or (c) using an object in the B location that is new and more interesting to the child than the earlier one (thus boosting the directional bias toward the new position relative to the old). Their dynamical model predicts sensitivity to just the same variables. The lesson is that sometimes what seems at first to be a property of abstract, high-level, static representations may turn out to result from rather low-level time-sensitive effects – most of which are naturally modeled in terms of dynamical equations.

E. Relation to Situated Cognition and Connectionism

Although the issue of situated cognition can be interpreted in many ways that have little to do with dynamical systems, the dynamical systems approach is highly compatible with this concern. The primary reason is that, from this perspective, the world, the body and the cognitive functions of the brain can all be analyzed using the same conceptual tools. This is important because it greatly simplifies our understanding of the relationship between these systems, and is readily interpreted as an instance of the biological adaptation of the body and the brain to the environment.

Connectionist models are discrete approximations of dynamical systems and so are the learning algorithms used with them. But when computational models are used simply to do identification or make a choice by settling to a fixed point (as was the case in the early connectionist models of the mid 1980s), they seem to be focused on tasks that are largely inspired by issues in symbolic cognitive models. Even if they attempt to solve temporal problems yet address only time specified as serial order, they would seem to be minimally dynamical. The touchstone of a thoroughly dynamical approach is the study of phenomena that occur in continuous time. The source `situatedness’, we would say, is situation in real time. Of course, neural networks are frequently used to study such phenomena, but other dynamical methods are also available for some problems that do not employ network simulations. Thus the development of connectionist modeling since the 1980s has certainly helped to move the field in the direction of dynamical thinking, but these models are not always good illustrations of the dynamical hypothesis of cognition.

F. Contrasting the dynamical systems framework with traditional approaches.

It should be acknowledged that the most widespread conceptualization of the mechanism of human cognition is that cognition resembles computational processes, like deductive reasoning or long division, by making use of symbolic representations of objects and events in the world that are manipulated by cognitive operations (typically serially ordered) which might reorder or replace symbols, and draw deductions from them. This approach has been called the computational approach and its best-known articulation is the physical symbol system hypothesis (Newell and Simon, 1972). The theoretical framework of modern linguistics (Chomsky, 1965) also falls within this tradition since it views sentence generation and interpretation as a serially ordered process of manipulating word-like symbols (such as table and go), abstract syntactic symbols (like NounPhrase or Sentence) and letter-like symbols representing minimal speech sounds (such as /t/, /a/ or features like [Voiceless] or [Labial]) in discrete time. In considering skills like the perceptual recognition of letters and sounds or recognizing a person's distinctive gait, or the motor control that produces actions like reaching, walking or pronouncing a word, the traditional approach hypothesizes that all processes of cognition are accomplished by computational operations that manipulate digital representations in discrete time. The mathematics of such systems is based on an abstract algebra dealing with the manipulation of strings and graphs of distinct symbol tokens. Indeed, Chomsky's work on the foundation of such abstract algebras (Chomsky, 1961) served as a theoretical foundation both for computer science and cognitive science, as well as modern linguistic theory.

It should be noted that the dynamical systems hypothesis for cognition is in no way incompatible with serially ordered processes on discrete symbol tokens. Some possible examples are forms of human cognition like doing arithmetic and generating a sentence. Still the dynamical systems approach denies that all cognition (or even most) can be satisfactorily understood in computational terms and insists that, since the physiological brain, the body and the environment are best accounted for in dynamical terms. Any explanation of human symbolic processing must include an account of their implementation in dynamical terms. The dynamical approach points out the inadequacy of simply assuming that a `symbol processing mechanism' is somehow available to human cognition, the way a computer happens to be available to a programmer. Instead wherever either discrete-time or digital functions are found in cognition, the continuous processes which give rise to the discreteness and digitality demand, sooner or later, a complete, continuous-time account. Indeed the dynamical approach tends to deny that cognition can be separated from the physical body or from the environment. A fundamental contrast is that the discrete time of computational models is replaced with continuous time for which rate of change and changes in the rate of change (first and second derivatives) etc. are meaningful at each instant. Thus it is invariably dependent upon studying parameters that change over time and it attempts to understand those changes by modelling them.

G. Strengths and Weaknesses of Dynamical Models

Dynamical modeling offers many important strengths relative to traditional symbol processing or computational models of cognition. First, the biological plausibility of digital, discrete-time models is always a problem. How and where might there be in the brain, a device that would behave like a computer chip, clicking along performing infallible operations on digital units? The answer often offered in the past was "We don't really know how the brain works, anyway, so this hypothesis is as plausible as any other." Such an argument does not seem as reasonable today as it did 30 or 40 years ago. Certainly neurophysiological function exhibits many forms of discreteness. But that does not justify simply assuming whatever kind of units we like and operations.

Second, temporal data can finally, by this means, be incorporated directly into cognitive models. Phenomena like (a) processing time (observable in data on reaction time, recognition time, response time, etc.), (b) temporal structure in motor behaviors (like motor patterns of reaching, speech production, locomotion, dance, etc.), and (c) temporal structure in stimulation (e.g., for speech and music perception, interpersonal coordination in, e.g., watching a tennis match, etc) can now be linked together if events can be predicted in time.

The language of dynamical systems provides a conceptual vocabulary that permits unification of cognitive processes in the brain with physiological processes in our bodily periphery and with environmental events external to the organism. Unification of processes across these fuzzy and partly artificial boundaries makes possible a truly embodied and situated understanding of human behavior of all types. The discrete time modeling of traditional approaches was always forced to draw a boundary separating the discrete time, digital aspects of cognition from continuous time functions (as, e.g., in Chomsky's distinction of Competence vs. Performance).

Third, cognitive development and `runtime processing' can now be integrated, since learning and perceptuo-motor behavior are governed by similar processes even if on different time scales. Symbolic or computational models were forced to treat learning and development as totally different processes unrelated to motor and perceptual activity.

Finally, trumping the reasons given above, there is the important fact that dynamical models include discrete-time, digital models as a special case whereas the other way around is not possible. (The sampling of continuous events permits discrete simulation of continuous functions, but the simulation itself remains discrete and digital at all times and only models a continuous function up to its Nyquist frequency, that is, up to half the sampling rate. See Port, Cummins and McAuley, 1995). Thus, any actual digital computer is also a dynamical system with real voltage values in continuous time that are subjected to discretization by an independent clock. Of course, computer scientists prefer not to look at them as continuous valued dynamical systems (because it is much simpler to treat them as digital machines) but computer engineers certainly do. Hardware engineers have learned to constrain the dynamics so it is governed with great reliability by powerful attractors for each binary cell that assure that each bit is able to settle into one of its two possible states before the next clock tick comes round when each cell is subject to a `poke’ with a voltage.

These strengths of dynamical modelling are of great importance to our understanding of human and animal cognition. As for weaknesses of dynamical modelling, there certainly are several, at least. First, the mathematics of dynamical models is quite a bit more inscrutable and less developed than the mathematics of digital systems. It is clearly much more difficult, for the time being, to construct actual models except for carefully constrained simple cases.

Second, during some cognitive phenomena (such when a student is performing long division, or a mathematician is doing a proof or planning the design of an algorithm, and possibly to some degree the human's processes of sentence generation and interpretation) humans appear to rely on serially ordered operations on digital symbols. Although, as noted, dynamical models are in principle capable of exhibiting digital behavior, how a neurally plausible model could do this remains beyond our grasp for the time being. For understanding such phenomena, it seems that computational models are, at the very least, simpler and more direct, even if they are inherently inherently inadequate.

H. Discrete vs. Continuous Representations

One of the great strengths of the classical computational approach to cognition is the seeming clarity of the traditional notion of a cognitive representation. If cognition is conceived as functioning rather like a program in Lisp, then the representations resemble Lisp atoms and s-expressions. Representations are distinct data structures of arbitrary form that happen to have semantic content with respect to the world outside or inside the cognitive system. They can be moved around or transformed as needed. Of course, they closely resemble words and sentences in natural language. Thus, if I think a thought about making a sandwich from bread and ham in the refrigerator, one can imagine that I employ cognitive representations of bread, the sandwich, my refrigerator, and so on. Thinking about sandwich assembly might be cognitively modeled by representations of sandwich components. Similarly, constructing the past tense of a word like walk can be modeled as a process of concatenating the representation of walk with the representation of -ed to yield walked. However, this traditional view runs into more difficulties when we try to imagine thinking about actually slicing the bread or spreading the mayonnaise.

What representations might play a role here? -- a knife, perhaps, but what about moving my body and arm to just the right place at just the right velocity? How could discrete, wordlike representations be employed to yield successful slicing of bread? And if this is, instead, to be handled by a nonrepresentational system (perhaps a dynamical one), then how could we determine the boundary between these two distinct and seemingly incompatible types of systems?

The development of connectionist models in the 1980s, employing networks of interconnected nodes, provided the first alternative to the view of representations as discrete context-invariant manipulable tokens. In connectionist models, the result of a process of identification (of, say, a letter of the alphabet or a human face) is only a temporary pattern of activations across a particular set of nodes, not something resembling a context-free, self-contained object. The possibility of representation in this more flexible form led to the notion of distributed representations, where no apparent `object' can be found to do the representing, but only a particular pattern distributed over the same set of nodes as are used for many other patterns. Such a representation does not resemble its meaning in any way and would not seem to be a good candidate for a symbol as conventionally conceived.

The development of dynamical models of perception and motor tasks has led to further extension of the notion of representation to include time-varying trajectories, limit cycles, coupled limit cycles and attractors toward which the system state may tend but which may never be achieved. From the dynamical viewpoint, static representations play a far more limited role in cognition. Indeed, a few researchers in this tradition deny that static representations are ever needed (Brooks, 1997).

References

Ashby, R. (1952) Design for a Brain. (Chapman-Hall, London)

Brooks, Rodney (1997) Intelligence without Representation. In J. Haugeland (ed.) Mind Design II (MITP, Cambridge, MA), pp. 395-420.

Chomsky, Noam (1961) On the notion `rule of grammar.’ Proceedings of the 12^th Symposium in Applied Mathematics, 6-24.

Chomsky, Noam (1965) Aspects of the Theory of Syntax. (MITP, Cambridge, MA)

Fel’dman, A. G. (1966) Functional tuning of the nervous system with control of movement or maintenance of a steady posture---III. Mechanographic analysis of the execution by man of the simplest motor tasks. Biophysics 11, 766-775.

Grossberg, Stephen (1995) Neural dynamics of motion perception, recognition, learning and spatial cognition. In Port and van Gelder (eds) Mind as Motion: Explorations in the Dynamics of Cognition (MITP, Cambridge, MA), 449-490.

Grossberg, Stephen (1986) The adaptive self-organization of serial order in behavior: Speech, language and motor control. N E. Schwab and H. Nusbaum (eds.) Pattern Recognition by Humans and Machines: Speech Perception. Academic Press, Orlando, FL).

Haken, H., Kelso, J. A. S., & Bunz, H. (1985). A theoretical model of phase transitions in human hand movements. Biological Cybernetics, 51, 347-356.

Haugeland, John. (1985). Artificial Intelligence: The Very Idea. Cambridge, MA: Bradford Books, MITP.

Kopell, Nancy (1995) Chains of coupled oscillators. In M. Arbib (ed) Handbook of Brain Theory and Neural Networks (MITP; Cambridge MA), pp. 178-183.

Newell, Allen, & Herbert Simon. (1975) Computer science and empirical inquiry. Communications of the ACM, pp. 113-126.

Piaget, Jean (1954) The Construction of Reality in the Child (Basic Books)

Port, Robert F., Fred Cummins, and J. Devin McAuley. Naive time, temporal patterns and human audition. In Robert F. Port and Timothy van Gelder, editors, Mind as Motion: Explorations in the Dynamics of Cognition. MIT P, Cambridge, MA, 1995.

Port, Robert & Timothy van Gelder, editors. Mind as Motion: Explorations in the Dynamics of Cognition. Bradford Books/MITP, 1995.

Thelen, Esther, G. Schöner, C. Schrier and L. B. Smith (2000) The dynamics of embodiment: A field theory of infant perseverative reaching. Behavioral and Brain Sciences (in press).

Townsend, James and Jerome Busemeyer (1995) Dynamic representation of decision making. In Robert F. Port and Timothy van Gelder, editors, Mind as Motion: Explorations in the Dynamics of Cognition. MITP, Cambridge, MA, 1995.

van Gelder, Timothy, & Robert Port. It's about time: Overview of the dynamical approach to cognition. In Robert Port and Timothy van Gelder, editors, Mind as motion: Explorations in the dynamics of cognition, pages 1-43. Bradford Books/MITP, 1995.

Books

Abraham, Ralph H and Christopher D. Shaw ( 1982) Dynamics: The Geometry of Behavior. Parts 1-4. (Ariel Press; Santa Cruz, CA).

Clark, Andy (1997) Being There: Putting the Brain, Body and World Together Again (MITP, Cambridge, MA).

Haugeland, John. (1985). Artificial Intelligence: The Very Idea. Cambridge, MA: Bradford Books, MIT Press.

Kelso. J. A. Scott (1995) Dynamic Patterns: The Self-organization of Brain and Behavior (MIT Press; Cambridge, MA).

Port, Robert and Tim van Gelder (1995) Mind as Motion: Explorations in the Dynamics of Cognition. (MIT Press, Cambridge, MA.).

Thelen, Esther and Linda Smith (1994) A Dynamical Systems Approach to the Development of Cognition and Action (MITPress; Cambridge MA).

Articles

Thelen, E., G. Schöner, C. Schrier and L. B. Smith (2000) The dynamics of embodiment: A field theory of infant perseverative reaching. Behavioral and Brain Sciences (in press).

van Gelder, Tim, & Robert Port. It's about time: Overview of the dynamical approach to cognition. In Robert Port and Timothy van Gelder, editors, Mind as Motion: Explorations in the dynamics of cognition, pages 1-43. Bradford Books/MIT Press, 1995.