Self-entrainment in Animal Behavior
and Human Speech

Robert Port, Keiichi Tajima, and Fred Cummins
Indiana University
Linguistics, Computer Science, Cognitive Science

April 25, 1996

Introduction

What is `self-entrainment'?

There is a very important characteristic of the behavior `of mice and men' that seems to be largely overlooked as a general feature of cognitive systems. This is the fact that humans and animals typically exhibit self-entrainment in their physical activity. But what is self-entrainment? When one physically oscillating system entrains another, it means that the timing of repetitive motions by one system influence motions by another oscillator such that they fall into a simple temporal relationship with each other -- that is, they tend to perform their motions in the same amount of time or in half (or double) the time, or some other simple integer ratio of time. By self-entrainment we mean that in actions by complex body, a gesture by one part of the body tends to entrain gestures by other parts of the body.

This property is well-known in the literature of motor control. Researchers on motor control have commented many times on its general importance and tried to encourage others to see its relevance to an understanding of the temporal structure of human cognition (eg, Bernstein, 1967, Kelso, 1995, Turvey 1990). However, cognitive science tends to underappreciate the significance of this phenomenon for cognitive science as a whole. This probably reflects a lack of appreciation of the problem of time in general within cognitive science (see Port and van Gelder, 1995). If one takes the problem of the timing of events in cognition to be important, then any such simple temporal constraint would immediately strike one as potentially of enormous significance for understanding patterns like words and sentences.

In this paper we will remind readers of some everyday illustrations, review a couple well-documented cases from the literature and then report the results of some simple experiments suggesting that ordinary human speech exhibits selfentrainment more or less whenever given the opportunity to do so. We shall conclude by suggesting that selfentrainment is deeply revealing about the way in which time is handled in the nervous system of animals for many purposes going well beyond the coordination of limbs.

Self-entrainment in Everyday Activity.

Research on motor behavior especially on humans, but also in other animals like fish, shows that when one gesture is performed simultaneously with another gesture - even by a distant part of the body, the two gestures have a strong tendency to constrain each other. That is, cyclic gestures that could in principle be completely independent -- like walking and waving the hand, or repeated reaching with the two hands, or wagging the index fingers of each hand -- tend not to be independent. Gestures tend to cycle in the ratio 1:1 or 1:n or m:n (where m and n are small integers). For example, most joggers have noticed that during steady-state jogging, one's breathing tends to lock into a fixed relationship with the step cycle - with, say, 2 or 3 steps to each inhalation cycle, or perhaps 3 steps to 2 inhales. To the runner it feels easier and less effortful and it probably is (cf. Bramble and Carrier, 1983).

Similar phenomena have been observed in the laboratory in various forms for over hundred a years (see references in Collier and Wright, 1995 and Treffner and Turvey, 1993). For example, in one recent study (Treffner and Turvey, 1993, Experiment 3) subjects were asked sit in a chair with their arms resting on the chair arm and then to swing a pendulum along the side of the chair. The pendula were essentially pieces of broomstick with various weights attached at one end and thus had different natural frequencies, that is, different frequencies at which they would swing if suspended from a fulcrum and bumped. This frequency is quite close to the rate at which the subjects swing them with least effort. They studied how each pendulum was swung just by itself and also when the other arm had a different pendulum to swing, for various pendulum combinations. Basically, when swinging two pendula, subjects had a strong tendency to entrain them to each other in simple ratios like 1:1, 2:1, 3:1 or 2:3 -- depending on the differences in the natural frequency of the two pendula. Other, more complex ratios would occur sometimes (like 4 cycles on the left to 5 cycles on the right), but these were quite unstable and seemed to quickly slip over to simpler ratios, like either 2:3 or 1:2.

This phenomenon, where one arm entrains the other one, is not restricted just to cyclic activity like waving a pendulum or a finger. Just a single, one-time gesture, if performed by two hands, tends to exhibit entrainment between the two hands. Kelso et al (1979) asked subjects to perform easy and hard reaching gestures with one or two arms, and to do so as quickly as they could without making many errors. They sat at a table and put, for example, their right index finger on a spot (with a touch sensor). Then on a signal, they reached to their right along the table to touch either a nearby target area with a large touch sensor or else a small, more distant target area. They found that subjects can perform the short, easy reach quite a bit faster than the harder more distant reach with either hand. Now when they asked subjects to perform these reaches with both hands simultaneously, they found the easy reach was strongly constrained by the harder one. Both gestures started and ended together and took about the same amount of time. It seemed that each hand entrained the other in the ratio 1:1 for the duration of the reach. Kelso et al. interpret this as evidence about the attractor structure of the dynamical system that provides the cognitive mechanism coordinating the two gestures. Coordinating the two reaches so that they are the same duration is apparently easier and more reliable than to allow (or force) each gesture to be independent of the other. It is not claimed that this kind of self-entrainment is the only way the two arms can be moved. With practice and appropriate payoffs, Ss could presumably learn many possible relations between the arms. But it seems to be "the most natural way" -- apparently the most straightforward way for humans to perform such a novel coordinated act.

Other examples of the strong tendency toward selfentrainment may be found in difficulties faced when learning to play the musical instruments that require using both hands -- that is, instruments like guitar, piano, flute and drums (as opposed to trumpet). In these instruments, the novice must learn to control the phase relations of the two hands appropriately. To produce an extended trill on a bongo drum, for example, the two hands must be kept at opposite phase even at very high rates. For novice drummers the hands tend to slip over to zero phase lag (where they hit the drum head simultaneously). In the case of plucked string instruments, like the guitar or mandolin, the fingers of the left hand must clamp into the fingerboard just before the plectrum strokes the string with the right hand. It seems to us that for a novice player, there is a tendency for the left hand finger to slap down on the string simultaneously with the plectrum slap across the string. Of course, you don't get a good sound in this case (because the plectrum excites the string while it is still partially damped by the soft flesh of the finger tip). If they keep practicing, learners eventually get the phase offset correct and can then produce fast runs and arpeggios. To play the piano in various styles there may be rather different phase relationships that are typical of each genre. Compare playing a military march on the piano (where a feature of the style is that the left and right hands frequently strike the keys simultaneously) with playing boogie-woogie (where the left hand beats out a steady bass pattern on the beat while the right hand operates quite independently with comparatively few of its strokes being in phase with the left hand. Getting the knack of such a style of performance requires decoupling the two arms from each other in some sense. So it seems that musical performance skills frequently involve careful control of self-entrainment by parts of the body.

Another characteristic of the self-entrainment of separate gestures is that any similarity between the gestures tends to be automatically increased or exaggerated. For example, the old task of rubbing the tummy and patting one's head is notoriously tricky until one has practiced it a little. One reason it is hard is that, although each gesture by itself -- the rotary gesture and the pat -- is an easy and familiar skill, one discovers when performing them together, that they have very similar natural frequencies (at least if you are moving the whole arm at the elbow in each case). This similarity of natural period seems to induce the different hand gestures to interfere with each other. In contrast, for example, if you rest your arm on a table and tap one finger rapidly while rubbing the belly slowly, there doesnt appear to be much interference (although if you keep it up, they will still tend to fall into a regular 1:n harmonic ratio with each other). Why does the degree of similarity of period between the competing gestures matter? Probably because a one-one temporal relationship tends to encourage treatment of the cyclic events as the same event. The novice guitar player has the same difficulty; strokes with the left and right hand get merged or confused with each other. So these familiar examples demonstrate other types of entrainment of one body part by another body part.

Theoretical models of these phenomena invariably employ dynamical models for coupled oscillators (Yamanishi, Wakato and Suzuki, 1980; Kelso, 1995). Each gesture is described by an equation specifying a vector field within which the system state moves. But the equation for each limb includes a term reflecting the current state of the other limb. For example, the sine circle map [Glass and Mackey, 19nn] shows why simple rational relations (eg, 1:1, 1:2, 2:3, etc) should be more stable than other relationships. It is the `coupling terms' in the oscillatory equations that account for how each affects the other.

Perceptual Self-entrainment

For the cases mentioned above, the coupling could be argued to be due largely to the physical link between these physical oscillators: the legs and trunk in the jogging case, or the two arms in the pendulum data. That is, the observed coupling effect could be due merely to physical forces - and thus arguably to have little relevance to theories of cognition. However, similar effects have also been observed where one of the oscillations involved has extremely low stimulus energy so that the coupling is strictly `informational' -- that is, made available by the auditory or visual system -- and is not due to an interfering force. In one experiment (Schmidt, Carello and Turvey, 1990), subjects swung their leg from a seated position on the edge of a table. They were asked to watch another subject sitting next to them on the table and to swing their leg either in phase or out of phase with the other person at various frequencies. At slow rates, the subjects were able to keep their phase close to the assigned values of 0 (=1) or 0.5 with respect to each other. However, they showed a strong tendency to fall from the out-of-phase pattern (one leg forward, the other back) to the in-phase pattern when rate was increased. So, although nothing but visual information links the two systems, the behavior is exactly the same as the behavior we discussed when the novice tries to produce a trill on a bongo drum. It is also identical to what is observed in the well-studied laboratory task where subjects wag their two index fingers in phase and out of phase at various rates (Kelso, 1995).

This is to note that it makes little difference whether the two limbs are in the same body (where physical forces might account for the coupling) or in different bodies (where only stimulus information could account for coupling). Apparently then, the entrainment phenomena include cases of entrainment between the visual system and motor control for limbs. From this perspective, the common tendency to tap our foot or nod our heads to music is just another example of self-entrainment between the auditory system and the motor system.

The conclusion we draw from these experimental and anecdotal observations is that physical links between independent oscillators is certainly not the only mechanism and probably not even the primary mechanism to account for the widespread observation of mutual entrainment in humans and animals. It seems that the coupling of different kinds of oscillation is a ubiquitous and intrinsic property of cognitive systems. If this is so, then we might expect to see broad exploitation of this property in other aspects of human cognition -- for example, in acts of speech communication.

Self-entrainment in Speech Timing

Our primary interest is in the temporal structure of speech. The orientation we propose is that speech perception and production involve several kinds of specialized entrainment between auditory information and motor control as well as within the auditory and motor systems. Phoneticians and phonologists seek to specify the kind of information speakers and listeners employ in understanding and producing speech in various languages. Within linguistics the standard assumption is that this information has an essentially static structure consisting of segments and features, and hierarchical trees of such objects. Although linguists have long sought support for static units of speech (Jakobson, Fant and Halle, 1952, Chomsky and Halle 1967, Ladefoged, 1972), there have been many difficulties with such attempts (eg, Lisker and Abramson, 1970; Port and Dalby, 1982).

It may be that by addressing continuous time directly with dynamical models of entrainment, units may be easier to find than if one tries to avoid time by looking for static segments. This ironic situation might occur if it turns out that building a recognizer and predictor for events in time is easier than building one for supposedly static, ordered objects like the traditional segments, distinctive features, feet, phonetic phrases, etc. What kind of temporal events might be simple? What kind would be easy to build a recognizer for? Our suggestion is one that involves a simple repetitive event, a `same' that recurs on a regular basis (of whatever type a recognizer can be built for). Such a recurring identity, if supported by an oscillatory predictor, might serve the function of dividing time into discrete units.

A number of kinds of time-like units have been proposed for speech in various languages: inter-stress intervals in English, the mora in Japanese, the bisyllabic foot in Finnish and Estonian. The notorious difficulty with these units is that they do not seem to be quite as regular as one might hope. But, of course, this will necessarily depend on what method is employed to measure them. A clock measuring in milliseconds is probably not the most appropriate mechanism.

Can self-entrainment be found for speech? What should one look for? One kind of supportive evidence would be if syllable onsets (or moras or stressed syllable onsets) could be shown to be evenly spaced in ordinary spoken prose. Although there is some evidence in support of regular spacing of moras in Japanese (Port, Dalby and O'Dell, 1989), attempts to find isochronous stresses in English (see, eg, Abercrombie, 1967) have not been very successful (Lehiste, 1977, Dauer, 19nn,).

A second place to look would be in performance of songs or poetry performance (see, eg, Boomsliter and Creel, 1979). Notice that if we are looking for temporal effects, only actual performances will provide relevant data, not the kind of orthographic descriptions of performance customarily used by linguists. Similarly, to look seriously at temporal events requires differentiating many styles of speech, since any piece of text can be pronounced with a wide range of possible rhythmic and intonational styles. The `linguistic competence' of speech rhythm cannot be investigated with any kind of symbolic description, but intrinsically requires audio recordings for analysis.

Aside from music and poetry recordings, a simple speaking task that is easily amenable to laboratory work might be to create an artificial speech style that encourages an interaction between the natural temporal rhythm of speech and some other periodic event. To propose an extreme case, we might ask subjects to pound a large hammer into a pillow and then repeat some phrase while doing so: "He POUNDed the HAMmer, He POUNDed the HAMmer...". Or they might be marched along a treadmill while repeating a phrase. If we find speech timing to be severely entrained to the nonspeech actions in such cases, noone would be surprised. But what if we take a more gentle approach and simply ask subjects to repeat a phrase at some suggested rate? For example, if subjects hear a metronome that periodically signals them to produce some phrase or sentence, then will we find that prominent events in the phrase -- such as stressed syllable onsets of metrical feet -- tend to be located at harmonic fractions of the phrase onset? The metronome signal produces a periodic perceptual pattern which might tend to entrain any possible harmonically related periodicities within the produced speech. If this kind of self-entrainment turns out to occur easily, then we might conclude that normal speech timing probably employs mechanisms that are fundamentally oscillatory or quasi-oscillatory. Only if such quasi-oscillation were already present could a weak periodic tone affect the timing of speech production -- or so we reason.

Experiment 1

Task. We propose a Phrase-Repetition Task, wherein some phrase is repeated over and over in a steady fashion under control of an auditory metronome. The repeated period of the phrase itself provides one periodicity (whose rate is set by a metronome) while foot onsets are marked by stressed syllables within the phrase provide a potential second periodicity. The self-entrainment we are looking for should show up in the relation between the feet and the phrase as a whole.

Subject. The subject for this experiment was a single male college-aged native speaker of American-English who volunteered to serve as a subject.

Methods. The subject was asked to repeat a short phrase or sentence triggered by a 50 ms, 500 Hz sine wave presented at a comfortable level over a loudspeaker. The phrase was `Talk about the game'. In the Increasing Rate condition, the rate of the metronome was changed in 11 equal log steps from slow to fast, from 3 sec down to 460 msec. For each step the period was decreased to 75% of the previous period down to 1 sec. Below that they were decreased to only 87.5% with each step. In the Decreasing Rate condition tempo was decreased in the same steps from fast to slow. In this experiment we explored the use of minimal instructions. Ss was asked: "Say this phrase in time with each metronome beep." He was told that if the metronome got too fast for them, then he should just slow down. He was not instructed to align any particular part of the phrase with anything else. The S was not given any particular practice or reinforcement for this task.

For each rate, the subject listened to the metronome for about 10 pulses and then jumped in to repeat the target phrase 10 times while listening to the beeps. Then the subject paused and waited until the next metronome rate was produced and then repeated the task. Thus Ss produced 110 repetitions of the phrase in each of the Increasing and Decreasing rate conditions. All experimental productions were recorded directly onto computer disk.

Since the onset of a vowel is an acoustically prominent event - and one that is known to produce a strong burst of neural activity in the auditory nerve (Delgutte, 1996) and because it has the advantage of occurring in most syllables, we measured the location of the vowel onset in each syllable from computer-generated waveform displays.

Results. We report here only the data on the onset of the vowel in the initial word talk and the nuclear-stressed word game that began the second foot in our test phrase. The location of the onset of game was then computed as a phase angle with respect the phrase-onset cycle by dividing the time interval from sentence onset by the interval between the two sentence onsets (defined as the beginning of the vowel in take). For this reason, the final phrase of each group of 10 had to be dropped making 9 data points at each rate in each condition.

In Figure 1 we display the sequence of phase angles of the onset of the final, stressed word in the phrase as they occurred through a single run of trials from the slowest to the fastest rate. Beginning on the left, we show the first trial of 9 productions of S when the metronome was set to 3 seconds. It can be seen that the onset of game occurs at a phase angle of about 0.2 (0.2*3 sec = 150 ms after onset of take). As the rate is increased it can be seen that, after every trial, the phase angles get longer up through the 8th trial. Then they drop back to just below 0.5 again. It appears that for quite a bit of the range, the phase angle hovers just below 0.5. What happened in trial 9 is that the subject apparently felt too hurried by the metronome and spontaneously dropped back to repeat the phrase only every other cycle. Thus, the time in milliseconds almost doubled between 8 and 9 while the phase measurement continues to locate the onset of game about half way through the phrase-onset cycle.

Figure 1

Figure 1. The phase angle of the onset of game as produced in the increasing rate condition of Experiment 1. For each trial (that is, metronome rate), there is a series of 9 sequentially produced data points. Note that the speaker spends considerable time in the region of phase just below 0.5

To see the preference for this phase more clearly, we collapsed the data across the two presentation conditions (increasing and decreasing rate) and constructed a frequency histogram of the phase of the target-word onset as a function of produced phase,y as shown in Figure 2. This display shows a very strong bias in favor of phase close to but just below 0.5.

Figure 2

Figure 2. A frequency histogram of the phase of game onset across both the increasing and decreasing rate conditions of Experiment 1. A strong preference for phases just below 0.5 is evident.

Of course, the preferred phase in both Figures 1 and 2 is actually not exactly 0.5. The preferred phase appears to be closer to 0.46. We should point out, however, that when listening to it with our `musicians ears' (all three authors are at least amateur musicians), we can only say that it sounds impressionistically like it is half way through the phrase cycle. That is, if asked to use musical notation to write down the subject's speech rhythm, we would be very confident in starting this word at beat 2 of a 2 beat measure for the repeated phrase. The question of why our measurement method locates the apparent preferred phase at 0.46 rather than 0.5 cannot be properly treated here. First of all, we have not yet validated our choice of measurement point. As is well known from the "P-center" research (Marcus, Scott), the location of the "perceptual pulse" for these words may lie at some distance from the vowel onsets measured here. Other factors might also play a role including asymmetries between the speech articulators.

These results suggest that subjects will, with minimal instructions, exhibit a tendency to place a stressed syllable at music-like phase angles. That is, this subject shows a tendency to entrain the phonological foot as a harmonic of the repetition cycle. Thus the results are quite in accord with the selfentrainment hypothesis for speech. With no instructions to do anything but repeat a phrase in time with a metronome, the subject aligned the two stressed syllables of the phrase "Talk about the game" as though they were the beats of a two-beat musical measure.

The next thing to do is to verify this result with other speakers, somewhat different instructions and with a different text fragment.

Experiment 2

In this variant of the previous experiment, we employed the phrase Beat about the bush and a different speaker. Again a male volunteer was given a repetition rate with the metronome signal, but this time the tempos were presented in three orders, increasing rate, decreasing rate and randomly ordered rates. S was told to "Pronounce the phrase once for each beep of the metronome." The metronome periods varied over the range from 2 sec and were reduced in 20% decrements down to a minimum of 565 ms, making a total of 13 steps. Again the temporal location of the onset of bush was measured as a phase angle with respect to the onset of successive productions of beat. This experiment employed an automatic `beat extractor' that measured beat location (very close to vowel onset) from a rectified, smoothed energy measure over the formant frequency region (250-2500 Hz) of the speech signal (see Cummins and Port, 1996a).

The results are shown as a histogram of observed phases for the onset of bush across all three order conditions in Figure 3. In this situation, the subject showed a preference for placing the onset of bush at phase of either 0.47 (about 1/2) or at 0.62 (roughly 2/3). Again, listening as musicians, the tokens with a phase near 0.5 sounded like the word was starting at 1/2 and the ones above 0.6 sounded like they were pronounced to a waltz time. That is, they sound like bush is occurring on the third beat of a three-beat pattern.

Figure 3

Figure 3. A frequency histogram of the phase of bush in experiment 2 across the increasing, decreasing and random rate conditions. A strong preference for phases just below 0.5 yand just above 0.6 is evident.

This result reinforces the conclusion from Experiment 1 that when given a periodic signal to which speech can be entrained, speakers find it easy to locate stressed syllables at simple integral fractions of a longer period. This is true across a wide range of rates and across a variety of experimental details. The results suggest that speakers tend to self-entrain when talking.

Concluding Discussion

The main results of these simple experiments have also been verified in other experiments in our lab. For example, in another variant of the Phrase Repetition task (Cummins, 1995, Cummins and Port, 1996), subjects were asked to repeat the phrase Take a pack of cards while listening to a periodic signal. But the signal was not simply a metronome beep, but rather a voice in synthetic speech saying both take and cards. Thus the phase angle of the second foot onset was specified for these subjects as well as the phrase repetition rate (which was kept constant) (cf. Yamanishi, Kawato and Suzuki, 1979). The target phase for cards ranged in equal steps from 0.3 to 0.65. Again here, an automatic beat extractor was employed to locate syllable onsets. Although the target phase angles for cards were varied from 0.3 to 0.65, subjects showed a strong tendency, as expected, to bunch their produced phases for cards at certain preferred values. These preferred values were quite close to phases of 1/3, 1/2 and 2/3 -- the simplest large harmonic fractions of 1.

Thus, it appears that speech timing at this gross, phrase and foot level, appears to have many previously unnoticed similarities to other kinds of motor behavior. One of the most striking of these similarities is self-entrainment.

It seems to us, then, that self-entrainment, the tendency of one cyclic pattern to entrain other patterns to it in simple harmonically related temporal ratios, may be a natural mode of timing in human (and presumably animal) activity - whether in motor control or perception. This behavior may be a natural and nearly unavoidable temporal constraint for systems of mutually coupled oscillators.

Thus is it likely that oscillators play a central role underlying control timing of speech as well as the limbs. And not just in motor control, but probably in perception as well. And probably not only in repetitive, cyclical patterns, but in many apparently noncyclic ones as well. If this is so, then such models for cyclic performance of speech (as was imposed here by our experimental task) may well be informative of non-cyclic performance of speech as well. Our expectation is that related effects will be found in the speech of most natural languages.

It seems possible that oscillator entrainment is the primary method for global timing control in complex animals. Selfentrainment may be the most basic and easily observable evidence of this method of timing. It is likely that such properties will turn up in many other aspects of cognition and language.


References

Abercrombie, David (1967) Elements of General Phonetics. Aldine Pub. Co, Chicago.

Bernstein, N. (1967) The Coordination and Regulation of Movement. Pergamon, London.

Boomsliter, P. C. and W. Creel (1977) The secret springs: Housman's outline on metrical rhythm and language. Language and Style 10, 296-323.

Bramble, D. M. and D. R. Carrier (1983) Running and breathing in mammals. Science 219, 251-256.

Chomsky, N. and M. Halle (1968) Sound Pattern of English. Harper-Row, New York.

Collier, Geoffry L. and Charles E. Wright (1995) Temporal rescaling of simple and complex rations in rhythmic tapping. JEP: Hum Perc & Perf, 602-627.

Cummins, Fred (1995) Identification of rhythmic forms of speech production. Paper presented at Acoustical Society of America, Fall, 1995. J. Acous. Soc Amer.

Cummins, Fred and Robert Port (1996) Rhythmic constraints on English stress timing. Proceedings of the Fourth International Conference on Spoken Language Processing. To appear.

Cummins, Fred and Robert Port (1996) Rhythmic commonalities between hand gestures and speech. Proceedings of the Eighteenth Meeting of the Cognitive Science Society. To appear.

Dauer, Rebecca (1983) Stress-timing and syllable-timing reanalyzed. J. Phonetics 11, 51-62.

Delgutte, Bertrand (1996) Auditory neural processing of speech. In W. J. Hardcastle and J. Laver (eds) Handbook of Phonetic Sciences. (Blackwell, Oxford).

Glass, Leon and Michael Mackey (1988) From Clocks to Chaos. Princeton Univ. Press, Princeton, NJ.

Jakobson, R., G. Fant and M. Halle (1952) Preliminaries to Speech Analysis. MIT Press, 1952/ 1963.

Kelso, J. A. S., D. Southard and D. Goodman (1979) On the nature of human interlimb coordination. Science 203, 1029-1031.

Kelso, J. A. Scott (199) Dynamic Patterns: The Self-Organization of Brain and Behavior. Bradford Books, MIT Press. (Cambridge).

Ladefoged, Peter (1972) A Course in Phonetics. Harcourt Brace Jovanovich.

Lehiste, Ilse (1977) Isochrony reconsidered. J. Phonetics 5,253-263.

Lisker, Leigh and Arthur Abramson (1971) Distinctive features and laryngeal control. Language 47, 767-785.

Port, Robert and Jonathan Dalby (1982) C/V ratio as a cue for voicing in English. Perception and Psychophysics 2, 141-152.

Port, Robert, Fred Cummins and Michael Gasser (1995) A dynamic approach to rhythm in language: Toward a temporal phonology. In B. Luka and B. Need (eds) Proceedings of the Chicago Linguistics Society, 1996 (Department of Linguistics, University of Chicago), pp. 375-397.

Port, Robert, Jonathan Dalby and Michael O'Dell (1987) Evidence for mora timing in Japanese. J. Acous. Soc. Amer. 81, 1574-1585.

Port, Robert and Timothy van Gelder (1995) Mind as Motion: Explorations in the Dynamics of Cognition. (Bradford Books/MIT Press, Cambridge, MA).

Schoner, G. and J. A. S. Kelso (1988) Dynamic pattern generation in behavioral and neural systems. Science 239, 1513-1520.

Schmidt, R. C., C. Carello and M. T. Turvey (1990) Phase transition and critical fluctuations in the visual coordination of rhythmic movements between people. JEP: Hum Percp & Perf 16, 227-247.

Treffner, Paul and M. T. Turvey (1993) Resonance constraints on rhythmic movement. JEP: Hum. Percp & Perf, 19 1221-1237.

Turvey, Michael T. (1990) Coordination. American Psychology 45, 938-953.

Yamanishi, Junichi, Mitsuo Kawato and Ryoji Suzuki (1979) Two coupled oscillators as a model for the coordinated finger tapping by both hands.Biological Cybernetics 37, 219-225.