Language Evolution in a Dynamic Environment

James T. Newkirk

Indiana University Department of Computer Science
LH-215, Bloomington, Indiana
(jnewkir@indiana.edu)

Abstract

Given the innate ability to produce and recognize a few discrete sounds, a simulated population evolved "sentences" of sound sequences to describe events in its environment by means of a genetic algorithm. This outcome was a direct result of the spatial organization of the environment acting with a progressive increase in the complexity of events occurring within it. As more specific responses to an event were required, the population adapted by evolving longer sentences from shorter ones to meet the new demands. This process is analogous to successful grammatical acquisition in neural networks by incremental learning, and indicates that specifically graduated environmental changes can contribute to the development of sentences in the evolution of language. The implication is that systematic interaction between population and environment plays a significant role in the evolution of grammar in language.

Introduction

Interest in the evolution of language has directed attention to genetic simulations that capture the salient aspects of the evolutionary process -- such as selection of genetic characteristics, survival, reproduction, and inheritance -- and apply them to a population of model organisms to explore basic language features. Holland [1] provides an overview of these concept and techniques. A fundamental first step toward language is coordinating the ability to emit and receive sounds to form cooperative signaling, and has been examined by MacLennan [2], among others. Even simple, unstructured signals can exchange enough semantic content to improve the survivability of a species over time.

An equally important step toward human language is the evolution of grammaticalization, in which signals are combined syntactically to effect communications that are semantically richer. Diller [3] investigates this property with respect to an evolutionary system.

This paper explores the evolution of the simplest grammatic structure, the composition of unit sounds into meaningful sequences. Such a sequence constitutes a complete and independent semantic unit, and can be regarded as either a sentence, or a word formed from phonologic constituents. The communication between individuals in the population is naturally crucial to this linguistic evolution, but I especially focus on the significant effect on the evolutionary process exerted by a dynamic environment: one which alters in accordance with the progress of adaptation.

The model

Evolutionary fitness was determined by the response of the population to events in a radically simplified artificial environment. The static elements of this environment which are constant throughout the course of evolution (its structure and the actions which it affects) have been greatly abstracted to place their effects in the background. In contrast, the environment's active elements -- events which affected the population -- and the process of communication were represented with more realistic details likely to influence the course of evolution.

The static structure of the environment comprised four categories to which any individual of the population might be assigned. Abstractly, these exclusive categories served simply to segregate the population; concretely, they could be thought of as spatial locations into which any individual might move at will. Initially, however, no spatial relationship was imposed upon these locations -- they were not distinguished by distance or proximity. The various locations were also the semantic space of the model, since the effect of a communication was to move an individual into one of these four locations.

The elements of communication were a vocabulary of several distinguished, atomic sounds which could be announced either singly or in short sequences, and which had the possibility of being recognized by each member of the species. Thus a vocabulary of {X, Y, Z} yielded signals such as "XX", "ZYZ" or simply "Y". This communication was "vocal": a broadcast signal arriving simultaneously and uniformly to all individuals. Such non-directional speech augmented a directed sense that corresponded to sight, and which served to detect events when they occurred. In common with many natural systems, what a one individual detected in its local area could be quickly signaled to the remainder of the population.

The active components of the environment were events introduced to the population periodically and prompting an announcement upon their detection. A listener's response to such an announcement was simply movement into one of the four possible areas. Events were either hostile or beneficial, and appeared at fixed locations according to their type. Appropriate responses were to move away from the location of a hostile event, and into the area of a beneficial one. Failure to avoid a hostile event sharply reduced the chance of survival; conversely, successful attraction to a beneficial event increased the chance of both survival and breeding. To increase the difficulty of adaptation, events could be diversified by increasing the number of locations in which they might be chosen to appear.

Speaking and hearing were both innate acts whose details were entirely configured for each individual by its genetic inheritance, determining which signal to announce in the presence of possible events, and (separately) which signals would be recognized and the responses to be made when they were heard. These two structures were initially uncorrelated and evolved independently of each other; indeed there was no guarantee that an individual recognized the sounds it produced itself, or assigned useful responses to them if they were recognized. Speaking and hearing thus began as uncoordinated acts, and the initial task of the system was to produce a correspondence between event, signals, and reactions by listeners that advanced the survivability of the population. It was hoped that over time the population might not restrict itself to single sounds from the vocabulary, but evolve a repertory of sound sequences, in effect moving from single words to simple sentences whose meanings correlated with surrounding events.

Genetic influence on sentences

The successful communication of event arrivals depended on the range of vocabulary sounds the species could employ. A large vocabulary -- in particular, a vocabulary with at least as many sounds as events -- suffices to describe all possible events by agreeing upon a single sound for each. In fact, this was the most common outcome with a generous vocabulary, and sentences neither appeared nor were necessary. Clearly, the burden of coordinating long arbitrary sequences between speaker and listener was unnecessary when shorter, simpler solutions were available.

Reducing vocabulary size and increasing the diversity of events -- or both -- forced the species to turn to more expressive signals than single sounds, and so to exploit sequences more frequently. These circumstances evolved a narrower repertory that included some sentences comprising a pair of sounds. Since three-element sentences were the longest possible, this was regard only as partial success at best. At this point it seemed that increasing the ratio between the number of locations in which events appeared and the vocabulary size would rapidly guide evolution toward sentences of desired length, but this turned out not to be the case.

If such demands on the innate vocabulary were increased very much, the survivability of the population was soon threatened in a number of ways. Coherent communications took longer to emerge, and often failed to appear at all. The population also grew disproportionately sensitive to a few initial announcers in the early generations, whose choice of signals biased the genetic diversity toward a small and often ineffective set of possible sentences. Solutions, in the form of sentences long or short, usually failed to appear.

Environmental influence on sentences

The ability to evolve sentences rose greatly, however, from environmental effects rather than genetic ones. Two changes improved sentence evolution, and proved mutually reinforcing.

One change was static: the enforcement of spatial organization on the locations in the model world. By treating the four locations as a one-dimensional line and allowing movement only between adjacent locations, the number of semantic possibilities in the population was quickly reduced to those which conformed with this structure. This had the effect of narrowing the search space for signal interpretation, and led to more dependable communications sooner.

The other change was dynamic: as the population was evolving, hostile events were made successively more demanding. Initially requiring the avoidance of only one location, they were gradually made less predictable, until the best response was narrowed to specific movement to a single location. Usually this series of location changes also obeyed the linear spatial order, on the principle that events in the world are bound by the same rules of movement as the individuals within it.

This latter graduation in the complexity of the environment was markedly more successful in evolving stable populations using three-sound sentences -- the maximum length possible -- to identify and avoid hostile events. This outcome is in contrast to the poor survival of the population when faced with a restrictive set of events from the outset. Analysis of this success focuses upon the roles that ambiguity and synonyms play in communication, and how they are exploited by the population to solve problems progressively.

Incomplete speech and ambiguous responses

An initial assumption in this model was that communication would not be perfect, but spoken and heard with uncertainty. Among the genetic possibilities for speech responses were movements to one of several possible locations: the sentence "XY" might be associated with the locations {3, 4}, for instance. In this case, the sentence was considered ambiguous, and a one of the locations would be chosen at random to decide the actual movement. Responses to partially-recognized sequences also led to an ambiguous interpretation: if two sentences shared a common prefix, and only that prefix was heard by the listener, their indicated responses were pooled, and a choice made between them. This arose frequently due to overlap in sentence components: if "XY" was associated with location 1, and "XZ" with 4, hearing "X" alone would force a random decision to be made between locations 1 and 4 as a response. Of course, both of these responses might not be appropriate for the given event, so this incompletely understood sentence would not be as effective a message as a longer one. Nevertheless, this gave individuals the ability to respond (with lesser accuracy) to shorter sentences than their genes may have dictated, a flexibility that proved important.

This semantic ambiguity had a grammatical counterpart in the development of synonyms by speakers. Synonyms arose because a single individual was selected at random from the population to announce each event. Since an individual's speech for an event was determined by its arbitrarily-constructed gene, sentences announced for the same event varied from instance to instance. This variety of announcers led to various dialectical groups, sometimes overlapping, each responding to some synonym for the same event.

Moreover, announcements might omit sounds from sentences. Events might take effect before an announcement could be completed, in which case only some prefix of a sentence might be spoken. Such interruptions made longer sentences somewhat more difficult to evolve, since they were less reliably communicated, and contributed to the ambiguity of sentences. This uncertainty in both elements of communication helped maintain a diversity of sentence interpretations in the species at the expense of accuracy. The survival of an individual and the success of the species depended on how often actions selected from an ambiguous set strayed from the ideal response.

Ambiguity preserves linguistic diversity

The existence of ambiguous yet useful responses to incomplete communication served to create a population with a wider potential repertory of signals than a system which elicited and rewarded exactitude alone. In particular, it encouraged the perpetuation of genes for longer sentences even though few or no sentences of that length were announced. As long as a sentence prefix led to a range of responses sufficiently close to the best choice to ensure survival on the average, the gene for it would continue to be carried in the population. Conversely, as long as a lengthy sentence could be partially responded to, announcers carrying such genes would continue to benefit the population, and survive themselves. Inexactitude leading to diversity could later be exploited by the population for responses to future events -- as long as they were semantically related to earlier ones.

Semantic refinement extends spoken forms

When a change in the environment introduce new events whose appropriate responses are similar to, or a restriction of, existing ones, the population can accommodate them in one of two ways: by evolving completely new sound associations for the new events, or by extending the present grammar and refining its semantics. In practice, this extension of sentences was more likely to emerge precisely because its grammatic predecessors had been developed and preserved in the population. Thus the species tended to adapt an existing broader solution to meet a more constrained problem that was spatially (and hence semantically) related.

Co-evolution of species and environment

Gradual alteration in stages of the events in the world in this way -- from those which can be generally interpreted to those which require more specific responses -- reliably served to breed the expression of full-length (three-element) sentences. Similar results have been reported by Elman [4] in the process of training neural nets to recognize sentences with non-trivial grammatic structure: in that experiment, the proportion of complex forms to simple ones was methodically increased as the training of the network progressed in order to successfully recognize a complete suite of sentence forms and make predictions from them. That grammatic forms can be evoked in both of these diverse learning media indicate common principles at work, among them the ones suggested here: the reinforcement of incomplete solutions, and the preservation of diverse potential solutions as long as possible over time.

That grammar can spring from the progressive alteration of environmental events acknowledges that semantics are not abstract and arbitrary assignments made to sounds, but are shaped by realistic and continuous changes in the environment which they represent. Such environmental evolution mirrors the population's linguistic evolution. It is reasonable to wonder, then, if environmental adjustment in turn can be produced not by deliberate manipulation from outside the model, but by an internal cycle of effects between the population and surrounding events.

If such a series of environmental changes occurs naturally, it must come as a direct result of the actions of the population. In essence, this means that the environment alters its inhabitants at the same time as they alter the environment. Effect on the environment to such degree is not unprecedented, and is a likely result of sustained success in linguistic evolution, particularly if humans are taken as an example. When successful communication produces successful growth, a population expands into a wider environment and thereby expose itself to novel events. It's reasonable to speculate that the pattern of exploration is self-sustaining and cyclic: local adaptation to a simpler environment leads to expansion into a more complex one; once it is accommodated, expansion begins again.

Summary

A governed rate of change in environmental complexity, if undertaken in realistic communication conditions, seems to exert a strong adaptive pressure on a population to acquire and sustain more complex linguistic attributes than would arise in a static environment. That such progressive change cannot be arbitrary leads to the conclusion that environment and inhabitants engage in a coordinated system of mutual interaction in the evolution of language, mediated by linguistic flexibility such as the successful tolerance of ambiguity. The similarity of this finding to one in the alternative domain of neural networks suggests that this principle is involved in adaptive systems in general.

References

  1. L. B. Booker, D.E. Goldberg and J.H. Holland, "Classifier Systems and Genetic Algorithms"; Artificial Intelligence 40(1-3):235-282, September 1989.

  2. Bruce J. MacLennan and Gordon M. Burghardt, "Synthetic Ethology and the Evolution of Cooperative Communication''; Adaptive Behavior, Vol. 2, No. 2, Fall 1993.

  3. Diller, Karl, "Language Acquisition and Evolving Systems"; Proceedings of the XVth International Congress of Linguists (Quebec, August 1992).

  4. Elman, J.L., "Learning and development in neural networks: The importance of starting small."; Cognition, 48.