Research
Old overview

Research overview (1987-2005)


What I want to try to figure out is where human language comes from, especially as it develops over the long term in the language learner, but also as it emerges over the short term in the course of communicative acts. One very influential position is that we are all born with a significant component of language (or of the concepts that language is about) already in our brains, that all that is left is to figure out the details of the particular language or languages that we're being exposed to. This position is based on the premise that the environment provides the language learner with such an impoverished set of linguistic data that the only way a language could be learned is for much of it to already be there. Of course this position is only tenable if human languages can be shown to share a sizable set of properties, and much of the research within this framework has been dedicated to finding that set of properties and showing that they are already in place as language is acquired.

A competing position, which is growing in influence, is based on the premise that there is considerable regularity in the environment, including regularity in the things that language is about, regularity in the language input itself, and unconscious strategies on the part of adults and older children to make language accessible. In place of innate constraints on language, this view posits that learners have access to powerful, general-purpose statistical learning mechanisms, to sophisticated perceptual and motor control systems, and to a propensity for social interaction. The research within this framework has been dedicated to elaborating the capacities of statistical learning, especially algorithms inspired by nervous systems and to studying the nature and the effects on learning of particular kinds oof regularities in the environment.

From early on I have gravitated to the second framework because it seems to me the simpler, default position. It is already obvious from other work in cognitive science that humans are sophisticated statistical learners. If this capacity and other general-purpose ones suffice, then we could do without the innate knowledge of language proposed by the proponents of the other view. In addition, calling something innate leaves open the question of how that innate knowledge gets implemented in neural hardware and, perhaps more significantly for cognitive science, how it could be linked to experience. Finally, the research in the other paradigm just has not been convincing. The search for what is universal, and presumably innate, in languages has had to resort to such abstract constructs that it seems less and less likely that such things have anything to do with what people know about language, let alone with their innate "endowment".

The strategy, then, is to take a phenomenon, a relatively simple linguistic behavior, and attempt to show how a simple statistical learning device, given plausible input, could acquire it. This normally leads to failure, and the simple device gets augmented in ways that make it more powerful. But power comes not from building in the solution to the problem by providing the knowledge that is needed to solve it. Power comes from special-purpose learning mechanisms and from modularity, from components of the system which learn to specialize as they are exposed to input. My recent projects have looked at three separate areas of linguistic behavior from this perspective.

  1. How do words and sentences mean what they mean? One fruitful approach to this question is to treat it from the perspective of the young child. For the child, language takes on significance as it is grounded in experience, in perception, action, and affective states. Two questions of interest in this project are:
    1. Language makes us relational; it gives us symbols (words) that refer directly to relations, for example, verbs such as put and prepositions such as behind. What are the implications of being relational for the architecture of cognition, and how does the learning of relational concepts and words in children help us understand the architecture? A major collaborator on this topic has been former IU Cognitive Science PhD student Eliana Colunga, now at the University of Colorado.
    2. How do word forms and word meanings relate to one another? Why is that the relation is mostly arbitrary (for example, there is nothing about the pronunciation of the word dog that suggests the concept of DOG), and why is it that the relation is not arbitrary in isolated regions of language (for example, for verbs such as splash and flutter and for the signs of signed languages)?
    3. How do simple perceptual inputs, along with particular linguistic tasks, lead to the orders of acquisition for different forms that are observed in children? For example, what is it that seems to make nouns easier to learn than adjectives?
  2. How do children learn the internal structure of words, how words are composed out of constituent morphemes (morphology) and how the primitive sounds of a language combine with one another (phonology)? My earlier work focused on the kind of computational device that could achieve this learning. The outcome of this project was a neural-network architecture which could be trained to recognize and produce words which are representative of the kinds of morphological combination that occur in the world's languages. One contribution was the discovery that separate modules responsible for learning roots (the basic forms of words) and for learning inflections (the prefixes, suffixes, etc. that get attached to roots) improved performance for all types of morphological combination. An early collaborator on this project was IU Cognitive Science student Chan-do Lee, now at Taejon University, South Korea.
  3. How are music and language organized in terms of nested periodic beats, and how do people process this structure? Another earlier project was concerned with developing computational models which can track the rhythm of simple musical and linguistic patterns and learn to beat along. Collaborators on this project included former Cognitive Science PhD student Douglas Eck, now at the University of Montreal, and IU Linguistics Professor Robert F. Port.