Introduction
The Interactive Activation and Competition network (IAC, McClelland
1981; McClelland & Rumelhart 1981; Rumelhart & McClelland 1982)
embodies many of the properties that make neural networks useful
information processing models. In this chapter, we will use the IAC
network to demonstrate several of these properties including content
addressability, robustness in the face of noise, generalisation across
exemplars and the ability to provide plausible default values for
unknown variables. The chapter begins with an example of an IAC network
to allow you to see a full network in action. Then we delve into the
IAC mechanism in detail creating a number of small networks to
demonstrate the network dynamics. Finally, we return to the original
example and show how it embodies the information processing
capabilities outlined above.
An IAC network consists of a number of competitive pools of units (see figure 1). Each unit represents some micro-hypothesis or feature. The units within each competitive pool are mutually exclusive features and are interconnected with negative weights. Between the pools positive weights indicate features or micro-hypotheses which are consistent. When the network is cycled, units connected by positive weights to active units become more active, while units connected by negative weights to active units are inhibited. The connections are, in general, bidirectional making the network interactive (i.e. the activation of one unit both influences and is influenced by the units to which it is connected).
The network in figure 1 represents information about two rival gangs - the Jets and Sharks (McClelland 1981). The central pool of units represent members of the two gangs. The pools around the edges represent features of these members including their name, occupation, marital status, gang membership, age and educational level. Within each pool the units are connected with negative weights indicating that they are mutually exclusive. If you are in your 20s you can't also be in your 30s, for example. Between the pools positive weights hold the specific information about each gang member. The instance unit corresponding to Art is connected to the units representing the name Art, Pusher, Single, Jet, 40's and Junior High (J.H.) - the characteristics of Art.
Table 1 shows the facts known about each of the gang members (Note: This is a subset of the full database presented in McClelland 1981).
Name | Gang | Age | Education | Marital Status | Occupation |
---|---|---|---|---|---|
Art | Jets | 40's | J.H. | Single | Pusher |
Rick | Sharks | 30's | H.S. | Divorced | Burglar |
Sam | Jets | 20's | College | Single | Bookie |
Ralph | Jets | 30's | J.H. | Single | Pusher |
Lance | Jets | 20's | J.H. | Married | Burglar |
Exercise 2: Now click on the reset button to return all of the units to zero activation and reselect the Toggle tool. Activate the instance unit for Art and the pusher and bookie units. Now cycle. Describe what happens to the Pusher and Bookie units.
In this introductory section we have provided an overview of the IAC network. In the next section, we will examine the IAC in more detail to see exactly how it computes activation values. Then we take a step back and look at some the important properties of the way in which neural network process information using the Jets and Sharks network as an example.
The Interactive Activation and Competition Mechanism
The activations of each of the units can be thought of as the "degree
of belief" in that hypothesis or the existence of that feature. The
weights are then an indication of how strongly belief in one hypothesis
or feature implies belief in another hypothesis or feature. The
Interactive Activation and Competition mechanism, embodied primarily in
the activation function, is a cyclic process of updating the
belief in a hypothesis according to existing evidence.
Consider a single unit. The evidence on whether it should increase or decrease its activation comes from its incoming weights. If most of the active units in the network (i.e. the hypotheses that the net believes are true) are connected to this unit with positive weights, then we should increase its activation. If most of the active units are connected to this unit with negative weights then its activation should be decreased. The net input function does this by multiplying each of the incoming weights by the level of activation of its unit and adding these values [1]. So the net input for unit i is:
neti = wijaj
where wij is the weight from unit j to unit i and aj is the activation of unit j.
Once the net input has been calculated for all units, activations are updated according to the following equation:
if (neti < 0)
ai = (max - a i) neti - decay (ai - rest)
Otherwise,
ai = (a i - min) neti - decay (ai - rest)
where ai is the amount by which we change ai.
The most important part of this equation is neti which determines whether the activation of unit i will increase or decrease. If neti is positive the evidence that this unit should be on is strong and the activation will increase. If it is negative the activation will decrease.
If the activation of a unit is equal to max then the net believes the hypothesis completely. If it is equal to min then the net disbelieves the hypothesis completely. The rest corresponds to an "I don't know state". The (max - a i) or (a i - min) terms ensure that the activation remains between min and max and doesn't continue to either grow or shrink without bound [2]. The -decay (ai - rest) part of the equation forces the activation to return to the rest value in the absence of external input.
Typically, we choose max > 0 >= rest >= min. By default max is set to 1, min is set to -0.2, rest is set to -0.1 and decay is 0.1. To change any of these parameters you first create a global value for that parameter by selecting the appropriate item from the Actions menu. Using the arrow tool you can select that global value and alter its value in the parameter panel at the bottom of the screen. Make sure that you hit return to make the value change.
IAC networks exhibit a number of interesting dynamics including:
The Dynamics of IAC Networks
We can get a feel for the dynamics outlined above by considering a
number of special cases:
Decay: Consider what happens when the net input to a unit is zero (i.e. neti = 0) so that a unit is only subject to the effect of decay. The "otherwise" clause of the if is taken which reduces to -decay (ai - rest). If the current activation is above rest then ai is a negative value which moves the activation down. If the current activation is below rest then ai is positive moving the value up. If ai = rest then there will be no change, that is, the activation will have settled on the rest value.
Figure 2: IAC network demonstrating decay.
Boundaries: The activation function is designed to automatically constrain the activation to be between max and min. To see why suppose the decay value is set to zero. Since it has the effect of moving the activation towards rest which lies between min and max ignoring it will not affect the argument. Now, if the net input is positive and ai is greater than max then ai = (max - a i) neti which will be negative forcing the activation down below max. If the the net input is negative and ai is less than min then ai = (a i - min) neti which will be positive forcing the activation above min.
Figure 3: IAC network demonstrating boundaries.
Equilibrium: If the net input to an IAC unit is held fixed the unit will eventually settle to an equilibrium point. We can see where this equilibrium point is by setting ai to zero and solving for ai. Lets suppose that neti is positive.
0 = (max - ai) neti - decay (ai - rest)
decay ai + neti ai = max neti + decay rest
ai = (max neti + decay rest) / (decay + neti)
Assuming max = 1 and rest = 0:
ai = neti / (decay + neti)
or
ai = (neti / decay) / (neti/decay +1)
The later expression shows how decay acts as a scaling factor on the equilibrium point. Note also that regardless of how large the net input becomes the activation at equilibrium is below one. An equivalent analysis is possible when neti is negative (see exercise 6).
Exercise 5: Calculate the equilibrium point when neti = 1, max = 1, rest = -0.1 and decay = 0.1. Now rerun the previous simulation. Does it settle at the appropriate equilibrium point?
Exercise 6: In the text above the equilibrium point is calculated when neti is fixed and positive. Perform the same calculation for the case where neti is negative.
So far we have been considering single units with fixed net input. As the name suggests, the Interactive Activation and Competition (IAC) network becomes most interesting when multiple units interact. Different dynamics can be established through the interconnection structure and the values of the weights and in the following sections we will examine some of these.
Competition: Placing units in pools where each unit in the pool is connected to each of the other members by negative weights (with no self weights) sets up a competitive dynamic. Small differences in the net input to units in such a pool become amplified as time progresses. Such a dynamic is useful when the units in the pool represent mutually exclusive hypotheses. For instance, in the Jets and Sharks example a gang member cannot be in their 20s and in their 30s or 40s. Consequently, the units in the "Age" pool are connected by negative weights and as processing continues one of them will eventually win.
To see how competition operates suppose that we have two units connected with weights of -1. Unit 1 receives external input (i.e. input from units other than unit 2) of I1 and unit 2 receives external input of I2. Furthermore, lets suppose that decay is zero, the initial activations of the units are zero and max = 1 > I1 > I2 > 0 = min.
At the first time step:
ai = (max - a i) neti - decay (ai - rest)
= (1 - 0) Ii
So a1 = I1 and a2 = I2
At the second timestep:
net1 = I1 - a2 = I1 - I2 and net2 = I2 - a1 = I2 - I1
So: a1 = (1 - I1)(I1 - I2) and a2 = I2 (I2 - I1)
Because 1 - I1 and I2 are positive and I1 > I2: a1 > 0 and a2 < 0. So a1 grows while a2 dies. The stronger a1 becomes the more it inhibits a2. The weaker a2 becomes the less it inhibits a1.
Figure 4: IAC network demonstrating competition between units.
Resonance: If units are connected by mutually excitatory weights they will tend to make each other active and to keep each other active in the face of decay. This phenomena is called resonance (an analogy to the resonance of waves).
For instance, suppose we have two units connected by excitatory connections with strength 2 x decay. Furthermore, suppose that rest is set to 0 and each unit has an activation of 0.5. How will they change?
ai = (1 - a i) neti - decay ai
= (1 - 0.5) x 0.5 x 2 x decay - decay 0.5
= 0
So the activations will remain at 0.5 indefinitely, that is, the units will resonate. The above example might leave one with the impression that resonance occurs only when the parameters are carefully selected. This is not the case, although usually the units will not come to have exactly the same values as the next exercise demonstrates.
Figure 5: IAC network demonstrating resonance of units.
Blocking and Hysteresis: The initial state of a network can have long term consequences for the way that the units respond to external input. A unit which is initially active can slow down the response of another unit (hysteresis) or in extreme cases block it completely.
Suppose we have two units with mutually inhibitory connections of -2 decay. Unit 1 begins with an activation of 0.5 and both units have external input of decay. Supposing rest is 0:
a1 = (1 - a 1) net1 - decay a1
= (1 - 0.5) decay - decay x 0.5
= 0
a2 = (a 1 - min) net2 - decay a2
= (a 2 - min) (-2 decay x 0.5 + decay) - decay x 0
= 0
The activation of unit 1 will be maintained at 0.5, but despite a positive external input the activation of unit 2 will remain at zero. Unit 2 is being blocked by the activation of the unit 1.
Figure 6: IAC network demonstrating blocking.
Oscillation: Up to this point we have concentrated on networks which settle to a steady state of activation. While this is often the case, IAC networks can oscillate (i.e. enter a limit cycle) given the appropriate parameters. Figure 7 shows such a network and Exercise 10 outlines how to construct it.
Figure 7: IAC network demonstrating oscillation.
Exercise 11: Calculate a1 and a2 and explain how the network oscillates.
Back to the Jets and Sharks Network
Now that you have a more in depth understanding of how the IAC network
operates lets return to the exercises performed in the introductory
section. Reset the Jets and Sharks network and activate the Art instance
unit using the toggle tool. Click on the cycle button.
Exercise 12: Why is Ralph's instance unit active?
Exercise 13: Why isn't the Ralph name unit active?
Exercise 14: Cycle for another 90 cycles (making a total of 100 cycles). What happens to Ralph's name unit? Why?
Exercise 15: Why are some of Art's features more active than others? Note: you can select the arrow tool and click on a unit to see its exact activation in the parameter panel at the bottom of the screen.
Now click on the reset button to return all of the units to zero activation and activate the instance unit for Art and the pusher and bookie units. Now run one cycle.
Exercise 16: What are the activations of the bookie, pusher and burglar units after one cycle?
Exercise 17: Why is the pusher unit more active than the bookie unit?
Exercise 18: Explain the activation of the burglar unit.
How Neural Networks Process Information
Neural networks process information in a very different way from
standard computers based on the Von Neumann architecture. Whereas
Von Neumann machines rely on discrete sequential processing, neural
networks are highly parallel and continuous in nature. These
differences are important both in terms of designing useful devices and
because they seem to provide a closer match to the way that people
operate. There are four properties of neural networks that we will be
demonstrating. These are:
Contrast this with human memory. We can be reminded of facts and episodes from what are often quite obscure cues which are often not unique when taken in isolation. We are able to locate records (or memories) based on any of the information stored in that record. For instance, if I say to a friend, "I like your blue shirt with the stripes.", they will often know exactly which one I'm talking about, despite the fact that I have not provided a unique identifier for that shirt. Furthermore, the cues given, that is, blue and striped, may only specify the shirt uniquely when taken in combination (if your friend has other blue shirts and other striped shirts).
Of course, it is possible with the appropriate query to retrieve the same information from a computer database. In human memory, however, retrieval from the content of a memory is automatic - human memory is fundamentally content addressable.
Exercise 19: To illustrate content addressability in the Jets and Sharks network, first reset the activations. Now activate the Jet, and 30's units and run 10 cycles. These characteristics while not unique individually do specify a unique gang member when taken together. Who is it?
Robustness to Noise
Von Neumann architectures are discrete in nature. This discrete nature
allows them to retain information completely faithfully when subjected
to small amounts of noise. Provided the noise is not sufficient to
switch a bit from a one to a zero or vice versa the information will be
interpreted as intended. This is, of course, the secret behind the
improved performance of digital recording. For a great many
applications this is a very useful property. I would prefer that the bank
retained the exact balance of my account, not an approximation or best guess.
In contrast, neural networks use redundancy in their structure to provide a best guess of the information to be retrieved. Such an approach is very useful in situations of extreme noise (such as speech recognition) where the information is incomplete or even incorrect. The next exercise demonstrates how the IAC network is resistant to erroneous information.
Exercise 20: Reset the Jets and Sharks network and toggle the Shark, 30's, H.S., Burglar and Married units. These are the characteristics of Rick except that Rick is actually Divorced not Married (see table 1). Now run 10 cycles. Which name unit comes on?
Exercise 21: The Lance unit also becomes partially active. Why?
Exercise 22: Why does the Lance instance unit deactivate?
Exercise 23: Run another 40 cycles. What happens to the marital status units?
Generalization
One operation that people seem to be very good at is collapsing over a
set of instances to establish a general trend. For instance, we might
ask "Are Americans more extroverted than Australians?". Unless you have
read the studies claiming that in fact they are, then your only
recourse would be to collapse across the set of Americans and the set
Australians you know and to extract some form of central tendency
measure on the extrovert/introvert dimension. This is quite a
difficult computation, but one that people perform routinely. The IAC
network can accomplish spontaneous generalisation of this kind by
activating a property and cycling.
Exercise 24: To ask the question "What are Single gang members like?" reset the network and toggle the Single unit on. Run ten cycles. Which characteristics become active?
Exercise 25: The Art and Ralph instance units become active but not the Sam instance unit, despite the fact that Sam is Single also. Why is this?
Exercise 26: Why does the 40s unit become active?
Default Assignment
The final property that we will examine is the ability of the IAC
network to provide plausible default values if it does not "know" a
given piece of information. The human memory system makes extensive use
of plausible defaults. In fact, people can have difficulty
distinguishing actual memories from those that have been reconstructed
from other related information. In the IAC network the provision of
plausible defaults is closely related to generalisation. Items which
are similar to the target item are used to extrapolate what the missing
information should be. In the following exercise we will remove some of
the weights in the Jets and Sharks network and see if it can provide
reasonable answers.
Exercise 27: Firstly, reset the network, toggle the Ralph instance unit on and run 10 cycles. Note which properties become active. Now use the delete tool to remove the weights between the Ralph instance unit and the 30s unit and the Ralph instance unit and the Single unit (remove the weights both to and from the units - that is 4 weights all up). Reset the network, toggle the Ralph instance unit on and run 30 cycles. How successful was the network at guessing Ralph's age and marital status properties? Explain the results.
Here endth the IAC tutorial. If you are interested in the IAC mechanism see the Word Superiority network (in the Networks menu) which uses the same mechanism to model people's ability to recognise letters in the context of words.
References
McClelland, J. L. (1981). Retrieving general and specific information
from stored knowledge of specifics. Proceedings of the Third Annual
Meeting of the Cognitive Science Society, 170-172.
McClelland, J. L., & Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception: Part 1. An account of basic findings. Psychological Review, 88, 375-407.
McClelland, J. L. & Rumelhart, D. E. (Eds.). (1988). Explorations in parallel distributed processing: A handbook of models, programs, and exercises. Cambridge, MA: MIT Press.
Rumelhart, D. E., & McClelland, J. L. (1982). An interactive activation model of context effects in letter perception: Part 2. The contextual enhancement effect and some tests and extensions of the model. Psychological Review, 89, 60-94.
Rumelhart, D. E., & McClelland, J. L. (Eds.). (1986). Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 1). Cambridge, MA: MIT Press.
Endnotes
[1] In the McClelland and Rumelhart formulation,
negative activations coming from other units are set to zero before entering
into the net input calculation. For simplicity's sake we have not included
this thresholding process. This restriction has little impact on the major
points we are trying to address.
[2] In the McClelland and Rumelhart version an activation which falls outside of the boundaries it is set either to max or min which ever is closest. This prevents the activations from becoming very large or very small quickly which can occur if parameter values are large. We have not included this component so that the natural tendency of the activation to keep activations within bounds can be observed. Be warned, though. If your activations are growing very large you may need to decrease some of the parameter values.