The Effects of Arousing Message Content and Structural Complexity on Television Viewers' Arousal and Allocation of Processing Resources.

Annie Lang
Associate Professor
Department of Telecommunications
Indiana University
515 N. Park Ave.
Bloomington, IN 47405
(812) 855-5824
ANLANG@INDIANA.EDU


Paul Bolls
and
Karlynn Kawahara


Graduate Students
School of Communication
Washington State University
Pullman, WA 99164-2520


April 5, 1996



Paper presented to the Midwest Artificial Intelligence and
Cognitive Science Conference

This paper examines how two aspects of a television message (the arousingness of the content and the structural complexity of the production) affect viewers' physiological and cognitive responses to the message. In particular, this study examines how these two variables affect viewers' arousal levels and viewers' allocation of processing resources.

This paper uses a limited capacity information processing model of television viewing. Television viewers are limited capacity information processors. To process a message, the information in the message must be encoded, processed, and stored for later retrieval. How well a message is processed is jointly determined by the amount of capacity required by the message and the amount of capacity allocated by the viewer to the task of processing the message.

The amount of capacity required by a message is determined jointly by the content and the structure of the message. Many aspects of television's content (e.g. difficulty, familiarity, emotion, and complexity) and structure (e.g. cuts, edits, camera movements, and audio video redundancy) have been shown to alter capacity allocation (Lang, et al., 1993; Lang, 1995).

The amount of capacity allocated to the message by the viewer is jointly determined by individual characteristics of the viewer and the structure of the TV message. Viewers choose how much attention they will pay to a message as a result of their own motivations, interests, knowledge, mood, etc.. Many basic elements of television's structure (e.g. cuts, edits, movement, and video graphics) also determine capacity allocation through the elicitation of orienting responses and accompanying brief increases in capacity allocation (Lang, et al., 1993).

The global effects of these brief increases in capacity allocation are not entirely clear. Locally, these structural features clearly result in orienting responses and increased capacity allocation. For example, Lang et al., (1993) have show that cuts (defined as a sudden change from one visual scene to another) elicit cardiac orienting responses and increases in secondary task reaction times (STRTs) in attentive television viewers if STRTs are measured at the precise point of the cut. However, STRTs measured at random points in highly complex television messages (defined as messages with many cuts, edits, and camera techniques) are faster than STRTs measured at random points in simple television messages.

This creates a difficulty in our understanding of how people allocate capacity to viewing television. A single structural feature seems to create a local or short term increase in capacity allocation, and yet if we introduce many such structural features, which might logically be thought to raise the overall level of capacity being allocated to a message, reaction times decrease, suggesting that the overall (or global) capacity allocation has gone down.

This is similar to research done by Britton on capacity allocation to simple and complex text. This work also showed that STRTs were faster during complex text than they were during simple text. Many of the same possible explanations have surfaced to explain both sets of results. One possibility is that somehow simple texts or television messages "fill capacity" to a greater extent. The problem with this explanation is that it is difficult to understand what it means to "fill capacity" or to determine what stimuli might result in capacity being filled

A second possibility is that complex texts and television messages elicit increases in arousal and that at increased levels of arousal viewers' actually have a larger pool of capacity (Kahneman, 1975) and therefore reaction times are faster even though more capacity is being allocated. The major drawback to this position is that in both the work on text and television, complex messages are remembered less well than simple ones, which is somewhat inconsistent with the notion that there is more capacity and a smaller fraction of it is being required. If that were the case than performance on the television watching task should not decrease.

Recently Lang and Basil (1995) have suggested a third possibility based on a reinterpretation of what is being measured by STRTs. In their model they suggest that watching television (for example) requires the viewer to encode the information contained in a television message, process it, and store it in memory. Television messages are ongoing streams of visual and verbal information. Hence at any given moment a viewer is encoding new information, processing the previously encoded information, and storing the previously processed information. They suggest that capacity may be allocated independently to each of these processes. They suggest that the structural variables of television (like cuts, edits, etc.) increase the allocation of capacity to the sub-process of encoding the message while content variables (like difficulty or emotion) may increase the allocation of capacity to storage. Thus, in this model, complex video (defined as video with many structural features) might be expected to increase the capacity allocated to encoding the message.

They go on to suggest that STRTs are not actually measuring the overall level of capacity being allocated to a task, but rather, that STRTs index the amount of capacity allocated to the encoding sub-process. They further suggest that rather than measuring the overall level of capacity allocated to encoding, the STRT measures how much of the capacity allocated to encoding (as a result of both structural and voluntary dimensions of allocation) is available (i.e. is not being used) at the encoding level of processing. This notion of availability suggests that we may over allocate capacity to the task of encoding a stimulus (possibly as a result of their structural features) and therefore have available capacity at the encoding stage to respond to a secondary task probe. This model suggests that the secondary task response is primarily an encoding task and therefore may be most sensitive to shortages of capacity during encoding.

This study is designed to investigate the hypotheses: 1) that structural complexity of a television message elicits arousal, which increases capacity or, 2) that structural complexity of a television message results in an increase in capacity allocated to encoding television messages and that STRTs index either capacity allocated to encoding or capacity available at encoding. Arousal was manipulated by choosing television messages that had arousing or calm contents. Structural complexity was manipulated by choosing messages with varying levels of cuts.

Methodology


Design and Independent Variables

The experiment is a mixed 3 (Order of Presentation) X 3 (Structural Complexity) X 2 (Arousal) X 5 (Message) design. To construct the stimulus tapes 30 messages were chosen from a pool of 312 coherent 30 second messages which had been taped off the local cable system (not including premium channels). Two levels of arousingness (calm and arousing) and three levels of Structural Complexity (simple, medium, and complex) were completely crossed. Five messages were chosen in each arousing/complexity category, resulting in a total of thirty messages. Three semi-random presentation orders were constructed and order of presentation was the only between subjects variable. Orders were constructed in blocks of six messages. The six messages in each block contained one message from each arousing/complexity category. The messages making up each block were randomly chosen for each order with the constraint that, across the three orders, each individual message had to appear in the first or last block of six one time.

Structural Complexity was operationalized as the number of cuts in a 30 second television message. Simple messages had 0 or 1 cut, medium messages had 4-6 cuts, and complex messages had 11 or more cuts in thirty seconds.

Arousingness of content was operationalized by having at least three undergraduate coders rate the pool of 312 messages using SAM (the Self-Assessment Mannequin) developed by P.J. Lang (Bradley et al., 1992). SAM is a pictorial arousal scale which translates into a 9 point scale ranging from 1=Very aroused or excited to 9=calm, sleepy, not aroused. Messages were chosen that were rated 1-3 or 7-9 by all the original coders. Arousing contents included fights, sexual scenes, and chases. Calm contents included nature, science, and meetings.

Dependent Variables
Arousal was measured in two different ways. First, all subjects in the experiment used SAM to rate how aroused they felt immediately following each message. Second, a group of subjects in the experiment was assigned to a physiological condition. In this condition subjects' heart rate (HR) and skin conductance (SC) were measured during viewing.

Capacity allocation was measured using STRTs. One reaction time probe was placed randomly within each ten second period of each message with the constraint that no probe was placed within one second of a cut (Geiger & Reeves (1993) have shown that the local increase in STRT associated with a cut occurs primarily within 500 ms of the cut).

The experiment was controlled by a Zenith 286 computer with a Labmaster A/D D/A board. Reaction time probes were placed on audio track two of the video tape and came out of the television speakers (as part of the tape). The tones were clearly audible but not significantly louder than the message audio. The tones and the subjects responses were both recorded as digital events by the computer and the ms between them was recorded as the reaction time.

SC was measured by placing two Beckman standard AG AGCL electrodes on the subjects non-dominant hand after washing the skin with distilled water to control hydration. The signal was passed to a Coulbourn SC module. SC level was sampled and recorded 20 times per second throughout viewing.

HR was measured by placing two Beckman mini AG AGCL electrodes on subjects forearms. A ground electrode was placed on subjects non-dominant forearm. HR was recorded using a Coulbourn bio-amplifier with filters. Data was recorded as milliseconds between beats and was later converted to HR per second.

Participants
Ninety-six undergraduates at a large Western University participated in this experiment for extra credit in a Communications course. Fifty-one participants were assigned to the reaction time condition and 30 were assigned to the physiology condition.

Procedure
Participants in the reaction time condition viewed the stimulus tape in groups of 2-6 on a 19 inch color television. Subjects held a button in their dominant hand and were instructed to watch the television closely since they would be tested on what they could remember, but, whenever they heard a tone they should press the button as fast as they could. Subjects viewed a practice message with five reaction time beeps and were given a chance to ask questions before viewing the stimulus tape. Participants in the physiology condition also viewed the practice tape in order to become accustomed to the electrodes and the environment before viewing the stimulus tape. Following viewing both groups of subjects filled out several questionnaires.

In both conditions there was a 30 second pause between messages. Participants were given 15 seconds to rate their emotional responses to the message using the SAM scale described above. Followed by either a 10 second recovery period and a five second base-line data collection period in the physiology condition or a 15 second recovery period in the reaction time condition

Hypotheses and Results


The variable capacity explanation discussed above predicts that when subjects are aroused they will have larger capacity pools and therefore faster STRTs. Thus, the first questions to be answered involve whether or not subjects were in fact aroused. Messages were chosen to have contents that were either arousing or calm. Thus it is logical to predict that participants will rate the arousing messages as more arousing and have higher SC levels during arousing messages. Previous research suggests that HR will be lower during arousing message - suggesting greater attention to these messages (Lang & Bolls, 1995) because parasympathetic activation tends to dominate sympathetic activation of the heart during television viewing.

H1: There will be a main effect for Arousingness of such that arousing messages will have higher SC and SAM levels but lower HR levels than calm messages.

The results are shown in Table 1. As predicted SAM ratings were higher for arousing messages and HR was lower for arousing messages. There was no significant difference for SC primarily due to significant complexity by arousal interaction which will be discussed later.

Table 1: Means and Anova's for the Arousing Main Effect

Measure   Calm      Arousing  F         df        p<   
SAM 7.22 5.66 196.66 (1,49) .000
SC 8.03 7.96 n.s.
HR 76.10 75.21 21.33 (1,27) .000

Previous research suggests that complexity elicits arousal in viewers though this has rarely been measured. Hence it is predicted that complex messages will be rated as more arousing and show higher SC and lower HR levels than medium or calm messages.

H2: There will be a main effect for Structural Complexity such that as Complexity increases SAM and SC will increase, and HR will decrease.

Results from the SAM data showed the predicted main effect for Structural Complexity (F(2,98)=26.71, p<.000). Simple messages had the lowest arousal ratings (M=6.97) followed by medium (M=6.29) followed by complex (M=5.97). The Complexity main effect on the HR and SC data was not significant. Figure 1 and Figure 2 show HR and SC data averaged over 5 second intervals for the six types of messages. As can be seen in these figures HR distinguishes fairly clearly among the arousing and calm messages, but not among the different levels of complexity. However for SC there is a significant Arousal X Complexity X Time interaction (F(4,108)=2.79, p<.03). As time goes on distinctions among the complexity levels do become significant even though the overall main effect is not significant.

These results show that both complexity and content increase viewers' self-reported levels of arousal in a seemingly additive way. On the other hand they have different effects on the two physiological measures. Increasing complexity results in a increase in SC levels while greater content arousingness is manifested by a decrease in HR. But Complexity does not appear to alter HR and Content Arousingness does not affect SC much (unless the message is simple).

What will be the effect of these two types of arousal on capacity allocation? The Kahneman/variable capacity prediction suggests that the most arousing messages should have the fastest reaction times, hence this theory would predict the fastest RTs for Arousing complex messages and the slowest RTs for calm simple messages. It would be logical, from this perspective to predict:

H3 alternative 1: There will be main effects for Arousingness of Content and Complexity such that Arousing messages will have faster RTs than calm messages and Complex messages will have faster RTs than simple messages.

On the other hand, the second explanation proposed makes somewhat different predictions. First, this hypothesis predicts that cuts result in an automatic call for capacity to encode the message. Thus as more cuts are introduced into the message more capacity will be allocated to the message. However, the STRT is now conceived of as measuring capacity allocated to encoding or, even more specifically, capacity available at encoding. In the case of previous research complexity was measured by counting the number of cuts, edits, and camera movements in a message. Cuts, edits, and camera movements differ in the amount of information they introduce into the message, which should affect how much of the capacity allocated to the structural feature will be needed to encode the accompanying information. Thus edits (which are changes from one camera to another in the same visual scene) and camera movements add little new information -- thus increasing the number of edits and camera moments will increase the capacity allocated to encoding without greatly increasing the amount of information available to be encoded. In this case it is likely that there will be more capacity allocated to encoding than is required. Hence reaction times should decrease since there would be available capacity at encoding. On the other hand, because this capacity was allocated to encoding - there would be less capacity available to be allocated to storage so memory for the messages would go down (as has been shown in previous research -- e.g. faster RTs and worse memory). Cuts, on the other hand, do introduce new information since they are defined as a change to a new visual scene. As a result - cuts likely require more of the capacity that they elicit to encode the information that is associated with them. This should result then in an increase in capacity allocated to encoding and an increase in the capacity needed to encode the message and therefore a decrease in the capacity available at encoding. Given that this manipulation involved only cuts (unlike previous research which included edits and camera movements) it is logical to predict that for this experiment increases in structural complexity will increase RT rather than decrease reaction time.

The second prediction made by this theoretical explanation is that content variables are more likely to affect the capacity allocated to storage of the message. Hence, arousingness of content should increase the capacity allocated to storage. STRTs will only index this increased allocation indirectly in that as more capacity is allocated to storage there will be less capacity available to be allocated to encoding. Thus - if arousal causes more capacity to be allocated to storage the encoding process will be overloaded sooner for arousing messages than for calm messages. Thus, while there is unlikely to be a main effect for arousal, there should be an interaction between Structural Complexity and Arousingness of Content such that RTs increase at a faster rate for arousing messages than they do for calm messages. Hence the alternative hypothesis:

H3 alternative 2: there will be a main effect for Structural Complexity on the RT data such that as complexity increases RTs will increase. Further there will be an interaction of Structural Complexity and Arousal such that RTs will increase faster for arousing messages than they will for calm messages.

Analysis of the RT data shows a significant main effect for Arousingness of Content (F(1,48)=15.36, p<.00) such that arousing messages had slower RTs (M=845.71) than calm messages (M=799.37). There was no main effect for Structural Complexity (F<1). The interaction of Complexity and Arousal was significant (F(2,96)=3.42, p<.037) and is shown in Figure 3.

These results do not precisely match the predictions of either alternative. The variable capacity hypothesis predicted faster RTs for arousing messages compared to calm messages. But these data show slower reaction times for arousing messages. On the other hand, arousing complex messages had faster RTs than arousing simple or arousing medium messages which is consistent with the variable capacity hypothesis.

On the other hand, the encoding allocation hypothesis predicted a main effect for structural complexity on the data which was not significant. However, structural complexity clearly has an impact on RTs, but the direction of the effect is determined by the Arousingness of the message. For calm messages the effect of complexity is in the direction predicted -- that is slower reaction times indicative of less available capacity as structural capacity increases. But for arousing messages the effect is in the opposite direction.

Discussion

Overall this study suggests that the relationships among message content and structure, arousal, and capacity allocation are not simple. While both message content and structural complexity appear to elicit arousal independently as measured by viewers' SAM ratings -- they do not appear to elicit the same pattern of physiological response. Content appears to have its greatest effect on HR and complexity has its greatest effect on SC.

References
Bradley, M., Greenwald, M., Petry, M. & Lang, P. (1992). Remembering Pictures: Pleasure and arousal in memory. Journal of Experimental Psychology, 18(2), 379-390.

Kahneman, D. (1973) Attention and Effort Englewood Cliffs, NJ: Prentice-Hall.

Lang, A. & Basil, M. D. (1996). What do secondary task reaction times measure anyway? Paper to be presented to the International Communication Association, Chicago, IL. May.

Lang, A. Dhillon, P., and Dong, Q. (1995). Arousal, Emotion, and Memory for television messages. Journal of Broadcasting and Electronic Media, 38,1-15.

Lang, A. (1995). Defining audio/video redundancy from a limited capacity information processing perspective. Communication Research 22, 86-115.

Lang, A., Geiger, S., Strickwerda, M., Sumner, J., (1993) The effects of related and unrelated cuts on viewers memory for television: A limited capacity theory of television viewing. Communication Research, 20; 1. pp. 4-29.