The CIRCSIM-Tutor v.3 input understander is the module which will process student answers during a tutoring dialog about blood pressure regulation. We have example transcripts of such dialog using both human and computer tutors. Even though the student is answering comparatively simple questions, we can observe a number of complicating phenomena, ranging from creative abbreviation to varied syntax. Students also have habits such as hedging answers and giving appropriate but unexpected responses. An important goal of the input understander is to recognize and correctly respond to some of these phenomena.
The tutoring protocol has several phases. The phase of interest to us consists of tutoring dialog, where the student and the computer take turns communicating in written language. The computer has control over the conversation, alternately teaching concepts and asking questions. The student's input is almost entirely responses to questions.
The "Input Understander" for CIRCSIM-Tutor [2] is a module which accepts the student's typed utterance and matches it to the needs of the pedagogical and language planners and the student modeller. Note that "understanding" is not a very good word for what the input understander does. There is a limited, finite set of ideas which can possibly be of use to the planners and the modeller. Any student-expressed ideas outside this limited set are uninteresting.
To a first approximation, then, the job of the input understander is to match the answer to the question. Here are some short questions CIRCSIM-Tutor version 2 produces, together with typical correct student responses which we have recorded in log files:
Tu: What is the correct value of stroke volume? St: Increased Tu: What are the determinants of cardiac output? St: Heart rate and stroke volume Tu: What is the relationship between heart rate and cardiac output? St: DirectAlthough we want to create more sophisticated language and tutoring strategies in version 3, resulting in richer language and occasionally more complicated questions, typical questions will not be more demanding than those illustrated above.
However even within the domain of fairly simple questions, there is need for the input understander to handle a number of phenomena beyond the correct one-word or one-phrase answers illustrated above. The increased demand on the input understander comes from both the variety of student behaviors which we are interested in handling correctly and the increased capabilities of the pedagogical and language planners.
A primary source of data on student behavior is the set of keyboard-to-keyboard tutoring transcripts the CIRCSIM-Tutor project has accumulated. We have more than sixty of them, each one or two hours long, including forty-seven with expert tutors. In this paper we use cleaned-up extracts (fixed spelling, expanded abbreviations) from the transcripts to illustrate the behaviors we want the input understander to handle. We have also mined examples from logs of students using the existing CIRCSIM-Tutor version 2.
Having these extensive transcripts makes it possible for us to have some confidence that our lists of short answers and hedges, which are derived from transcript data, are sufficiently comprehensive.
Unfortunately, there is often little syntactic resemblance between the question and the answer. Here are some responses, most verbatim and the rest derived from real examples, to the question emitted by version 2: "What is the correct value of (some parameter)?"
up (adverb)
increase (verb)
increases
increased (predicate adjective or past participle)
i (drastic but common abbreviation)
unchanged
no change
goes up (phrasal verb)
went up
it goes up (a whole sentence)
negative (adjective)
+ (symbol)
zero
remains same (curious grammar)
There is no profit in parsing many of these answers with a sentence grammar.
Instead, the lexical entries for observed short-answer words and phrases
contain pointers to the relevant logical concept. The above examples cover
only three concepts: up, down, and no change. It is the job of the input
understander to try to match these concepts, taken from the lexical entries,
to the question, which is available from the pedagogical planner in logical
form. In our keyboard-to-keyboard transcripts we have fairly extensive
examples of these kinds of answers.For some topics students generate new phrases at will, what linguists call creativity. For instance students have been observed uttering the following, among other answers, all in response to the same basic question:
neural nervous system parasympathetic nervous system sympathetics sympathetic stimulation sympathetic tone reflexGiven such data, we would not be surprised to see a student creating "neural stimulation" or "reflex system." It would be difficult to try to anticipate all possible phrases in the lexicon. Thus if the student's input is not a known word or fixed phrase, it will be parsed as a phrase or a sentence.
Real-time typed communication tends to be fragmentary, so the input understander parses the input string bottom-up, attempting to identify phrase constituents. Information as to the meaning can come from either the head constituent or a modifier. For example, "system" as the head noun in "reflex system" tells you nothing, it is "reflex" that carries the meaning of interest.
Occasionally a student will state the answer as a complete proposition, e.g. "cardiac output increases" or even "it increases" when just "increases" would be adequate. We have to parse and produce a logical form for such a sentence, and then match it to the question. It is not sufficient in this example to simply recognize the word "increases" and propose it as an answer to the question, viz:
Tu: So what happens to mean arterial pressure? St: Cardiac output increases. Tu: Correct! Mean arterial pressure increases.
Since there are roughly 3000 strings in our lexicon at this time, a number which will inevitably grow, the spelling corrector often proposes several different possible corrections. We have several techniques for arbitrating.
The first technique involves comparing each candidate word with the expected answer. This takes two forms: first is a direct comparison to the expected correct answer, next is a search through an ontology of concepts referred to by the candidate replacement words. The point here is that if the question is asking for a control mechanism (either neural or hemodynamic in our case), we will prefer spelling corrections which yield the control mechanism concept.
This ontology currently exists, containing many nouns from our lexicon. It needs to be expanded to include more concepts which are represented by other parts of speech and by phrases. The same data structure is used when sentences are parsed, helping to match up complements against the expected arguments of verbs.
Another technique for arbitrating possible spelling corrections might be to pick from among the working set of recently-used words. Defining this working set is something of a problem; we hypothesize that it contains the terms which occur in all the open discourse segments. The discourse planner and pedagogical planner log when various pedagogical goals open and close. At any given time we can identify a nested set of open planning goals, each extending over a number of turns. We could then look back through the discourse history which CIRCSIM-Tutor keeps and collect a prioritized list of the words which have occurred in both tutor and student utterances, with words from the most recently opened segments having the highest priority.
You can see this kind of locality of reference in impromptu abbreviations. When conversing with human tutors, students often abbreviate with abandon. "Inotropic state" can became "inotropic s." "Parasympathetic" can become "parasymp" or "para." Such impromptu abbreviations should not be confused with the stock of standard abbreviations which can be used at any time. For instance "Cardiac output" is usually typed "co" or sometimes "c.o.," and can occur in those forms anywhere. These stock abbreviations are in the lexicon, and thus are not subject to spelling correction. But a term like "Frank-Starling" has no stock abbreviation, it might be abbreviated as "f.s.," but only on second use in a nearby location.
It is hard to discern a common principle which enables the input understander to recognize hedges. We have a list of hedge phrases from our transcripts (including the popular question mark), which the input understander will convert to a <hedge> token in the input. It makes no practical sense to parse "I think cardiac output increases" into a main verb "think" with a complement sentence. Consider first that few real answers are complicated enough to contain an embedded sentence, and second that the student may also utter "cardiac output, I think." Having said that, it must be noted that students do sometimes hedge by simply asking a quesion, perhaps with no punctuation at all, so parsing may be useful to identify answers hedged in this fashion.
We have some examples where the tutor asks what controls total peripheral resistance (TPR). The desired answer is the nervous system, but the student answers arteriolar radius (Ra). The nervous system happens to work by varying Ra in this case. Ra is not a wrong answer--it should not be contradicted--but it doesn't serve the immediate tutoring goal either. It can be made to serve, if the tutor leads the student backward from Ra to the nervous system. We have a number of similar cases in our transcripts.
An important point here is that the input understander needs to make some sort of relevance judgment. If we look in the concept ontology, we discover that the nervous system (a control mechanism), and Ra (a measurable parameter) are not related. It would appear that "Ra" cannot be an answer to the tutor's question. However CIRCSIM-Tutor also has a "concept map," a graph showing the physiological causal relations among various concepts. The concept map shows a causal chain from the nervous system to Ra to TPR. If the input understander is to properly handle this kind of answer, it needs to refer to the causal relation map to see where the student's answer (Ra) lies with respect to the tutor's desired answer (nervous system).
To be more precise, the input understander must know the current tutoring goal. Before the input understander can report that Ra is a relevant answer, it must know that pedagogical planner was trying to teach that TPR is neurally controlled. The knowledge that this is the current tutoring goal, combined with the knowledge of the causal relations, enables us to judge that Ra is a relevant answer.
Another important point is that this process can be thought of as a kind of student initiative from the pedagogical planner's point of view. Even though it is not the expected answer, it does not cause the planner to give up on any goals. In response to a normal wrong answer, the planner notices that some goal has not been achieved so the planner adopts a new strategy. However in this case the planner pushes its current goal and proceeds to satisfy it using an unexpected refinement. In effect, the student has interrupted the course of the conversation, causing the tutor to deal with the student's idea.
Of course this is not a student initiative from the student's point of view--the student never wavers from the single goal of providing the correct answer. The student did not intend to introduce a new goal into a cooperative conversation, even though in effect this is what happened. Trying to recognize the student's "plan" in this case would be unnecessary.
We like this example because it might be implementable, it is something that students demonstrably do, and it would be a miniscule first step towards a planner which adapts its tutoring plans in a cooperative response to the student's input.
Because people have prior expectations about how to answer computer-generated questions, we expect that some of the phenomena we hope to capture and utilize won't often occur in nature. For example, we observe that human-tutored students often hedge their answers, turning "increase" into "increase?" and "cardiac output" into "probably cardiac output." If we want CIRCSIM-Tutor version 3 to respond to hedges we may need to give students permission to hedge.
Instead of giving students explicit permission, we plan to include some sample tutoring dialog in the instructions. Finding neutral illustrative dialog may be a bit tricky, as we don't want to corrupt possible experiments by giving away answers to some of the problems. But illustrating hedged answers with question marks and adverbs, for example, may have more impact on the student than explicit instruction. Some of the phenomena we hope to handle, such as spelling errors, occur naturally. We don't need to give students permission to misspell words in order for misspellings to occur. But with human tutors students seem to misspell with abandon, and even invent impromptu abbreviations. It might be nice to increase the student's expectation that the computer is capable of such treatment, but if students receive positive reinforcement when typing mistakes occur, giving permission may not be necessary.
Students frequently hedge answers when communicating with people, so we ought to be able to recognize a variety of hedges in CIRCSIM-Tutor, though we might need to persuade students to hedge to a computer tutor. Finally, some answers which are appropriate but unexpected, which a human tutor can handle by adapting plans, may conceivably be within our reach.
[2] Lee, Yoon-Hee. (1990) Handling Ill-Formed Natural Language Input for an Intelligent Tutoring System. Ph.D. Thesis, Illinois Institute of Technology. return
[3] Elmi, Mohammad Ali. (1994) A Natural Language Parser with Interleaved Spelling Correction Supporting Lexical Functional Grammar and Ill-Formed Input. Ph.D. Thesis, Illinois Institute of Technology. return