Received: from RIGA.MT.CS.CMU.EDU by moose.cs.indiana.edu
	(8.7.1/IUCS.1.39) id FAA20828; Fri, 15 Dec 1995 05:44:01 -0500 (EST)
Message-Id: <199512151044.FAA20828@moose.cs.indiana.edu>
From: Yan Qu <yqu@RIGA.MT.CS.CMU.EDU>
Date: Fri, 15 Dec 95 05:43:20 EST
To: ITALLC96@cs.indiana.edu
CC: yqu@cs.cmu.edu
Subject: ITALLC96 Submission
Status: RO


Dear ITALLC96 Organizer,

A figure (in .eps file) in this paper is commented out since in order
to view the figure, a separate format file has to be sent.  If you
need to look at this figure, please let me know how you want the ps
file to be sent.  I'll be out of town from today, but can be reached
through email after Jan 17.  Thank you and have a good holiday.

Yan Qu

-------Cut under this line for abstract---
\documentstyle[12pt]{article}
%\input /afs/cs/usr/yqu/src/psfig.sty
\begin{document}
\begin{center}
{\Large \bf Information-Theoretic Account \\ of Dialogue Utterances}
\vskip 5mm
Yan Qu \\
Computational Linguistics Program \\
Carnegie Mellon University \\
yqu@cs.cmu.edu
\end{center}

\vspace*{.5cm}

\section{Thesis Statement}
Agents engaged in cooperative problem solving face the problem of what
information to communicate from among the wealth of information of
each agent.  Only recently have researchers begun to study what
information to include in a dialogue, when to include it, and the
efficiency consideration for achieving the communication effect of the
information (\cite{walker93,WR94,WW94,guinn95,guinn93,turner95}).
However, previous researches fail to deliver an objective model of
selecting information for efficient communicative dialogue design.  In
this paper, I describe an objective measure of measuring information
of statements and the communicative effectiveness of statements.  The
theoretical framework is Bar-Hillel and Carnap's fundamental work on
Semantic Information \cite{bc53}.  I first introduce the semantic
information theory.  Then I apply this theory to measure the
information of two kinds of dialogue utterances, {\it positive
constraint background statements} and {\it negative constraint
background statements}.  These utterances are taken from a corpus of
naturally occuring dialogues, in which two people try to schedule a
time to meet.  Lastly, I show the objective measure of information in
dialogue utterances is helpful for determining the communicative
effectiveness of these utterances in multi-agent communication.

\section{Semantic Information Theory}

Bar-Hillel and Carnap's semantic information theory provides a
definition of content of statements as well as an objective way of
measuring content and information.  {\it Content} of a statement {\it
s} is defined as the set of all statements logically excluded by {\it
s}.  We expect that the more precise a statement is, the less probable
it is.  As a statement grows in length and precision, the content, or
set of excluded statements, grows.  Thus the size of the content of a
message is inversely related to the message's probability and
proportional to the information in the statement.  A content-based
information measure is defined as
\begin{equation}
 I(s) = -logPr(s) 
\end{equation}
and 
\begin{equation}
 Pr(s) \propto sizeof[Cont(s)]
\end{equation}
where {\it s} is a statement, and $Pr(s)$ is the probability of
statement $s$.  From (1) and (2), the information in statement {\it s}
can be calculated by:
\begin{equation}
I_{s} = log_{2} \frac{1}{1-sizeof(s)}
\end{equation}

Bar-Hillel and Carnap's semantic model provides a numeric measure of
the amount of information equal to that provided by Shannon's measure.
However, their measure is derived from the principles of logic rather
than from probability theory, and thus allows for an interpretation of
information based on the notion of an informative {\it statement} in a
language.

\section{Information-Theoretic Account of Dialogue Utterances}
\subsection{Information measure of atomic positive and negative constraint background statements}

Suppose in the scheduling domain, there are $n$ atomic time slots $t1,
t2, ... tn$ for negotiation.  To simplify the problem, suppose there
is only one one-argument predicate {\it free} in this domain.  The
attribute {\it busy} is represented as the negation of {\it free}.
Thus the whole language is $L_n ^ {1}$, with a conceptual space of
size $2^n$.

Now consider the case of positive constraint background statements.
The content of statement $free(t_i)$ is the set of all statements
logically excluded by the statement.  For example, it logically
excludes all statement with literal $-free(t_i)$, denoted by
$Cont(free(t_i))$. The proportion of the space occupied by
$Cont(free(t_i))$, as opposed to the entire conceptual space, is
${\frac{1}{2}}$.  The information is $I_s =
log_{2}\frac{1}{1-\frac{1}{2}} = 1$.

Along the same line of calculations, we see negative constraint
background statements are not very different from positive constraint
background statements.  The content of statement $-free(t_i)$ is the
set of all statements logically excluded by the statement.  In this
case, it logically excludes all statement with literal $free(t_i)$,
denoted by $Cont(-free(t_i))$.  The proportion of the space occupied
by $Cont(-free(t_i))$, as opposed to the entire conceptual space is
again $\frac{1}{2}$.  The information is $I_s =
log_{2}\frac{1}{1-\frac{1}{2}} = 1$.

A negative constraint background statement is a negated factual
statements of a positive constraint background statement.  Though the
contents of the two types are different, the information carried by
both types is actually the same.

\subsection{Information measure of multiple positive and negative constraint background statements}

% conjunctions
The above discussion deals with statements with individual elements at
the atomic clause level. Atomic statements can be combined together to
form more complex statements.  This section illustrates how to
calculate information for conjunction of statements.

Intuitively, conjunctions should carry more information than the
individual statements that constitute the conjunction.  The intuition
can bear out graphically by looking at the set of statements logically
excluded by the conjunction and its individual components.

%insert vien diagram
%\begin{figure}
%\centerline{\psfig{figure=/afs/cs.cmu.edu/project/nnspeech-4/yqu/papers/itallc/venn.eps}}
%\caption{{\bf Atomic Statement vs. Conjunction}}\label{fig:conj}
%\end{figure}

Figure~\ref{fig:conj} gives the graphical illustration for statement
$free(t_{1})$ and statement $free(t_{1}) \wedge free(t_{2})$.  By
stating statement $free(t_{1})$, the whole conceptual space is cut by
half as shown in the left diagram in \ref{fig:conj}.  The set of
statements excluded by saying statement $free(t_{1})$ is represented
by the shaded area.  By stating $free(t_{1})$ and $free(t_{2})$, the
conceptual space is first cut by half by statement $free(t_{1})$, and
then the statement $free(t_{2})$ also cuts the conceptual space in
half.  The total set of statements excluded by the conjunction of
statements is the total of the shaded areas excluded by both
statements.

The content of the conjunction $free(t_{1}) \wedge free(t_{2})$ is
therefore:
\begin{eqnarray*}
cont(free(t_{1}) \wedge free(t_{2})) & = & cont(free(t_{1})) + cont(free(t_{2})) \\
& & - cont(free(t_{1}) \vee free(t_{2})) \\
& = & sizeof(free(t_{1})) + sizeof(free(t_{2})) \\
& & - sizeof(free(t_{1}) \vee free(t_{2})) \\
& = & \frac{1}{2} + \frac{1}{2} - \frac{1}{4} \\
& = & \frac{3}{4}
\end{eqnarray*}

The information carried by the individual statement and the
conjunction of both statements are: \\
\begin{math}
I_{free(t_{1})} = I_{free(t_{2})} = log_{2}\frac{1}{1- \frac{1}{2}} = 1 \\
I_{free(t_{1}) \wedge free(t_{2})} = log_{2}\frac{1}{1-\frac{3}{4}} = 2
\end{math}
\\
Note that the set of statements consistent with $free(t_{1}) \wedge
free(t_{2})$ contains the set of statements consistent with
$free(t_{1})$ or $free(t_{2})$. That is, the set of statements
excluded by $free(t_{1}) \wedge free(t_{2})$ contains and implies the
set of statements excluded by $free(t_{1})$ or $free(t_{2})$.  The
result confirms our intuition that conjunctions are more informative
than the individual statements that constitute the conjunction.

In general, the content of a conjunction of $n$ atomic clauses is
$cont = ({\frac{1}{2}})^{1} + ({\frac{1}{2}})^{2} + ... +
({\frac{1}{2}})^{n}$.  Information of the conjunction is $I =
log_{2}\frac{1}{1-{(1-cont)}} = n$.

\section{Information Transfer Effect}

In the previous section, I provided the information measure for
statements in term of a {\it single} agent's belief space. Next, I
show how information is calculated in multi-agent interaction.  Clark
et all \cite{cs92,cw92} propose that agents engaged in a conversation
need to bring a certain amount of common ground to a conversation in
order to understand each other.  The process of adding to this common
ground is {\it grounding}.  Grounding can be seen as a process of
updating the mutual beliefs of the agents.

I focus on a particular mutual belief space -- the conceptual problem
solving space, and on how this shared problem solving space is updated
by information transfer.  First, I describe the shared conceptual
problem solving space, and then illustrate how positive and negative
constraint background statements update the problem-solving space.
 
\subsection{Mutual beliefs}

A general view of mutual beliefs is to assume that each agent has a
mutual belief space containing all beliefs he thinks of sharing with
another agents.  Mutual belief is generally defined in term of belief
following \cite{schiffer}.  The connection between belief and mutual
belief is defined by the fix-point axiom, which captures the
circularity of mutual belief \cite{harman}: \\
$SH_{xy}p \equiv BEL_{x}(p \wedge SH_{yx}p)$ \\
where $SH_{xy}p$ means that agents $x$ and $y$ mutually share 
the belief that $p$.

\subsection{Negotiation as search}

Problem solving can be characterized as a task of searching through a
large, predefined space of hypotheses to find the hypothesis that is
the goal of the problem.  In the scheduling domain, though agents are
assumed to have complete knowledge of their respective calendars, they
have incomplete (or no) knowledge as to the other's calendar.  Thus
the scheduling process is to search through a hypothesis space to find
out the goal for both agents.  While a variety of representations are
possible to represent a hypothesis, I choose a simple representation
in which each hypothesis is described by a conjunction of individuals
with constraints on the values of one single attribute.  For example,
the entire hypothesis space for negotiation over $n$ time slots $t_1,
t_2, ..., t_n$ with one attribute $free$ will be represented as:

$<?,?, ..., ?>$ \\ 
in which each question mark corresponds to a clause which shows
whether an individual time slot $t_i$ is $free$ or not.  Any
instantiation of the question mark with a positive clause is
considered the goal of the problem solving.  The process of
negotiation is to find the positive instantiation of any of
individuals.

\subsection{Update the mutual belief space}

The design of multi-agent interaction systems needs making many
assumptions about the agents and the task.  I make the following
assumptions as to the scheduling agents:

\begin{enumerate}
\item {\bf Assumption 1}: Agents have no prior beliefs of the 
other agent's calendar.
\item {\bf Assumption 2}: Agents have perfect understanding of 
each other during negotiation.
\item {\bf Assumption 3}: If a time $t$ is bad for one agent, then
negotiation is closed for this time $t$.
\item {\bf Assumption 4}: If a time $t$ is good for one agent, then
the other agent can either accept the time $t$, in which case the
negotiation ends successfully, or reject the time $t$, in which case
time $t$ is closed for negotiation.  I postulate that when
alternatives are involved, the partner must respond in order to give a
status for time $t$.
\end{enumerate}

It follows from assumption 1 and 2 that the agents start the
negotiation with a shared problem solving space with all possible
hypotheses.  Since perfect communication is assumed, at each stage of
negotiation, each agent updates mutual beliefs.  However, the mutual
problem solving space may or may not be updated depending on what
information is conveyed.

\noindent{{\bf Definition. Communication Effectiveness (CE):} CE
measures the effectiveness of a statement on the reduction of mutual
search space.  The formula for calculating $CE$ is:
\begin{equation}
CE = \frac{sizeof(s)}{N}
\end{equation}
where $sizeof(s)$ denotes the proportion of the set of statements
excluded by the statement $s$, and $N$ is the minimal total number of
statements that are required in order to finalize the reduction of
mutual search space.

It follows from assumptions 2 and 3 that a negative constraint
background statement $i$ has $CE = sizeof(i)$.  In particular, suppose
that agent A utters statement $-free(t)$. By assumption 2,
$Bel_{B}Bel_{A}(-free(t))$ holds.  By assumption 3, since the time
slot $t$ is closed for negotiation, the mutual search space is
updated.  Since no additional statement is required to finalize the
mutual search space reduction, the total numbers of statements
required is just one, i.e. $i$ itself.

Now consider the case of positive constraint background statement
$i$. In particular, suppose agent A utters $free(t)$, agent B will
update his belief model as $Bel_{B}Bel_{A}(free(t))$.  In this case,
however, the mutual belief space can not be updated since the status
of time slot $t$ for agent B is not known to agent A.  By assumption
4, agent B has to respond with either acceptance or rejection to inform
agent A of the status of the time slot $t$.  In the case of B
rejecting $t$, the mutual problem solving space is reduced.  If B
accepts $t$, the mutual problem solving space is also reduced.  At the
same time, the agents reach a common goal state and negotiation is
over.  In either case, B's response is needed to finalize the search
space reduction implied by A's proposal.  Thus the minimal total
number of sentences required to finalize the search space reduction is
2.  Consequently $CE = \frac{sizeof(i)}{2}$.

\section{Conclusions}

In this paper, I apply the semantic information theory to two types of
dialogue utterances: the positive constraint background statements and
the negative constraint background statements.  The objective measure
of the information and communicative effectiveness of the dialogue
utterances is useful for developing efficient human-computer or
computer-computer interaction systems.

\begin{thebibliography}{10}

\bibitem{bc53}
Yehoshua Bar-Hillel and Rudolf Carnap.  
\newblock Semantic Information.  
\newblock {\em The British Journal for the Philosophy of Science}, 
4(13):147--157, 1953.

\bibitem{cs92}
Herbert~H. Clark and Edward~F. Schaefer.
\newblock Contributing to discourse.
\newblock {\em Cognitive Science}, 13, 1989.

\bibitem{guinn93}
C.~I. Guinn.
\newblock A computational model of collaborative discourse.
\newblock In {\em Workshop Proceedings from AI-ED 93 World Conference on
  Artificial Intelligence in Education}, 1993.

\bibitem{guinn95}
Curry~I. Guinn.
\newblock {\em Meta-Dialogue Behaviors: Improving the Efficiency of
  Human-Machine Dialogue}.
\newblock PhD thesis, Duke University, 1995.

\bibitem{turner95}
Elise~H. Turner.
\newblock Selecting information to communicate.
\newblock In {\em AAAI Workshop on Planning for Inter-Agent Communication},
  1995.

\bibitem{walker93}
Marilyn Walker.
\newblock {\em Information Redundancy and Resource Bounds in Dialogue}.
\newblock PhD thesis, University of Pennsylvania, 1993.

\bibitem{WR94}
Marilyn Walker and Owen Rambow.
\newblock The role of cognitive modeling in achieving communicative intentions.
\newblock In {\em Proceedings of the Seventh International Workshop on Natural
  Language Generation}, 1994.

\bibitem{WW94}
Marilyn Walker and S.~Whittaker.
\newblock Mixed-initiative in dialogue: An investigation into discourse
  segmentation.
\newblock In {\em Proceedings of the 28th Annual meeting of the Association for
  Computational Linguistics}, 1994.

\bibitem{cw92}
Deanna Wilks-Gibbs and Herbert~H. Clark.
\newblock Coordinating beliefs in conversation.
\newblock {\em Journal of Memory and Language}, 31(2):183--194, 1992.

\bibitem{schiffer}
S. R. Schiffer.
\newblock Meaning.
\newblock {\em Oxford University Press}, 1972.

\bibitem{harman}
G. Harman.
\newblock Review of ``Language Behavior'' by Jonathan Bennett.
\newblock {\em Language}, 53:417--424, 1977.

\end{thebibliography}
\end{document}








