Syntax, semantics, and grammars (1)
What is syntax for?
(A sample text)
Constituency and structure
He took a rake out of the closet.
He took [ a rake ] [ out of the closet ].
He took [ a rake [ out of the closet ] ].
He took a rake out
of the closet.
- What goes with what?
- What modifies or "complements" what?
- How do languages indicate this?
Relations
The pile of leaves that Toad had raked for Frog blew everywhere.
The pile of leaves that Toad had raked
[0] for Frog blew everywhere.
He will never guess who
raked his leaves.
She sent herself an email.
- How are words tied together through relations?
- Coreference
- Grammatical relations: SUBJECT, DIRECT OBJECT, INDIRECT OBJECT
- How do languages indicate this?
- Word order
- Morphology on the word on one end or the other or both
- Syntactic and lexical conventions
Compositionality, syntax-semantics mappings
messy leaves
Toad ran through the grass so that Frog would not see him.
- How do the constituents of a phrase or sentence map onto
the constituents of the associated meaning?
- How are novel combinations of words and phrases interpreted?
(Here is an extreme example involving
"colorless, green ideas".)
Constituent order
Rana llegó a la casa de Sapo.
'Frog got to Toad's house.'
Llegó Rana a la casa de Sapo.
Llegó a la casa de Sapo Rana.
A la casa de Sapo llegó Rana.
- What constraints are there are on the order in which constituents
can appear?
Selectional restrictions
messy room, messy hair, messy leaves, messy thoughts,
messy star
devote your time, devote yourself, devote your blood,
devote your name
- What constraints are there on the kinds of modifiers or
complements that can be associated with particular words?
Grammaticality and obligatoriness
He took a rake out of the closet.
- What does a particular language require in order for a sentence
to be "grammatical"?
- Why does this matter?
Generalizations and abstraction
I left the keys in the car.
What did I leave in the car?
Where did I leave the keys?
The keys I left in the car.
the keys (that) I left in the car
Al broke the iPod.
The iPod broke.
The iPod was broken by Al
- How can abstraction reveal how surface sentences are related to one another?
Constituency grammars
- Sentences consist of constituents (phrases), which consist
of constituents or words
Context-free grammars (pushdown automata, recursive transition networks)
A grammar for a fragment of English
S → NP VP VP → TV NP
NP → (Det) AdjP* N PP* VP → LV AdjP
NP → NP CoordConj NP VP → LV PP
NP → Pro VP → CV that S
VP → Aux VP AdjP → Adv* Adj
VP → IV PP → P NP
N → tomatoes TV → can
Pro → this LV → seemed
CoordConj → and Adj → pretty
Aux → can Adv → pretty
IV → died P → under
CV → said
Problems with simple context-free grammars
- Agreement
- (in Swahili)
Wavulana watatu wadogo walianguka.
boys three small fell
'Three small boys fell.'
Viti vitatu vidogo vilianguka.
chairs three small fell
'Three small chairs fell.'
Nyumba tatu ndogo zilianguka.
house(s) three small fell
'Three small houses fell.'
Meno matatu madogo yalianguka.
teeth three small fell
'Three small teeth fell.'
- In a CFG for Swahili, we seem to need separate NP rules and
separate S rules for each class of nouns. For English Ss
(which are much simpler), we would have
S → NPsing VPsing
S → NPplur VPplur
- Long-distance dependencies
Context-free grammars with complex categories (feature structures)
[HPSG, LFG]
- Handling agreement
[cat=S] → [cat=NP, num=?n] [cat=VP, num=?n]
[cat=NP, num=?n] → [cat=N, num=?n]
[cat=VP, num=?n] → [cat=V, arg1=0, num=?n]
Willard: [cat=N, num=sing, lex=willard]
people: [cat=N, num=plur, lex=person]
dances: [cat=V, arg1=0, num=sing, lex=dance]
- Handling long-distance dependencies
- A set of rules using the slash feature
[cat=S] → [+WH, cat=?c] [cat=Aux, num=?n] [cat=S, slash=?c, num=?n]
[cat=S, slash=?s, num=?n] → [cat=NP, num=?n] [cat=VP, slash=?s]
[cat=VP, slash=?s] → [cat=V, arg1=[cat=NP]] [cat=NP, slash=?s]
[cat=?c, slash=?c] → 0
-
Representation of the who does Max love?
(
dtrs is a feature that takes the set of daughter constituents as
its value)
[cat=S, dtrs={[+WH, cat=NP, form='who'],
[cat=Aux, num=sing, form='does'],
[cat=S, slash=NP, num=sing,
dtrs={[cat=NP, form='Max', num=sing],
[cat=VP, slash=NP,
dtrs={[cat=V, arg1=[cat=NP], form='love'],
[cat=NP, slash=NP, form='0']}]}]}]
-
How syntax and semantics are related (in an old version of HPSG)
-
Order may be represented separately from constituency relations (for example, in HPSG).
-
In HPSG, words and phrases are typed feature structures organized in a type hierarchy.
Probabilistic context-free grammars
-
Simple CFGs, with or without categories replaced by feature structures, can't help us prefer one analysis of a sentence over another
-
Probabilistic (or stochastic) CFGs have a probability associated with each rule: the probability of the right-hand side, given the left-hand side of the rule.
Thus the sum of the probabilities for the rules with a given LHS must be 1.
-
The probability of a given analysis of a sentence is the product of the probabilities of all of the rules used in the analysis (assuming statistical independence of all of the rules).
-
For disambiguation, select the parse for a sentence that has the highest probability.
Dependency grammars
- The structure of a sentence can be represented as a set of relations (dependencies) between words; there are no phrasal nodes.
-
Each dependency is a directed arc between a head and a
dependent word.
-
Each sentence has one root word which has no head.
Each word in a sentence can have at most one head.
- Examples and advantages of dependency over constituency grammars (Nivre)
- A dependency grammar
- A set of arc labels
- Constraints on arcs into and out of particular words or word classes
- (Sometimes) constraints on the structure of an analysis, for example,
the projectivity constraint: every word occurring between a word dand its head his either dominated by d or by h.