Lexemes or lexical classes that the unit inherits from
One or more grammatical constraint specifications,
each invoking one principle on a given dimension
An example: main clause transitive verb: Syntax dimension, Valency principle
Lexicon/grammar as an inheritance hierarchy
XDG: processing
(Morphological analysis)
Lexicalization
Search for matching entries
Invokation of principles referenced in entries
Creation of variables; examples:
Each node n has a daughters variable whose value is the set of indices of daughter nodes of n
Each node that must agree with another has an agr variable for the possible values of the
relevant agreement feature (for example, person-number-gender)
Instantiation of constraints relating variables
Creation of a disambiguation variable for each node; its value is the index of an entry all of whose constraints are satisfied
Constraint satisfaction
If it succeeds, it returns all possible complete variable assignments, each corresponding to a single grammatical analysis of the sentence (a multigraph across the nodes)
XDG grammars are declarative, so they can be used for generation as well as analysis
Each node needs an explicit position variable representing its output position in the generated sentence
(Morphological generation)
Analysis example
Multilingual XDG (1)
For each language
Constraints (by way of principles) on one or more syntactic dimensions
Interface dimension relating one syntactic dimension to one semantic
dimension, constraining how arc labels in syntax relate to arc labels
in semantics
Semantics shared by all languages
An Amharic translation of the water is contaminated
Multilingual XDG (2)
The English predicate adjective contaminated
and the Amharic present perfect passive verb ተበክሏል
play the same role in the semantics despite their very different syntax
These differences are reflected in the English entries for predicate adjectives
and main clause copulas and the Amharic entry for present perfect intransitive verbs
Synchronous XDG
One big multilingual grammar with syntactic dimensions for each language,
as well as semantics and the syntax-semantics interface dimension for each language
A sentence and its translation into one or more languages: just a multigraph connecting the nodes of the sentence
A multilingual analysis
Connecting the languages
Entries in one language associated with entries in the other
through cross-lingual links that behave like lexical inheritance links
Partial Amharic and English lexicons with three cross-lingual links
Translation
Analysis + generation
Ordinary constraint satisfaction with additional target language constraints
Additional generation steps
Ordering of output words
Morphological generation
Translation example
Not discussed
Mismatch in number of words across dimensions
Disambiguation
Multi-word units
Project status
Our own (mostly completely) Python implementation of XDG
Morphological analyzers/generators for Amharic, Oromo, Tigrinya, Quechua, and Spanish
(see Gasser (2009) for the approach)
Tiny XDG grammars for English and Amharic plus cross-lingual links