Homework 2: morphotactics and finite-state automata

Email me your solution (a single Python file) by Wednesday, January 23, 11:59pm.

Background: Swahili noun morphotactics

Swahili nouns belong to one of a set of classes that are characterized by particular prefixes for their singulars and plurals. Nouns in the "N class" are the same in the singular and plural forms (and often have no n- prefix). Here are examples of nouns in four of the main classes.

M-WA class
M-MI class
KI-VI class
N class

Singular nouns may also be followed by the suffix -ni, meaning 'at, in, on'. So the following are possible: mlimani, kitandani, mezani. But -ni is not added to plural nouns. So milimani is not possible, for example.

The FSA program

Download FSA.py, a debugged version of a program written by Oliver Steele which creates and manipulates finite-state automata and tests strings for membership in the languages represented by the FSAs.

The program lets you create FSAs from regular expressions, including the metacharacters "*" (0 or more instances), "+" (1 or more instances), "?" (0 or 1 instances), "|" (or) "()" (grouping), using compileRE and manipulate FSAs using concatenation(), union(), intersection(), minimized(), determinized(), etc. compileRE(), but not the other functions, minimizes the resulting FSA before returning it. You can test whether a given FSA accepts a string with the function accepts(). You can examine an FSA with the function tuple(), which returns a tuple consisting of the states, alphabet (if one has been specified), transitions, initial state, and final states of the FSA. Here are some examples.

>>> from FSA import *
>>> fsa1 = compileRE('ab+')
>>> fsa1.tuple()
([0, 2, 1], None, [(0, 1, 'a'), (1, 2, 'b'), (2, 2, 'b')], 0, [2])
>>> fsa1.accepts('ab')
True
>>> fsa1.accepts('abb')
True
>>> fsa1.accepts('a')
False
>>> fsa2 = compileRE('(ab+)|(ba+)')
>>> fsa2.accepts('ba')
True
>>> fsa2.accepts('abba')
False
>>> fsa2.accepts('baa')
True
>>> fsa2.accepts('bab')
False
>>> fsa3 = compileRE('c?')
>>> fsa4 = concatenation(fsa1, fsa3)
>>> fsa4.tuple()
([0, 2, 1, 3, 4], None, [(0, 1, 'a'), (1, 2, 'b'), (2, 2, 'b'), (3, 4, 'c'), (2, 3, None)], 0, [3, 4])
>>> fsa5 = fsa4.minimized()
>>> fsa5.tuple()
([2, 0, 3, 1], None, [(0, 1, 'a'), (1, 2, 'b'), (2, 3, 'c'), (2, 2, 'b')], 0, [2, 3])
>>> fsa5.accepts('ab')
True
>>> fsa5.accepts('abc')
True
>>> fsa5.accepts('ac')
False

What you have to do

Using FSA.py, create an FSA that accepts all of the nouns in the list above and fails to accept nouns that don't agree with Swahili morphotactics. (Minimized, my FSA had 42 states.) Here are some examples.

>>> swahili_noun.accepts('mtoto')
True
>>> swahili_noun.accepts('watoto')
True
>>> swahili_noun.accepts('kitoto')
False
>>> swahili_noun.accepts('mlima')
True
>>> swahili_noun.accepts('milima')
True
>>> swahili_noun.accepts('walima')
False
>>> swahili_noun.accepts('vitu')
True
>>> swahili_noun.accepts('mtu')
True
>>> swahili_noun.accepts('tu')
False
>>> swahili_noun.accepts('mlimani')
True
>>> swahili_noun.accepts('milimani')
False
>>> swahili_noun.accepts('limani')
False
>>> swahili_noun.accepts('mezani')
True
>>> len(swahili_noun.states)
42

Email your single file to me (gasser AT indiana DOT edu).

Home

Calendar

Coursework

Notes

Readings

Code

Resources


IU | INFO | CSCI

Contact instructor