Next: Level 1: MostFrequentWords Up: WordSieve: A Method for Extraction Previous: The WordSieve Algorithm for

The WordSieve Architecture

The WordSieve network consists of three small, interdependent levels of nodes. Their function is analogous to a sieve, ``trapping'' partitioning words that reflect a user's context. Each of the nodes in each level contains a small number of attributes: a word and one or two real number values. The word associated with any node can change over time.

The architecture of the current version of WordSieve is shown in figure 2. WordSieve processes documents by first passing each word through the MostFrequentWords level as the words are encountered in a document. Levels 2 and 3 then get their information solely from what survives in the MostFrequentWords level. We will examine each level individually, then consider how well the system performs.

**Figure 2:** Word Sieve Diagram
$\includegraphics[width=2.5in]{graphics/WordSieveDiagram2.eps}$

Level 1: MostFrequentWords
Level 2: Words Occurring in Document Sequences
Level 3: Words Absent in Document Sequences
WordSieve Output
- User Profiles
- Context Profiles

Travis Bauer
2002-01-25