This last stage in the Text Track Project is essentially a persistent version of the previous BETA version of the project. All the iterations performed so far have been used with main-memory only databases, so the size of the database, which imposed two restrictions:
The persistence of the database is achieved through the use of the E Programming Language, developed in Wisconsin, as a part of the Exodus Project. The original program is converted from a main-memory based database to persistent database by converting the class declarations for the Elements and Attributes to special persistent dbclass declarations in E, and using E's transaction processing syntax for accessing and modifying the persistent data. E was chosen as opposed to the normal Exodus structure because of E's high level syntax and easy migration capabilities from a non-persistent program to a persistent program.
The server is set up using the Exodus 3.0 storage manager. The version of E that is installed presently did not seem to be compatible with Exodus 3.1, so after many futile attempts of trying to run the examples, we decided to move back to Exodus 3.0. Setting up the server consists essentially of creating a data and log volume, and running the server on the machine specified in the server configuration file. Details of how to set up the server can be obtained from the Exodus Users' Manual.
The client in our case is simply the set of executable programs in this project. In this iteration, we have two executables:
% parse [SGML-filename]The SGML filename, as before in the previous iteration, should have the DOCTYPE information with the full pathname of the DTD. If the filename is not specified, the program will automatically use ~asengupt/mmdb/dtd/starwars.sgm.
Both programs are located in /u/asengupt/mmdb/exodus/demo on cs machines.
For use with more sophisticated demos, some subsets of poetry data can be used. One such subset is kept in /u/asengupt/mmdb/test/poetries.sgm, and another subset, with more than 85000 elements, is in /u/asengupt/temp/poetry/wrdswrth/ww.sgm. This data contains all the works of William Wordsworth, and the index is already built on megamouth. The total index building time was about half an hour - and the total data volume size dumped by Exodus was about 100 MB.