DocBase is the successor of SGMLQuery, and contains all features of SGMLQuery and includes SQL support
DocBase started as a research project in the Department of Computer Science at Indiana University, Bloomington. This research was part of my dissertation, under the guidance of Prof. Dirk Van Gucht, Prof. Edward Robertson, Prof. Andrew Dillon and Prof. David Leake.
In its current implementation, DocBase acts primarily as a query processing system for structured documents. Right now, DocBase supports SGML (Standard Generalized Query Language - ISO 8879), and XML with DTDs. DTDless XML support is planned in a future release.
DocBase never really had a big development team. After I started the initial implementation in January 1996, I had a few students work on parts of the code and make very strong contributions. I am very thankful to these students for their help with the project.
Arijit Sengupta. "The compleat closure: toward a unified view of structured document database objects" Accepted for publication at the Fifth international Conference on Information Systesm Analysis and Synthesis (ISAS '99). Orlando, Florida. July 31-August 4, 1999
Arijit Sengupta. "Toward the union of databases and document management: The design of DocBase." Accepted for publication in Proceedings: Conference on Management of Data (COMAD'98), Hyderabad, India, December 17-19 1998. Available in postscript [548K]. (Text Abstract)
Arijit Sengupta. "DocBase - A Database Environment for Structured Documents". Ph.D. Thesis. Indiana University, Bloomington. December, 1997 available as gzipped postscript [600K] (Text Abstract)
Arijit Sengupta and Andrew Dillon. "Extending SGML to Accommodate Database Functions: A Methodological Overview." Journal of the American Society of Information Systems (JASIS), special issue on structured information/standards for document architectures. pages 629-637, July, 1997. Available in postscript [744K]. (Text Abstract)
Arijit Sengupta and Andrew Dillon. "Query By Templates: A Generalized Approach for Visual Query Formulation for Text Dominated Databases." in Proceedings: Conference on Advanced Digital Libraries (ADL'97), Library of Congress, Washington, D.C. pages 36-47. May 7-9 1997. Available in postscript [776K] (Text abstract)
Arijit Sengupta. "Standardizing the Querying Process with SGML: The SQL DTD." In Tommie Usdin and Debbie Lapeyre, editors, Proceedings of the SGML'96 Conference. Graphic Communications Association, pages 323-337, November, 1996. (Presented at conference.) Available in SGML, also available in postscript [800K] (Text Abstract)
Arijit Sengupta. "Demand More from Your SGML Database! Bringing SQL Under the SGML Limelight." <TAG>, 9(4):pages 1-7, April 1996. Available in postscript.[352K] (Text Abstract)
Currently, there is one completed system demonstration available with DocBase, for the Chadwyck-Healey English Poetry Database. Because of copyright restrictions, the actual poems are not available for access outside Indiana University. To access the demonstration, please refer to the QBT page.
Source code of DocBase and QBT is currently not publicly available. However, after the initial release is completed (projected date: August 1999), we will have part of whole of the system available for download under the GNU public license. If you would like to be notified when a release is available, please contact me at asengupt@cs.indiana.edu.
Last modified: Mon May 24 22:48:04 EST 1999