Selected publications by members of the Database Group
Unless otherwise noted, links are to PostScript files
(possibly zipped)
or to web pages offering a variety of formats.
- Data & Web Mining, Network Analysis, Information Retrieval
- On Approximation Measures for Functional Dependencies, C. Giannella and E. Robertson, Information Systems To appear, 2004.
- Mining Frequent Itemsets Over Arbitrary Time Intervals in Data Streams,
C. Giannella, J. Han, E. Robertson, C. Liu.
Computer Science Department Technical Report 587,
Indiana University, Nov 2003.
An older version:
Mining Frequent Patterns in Data Streams at Multiple Time Granularities,
C. Giannella, J. Han, J. Pei, X. Yan and P.S. Yu. Data Mining: Next
Generation Challenges and Future Directions,
AAAI/MIT Press, H. Kargupta, A. Joshi, K. Sivakumar, and Y. Yesha (eds.),
2003.
- A Note on Approximation Measures for Multi-valued Dependencies in
Relational Databases, C. Giannella and E. Robertson, Information Processing Letters Volume 85, Issue 3 153-158, 2003.
- Topical Web Crawlers: Evaluating Adaptive Algorithms, F. Menczer, G. Pant, P. Srinivasan. To appear in ACM Trans. on Internet Technologies,
Download site
- Search engine-crawler symbiosis: Adapting to Community Interests, G. Pant, S. Bradshaw, F. Menczer. Proc. ECDL 2003
Download site
- Topical crawling for business intelligence, G. Pant, F. Menczer Proc. ECDL 2003
Download site
- Defining Evaluation Methodologies for Topical Crawlers, P. Srinivasan, F. Menczer, G. Pant. Position paper, SIGIR 2003 Workshop on Defining Evaluation Methodologies for Terabyte-Scale Collections
Download site
- Complementing Search Engines with Online Web Mining Agents, F. Menczer. Decision Support Systems 35(2): 195-212, 2003
Download site
- Crawling the Web, G. Pant, P. Srinivasan, F. Menczer. To appear in M. Levene and A. Poulovassilis, eds.: Web Dynamics, Springer, 2003
Download site
- Feature Selection in Data Mining, Y.S. Kim, N. Street, F. Menczer. In J. Wang, ed.: Data Mining: Opportunities and Challenges, Idea Group Publishing, pp. 80-105, 2003
Download site
- Growing and Navigating the Small World Web by Local Content, F. Menczer. Proc. Natl. Acad. Sci. USA 99(22): 14014-14019, 2002
Download site
- Adaptive Assistants for Customized E-Shopping, F. Menczer, A. Monge, N. Street. IEEE Intelligent Systems 17(6): 12-19, Nov-Dec 2002
Download site
- MySpiders: Evolve your own intelligent Web crawlers, G. Pant, F. Menczer. Autonomous Agents and Multi-Agent Systems 5(2): 221-229, 2002
Download site
- Evolutionary model selection in unsupervised learning, Y.S. Kim, N. Street, F. Menczer. Intelligent Data Analysis 6(6): 531-556, 2002
Download site
- IntelliShopper: A Proactive, Personal, Private Shopping Assistant, F. Menczer, N. Street, N. Vishwakarma, A. Monge, M. Jakobsson. Proc. 1st ACM Int. Joint Conf. on Autonomous Agents and MultiAgent Systems (AAMAS 2002) pp. 1001-1008
Download site
- Web Crawling Agents for Retrieving Biomedical Information, P. Srinivasan, J. Mitchell, O. Bodenreider, G. Pant, F. Menczer. Proc. Int. Workshop on Agents in Bioinformatics (NETTAB 2002)
Download site
- Meta-Evolutionary Ensembles, Y.S. Kim, N. Street, F. Menczer. Proc. IEEE Intl. Joint Conf. on Neural Networks (IJCNN'02)
Download site
- Exploration versus Exploitation in Topic Driven Crawlers, G. Pant, P. Srinivasan, F. Menczer. Proc. WWW 2002 Workshop on Web Dynamics
Download site
- An Axiomatic Approach to Defining Approximation Measures for Functional
Dependencies, Chris Giannella. Lecture Notes in Computer Science vol 2435 pg. 37-51(proceedings of the
6th East-European Conference on Advances in Databases and Information
Systems), 2002.
- Discovering Frequent Itemsets in the Presence
of Highly Frequent Items, Dennis P. Groth and Edward L. Robertson.
Workshop on Rule Based Data Mining, in Conjunction with the 14th
International Conference On Applications of Prolog, 2001.
- FastFDs: A Heuristic-Driven Depth-First Algorithm for Mining
Functional Dependencies from Relation Instances. Cathy Wyss, Chris
Giannella, and Edward Robertson, Proceedings of the 3rd International
Conference on Data Warehousing and Knowledge Discovery (DaWaK 2001),
Munich, Germany, September 2001. Published in Lecture Notes in Computer
Science 2112.
-
On an Information Theoretic Approximation Measure for Functional
Dependencies. Chris Giannella and Edward Robertson, Indiana University,
Computer Science Department Technical Report 555, Aug 2001.
-
Information Dependencies,
Mehmet Dalkilic and Edward Robertson. Indiana University, Computer Science
Department Technical Report 531, Nov 1999, also in ACM PODS,
2000.
-
Average Case Performance of the Apriori Algorithm,
Paul Purdom and Dirk Van Gucht. Indiana University, Computer Science
Department Technical Report 529, Oct 1999.
- CE: The Classifier-Estimator Framework for Data Mining,
Mehmet Dalkilic, Edward Robertson, and Dirk Van Gucht,
Proceedings 7th IFIP 2.6 Working Conference on Database Semantics,
Chapam & Hall, 1998.
Full version available as
Computer Science Department Technical Report 480,
Indiana University, May 1997.
- Query Languages and Processing
-
An Information Theoretic Histogram for Single Dimensional Selectivity Estimation , Bassem Sayrafi and Chris Giannella. Indiana University, Computer Science Department Technical
Report 584, June 2003.
-
Using Horizontal-Vertical Decompositions to Improve Query
Evaluation,
Chris Giannella, Mehmet Dalkilic, Dennis Groth, and Edward Robertson.
Lecture Notes in Computer Science vol 2405, 2002
(proceedings of the 19th British National Conference on Databases BNCOD)
26-41.
Computer Science Department Technical Report 558,
Indiana University, Feb 2002.
- Providing Better Support for
Quantified Query Processing,
Sudhir Rao, Antonio Badia, and Dirk Van Gucht.
ACM SIGMOD 1996
-
Processing Generalized Quantified Queries,
Sudhir Rao, Antonio Badia, and Dirk Van Gucht,
Indiana University, Computer Science Department Technical Report 452,
May 1996.
- Unnesting and Optimization
Techniques for Queries Containing Generalized
Quantifiers, Sudhir Rao, DRAFT, in preparation.
- Polynomial-Time Query Languages for Untyped
Lists Edward Robertson, Lawrence V. Saxton, and Dirk Van Gucht
DRAFT, in preparation
- A graph-oriented object database model, M. Gyssens, J. Paredaens,
J. Van den Bussche, D. Van Gucht.
IEEE Transactions on Knowledge and Data Engineering ,
vol 6, no 4, pages 572-586, 1994.
- An overview of GOOD, J. Paredaens, J. Van den Bussche, D.
Van Gucht, et al., SIGMOD Record , vol 21, no 1, pages 25-31, 1992.
- Spatial Database Theory
- On Adding a Connectedness Operator to FO+poly (linear), Chris
Giannella and Dirk Van Gucht,
Acta Informatica 38(9), pages 621-648, 2002 .
An earlier (less polished) version appears as
Computer Science Department Technical Report 530,
Indiana University, 2000
Download.
- Adding a Connectedness Operator to FO+poly -- Extended Abstract,
Chris Giannella, Proceedings of the 2000 student session of the
European summer school in logic, language, and information (ESSLLI2000) in
Birmingham, England.
- Complete geometrical query languages, M. Gyssens, J. Van den
Bussche, D. Van Gucht. Journal of Computer and System Sciences ,
vol 58, no 3, pages 483-511, 1999. (A preliminary version was presented at
PODS'97 .)
- An Expressive Language for Linear Spatial Database Queries,
L. Vandeurzen, M. Gyssens, D. Van Gucht, PODS'98
- Genericity in Spatial Databases, Bart Kuijpers, Dirk Van Gucht, to
appear in Constraint Databases (eds. G. Kuper, L.
Libkin, and J. Paredaens), 1998
- Towards a Theory of Movie Database Queries, Bart Kuijpers, Jan
Paredaens, and Dirk Van Gucht, Technical Report University of
Antwerp 98-02, 1997
- On the Decidability of Semi-Linearity for Semi-Algebraic Sets and
its Implications for Spatial Databases, F. Dumortier, M. Gyssens, L.
Vandeurzen, D. Van Gucht, PODS'97
- On Query Languages for Linear Queries Definable with Polynomial
Constraints, L. Vandeurzen, M. Gyssens, and D. Van Gucht, Lecture Notes
in Computer Science (Proceedings of Second International Conference on
Principles and Practice of Constraint Programming, Cambridge, Massachusetts,
USA, August 19-22, 1996), vol. 1118, Springer, 1996, pp. 468-481.
- On the Desirability and Limitations of Linear Spatial Database
Models, L. Vandeurzen, M. Gyssens, and D. Van Gucht, Lecture Notes in
Computer Science (Proceedings of the 4th International Symposium on
Large Spatial Databases (SSD'95)), M.J. Egenhofer and J.R. Herring, eds.,
vol. 951, Springer, 1995, pp. 14-28.
- First-order queries on finite structures over the reals,
Jan Paredaens, Jan Van den Bussche, Dirk Van Gucht, Logic In
Computer Science , 79-89, 1995
- Towards a Theory of Spatial Database Queries, Jan Paredaens, Jan
Van den Bussche, Dirk Van Gucht, PODS'94 , 279-288, 1994
- Reflective and Meta-Data Query Languages
-
Optimal Tuple Merge is NP-Complete.
Edward L. Robertson and Catharine M. Wyss.
- A Relational Algebra for Data/Metadata Integration in a Federated
Database System. Cathy Wyss and Dirk Van Gucht, CIKM 2001, Atlanta,
Georgia.
- Augmenting SQL with Dynamic Typing to Support Interoperability in
a Relational Federation. Cathy Wyss, Felix Wyss, and Dirk Van Gucht, EFIS
2001, Berlin, Germany.
-
MD-SQL: A Language for Meta-Data Queries over Relational Databases
C. M. Rood, D. Van Gucht and F. I. Wyss.
Indiana University, Computer Science Department Technical Report 528,
Jul 1999
- Typed query languages for databases containing queries, F. Neven,
J. Van den Bussche, D. Van Gucht, G. Vossen. Information Systems ,
vol 24, no 7, pages 569-595, 1999. (A preliminary version was presented at
PODS'98 .)
-
Design and Implementation of Reflective SQL
Mehmet M. Dalkilic, Manoj Jain, Dirk Van Gucht, and Anurag Mendhekar.
Indiana University, Computer Science Department, Technical Report 451,
Feb 1996
- Reflective Programming in the Relational Algebra, Jan Van den
Bussche, Dirk Van Gucht, Gottfried Vossen, ACM PODS 1993
- Complex Object Databases
- On the completeness of object-creating database transformation
languages, J. Van den Bussche, D. Van Gucht, M. Andries, M. Gyssens.
Journal of the ACM , vol 44, no 2, pages 272-319, 1997. (A preliminary
version was presented at FOCS'92 .)
- A Polynomial-Time Query Language for
Hierarchically Structured Documents
Arijit Sengupta and Dirk Van Gucht,
DRAFT, in preparation.
- Query By Templates:
A Generalized Approach for Visual Query Formulation for Text
Dominated Databases,
Arijit Sengupta and Andrew Dillon,
Conference on Advanced Digital Libraries (ADL'97), 1996.
- Standardizing the Querying Process with SGML:
The SQL DTD(PostScript version),
Arijit Sengupta.
Tommie Usdin and Debbie Lapeyre, editors, Proceedings
of the SGML'96 Conference. Graphic Communications Association, 1996.
An
SGML version is also available (of course), for those with SGML viewers.
- Extending SGML to Accommodate
Database Functions: A Methodological Overview,
Arijit Sengupta and Andrew Dillon,
Journal of the American Society of Information Systems (JASIS),
special issue on structured information/standards for
document architectures. August, 1996.
- Demand More from Your SGML Database! Bringing
SQL Under the SGML Limelight,
Arijit Sengupta,
<TAG>, April 1996.
- Structured Document Databases,
Arijit Sengupta, September 1996.
Arijit's thesis proposal and project summary.
- The expressive power of cardinality-bounded set values in
object-based data models, J. Van den Bussche, D. Van Gucht. Theoretical
Computer Science , vol 149, no 1, pages 49-66, 1995. (A preliminary
version was presented at ICDT'92 ).
- Expressiveness of efficient semi-deterministic choice constructs, M.
Gyssens, J. Van den Bussche, D. Van Gucht. Automata, Languages and
Programming - ICALP'94 (S. Abiteboul, E. Shamir, editors), Lecture
Notes in Computer Science , vol 820, pages 106-117. Springer, 1994. (A
full version presenting polynomial-time semi-deterministic choice
constructs that are more general than swap-choice, is in preparation.)
- Non-deterministic aspects of database transformations involving
object creation, J. Van den Bussche, D. Van Gucht. Modeling Database
Dynamics (U. Lipeck, B. Thalheim, editors), Workshops in Computing,
pages 3-16. Springer, 1993.
- Modeling Information and Information Systems
-
Architectural Principles for Enterprise Frameworks,
Richard A. Martin, Edward L. Robertson, and John A. Springer,
Evaluation of Modeling Methods (EMMSAD'04),
Riga, Latvia, June 2004.
Full version available as
Computer Science Department Technical Report 594.
- A Comparison of Frameworks for Enterprise Architecture Modeling,
Richard Martin and Edward Robertson,
ER2003 - 22nd International Conference on Conceptual Modeling,
Chicago IL,
pages - (abstract only),
complete presentation in PowerPoint.
-
Frameworks: Comparison and Correspondence for Three Archetypes
(pdf version of presentation),
Richard Martin and Edward Robertson,
Zachman Information Framework Architectures, 2002.
-
Formalization of Multi-level Zachman Frameworks,
Richard Martin and Edward Robertson,
Indiana University,
Computer Science Department Technical Report 522,
April 1999.
- Leveled Entity-Relationship Model,
an extension of ER notions to provide for a hierarchical model,
using O-O-like encapsulation,
Munish Gandhi, Edward Robertson, and Dirk Van Gucht.
Entity-Relationship 95.
-
Leveled Entity-Relationship Model,
a fuller version of the above,
Munish Gandhi, Edward Robertson, and Dirk Van Gucht.
Indiana University, Computer Science Department Technical Report 404,
May 1994
- A Data Model for Audio-Video Data,
Munish Gandhi and Edward Robertson,
Advances in Data Management `94.
P. Sadanandan and S. Chakravarthy (eds),
Tata McGraw-Hill, New Delhi (1994), 135-150.
- Modeling and Querying Primitives
for Digital Media,
Munish Gandhi, Edward Robertson, and Dirk Van Gucht.
1994 ACM SIGMOD Conference on Management of Data.
- Semantic-Based Data Model,
an application of ER models to capture notions of
design and versioning,
Munish Gandhi and Edward Robertson.
Entity-Relationship 92.
- Visualization
- An Entropy-Based Approach to Visualizing Database Structure,
Dennis Groth and Edward Robertson, The Sixth IFIP Working Conference on
Visual Database Systems (VDB6), May 2002.
- An Integrated System for Database Visualization (Poster), Dennis
Groth, Edward Robertson, Advanced Visual Interfaces 2002, May 2002.
- Architectural support for database
visualization,
Dennis Groth and Edward Robertson.
Workshop on New Paradigms in Information Visualization and
Manipulation, 1999
- Nonlinear magnification fields,
T. Alan Keahey and Edward L. Robertson.
IEEE Information Visualization '97.
- Techniques for Non-Linear
Magnification Transformations,
T. Alan Keahey and Edward L. Robertson.
IEEE Information Visualization '96.
-
Viewing Text with Non-Linear Magnification: An Experimental Study,
T. Alan Keahey and Julianne Marley.
Indiana University, Computer Science Department Technical Report 459,
April 1996.
-
Non-linear image magnification,
T. Alan Keahey and Edward L. Robertson.
Indiana University, Computer Science Department, Technical Report
460, April 1996.
- Semi-structured Data Management
- Tree Logical Classes for
Efficient Evaluation of XQuery
Stelios Paparizos, Yuqing Wu, Laks V.S. Lakshmanan, H.V. Jagadish. SIGMOD
2004.
- Storing XML (with
XSD) in SQL Databases: Interplay of Logical and Physical
designs. Zhiyuan Chen, Surajit Chaudhuri, Kyuseok Shim, Yuqing
Wu. ICDE 2004.
- TIMBER: A Native
System
for Querying XML.
Stelios Paparizos, Shurug Al-Khalifa, Adriane Chapman, H.V.
Jagadish, Laks V.S. Lakshmanan, Andrew Nierman, Jignesh M. Patel, Divesh
Srivastava, Nuwee Wiwatwattana, Yuqing Wu and Cong Yu. SIGMOD (demo) 2003.
- Structural
Join Order Selection for XML Query Optimization.
Yuqing Wu, Jignesh Patel and H.V. Jagadish, ICDE 2003.
- Using
Histograms to Estimate Answer Size for XML Queries. Yuqing Wu, Jignesh Patel, H. V. Jagadish.
Information Systems 28
(1-2): 33-59 (2003) -- Special Issue: Best Papers from EDBT 2002.
- TIMBER: A Native
XML Database.H. V. Jagadish, Shurug Al-Khalifa, Adriane Chapman,
Laks V.S. Lakshmanan, Andrew Nierman, Stelios Paparizos, Jignesh M. Patel,
Divesh Srivastava, Nuwee Wiwatwattana, Yuqing Wu and Cong Yu. VLDB Journal, Vol. 11, Issue 4
(2002).
- COMMIX: Towards
Effective Web Information Extraction,
Integration and Query Answering.
Tengjiao Wang, Shiwei Tang, Dongqing Yang, Jun Gao,
Yuqing Wu, Jian Pei: SIGMOD (demo) 2002.
- Estimating Answer
Sizes for XML Queries.
Yuqing Wu, Jignesh M. Patel and H.V.Jagadish.
EDBT 2002.
- Grouping in
XML.Stelios Paparizos, Shurug Al-Khalifa, H. V. Jagadish,
Laks V.S. Lakshmanan, Andrew Nierman, Divesh Srivastava and Yuqing Wu.
EDBT Workshop on XML Data Management (XMLDM'02), Published in
Springer-Verlag, Lecture Notes in Computer Science Vol.2490, 2002.
- Structural Joins:
A
Primitive for Efficient XML Query Pattern Matching.
Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas,
Jignesh M. Patel, Divesh Srivastava and Yuqing Wu. ICDE 2002.
- Miscellaneous (Algorithms, Logic Programming, Logic in Database Theory)
- Useful Transformations in Answer Set Programming. J.C. Nieves
Sanchez, M. Osorio, and C. Giannella, Workshop on Answer Set Programming as
part of AAAI 2001 Spring Symposium Series, March 26-28, Stanford CA,
Technical Report SS-01-01, pg 146-152.
- Polynomially
orderable classes of structures,
J. Van den Bussche, D. Van Gucht. Unpublished DRAFT
- An Empirical Study of the 4-Valued Kripke Kleene Semantics and
4-Valued Well-Founded Semantics in Random Propositional Logic Programs,
C. Giannella, J. Schlipf, Annals of Mathematics and Artificial
Intelligence 25 (1999) 3,4, pg 275-309, ed. J. Dix, J. Lobo
- An Empirical Study of the 3-Valued Kripke Kleene Semantics in Random
Propositional Logic Programs, C. Giannella, J. Schlipf, Proceedings
of the Logic Programming Track of the 7th International Workshop on
Non-Monotonic Reasoning 1998, pg. 41-50, ed. J. Dix, J. Lobo
- On the Complexity of Partitioning
Sparse Matrix Representations,
Jóhann P. Malmquist and Edward Robertson,
BIT 1982.
This deals with the partitioning of network databases in
order to minimize interpage links.
This is the paper that got Ed Robertson into databases.
- Pedagogy
Computer Science on-line
Technical Report collection.