Indiana University Bloomington

School of Informatics and Computing

Technical Report TR602:
Measures in Databases and Data Mining

Bassem Sayrafi, Dirk Van Gucht, Marc Gyssens
(Nov 2004), 39
We present a framework to study the properties of measures as they occur in various areas of databases and data mining such as aggregation in queries, measurement of data uniformity, and frequency calculation. The framework is a generalization of the theory of mathematical measures. In particular, our framework is built on principles that relax the additivity principle for mathematical measures. Besides using our framework to classify measures, we derive general bounds and rules they must satisfy. By considering the analogue of first and second derivatives of functions, in our case the first and second finite differences of measures, we obtain inference systems that allow us to reason about constraints that exist between data objects relative to measurements.

