Research » Provenance


If computing has to become a genuine third pillar of science, the scientists need to be able to apply the same rigor of reproducibility that has been applied to theory and experimentation and that has been the keystone of science. This is where data provenance become critical. In order to ensure verifiability in computational simulations the results must be accurately traceable to the input as well as to the code that produced those results. This problem has only been addressed in an ad hoc manner so far.

The goal of this effort is to develop a precise mathematical model of provenance and apply that to analyze and instrument programs for automatically collecting provenance information as an integral part of a scientific workflow.

Related publications:

  1. Devarshi Ghoshal, Beth Plale and Arun Chauhan. Regeneration, Referencing and Quality Assessment of Benchmarking Data using Provenance. In Proceedings of the 5th International Provenance and Annotation Workshop (IPAW), 2014.
    [Article DOI]
  2. Devarshi Ghoshal, Arun Chauhan and Beth Plale. Static Compiler Analysis for Workflow Provenance. In The 8th Workshop on Workflows in Support of Large-Scale Science (WORKS13), 2013. Held in conjunction with the 2013 International Conference for High Performance Computing, Networking, Storage and Analysis (SC13).
    [Article DOI]
Arun Chauhan / Computer Science / Indiana University