Readings:
-- Scientific Data Management: Challenges, Technology, and Deployment, A. Shoshani and D. Rotem Eds. CRC Press. ISBN 978-1-4200-6980-8, 2010.
-- The Fourth Paradigm http://research.microsoft.com/en-us/collaboration/fourthparadigm/
-- Select journal, conference and workshop readings.
Course Description: As supercomputers and modern scientific instruments allow scientists to generate data on everything from the human genome to the origin of distant planets and the changing climate of our own planet, we find ourselves awash in computational data - a problem often referred to as the data deluge. Data produced by these technologies are precious and irreplaceable, holding the potential for greater scientific knowledge and understanding in perpetuity. In this seminar course we will explore multiple dimensions of scientific data management including issues at-scale and preservation and archiving.
The course utilizes lectures, presentations, and discussions. If student interest and background merits, students will get hands-on experience with research tools and web services around a class project. See http://pti.iu.edu for kinds of research tools to be explored.
Prerequisite Moderate level of mastery with programming in traditional programming language such as Java or C++, and this experience in something more substantial than toy standalone codes. Interdisciplinary teams that utilize complementary skill sets are a possibility depending on class makeup.