CSCI B534 Distributed Systems

School of Informatics and Computing, Indiana University Bloomington
Spring 2010; Tue and Thur 5:30-6:45 p.m., Informatics East Room 130
Meets with CSCI-B 490


Much of the computing taking place today is distributed. Web services, cloud computing, virtualization, peer-to-peer and Internet computing all have distributed systems concepts at their foundation. In this course we study the foundational concepts of distributed systems, foundations that you need to move from technology to technology: cloud computing today, and something entirely different a decade from now. You will also get hands-on programming experience with today's technologies, specifically cloud computing and web services for large-scale data. The course is intended to give the computer science student (or other serious technologist) a balanced experience that will grow their foundational understanding of distributed systems, and at the same time provide valuable hands-on experience that puts the foundational understanding in a context that can be of immediate use.

Instructors: Professor Beth Plale and Chathura Herath
Plale: LH301C, 812-855-4373, e-mail, office hours Tue 3:00 - 5:00pm
Herath: e-mail. Chathura is available at chathurah on skype, yahoo and gmail.
Associate Instructors:
Yuan Luo : LH301H, e-mail, Office hours: Th 2:30 - 3:50 p.m. or by appointment.
Pairoj Rattadilok : LH310, e-mail, Office Hours: Wed 2:00 - 3:30 p.m., or by appointment.

Topics and Agenda
Goals of distributed systems 13 Jan
LEAD Portal for scientific discovery 15 Jan
Architectures: web services, cloud computing, overlay networks, ... 19-28 Jan
Virtualization and Communication: threads, VMs, RPC, ... 2-18 Feb
Performance and benchmarks 23-25 Feb
Midterm exam 2 Mar
Naming: global naming, name spaces 4 Mar
Synchronization: clocks, elections, mutual exclusion 9-11 Mar
Spring Break 16, 18 Mar
Consistency: data centric, eventual 23-25 Mar
Fault tolerance: resilience, reliable group communication, recovery 30 Mar - 01 Apr
Distributed file systems and distributed storage systems: Google File System, Big Table, NFS ... 6-13 Apr
Workflows systems and geoscience informatics: research topics 15-20 Apr
Student presentations of synthesis paper (grad only) 22-29 Apr
Final Exam Mon 3 May 7:15-9:15 p.m.

Textbook and materials: The course textbook is by Andrew S. Taenbaum and Maarten Van Steen called Distributed Systems: Principles and Paradigms, 2nd Ed., Prentice Hall, 2007. You are strongly advised to get the book. Other readings will come from conference and journal papers that can be downloaded from sources such as IEEE Digital Library, ACM Digital Library, or Citeseer. We will be using the Oncourse site for this course.

Abstracts

You will write abstracts for assigned readings from papers; there will be about a dozen papers in all. The abstract serves the purpose of organizing your thoughts for the class discussion. The abstract should be about 500 words in length and i.) Identify the problem being solved, ii.) identify the solution the author proposed and how the author validates the solution, and iii.) provides an assessment of the importance of the work. Abstracts will be submitted via Oncourse and will be due at the beginning of the class in which the paper is discussed.

B534 enrollees will be responsible for submitting abstracts for all 12 papers; B490 enrollees will be responsible for submitting 9 out of 12 of the abstracts.

Required Readings
Goals of Distributed Systems (12-14 Jan) Chapter 1, Sections 1.1 - 1.3
Architectures (19-28 Jan) Chapter 2, Sections 2.1, 2.2
Curbera, F., et al. Unraveling the Web Services Web: an Introduction to SOAP, WSDL, and UDDI, IEEE Internet Computing, 6, 2, Mar/Apr 2002 [Link]
Curbera, F. et al. The Next Step in Web Services: How three specifications support creating robust service compositions, CACM 46, 10, Oct 2003 [Link]
Armbrust, M. et al. Above the Clouds: A Berkeley View of Cloud Computing, U Calif Berkeley Tech Report UCB/EECS-2009-28, Feb 2009 [Link]
Virtualization and Communication (2-18 Feb) Chapter 3, Sections 3.1 - 3.4
Chapter 4, Sections 4.1 - 4.3
Barham, P. et al., Xen and the art of Virtualization, ACM Symposium on Operating Systems Principles, 2003. [Link]
Performance Evaluation (23-25 Feb) Vivek S. Pai, Peter Druschel, and Willy Zwaenepoel, Flash: An Efficient and Portable Web Server, Proceedings of the USENIX 1999 Annual Technical Conference Monterey, CA, June 1999 [Link]
Naming (2-4 Mar) Chapter 5, Sec 5.1 – 5.4
Synchronization (9-11 Mar) Chapter 6, Sec 6.1, 6.3
Lamport, L., Time, Clocks, and the Ordering of Events in a Distributed System, Communications of ACM, 21, 7, Jul 1978 [Link]
Consistency (23-25 Mar) W. Vogels, Eventually Consistent, Communications of ACM, 52, 1, Jan 2009 [Link]
Terry, D.B., et al. Session Guarantees for Weakly Consistent Replicated Data, Proceedings of the ACM Third International Conference on Parallel and Distributed Information Systems, 1994 [Link]
Fault Tolerance (30 Mar – 01 Apr) Chapter 8, Sections 8.1, 8.2, 8.3, 8.6
Distributed File and Storage Systems (6 – 13 Apr) Chapter 11, 11.1 – 11.9
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber, Bigtable: A Distributed Storage System for Structured Data, OSDI 2006. [Link]
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, The Google File System 19th ACM Symposium on Operating Systems Principles, Lake George, NY, October, 2003. [Link]
Santry, D., et al. Deciding when to forget in the Elephant file system, ACM Symposium on Operating Systems Principles, 1999 Link]
Workflow Systems and Geoscience Informatics (15-22 Apr) Paper TBA

Projects

The course includes three projects, two that are programming projects and one that is a synthesis paper. Those students enrolled in B490 will do the 2 programming projects but not the synthesis paper.

The programming projects will require experience with programming, and will grow your skills at systems programming. Distributed systems today are too large for any one person to write, so the systems programmer must be comfortable working with APIs, libraries, and code from other programmers and other organizations. You will likely work in Java and on a linux platform, though you may choose other languages/platforms. The programming projects are group projects. The project grade will be based on a demo, the quality of the code, and a written report.

For the synthesis paper (B534 students), you will research an area of distributed systems by selecting and reading three related conference papers from selective conference venues. From these readings you will develop a taxonomy and use the taxonomy to guide you in structuring your paper. The synthesis paper can be a breakthrough experience in independent scholarship for a student. When coupled with one of the projects or outside work, the synthesis paper can provide a path for independent scholarship beyond the spring semester.

The course prerequisite is CSCI P536 Advanced Operating Systems, CSCI P436 Operating Systems or consent of instructor. The prereq is there because the programming projects are intended as a systems programming experience that builds off core competency in synchronization, concurrency, file systems, and single-image programming. If you think you've got the requisite skills but haven't taken P536, talk to the instructor.

Grading

The course grade is determined by the student's performance over several areas: projects (50%), readings and discussion (25%), and exams (25%).

Academic Misconduct Your academic conduct while taking this course is bound by the IU Code of Student Rights, Responsibilities, and Conduct. In particular, Part II discusses your responsibility to uphold and maintain academic and professional honesty and integrity http://www.iu.edu/~code/code/responsibilities/academic/index.shtml.