CSCI B534 Distributed Systems

School of Informatics, Prof. Plale
Spring 2009; Tue and Thur 2:30-3:45 p.m., BH 321


Much of the computing taking place today is distributed. Web services, cloud computing, virtualization, peer-to-peer and Internet computing all have distributed systems concepts at their foundation. In this course we study the foundational concepts of distributed systems, foundations that you need to move from technology to technology: cloud computing today, and something entirely different a decade from now. You will also get hands-on programming experience with today's technologies, specifically cloud computing and web services for large-scale data. The course is intended to give the computer science student (or other serious technologist) a balanced experience that will grow their foundational understanding of distributed systems, and at the same time provide valuable hands-on experience that puts the foundational understanding in a context that can be of immediate use.

Instructor: Professor Beth Plale (with Yiming Sun and Eran Chinthaka)
Associate Instructors: Yiming Sun, Sharanya Chinnusamy
Plale: LH301C, 812-855-4373, e-mail, office hours: 1:00 - 2:00pm, Tuesdays and Thursdays
Yiming: LH301H, e-mail, office hours: 3:00 - 4:30pm, Mondays and Wednesdays, or by appointment.

Topics and Agenda
Goals of distributed systems (e.g., transparencies) 13-15 Jan
Architectures 20-29 Jan Client-server, Peer-to-peer architectures, autonomic computing Programming models: Dryad and Map-Reduce
Virtualization (VMs) and Communication 3-19 Feb
Cloud computing and Web Services 24 Feb-5 Mar
Naming (global naming, name spaces) 10-12 Mar
Spring Break 17-19 Mar
Synchronization (clocks, elections, mutual exclusion) 24-26 Mar
Consistency 31 Mar- 9 Apr
Fault tolerance 14-16 Apr
Distributed file systems and models (e.g., Google File System, File system workloads) 21-23 Apr
Student presentations of synthesis paper 28 Apr-5 May (final exam period 2:45-4:45)

Textbook and materials: The textbook is Andrew S. Taenbaum and Maarten Van Steen, Distributed Systems: Principles and Paradigms, 2nd Ed., Prentice Hall, 2007. You are strongly advised to get the book. Other readings will come from conference and journal papers that can be downloaded from sources such as IEEE Digital Library, ACM Digital Library, or Citeseer.

Abstracts

You will write abstracts for assigned readings from papers. The abstract serves the purpose of organizing your thoughts for the class discussion. The abstract should be about 500 words in length and i.) Identify the problem being solved, ii.) identify the solution the author proposed and how the author validates the solution, and iii.) provides an assessment of the importance of the work.

Projects

The course includes three projects, two that are programming projects and one that is a synthesis paper. The programming projects will require experience with programming, and will grow your skills at systems programming. Distributed systems today are too large for any one person to write, so the systems programmer must be comfortable working with APIs, libraries, and code from other programmers and other organizations. We will likely work in Java and on a linux platform.

For the synthesis paper, you will research an area by selecting and reading three related works from highly selective conference or journal venues, then construct a taxonomy to use as an organizing framework for the paper. The synthesis paper can be a breakthrough experience in independent scholarship for a student. When coupled with one of the projects or outside work, the synthesis paper can provide a path for independent scholarship beyond the spring semester.

The course prerequisite is CSCI P536 Advanced Operating Systems or consent of instructor. The prereq is there because the programming projects are intended as a systems programming experience that builds off core competency in synchronization, concurrency, file systems, and single-image programming. If you think you've got the requisite skills but haven't taken P536, let me know.

Grading

The course grade is determined by the student's performance over three areas: projects (60%), readings and discussion (30%), and problem solving (homeworks 10%). There will be no exams.