CIS 505: Software Systems (Spring 2018)

Image of a router
This course provides an introduction to fundamental concepts of distributed systems, and the design principles for building large-scale computational systems.

We will study some of the key building blocks – such as synchronization primitives, group communication protocols, and replication techniques – that form the foundation of modern distributed systems, such as cloud-computing platforms or the Internet. We will also look at some real-world examples of distributed systems, such as GFS, MapReduce, Spark, and Dynamo, and we will gain some hands-on experience with building and running distributed systems.

CIS 505 is one of the core courses in the MSE and EMBS programs, and its final exam qualifies as one of the four WPE-I exams in the PhD program.


Linh Thi Xuan Phan
Office hours: Wednesdays 1:00-2:00pm (Levine 464)

When and where:
MW 4:30-6:00pm, TOWNE 100

Teaching assistants:

Sarvesh Surana
Office hours: Mondays 2:00-3:30pm
Location: Levine 6th floor bump space

Xiaozhou Pu
Office hours: Mondays 3:00-4:00pm
Location: Levine 5th floor bump space

Krishna Bharathala
Office hours: Tuesdays 1:30-2:30pm
Location: Levine 5th floor bump space

Juncheng Chen
Office hours: Tuesdays 4:00-5:00pm
Location: Levine 6th floor bump space

Alexander Thurston
Office hours: Wednesdays 9:30-10:30am
Location: Levine 5th floor bump space

Oshin Agarwal
Office hours: Wednesdays 3:00-4:00pm
Location: Levine 5th floor bump space

Devesh Dayal
Office hours: Thursdays 12:00-1:30pm
Location: Levine 6th floor bump space

Natasha Narang
Office hours: Thursdays 1:30-2:30pm
Location: Levine 6th floor bump space

Jacob Kahn
Office hours: Thursdays 2:00-3:00pm
Location: Levine 5th floor bump space

Nikheel Savant
Office hours: Fridays 2:00-3:00pm
Location: Levine 6th floor bump space

Saeed Abedi
Office hours: Fridays 3:00-4:00pm
Location: Levine 6th floor bump space

Thomas Greening
Office hours: Fridays 4:00-5:00pm
Location: Levine 5th floor bump space

Course policies

Course textbook:
Distributed Systems: Principles and Paradigms, 3rd edition (by M. van Steen and A. Tanenbaum; ISBN 978-1543057386). You can get a digital version of this book for free; hardcopies are available, e.g., from Amazon. Additional material will be drawn from selected research publications.

Either undergraduate networking or operating systems is required. You should also be comfortable with programming in C/C++.

The course will involve three substantial programming assignments, a group project, a midterm, and a final examination.

Your letter grade will be based on the programming assignments (30%), the group project (25%), the midterm exam (15%), the final exam (25%), and your participation (5%).


We will be using Piazza for all course-related discussions.

Homework assignments and project are available for download; you can submit your solution online. If necessary, you can request an extension for your homeworks.

PennCloud Award

Winners of Fall 2017 PennCloud Award
Bhairavi Mehta, Sarvesh Surana, Animesh Shah, Mihir Pattani, and Swathi Rajanna.
Fall 2017 PennCloud Project
Example services of the best PennCloud platform.

The Fall 2017 PennCloud Award went to Bhairavi Mehta, Mihir Pattani, Swathi Rajanna, Animesh Shah, and Sarvesh Surana for the best final project. The team presented a solid cloud platform with a highly scalable, fault-tolerant key-value datastore at the backend that supports strong consistency and efficient replication, checkpointing and recovery. The platform offers a diverse set of services that are rich in features, such as a webmail service that supports remote users, email attachments, mail folders, and address books; a storage service that supports uploading, downloading, sharing of large files in any format; one-to-one and group chat service with FIFO ordering semantics; among others.

Schedule (tentative)

Date Topic Details Reading Remarks
Jan 10 Introduction Course overview
Chapter 1  
Jan 15 No class (Martin Luther King, Jr. Day) HW0
Jan 17 Processes and threads Basic concepts
The UNIX model
Implementation in the kernel
Chapter 3.1 (Sections 1+2)  
Jan 22 System calls System calls
The file API
Kernel entry/exit
  HW0 due; HW1
Jan 24 Concurrency control Synchronization primitives
Race conditions, critical sections
Deadlock and starvation
Jan 29 Synchronization Semaphores
Classical synchronization problems
Monitors and condition variables
Jan 31 Communication Sockets
Socket programming
Handling multiple connections
Chapters 4.1+4.3 + 3.1 (Section 3)  
Feb 5+7 Remote Procedure Calls Programming model
Stub code; marshalling; binding
Handling failures
Chapters 4.2+8.3 HW1 due; HW2
Feb 12 Naming Kinds of names; name spaces
The Domain Name System
Chapter 5  
Feb 14 Clock synchronization Logical clocks
Distributed mutual exclusion
Chapters 6.1–6.3  
Feb 16 Last day to drop HW2MS1 due
Feb 19 Distributed coordination Distributed mutual exclusion
Leader election
Bully algorithm; token ring
Chapter 6.4  
Feb 21 Group communication Reliable multicast
IP multicast
FIFO, causal and total ordering
Chapter 8.4  
Feb 26 Class is canceled – Linh is at NSF Meeting  
Feb 28 Midterm The midterm exam will cover topics from the first lecture
Mar 3–11 Spring break HW2MS2+3 due (on 3/7)
Mar 12 Group communication (cont.) Algorithms for FIFO, causal and total ordering Chapter 8.4 HW3; Project
Mar 14 Replication Primary/backup protocols
Quorum protocols
Sequential and causal consistency
Client-centric models
Chapter 7  
Mar 19 Bigtable and Project Bigtable case study
Project overview
Mar 21 No class – University is closed
Mar 26 Fault tolerance 2PC and 3PC
Logging and recovery
Chandy-Lamport algorithm
Chapters 8.5+8.6; [Chandy-Lamport] HW3 due (on 3/27)
Mar 28 State-machine replication Failure models
The Consensus problem
Chapters 8.1+8.2; [Paxos]  
Mar 30 Last day to withdraw
Apr 2 Non-crash Fault Tolerance The Byzantine Generals problem
Impossibility results
Apr 4 File systems File operations; name space
Data structures on disk
Space management
Apr 5 Distributed file systems NFS
Disconnected operation
Chapter 2.4.2; [Coda]  
Apr 9
Apr 11 Class is canceled – Linh is at CPSWeek
Apr 16 Google File System Google cluster architecture
Reading and writing in GFS
Consistency and fault tolerance
[Cluster] [GFS]  
Apr 18 MapReduce MapReduce programming model
System architecture
Apr 23 Spark Differences to MapReduce
Case study: PageRank
Apr 25 DHTs and Dynamo Distributed hash tables
The CAP dilemma
Amazon Dynamo
Apr 26–27 Reading days
May 2 Final Exam (6-8pm) The final exam will cover all topics, from Jan 10 through Apr 25
Apr 30–May 8 Project demos and reports
Web site contact: Linh Thi Xuan Phan