NETS 212: Scalable and Cloud Computing (Fall 2017)
Location: 560 Levine Hall
Office hour: Tuesdays 2-3pm (Levine 560)
|Time and location||Tuesdays/Thursdays 4:30-6:00pm
Dhruv Agarwal (firstname.lastname@example.org)
Office hour: Mondays noon-1pm (5th floor GRW bump space)
Max Doppelt (email@example.com)
Office hour: Fridays 2-3pm (5th floor GRW bump space)
Benjamin Judd (firstname.lastname@example.org)
Office hour: Thursdays 3-4pm (5th floor GRW bump space)
Tony Kang (email@example.com)
Office hour: Fridays 1-2pm (5th floor GRW bump space)
Aspyn Palatnick (firstname.lastname@example.org)
Office hour: Mondays 3-4pm (5th floor GRW bump space)
Sumit Shyamsukha (email@example.com)
Office hour: Tuesdays 3-4pm (5th floor GRW bump space)
Robert Zajac (firstname.lastname@example.org)
Office hour: Wednesdays 11am-noon (5th floor GRW bump space)
What is the "cloud"? How do we build software systems and components that scale to
millions of users and petabytes of data, and are "always available"?
In the modern Internet, virtually all large Web services run atop multiple geographically distributed data centers: Google, Yahoo, Facebook, iTunes, Amazon, eBay, Bing, etc. Services must scale across thousands of machines, tolerate faults, and support thousands of concurrent requests. Increasingly, the major providers (including Amazon, Google, Microsoft, HP, and IBM) are looking at "hosting" third-party applications in their data centers - forming so-called "cloud computing" services. A significant number of these services also process "streaming" data: geocoding information from cell phones, tweets, streaming video, etc.
This course, aimed at a sophomore with exposure to basic programming within the context of a single machine, focuses on the issues and programming models related to such cloud and distributed data processing technologies: data partitioning, storage schemes, stream processing, and "mostly shared-nothing" parallel algorithms.
|Topics covered||Datacenter architectures, the MapReduce programming model, Hadoop, cloud algorithms (PageRank, adsorption, friend recommendation, TF/IDF), web programming basics (servlets, AJAX, Node.js/Express, Bootstrap), higher-level programming (Hive, Pig Latin), ...|
|Format||The format will be two 1.5-hour lectures per week, plus assigned readings. There will be regular homework assignments and a term project, plus a midterm and a final exam.|
CIS 120, Introduction to Programming
CIS 160, Discrete Mathematics
Co-requisite: CIS 121, Data Structures
|Texts and readings||
Hadoop: The Definitive Guide, Fourth Edition, by Tom White (O'Reilly) (ISBN 978-1-4919-0163-2; read online for free,
or buy for approx. $32)
Data-Intensive Text Processing with MapReduce, by Jimmy Lin and Chris Dyer (Morgan & Claypool) (ISBN 978-1608453429; read online for free, or buy for approx. $40)
Additional materials will be provided as handouts or in the form of light technical papers.
|Grading||Homework 30%, Term project 30%, Exams 35%, Participation 5%|
You are encouraged to discuss your homework assignments with your
classmates; however, any code you submit must be your own work. You may
not share code with others or copy code from outside sources,
except where the assignment specifically allows it.
can have serious consequences.
We will be using Piazza for course-related discussions.
|Term project||In three-person teams, build a small Facebook-like application using Node.js and Amazon's SimpleDB. Based on network analysis, the application should make friend recommendations; it should also visualize the social network.|
|Facebook award||In previous years, Facebook sponsored an award for the best term project. You can learn more about the winners in the Hall of Fame.|
|Assignments||Homework assignments will be available for download; you can submit your solution here. If necessary, you can request an extension.|
|Lab sessions||The TAs may occasionally hold lab sessions to provide additional help with topics related to the class.|
Below is the tentative schedule for the course: