CIS 650: Implementation of Data Management Systems (Spring 2014)

Location

512 Levine Hall, Monday/Wednesday 3:00-4:30



Instructor



Zachary Ives
Location: 576 Levine Hall North
Office hours: Wed 4:30-5:30 and by appointment

Piazza: piazza.com/penn/spring2014/cis650

Description

What are the basic algorithms, architectures, and principles of building high-performance, reliable systems for processing large volumes of (semi)structured data? Examples include database systems, graph data stores, data integration systems, search engines, cloud or cluster compute engines, and more. Such systems often separate the set of operations to be performed from their algorithmic execution; they might provide consistency and atomicity semantics (as well as other "ACID" properties); and they frequently manage data that is too large to fit in memory. This course will focus on these questions, using foundational research papers, a recent text on data integration and data management for the Web, as well as recent papers from journals and conferences.

Prerequisites

CIS 550 (or equivalent), and basic familiarity with the relational data model, algebra, and calculus are required. Programming ability in Java, C, or C# are also required.

Format

The format will be two one-and-a-half-hour lectures a week, plus assigned readings from the textbook and supplementary materials. In general, a one-page summary/review of each paper must be submitted at least one hour prior to class. In some cases, students will be expected to lead the discussion and/or analysis of a paper. Every week we may assign groups to argue for the significance or superiority of one paper vs. the others. Additionally, there will be a research-oriented course project that will serve as the final exam. The course project will include an implementation with experimental validation, a project report, and a brief (~ 15 minute) presentation.

Texts
  • Principles of Data Integration, AnHai Doan, Alon Halevy, and Zachary Ives, Morgan-Kaufmann
  • Readings in Database Systems, Joseph M. Hellerstein and Michael Stonebraker, editors, 4th ed, MIT Press. (Direct links to papers from this book will generally be provided.)
  • Additional materials will be provided in the form of technical papers.
Grading

In addition to paper summaries/reviews (20%), there will be a "midterm report" that synthesizes and comments on one of the topics of the first half of the semester (25%), a final project (50%), and participation (5%).

Schedule is available in (frequently updated) electronic form here.

You should also read this document on how to read and review technical papers.

Project Ideas are available in (frequently updated) electronic form here