Date Topic Reading
1/10/18 Course overview Slides
1/17/18 Conjunctive queries and Datalog Slides
Ch. 24 (Deductive Databases) of Database Management Systems by Ramakrishnan and Gehrke.
Alternatively, Chapter 3 of Principles of Database and Knowledge-Base Systems: Volume 1 by J.D. Ullman (available here)
or Ch. 12, 13:1-3, 15:1-3 of Foundations of Databases by Abiteboul, Hull and Vianu (available here).
1/22/18 Datalog, recursion and negation Slides
1/24/18 Class cancelled
1/29/18 Datalog, recursion and negation, cont.
Applications: query optimization, data integration
Answering queries using views: A survey by Alon Halevy. VLDB J. 10:4, pp. 270-294. (Focus on Sec. 1-4, no summary necessary.)
Slides (negation) (views)
1/31/18 Semistructured data: XML and XQuery Slides
XQuery tutorial
2/5/18 Semi-structured data: JSON, MongoDB and JSONique Slides
MongoDB and JSONique
2/7/18 Semi-structured data: JSON, MongoDB and JSONique (cont)
Datalog and Big Data Analytics (Qizhen Zhang)
Paper summary and discussion: Big Data Analytics with Datalog Queries on Spark SIGMOD 2016
2/12/18 PowerGraph (Max) Paper summary and discussion: PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs by Gonzalez, Low, Gu, Bickson and Guestrin. OSDI 2012: 17-30.
2/14/18 Data Citation (Yinjun Wu) Data Citation: Giving credit where credit is due
2/19/18 Graph databases: Foundations, RDF/SPARQL, Neo4J Foundations of Modern Query Languages for Graph Databases, by Angles, Arenas, Barcelo, Hogan, Reutter, Brgoc. ACM Computing Surveys 50:5, 2017.
See also this analysis.
2/21/18 Pregel (Qizhen Zhang) Paper summary and discussion: Pregel: A System for Large-Scale Graph Processing, by Malewicz, Austern, Bik, Dehnert, Horn, Leiser and Czajkowski, SIGMOD 2010.
2/26/18 Class cancelled
2/28/18 Data Streams: Query languages, mining
Data Streams: STREAM (Davidson)
Query Languages and Data Models for Database Sequences and Data Streams, by Law, Wang, Zaniolo. VLDB 2004.
Mining Data Streams: A Review by Gaber, Zaslavsky and Krishnaswamy. SIGMOD Record 34:2, 2005
STREAM: The Stanford Data Stream Management System
, by Arasu, Babcock, Babu, Cieslewicz, Datar, Ito, Motwani, Srivastava and Widom.
3/5-7/18 Spring break
3/12/18 Discretized Stream (Hongru Du) Paper summary and discussion: Discretized Stream: Fault-Tolerant Streaming Computation at Scale, by Zaharia, Das, Li, Hunter, Shenker, Stoica. SOSP 2013.
3/14/18 Versioning XML (Davidson) Archiving Scientific Data, by Buneman, Khanna, Tajima and Tan.
3/19/18 Streaming Environments: Drizzle (Max) Paper summary and discussion: Drizzle: Fast and Adaptable Stream Processing at Scale, by Venkataraman, Armbrust, Panda, Ghodsi, Ousterhout and Franklin. SOSP 2017.
3/21/18 Streaming Environments: Flink(Hongru Du) Paper summary and discussion: Apache Flink: Stream and Batch Processing in a Single Engine, by Carbone, Katsifodimos, Ewen, Markl, Haridi, Tzoumas. IEEE Data Eng. Bull. 34 2015.
3/26/18 Transactions and NoSQL (Davidson) CAP Twelve Years Later: How the 'Rules' Have Changed, by Brewer. Computer 45:2, pp. 23-29, Feb. 2012.
Eventually consistent by Vogels, W. Communications of the ACM. 52: 40, 2009.
Don't Settle for Eventual Consistency, CACM 57:5, May 2014.
See also this blog.
3/28/18 Distributed databases: Spanner (Leshang Chen) Paper summary and discussion: Spanner: Google's Globally-Distributed Database OSDI 2012.
Note that Spanner: Becoming a SQL System SIGMOD 2017 will be presented by Qizhen Zhang during the Database Group Meeting on 3/2/18.
4/2/18 Storage (Azure: Hui Lyu) Paper summary and discussion: Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics, SIGMOD 2017.
4/4/18 DS ethics and databases (Julia Stoyanovich) Towards a platform for responsible Data Science, by Stoyanovich, Howe, Abiteboul, Miklau, Sahuguet and Weikum. SSDBM 2017.
DataSynthesizer: Privacy-preserving synthetic datasets, by Ping, Stoyanovich and Howe. SSDBM 2017.
4/9/18 DS ethics and databases, cont. (Hui Lyu) Paper summary and discussion: PrivateClean: Data Cleaning and Differential Privacy, by Krishnan, Wang, Franklin, Goldberg, and Kraska. SIGMOD 2016.
4/11/18 Main memory databases How to Build a Non-Volatile Memory Database Management System, by Arulraj and Pavlo. SIGMOD 2017
Let's Talk About Storage and Recovery Methods for Nov-Volatile Memory Database Systems, by Arulraj, Pavlo and Duloor. SIGMOD 2015.
4/16/18 Main memory databases, cont. (Leshang Chen) Paper summary and discussion: Revisiting Reuse in Main Memory Database Systems, SIGMOD 2017
ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads, by Kim, Wang, Johnson and Pandis.
4/18/18 Main memory databases, cont. Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last, by Menon, Mowry and Pavlo. VLDB 2017.
Write-limited sorts and joins for persistent memory
4/23/18 Project presentations
4/25/18 Project presentations