Date Topic Reading
1/10/18 Course overview Slides
1/17/18 Conjunctive queries and Datalog Slides
Ch. 24 (Deductive Databases) of Database Management Systems by Ramakrishnan and Gehrke.
Alternatively, Chapter 3 of Principles of Database and Knowledge-Base Systems: Volume 1 by J.D. Ullman (available here)
or Ch. 12, 13:1-3, 15:1-3 of Foundations of Databases by Abiteboul, Hull and Vianu (available here).
1/22/18 Datalog, recursion and negation Slides
1/29/18 Datalog, recursion and negation, cont.
Applications: query optimization, data integration
Answering queries using views: A survey by Alon Halevy. VLDB J. 10:4, pp. 270-294. (Focus on Sec. 1-4, no summary necessary.)
Slides (negation) (views)
1/31/18 Semistructured data: XML and XQuery Slides
XQuery tutorial
2/5/18 Semi-structured data: JSON, MongoDB and JSONique Slides
MongoDB and JSONique
2/7/18 Semi-structured data: JSON, MongoDB and JSONique (cont)
Datalog and Big Data Analytics (Qizhen Zhang)
Paper summary and discussion: Big Data Analytics with Datalog Queries on Spark SIGMOD 2016
2/12/18 PowerGraph (Max) Paper summary and discussion: PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs by Gonzalez, Low, Gu, Bickson and Guestrin. OSDI 2012: 17-30.
2/14/18 Data Citation (Yinjun Wu) Data Citation: Giving credit where credit is due
2/19/18 Graph databases: Foundations, RDF/SPARQL, Neo4J Slides
Foundations of Modern Query Languages for Graph Databases, by Angles, Arenas, Barcelo, Hogan, Reutter, Brgoc. ACM Computing Surveys 50:5, 2017.
See also this analysis.
2/21/18 Pregel (Qizhen Zhang) Paper summary and discussion: Pregel: A System for Large-Scale Graph Processing, by Malewicz, Austern, Bik, Dehnert, Horn, Leiser and Czajkowski, SIGMOD 2010.
2/28/18 Data Streams: STREAM (Davidson) Slides
STREAM: The Stanford Data Stream Management System, by Arasu, Babcock, Babu, Cieslewicz, Datar, Ito, Motwani, Srivastava and Widom.
3/5-7/18 Spring break
3/12/18 Discretized Stream (Hongru Du) Paper summary and discussion: Discretized Stream: Fault-Tolerant Streaming Computation at Scale, by Zaharia, Das, Li, Hunter, Shenker, Stoica. SOSP 2013.
3/14/18 Versioning XML (Davidson) Slides
Archiving Scientific Data, by Buneman, Khanna, Tajima and Tan.
3/19/18 Streaming Environments: Drizzle (Max) Paper summary and discussion: Drizzle: Fast and Adaptable Stream Processing at Scale, by Venkataraman, Armbrust, Panda, Ghodsi, Ousterhout and Franklin. SOSP 2017.
3/26/18 Streaming Environments: Flink(Hongru Du) Paper summary and discussion: Apache Flink: Stream and Batch Processing in a Single Engine, by Carbone, Katsifodimos, Ewen, Markl, Haridi, Tzoumas. IEEE Data Eng. Bull. 34 2015.
3/28/18 Distributed databases: Spanner (Leshang Chen) Paper summary and discussion: Spanner: Google's Globally-Distributed Database OSDI 2012.
Note that Spanner: Becoming a SQL System SIGMOD 2017 will be presented by Qizhen Zhang during the Database Group Meeting on 3/2/18.
4/2/18 Distributed file services: Azure (Hui Lyu) Paper summary and discussion: Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics, SIGMOD 2017.
4/4/18 Updates (Mert) Paper summary and discussion: bLSM: A General Purpose Log Structured Merge Tree, by Sears and Ramakrishnan. SIGMOD 2012.
4/9/18 DS ethics and databases (Hui Lyu) Paper summary and discussion: PrivateClean: Data Cleaning and Differential Privacy, by Krishnan, Wang, Franklin, Goldberg, and Kraska. SIGMOD 2016.
4/11/18 DS ethics and databases (Julia Stoyanovich) Towards a platform for responsible Data Science, by Stoyanovich, Howe, Abiteboul, Miklau, Sahuguet and Weikum. SSDBM 2017.
DataSynthesizer: Privacy-preserving synthetic datasets, by Ping, Stoyanovich and Howe. SSDBM 2017.
4/16/18 Main memory databases (Mert) Paper summary and discussion:How to Build a Non-Volatile Memory Database Management System, by Arulraj and Pavlo. SIGMOD 2017
4/18/18 Main memory databases, cont. (Leshang Chen) Paper summary and discussion: Revisiting Reuse in Main Memory Database Systems, SIGMOD 2017
4/23/18 Writing a compelling conference/journal paper Slides (Material drawn from Black, Leen and Maier's course on Scholarship Skills.)
4/25/18 Project presentations