CIS 700-003: Big Data Analytics (Spring 2017, Beta version)

This is an archived Web site. For the current offering of CIS 545, please see here.

All assignments here are for reference only, and are subject to change in future years!

Assignment 0 Getting Started. This simple assignment gets you started with Docker and Jupyter, as well as our course submission site.
Assignment 1 Data Wrangling. Learn to read, save, combine, and clean data in Pandas and SQL. Integrate across sources. Conduct simple analyses of data.
Assignment 2 Big Data, Graph Data. Learn to use Apache Spark. Traverse graph data. Compute measures of graph centrality. Recommend friends in a social network.
Assignment 3 The Cloud, Matrices, and Arrays. Learn to use Spark on Elastic MapReduce. Perform computations over graphs and documents using NumPy. Learn a bit about document vectors and information retrieval.
Assignment 4 Optimization, Search, and Clustering. Learn algorithmic primitives such as gradient-descent, search, genetic algorithms, dynamic programming, and clustering. Along the way you'll build a simple artificial neuron and learn about gene sequence alignment.
Assignment 5 Classification and TensorFlow. Build a classifier using spam, experimenting with different classifier methods and ensembles. Construct a neural network in Google's TensorFlow.
Assignment 6 Tuning, Time-Series Data and Visualization. Experiment with tuning classifiers for detecting seizures in dog EEG data. Visualize spatiotemporal activity from earthquakes.