




CIS 700003: Big Data Analytics (Spring 2017, Beta version)

 





Assignment 0 

Getting Started.
This simple assignment gets you started with Docker and Jupyter, as well as our
course submission site.

Assignment 1 

Data Wrangling.
Learn to read, save, combine, and clean data in Pandas and SQL. Integrate across sources. Conduct simple analyses of data.

Assignment 2 

Big Data, Graph Data.
Learn to use Apache Spark. Traverse graph data. Compute measures of graph centrality. Recommend friends in a social network.

Assignment 3 

The Cloud, Matrices, and Arrays.
Learn to use Spark on Elastic MapReduce. Perform computations over graphs and documents using NumPy.
Learn a bit about document vectors and information retrieval.

Assignment 4 

Optimization, Search, and Clustering.
Learn algorithmic
primitives such as gradientdescent, search, genetic algorithms,
dynamic programming, and clustering. Along the way you'll build a
simple artificial neuron and learn about gene sequence alignment.

Assignment 5 

Classification and TensorFlow.
Build a classifier using spam,
experimenting with different classifier methods and ensembles. Construct
a neural network in Google's TensorFlow.

Assignment 6 

Tuning, TimeSeries Data and Visualization.
Experiment with
tuning classifiers for detecting seizures in dog EEG data. Visualize
spatiotemporal activity from earthquakes.
