CIS 700: Software Analysis and Testing (Fall 2017)

[Recent News] [Logistics] [Syllabus] [Schedule] [Homeworks] [Project]

Recent News

Logistics

Time:Mondays and Wednesdays, 1:30-3 pm
Location:Moor Building 212
Instructor: Mayur Naik
Email: mhnaik@cis.upenn.edu
Office: Levine 610
Office hours: 11-12 Tuesdays (or by appointment)
Teaching
Assistant:
Xujie Si
Email: xsi@cis.upenn.edu
Office: Levine 613
Office hours: 2-3 Thursdays (or by appointment)

Syllabus

Course Overview: Aspects of software development besides programming, such as diagnosing bugs, testing, and debugging, comprise over 50% of development costs. Modern technology has come a long way in aiding programmers with these aspects of development, and at the heart of this technology lies software analysis: a body of work that concerns discovering facts about a given program. Many diverse software analysis approaches exist, each with their own strengths and weaknesses. In this course, you will learn the principles underlying these approaches and gain hands-on experience applying them to automate testing software and finding bugs in complex, real-world programs.

Course Objectives: Upon successfully completing the course, you will be able to:

Evaluation: Evaluation will be based on three components: homeworks, two in-class exams, and a project. Exams will be based only on material covered in the lectures and will be open lecture slides/notes.

Collaboration Policy: Discussing homework solutions is allowed but each individual's submission should be their own.

Prerequisites: The course has two prerequisites:

Course Material: There is no required textbook; all relevant materials will be made available online.

Schedule

Note: This is a tentative list subject to revision!

Date Topic Notes
Introduction to software analysis and software testing. Presents landscape of different techniques, introduces basic concepts, and discusses capabilities and limitations.
Sep 06 Introduction to Software Analysis [PPTX] [PDF] [video]
Sep 11 Introduction to Software Testing [PPTX] [PDF] [video]
Presents techniques for automated testing in important application domains (mobile and multi-threaded applications) and languages (object-oriented and functional).
Sep 13 Random Testing [PPTX] [PDF] [video]
Sep 18, 20 Automated Test Generation [PPTX] [PDF] [video]
Sep 27 Property-Based Testing
(Guest lecture by John Hughes)
[PPTX] [PDF]
Presents building blocks of dataflow analysis. Covers abstractions and algorithms for analyzing structured language constructs, primitive data, and objects.
Sep 25 Dataflow Analysis [PPTX] [PDF] [video]
TBD Pointer Analysis [PPTX] [PDF] [video]
TBD Type-State Analysis [PPTX] [PDF]
Presents theoretical foundations of static analysis. Introduces topics of constraint-based analysis, type-based analysis, and abstract interpretation.
TBD Constraint-Based Analysis [PPTX] [PDF] [video]
TBD Type-Based Analysis [PPTX] [PDF] [video]
TBD Abstract Interpretation [PPTX] [PDF]
Presents techniques to automate debugging. Covers algorithms for statistical debugging, fault localization, and performance debugging.
TBD Statistical Debugging [PPTX] [PDF] [video]
TBD Delta Debugging [PPTX] [PDF] [video]
TBD Performance Debugging [PPTX] [PDF]
Presents path-sensitive analysis and its incarnation in dynamic symbolic execution and software model checking.
TBD Dynamic Symbolic Execution [PPTX] [PDF] [video]
TBD Software Model Checking [PPTX] [PDF]
Presents advanced topics in dataflow analysis with an emphasis on scaling to very large programs.
TBD Inter-procedural Analysis [PPTX] [PDF]
TBD Sensitivity in Dataflow Analysis [PPTX] [PDF]
TBD Sparsity in Dataflow Analysis [PPTX] [PDF]

Homeworks

Homeworks will be of four kinds: (1) reading technical papers and writing reviews; (2) solving practice questions for exams (3) applying software analyis/testing tools and reporting your findings; and (4) implementing software analysis/testing algorithms.

Homeworks will be published on Canvas and solutions should be submitted through Canvas.

Project

Project Overview: A hands-on project will span the duration of the course. Students will form teams of upto two and pick a topic in consultation with the instructor. A list of suggested topics is provided below. There will be three milestones to provide feedback on each team’s progress. Each team will develop their implementation on github and maintain a technical report to be submitted at each milestone. The project will conclude with an in-class presentation by each team at the end of the semester.

Each project must encompass a problem statement, a formal/conceptual formulation of your proposed solution, a realistic implementation, and an experimental evaluation. The project will be evaluated based on creativity, correctness, comprehensiveness, and clarity.

Project Milestones: All project-related deliverables must be submitted through Canvas by midnight on the below dates.

WhenWhatDeliverable
TBDPropose project topicProject topic description
[instructions]
TBDFinalize project topic
TBDProject milestone 1Implementation and report
[instructions]
TBDProject milestone 2
TBDProject milestone 3
TBDIn-class project presentationsPresentation slides

Suggested Project Topics: Below is a list of project topics. You are not obligated to choose from this list. Feel free to propose your own project topic!

Note: This is a tentative list subject to revision!

TopicPapersTools
Call-Graph Construction Algorithms Frank Tip, Jens Palsberg. Scalable propagation-based call graph construction algorithms. OOPSLA 2000. Chord, Soot, WALA
Static/Dynamic Slicing Algorithms Susan Horwitz, Thomas Reps, David Binkley. Interprocedural Slicing Using Dependence Graphs. ACM Trans. Program. Lang. Syst. 12(1), 1990. Chord
Using Machine Learning for Program Analysis Benjamin Livshits, Aditya Nori, Sriram Rajamani, Anindya Banerjee. Merlin: specification inference for explicit information flow problems. PLDI 2009. Chord plus Infer.NET library
Pointer Analysis Algorithms Michael Hind. Pointer analysis: Haven't we solved this problem yet? PASTE 2001. This is just a survey paper; you will have to pick a more technical paper to present.
Static Taint Analysis for Mobile Apps Omer Tripp, Marco Pistoia, Stephen J. Fink, Manu Sridharan, Omri Weisman. TAJ: effective taint analysis of web applications. PLDI 2009. STAMP (built on Chord)
Dynamic Taint Analysis for Mobile Apps Edward J. Schwartz, Thanassis Avgerinos, David Brumley. All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask). IEEE S&P 2010. A3T
Input Generation via Random Testing Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball. Feedback-Directed Random Test Generation. ICSE 2007. Randoop, DynoDroiD
Input Generation via Concolic Testing Patrice Godefroid, Nils Klarlund, Koushik Sen. DART: directed automated random testing. PLDI 2005.
Edward J. Schwartz, Thanassis Avgerinos, David Brumley. All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask). IEEE S&P 2010.
Saswat Anand, Mayur Naik, Hongseok Yang, Mary Jean Harrold. Automated Concolic Testing of Smartphone Apps. FSE 2012.
A3T
Dataflow Analysis for Type-State Checking David Greenfieldboyce, Jeffrey S. Foster. Type qualifier inference for Java.. OOPSLA 2007. JQual, SAFE
Model Checking for Type-State Checking SLAM or BLAST
Program Performance Prediction Ling Huang, Jinzhu Jia, Bin Yu, Byung-Gon Chun, Petros Maniatis, Mayur Naik. Predicting Execution Time of Computer Programs Using Sparse Polynomial Regression. NIPS 2010.
Sangmin Lee, Byung-Gon Chun, Ling Huang, Petros Maniatis, Mayur Naik. Mantis: Automatic Generation of System Performance Predictors. Technical Report, 2011.

Mantis (built on Chord and GNU Octave)
Program Analysis for Mobile-Cloud Computing Byung-Gon Chun, Sunghwan Ihm, Petros Maniatis, Mayur Naik, Ashwin Patti. CloneCloud: Elastic Execution between Mobile Device and Cloud. EuroSys 2011.
Cong Shi, Mostafa Ammar, Ellen Zegura, Mayur Naik. Computing in Cirrus Clouds: The Challenge of Intermittent Connectivity. SIGCOMM-MCC'12.
Android device emulator
Static Analysis of Concurrent Programs Mayur Naik, Chang-Seo Park, Koushik Sen, David Gay. Effective static deadlock detection. ICSE 2009.
Mayur Naik, Alex Aiken, John Whaley. Effective static race detection for Java. PLDI 2006.
Chord
Dynamic Analysis of Concurrent Programs Madanlal Musuvathi, Shaz Qadeer, Thomas Ball, Gerard Basler, Piramanayagam Arumuga Nainar, Iulian Neamtiu. Finding and Reproducing Heisenbugs in Concurrent Programs. OSDI 2008.
Pallavi Joshi, Mayur Naik, Chang-Seo Park, Koushik Sen. CalFuzzer: An Extensible Active Testing Framework for Concurrent Programs. CAV 2009.
Pallavi Joshi, Mayur Naik, Koushik Sen, David Gay. An effective dynamic analysis for detecting generalized deadlocks. FSE 2010.
CalFuzzer, Chord, CHESS