Current Projects


Code2Inv: Learning to Verify and Synthesize Programs

Automated tools to verify and synthesize programs rely on hand-engineered search heuristics that hinder them in aspects such as flexibility, scalability, extensibility, and naturalness of generated solutions. Code2Inv is a deep learning framework that aims to overcome these limitations. We have demonstrated the framework on two kinds of tasks: finding loop invariants in small C programs and solving CHC constraints.

Website: http://code2inv.cis.upenn.edu/

Papers:

Hoppity: Learning to Find and Fix Bugs in Programs

Automated code assistants stand to greatly aid programmers and improve software quality. Hoppity is a deep learning framework that aims to detect and fix a broad range of bugs in programs. It is designed for use in real-time while developing or committing code.

Website: http://hoppity.cis.upenn.edu/

Papers:

Improving Software Security via Program Debloating

Prevalent software engineering practices have significantly increased the complexity and bloat of software. This in turn has led to decreased performance and increased security vulnerabilities. This project is addressing a multitude of research problems to effectively counter this trend.

Chisel is a source-level system for debloating C/C++ programs. It automatically removes unnecessary functionality guided by a high-level specification of the desired program.

Website: http://chisel.cis.upenn.edu/

Papers:

Learning Logic Programs from Structured Data

The problem of synthesizing logic rule-based programs from structured data has applications in knowledge discovery, program analysis, and databases. This project is developing effective algorithms and tools to synthesize sophisticated declarative programs from structured data ranging from a a few input-output examples to Big Data.

Our current focus is on synthesizing Datalog programs with rich features such as multi-way joins, recursion, invented predicates, and aggregation.

We have curated a set of benchmarks called Datalog Bench and developed a series of prototypes with different capabilities:

  • EGS, based on example-guided synthesis (PLDI 2021)
  • Gensynth, based on evolutionary search (AAAI 2021)
  • Prosynth, based on counterexample-guided inductive synthesis (POPL 2020)
  • Difflog, based on gradient descent-style numerical relaxation (IJCAI 2019); and
  • ALPS, an interactive approach based on combinatorial discrete search (FSE 2018)

Combining Neural Learning and Symbolic Reasoning

Many emerging applications in machine learning and AI involve effectively combining perception and reasoning. This project is developing techniques and tools to synergistically combine neural learning and symbolic reasoning, with a current focus on visual question-answering.

Papers: