# Caleb Stanford

I’m beyond excited to be joining the University of California, Davis as an assistant professor of computer science starting in Fall 2023! If you’re interested in doing computer science research, or you just want to chat, send me an email! You can apply to the PhD program at UC Davis here.

I graduated in July 2022 with my PhD in computer science from the University of Pennsylvania, where I was advised by Rajeev Alur. Before that, I got my ScB from Brown University in math and computer science in 2016.

During 2022-2023, I’ll be taking a gap year working with Deian Stefan and the ProgSys group at UCSD.

You can view my CV online or as a pdf.

## Research

My research is in programming languages and systems. Generally, I am interested in any problem space where I think mathematical abstractions will lead to practical impact in the safety, correctness, expressiveness, usability, or security of systems. Some of my past and ongoing projects include:

• Abstractions for processing data streams. How can we write programs that operate on high-rate distributed data streams correctly? My work proposes using partially ordered sets to combine and process streams. Good starting points are DiffStream (OOPSLA 2020) (for a software engineering tool) and Synchronization Schemas (invited paper at PODS 2021) (for some of the theory behind it).

• Derivatives of regular expressions. If you know Calculus, you may be familiar with rules like the sum rule and product rule, which say that d(f + g) = df + dg and d(fg) = d(f)g + fd(g), where d is the derivative operator and f and g are functions. It turns out there is an interesting (and practically useful!) theory of similar rules for regular expressions, but where the rules are a bit different: for example, d(AB) = d(A)B + (A∩ε)d(B). If you’re interested, check out our PLDI paper on implementing these things in the Z3 theorem prover.

• Fuzzing programmable networks. One surprisingly effective technique for finding bugs in software is called fuzzing, where we just spam the software with random garbage inputs and see if it crashes or produces any errors. In the FP4 project, we worked on applying fuzzing to programmable network switches (used to program things like firewalls, load balancers, and routing programs which are the backbone of the internet). Check out pystate, which is a piece of this project that I wrote, or FP4, the main repository for the network fuzzer.

• Models of streaming computation. Programs that process streaming data are fundamentally different mathematical objects than usual programs: in theoretical computer science terms, they are more like DFAs and NFAs (finite state machines) than Turing machines (unbounded memory). However, most streaming computation in practice goes beyond finite state machines to other primitive forms of quantitative computation that aren’t quite finite-space. For some of my work on quantitative state machines, see Modular Quantitative Monitoring (POPL 2019).

### Publications

*equal contribution authors in alphabetical order

1. FP4: Line-Rate Greybox Fuzz Testing for P4 Switches. Nofel Yaseen, Liangcheng Yu, Caleb Stanford, Ryan Beckett, and Vincent Liu. In submission.

2. A Robust Theory of Series-Parallel Graphs. Rajeev Alur, Caleb Stanford, and Christopher Watson. In submission.

3. Guided Incremental Dead State Detection. Caleb Stanford and Margus Veanes. In submission.

4. Stream Processing with Dependency-Guided Synchronization. Konstantinos Kallas,* Filip Niksic,* Caleb Stanford,* and Rajeev Alur, Principles and Practice of Parallel Programming (PPoPP), February 2022. Extended version; 2-Minute Elevator Pitch (September 2019); Poster (October 2019)

5. Correctness in Stream Processing: Challenges and Opportunities. Caleb Stanford, Konstantinos Kallas, and Rajeev Alur, Conference on Innovative Data Systems Research (CIDR), January 2022. Slides; Video

6. Symbolic Boolean Derivatives for Efficiently Solving Extended Regular Expression Constraints. Caleb Stanford, Margus Veanes, and Nikolaj Bjørner, Programming Language Design and Implementation (PLDI), June 2021. Slides; Talk (lightning and full)

7. Synchronization Schemas. Rajeev Alur, Phillip Hilliard, Zachary Ives, Konstantinos Kallas, Konstantinos Mamouras, Filip Niksic, Caleb Stanford, Val Tannen, and Anton Xue, Invited Contribution to Principles of Database Systems (PODS), June 2021. Invited paper.

8. DiffStream: Differential Output Testing for Stream Processing Programs. Konstantinos Kallas,* Filip Niksic,* Caleb Stanford,* and Rajeev Alur, Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), November 2020. Slides; Video

9. Streamable Regular Transductions. Rajeev Alur, Dana Fisman, Konstantinos Mamouras, Mukund Raghothaman, and Caleb Stanford, Theoretical Computer Science (TCS), February 2020.

10. Data-Trace Types for Distributed Stream Processing Systems. Konstantinos Mamouras, Caleb Stanford, Rajeev Alur, Zachary Ives, and Val Tannen, Programming Language Design and Implementation (PLDI), June 2019. Video Abstract

11. Modular Quantitative Monitoring. Rajeev Alur, Konstantinos Mamouras, and Caleb Stanford, Principles of Programming Languages (POPL), January 2019. Slides; Video

12. Interfaces for Stream Processing Systems. Rajeev Alur, Konstantinos Mamouras, Caleb Stanford, and Val Tannen, Principles of Modeling: Festschrift Symposium in honor of Edward A. Lee, October 2017. Invited paper.

13. Automata-Based Stream Processing. Rajeev Alur, Konstantinos Mamouras, and Caleb Stanford, International Colloquium on Automata, Languages, and Programming (ICALP), July 2017. Slides

### Software

• pystate, a call-sensitive state tracker for Python objects based on CRC-32, intended for use in fuzzing.

• FP4, a hardware fuzzer for P4 switches; we used pystate for control-plane state tracking

• Guided incremental digraphs: a data structure for incrementally tracking live and dead states for SMT applications.

• dZ3: A new constraint solver for regular expressions, now the default in Z3. (benchmarks; experimental scripts)

• DiffStream: Differential testing for Apache Flink programs.

• Flumina: A programming model for stream processing with parallelizable synchronization primitives and predictable semantics.

• Data transducers: A general-purpose intermediate representation for data stream monitoring with performance guarantees.

## Teaching

CIS 198: Rust Programming (spring 2021)

I was the instructor for CIS198: Rust Programming at UPenn, an undergraduate introduction to Rust. The course lecture notes are publicly available.

.

Archived material

## Contact

castan at cis upenn edu