Haswell die photo, from Intel Free Press

CIS-800-003, Topics in Parallel Programmability Spring 2013


Joe Devietti

Office Hours: by appointment in Levine 572

When & Where

Monday/Wednesday 1:30-3:00pm, Towne 319

Course Description

Parallel programming is substantially more difficult than its sequential counterpart. This graduate seminar will cover in detail various methods that have been proposed for coping with the performance and correctness challenges of shared memory parallel programming both in principle and in practice. Graduate-level coursework in computer architecture and programming languages (CIS 501/500) will be very helpful.

Course Materials

No textbooks are required; links to all the papers we read will be provided at this website.


  • In-class presentations: 30%
  • Reading quizzes: 20%
  • Future Work write-ups: 30%
  • Participation: 20%

There will be no exams.

Reading quizzes and a place to upload future-work write-ups can be found on Blackboard.


This schedule is subject to change

Many of the paper links below are to publisher sites (like the ACM Digital Library). You’ll need to download the papers from an on-campus computer or via the UPenn Library proxy

Date Topic + Reading Presenter
Wed 9 Jan intro, sequential consistency Joe [slides pdf ]
Mon 14 Jan hardware memory models
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs [TOC ’79]
A Primer on Memory Consistency and Cache Coherence (Chapters 3-4, read Chapters 1-2 for more background)
Joe [slides pdf ]
Wed 16 Jan language memory models
The Java Memory Model [POPL ’05]
On Validity of Program Transformations in the Java Memory Model [ECOOP ’08]
Joe [slides pdf ]
Mon 21 Jan no class, MLK Jr Day -
Wed 23 Jan Hadi Esmaeilzadeh guest lecture
Dark Silicon and the End of Multicore Scaling [IEEE Micro Top Picks ’12]
Mon 28 Jan linearizability
Linearizability: a correctness condition for concurrent objects [TOPLAS ’90]
Joe [slides pdf ]
Wed 30 Jan data race detection
Eraser: a dynamic data race detector for multithreaded programs [TOCS ’97]
FastTrack: efficient and precise dynamic race detection [PLDI ’09]
Mon 4 Feb atomicity violations
High-level Data Races [VVEIS ’03]
AVIO: detecting atomicity violations via access interleaving invariants [ASPLOS ’06]
Wed 6 Feb Brandon Lucia guest lecture
Cooperative Empirical Failure Avoidance for Multithreaded Programs [ASPLOS ’13]
Mon 11 Feb general concurrency bugs
Learning from mistakes: a comprehensive study on real world concurrency bug characteristics [ASPLOS ’08]
Wed 13 Feb liveness
Simple, fast, and practical non-blocking and blocking concurrent queue algorithms [PODC ’96]
Chen Y.
Mon 18 Feb transactional memory
Transactional memory: architectural support for lock-free data structures [ISCA ’93]
Subtleties of Transactional Memory Atomicity Semantics [CAL ’06]
Wed 20 Feb hardware TM
Bulk Disambiguation of Speculative Threads in Multiprocessors [ISCA ’06]
Haswell ISA documentation (Sections 8.1, 8.2 and 8.3.8)
Mon 25 Feb software TM
Compiler and runtime support for efficient software transactional memory [PLDI ’06]
Software transactional memory: why is it only a research toy? [CACM ’08]
Wed 27 Feb SC with speculation
BulkSC: bulk enforcement of sequential consistency [ISCA ’07]
Mon 4 Mar no class, Spring Break -
Wed 6 Mar no class, Spring Break -
Mon 11 Mar SC without speculation
Efficient sequential consistency via conflict ordering [ASPLOS ’12]
Wed 13 Mar SC across the stack
DRFX: a simple and efficient memory model for concurrent programming languages [PLDI ’10]
Chen C.
Mon 18 Mar no class, Joe travelling -
Wed 20 Mar no class, Joe travelling -
Mon 25 Mar software record+replay
DoublePlay: Parallelizing Sequential Logging and Replay [ASPLOS ’11]
Wed 27 Mar hardware record+replay
DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Efficiently [ISCA ’08]
Mon 1 Apr execution-level determinism
DMP: deterministic shared memory multiprocessing [ASPLOS ’09]
Wed 3 Apr deterministic languages
A type and effect system for deterministic parallel Java [OOPSLA ’09]
Christian + Laurel
Mon 8 Apr task parallelism
Cilk: an efficient multithreaded runtime system [PPoPP ’95]
Chen Y.
Wed 10 Apr error detection with tasks
Scalable and precise dynamic datarace detection for structured parallelism [PLDI ’12]
Mon 15 Apr no class, Joe travelling -
Wed 17 Apr no class, Joe travelling -
Mon 22 Apr verification
A randomized scheduler with probabilistic guarantees of finding bugs [ASPLOS ’10]