Nvidia Ada Lovelace die photo, c/o Nvidia

CIS 6010: Special Topics in Computer Architecture: GPGPU Architecture and Programming Fall 2023

Course Information

instructor: Joe Devietti
when: Monday/Wednesday 12-1:30pm
where: Towne 305
contact: email, canvas

office hours:

  • by appointment

Course Description

Graphics Processing Units (GPUs) have become extremely popular and are used to accelerate an increasingly diverse set of non-graphics workloads. This seminar will examine modern GPU architectures, the programming models used to write general-purpose code for GPUs, and the complexities of programming such highly parallel architectures. There will be a special emphasis on concurrency correctness issues as they relate to GPUs, including GPU memory consistency models and GPU concurrency bugs. Graduate-level coursework in computer architecture (e.g., CIS 5710) will be very helpful.

Course Materials

No textbooks are required; links to all readings will be provided at this website.


  • Project: 50%
  • Participation: 30%
  • Assignments: 20%

There will be no exams.

Submit homework via Canvas.

The class project can be done in groups of up to 2. The project is open-ended: it should be something related to GPUs but the specifics are up to you. Choosing a project that incorporates your interests (research or otherwise) is a great idea!

Course Schedule

This schedule is subject to change

Date Topic Presenter
Wed 30 Aug Intro Joe
Mon 4 Sep no class - Labor Day
Wed 6 Sep General-Purpose Graphics Processor Architectures (accessible via Penn VPN), Chapters 1 & 2 Joe
Mon 11 Sep ” Sections 3.1 - 3.3 Joe
Wed 13 Sep ” Section 3.4 - 3.6 Joe
Mon 18 Sep ” Chapter 4 Joe
Wed 20 Sep Real-world GPU design Joe
Mon 25 Sep no class - Yom Kippur
Wed 27 Sep CUDA Programming Guide Joe
Mon 2 Oct CUDA synchronization Joe
Wed 4 Oct A Primer on Memory Consistency and Cache Coherence, Chapters 3-5 (SC, TSO, RC) Joe
Mon 9 Oct MCM Primer
Wed 11 Oct MCM Primer
Mon 16 Oct MCM Primer
Wed 18 Oct Dynamic Warp Subdivision for Integrated Branch and Memory Divergence Tolerance
Mon 23 Oct The Dual-Path Execution Model for Efficient GPU Control Flow
Wed 25 Oct Heterogeneous-Race-Free Memory Models
Mon 30 Oct GPU concurrency: Weak Behaviours and Programming Assumptions slides
Wed 1 Nov Dynamic Warp Formation
Mon 6 Nov A Formal Analysis of the NVIDIA PTX Memory Consistency Model
Wed 8 Nov Cache Coherence for GPU Architectures
Mon 13 Nov Cache-Conscious Wavefront Scheduling
Wed 15 Nov SIMR: Single Instruction Multiple Request Processing for Energy-Efficient Data Center Microservices
Mon 20 Nov Understanding The Security of Discrete GPUs
Wed 22 Nov no class - Thanksgiving
Mon 27 Nov GPUfs: integrating a file system with GPUs
Wed 29 Nov GPUnet: Networking Abstractions for GPU Programs
Mon 4 Dec tbd
Wed 6 Dec Project Presentations
Mon 11 Dec Project Presentations