GART: Gaussian Articulated Template Models

Jiahui Lei¹ Yufu Wang¹ Georgios Pavlakos² Lingjie Liu¹ Kostas Daniilidis^1,3

University of Pennsylvania¹ UC Berkeley² Archimedes, Athena RC³

CVPR 2024 (Highlight)

Paper
PDF
ArXiv
Link
Code
Github

4min Video Intro+Results

Watch this video here on Bilibili in China

Abstract

We introduce Gaussian Articulated Template Model GART, an explicit, efficient, and expressive representation for non-rigid articulated subject capturing and rendering from monocular videos. GART utilizes a mixture of moving 3D Gaussians to explicitly approximate a deformable subject's geometry and appearance. It takes advantage of a categorical template model prior (SMPL, SMAL, etc.) with learnable forward skinning while further generalizing to more complex non-rigid deformations with novel latent bones. GART can be reconstructed via differentiable rendering from monocular videos in seconds or minutes and rendered in novel poses faster than 150fps.