"Seeing the big picture in vision:
from local structure to global pattern "

Jianbo Shi
Robotics Institue
Carnegie Mellon University

The imagery of this world is full of interesting details and patterns. Much of my research in vision has been motivated by the question: how can we efficiently sort through details to see the “big picture” in a scene? In this context, I will talk about my works in image segmentation, and human recognition.

In image segmentation, we focused on two problems 1) how to extract global grouping structure from local image features, and 2) how to guide the grouping process to achieve higher-level vision tasks, such as recognizing familiar object shapes. To answer the first question, we have proposed a hierarchical graph partitioning formulation called normalized cuts, which defines a global criterion that optimally balances the requirements of segmentation with that of grouping. It also has an efficient computation algorithm using sparse generalized eigen-solver. For second question, we have developed graph-based method to guide the bottom-up grouping process to follow the constraints of object shape and partial grouping priors. We use dual graphs for encoding the coupled interactions between region intensity and contour shape; and we encode partial grouping priors as a subspace constraints on the normalized cuts group indicating variables.

In human recognition, I will talk about two of our recent works: on extracting and tracking detailed body movements, and on detecting typical/atypical human behavior in a video sequence. We show again that by integrating and reasoning about massive local image constraints in space and time, we can make these complex problems simpler to solve.

This is joint works with Stella Yu, Hua Zhong, Ralph Gross, Jiang Gao, and Jake Sprouse



Friday, May 10, 2002
Moore School Bldg. - Room #216
11:00 a.m. - 12:30 p.m.