CIS Homeline

 

CIS Home divider Penn Engineering divider PENN   spacer
 

 
  Jitendra Malik: Recognizing objects and actions in images and video                                                                                                     

The object recognition problem is that of finding instances of object classes in an image or video sequences: faces, giraffes, the digit 5, chairs etc. We base our approach on deformable shape matching using relational descriptors based on "shape contexts" and "geometric blur". This enables one to compute similarity measures between shapes which, together with similarity measures for texture and color, can be used to drive object recognition. I will show results on a variety of 2D and 3D datasets such as handwritten digits and the Caltech-101 dataset of visual categories.


The action recognition problem is that of finding instances of actions in video sequences: run, jump, kick etc. We have developed two approaches to recognition of actions. In low resolution data, ("far field") the approach is based on collecting low resolution optical flow measurements over a spatiotemporal volume for each moving figure, constructing a robust descriptor from this volume, and then matching these to stored sequences. In high resolution data ("near field") the approach is based on extracting stick figures in each frame, and relying on joint level human body tracking to provide a complete intermediate representation which is robust to lighting, clothing as well as pose.


This talk is based on joint work; please visit: http://http.cs.berkeley.edu/projects/vision/vision_group.html

for pointers  to publications.

 

Back to main Colloq Page


 
 
CIS Home divider Penn Engineering divider PENN   spacer