Photo

Zachary G. Ives

Associate Professor
Computer & Information Science Department
University of Pennsylvania

Associated Faculty, Penn Center for Bioinformatics
Undergraduate Chair, Singh Program in Market & Social Systems Engineering (launching Fall 2011)

Teaching CIS 550 in Fall 2009
Office hours for Fall 2009 semester: Wednesdays, 3:30-4:30

 

Contact Information

576 Levine Hall North
Computer and Information Science Department
University of Pennsylvania
3330 Walnut Street
Philadelphia, PA 19104-6389
zives @ atcis.upenn.edu
(215) 746-2789    Fax: (215) 898-0587

Biographical Sketch

Zachary Ives is an Associate Professor at the University of Pennsylvania and an Associated Faculty Member of the Penn Center for Bioinformatics. He received his B.S. from Sonoma State University and his PhD from the University of Washington. His research interests include data integration, peer-to-peer models of data sharing, processing and security of heterogeneous sensor streams, and data exchange between autonomous systems. He is a recipient of the NSF CAREER award and a member of the 2006 (first) DARPA Computer Science Study Panel.  He has been a co-program chair for the XML Symposium (2006) and New Trends in Information Integration (2008, 2010) workshops.

Research

My research interests lie in the areas of databases and distributed systems, especially as they relate to the Web, Web-scale information sharing, and distributed networks of devices (e.g., sensors, actuators). I am a member of the database, wireless/mobile systems, and systems research groups at Penn. My research projects relate to making it easier to exchange, locate, and analyze networked information.

  • ORCHESTRA focuses on the problem of collaborative data sharing:  exchanging data and updates among loose confederations of databases, when the different database owners have different schemas and different ideas of what is the "right" content. We have developed techniques to map data and updates among different sites, maintain data provenance, and use the data provenance as the basis of assessing trust and ultimately to resolve conflicts.  We specifically target biological data sharing applications.  See here for an overview paper. Funded by NSF CAREER #IIS-0477972.
  • The Q query system addresses the challenges of querying in a system like Orchestra, when one does not know apriori where to find the most relevant data.  Q takes as input a keyword query, which it matches against schema elements to produce potential data integration queries.  The system returns answers from the most promising queries and takes user feedback on the results.  This feedback is used to learn which sources are most relevant to the information need that motivated the query.  Funded by NSF CAREER #IIS-0477972 and SEIII #IIS-0513778.
  • Aspen addresses the problem of programming and integrating large-scale and complex sensor networks. The system focuses on a setting in which large numbers of distributed sensors, with varying capabilities, must be coordinated in order to manage and reason about collections of physical entities and phenomena. My focus is on sensor data integration, i.e., integration of data streams from multiple sensor (and other) sources. A target application is data center monitoring for energy, temperature, load, and other factors. Different aspects of the research are funded by NSF III #IIS-0713267, NOSS #CNS-0721541, and a University Research Initiative grant from Lockheed Martin.
  • CopyCat, in collaboration with USC Information Sciences Institute (led by Craig Knoblock) and Fetch Technologies (led by Steve Minton), considers the problem of how to make it easy for users to author, use, and debug mappings for one-time integration tasks. The system presents a spreadsheet-like workspace, into which the user may paste columns and rows of data from source applications. The system attempts to learn what data is being extracted and what queries are being asked, and it makes auto-complete suggestions that generalize the user's work. The user provides feedback (either explicitly or by pasting more data) and the system refines its suggestions accordingly. Provenance information is used to explain and debug results, and it is also a foundation for the learning process. See here for an overview paper. CopyCat was funded in part by a DARPA IPTO seedling in the area of "best effort data integration," and is also funded in part by DARPA DSO funding through the CSSG program.

I also participate in several projects that are led by my colleagues at Penn:

  • pPOD (led by Val Tannen) focuses on the modeling and management of information related to phylogenetic trees.  pPOD leverages the Orchestra engine.
  • PIRIS (led by Doug Wiebe) focuses on integrating data records relating to gunshot wound cases in Philadelphia, in order to help support intervention.  Funded by the State of Pennsylvania.

 

Acknowledgments: I have also received grants from DARPA CSSG (#HR0011-06-1-0016), Penn ISTAR, the State of Pennsylvania, and Lockheed Martin, and software donations from MarkLogic, Electric Software, and IBM Corp.

Teaching

Selected recent courses and seminars:

Detailed information is here.

Publications

To appear / accepted for publication:

Selected recent publications:

A complete list is here.

PhD Advisees

Collaborators

Graduated Students

Tips on Interviewing

Finishing your PhD and going on the job market? I have previously compiled a list of reverences on interviewing, which you can find here.


Last modified: Wed Aug 20 19:42:31 EDT 2008