Mark Dredze

4th year PhD Student
Computer and Information Science
University of Pennsylvania
Contact Info:

www.cis.upenn.edu/~mdredze   www.dredze.com

About Me

NEW: AAAI 2008 Workshop on Enhanced Messaging. Email, IM, AI, HCI, and all that. Check it out.

As a graduate student, I have a range of interests. My work falls in the intersection of machine learning and natural language processing, often with a dash of user interfaces. My advisor is Fernando Pereira and I am part of the Structured Learning at Penn group. I spend a lot of time working on Calo, a multi-institution effort to develop a Cognitive Agent that Learns and Organizes.

Much of my work concerns email and how to improve the email experience. I have also worked at a few places in industry, including Google, IBM and Microsoft.

I work on a range of topics, including:
Semi-supervised learning, sequence learning, online learning, email classification, user modeling, domain adaptation, email activity management, among others.

"What do you work on?"
I am often asked this question. I can easily refer you to the list of topics above or you can look at my publications, but I wanted to provide a deeper answer.

I began my research as an undergraduate working on intelligent user interfaces, specifically an interface for bringing contextual information to television news viewers. The work required various learning components, ie. generating queries, segmenting stories, classification, etc. I wanted to focus on these components and learning that would support good UIs. A natural medium for these applications was email (projects like email activity management). I branched off into other prediction tasks in email including reply prediction and attachment prediction, both designed to improve the email experience. This work has taught me the importance of building learning models specific to each user as different users operate in different environments.

Subsequently, I have worked on related technologies and problems, such as domain/user adaptation, semi-supervised learning, and online learning. I'm interested in exploring how we can learn to adapt intelligent systems to new situations. While finding alignments between domains is important, I am exploring how to learn a user representation that can be shared across tasks and transfered to new users.

Publications

Click to show abstract.

     2008
Expand Me    Mark Dredze, Joel Wallenberg. Further Results and Analysis of Icelandic Part of Speech Tagging. Technical Report MS-CIS-08-13, University of Pennsylvania, Department of Computer and Information Science, 2008. [PDF]
Data driven POS tagging has achieved good performance for English, but can still lag behind linguistic rule based taggers for morphologically complex languages, such as Icelandic. We extend a statistical tagger to handle fine grained tagsets and improve over the best Icelandic POS tagger. Additionally, we develop a case tagger for non-local case and gender decisions. An error analysis of our system suggests future directions. This paper presents further results and analysis to the original work.
 
Expand Me    Mark Dredze, Koby Crammer, Fernando Pereira. Confidence-Weighted Linear Classification. International Conference on Machine Learning (ICML), 2008. [PDF]
We introduce confidence-weighted linear classifiers, which add parameter confidence information to linear classifiers. Online learners in this setting update both classifier parameters and the estimate of their confidence. The particular online algorithms we study here maintain a Gaussian distribution over parameter vectors and update the mean and covariance of the distribution with each instance. Empirical evaluation on a range of NLP tasks show that our algorithm improves over other state of the art online and batch methods, learns faster in the online setting, and lends itself to better classifier combination after parallel training.
 
Expand Me    Mark Dredze, Koby Crammer. Active Learning with Confidence. Association for Computational Linguistics (ACL), 2008. [PDF]
Active learning is a machine learning approach to achieving high-accuracy with a small amount of labels by letting the learning algorithm choose instances to be labeled. Most of previous approaches based on discriminative learning use the margin for choosing instances. We present a method for incorporating confidence into the margin by using a newly introduced online learning algorithm and show empirically that confidence improves active learning.
 
Expand Me    Mark Dredze, Joel Wallenberg. Icelandic Data-Driven Part of Speech Tagging. Association for Computational Linguistics (ACL), 2008. [PDF]
Data driven POS tagging has achieved good performance for English, but can still lag behind linguistic rule based taggers for morphologically complex languages, such as Icelandic. We extend a statistical tagger to handle fine grained tagsets and improve over the best Icelandic POS tagger. Additionally, we develop a case tagger for non-local case and gender decisions. An error analysis of our system suggests future directions.
 
Expand Me    Kuzman Ganchev, Mark Dredze. Small Statistical Models by Random Feature Mixing. Workshop on Mobile NLP at ACL, 2008. [PDF]
The application of statistical NLP systems to resource constrained devices is limited by the need to maintain parameters for a large number of features and an alphabet mapping features to parameters. We introduce random feature mixing to eliminate alphabet storage and reduce the number of parameters without severely impacting model performance.
 
Expand Me    Mark Dredze, Hanna Wallach, Danny Puller, Tova Brooks, Josh Carroll, Joshua Magarick, John Blitzer, Fernando Pereira. Intelligent Email: Aiding Users with AI. American National Conference on Artificial Intelligence (AAAI), NECTAR Paper, 2008. [PDF]
Email occupies a central role in the modern workplace. This has led to a vast increase in the number of email messages that users are expected to handle daily. Furthermore, email is no longer simply a tool for asynchronous online communication - email is now used for task management, personal archiving, as well both synchronous and asynchronous online communication. This explosion can lead to "email overload" - many users are overwhelmed by the large quantity of information in their mailboxes. In the human--computer interaction community, there has been much research on tackling email overload. Recently, similar efforts have emerged in the artificial intelligence (AI) and machine learning communities to form an area of research known as intelligent email. In this paper, we take a user-oriented approach to applying AI to email. We identify enhancements to email user interfaces and employ machine learning techniques to support these changes. We focus on three tasks - summary keyword generation, reply prediction and attachment prediction - and summarize recent work in these areas.
 
      Kevin Lerman, Ari Gilder, Mark Dredze, Fernando Pereira. Reading the Markets: Forecasting Public Opinion of Political Candidates by News Analysis. North East Student Colloquium on Artificial Intelligence (NESCAI), 2008.
 
Expand Me    Mark Dredze, Hanna Wallach. User Models for Email Activity Management. The IUI 2008 Workshop on Ubiquitous User Modeling, 2008. [PDF]
A single user activity, such as planning a conference trip, typically involves multiple actions. Although these actions may involve several applications, the central point of co-ordination for any particular activity is usually email. Previous work on email activity management has focused on clustering emails by activity. Dredze et al. accomplished this by combining supervised classifiers based on document similarity, authors and recipients, and thread information. In this paper, we take a different approach and present an unsupervised framework for email activity clustering. We use the same information sources as Dredze et al.- namely, document similarity, message recipients and authors, and thread information - but combine them to form an unsupervised, non-parametric Bayesian user model. This approach enables email activities to be inferred without any user input. Inferring activities from a user's mailbox adapts the model to that user. We next describe the statistical machinery that forms the basis of our user model, and explain how several email properties may be incorporated into the model. We evaluate this approach using the same data as Dredze et al., showing that our model does well at clustering emails by activity.
 
Expand Me    Mark Dredze, Hanna Wallach, Danny Puller, Fernando Pereira. Generating Summary Keywords for Emails Using Topics. Proceedings of the 2008 International Conference on Intelligent User Interfaces, 2008. [PDF]
Email summary keywords, used to concisely represent the gist of an email, can help users manage and prioritize large numbers of messages. We develop an unsupervised learning framework for selecting summary keywords from emails using latent representations of the underlying topics in a user's mailbox. This approach selects words that describe each message in the context of existing topics rather than simply selecting keywords based on a single message in isolation. We present and compare four methods for selecting summary keywords based on two well-known models for inferring latent topics: latent semantic analysis and latent Dirichlet allocation. The quality of the summary keywords is assessed by generating summaries for emails from twelve users in the Enron corpus. The summary keywords are then used in place of entire messages in two proxy tasks: automated foldering and recipient prediction. We also evaluate the extent to which summary keywords enhance the information already available in a typical email user interface by repeating the same tasks using email subject lines.
 
Expand Me    Mark Dredze, Tova Brooks, Josh Carroll, Joshua Magarick, John Blitzer, Fernando Pereira. Intelligent Email: Reply and Attachment Prediction. Proceedings of the 2008 International Conference on Intelligent User Interfaces, 2008. [PDF]
We present two prediction problems under the rubric of Intelligent Email that are designed to support enhanced email interfaces that relieve the stress of email overload. Reply prediction alerts users when an email requires a response and facilitates email response management. Attachment prediction alerts users when they are about to send an email missing an attachment or triggers a document recommendation system, which can catch missing attachment emails before they are sent. Both problems use the same underlying email classification system and task specific features. Each task is evaluated for both single-user and cross-user settings.
 
Expand Me    Koby Crammer, Mark Dredze, John Blitzer, Fernando Pereira. Batch Performance for an Online Price. The NIPS 2007 Workshop on Efficient Machine Learning, 2008. [PDF]
Batch learning techniques achieve good performance, but at the cost of many (sometimes even hundreds) of passes over the data. For many tasks, such as web-scale ranking of machine translation hypotheses, making many passes over the data is prohibitively expensive, even in parallel over thousands of machines. Online algorithms, which treat data as a stream of examples, are conceptually appealing for these large scale problems. In practice, however, online algorithms tend to underperform batch methods, unless they are themselves run in multiple passes over the data.
In this work we explore a new type of online learning algorithm that incorporates a measure of confidence to the algorithm. The model maintains a confidence for each parameter, reflecting previously observed properties of the data. While this requires an additional parameter for each feature of the data, this is a minimal cost when compared to running the algorithm multiple times over the data. The resulting algorithm learns faster, requiring both fewer training instances and fewer passes over the training data, often approaching batch performance with only a single pass through the data.
 
Expand Me    Mark Dredze, Krzysztof Czuba. Learning to Admit You're Wrong: Statistical Tools for Evaluating Web QA. The NIPS 2007 Workshop on Machine Learning for Web Search, 2008. [PDF]
Web search engines provide specialized results to specific queries, often relying on the output of a QA system. However, targeted answers, while helpful, are embarrassing when wrong. Automated techniques are required to avoid wrong answers and improve system performance. We present the Expected Answer System, a statistical data-driven framework that analyzes the performance of a QA system with the goal of improving system accuracy. Our system is used for wrong answer prediction, missing answer discovery, and question class analysis. An empirical study of a production QA system, one of the first such evaluations presented in the literature, motivates our approach.
 
Expand Me    Kedar Bellare, Partha Pratim Talukdar, Giridhar Kumaran, Fernando Pereira, Mark Liberman, Andrew McCallum, Mark Dredze. Lightly-Supervised Attribute Extraction for Web Search. The NIPS 2007 Workshop on Machine Learning for Web Search, 2008. [PDF]
Web search engines can greatly benefit from knowledge about attributes of entities present in search queries. In this paper, we introduce lightly-supervised methods for extracting entity attributes from natural language text. Using these methods, we are able to extract large numbers of attributes of different entities at fairly high precision from a large natural language corpus. We compare our methods against a previously proposed pattern-based relation extractor, showing that the new methods give considerable improvements over that baseline. We also demonstrate that query expansion using extracted attributes improves retrieval performance on underspecified information-seeking queries.
 

     2007
      Danny Puller, Hanna Wallach, Mark Dredze, Fernando Pereira. Generating Summary Keywords for Emails Using Topics. Women in Machine Learning Workshop (WiML) at Grace Hopper, 2007.
 
Expand Me    Neal Parikh, Mark Dredze. Graphical Models for Primarily Unsupervised Sequence Labeling. Technical Report MS-CIS-07-18, University of Pennsylvania, Department of Computer and Information Science, 2007. [PDF]
Most models used in natural language processing must be trained on large corpora of labeled text. This tutorial explores a 'primarily unsupervised' approach (based on graphical models) that augments a corpus of unlabeled text with some form of prior domain knowledge, but does not require any fully labeled examples. We survey probabilistic graphical models for (supervised) classification and sequence labeling and then present the prototype-driven approach of Haghighi and Klein (2006) to sequence labeling in detail, including a discussion of the theory and implementation of both conditional random fields and prototype learning. We show experimental results for English part of speech tagging.
 
Expand Me    Mark Dredze, Reuven Gevaryahu, Ari Elias-Bachrach. Learning Fast Classifiers for Image Spam. CEAS, 2007. [PDF] [Data]
Recently, spammers have proliferated image spam, emails which contain the text of the spam message in a human readable image instead of the message body, making detection by conventional content filters difficult. New techniques are needed to filter these messages. Our goal is to automatically classify an image directly as being spam or ham. We present features that focus on simple properties of the image, making classification as fast as possible. Our evaluation shows that they accurately classify spam images in excess of 90% and up to 99% on real world data. Furthermore, we introduce a new feature selection algorithm that selects features for classification based on their speed as well as predictive power. This technique produces an accurate system that runs in a tiny fraction of the time. Finally, we introduce Just in Time (JIT) feature extraction, which creates features at classification time as needed by the classifier. We demonstrate JIT extraction using a JIT decision tree that further increases system speed. This paper makes imagespam classification practical by providing both high accuracy features and a method to learn fast classifiers.
 
Expand Me    Mark Dredze, John Blitzer, Partha Pratim Talukdar, Kuzman Ganchev, Joao Graca, Fernando Pereira. Frustratingly Hard Domain Adaptation for Dependency Parsing. Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL, 2007. [PDF]
We describe some challenges of adaptation in the 2007 CoNLL Shared Task on Domain Adaptation. Our error analysis for this task suggests that a primary source of error is differences in annotation guidelines between treebanks. Our suspicions are supported by the observation that no team was able to improve target domain performance substantially over a state of the art baseline.
 
Expand Me    Koby Crammer, Mark Dredze, Kuzman Ganchev, Partha Pratim Talukdar, Steven Carroll. Automatic Code Assignment to Medical Text. BioNLP Workshop at ACL, 2007. [PDF]
Code assignment is important for handling large amounts of electronic medical data in the modern hospital. However, only expert annotators with extensive training can assign codes. We present a system for the assignment of ICD-9-CM clinical codes to free text radiology reports. Our system assigns a code configuration, predicting one or more codes for each document. We combine three coding systems into a single learning system for higher accuracy. We compare our system on a real world medical dataset with both human annotators and other automated systems, achieving nearly the maximum score on the Computational Medicine Center's challenge.
 
Expand Me    John Blitzer, Mark Dredze, Fernando Pereira. Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification. Association for Computational Linguistics (ACL), 2007. [PDF]
Automatic sentiment classification has been extensively studied and applied in recent years. However, sentiment is expressed differently in different domains, and annotating corpora for every possible domain of interest is impractical. We investigate domain adaptation for sentiment classifiers, focusing on online reviews for different types of products. First, we extend to sentiment classification the recently-proposed structural correspondence learning (SCL) algorithm, reducing the relative error due to adaptation between domains by an average of 30% over the original SCL algorithm and 46% over a supervised baseline. Second, we identify a measure of domain similarity that correlates well with the potential for adaptation of a classifier from one domain to another. This measure could for instance be used to select a small set of domains to annotate whose trained classifiers would transfer well to many other domains.
 
      Mark Dredze, Hanna M. Wallach. Email Keyword Summarization and Visualization with Topic Models. North East Student Colloquium on Artificial Intelligence (NESCAI), 2007.
 
      John Blitzer, Mark Dredze, Fernando Pereira. Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification. North East Student Colloquium on Artificial Intelligence (NESCAI), 2007.
 

     2006
Expand Me    Mark Dredze, John Blitzer, Fernando Pereira. "Sorry, I Forgot the Attachment:" Email Attachment Prediction. CEAS, 2006. [PDF]
The missing attachment problem: a missing attachment generates a wave of emails from the recipients notifying the sender of the error. We present an attachment prediction system to reduce the volume of missing attachment mail. Our classifier could prompt an alert when an outgoing email is missing an attachment. Additionally, the system could activate an attachment recommendation system, whereby suggested documents are offered once the system determines the user is likely to include an attachment, effectively reminding the user to include the attachment. We present promising initial results and discuss implications of our work.
 
      Nicholas Kushmerick, Tessa Lau, Mark Dredze, Rinat Khoussainov. Activity-Centric Email: A Machine Learning Approach. Proceedings of the 2006 American National Conference on Artificial Intelligence (AAAI '06), NECTAR Paper, 2006. [PDF]
 
      Mark Dredze, John Blitzer, Koby Crammer, Fernando Pereira. Feature Design for Transfer Learning. North East Student Colloquim on Artificial Intelligence (NESCAI), 2006. [PDF]
 
Expand Me    Mark Dredze, Tessa Lau, Nicholas Kushmerick. Automatically Classifying Emails into Activities. Proceedings of the 2006 International Conference on Intelligent User Interfaces, 2006. [PDF]
Email-based activity management systems promise to give users better tools for managing increasing volumes of email, by organizing email according to a user's activities. Current activity management systems do not automatically classify incoming messages by the activity to which they belong, instead relying on simple heuristics (such as message threads), or asking the user to manually classify incoming messages as belonging to an activity. This paper presents several algorithms for automatically recognizing emails as part of an ongoing activity. Our baseline methods are the use of message reply-to threads to determine activity membership and a naive Bayes classifier. Our SimSubset and SimOverlap algorithms compare the people involved in an activity against the recipients of each incoming message. Our SimContent algorithm uses IRR (a variant of latent semantic indexing) to classify emails into activities using similarity based on message contents. An empirical evaluation shows that each of these methods provide a significant improvement to the baseline methods. In addition, we show that a combined approach that votes the predictions of the individual methods performs better than each individual method alone.
 

     2005
      Rie Kuboto Ando, Mark Dredze, Tong Zhang. Trec 2005 Genomics Track Experiments at IBM Watson. TREC, 2005. [PDF] (Group invited talk at TREC 2005, ranked 3rd and 4th out of 53 entries
 
Expand Me    Catalina Danis, Wendy Kellogg, Tessa Lau, Mark Dredze, Jeffrey Stylos, Nicholas Kushmerick. Managers Email: Beyond Tasks and To-Dos. CHI, 2005. [PDF]
In this paper, we describe preliminary findings that indicate that managers and non-mangers think about their email differently. We asked three research managers and three research non-managers to sort about 250 of their own email messages into categories that "would help them to manage their work." Our analyses indicate that managers create more categories and a more differentiated category structure than non-managers. Our data also suggest that managers create "relationship-oriented" categories more often than non-managers. These results are relevant to research on "email overload" that has highlighted the use of email for activities beyond communication. In particular, our findings suggest that too strong a focus on task management may be incomplete, and that a user's organizational role has an impact on their conceptualization and likely use of email.
 
Expand Me    Mark Dredze, John Blitzer, Fernando Pereira. Reply Expectation Prediction for Email Management. CEAS, 2005. [PDF]
We reduce email overload by addressing the problem of waiting for a reply to one's email. We predict whether sent and received emails necessitate a reply, enabling the user to both better manage his inbox and to track mail sent to others. We discuss the features used to discriminate emails, show promising initial results with a logistic regression model, and outline future directions for this work.
 

     2004
      Mark Dredze, Jeffrey Stylos, Tessa Lau, Wendy Kellogg, Catalina Danis, and Nicholas Kushmerick. Taxie: Automatically identifying tasks in email. , 2004. Unpublished manuscript
 

     2003
      Kevin Livingston, Mark Dredze, Kristian Hammond, and Larry Birnbaum. Beyond Broadcast. Proceedings of the 2003 International Conference on Intelligent User Interfaces, 2003.
 

Data/Code

I get a lot of emails asking me for data or code from one of my papers. If you are wondering, the answer is yes! I try to provide both data and code so that others can reproduce or compare against my results. Sadly, I don't post them for download, mostly due to the lack of time. However, if I know you want them, I usually make them available. Just email me.

Datasets
Image Spam Dataset [Link]
A collection of ham and spam images taken from real user email.

Multi-Domain Sentiment Dataset [Link]
Product reviews from several different product types taken from Amazon.com.

Attachment Prediction Email (Email for data)
Enron emails annotated with attachment information and cleaned of numerous artificats inserted by email programs.

Code
Structured Learning at Penn [Link]
This is a collection of software developed by my group for doing a range of machine learning tasks, such as dependency parsing, structured learning, gene prediction and gene mention finding.



Colleagues

I have worked with a lot of amazing people on a wide variety of projects. Here are a few of them:

Rie Johnson (Ando)
Kedar Bellare
Larry Birnbaum
John Blitzer
Koby Crammer
Krzysztof Czuba
Kris Hammond
Ryan Gabbard
Kuzman Ganchev
David Johnson
Nicholas Kushmerick
Tessa Lau
Kevin Lerman
David Mimno
Fernando Pereira
Doug Riecken
Jeff Reynar
Partha Pratim Talukdar
Hanna M. Wallach
Joel Wallenberg
Casey Whitelaw
Tong Zhang

Students

I am a graduate advisor on a number of student research projects. Email me if you are interested in working on an independent study or senior project.

Previous and Current Student Projects

Project Student
Email keyword summarization Danny Puller UPenn Summer Provost Fellowship
Sentiment classification Ian Cohen
Email Attachment Prediction Josh Magarick
Prototype Driven Learning and Graphical Models Neal Parikh
Machine Learning in Prediction Markets Ari Gilder
Kevin Lerman
[Link] Winner Best CS Senior Design Project, Honorary Mention Best Engineering Design Project
User Adaptation in Email Reply Prediction Tova Brooks
Josh Carroll
Formal and Informal Meeting Extraction from Email Lauren Paone [Link]

Links

Links:
Josh Goldman's Blog
The Audhumlan Conspiracy (Ryan Gabbard)
Arona's blog (coolest person in the world)