Mitchell P. Marcus

RCA Professor of Artificial Intelligence

Mitch Marcus's photo Office Address:
Dept. of Computer & Information Science
University of Pennsylvania
3330 Walnut Street
503 Levine Hall
Philadelphia, PA 19104-6389

Email: mitch@cis.upenn.edu
Tel: (215) 898-2538
FAX: (215) 898-0587

I'm the RCA Professor of Artificial Intelligence in the Department of Computer and Information Science at the University of Pennsylvania, where I'm also Professor of Linguistics. I received my Ph.D. in 1978 from the MIT Artificial Intelligence Lab, and was a Member of Technical Staff at AT&T Bell Laboratories before coming to Penn in 1987. I served as chair of Penn's Computer and Information Science Department, as chair of the Penn Faculty Senate, as well as president of the Association for Computational Linguistics. I currently serve as chair of the Advisory Committee of the Center of Excellence in Human Language Technology at John Hopkins University. I was named a Fellow of the American Association of Artificial Intelligence in 1992.

I created and ran the Penn Treebank Project through the mid-1990s which developed the primary training corpus that led to a breakthrough in the accuracy of natural language parsers for unrestricted text. I and my collaborators continue to develop hand-annotated corpora for use world-wide as training materials for statistical natural language systems. I am currently the principal investigator for an ARO-funded MURI project to investigate natural language understanding for human-robot interaction with co-PIs at Stanford, Cornell, UMass Amherst, UMass Lowell and George Mason. My research interests include: statistical natural language processing, human-robot communication, and cognitively plausible models for automatic acquisition of linguistic structure.

My past PhD students have gone on to teach at such schools as MIT, Johns Hopkins, University of Arizona, Queens College and the Navy Postgraduate School, and such industrial research labs as IBM Research, BBN Technologies, and Microsoft Research.


Current Projects

Situation Understanding Bot Through Language And Environment (SUBTLE)

OntoNotes


Selected Publications

Q. Zhao, M. Marcus, A simple unsupervised learner for POS disambiguation rules given only a minimal lexicon, Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, p 688-697, 2009.

S. Pradhan, E. Hovy, M. Marcus, M. Palmer, L. Ramshaw, and R. Weischedel, OntoNotes: A Unified Relational Semantic Representation, International Journal of Semantic Computing, Vol. 1, No. 4, 2007.

N. Habash, R. Gabbard, O. Rambow, S. Kulick, and M. Marcus, Determining Case in Arabic: Learning Conplex Linguistic Behavior Requires Complex Linguistic Features, Proceedings of the 2007 Conference on Empirical Methods in Natural Language Processing, 2007.

R. Gabbard, S. Kulick, M. Marcus, Fully Parsing the Penn Treebank, HLT-NAACL 2006, New York, New York.

M. Marcus, B. Santorini, and M. Marcinkiewicz, Building a large annotated corpus of English: the Penn Treebank, Corpus Linguistics: Readings in a Widening Discipline, G. Sampson and D. McCarthy (eds.), Continuum, 2004. Also in Using Large Corpora, S. Armstrong (ed.), MIT Press, 1994. (reprinted from Computational Linguistics, 19(2), 1993)

M. Marcus (ed.), HLT 2002: Proceedings of the Second International Conference on Human Language Technology Research, Morgan Kaufmann, 2002.

L. Ramshaw, M. Marcus, Text Chunking using Transformation-Based Learning, Natural Language Processing Using Very Large Corpora, Armstrong et al. (eds.), Kluwer, 1998.

E. Brill, D. Magerman, M. Marcus, and B. Santorini, Deducing linguistic structure from the statistics of large corpora, Proceedings of DARPA Speech and Natural Language Workshop, June, 1990, Morgan-Kaufmann.

D. Magerman, M. Marcus, Parsing a natural language using mutual information statistics, Proceedings of AAAI 90.

M. Marcus, D. Hindle, and M. Fleck, D-Theory: Talking about talking about trees, Proceedings of the 21st Annual Meeting of the ACL, 1983.

M. Marcus, A Theory of Syntactic Recognition for Natural Language, MIT Press, 1980.

(Please note: Under many circumstances, I don't put my name on my students' papers. Please see my students' web sites for other current work.)

Current PhD Students

Former PhD Students