Partha Pratim Talukdar

Doctoral Student
Computer & Information Science Department
University of Pennsylvania
Philadelphia, PA - 19104

Email: partha at cis dot upenn dot edu

CV
Advisors: Mark Liberman, Fernando Pereira, Zack Ives.

Research

I am primarily interested in machine learning, natural language processing and data integration. My recent research has focused on learning algorithms for large-scale information extraction and data integration.

Research Groups: Structured Learning at Penn, Penn Research in Machine Learning (PRIML), Penn Natural Language Processing and Penn BioIE Group


Publications

2009

New Regularized Algorithms for Transductive Learning [ Slides ] [ Video ]
Partha Pratim Talukdar, Koby Crammer
European Conference on Machine Learning (ECML-PKDD) 2009, Bled, Slovenia.

Sequence Learning from Data with Multiple Labels [ Slides ]
Mark Dredze, Partha Pratim Talukdar, Koby Crammer
ECML-PKDD 2009 workshop on Learning from Multi-Label Data (MLD 09), Bled, Slovenia.

Interactive Data Integration through Smart Copy and Paste
Zack Ives, Craig Knoblock, Steve Minton, Marie Jacob, Partha Talukdar, Rattapoom Tuchinda, Jose Luis Ambite, Maria Muslea, Cenk Gazen.
Conference on Innovative Data Systems Research (CIDR) 2009, Asilomar, California.

Regularized Learning with Networks of Features.
Ted Sandler, John Blitzer, Partha Pratim Talukdar, Lyle H. Ungar.
Advances in Neural Information Processing Systems (NIPS) 2009.

Topics in Graph Construction for Semi-Supervised Learning
Partha Pratim Talukdar
UPenn CIS Technical Report MS-CIS-09-13 (WPE-II review report, non-refereed).

2008

Weakly Supervised Acquisition of Labeled Class Instances using Graph Random Walks [ Slides ]
Partha Pratim Talukdar, Joseph Reisinger, Marius Pasca, Deepak Ravichandran, Rahul Bhagat, Fernando Pereira.
EMNLP 2008, Honolulu, Hawaii.

The Orchestra Collaborative Data Sharing System.
Todd J. Green, Grigoris Karvounarakis, Nicholas E. Taylor, Val Tannen, Partha Pratim Talukdar, Marie Jacob, Fernando Pereira.
ACM SIGMOD Record, September 2008.

Learning to Create Data-Integrating Queries [ Slides ]
Partha Pratim Talukdar, Marie Jacob, Mohammad Salman Mehmood, Koby Crammer, Zack Ives, Fernando Pereira, Sudipto Guha.
34th International Conference on Very Large Databases (VLDB 2008), Auckland, New Zealand.

A Rate-Distortion One-Class Model and its Applications to Clustering. [ Slides ] [ Video ]
Koby Crammer, Partha Pratim Talukdar, Fernando Pereira.
International Conference on Machine Learning (ICML) 2008, Helsinki, Finland.

DRASO: Declaratively Regularized Alternating Structural Optimization. [ Slides ] [ Video ]
Partha Pratim Talukdar, John Blitzer, Ted Sandler, Mark Dredze, Koby Crammer, Fernando Pereira.
ICML 2008 Workshop on Prior Knowledge for Text and Language Processing, Helsinki, Finland.

2007

Lightly-Supervised Attribute Extraction.
Kedar Bellare, Partha Pratim Talukdar, Giridhar Kumaran, Fernando Pereira, Mark Liberman, Andrew McCallum and Mark Dredze.
NIPS 2007 Workshop on Machine Learning for Web Search.

Frustratingly Hard Domain Adaptation for Dependency Parsing.
Mark Dredze, John Blitzer, Partha Pratim Talukdar, Kuzman Ganchev, Joao Graca, and Fernando Pereira.
CoNLL Shared Task Session of EMNLP-CoNLL 2007, Prague.

Automatic Code Assignment to Medical Text.
Koby Crammer, Mark Dredze, Kuzman Ganchev, Partha Pratim Talukdar and Steve Caroll.
BioNLP 2007, Prague.

2006

A Context Pattern Induction Method for Named Entity Extraction [ Slides ]
Partha Pratim Talukdar, Thorsten Brants, Mark Liberman and Fernando Pereira
Tenth Conference on Computational Natural Language Learning (CoNLL-X), New York City, June 8-9, 2006.

2004

Hindi Text Normalization.
K. Panchapagesan, Partha Pratim Talukdar, N. Sridhar Krishna, Kalika Bali, A.G. Ramakrishnan.
Fifth International Conference on Knowledge Based Computer Systems (KBCS), 19-22 December 2004, Hyderabad India.

Phonetic Distance Based Cross-lingual Search.
Sriram S., Partha Pratim Talukdar, Sameer Badaskar, Kalika Bali, A.G. Ramakrishnan.
International Conference on Natural Language Processing, 19-22 December 2004, Hyderabad India.

Optimal Creation of Speech Databases for Indian Language Speech Technology
Satinder Singh, Partha Talukdar, Sridhar Krishna, Sandeep Manocha, Kalika Bali,Sitaram R.N.V..
International Conference on Speech and Language Technology/ O-COCOSDA , 17-19 November 2004, New Delhi, India.

Tools for the Development of a Hindi Speech Synthesis System
Kalika Bali, A.G.Ramakrishnan, Partha Pratim Talukdar, N. Sridhar Krishna.
5th ISCA Speech Synthesis Workshop, 14th-16th June 2004, Carnegie Mellon University, USA.

Duration Modeling for Hindi Text-to-Speech Synthesis.
N. Sridhar Krishna, Partha Pratim Talukdar, Kalika Bali, A.G. Ramakrishnan.
8th International Conference on Spoken Language Laguage Processing (ICSLP), 4th-8th October 2004, Jeju Island, Korea.

Automatic Generation of Compound Word Lexicon for Hindi Speech Synthesis.
Deepa S.R., A.G. Ramakrishnan, Kalika Bali, Partha Pratim Talukdar.
Language Resources and Evaluation Conference (LREC) 2004, Portugal, 26-28 May 2004.


Software

OCRD: One Class Algorithm based on Rate-Distortion theory (download)
An algorithm to choose a coherent subset of points from a large set. Please see A Rate-Distortion One-Class Model and its Applications to Clustering for details.



You can't miss this: Meet your meat
What is your ecological footprint? Check here
Here is something for a surprise!
Stanford commencement speech by Steve Jobs (I was also there)

Murals in Philadelphia

You can check Henry's current status here!