Welcome to Andrew Schein's Web Page




Andrew Schein is a Ph.D. candidate in the Department of Computer and Information Science   at   The University of Pennsylvania.


Contact Information:
Email is the best way to reach me.


Research Interests:
recommender systems (e.g. recommending new movies based on movies a customer has seen previously), statistical natural language processing, information extraction, statistical methods in machine learning and data mining. Current research focuses on extracting knowledge from biomedical text corpora.

Advisor: Lyle Ungar
Data Mining Research Group
The Roos Laboratory
Computational Biology and Informatics Laboratory



Publications:

Andrew I. Schein, Alexandrin Popescul, Lyle H. Ungar, and David M. Pennock. Evaluating Hot- and Cold-Start Recommendations. Submitted.

Andrew I. Schein, Sharon J. Diskin, S. Ted Sandler, Fernando C. N. Pereira and Lyle H. Ungar. Bootstrapping Annotation for Information Extraction using Pre-Existing Knowledge Sources. Submitted.

Andrew I. Schein, Lawrence K. Saul, and Lyle H. Ungar. A Generalized Linear Model for Principal Component Analysis of Binary Data. Appeared in Proceedings of the 9'th International Workshop on Artificial Intelligence and Statistics. January 3-6, 2003. Key West, FL. [.ps.gz] [.pdf] [talk]

Andrew I. Schein. Notes on the CROC Curve. Unpublished. [.ps.gz] [.pdf]

Andrew I. Schein, Alexandrin Popescul, Lyle H. Ungar, and David M. Pennock. Methods and Metrics for Cold-Start Recommendations. Appeared in Proceedings of the 25'th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2002), pp 253-260. August 11-15, 2002. Tampere, Finland. [.ps.gz] [.pdf]

Andrew I. Schein, Alexandrin Popescul, Lyle H. Ungar, and David M. Pennock. Generative Models for Cold-Start Recommendations. Workshop on Recommender Systems at SIGIR, 2001. [.ps.gz] [.pdf]

Andrew I. Schein, Alexandrin Popescul, and Lyle H. Ungar. PennAspect: A Two-Way Aspect Model Implementation. University of Pennsylvania Department of Computer and Information Science, Technical Report MS-CIS-01-25. [.ps]

Andrew I. Schein, Jessica C. Kissinger, and Lyle H. Ungar. Chloroplast Transit Peptide Prediction: a Peek Behind the Black Box. Nucleic Acids Research, 2001, Vol 29, No. 16 e82. View Online.

Software Downloads:
1. PennAspect - an implementation of the two-way aspect model.
2. ROCtools - Some java code to build ROC curves and variants. Includes GROC and CROC varieties for recommender system evaluation.
3. LPCA - ALS model fitting for Logistic Principal Component Analysis (written in matlab).

TA Duties:
I have fulfilled my TA requirements. In the past, I have TA'd
1. CSE240 - Computer Architecture
2. CIS535/BIOL536 - Computational Biology

Eniac 2000:
I used to administrate for Lyle's Eniac 2000 group. The resource page is located here. Eniac 2000 has been superceded by Liniac.

WPE II:
You can read my Written Preliminary Examination II paper on various probabilistic models for machine learning applications:   [.ps.gz ] [ .pdf] .

Random Bits:
Link Page here.
Software Development Links.
Deadlines.
A Guide to Machine Learning Courses at Penn.
hits since March 5, 2002.
Thanks to digits for the free counter.