Penn Data Mining Group: Publications


Main Research Areas

Feature Selection

When one has far more potentially predictive features than observations, standard penalty methods such as AIC and BIC fail. We have developed a set of Streamwise Feature Selection (SFS) methods which support interleaving the generation and selection of features. SFS excels when a huge number of features can be generated (given enough computer time), but only a small fraction of them are significant.
  • Multi-Task Feature Selection using the Multiple Inclusion Criterion (MIC). Paramveer S. Dhillon, Brian Tomasik, Dean Foster and Lyle Ungar. ECML-PKDD (European Conference on Machine Learning) Bled, Slovenia, Sept. 2009
  • Transfer Learning, Feature Selection and Word Sense Disambiguation. Paramveer S. Dhillon and Lyle Ungar. ACL-IJCNLP (Annual Meeting of the Association of Computational Linguistics), Singapore, Aug. 2009
  • Efficient Feature Selection in the Presence of Multiple Feature Classes. Paramveer S. Dhillon, Dean Foster and Lyle H. Ungar IEEE International Conference on Data Mining (ICDM), 2008.
  • Streamwise Feature Selection, Jing Zhou, Bob Stine, Dean Foster, and Lyle Ungar, Journal of Machine Learning Research (JMLR) 7 1861-1885, 2006.
  • Streaming Feature Selection using alpha investing, Jing Zhou, Bob Stine, Dean Foster, and Lyle Ungar SIGKDD-2005, 384-393, 2005.
  • Streaming Feature Selection, Lyle Ungar, Jing Zhou, Dean Foster and Bob Stine, AI and Statistics, 2005.
  • Streaming Feature Selection, Lyle Ungar, Dean Foster and Bob Stine, Snowbird Learning Conference, 2004. For more detail, see our work in progress,
  • Feature selection methods have implicit assumptions as to distributions of the features; estimating these distributions leads to methods with superior performance.
  • Characterizing the generalization performance of model selection strategies D. Schuurmans, D.P. Foster and L.H. Ungar, presented at ML 1997 (pdf)

Text Mining

  • Web-scale named entity recognition, Whitelaw, Kehlenbeck, Petrovic, and Ungar, Proceedings of the 17th ACM conference on Information and knowledge management (CIKM '08) 2008
  • Positioning Knowledge: Schools of Thought and New Knowledge Creation, S. Phineas Upham, Lori Rosenkopf and Lyle H. Ungar, Scientometrics
  • Using Text Mining to Analyze User Forums. in special issue ``Web Mining for E-commerce & E-services'', Journal of Online Information Review 2009
  • Efficient Clustering of Web-Derived Data Sets. Luis Sarmento, Alexander Kehelenbeck, Eugenio Oliveira, and Lyle Ungar International Conference on Machine Learning and Data Mining (MLDM) 2009
  • An Approach to Web-scale Named-Entity Disambiguation. Luis Sarmento, Alexander Kehelenbeck, Eugenio Oliveira, and Lyle Ungar International Conference on Machine Learning and Data Mining (MLDM) 2009
  • Finding cohesive clusters for analyzing knowledge communities. Vasileios Kandylas, S. Phineas Upham and Lyle H. Ungar, IEEE Knowledge and Information Systems 17(3) p. 335, (2008)
  • Multiway Clustering for Creating Biomedical Term Sets. V Kandylas, L Ungar, T Sandler, S Jensen Proceedings of the 2008 IEEE International Conference on Bioinformatics and Biomedicine (BIBM '08), 2008
  • Using Text Mining to Analyze User Forums R. Feldman, M. Fresko, J. Goldenberg, O. Netzer, L. Ungar 5th IEEE ICSSSM'08, Melbourne, 2008.
  • Web-Scale Named Entity Recognition Casey Whitelaw, Alex Kehlenbeck, Nemanja Petrovic and Lyle Ungar ACM 17th Conference on Information and Knowledge Management (CIKM), 2008
  • Information Extraction from Informal Texts Presentation at ICDM 2007.
  • Knowledge Positioning: Schools of Thought and New Knowledge Creation, Phin Upham, Lori Rosenkopf, and Lyle Ungar, 2006 Academy of Management Annual Meeting, August 11-16, Atlanta, Georgia.
  • Automatic Term List Generation for Entity Tagging Ted Sandler, Andrew I. Schein and Lyle H. Ungar 22(6): 651. Bioinformatics, 2006.
  • Integrated Annotation for Biomedical Information Extraction Seth Kulick, Ann Bies, Mark Libeman, Mark Mandel, Ryan McDonald, Martha Palmer, Andrew Schein and Lyle Ungar, HLT/NAACL, Boston, May 2004
  • Shallow Semantic Annotation of Biomedical Corpora for Information Extraction Seth Kulick, Mark Liberman, Martha Palmer, and Andrew Schein. Proceedings of the 2003 ISMB Special Interest Group Meeting on Text Mining (a.k.a. BioLink). June 27, 2003. Brisbane, Australia. (slides)
    • As part of a large project on mining the Bibliome (Information Extraction from the Biomedical Literature), we annotated medline documents with entities and their relations, and used machine learning methods to do automatic tagging and information extraction.
See also Statistical Relational Learning.

Genomics and Proteomics

We apply a vaariety of different statisical methods to problems in genomics and proteomics. Much of our recent work studies motifs involved in protein-protein and protein-DAN binding, including interactions between HIV and human proteins.

Clustering and Collaborative Filtering

Collaborative filtering problems (e.g. recommending movies based on what other people have liked) can be modeled and optimally solved using generative statistical models. Current (not yet published) work shows how EM methos on mixture models can systematically be changed to winner-take-all hard clustering methods. The evolution of citation clusters over time has implications for theories of the growth and decline of knowledge communities.

Statistical Relational Learning

To build predictive models from data in relational data bases, one needs to intelligently search the space of data base queries. Clustering can be used to create new database tables, augmenting the relational database, and reducing data sparsity problems. Feature selection on infinite streams of features requires careful control of false discovery rates.

Active Learning

The Bayesian A-optimality condition from experimental design provides a principled foundation for active learning. Active learning significantly reduces the cost of tagging for word sense disambiguation. Multi-arm bandits and other methods support active learning for determining how to grasp an object or what path a robot should follow.
  • Active Learning for Vision-Based Robot Grasping, Salganicoff, M. L.H., Ungar, and R. Bajcsy Machine Learning Journal 23, 251-278, 1996.
  • Active Exploration-Based ID-3 Learning for Robot Grasping, Salganicoff, M., L.G. Kunin and L.H.Ungar, Proceedings of the Workshop on Robot Learning, 11th Intl Conf. on Machine Learning, July, 1994.
  • Active Exploration and Learning in Real-Valued Spaces using Multi-Armed Bandit, M. Salganicoff and L.H. Ungar, Proc. 12th Intl. Conf. on Machine Learning, July, 1995.

Neural Networks and Nonlinear Modeling

Neural networks can be viewed as regression models. Accurate prediction intervals can be derived by correctly estimated the degrees of freedom of the neural net model. Alternative statistical models such as MARS may or may not be more accurate depending on the type of problem being solved. Combining neural nets with first principles models improves performance.
  • Hybrid neural network models for environmental process control, R.D. De Veaux, R. Bain, and L.H. Ungar, Environmetrics 10(3), 225-236, 1999.
  • Prediction Intervals for Neural Networks via Nonlinear Regression R. de Veaux, J. Schumi, J. Schweinsberg, D. Shellington and L.H. Ungar, Ungar, Technometrics 40:(4) 273-282, 1998. (pdf)
  • Estimating Monotonic Functions and Their Bounds. H. Kay and L.H. Ungar, AIChE Journal, 46(12), 2425-2434 (ps)
  • A Brief Introduction to Neural Networks R. De Veaux and L.H. Ungar, Unpublished, 1997. (pdf)
  • Estimating Prediction Intervals for Artificial Neural Networks , E. Rosengarten and R. de Veaux, Ninth Yale Workshop on Adaptive and Learning Systems, 1996. (pdf)
  • Multicollinearity: A Tale of two Non-parametric Regressions R.D.de Veaux and L.H. Ungar). In Selecting Models from Data: AI and Statistics IV, (ed P.Cheeseman and R.W. Oldford), pp. 293-302. Springer-Verlag, 1994. (pdf)
  • SVD-Net: An Algorithm which Automatically Selects Network Structure, Psichogios, D.C. and L.H. Ungar, IEEE Transactions on Neural Networks, 5(3) 513-515, 1994.
  • A Hybrid Neural Network - First Principles Approach to Process Modeling, D.C. Psichogios and L.H. Ungar, AIChE Journal, 1499--1512, October, 1992.
  • Using Radial Basis Functions to Approximate a Function and Its Error Bounds, Leonard, J.A., M.A. Kramer and L.H. Ungar, IEEE Transactions on Neural Networks, 3(4), 624-627, 1992.
  • A Neural Network Architecture that Computes its own Reliability, Leonard, J.A., M.A. Kramer, and L.H. Ungar, Computers and Chem. Engr., 16(9), 819--837, 1992.
Neural networks, particularly radial basis functions can be attractive models for model-based process control, but accurate models do not guarantee stable control.
  • Radial Basis Function Neural Networks for Process Control L.H. Ungar, T. Johnson and R.D. de Veaux, CIMPRO Proceedings pp.357-364, 1994. (pdf)
  • A Statistical Basis for Using Radial Basis Functions for Process Control L.H. Ungar and R. de Veaux, American Control Conference (ACC) Proceedings, 1995. (pdf)
  • Neural Networks for Process Control, Ungar, L.H. E.D. Hartman J.D. Keeler and G.D Martin, Proc. Intelligent Systems in Process Engineering (ISPE '95), 1995.
  • Radial Basis Function Neural Networks for Process Control, Ungar, L.H., T. Johnson and R.D. De Veaux, Computer-Integrated Manufacturing in the PROcess industries (CIMPRO) Proceedings, 357-364, 1994.
  • Stability of Neural Net Based Model Predictive Control, Eaton, J.W., J.B. Rawlings, and L.H. Ungar, Proceedings of the ACC, 2481-85, 1994.
  • Direct and Indirect Model Based Control Using Artificial Neural Networks, Psichogios, D.C., and L.H. Ungar, I \& EC Res. 30, 2564-2573, 1991.

Reinforcement Learning for Robotics and Multi-agent Systems

Robotic learning requires search for optimal control policies. This is often best done by exploring in the vicinity of a decision boundary between different control policies.
  • Using Policy Gradient Reinforcement Learning on Autonomous Robot Controllers G. Z. Grudic, Vijay Kumar and L. H. Ungar, IEEE-RSJ International Conference on Intelligent Robots and Systems (IROS03), 2003
  • Rates of Convergence of Performance of Gradient Estimates Using Function Approximation and Bias in Reinforcement Learning, G. Z. Grudic and L. H. Ungar. Neural Information Processing Systems (NIPS*2001) Vancouver, Canada, 2001 .
  • Learning Multi-agent Co-ordination using Secondary Reinforcers, G. Z. Grudic and L. H. Ungar, submitted, 2003.
  • Using Reinforcement Learning to Refine Autonomous Robot Controllers, G. Z. Grudic, V. Kumar, L. H. Ungar, submitted, 2003.
  • Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning, G. Z. Grudic and H. Ungar, Seventeenth International Joint Conference on Artificial Intelligence (IJCAI 01), Seattle, USA, August, 2001
  • Localizing Search in Reinforcement Learning, G. Z. Grudic and L. H. Ungar Proc. 18th National Conference on Artificial Intelligence, (AAAI-00), 590-595, 2000.
  • (postscript (pdf)
  • Localizing Policy Gradient Estimates to Action Transitions, G. Z. Grudic and L. H. Ungar ICML2000, (postscript) (pdf).
  • Active Learning for Vision-Based Robot Grasping, Salganicoff, M. L.H., Ungar, and R. Bajcsy Machine Learning Journal 23, 251-278, 1996.
  • Active Exploration-Based ID-3 Learning for Robot Grasping, Salganicoff, M., L.G. Kunin and L.H.Ungar, Proceedings of the Workshop on Robot Learning, 11th Intl Conf. on Machine Learning, July, 1994.
  • Active Exploration and Learning in Real-Valued Spaces using Multi-Armed Bandit, M. Salganicoff and L.H. Ungar, Proc. 12th Intl. Conf. on Machine Learning, July, 1995.