CIS Seminars & Events

Spring 2014 Colloquium Series

January 16

Farnam Jahanian
Assistant Director for CISE, National Science Foundation Title:
"Innovating for Society: Realizing the Transformative Impact of Computing and Communication in a Data-Driven World"

Read the Abstract and Bio

Abstract: Advances in computer and information science and engineering (CISE) are catalyzing a societal transformation.  We are witnessing unprecedented growth of scientific and social data, deep integration of the cyber and physical worlds, wireless connectivity at broadband speeds, and seamless virtualized access to resources in the cloud.  These advances are transforming the way we work, play, communicate, learn, and discover. Investments in ambitious, long-term research and infrastructure, as well as in the development of a computing and information technology workforce, enable these advances and are a national imperative. 

Foundational research is an integral part of a comprehensive innovation ecosystem with the capacity to yield new benefits for society.  CISE research provides a foundation for economic competitiveness and is critical to achieving our societal priorities including healthcare, smart transportation, sustainability, education and life-long learning, and public safety.

This talk will focus on technological advances and emerging trends that are shaping our future and accelerating the pace of discovery and innovation across all science and engineering disciplines. It will describe how these trends inform priorities and programs at the National Science Foundation. It will also highlight several recent national initiatives, including the National Big Data Research and Development Initiative, the National Robotics Initiative (NRI), the U.S. Ignite Initiative, and the Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative.

Bio: Farnam Jahanian leads the National Science Foundation Directorate for Computer and Information Science and Engineering (CISE).  He guides CISE, with a budget of over $850 million, in its mission to uphold the nation’s leadership in scientific discovery and engineering innovation through its support of fundamental research in computer and information science and engineering and of transformative advances in cyberinfrastructure. He has testified before Congress on a broad range of topics, including cybersecurity and Big Data.

Dr. Jahanian is on leave from the University of Michigan, where he holds the Edward S. Davidson Collegiate Professorship and served as Chair for Computer Science and Engineering from 2007 – 2011 and as Director of the Software Systems Laboratory from 1997 – 2000.  His research on Internet infrastructure security formed the basis for the Internet security company Arbor Networks, which he co-founded in 2001 and served as Chairman until its acquisition in 2010.  Dr. Jahanian holds a master's degree and a Ph.D. in Computer Science from the University of Texas at Austin.  He is a Fellow of ACM, IEEE, and AAAS.

January 13

Victor Vianu
Computer Science Department, University of California San Diego Title
"Automatic Verification of Data-Centric Workflows"

Read the Abstract

Abstract: Workflows centered around data are increasingly common. Recently, tools have been developed for high-level specification of such workflows and other data-driven applications.  Such specification tools not only allow fast prototyping and improved programmer productivity but, as a side effect, provide convenient targets for automatic verification. A notable example is IBM's business artifact framework, successfully deployed in practice.

In this talk I will present a formal model of data-centric workflows based on business artifacts, and results on automatic verification of such processes. Artifacts are tuples of relevant values, equipped with local state relations and accessing an underlying database. They evolve under the action of services specified by pre-and-post conditions, that correspond to workflow tasks. The verification problem consists in statically checking whether all runs of an artifact system satisfy desirable properties, expressed in an extension of linear-time temporal logic. I will exhibit several classes of specifications and properties that can be automatically verified. The results are quite encouraging and suggest that, unlike arbitrary software systems, significant classes of data-centric workflows may be amenable to fully automatic verification. This relies on a novel marriage of techniques from the database and computer-aided verification areas.

February 18

Hal Daume III
Computer Science, University of Maryland, College Park
"The Many Flavors of Language: Understanding and Adapting Statistical Models"

Read the Abstract and Bio

Abstract: Language use can vary along many axes, including genre, topic, register and communication medium. Rounded to two decimal points, of all text produced today, 0.00% of it is newswire. Yet most of our statistical models are built based on labeled data drawn from news and related media. These systems fall apart when applied on other types of language, often falling short of the performance of oft maligned "rule-based systems." If we want statistical systems that we can use on the diverse types of language we see today (social media, scientific texts, speech, etc.) we essentially have two choices: annotate new types of data for all relevant tasks or develop better learning technology. We take the second approach because it scales better to the large variety of types of language and large number of interesting tasks.

I'll begin this exploration into language flavors by asking the question: when statistical models are applied to new domains, what goes wrong? Despite almost a decade of research in domain adaption, very little effort has gone into answering this question. My goal is to convince you that by taking this analysis problem seriously, we can develop much better hypotheses about how to build better systems. Once we understand the problem, I'll discuss my work that addresses the various aspects of the adaptation problem with applications ranging from simple text categorization through structured prediction and all the way to machine translation. (Along the way I'll also highlight applications of these technologies to other domains like vision and robotics.)

This is joint work with a large number of students and collaborators: Arvind Agarwal, Marine Carpuat, Larry Davis, Shobeir Fakhraei, Katharine Henry, Ann Irvine, David Jacobs, Jagadeesh Jagarlamudi, Abhishek Kumar, Daniel Marcu, John Morgan, Dragos Munteanu, Jeff Phillips, Chris Quirk, Rachel Rudinger, Avishek Saha, Abhishek Sharma, Suresh Venkatasubramanian.

Bio: Hal Daume III is an associate professor in  Computer Science at the University of Maryland, College Park. He holds joint appointments in UMIACS and Linguistics. He was previously an assistant professor in the School of Computing at the University of Utah. His primary research interest is in developing new learning algorithms for prototypical problems that  arise in the context of language processing and artificial intelligence. This includes topics like structured prediction,  domain adaptation and unsupervised learning; as well as multilingual modeling and affect analysis. He associates himself most with conferences like ACL, ICML, NIPS and EMNLP. He earned  his PhD at the University of Southern California with a  thesis on structured prediction for language (his advisor was Daniel  Marcu). He spent the summer of 2003 working with Eric Brill in the machine learning and applied statistics group at Microsoft  Research. Prior to that, he studied math (mostly logic) at Carnegie Mellon University, while working at the Language  Technology Institute.

February 27

Alekh Agarwal
Microsoft Research New York
"Representation, modeling and computation: Opportunities and challenges of modern datasets"

Read the Abstract

Abstract: Machine learning from modern datasets presents novel opportunities and challenges. Larger and more diverse datasets enable us to answer more complex statistical questions, but present computational challenges in designing algorithms that can scale. In this talk I will present two results, the first one about computational challenges and the second about an opportunity enabled by modern datasets in the context of representation learning.  

I will start by presenting a distributed machine learning system we developed to address the computational scalability problem. Our system obtains state-of-the-art computational results in many common classification and regression tasks. I will discuss both the communication and computational components of the system, along with experimental evaluation on industry-scale data as well as large datasets in the academic literature.  

In the second part of my talk, I will present my recent work on dictionary learning, also known as sparse coding. The goal here is to efficiently learn a basis such that each data point is a combination of only a small number of basis elements, and applications arise in signal processing as well as machine learning. I will present an efficient algorithm which is guaranteed to recover the true dictionary, given enough data samples. This is the first recovery result for over complete dictionaries, and comes with an easy to implement algorithm.

March 6

John Duchi
Computer Science Department, University of California, Berkeley
"Computation, Communication, and Privacy Constraints on Learning"

Read the Abstract and Bio

Abstract: How can we maximally leverage available resources--such as computation, communication, multi-processors, or even privacy--when performing machine learning? In this talk, I will suggest statistical risk (a rigorous notion of the accuracy of learning procedures) as a way to incorporate such criteria in a framework for development of algorithms. In particular, we follow a two-pronged approach, where we (1) study the fundamental difficulties of problems, bringing in tools from optimization, information theory, and statistical minimax theory, and (2) develop algorithms that optimally trade among multiple criteria for improved performance. The resulting algorithms are widely applied in industrial and academic settings, giving up to order of magnitude improvements in speed and accuracy for several problems. To illustrate the practical benefits that a focus on the tradeoffs of statistical learning procedures brings, we explore examples from computer vision, speech recognition, document classification, and web search.

Bio: John is currently a PhD candidate in computer science at Berkeley, where he started in the fall of 2008. His research interests include optimization, statistics, machine learning, and computation. He works in the Statistical Artificial Intelligence Lab (SAIL) under the joint supervision of Michael Jordan and Martin Wainwright. He obtained his MA in statistics in Fall 2012, and received a BS and MS from Stanford University in computer science under the supervision of Daphne Koller. John has won several awards and fellowships, including a best student paper award at the International Conference on Machine Learning (ICML) and the NDSEG and Facebook graduate fellowships. John has also worked as a software engineer and researcher at Google Research under the supervision of Yoram Singer and Fernando Pereira.

March 18

Rishabh Singh
Computer Science and Artificial Intelligence Laboratory, MIT
"Program Synthesis for the Masses"

Read the Abstract and Bio

Abstract: New computing platforms have greatly increased the demand for programmers, but learning toprogramremains a big challenge. Programsynthesishas the potential to revolutionize programming by making it more accessible. My work has focused on two goals: enabling programming for end-users through the use of more intuitive specification mechanisms, and generating automated feedback to help students learn programming. In this talk, I will present my work on two systems that work towards these goals: Autograder and FlashFill.The Autograder system provides automated feedback to students on introductory programming assignments, and was successfully piloted on thousands of programming exercises from edX. The FlashFill system helps end-users perform repetitive data transformations over strings, numbers, and tables using input-output examples. A part of FlashFill was shipped as part of Excel 2013 and was quoted as one of the top features by many press reviews. I will describe how algorithms built on advances in constraint-solving, machine learning, and formal verification enabled the new forms of interaction required by these systems.

Bio: Rishabh Singh is a PhD candidate in the Computer Science and Artificial Intelligence Laboratory at MIT. His research interests are broadly in formal methods and programming languages. His PhD work focuses on developing program synthesis techniques for making programming accessible to end-users and students. He is a Microsoft Research PhD fellow and winner of MIT's William A. Martin Outstanding Master's thesis Award. He obtained his BTech in Computer Science and Engineering from IIT Kharagpur in 2008, where he was awarded the Institute Silver Medal and Bigyan Sinha Memorial Award. He was also awarded to be Prime Minister's National Guest at Republic Day Parade, New Delhi in 2005.

March 25

Zachary Kincaid
Computer Science Dept., University of Toronto
"Parallel proofs for parallel programs"

Read the Abstract

Abstract: Writing correct concurrent software is hard.  My research aims to make it easier by developing algorithms for automatically proving that concurrent programs are free of certain types of errors (e.g., dividing by zero, overflowing buffers, or dereferencing null pointers).  In this talk, I will argue that in order to effectively reason about concurrent programs, we must abandon certain principles that we take for granted when reasoning about sequential programs; in particular, a global view of time.  Global time is an inherently sequential concept which imposes artificial ordering constraints between parallel events, and leads to a combinatorial explosion in the number of control states in a concurrent program.  My work proposes that the flow of data is the fundamental feature of interest in concurrent programs.  Data flow based reasoning allows the parallelism present in a program to be maintained in its proof of correctness, which enables tractable automated reasoning and succinct proofs.  In other words: my talk will be on finding parallel proofs for parallel programs.

March 27

Michael Carbin
Electrical Engineering and Computer Science Dept.,
MIT "Reasoning about Approximate Computing"

Read the Abstract and Bio

Abstract: Many modern applications implement large-scale computations (e.g., machine learning, big data analytics, and financial analysis) in which there is a natural trade-off between the quality of the results that the computation produces and the performance and cost of executing the computation.

Exploiting this fact, researchers have recently developed a variety of new mechanisms that automatically change the structure and execution of an application to enable it to meet its performance requirements. Examples of these mechanisms include skipping portions of the application's computation and executing the application on fast and/or energy-efficient unreliable hardware systems whose operations may silently produce incorrect results.

I present a program verification and analysis system, Rely, whose novel verification approach makes it possible to verify the safety, security, and accuracy of the approximate applications that these mechanisms produce. Rely also provides a program analysis that makes it possible to verify the probability that an application executed on unreliable hardware produces the same result as if it were executed on reliable hardware.

Bio: Michael Carbin is a Ph.D. Candidate in Electrical Engineering and Computer Science at MIT. His interests include the design and implementation of programming systems that deliver improved performance and resilience by incorporating approximate computing and self-healing. His undergraduate research at Stanford University received an award for Best Computer Science Undergraduate Honors Thesis. As a graduate student, he has received the MIT Lemelson Presidential and Microsoft Research Graduate Fellowships. His recent research on verifying the reliability of programs that execute on unreliable hardware received a best paper award at OOPSLA 201.

April 1

Finale Doshi-Velez
School of Engineering and Applied Sciences and the Center for Biomedical Informatics, Harvard University
"Prediction and Interpretation with Complex Data"

Read the Abstract and Bio

Abstract: Not only are today's data more complex than ever before, but the talks and explanations that we want from them are growing in sophistication.  Latent variable models provide a powerful tool for summarizing data through a set of hidden variables.  These models are generally trained to maximize prediction accuracy, and modern latent variable models now do an excellent job of finding compact summaries of the data with high predictive power.  However, there are many situations in which good predictions alone are not sufficient. Whether the hidden variables have inherent value by providing insights about the data, or whether we wish to interface with domain expert on how to improve a system, understanding what the discovered hidden variables mean is an important first step.

In this talk, I will discuss how the language of probabilistic modeling naturally and flexibly allows us to incorporate information about how humans organize knowledge in addition to finding predictive summaries of data.  In particular, I will talk about how a new model, GraphSparse LDA, discovers interpretable latent structures without sacrificing (and sometimes improving upon!) prediction accuracy.  The model incorporates knowledge about the relationships between observed dimensions into a probabilistic framework to find a small set of human-interpretable "concepts" that summarize the observed data.  This general approach can be applied to a wide variety of domains, and in particular we use it to recover interpretable descriptions of novel, clinically-relevant autism subtypes from a medical data-set with thousands of dimensions.

Bio: Finale Doshi-Velez is a postdoctoral fellow jointly between Harvard's School of Engineering and Applied Sciences and the Center for Biomedical Informatics.  She received her MSc from the University of Cambridge in 2009 (as a Marshall Scholar) and her PhD from MIT in 2012.  She was selected as one of IEEE's "AI 10 to  Watch" in 2013.  Her research interests include latent variable modeling, sequential decision-making, and clinical applications.

April 3

Sivan Sabato
Microsoft Research New England
"Learning with Lower Information Costs"

Read the Abstract and Bio

Abstract: In this talk I will consider learning with lower information costs, focusing on linear regression. Linear regression is one of the most widely used methods for prediction and forecasting, with widespread
uses in many fields such as natural sciences, economy and medicine.  I will show how to improve the information costs of linear regression in two settings. First, I will present a new estimation algorithm for the standard supervised regression setting. This is the first efficient estimator that enjoys minimax optimal sample complexity, up to log factors, for general heavy tailed distributions. The technique is general and can be applied to a larger class of smooth and strongly convex losses. Second, I will consider the challenge of using crowd sourcing for labeling in tasks that usually require experts, and show how to achieve this using linear regression combined with a feature multi-selection approach.

Based on Joint work with Daniel Hsu and Adam Kalai.

Bio: Sivan Sabato is a post-doctoral researcher at Microsoft Research New England. Her main research interests are in statistical machine learning theory and its applications. Sivan received her M.Sc. in Computer Science from the Technion, and her Ph.D. in Computer Science from the Hebrew University of Jerusalem. She is an alumna of the Adams fellowship program for outstanding Ph.D. students, and has been awarded several honors, including the Wolf Prize for outstanding M.Sc. thesis, the Google Anita Borg Scholarship, and the Intel Excellence Award.

April 8

Wen-Hann Wang
Managing Director of INTEL labs
"Energy Efficient Computing: Enabling The New Digital Society"

Read the Abstract and Bio

Abstract: By 2015, over 2.5 billion people with more than 10 billion devices will be connected to the Internet. This explosion of connected devices will extend beyond datacenters and  handheld devices to a new world of more intelligent systems such as appliances, cars, clothing, furniture, roads and smart cities, giving way to a new era of Integrated Computing, where computing will be everywhere. This new era of Integrated Computing promises to usher in compelling new, personalized user experiences, and will demand innovations in circuits, architectures, power management and energy efficiency with minimum impact to the environment.

The challenge is to deliver a mobile user experience where neither power nor efficiency are compromised.  Simply put, we must find a way to provide longer battery life in the mobile devices that we rely on without sacrificing computing power.

In his session, Wen-Hann Wang will highlight innovations in Intel Labs technologies that are delivering dramatic gains in platform power management and energy efficiency. These innovations are enabling smaller form factors like wearables and opening up new realms of capabilities such as voice, touch, and gestures in integrated computing.

Bio: Wen-Hann Wang is corporate vice president and is managing director of Intel Labs. He is responsible for Intel's global research efforts in computing and communications. Prior to his current assignment, he served as vice president of Intel Labs and director of circuits and system research. Before returning to the labs in 2009, Wang served for nine years on the Software and Services Group (SSG) Staff while he held general management positions for the Core Software, the Managed Runtime, and the Middleware Products divisions. He was also instrumental in establishing SSG's presence in PRC and served as general manager for Intel Asia Pacific R&D Limited.

Wang joined Intel in 1991 as an Intel® Pentium® Pro platform architect, working on the highly successful P6 product family. His platform architecture and analysis work was instrumental in the creation of the Intel® Xeon® processor product line. He served as platform infrastructure research manager of the newly formed Intel Microprocessor Research Lab (MRL) in 1995 and later became director of the Emerging Platforms Lab, delivering cutting-edge technologies and reference platforms for Intel product groups.

Wang holds 15 patents and has received numerous technical awards, including being elected IEEE Fellow. Prior to joining Intel in 1991, Wang served as a research staff member at IBM T. J. Watson Research Lab. He has worked and studied in three continents.

Wang received his bachelor's degree in electrical engineering from National Taiwan University, a master's degree in electrical engineering from Philips International Institute of Technological Studies (Netherlands), and a Ph.D. in computer science from the University of Washington.

April 10

Emina Torlak
Computer Science Department, University of California, Berkeley
"Programming for Everyone: From Solvers to Solver-Aided Languages and Beyond"

Read the Abstract and Bio

Abstract: We live in a software-driven world.  Software helps us communicate and collaborate; create art and music; and make discoveries in biological, physical, and social sciences. Yet the growing demand for new software, to solve new kinds of problems, remains largely unmet.  Because programming is still hard, developer productivity is limited, and so is end-users' ability to program on their own.

In this talk, I present a new approach to constructing programs, which exploits advances in constraint solving to make programming easier for experts and more accessible to everyone else.  The approach is based on two observations.  First, much of everyday programming involves the use of domain-specific languages (DSLs) that are embedded, in the form of APIs and interpreters, into modern host languages (for example, JavaScript, Scala or Racket). Second, productivity tools based on constraint solvers (such as verification or synthesis) work best when specialized to a given domain.  Rosette is a new kind of host language, designed for easy creation of DSLs that are equipped with solver-based tools.  These Solver-Aided DSLs (SDSLs) use Rosette's symbolic virtual machine (SVM) to automate hard programming tasks, including verification, debugging, synthesis, and programming with angelic oracles.   The SVM works by compiling SDSL programs to logical constraints understood by SMT solvers, and then translating the solver's output to counterexamples (in the case of verification), traces (in the case of angelic execution), or code snippets (in the case of synthesis and debugging).  Rosette has hosted several new SDSLs, including imperative SDSLs for data-parallel and spatial programming; a functional SDSL for specifying executable semantics of secure stack machines; and a declarative SDSL for web scraping by example.

Bio: Emina Torlak is a researcher at U.C. Berkeley, working at the intersection of software engineering, formal methods, and programming languages.  Her focus is on developing tools that help people build better software more easily.  She received her B.Sc. (2003), M.Eng. (2004) and Ph.D. (2009) from MIT, where she developed Kodkod, an efficient SAT-based solver for relational logic.  Kodkod has since been used in over 70 tools for verification, debugging, and synthesis of code and specifications.  Emina has also worked on a wide range of domain-specific formal methods.  She won an ACM SIGSOFT distinguished paper award for her work at LogicBlox, where she built a system for synthesizing massive data sets, used in testing of decision support applications.  As a member of IBM Research, she led the development of a tool for bounded verification of memory models, enabling the first fully automatic analysis of the Java Memory Model.  These experiences inspired her current research on solver-aided languages, which aims to reduce the effort of applying formal methods to new problem domains.

April 15

Heming Cui
Computer Science Dept., University of Columbia
"Adding Simplicity and Order in Today's Complex Software"

Read the Abstract and Bio

Abstract: Real-world software becomes increasingly complex and hard to get right due to two trends. First, driven by the rise of the multi-core hardware, software is moving from single-threaded to multi-threaded, and the concurrently running threads can interleave in too many different ways. These messy thread interleavings aggravate understanding, testing, and analyzing of software, and can easily lead to wrong outputs and security breaches. Second, driven by the diverse computational demands, software is getting larger and larger, leading to exponential growth of program paths. It is extremely hard for existing techniques to check critical programming rules (e.g., allocated memory must be freed, and assertions must always succeed) on these exponentially many program paths, thus software is prone to harmful bugs (e.g., resource leaks and crashes). In this talk, I will present two new techniques that can make software much easier to get right by removing unnecessary software complexity. First, I will present Stable Multi-Threading (StableMT), a reliable and secure parallelism paradigm that can greatly reduce the amount of thread interleavings, and a practical StableMT runtime system that is fast and scalable with a wide range of popular software. Second, I will introduce rule-directed pruning, a new algorithm that can soundly prune program paths irrelevant to a checked rule, leading to exponential speedup of a state-of-the-art verification tool, and eight new rule violations confirmed or fixed in widely used software. These two new techniques have broad applications, and they have been leveraged by researchers to greatly improve several existing reliability and security techniques.

Bio: Heming Cui is a PhD candidate at Computer Science of Columbia University. His thesis research is focused on improving software reliability and security with runtime and program analysis techniques. His general interests are in operating systems, programming languages, and distributed systems. Before joining Columbia, he obtained his master and bachelor degrees from Tsinghua University, Beijing, China.

April 22

Saul Gorn Memorial Lecture
Eric Horvitz
Distinguished Scientist & Managing Director, Microsoft Research
"Data, Predictions, and Decisions in Support of People and Society"

Read the Abstract and Bio

Abstract: I will present on harnessing data to build predictive models that can guide decision making. First, I will discuss opportunities to harness data in prediction and decision making in healthcare.  I will focus on efforts on readmissions reduction and hospital-associated infection to highlight broader possibilities with using machine learning and decision support to enhance the quality and reduce the cost of healthcare. Then, I will describe the use of anonymized behavioral data drawn from web services as a large-scale sensor network for public health.  I will describe several efforts, including the use of data from logs of web search to identify adverse effects of medications.   Finally, I will discuss directions with building systems that are designed to leverage the complementary aspects of machine and human intelligence to assist people in daily life and to solve problems in science and society.

Bio: Eric Horvitz is distinguished scientist and director at Microsoft Research.  His interests span theoretical and practical challenges with developing systems that perceive, learn, and reason, with a focus on inference and decision making under uncertainty.  He has been elected a fellow of AAAI, AAAS, the American Academy of Arts and Sciences, and the National Academy of Engineering, and has been inducted into the CHI Academy.  He received PhD and MD degrees at Stanford University.  Information on publications, collaborations, and activities can be found at http://research.microsoft.com/~horvitz.

Read more about the Saul Gorn Lecture Series

April 24

Byron Boots
Robotics and State Estimation, University of Washington
"Machine Learning For Modeling Real-World Dynamical Systems"

Read the Abstract and Bio

Abstract: A major challenge in machine learning is to reliably and automatically discover hidden structure in high-dimensional data. This is an especially formidable problem for sequential data: revealing the dynamical system that governs a complex time series is often not just difficult, but provably intractable. Popular maximum likelihood strategies for learning dynamical system models are slow in practice and often get stuck at poor local optima, problems that greatly limit the utility of these techniques when learning from real-world data. Although these drawbacks were long thought to be unavoidable, recent work has shown that progress can be made by shifting the focus of learning to realistic instances that rule out the intractable cases.

In this talk, I will present a new family of computational approaches for learning dynamical system models. The key insight is that low-order moments of observed data often possess structure that can be revealed by powerful spectral decomposition methods, and, from this structure, model parameters can be directly recovered. Based on this insight, we design highly effective algorithms for learning parametric models like Kalman Filters and Hidden Markov Models, as well as an expressive new class of nonparametric models via reproducing kernels. Unlike maximum likelihood-based approaches, these new learning algorithms are statistically consistent, computationally efficient, and easy to implement using established matrix-algebra techniques. The result is a powerful framework for learning dynamical system models with state-of-the-art performance on video, robotics, and biological modeling problems.

Bio: Byron Boots is a postdoctoral researcher working with Dieter Fox in the Robotics and State Estimation Lab at the University of Washington. He received his Ph.D. in Machine Learning from Carnegie Mellon University in 2012 where he was advised by Geoffrey Gordon. Byron’s work on learning models of dynamical systems received the 2010 Best Paper award at the International Conference on Machine Learning (ICML-2010). His research focuses on modeling and control problems at the intersection of statistical machine learning, artificial intelligence, and robotics.

May 6

Harsha V. Madhyastha
Computer Science and Engineering Dept., University of California Riverside
"Enabling Cloud-Based Services: The Cloud Isn't Everywhere"

Read the Abstract and Bio

Abstract: We are currently in the midst of a revolution with regards to how applications are delivered to users. Instead of simply shipping binaries that users install on their devices, application providers are increasingly shifting to the model of offering services. Google Docs, Instagram, Dropbox, and Words with Friends are popular examples of this paradigm shift. In all of these cases, the use of services enables application providers to offload application functionality from resource-constrained client devices such as smartphones and tablets, offer a seamless user experience across multiple devices, and enable content sharing.

However, the service-based application model requires application providers to incur additional costs associated with hosting and managing service deployments. Services also implicitly threaten user privacy and are constrained by the Internet in terms of the performance and availability perceived by users. In this talk, I will highlight some of the problems associated with existing best practices to address these challenges, and I will present an overview of three systems that we have developed to address these problems: WhyHigh, SPANStore,

Bio: Harsha V. Madhyastha is an Assistant Professor in the Computer Science and Engineering department at University of California Riverside. His research interests broadly span the areas of distributed systems, networking, and security and privacy. Many of the systems developed as part of his research have been widely used and have had significant impact. For example, WhyHigh has reduced latencies to Google by an order of magnitude for millions of users, the MyPageKeeper system for detecting social malware is in use by over 20,000 Facebook users, and Internet topology and performance data from the iPlane system has been used in research projects at over 100 institutions. His work has also resulted in award papers at the USENIX NSDI and ACM SIGCOMM IMC conferences. His research is supported by the Army Research Laboratory (ARL), the Defense Threat Reduction Agency (DTRA), Amazon, VMware, a Google Research award, a NetApp Faculty Fellowship, and an NSF CAREER award.

May 14

Aarti Gupta
NEC Labs America
"Software Verification: Locks and MLoCs*"

Read the Abstract and Bio

Abstract: Software verification deals with proving or falsifying correctness properties of programs, and has broad applications from ensuring safety of mission-critical software to improving program robustness and programmer productivity. Approaches based on automatic techniques for state space exploration, such as model checking and symbolic execution, have seen a resurgence of interest in recent years, driven largely by advancements in solvers for Boolean Satisfiability (SAT) and Satisfiability Modulo Theories (SMT). These solvers perform the underlying search for proofs or bugs, typically by reasoning over abstract models of programs. While abstraction is critical in scaling to real-life programs, it is equally important to ensure enough precision in the models and the analysis to verify specific properties of interest. In this talk, I will describe my work in foundational techniques that address these twin challenges of precision and scalability in verifying sequential and multi-threaded programs. I will also share the lessons learnt in my efforts in driving them to industry practice where they are routinely used in a modular methodology to find complex bugs in software projects, some with millions of lines of code. Finally, I will discuss future extensions and applications of this research framework.

*MLoC: Million Lines of Code

Bio: Aarti Gupta received her PhD in Computer Science from Carnegie Mellon University. Her broad research interests are in the areas of formal analysis of programs/systems and automatic decision procedures. At NEC Labs she has led research in systems analysis and verification. She has given several invited keynotes highlighting the foundational research contributions of her group in verification and their successful industrial deployment. She has served as an Associate Editor for Formal Methods in System Design (since 2005) and for the ACM Transactions on Design Automation of Electronic Systems (2008-2012). She has also served as Co-Chair for the International Workshop on Satisfiability Modulo Theories (SMT) 2010, the International Conference on Computed Aided Verification (CAV) 2008, and Formal Methods in Computer Aided Design (FMCAD) 2006. She is currently serving on the Steering Committee of CAV.