CIS 261 Fall 2012
Discrete Probability, Stochastic Processes, and Statistical Inference
Course Information
Course Description:
- Purpose:
The purpose of this course is to provide a 1 CU educational
experience which tightly integrates the theory and applications of discrete probability,
discrete stochastic processes, and discrete statistical inference in the study of
computer science.
- Intended Audience and Prerequisites:
The intended audience for this class is both those students who are CS majors as well
as those intending to be CS majors. The only prerequisite for this course is CIS 160.
Specifically, it will be assumed that the students will know: Set Theory, Mathematical
Induction, Number Theory, Functions, Equivalence Relations, Partial-Order Relations,
Combinatorics, and Graph Theory at the level currently covered in CIS 160.
This course could be taken immediately following CIS 160. Computation and Programming
will play an essential role in this course. The students will be expected to use either the
Maple or Mathematica programming environments in homework exercises which will include: numerical
and symbolic computations, simulations, and graphical displays.
- Connections to Other CIS Courses:
The primary focus on the discrete case, i.e., countable sample spaces and state spaces
fits very nicely with the focus on discrete structures in CIS 160, 262, and 320.
There will be a heavy emphasis on events defined in terms of strings.
- Other Features:
Topics on discrete stochastic processes will be selected from finite- and infinite-state Markov chains.
We will also show how these processes are related to Finite-State Machines.
Topics on discrete statistical inference will be selected from parameter estimation and
hypothesis testing for probability models for string generation, including finite-state
Markov chains. There will be some coverage of models with continuous state spaces, e.g., R,
we will address Central Limit Theory ideas through exercises using both symbolic and
numerical computation. We will introduce the Minimum Description Length Paradigm to unite
basic ideas about randomness, inference and computation.
The illustrations of the concepts and applications of the theory will all be selected from
computer science and other disciplines.
Instructors:
- Professor Max Mintz:
- Previous CIS 261 Course Evaluations:
- CIS 261 2007-2012:
See Penn Course Review
Teaching Assistant:
| Name: |
E-Mail: |
Office: |
Office Hours: |
| Ilana Arbisser |
iarbisser@gmail.com |
TBA |
TBA |
Class Schedule:
- Lectures: Tuesday, Thursday, 13:30-14:50, Room: 321 TB
- Recitation: Friday 14:00-14:50, Room: 309 TB
Preliminary Examination Schedule:
- Preliminary Examination I:
5 October, 14:00-14:50; Room: 309 TB
- Preliminary Examination II:
2 November, 14:00-14:50; Room: 309 TB
- Preliminary Examination III:
30 November, 14:00-14:50; Room: 309 TB
- NB: Makeup preliminary examinations will only be given for verified
medical reasons. All makeup examinations are oral.
Weekly Quizzes:
- There will be weekly written quizzes lasting 10-15 minutes.
The quizzes are scheduled on Fridays.
The purpose of the quizzes is to aid students in keeping
current on the homework and lecture material. The subject material of the
quizzes will be taken from the homework assignments and lectures. The tentative
quiz schedule is: September 14, 21, 28; October 12, 19, 26; November 9, 16; December 7.
No makeup quizzes will be given.
Examination Policy:
- NB: All preliminary examinations and quizzes are closed-book examinations. No
written notes, personal assistants, or calculators are permitted.
Written Assignments Policy:
- Written assignments (problem sets) will be given weekly during lecture and
recitation. The problems will vary in difficulty and will be designed to reinforce and augment the
material in the lectures and text.
- Assignments include: analytical work, derivations, proofs, algorithms,
and computer program implementations.
- The assignments will be collected and graded. Each assignment must be submitted
at the beginning of the class in which it is due. Late submissions will not be accepted.
Our goal is to grade all of the homework that is submitted in a timely fashion. However, due to the
size of the class, it may become necessary from time to time to limit grading to a selected subset of
the assigned problems. For example, if five problems are assigned in a given set, we might select
three of the five problems for grading.
- Problem set solutions will be discussed in the recitation section
with additional commentary from time to time in the lectures.
Solutions to the homework problems will be distributed in class.
- You are permitted to discuss the homework problems with other class members with the
following limitations. These discussions are to be limited to high-level concepts. You are
not permitted to copy or share written work or implementation details. It is understood that the
work that you submit may be based on these discussions but has not been either copied directly from another
student's paper nor is it, in part or in whole, the product of impermissible collaboration.
- All written work must be neat, well-organized, and include sufficient explanations in
the delineation of the solutions. Messy, poorly organized, or illegible material will be returned ungraded.
- Each homework set will be graded based on the following grade levels: E (excellent),
S (satisfactory), U (unsatisfactory), NC (no credit).
Grading Policy:
- The final course grade
will be based entirely on the three preliminary examinations, the comprehensive two-hour final examination,
the aggregate quiz scores, and the homework assignments. No exemptions from examinations will be made.
The final grade will be based on the following units: (1) the quizzes; (2) prelim I; (3) prelim II;
(4) prelim III; (5) the final exam, part I; and (6) the final exam, part II. The sum of these six components
will constitute 90% of the final course grade. The remaining 10% of the final course grade will be based on
the aggregate homework scores.
Texts:
- In lieu of a required text, Lecture Notes will be distributed
at the beginning of the course.
CIS 261 Course Topics will be selected from:
Discrete Probability Theory:
Randomness and Description Length; Pseudo-Random-Number Generators; An Axiomatic Approach to Discrete
Probability Spaces; The Generation of Finite Sigma-Fields via Finite Partitions; Derived Properties
of the Probability Function; The Definition of Conditional Probability; Bayes' Theorems; The Polya
Urn Scheme; Independent Events; Product Probability Spaces; Important Discrete Probability Laws;
An Axiomatic Derivation of the Poisson Probability Law; Discrete Random Variables; The Univariate
Point-Mass Function; The Joint Point-Mass Function; Independent Random Variables; Functions of one
or more Random Variables; The Conditional Point-Mass Function; The Definition of the Expected Value
of a Random Variable; The Variance of a Random Variable; The Expected Value of a Function of one or
more Random Variables; Uncorrelated versus Independent Pairs of Random Variables; The Definition of
Conditional Expectation; Convergence Concepts for a Sequence of Random Variables.
Discrete Stochastic Processes:
Finite and Infinite Sequences of Discrete Random Variables on a Common Probability Space; Markov
Processes; Matrix Algebra; Time-Homogeneous Finite Markov Chains; State Transition Matrices;
State Transition Diagrams; Classification of States: Absorbing, Transient, Recurrent, and Periodic;
Classification of Chains: Absorbing, Ergodic, and Regular; Canonical Forms; The Fundamental Matrix;
Absorption Probabilities; The Fundamental Limit Theorem for Regular Chains.
Discrete Statistical Inference:
Sampling Theory and the Laws of Large Numbers; The Sample Cumulative Distribution Function;
Order Statistics; Statistical Inference: Parametric versus Nonparametric Models; Robustness Issues:
The Mean versus the Alpha-Trimmed Mean; Discrete Exponential Families with Vector Parameters;
Sufficient Statistics; Understanding What an Estimator is; Desirable Properties of Estimators;
Methods of Parameter Estimation: Maximum Likelihood Estimation, Maximum A Posteriori Probability
Estimation, Decision-Theoretic Procedures, The Minimum Description Length Paradigm; Understanding
What a Hypothesis Test is; Desirable Properties of Tests; Neyman-Pearson Tests; An Example of a
Nonparametric Test; Monte Carlo Methods: The Bootstrap.