(3 units) Second term, 2000-2001
Mon and Wed, 9:00 - 10:30am
W4007 Hygiene
This is a course in modern statistics (i.e., statistics using the computer), for the sophisticated user of statistics and computers. We will introduce topics in numerical analysis useful for statistical modeling and analysis. We will focus on computing above statistics and algorithms above programming. Example methods include deterministic and stochastic methods for optimization and integration, the EM algorithm, Monte Carlo simulation (both non-iterative and iterative), and kernel density estimation. Applications include Bayesian hierarchical models, mixture models, time series, nonlinear regression, smoothing, classification, and modern variable selection.
Prerequisites: 140.751-752 and 140.771-772 or equivalent; computer programming required (e.g. R/S-Plus/Matlab and/or C/C++/Fortran).
Student evaluation will be based on two programming projects (to be done in R/S-Plus/Matlab or whatever else you may prefer).
Note: I will be trying to prepare documents to help people get started with C, perl and R. See the following links. [ C | perl | R ]
October | 30 | Introduction;
statistical computing in practice Notes: [ pdf (560k) ] | ||
November | 1 | R (in brief) Notes: [ pdf (191k) ] R problem set: [ Data | Problems (pdf 13k) | Solutions: Part A / Part B ] Reading: MASS (ch 1-4) Additional comments | ||
6 | Random number generation Notes: [ pdf (362k) ] Reading: NAS (ch 20); NRC (ch 7); MASS (§ 5.2) | |||
8 | Permutation test and the
bootstrap Notes: [ pdf (208k) ] Reading: NAS (ch 22), Efron and Tibshirani (§ 9.5) Additional comments | |||
13 | Numerical linear algebra Notes: [ pdf (301k) ] Reading: NAS (ch 8-9), Thisted (ch 3) Additional comments Assignment 1 (due Nov 29): [ latex (3.6k) | pdf (17k) | data ] | |||
15 | EM algorithm Notes: [ pdf (243k) ] Reading: NAS (ch 10) | |||
20 | Newton-Raphson, Fisher
scoring Notes: [ pdf (133k) ] Reading: NAS (ch 11), Thisted (ch 4) | |||
22 | Nonlinear regression,
iteratively reweighted least squares Notes: [ pdf (254k) ] Reading: NAS (§ 11.4, 11.5), Thisted (§ 4.5.5, 4.5.6) | |||
27 | EM algorithm
extensions Notes: [ pdf (222k) ] Reading: NAS (ch 12) | |||
29 | Downhill simplex method, Lp regression and
constrained optimization Notes: [ pdf (269k) ] Reading: NAS (ch 14), NRC (§ 10.4), Thisted (§ 4.5.7) Assignment 2 (due Dec 20): [ latex (2.4k) | pdf (13k) | data | code ] | |||
December | 4 | Numerical integration Notes: [ pdf (345k) ] Reading: NAS (ch 16), NRC (ch 4) | ||
6 | Hidden Markov models Notes: [ pdf (231k) ] Reading: NAS (§ 23.3) | |||
11 | Markov chain Monte Carlo I Notes: [ pdf (749k) ] Reading: NAS (ch 24) Additional comments | |||
13 | Markov chain Monte Carlo II Notes: [ pdf (1,334k) ] Additional comments | |||
18 | Tree-based models
(aka recursive partitioning) and neural networks Notes: [ pdf (220k) ] Reading: MASS (ch 10, § 9.4) | |||
20 | Program design Notes: [ pdf (423k) ] Reading: S programming (§ 8.4), Writing R extensions [pdf (387k; 60 pgs)] |