(3 units) Second term, 2000-2001

Mon and Wed, 9:00 - 10:30am

W4007 Hygiene

This is a course in modern statistics (i.e., statistics using the computer), for the sophisticated user of statistics and computers. We will introduce topics in numerical analysis useful for statistical modeling and analysis. We will focus on computing above statistics and algorithms above programming. Example methods include deterministic and stochastic methods for optimization and integration, the EM algorithm, Monte Carlo simulation (both non-iterative and iterative), and kernel density estimation. Applications include Bayesian hierarchical models, mixture models, time series, nonlinear regression, smoothing, classification, and modern variable selection.

**Prerequisites**: 140.751-752 and
140.771-772 or equivalent; computer programming required
(e.g. R/S-Plus/Matlab and/or C/C++/Fortran).

Student evaluation will be based on two programming projects (to be done in R/S-Plus/Matlab or whatever else you may prefer).

**Note**: I will be trying to
prepare documents to help people get started with C, perl and R. See
the following links. [ C | perl
| R ]

October | 30 | Introduction;
statistical computing in practice Notes: [ pdf (560k) ] | ||

November | 1 | R (in brief) Notes: [ pdf (191k) ] R problem set: [ Data | Problems (pdf 13k) | Solutions: Part A / Part B ] Reading: MASS (ch 1-4) Additional comments | ||

6 | Random number generation Notes: [ pdf (362k) ] Reading: NAS (ch 20); NRC (ch 7); MASS (§ 5.2) | |||

8 | Permutation test and the
bootstrap Notes: [ pdf (208k) ] Reading: NAS (ch 22), Efron and Tibshirani (§ 9.5) Additional comments | |||

13 | Numerical linear algebra Notes: [ pdf (301k) ] Reading: NAS (ch 8-9), Thisted (ch 3) Additional comments Assignment 1 (due Nov 29): [ latex (3.6k) | pdf (17k) | data ] | |||

15 | EM algorithm Notes: [ pdf (243k) ] Reading: NAS (ch 10) | |||

20 | Newton-Raphson, Fisher
scoring Notes: [ pdf (133k) ] Reading: NAS (ch 11), Thisted (ch 4) | |||

22 | Nonlinear regression,
iteratively reweighted least squares Notes: [ pdf (254k) ] Reading: NAS (§ 11.4, 11.5), Thisted (§ 4.5.5, 4.5.6) | |||

27 | EM algorithm
extensions Notes: [ pdf (222k) ] Reading: NAS (ch 12) | |||

29 | Downhill simplex method, L_{p} regression and
constrained optimization Notes: [ pdf (269k) ] Reading: NAS (ch 14), NRC (§ 10.4), Thisted (§ 4.5.7) Assignment 2 (due Dec 20): [ latex (2.4k) | pdf (13k) | data | code ] | |||

December | 4 | Numerical integration Notes: [ pdf (345k) ] Reading: NAS (ch 16), NRC (ch 4) | ||

6 | Hidden Markov models Notes: [ pdf (231k) ] Reading: NAS (§ 23.3) | |||

11 | Markov chain Monte Carlo I Notes: [ pdf (749k) ] Reading: NAS (ch 24) Additional comments | |||

13 | Markov chain Monte Carlo II Notes: [ pdf (1,334k) ] Additional comments | |||

18 | Tree-based models
(aka recursive partitioning) and neural networks Notes: [ pdf (220k) ] Reading: MASS (ch 10, § 9.4) | |||

20 | Program design Notes: [ pdf (423k) ] Reading: S programming (§ 8.4), Writing R extensions [pdf (387k; 60 pgs)] |

- K Lange (1999) Numerical
analysis for statisticians.
Springer-Verlag, New York.
**[Required.]**(“NAS”) - RA Thisted (1988) Elements of statistical computing: Numerical computation. Chapman and Hall, New York.
- P Spector (1994) An introduction to S and S-PLUS. Wadsworth, Belmont, CA.
- WN Venables and BD Ripley (1999) Modern applied statistics with
S-PLUS, 3rd edition. Springer-Verlag, New York.
**[Highly recommended for this course.]**[Online complements] (“MASS”) - WN Venables and BD Ripley (2000) S programming. Springer-Verlag, New York.
- BW Kernighan and DM Ritchie (1988) The C programming language, 2nd edition. Prentice Hall, Englewood Cliffs, NJ.
- PJ Plauger (1992) The standard C library . Prentice Hall, Englewood Cliffs, NJ.
- WH Press et al. (1992) Numerical recipes in C: The art of scientific computing, 2nd edition. Cambridge University Press. (“NRC”)
- BW Kernighan and R Pike (1999) The practice of programming. Addison-Wesley, Reading, MA.
- S Oualline (1992) C Elements of Style. M&T Books, San Mateo, CA.
- Various O'Reilly books:

- CRAN R Archive
- R mailing lists
- R FAQ
- S FAQ
- Search S-news archive
- StatLib
- Perl links
- List of unix tools
- Biostat IT committee
- Introduction to [ C | perl | R ]

kbroman at jhsph.edu

http://www.biostat.wisc.edu/~kbroman

Last modified: Wed Apr 15 23:02:19 2009