|
|
Syllabus, Readings and Lecture Notes
- Introduction to Molecular Biology, Genomics, Bioinformatics, Probability
- topics: DNA, RNA, proteins, the Central Dogma, RNA processing in
eukaryotes, networks, available genomics data, basic probability theory for
discrete variables
- required reading
- recommended reading
- lecture notes
- Pairwise Sequence Alignment
- topics: dynamic programming methods for global and local alignment,
linear and affine gap penalty functions, the BLAST algorithm,
alignment statistics, substitution matrices
- required reading
- lecture notes
- Probabilistic Sequence Models
- topics: Markov chains, high-order Markov models, inhomogenous Markov models, hidden Markov models, Forward/Backward/Viterbi algorithms, applications to gene finding and motif modeling
- required reading
- Sections 3.0, 3.1, 3.5 in Durbin et al.
- Sections 3.2, 3.3 in Durbin et al.
- lecture notes
- Multiple Sequence Alignment and Phylogenetic Tree Inference
- topics: dynamic programming for MSA, heuristic methods for
MSA, profile HMMs, distance based phylogeny methods, UPGMA,
neighbor-joining, parsimony based algorithms, Fitch's algorithm,
weighted parsimony, nearest-neighbor interchange, branch and bound search
- required reading
- Chapter 6 in Durbin et al.
- Sections 7.1-7.6 in Durbin et al.
- lecture notes
- Analyzing Data from Microarray, SNP-chip, and other High-Throughput Experiments
- topics: high-throughput technologies, detecting diffferential expression, multiple hypothesis testing and false-discovery-rate methods, clustering algorithms, biclustering, classification algorithms
- required reading
- Spotted Array Flash Animation Tutorial
- J. Storey and R. Tibshirani.
Statistical
Significance for Genomewide Studies.
Proceedings of the National Academy of Sciences,
100:9440-9445, 2003.
- C. Manning and H. Schutze. Chapter 15: Clustering.
Foundations of Statistical Natural Language Processing,
MIT Press, 1999.
- J. Nevins, E. Huang, H. Dressman, J. Pittman, A. Huang, M. West.
Towards
integrated clinico-genomic models for personalized medicine: combining
gene expression signatures and clinical factors in breast cancer
outcomes prediction. Human and Molecular Genetics, 12:R153-R157, 2003.
- recommended reading
- M. Eisen, P. Spellman, P. Brown, D. Botstein. Cluster Analysis
and Display of Genome-Wide Expression Patterns.
Proceedings of the National Academy of Sciences,
95:14863-14868, 1998.
- A. Tanay, R. Sharan and R. Shamir.
Discovering
Statistically Significant Biclusters in Gene Expression Data.
Bioinformatics 18(Suppl. 1):S136-S144, 2002.
- J. Hardin, M. Waddell, D. Page, F. Zhan, B. Barlogie,
J. Shaughnessy, and J. Crowley.
Evaluation of
Multiple Models to Distinguish Closely Related Forms of Disease Using
DNA Microarray Data: an Application to Multiple Myeloma.
Statistical Applications in Genetics and Molecular Biology,
3(1) 2004.
- M. Molla, M. Waddell, D. Page and J. Shavlik.
Using Machine
Learning to Design and Interpret Gene-Expression Microarrays
AI Magazine, 25(1):23-44, 2004.
- lecture notes
- Inferring and Modeling Cellular Networks
- topics: Bayesian networks, module networks, exact and approximate inference methods, parameter and structure learning
- required reading
- N. Friedman, M. Linial, I. Nachman, D. Pe'er.
Using
Bayesian networks to analyze expression data.
Journal of Computational Biology 7(3-4):601-620, 2000.
Alternate link
- K. Sachs, O. Perez, D. Pe'er, D. Lauffenburger, G. Nolan.
Causal
protein-signaling networks derived from multiparameter single-cell data.
Science 308(5721):523-529, 2005.
- E. Segal, M. Shapira, A. Regev, D. Pe'er, D. Botstein, D. Koller and N. Friedman.
Module networks:
identifying regulatory modules and their condition-specific regulators from
gene expression data.
Nature Genetics 34(2):166-176, 2003.
- recommended reading
- lecture notes
- Protein Structure Prediction
- topics: secondary structure prediction, threading, the ROSETTA method, docking
- required reading
- recommended reading
- lecture notes
Last modified: Fri Dec 14 11:14:46 CST 2007
|