|
Syllabus, Readings and Lecture Notes
Course Overview
Motif and cis-Regulatory Module (CRM) Modeling
- topics: learning motif models, learning models of cis-regulatory modules, Gibbs sampling, Dirichlet priors,
parameter tying, heuristic search, HMM structure search, sequence entropy and mutual information
- required reading
- T. Bailey and C. Elkan.
The value
of prior knowledge in discovering motifs with MEME.
In Proceedings of the 3rd International Conference on
Intelligent Systems for Molecular Biology, pp. 21-29, 1995.
- C. Lawrence, S. Altschul, M. Boguski, J. Liu, A. Neuwald, and
J. Wootton. Detecting
subtle sequence signals: a Gibbs sampling strategy for multiple alignment.
Science 262:208-214, 1993.
- O. Elemento, N. Slonim and S. Tavazoie.
A universal framework for regulatory element discovery across all genomes and data types.
Molecular Cell 28(2):337-350, 2007.
(Supplemental materials containing key methodological details)
- optional reading
- optional viewing
- lecture notes
- Learning Sequence
Motif Models using EM
(PDF, PPTX) (1/21, 1/26)
- Learning Sequence Motif
Models using Gibbs Sampling (PDF, PPTX) (1/26, 1/28, 2/2)
- Inferring Models of cis-Regulatory
Modules using Information Theory (PDF, PPTX) (2/2, 2/4)
Genotype Analysis
- topics: haplotype inference, genome-wide association studies (GWAS),
quantitative trait loci (QTL) mapping, interpreting noncoding variants,
epigenomic data, neural networks, multiple hypothesis testing
- required reading
- optional reading
- lecture notes
- Linking Genetic Variation to Important Phenotypes (PDF, PPTX) (2/9, 2/11)
- GWAS and multiple testing correction (PDF, PPTX) (2/11, 2/16, 2/18)
- Interpreting noncoding variants (PDF, PPTX) (2/18, 2/23, 2/25)
RNA-Seq and Mass Spectrometry
- topics: RNA-Seq technology, transcript quantification with RNA-Seq,
peptide and protein identification with mass spectrometry,
alternative splicing, splice graphs
- required reading
- optional reading
- lecture notes
- Transcript quantification with RNA-Seq (PDF, PPTX) (3/1, 3/10)
- Analysis of alternative splicing with RNA-Seq and probabilistic splice graphs
(PDF, PPTX) (class canceled)
- Mass spectrometry (PDF, PPTX) (3/15, 3/17)
Biological Network Analysis
- topics: biological network evolution, network modules, network alignment, pathway identification
- required reading
- R. Sharan and T. Ideker. Modeling cellular machinery through biological network comparison. Nat Biotech, 24(4):427-433, 2006.
- E. Yeger-Lotem, L. Riva, L.J. Su, A.D. Gitler, A.G. Cashikar, O.D. King, P.K. Auluck, M.L. Geddie, J.S. Valastyan, D.R. Karger, S. Lindquist, and E. Fraenkel. Bridging high-throughput genetic and transcriptional data reveals cellular responses to alpha-synuclein toxicity. Nat Genet 41(3):316-323, 2009.
- optional reading
- D-Y. Cho, Y-A. Kim, and T.M. Przytycka. Chapter 5: Network Biology Approach to Complex Diseases. PLoS Comput Biol, 8(12):e1002820, 2012.
- R. Sharan, S. Suthram, R.M. Kelley, T. Kuhn, S. McCuine, P. Uetz, T. Sittler, R.M. Karp, and T. Ideker. Conserved patterns of protein interaction in multiple species. Proceedings of the National Academy of Sciences 102(6):1974–1979, 2005.
- J.W. Chinneck. Practical Optimization: A Gentle Introduction.
- lecture notes
- Comparative network analysis (PDF, PPTX) (3/17, 3/29)
- Identifying signaling pathways (PDF, PPTX) (3/31, 4/5)
Gene Finding
- topics: the gene finding task, maximal dependence decomposition,
interpolated Markov models, pair HMMs, GENSCAN
- required reading
- optional reading
- lecture notes
- Interpolated Markov Models for Gene Finding (PDF, PPTX) (4/5, 4/7)
- Eukaryotic Gene Finding: The GENSCAN System (PDF, PPTX) (4/7, 4/12)
- Comparative Gene Finding (abridged) (PDF, PPTX) (4/12, 4/14)
Large-Scale and Whole-Genome Sequence Alignment
- topics: large-scale alignment, whole-genome alignment,
suffix trees, k-mer tries,
sparse dynamic programming, longest increasing
subsequence problem, Markov random fields,
MUMmer, LAGAN, Mercator
- required reading
- A. Delcher, S. Kasif, R. Fleischmann, J. Peterson, O. White
and S. Salzberg.
Alignment of Whole Genomes.
Nucleic Acids Research 27(11):2369-2376, 1999.
- M. Brudno, C. Do, G. Cooper, M. Kim, E. Davydov, NISC Comparative
Sequencing Program, E. Green, A. Sidow, and S. Batzoglou.
LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale
Multiple Alignment of Genomic DNA.
Genome Research 13:721-731, 2003.
- optional reading
- lecture notes
- Alignment of Long Sequences (PDF, PPTX) (4/14, 4/19)
- Multiple Whole Genome Alignment (PDF, PPTX) (4/21, 4/26)
RNA Structure Analysis
- topics: predicting RNA secondary structure, Nussinov/energy-minimization algorithms,
stochastic context free grammars, Inside/Inside-Outside/CYK algorithms
- required reading
- Chapter 9 in Durbin et al.
- Sections 10.1, 10.2 in Durbin et al.
- optional reading
- lecture notes
- RNA Secondary Structure Prediction
(PDF, PPTX) (4/26, 4/28)
- Stochastic Context Free Grammars for RNA Structure Modeling
(PDF, PPTX) (4/28)
Protein Structure Prediction
- topics: secondary structure prediction, threading, branch and bound search, ROSETTA
- required reading
- optional reading
- lecture notes
- Introduction to Protein Structure Prediction (PDF, PPTX) (5/3)
- Protein Threading (PDF, PPTX) (5/5)
Lecture Notes
Thank you to Professors Mark Craven and Colin Dewey for providing lecture material.
|