|  | 
 Syllabus, Readings and Lecture NotesCourse OverviewMotif and cis-Regulatory Module (CRM) Modeling
 
    topics: learning motif models, learning models of cis-regulatory
   modules, Gibbs sampling, Dirichlet priors,
   parameter tying, sequence entropy, mutual information
    required reading
      
      T. Bailey and C. Elkan.
      The value
      of prior knowledge in discovering motifs with MEME.
      In Proceedings of the 3rd International Conference on
      Intelligent Systems for Molecular Biology, pp. 21-29, 1995.
      C. Lawrence, S. Altschul, M. Boguski, J. Liu, A. Neuwald, and
      J. Wootton.  Detecting
      subtle sequence signals: a Gibbs sampling strategy for multiple alignment.
      Science 262:208-214, 1993.
      O. Elemento, N. Slonim and S. Tavazoie.
      
A universal framework for regulatory element discovery across all genomes and data types.
      Molecular Cell 28(2):337-350, 2007.
(Supplemental materials containing key methodological details)
        optional reading
      
	 optional viewing
	  
    lecture notes
      
	 Learning Sequence
	Motif Models using EM
	(PDF, PPTX) (1/19, 1/24, 1/26)
	 Learning Sequence Motif
	Models using Gibbs Sampling (PDF, PPTX, Gamma example, Dirichlet example) (1/26, 1/31, 2/2)
	 Inferring Models of cis-Regulatory
	Modules using Information Theory (PDF, PPTX) (2/2, 2/7, 2/9)
       Genotype Analysis
 
   topics: haplotype inference, genome-wide association studies (GWAS),
  quantitative trait loci (QTL) mapping, multiple hypothesis testing
   required reading
  
   optional reading
  
   lecture notes
        
	 Linking Genetic Variation to Important Phenotypes (PDF, PPTX) (2/9, 2/14)
	 GWAS and multiple testing correction (PDF, PPTX) (2/14, 2/16, 2/21)
         Epigenomics
 
   topics: epigenomic data types, DNase I hypersensitivity, Gaussian processes,
  convolutional neural networks, interpreting noncoding genetic variants
   required reading
  
	R.I. Sherwood, T. Hashimoto, C.W. O'Donnell, S. Lewis, A.A. Barkal, J.P. van Hoff, V. Karun, T. Jaakkola, and D.K. Gifford.  Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape. Nat Biotechnol 32(2): 171–178, 2014.
  	J. Lever, M. Krzywinski, and N. Altman. Points of Significance: Classification evaluation. Nat Methods 13(8):603-604, 2016.
  	C. Angermueller, T. Pärnamaa, L. Parts, and O. Stegle. Deep learning for computational biology. Mol Syst Biol 12(7):878, 2016.
   optional reading
  
   lecture notes
        
 RNA-Seq and Mass Spectrometry
 
    topics: RNA-Seq technology, transcript quantification,
   peptide and protein identification with mass spectrometry
    required reading
	
    optional reading
    
    lecture notes
	
		Transcript quantification with RNA-Seq (PDF, PPTX) (3/14, 3/16)
		Mass spectrometry (PDF, PPTX) (3/28, 3/30, 4/4)
	 Biological Network Analysis
 
    topics: protein interactions, pathway identification, linear programming, min cost flow
    required reading
	
		E. Yeger-Lotem, L. Riva, L.J. Su, A.D. Gitler, A.G. Cashikar, O.D. King, P.K. Auluck, M.L. Geddie, J.S. Valastyan, D.R. Karger, S. Lindquist, and E. Fraenkel. Bridging high-throughput genetic and transcriptional data reveals cellular responses to alpha-synuclein toxicity. Nat Genet 41(3):316-323, 2009.
	 optional reading
	
    lecture notes
	
		Identifying signaling pathways (PDF, PPTX) (4/4, 4/6)
	 Gene Finding
 
    topics: gene finding, interpolated Markov models, generalized HMMs, pair HMMs
    required reading
      
    optional reading
      
    lecture notes
      
	 Interpolated Markov Models for Gene Finding (PDF, PPTX) (4/11, 4/13)
	 Eukaryotic Gene Finding (PDF, PPTX) (4/13)
       Large-Scale and Whole-Genome Sequence Alignment
 
    topics: large-scale alignment, whole-genome alignment,
        suffix trees, k-mer tries, longest increasing
        subsequence problem, MUMmer
        
    required reading
      
      A. Delcher, S. Kasif, R. Fleischmann, J. Peterson, O. White
      and S. Salzberg.
      Alignment of Whole Genomes.
      Nucleic Acids Research 27(11):2369-2376, 1999.
       optional reading
      
      E. Ukkonen.
      On-line Construction of Suffix Trees
      Algorithmica 14(3):249-260, 1995.
      M. Brudno, C. Do, G. Cooper, M. Kim, E. Davydov, NISC Comparative
      Sequencing Program, E. Green, A. Sidow, and S. Batzoglou.
      LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale
      Multiple Alignment of Genomic DNA.
      Genome Research 13:721-731, 2003.
      Chapter 3 of C. Dewey.
      Whole-genome alignments and polytopes for comparative genomics.
      PhD thesis.  University of California, Berkeley, 2006.
       lecture notes
      
	Alignment of Long Sequences (PDF, PPTX) (4/18, 4/20, 4/25)
       RNA Structure Analysis
 
    topics: predicting RNA secondary structure, Nussinov/energy-minimization algorithms,
        stochastic context free grammars
    required reading
      
       Chapter 9 in Durbin et al.
       Sections 10.1, 10.2 in Durbin et al.
       optional reading
      
    lecture notes
      
	 RNA Secondary Structure Prediction
      (PDF, PPTX) (4/25, 4/27)
      
	 Stochastic Context Free Grammars for RNA Structure Modeling
	  (PDF, PPTX) (4/27, 5/2)
	  
	
	
        Protein Structure Prediction
 
    topics: secondary structure prediction, threading, branch and bound search
    required reading
   
    optional reading
   
   lecture notes
        
	 Introduction to Protein Structure Prediction (PDF, PPTX) (5/2)
	 Protein Threading (PDF, PPTX) (5/4)
         Lecture Notes
 
Thank you to Professors Mark Craven and Colin Dewey for providing lecture material.  These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by Mark Craven, Colin Dewey, and Anthony Gitter.
 
 |