Biostatistics & Medical Informatics 776
Computer Sciences 776
Advanced Bioinformatics (Spring 2004)
General Course Information
Course Overview
This course (BMI-776) is the 2nd part of a two course series. Part 1 is
BMI-576 tought by
Mark Craven in the Fall Semester.
His numerous contributions in developing this two course series and many of
the lecture notes for BMI-776 are gratefully acknowledged. BMI-576 is a
pre-requisite for BMI-776, and the following course description applies
to both BMI-576 and BMI-776:
The biological sciences are undergoing a revolution in how they
are practiced. In the last decade, a vast amount of data (DNA
sequences, protein sequences, etc.) has become available, and
computational methods are playing a fundamental role in transforming
this data into scientific understanding.
Bioinformatics (Also called Computational Biology or Computational Molecular Biology) involves developing and applying computational methods for managing and analyzing information about the sequence, structure and function of biological molecules and systems.
The goals of this two-course series are to provide an understanding of:
- the types and sources of data available for computational biology,
- the fundamental computational problems in molecular biology and genomics,
- a core set of widely used algorithms in computational biology,
- a set of algorithms that have important applications in computational
biology, but which have key applications outside of biology as well.
BMI-776 Spring 2003 course web page
BMI-776 Spring 2004 course web page
Course Requirements
The grading for the course will be be based on:
- homework assignments (4): 40%
- midterm exam: 25%
- final exam: 35%
The primary focus of the homework assignments will be programming and
experimenting with various algorithms discussed in class. Some
homework assignments may also involve written exercises.
Syllabus, Readings and Lecture Notes
- Probabilistic Sequence Models
- topics: Pairwise alignment using HMMs, Profile HMMs, Multiple Alignment by profile HMM training, semi Markov models, interpolated Markov models, EM for motif finding, Gibbs sampling.
- required reading
- Chapter 4, R. Durbin et al.
- Chapter 5, R. Durbin et al.
- Section 6.5, R. Durbin et al.
- S. Salzberg, A. Delcher, S. Kasif, and O. White.
Microbial
Gene Identification Using Interpolated Markov Models.
Nucleic Acids Research 26(2):544-548, 1998.
- C. Burge and S. Karlin. Prediction of Complete Gene Structures in Human
Genomic DNA. Journal of Molecular Biology 268(1):78-94.
- T. Bailey and C. Elkan.
The Value
of Prior Knowledge in Discovering Motifs with MEME(PDF).
In Proceedings of the 3rd International Conference on
Intelligent Systems for Molecular Biology, pp. 21-29, 1995.
- C. Lawrence, S. Altschul, M. Boguski, J. Liu, A. Neuwald, and
J. Wootton. Detecting Subtle Sequence Signals: A Gibbs Sampling Strategy for Multiple Alignment. Science 262:208-214, 1993.
- Optional reading
- lecture notes
- Advanced Sequence Alignment
- topics: banded alignments, PSI-BLAST, whole genome alignment,
multiple whole genome alignment, Motif Finding.
- required reading
- S. Altshul, T. Madden, A. Schaffer, J. Zhang, Z. Zhang, W. Miller
and D. Lipman.
Gapped
BLAST and PSI-BLAST: A New Generation of Protein Database Search
Programs. Nucleic Acids Research 25(17):3389-3402, 1997.
- A. Delcher, S. Kasif, R. Fleischmann, J. Peterson, O. White and
S. Salzberg.
Alignment
of Whole Genomes. Nucleic Acids Research 27(11):2369-2376,
1999.
- M. Höhl, S. Kurtz and E. Ohlebusch.
Efficient
Multiple Genome Alignment. Bioinformatics 18:S312-S320, 2002.
- J. Buhler and M. Tompa.
Finding Motifs Using Random Projections. Journal of Computational Biology
9(2):225-242, 2001.
- lecture notes
- Probabilistic Approaches to Phylogeny
- required reading
- Chapter 8 in Durbin et al.
- lecture notes
- RNA Structure Modeling
- topics: RNA secondary structure prediction, Covariance Model
- required reading
- Chapter 10 (sections 3 and up) of Durbin et al.
- lecture notes
- Probabilistic Gene Expression Analysis
- required reading
- lecture notes
- Biomedical Text Analysis
- required reading
- lecture notes
Homework Assignments
Grades
Late Policy on Homeworks:
- Homeworks are due by 11:59pm. Use electronic handin as described with each homework.
- Each student has five free late days to use during the semester (Also Sat-Sun and UW-Holidays will
not count as late days). Once these are exhausted, late homeworks will be penalized 10% per day.
- No homeworks will be accepted more than one week late (excepting UW-Holidays).
- If you submit the 4th homework late, your initial grade will be an incomplete and your final grade will be submitted late.
Sample Exams
Academic Misconduct:
All examinations, programming assignments, and written homeworks must
be done individually. Cheating and
plagiarism will be dealt with in accordance with University
procedures (see the
Academic Misconduct Guide for Students).
Hence, for example, code for programming assignments must not
be developed in groups, nor should code be shared. You are
encouraged to discuss with your peers, the TAs or
the instructor ideas, approaches and techniques broadly, but not at a level
of detail where specific implementation issues are described by anyone.
If you have any questions on this, please ask the instructor before you act.
Last modified: Nov 11, 2004