Predoctoral Training Program in Bio-Data Science (BDS)



Research to improve the analysis of big biomedical data is active at the interface of computer sciences, statistics, and various biomedical disciplines, including genomics, molecular biology, neuroscience, cancer research, and population health.  The mission of the Bio-Data Science (BDS) training program is to provide predoctoral research training at this interface, preparing graduate students for key roles in academia, industry, or government. The BDS training program is supported by a T32 grant from the National Library of Medicine.


Problems in the generation, acquisition, management, analysis, visualization, and interpretation,  of data , which have always been important players in biomedical science, now assume leading roles in the massive effort to understand health and disease. The unprecedented size, complexity, and heterogeneity of big biomedical data demands research that will allow us to more efficiently extract knowledge from data in order to make better predictions, to characterize biological systems, and generally to enable subsequent investigation. Modern biological, medical, and health studies often involve data sets from which useful, accurate information cannot be efficiently extracted with available methods. Resolving research challenges in this domain requires advanced skill in statistics and computer sciences as well as advanced knowledge of the relevant biomedical context. The BDS training program is designed to provide research training in these three areas.  


Predoctoral trainees will be students in good standing from one of the three affiliated PhD programs (Statistics, Computer Sciences, or Cellular and Molecular Biology). All supported trainees must be permanent residents or U.S. citizens. Information about individual graduate programs can be obtained from the relevant biological and computational departments. Ideal candidates will be students in their second or third year of study and have completed the pre-requisites for the required courses. Graduate credits in Stat/Biostat often require Stat 309/310, Introduction to Mathematical Statistics, which creates an extra requirement for some CS PhD students.  Similarly, graduate credits in CS typically require CS367, Introduction to Data Structures. However, promising students are encouraged to apply even if they haven't taken these specific pre-requisities.



Trainee Guidelines

Student training will consisted of strategic coursework, guided research rotations, participation in the interdiscipliary research community on campus, as well as in a national arena.


In conjunction with the requirements of their home PhD program, trainees will attain competency in core areas of bio-data science through course work in all three focal areas. At least 6 graduate credits (2 courses) will be required in each of area. The table below lists course offerings from the three separate program areas that could be used for this purpose.

Computer Sciences Stat/Biostat Other Biomedical
CS 726. Nonlinear Optimization STAT 610. Introduction to Statistical Inference GEN 629. Evolutionary Genetics
CS 733. Sparse Numerical Analysis STAT/BMI 641. Statistical Methods for Clinical Trials GEN 633. Population Genetics
CS 760. Machine Learning STAT 741. Survival Analysis GEN 677. Genomic Science
CS 761. Advanced Machine Learning STAT 771. Computational Statistics NTP 619. Biology of Mind
CS 764. Topics in Database Management Systems STAT 775. Bayesian Inference NTP 675. Functional Neuroimaging in Cognitive Disorders
CS 766. Computer Vision STAT/BMI 768. Statistical Methods in Medical Image Analysis ONC 703. Carcinogenesis and Tumor Cell Biology
CS 776. Advanced Bioinformatics STAT/BMI 877. Statistical Methods in Molecular Biology PHS 795. Priniciples of Population Health Science
CS/BMI 767. Methods in Medical Image Analysis STAT 992. High Dimensional Statistica Inference BIOCHEM 620. Eukaryotic Molecular Biology
CS 784. Data Models and Languages
CS 858. Visualization

Credit will also be required in the responsible conduct of research (RCR), computing infrastructure, and big biomedical data. These and other relevant courses are selected with guidance from the student's mentor, subject to approval by the Steering Committee.

Research Rotations

To contextualize data science course work, trainees will complete at least 6 research credits through the lab rotation system. Typically, this will be through 3 one-semester-long projects each mentored by 2 collaborating trainers from different program areas.


BDS trainers include faculty from BMI, Statistics, Computer Sciences departments and the Cellular and Molecular Biology program. For a current list of Possible Trainers click here.

Interdisciplinary Scienctific Community at UW and Beyond

To foster the interdisciplinary community of scholars, trainees will participate in the monthly bio-data science seminar series.

To enhance professional development and expand the students' network of scholars, trainees will attend national meetings relevant to bio-data science. BDS trainees will typically be expected to attend two or three professional conferences per year during the course of their program. Examples of relevant meetings include:

Spring Meeting of the Eastern North AmericaRegion (ENAR) of the International Biometric Society

The Joint Statistical Meetings of the American Statistical Association (JSM)

The Intelligent Systems in Molecular Biology (ISMB) Meeting

The Research in Computational Molecular Biology (RECOMB) Meeting.

Meetings at Cold Spring Harbor Laboratory


Interested in Participating?

Whitney A. Sweeney
Student Services Coordinator
Department of Biostatistics and Medical Informatics
Email: sweeney [at]