The focus of my research program is to develop and apply machine-learning methods to inferring models of, and reasoning about, networks of interactions among genes, proteins, clinical and environmental factors, and phenotypes of interest. Of particular interest are machine-learning problems that involve sequences, time series, networks, natural language, and high-dimensional data. The work in my group also emphasizes active learning and learning with weak supervision.

Current projects in my group include the following:

Characterizing host virus interactions including the host networks involved in viral replication, and viral-genotype to disease-phenotype associations.

Representative publications:
K. Lee, A. Kolb, Y. Sverchkov, J. Cuellar, M. Craven and C. Brandt (2015).
Recombination Analysis of Herpes Simplex Virus Type 1 Reveals a Bias towards GC Content and the Inverted Repeat Region.
Journal of Virology.

D. Chasman, B. Gancarz, L. Hao, M. Ferris, P. Ahlquist and M. Craven (2014).
Inferring Host Gene Subnetworks Involved in Viral Replication.
PLoS Computational Biology 10(5).

L. Hao, Q. He, Z. Wang, M. Craven, M. Newton and P. Ahlquist (2013).
Limited Agreement of Independent RNAi Screens for Virus-Required Host Genes Owes More to False-Negative than False-Positive Factors.
PLoS Computational Biology 9(9).

Inferring intracellular networks from genome-wide experiments.

Representative publication:
D. Chasman, Y.-H. Ho, D. Berry, C. Nemec, M. MacGilvray, A. Merrill, J. Hose, M. V. Lee, J. Will, J. Coon, A. Ansari, M. Craven and A. Gasch (2014).
Pathway Connectivity and Signaling Coordination in the Yeast Stress-Activated Signaling Network.
Molecular Systems Biology 10(11):759.

Learning models to assess risk for clinical events such as asthma exacerbations and post-hospitalization VTEs from electronic health records.

Representative publication:
E. Kawaler, A. Cobian, P. Peissig, D. Cross, S. Yale and M. Craven (2012).
Learning to Predict Post-Hospitalization VTE Risk from EHR Data.
Proceedings of the American Medical Informatics Association (AMIA) Annual Symposium.

Extracting information from the scientific literature and exploiting this information for understanding biological data.

Representative publications:
H. Shatkay and M. Craven (2012).
Mining the Biomedical Literature.
MIT Press.

A. Vlachos & M. Craven (2012).
Biomedical Event Extraction from Abstracts and Full Papers using Search-Based Structured Prediction.
BMC Bioinformatics 13(Suppl. 11):S5.

Modeling, aligning and classifying complex biomedical time series.

Representative publication:
A. Smith, A. Vollrath, C. Bradfield & M. Craven (2009).
Clustered Alignments of Gene-Expression Time Series Data.
Bioinformatics 25:i119-i127. (special issue: Proceedings of the 17th ISMB and 8th ECCB Conferences)