UW Biiostatistics & Medical Informatics UW Biostatistics & Medical Informatics UW Madison UW Biostatistics & Medical Informatics Site Map
 
UW School of Medicine and Public Health UW Madison






Educational Resources
Computational Biology and Biostatistics Summer Research Program
- Research Projects - 2006-

The following research topics were from the Summer 2007 Research Program in Biostatistics. Research project descriptions from previous years are also available. Next year's program will probably have different topics, but on the same level as the ones listed below. Click on title of project for further information on each research project.


Student Mentor(s) Title of Project
Samantha Bromfield Moo Chung and Houri Vorperian

Functional Principal Component Analysis (PCA) in Vocal Tract Development

David Gasca Marjorie Rosenberg
Smoking Cessation Benefit
Linda Liu Rick Nordheim
Statistical Analysis of the Shannon Diversity Index
Corina Prieto Julie Mitchell and Steve Darnell
Computational Analysis of Protein Interfaces
Nicholas Stong George Phillips and Roman Aranda 3D Printing of Protein Models
Aline Thomas Mark Craven and Keith Noto
Finding CRMs with Background Modeling

Title of Project: Functional Principal Component Analysis (PCA) in Vocal Tract Development
Student Samantha Bromfield
Mentor(s): Moo Chung and Houri Vorperian
Abstract:

One objective of the Vocal Tract Development Lab is to characterize the anatomic growth of the vocal tract and surrounding structures as a first step towards understanding the biological basis of speech development. The vocal tract (VT) has been described as a 2 tube model or resonator with an anterior-oral portion in the horizontal plane (VT-H) and a posterior-pharyngeal portion in the vertical plane (VT-V). The growth of the vocal tract is non-uniform across sex, implying that the two portions may develop differently for males and females. The purpose of this statistical analysis was to assess the relational growth of the anterior versus posterior portions of the vocal tract as a function of sex. The comparison of sex differences was done using polynomial regression models. Three linear measurements were used in this analysis: 1) vocal tract length (VTL); 2) anterior vocal tract length (VT-H); and 3) posterior vocal tract length (VT-V). These measurements were secured from imaging studies (MRI and CT) of 229 cases (83 female and 146 male) between the ages birth to 20. Results indicate that there are significant sex differences in the growth rate and growth pattern of all three vocal tract measurements analyzed.


Title of Project: Smoking Cessation Benefit
Student David Gasca
Mentor(s): Marjorie Rosenberg
Abstract:

Data was collected after the inclusion of a new insurance benefit for smoking cessation by the state of Wisconsin for the benefit of giving employers the pharmaceutical cost of such a benefit. Pharmacotherapy claims include use of one of the four FDA approved medications for smoking cessation. Subjects, for which individual-level claims was collected, included in the three-year study from 2001-2003 were the state of Wisconsin employees, retirees, and adult dependents. Single medication users were the concentration of the study, which made up 91% of all smoking cessation claimants, and comparisons were drawn of their individual activity within the pharmacotherapy plan to those of the clinical guidelines provided by the Agency for Healthcare Research and Quality (AHRQ).


Title of Project:

Statistical Analysis of the Shannon Diversity Index

Student Linda Liu
Mentor(s): Rick Nordheim
Abstract: The purpose of this project was to assess the sampling properties of a widely used ecological measure, the Shannon diversity index. Results are presented from a study of the effects of sample size on the index, especially when applied to theoretical communities with different underlying species abundance distributions. The two distributions used n the study were geometric and lognormal, each with a range of parameters. We found that the response of the index to sample size depends on the shape of the distribution, but that for each distribution the bias decreased as sample size increased. We also analyzed the reliability of the t test used for comparing the Shannon indices for two samples. The study found that the realized performance of the t test reached closer agreement with the nominal performance (significance level of the test) for larger samples from each simulated community.

Title of Project:

Computational Analysis of Protein Interfaces

Student

Corina Prieto

Mentor(s):

Julie Mitchell and Steve Darnell

Abstract: Protein-protein interactions are important for the understanding of biological pathways. The physical properties of protein-protein interfaces are one of the factors that determine its interaction behavior. Studying protein-protein interfaces, specifically “hot spot” residues that account for the majority of free energy in binding may reveal more information about their properties associated with binding. It is important to predict hot spots because they can help us design novel protein interfaces through a better understanding of their interaction behavior. We used machine learning to generate knowledge-based-decision-tree models that explore protein interfaces relative to flexibility and determined if this feature would improve the prediction of hot spots. However, important trends in the data were observed regarding the flexibility of hot spots based on whether they were hydrophobic or polar.

Title of Project:

3D Printing of Protein Models

Student

Nicholas Stong

Mentor(s):

George Phillips and Roman Aranda

Abstract: The Protein Data Bank (PDB) was established in 1971 at the Brookhaven National Laboratory; in 1998 it was transferred to the Research Collaboratory for Structural Bioinformatics, which is composed of Rutgers University, The University of Wisconsin-Madison, NIST, and the San Diego Supercomputer Center. They maintain a variety of information on 37,136 structures, all of which are made publicly available. Scientists all over the world using a variety of techniques including x-ray crystallography and NMR spectroscopy determine this information experimentally. Most important is the 3-D model information available in .PDB files (Wikipedia). Using, Pymol, an open source molecular visualization system, this data can be viewed and manipulated. Using this program, proteins can be explored and better understood.

Title of Project:

Finding CRMs with Background Modeling

Student

Aline Thomas

Mentor(s):

Mark Craven and Keith Noto

Abstract: Promoter regions of co-regulated genes often contain sets of protein binding sites called cis-Regulatory modules (CRMs). Probabilistic methods are often used in models that represent both binding site DNA and background DNA so that algorithms can find CRMs. In this project, an algorithm containing a two-state 5 th order hidden Markov model is compared to one containing a 5 th order Markov chain model to determine if a more sophisticated model will more accurately represent background DNA, and therefore, improve the ability of machines to find CRMs. Comparing their representation of DNA sequences, the hidden Markov model performed significantly better. However, the Markov chain model and the hidden Markov model found CRMs equally well with the motif finder.

 


CBB Research Projects Index: 2001, 2002, 2003, 2004, 2005, 2006, 2007

 

Uw Madison, Chemistry Lab UW Madison, Class held outside on Bascom Hill

Internal Use | Site Map | Search |
Overview | People | Training | Research | Seminars | Employment | Links |
Biostatistics Program | Clinical Trials Program | Medical Informatics Program | Biomedical Computing |

Copyright © 2006 The Board of Regents of the
University of Wisconsin System

 

UW Madison UW School of Medicine and Public Health