2009 RESEARCH PROJECTS
| Project Title: |
Breakage Models of Chromosomal Evolution |
| Student |
Ali Al-Hanooti |
| Mentor(s): |
Colin Dewey and Farzad Rastegar |
| Abstract: |
Genomes are constantly undergoing mutations due to various types of events such as: deletion, duplication, inversion, insertion, and translocation. For the purpose of our research we decided to focus on comparing bacterial genome sequences. Theories have been proposed to how breakage happens in genome sequences. The common two theories are: Fragile breakage model, and Random breakage model. Up until 2003, the scientific community embraced the Random breakage model, which believed that genomic links break off at random locations. However, in 2003, Pevzner and Tesler challenged the scientific status quo by arguing that “hotspots” exist in the human genome. These “hotspots” allow links of the particular genome to detach from the main link. We will be using data made available by GenBank. The data is comprised of a number of bacterial genome sequences. We will perform comparative analysis on the sequences to find if they support the Fragile or Random breakage model. |
| Project Title: |
Measuring County Level Excess Deaths |
| Student |
Mariangely Almenas-Santiago |
| Mentor(s): |
Ronald Gangnon |
| Abstract: |
Excess deaths are defined as deaths due to preventable or treatable causes that could be addressed with appropriately targeted strategies. For county level data, empirical percentiles of the observed mortality rates, particularly in the tails, are unstable and biased estimates of the corresponding true percentiles of the underlying mortality rates. Estimates of target percentiles for county-level mortality rates for United States mortality data, for years 1999 to 2005, were measured using an empirical Bayes procedure based on a posterior distribution. Results compared to two other estimation methods shows that empirical Bayes estimation produces less variable estimates for county mortality rates especially for low definitions of the target. Excess death rates are lower for women than for men, even though excess deaths counts for women become higher after the age of 75. Poorly performing states or counties are not always the principal source of excess deaths. Strategies to reduce the number of excess deaths should not be focused on reducing the excess mortality rates of the poorly performing counties or states, but in pushing for more stringent standards to those who contribute the higher amount of excess deaths. |
| Project Title: |
Modeling Morphine Titration and Conversions in Palliative Care |
| Student |
Brittney Bailey |
| Mentor(s): |
Paul R. Hutson |
| Abstract: |
This study uses the R environment to simulate and evaluate two common morphine dosing scenarios. The first is a conversion from a basal intravenous (IV) infusion of morphine to an oral morphine dose of MS-Contin® (MSC); the second a basal IV infusion augmented by patient- and nurse-initiated IV boluses. Pharmacokinetic parameters were derived published clinical trials and plasma concentration models were based on a one-compartment model of the body. Results showed that 1) MSC should not be introduced before the discontinuation of the basal infusion; and 2) patient- and nurse-initiated IV boluses were effective in raising the overall plasma-morphine concentration levels but did not have a significant additive effect on concentration from one bolus to the next. |
| Project Title: |
Warfarin Dose Model |
| Student |
Ashwin P. Devendiran |
| Mentor(s): |
David Page |
| Abstract: |
Warfarin is used widely as an oral anticoagulant in many countries. The correct dose of warfarin depends on various factors such as genetic variability and geographic area of the patients1. The pharmacogenetic algorithm, which uses an ordinary least-squares regression model, has various limitations including high mean absolute error. As a result, a new model is required to minimize the adverse effects caused by incorrect dose of warfarin. In this research, the regressions namely least median square regression, pace regression, and support vector regression are used to create a new, better model. The warfarin dose data collected from the International Warfarin Pharmacogenetic Consortium (IWPC) was used to find a new model. The warfarin dose model created uses clinical and genetic data of the patients. Using this data the pharmacogenetic model was reproduced to make sure there were no experimental errors. In addition, various statistical models were tested to achieve a minimal mean absolute error. The absolute mean error of the two models, least median square regression and pace regression, was close to the mean absolute error of the pharmacogenetic algorithm; however, it was not more significant than the pharmacogenetic algorithm. On the other hand, the support vector regression has a higher mean absolute error than any other algorithm created. The pharmacogenetic algorithm is still the better model for predicting the initial dose of warfarin. In the future, a model with complex algorithm, which uses more attributes, might produce a predicted initial dose of warfarin with minimal mean absolute error. |
| Project Title: |
A Bayesian Approach to Estimating Phylogenies Using Both Gene Order and Sequence Data Analysis |
| Student |
Emily Lundt |
| Mentor(s): |
Bret Larget and Alex Richard Kreibich |
| Abstract: |
Bryophytes are comprised of three classes that share some of the earliest ancestors of land plants. A long-standing open question is which of these classes (moss, liverworts, or hornworts) is the most recent ancestor to land plants. Determining which class shares the more recent ancestor is crucial to gaining insight about the characteristics leading to the widespread success of land plants. Chloroplast organelles, and thus chloroplast DNA, are common among bryophytes, land plants, and algae providing a means to study their relationships. Phylogenetic tree estimates for a subset of 19 species with sequenced chloroplasts was conducted. BADGER free software was used to analyze gene order and MrBayes free software was used to analyze the nucleotide sequence data of rbcL, atpA, and atpB genes. Analysis of gene order was inconclusive in that all possible phylogenetic trees were equally likely. Analysis of sequence data for rbcL and atpB supports hornworts as the closet relative of land plants, however atpA supports most strongly sister clade of hornworts and moss. |
| Project Title: |
Socioeconomic Status and Breast Cancer Survival |
| Student |
Ritesh Ramchandani |
| Mentor(s): |
Amy Trentham-Dietz, Ronald Gangnon and Brian Sprague |
| Abstract: |
Some studies have observed lower breast cancer survival rates in women of lower socioeconomic status (SES). The primary aim of this study is to determine if there is an association between SES and breast cancer mortality in a cohort of Wisconsin women. We analyzed data from 5,865 Wisconsin women, ages 20-69, who were diagnosed with invasive breast cancer during 1995-2003. Vital status was determined up to December 31, 2006 from the National Death Index. We considered both individual and census tract SES variables as predictors of survival. Individual variables considered were education, marital status, and household income and size. Census tract variables were percent above age 25 without a high school diploma, percent in poverty, percent urbanicity, and percent in working class jobs. Cox proportional hazards models were used to measure the associations of SES and breast cancer-specific survival when controlling for other prognostic factors. When controlling for age at diagnosis, stage at diagnosis, tumor histologic type, year of interview, and mammography screening history, greater mortality was observed for women without a college education versus those with a college degree (Hazard Ratio: 1.36; 95% CI: 1.06-1.74), and for divorced, separated, or widowed women (1.33; 1.03-1.66) as compared with married women. Higher mortality was associated with women in census tracts with = 20 percent without a high school diploma (1.43; 1.07-1.91) and 10-19.9 percent without a high school diploma (1.30; 1.05-1.62) versus < 10 percent. Women in census tracts with high working class populations also had higher mortality rates (1.26; 1.00-1.57). Controlling for age, lower SES was also associated with lower odds of having received a mammogram 5 years before diagnosis. Lower SES was associated with reduced survival and lack of mammography screening. This suggests the need to improve knowledge of and access to breast cancer screening in women of lower SES and to achieve optimal treatment and care for all women after diagnosis. |
2008 RESEARCH PROJECTS
| Project Title: |
QTL Mapping in an Outcross in Sticklebacks |
| Student |
Leah Fehr |
| Mentor(s): |
Karl Broman |
| Abstract: |
Stickleback fish are a valuable model organism for studies of evolution, particularly for the process of speciation (how new species are formed). Different stickleback species, living in different environments, show large morphological and behavioral differences. Such quantitative phenotypes are generally affected by multiple genetic loci (often called quantitative trait loci, QTL). We seek to map the QTL contributing to two phenotypes, body size and the number of lateral plates, in a complex outcross derived from two stickleback fish of two different species. The key methodological question concerns that joint analysis of a set of four families derived from the two founder fish. We hope that our new approach, analyzing the families jointly rather than individually, will give greater power to detect QTL and will provide improved estimates of the effects and locations of the QTL. |
| Project Title: |
Fertility Timing Within Marriage |
| Student |
Jaclyn Herrera |
| Mentor(s): |
Tara Becker |
| Abstract: |
Since the 1960s, the relationship between marriage and fertility has changed greatly in the United States. By using data from the 1969-2005 waves of the Panel of Income Study Dynamics, the timing of fertility within married couples is investigated, with a hypothesis that the time-span from the start of the marriage to the first birth will increase as time goes on. Life table methodology is used to determine the expected number of years a couple will remain married before experiencing their first birth and the proportion of couples expected to experience a birth in each year of marriage. Results show that there is a slight delay, as time goes on, in childbearing within the first 4 years of marriage. In addition, it appears that a greater proportion of births occur in the fourth and fifth years of marriage in the 1984-1988 cohort group than in earlier cohort groups. Finally, there appears to be no dominant pattern in the expected number of years until a first birth within married couples and across cohort groups. |
| Project Title: |
A Computational Analysis of MicroRNA Impact on Gene Expression Levels in a Diabetes Study |
| Student |
Christine Muganda |
| Mentor(s): |
Mark Craven |
| Abstract: |
A Computational Analysis of MicroRNA Impact on Gene Expression Levels in a Diabetes Study
As a means of better understanding type II diabetes mellitus, we study the regulation of certain sets of seemingly co-regulated disease-related genes. Specifically, we investigate the extent to which the expression levels of these genes are impacted by microRNAs. Using the mouse genome as a model, 3’ untranslated regions of the genes are tested for miRNA target enrichment using TargetScan and statistical analysis, co-regulated gene sets are then searched for conserved motifs using MEME, and finally a decision tree is learned on the gene set and cross-validation is used to explore the validity of the regularities found by this machine learning technique. The data sets obtained from Adipose and Islet tissue were found to have a statistically significant amount of miRNA target enrichment as identified by TargetScan. |
2007 RESEARCH PROJECTS
| Project Title: |
Similarity Functions for Toxicogenomic Microarray Data |
| Student |
Deborah Muganda |
| Mentor(s): |
Mark Craven and Adam Smith |
| Abstract: |
Current techniques to assay a chemical for toxicity involve many separate tests, which can take an inordinate amount of time, person-hours, and money. However it is well known that the exposure of an animal to a chemical alters the expression of many of its genes, particularly in its liver. These expression changes can be measured in parallel using microarray technology. The goal of Wisconsin's EDGE (Environment, Drugs, and Gene Expression) program is to create a central database for this data, in order to simplify and make more accurate the comparison of new chemicals to ones that are already well studied and thus to speed up testing. A central problem in this type of analysis is calculating the similarity of a pair of gene-expression profiles, each of which represents measurements over thousands of genes. The goal of this project is to implement and empirically evaluate various functions for computing expression profile similarity. |
| Project Title: |
Heavy Metal Exposure and Breast Cancer Risk |
| Student |
Luis Crouch |
| Mentor(s): |
Ronald Gangnon and Jane McElroy |
| Abstract: |
While heavy metals are known to have a variety of adverse effects on health, there has to this point been little investigation of their possible associations with carcinogenesis. In particular, there are no known links between barium, cesium, cobalt, lead, manganese, molybdenum, thallium, tungsten, or uranium exposure and development of breast cancer. In this population-based case-control study, urinary analysis was used to obtain metal concentrations which were scaled using both creatinine and specific gravity-adjustment. Multivariable logistic regression was performed for each metal, correcting for both age and a set of established risk factors. Metal concentrations were treated as both categorical (quartiles) and continuous (linear) variables. Cobalt (OR=0.48), lead (OR=1.99), and tungsten (OR=2.02) were found to be significantly associated with development of breast cancer when comparing the highest quartile of exposure to the lowest quartile of exposure, after adjusting metal concentrations for specific gravity and controlling for age, parity, age at first birth, family history of breast cancer, recent alcohol consumption, body mass index, age at menarche, menopausal status, age at menopause, type of postmenopausal hormone use, education, and marital status. Additional study is needed to verify these results, but they provide an important starting point in evaluating heavy metal exposure as an environmental risk factor for breast cancer. |
| Project Title: |
Gene Expression Analysis of Rhino-Virus Induced Asthma |
| Student |
Nadia Abuelezam |
| Mentor(s): |
Sunduz Keles and Hyonho Chun |
| Abstract: |
Rhino-virus infection has been shown to worsen respiratory symptoms in asthmatic individuals. We hypothesize that the effect rhino-virus has on asthmatic individuals has a genetic basis. Using 22 microarray data sets from cell cultures of healthy and asthmatic individuals we sought to determine a list of differentially expressed genes in asthmatic cells after rhino virus infection. After fitting two gene level models and an empirical Bayes model, such a list was created. Using gene set enrichment analysis, genes important to the interaction between asthma and rhino-virus were also identified. Further research and analysis will be completed on the data by the researchers at the University of Wisconsin at Madison Hospital. |
| Project Title: |
Empirical Study of Side Chain Cross-Peak Intensity Patterns |
| Student |
John Lewis |
| Mentor(s): |
John Markley, Hamib Eghbalnia and Arash Bahrami |
| Abstract: |
Side chains are an important component when it comes to understanding the biological properties of a protein’s structure. The cross-peak intensities for these side chains are useful in analyzing the interaction of atoms in the amino acids that makeup the protein. Using Nuclear Magnetic Resonance (NMR) spectroscopy, the intensities for the proton coupling interactions for the proteins were obtained. Studying these coupling intensities of amino acids, we were able to observe many different results. We initially came across problems such as missing patterns from the HCCH TOCSY files and inadequate number of patterns to present to discover trends. However, we have been able to identify many different coupling patterns and intensity trends that applied to many of the amino acids in both the same and different proteins. |
| Project Title: |
Medicare Part D: An Evaluation of the Prescription Drug Benefit |
| Student |
Erin Fisher |
| Mentor(s): |
Margie Rosenberg |
| Abstract: |
We conducted an investigation into Medicare Part D, the new prescription drug benefit made available to Medicare beneficiaries in 2006. The introduction of this benefit marked the first time prescription drugs, which have been assumed an important role in health care, were covered by Medicare. Our evaluation of this benefit was carried out in two ways. First, we conducted a literature review of Medicare Part D; hence we present conclusions from previously published articles on some issues encountered with Part D. We also conducted an independent analysis of the cost of medications that would be covered by Part D through modeling prescription expenditures of the eligible population, based on consumer data from Medicare Current Beneficiary Survey. Our analysis produced an estimate of $1,331 per beneficiary in insurance-covered drug expenditures, which led to a ten-year estimate of $520 billion. This figure is comparable to the projection of $534 billion released by the Centers for Medicare and Medicaid Services (CMS) Office of the Actuary in February of 2004, but is $125 billion in excess of the budget accepted by Congress when the bill was passed in 2003 (Foster, 2004). This creates concerns about the future of both the drug benefit and Medicare, and cost-containing reforms altering the program may necessarily be introduced. |
| Project Title: |
Computational Methods for Toxin Characterization from Transcription Profile |
| Student |
Deborah Muganda |
| Mentor(s): |
Mark Craven and Adam Smith |
| Abstract: |
Determining the toxicity of a new chemical is a costly and time-consuming process. Past research has shown that computational methods can be used to classify the toxicity of chemicals based on the transcription profiles that they induce. Since we expect some methods of comparing transcription profiles to result in more accurate classifications, we tested four distance algorithms, Euclidean distance, scale-independent Euclidean distance, Pearson’s Correlation and Spearman’s rank correlation. Pearson’s rank correlation and scale-independent Euclidean were able to correctly classify the toxicity of a chemical based on its transcriptional profile more accurately than the baseline algorithm, Euclidean distance. |
2006 RESEARCH PROJECTS
| Project Title: |
Functional Principal Component Analysis (PCA) in Vocal Tract Development |
| Student |
Samantha Bromfield |
| Mentor(s): |
Moo Chung and Houri Vorperian |
| Abstract: |
One objective of the Vocal Tract Development Lab is to characterize the anatomic growth of the vocal tract and surrounding structures as a first step towards understanding the biological basis of speech development. The vocal tract (VT) has been described as a 2 tube model or resonator with an anterior-oral portion in the horizontal plane (VT-H) and a posterior-pharyngeal portion in the vertical plane (VT-V). The growth of the vocal tract is non-uniform across sex, implying that the two portions may develop differently for males and females. The purpose of this statistical analysis was to assess the relational growth of the anterior versus posterior portions of the vocal tract as a function of sex. The comparison of sex differences was done using polynomial regression models. Three linear measurements were used in this analysis: 1) vocal tract length (VTL); 2) anterior vocal tract length (VT-H); and 3) posterior vocal tract length (VT-V). These measurements were secured from imaging studies (MRI and CT) of 229 cases (83 female and 146 male) between the ages birth to 20. Results indicate that there are significant sex differences in the growth rate and growth pattern of all three vocal tract measurements analyzed. |
| Project Title: |
Smoking Cessation Benefit |
| Student |
David Gasca |
| Mentor(s): |
Marjorie Rosenberg |
| Abstract: |
Data was collected after the inclusion of a new insurance benefit for smoking cessation by the state of Wisconsin for the benefit of giving employers the pharmaceutical cost of such a benefit. Pharmacotherapy claims include use of one of the four FDA approved medications for smoking cessation. Subjects, for which individual-level claims was collected, included in the three-year study from 2001-2003 were the state of Wisconsin employees, retirees, and adult dependents. Single medication users were the concentration of the study, which made up 91% of all smoking cessation claimants, and comparisons were drawn of their individual activity within the pharmacotherapy plan to those of the clinical guidelines provided by the Agency for Healthcare Research and Quality (AHRQ). |
| Project Title: |
Statistical Analysis of the Shannon Diversity Index |
| Student |
Linda Liu |
| Mentor(s): |
Rick Nordheim |
| Abstract: |
The purpose of this project was to assess the sampling properties of a widely used ecological measure, the Shannon diversity index. Results are presented from a study of the effects of sample size on the index, especially when applied to theoretical communities with different underlying species abundance distributions. The two distributions used n the study were geometric and lognormal, each with a range of parameters. We found that the response of the index to sample size depends on the shape of the distribution, but that for each distribution the bias decreased as sample size increased. We also analyzed the reliability of the t test used for comparing the Shannon indices for two samples. The study found that the realized performance of the t test reached closer agreement with the nominal performance (significance level of the test) for larger samples from each simulated community. |
| Project Title: |
Computational Analysis of Protein Interfaces |
| Student |
Corina Prieto |
| Mentor(s): |
Julie Mitchell and Steve Darnell |
| Abstract: |
Protein-protein interactions are important for the understanding of biological pathways. The physical properties of protein-protein interfaces are one of the factors that determine its interaction behavior. Studying protein-protein interfaces, specifically “hot spot” residues that account for the majority of free energy in binding may reveal more information about their properties associated with binding. It is important to predict hot spots because they can help us design novel protein interfaces through a better understanding of their interaction behavior. We used machine learning to generate knowledge-based-decision-tree models that explore protein interfaces relative to flexibility and determined if this feature would improve the prediction of hot spots. However, important trends in the data were observed regarding the flexibility of hot spots based on whether they were hydrophobic or polar. |
| Project Title: |
3D Printing of Protein Models |
| Student |
Nicholas Stong |
| Mentor(s): |
George Phillips and Roman Aranda |
| Abstract: |
The Protein Data Bank (PDB) was established in 1971 at the Brookhaven National Laboratory; in 1998 it was transferred to the Research Collaboratory for Structural Bioinformatics, which is composed of Rutgers University, The University of Wisconsin-Madison, NIST, and the San Diego Supercomputer Center. They maintain a variety of information on 37,136 structures, all of which are made publicly available. Scientists all over the world using a variety of techniques including x-ray crystallography and NMR spectroscopy determine this information experimentally. Most important is the 3-D model information available in .PDB files (Wikipedia). Using, Pymol, an open source molecular visualization system, this data can be viewed and manipulated. Using this program, proteins can be explored and better understood. |
| Project Title: |
Finding CRMs with Background Modeling |
| Student |
Aline Thomas
|
| Mentor(s): |
Mark Craven and Keith Noto |
| Abstract: |
Promoter regions of co-regulated genes often contain sets of protein binding sites called cis-Regulatory modules (CRMs). Probabilistic methods are often used in models that represent both binding site DNA and background DNA so that algorithms can find CRMs. In this project, an algorithm containing a two-state 5 th order hidden Markov model is compared to one containing a 5 th order Markov chain model to determine if a more sophisticated model will more accurately represent background DNA, and therefore, improve the ability of machines to find CRMs. Comparing their representation of DNA sequences, the hidden Markov model performed significantly better. However, the Markov chain model and the hidden Markov model found CRMs equally well with the motif finder. |
2005 RESEARCH PROJECTS
| Project Title: |
Correlation, Permutation, and the Brain |
| Mentor(s): |
Moo Chung |
| Abstract: |
In brain imaging studies, it is necessary to correlate psychological measures with neuroanatomical measures and compare the correlation measure between groups while removing the effect of gender and age. Partial correlation can be used for this purpose. We will test if the partial correlation measure is different between the groups of autistic and normal subjects using the permutation test. Since it is required to test the correlation in more than 40,000 sites in the brain, we will also extend the permutation test to the problem of testing multiple hypotheses simultaneously. |
| Project Title |
Prediction of cardiorespiratory fitness (CRF) in obese children |
| Mentor(s): |
Jens Eickhoff |
| Abstract: |
Low cardiorespiratory fitness (CRF) in children can lead to heart disease, stroke diabetes and hypertension in adult life. CRF is typically assessed by measuring maximum oxygen uptake (VO2max) which is performed in a laboratory setting (using graded exercise tests) and may require considerable expense in terms of time, trained personnel and medical supervision (American College of Sports Medicine, 2000). Moreover, accurate assessment of VO2max may be difficult in obese children. However, less expensive standard measurements, e.g., BMI, lean body mass, fat percentage, etc., can be used to predict VO2max. In this project, we will develop a model for predicting VO2max in obese children using various standard demographic and body composition variables. |
| Project Title: |
Insulin Production in Diabetes Model: Spatial Comparison of Strains |
| Mentor(s): |
Brian S. Yandell |
| Abstract: |
Insulin is produced in the pancreas, specifically in the Islets of Langerhans by beta cells. (Yes, there are alpha cells as well.) Scientists in biochemistry have images of hundreds of these islets for two types of mice. One strain, BTBR.ob, mimics Type II (mature onset) diabetes. A second strain, B6.ob, appears healthy. We have roughly a hundred islets per mouse, with 3 mice per strain. We can visually see differences in the production of insulin in the islet beta cells in these images. And we have summary measurements on each islet that we can use to compare the two types of mice. Further, there is opportunity to consider how we might improve on the measurement process and the experimental design. The goal is to clearly infer differences in the quality of insulin production between BTBR.ob and B6.ob. You can/will meet with the scientists and work with the measurement system as well. |
2004 RESEARCH PROJECTS
| Project Title: |
Assessing the effect of stratification on balance and efficiency in studies with stratified, blocked randomization |
| Mentor(s): |
Tom Cook |
| Abstract: |
Randomization is essential for assessing treatment effects in controlled experiments. Randomization achieves several goals: ensuring that there are no systematic differences between treatment groups, ensuring that sample sizes in each treatment group are approximately the same, and providing a theoretically rigorous basis for statistical inference. Blocked randomization can be used to increase the degree to which balance is achieved; however, if stratification is used, the balance is decreased with increasing number of strata. We will explore the impact that the number of strata has on balance and on the operating characteristics of the corresponding statistical tests. |
| Project Title: |
Measuring inter-rater agreement between pathologists in oncology studies |
| Mentor(s): |
Jens Eickhoff |
| Abstract: |
In many oncology studies, tumor tissues are examined and categorized (e.g., benign, malignant) by at least two pathologists. Consequently, it is oftentimes of interest to quantify the magnitude of agreement between the pathologist ratings. A common way to measure rater agreement is to use the kappa coefficient. However, the use of such a measure as become increasingly controversial, in part because the kappa coefficient is unnecessarily stringent in crediting so much of observed agreement to chance; if certain categories predominate, seemingly good agreement can still result in low values of the kappa coefficient.
In this project, latent variable models will be utilized to measure inter-rate agreement. Latent variable models are intuitively appealing in this framework and furnish meaningful interpretation and statistical inference. Specifically, parameter estimates can be used to quantify inter-rater variability and rater classification thresholds. This project will involve simulation studies as well as analysis of real data.
|
| Project Title: |
Empirical Bayes methods for matched microarray data |
| Mentor(s): |
Christina Kendziorski and Timothy Grant |
| Abstract: |
DNA microarrays allow for large scale coordinate monitoring of gene expression. In the late 1990's, they were referred to as the "first great hope" for providing global views of biological processes (Nature Genetics Chipping Forecast, 1999). The enthusiasm was not misguided. Microarrays are now the most widely used tool to efficiently measure an organism's gene expression levels.
Microarray experiments result in tens of thousands of measurements for a single individual and, most often, groups of individuals are studied across multiple biological conditions. The complexities of the data structure pose new challenges for the statistician. A number of statistical methods have been developed to identify genes differentially expressed between two conditions. However, no methods explicitly address the situation where the data is matched. Accounting for the matched structure could lead to much needed increases in sensitivity.
In the summer of 2004, we will extend our empirical Bayes methods for identifying differentially expressed genes to account for matched data. We will also consider simple extensions of other approaches to allow for matching. The operating characteristics of the approaches will be assessed using simulations and data from the laboratories of Drs. Gould and Ahlquist. The Gould lab studies mammary cancer and has collected microarray data from control and RXR treated rats that have been pair fed to control for treatment induced weight loss. The Ahlquist lab is studying nasal cancer and has microarray data from both healthy and cancerous tissues of individual patients.
|
| Project Title: |
Cognitive outcomes in a randomized trial of raloxifene |
| Mentor(s): |
Rebecca Koscik |
| Abstract: |
Background: Recent findings from the Women’s Health Initiative (WHI) raise serious concerns about the safety and feasibility of prolonged therapy with opposed conjugated estrogen. Consequently, there is a critical need to identify alternatives to traditional hormone therapy. One such alternative, raloxifene, is a selective estrogen receptor modulator (SERM) used commonly for the treatment of osteoporosis, and known to exhibit neuromodulatory and neurotrophic actions in the brain. Further, raloxifene has the potential to enhance cognitive function of healthy older women. However, no clinical study has evaluated whether administration of raloxifene could improve memory or other cognitive symptoms of postmenopausal women with Alzheimer’s disease (AD).
Objective: To determine if short-term administration of raloxifene would improve performance on a comprehensive battery of tests of cognition and mood for post-menopausal women with AD.
Methods: The present ongoing randomized, placebo-controlled, double-blind, parallel-group design clinical study evaluates the cognition-enhancing efficacy of 120mg/day of raloxifene in post-menopausal women with probable AD. The study involves three months of therapy with raloxifene, and comprehensive evaluation of cognition and mood at baseline and weeks 4, 8, and 12 on treatment and again at week 20 following 2 months of treatment termination. |
| Project Title: |
Analysis of nutrient intake data from 3-day food records |
| Mentor(s): |
Hui-Chuan Lai |
| Abstract: |
Epidemiologic investigations of diet and various measures of health-related outcomes are common in the scientific literature. However, obtaining accurate and reliable dietary data, that is representative of usual intake for the time period of interest, is challenging. Conducting and/or reading about dietary analyses require an understanding of the potential sources of error involved in utilizing dietary data. Knowledge of the Dietary Reference Intakes (US nutrient intake standards) is also essential. Dietary assessment tools vary; dietary data in this project come from 3-day food records. Potential sources of error may include (1) under- or over-reporting of food intake by study participants (intentional or unintentional), (2) inadequate information in the nutrient database for unusual food items, (3) differing data entry of similar food items by study personnel, (4) variation in intake over the 3 days, and (5) inter-individual differences in intake that may differ according to other factors (i.e. gender, age, body weight). Another important consideration in analyzing dietary data is whether nutrient intake should account for total calorie intake. These issues will be explored with selected nutrients in a sample of healthy, college-aged subjects. |
| Project Title: |
Machine learning for drug design |
| Mentor(s): |
David Page |
| Abstract: |
This project will involve applying a machine learning algorithm to determine structural properties common to molecules that work as anti-cancer agents. The student(s) will use a molecular modeling software package to draw the molecules and estimate their three-dimensional structures. They then will label the molecules as "active" or "inactive" according to chemical tests performed robotically (though the student(s) will not be involved in running these tests). Finally, the student(s) will run the machine learning algorithm to produce a model that predicts anti-cancer activity from molecular structure. Cross-validation will be used to estimate the accuracy of the model(s) produced by this approach. |
| Project Title: |
Statistical genomics: microarray gene mapping in mice |
| Mentor(s): |
Brian Yandell |
| Abstract: |
I have been working closely with the laboratory of Prof. Alan Attie in Biochemistry on a mouse model for diabetes and obesity. Recently Christina Kendziorski joined our team. We now have an F2 genetic cross with 60 mice and over 40,000 measurements per mouse. These measurements are mRNA gene expression abundance with Affymetrix chips. They represent chemical signals inside liver tissue at the time of mouse sacrifice. Differences among mice may shed light on the biochemical basis of diabetes and obesity. The key question: what are the primary genetic factors, or genomic regions, that influence these mRNA expression measurements? Our studies and those of others suggest that the story is not simple, but we are beginning to get some clues. There are interesting statistical questions at a variety of levels, depending on a student’s abilities and inclination. |
2003 RESEARCH PROJECTS
To view a listing of 2003 through 2001 Research Projects please click the link below:
2003 - 2001 Research Projects
|