Is a set of genes enriched?
Brad Efron PhD
Department of Statistics
Wednesday, October 18, 2006, 4:00 pm
The statistics microarray literature has focussed on the detection of individual genes that are expressed differently under Treatment and Control conditions. A combination of insufficient data per gene and highly multiple simultaneous inference often produces disappointing results. "Enrichment" considers predefined groups of genes, such as pathways, using the results from all of them to detect patterns of differential expression. Subramaniain et al (2005) proposed an interesting enrichment
test based on the Kolmogorov-Smirnov statistic, while the "Limma" Bioconductor algorithm is based on means of z scores. We consider a general class of enrichment tests from the point of view of efficient detection. A new enrichment statistic, "maxmean", is shown to have efficiency advantages over both the mean and the Kolmogorov-Smirnov tests. Obtaining a legitimate p-value for the enrichment of a given gene-set can involve permuting both the rows and the columns of the expression matrix.
This is joint work with Rob Tibshirani.
Return to seminar list