singleDC {GSCA} | R Documentation |
This function runs a single-study GSCA, differential co-expression
(DC) analysis, described in Choi and Kendziorski (2009).
The condition-specific gene-gene pairwise correlations
are first calculated; then for each gene set defined in GSdefList
, the
dispersion index is calculated across condition-specific correlations.
Samples are randomly permuted across conditions for nperm
times. Permutation-based p-values are calculated, based on the rank of
observed DI
among permuted index values.
singleDC(data, group, GSdefList, nperm, permDI = FALSE)
data |
A data matrix of rows representing genes and columns
representing arrays. rownames(data) is used to subset a
sub-matrix from data for each gene set. (Rows must be named
by gene IDs used in GSdefList . For example, if
GSdefList defines gene sets in Entrez Gene IDs,
rownames(data) should be Entrez Gene IDs. |
group |
A numeric vector that specifies the number of arrays
(columns) in each condition. For example, if c(10, 5) is
provided, first 10 columns of the data matrix are used for
one condition and the next 5 are used for the other condition. |
GSdefList |
A list of character vectors that define gene sets. Each entry of this list is a gene set. |
nperm |
The desired number of permutations. |
permDI |
TRUE/FALSE. If set TRUE, dispersion index values from permutation are saved and returned; if FALSE, permutation-based dispersion index values are not returned. Default is FALSE. |
Samples (columns) are permuted across conditions. For each permutation, condition-specific correlations are re-calculated based on permuted samples, and dispersion indices (DIs) are calculated based on those permutation-based correlations. As focus is on difference, the p-value for each gene set is calculated as:
p = sum(permutation DIs >= observed DI) / nperm .
DI |
The dispersion index vector for each gene set. |
pvalue |
The permutation-based p-value for each gene set. |
permv |
The permutation-based DI matrix, of nperm
columns. The first column is identical to what is returned by DI . |
Currently, singleDC
implements DC analysis for two conditions
(e.g., tumor vs. normal) and three conditions (e.g., AA, AB, and BB
genotypes). For three conditions, pairwise DIs are first calculated
and averaged (internally).
YounJeong Choi
Choi and Kendziorski, submitted.
data(LungCancer3) GS <- LungCancer3$info$GSdef GSdesc <- LungCancer3$info$Name dc.M <- singleDC(data = LungCancer3$data$Michigan, group = c(86, 10), GSdefList = GS, nperm = 3, permDI = TRUE)