singleDC {GSCA}R Documentation

The function to run a single-study GSCA, differential co-expression (DC) analysis

Description

This function runs a single-study GSCA, differential co-expression (DC) analysis, described in Choi and Kendziorski (2009). The condition-specific gene-gene pairwise correlations are first calculated; then for each gene set defined in GSdefList, the dispersion index is calculated across condition-specific correlations.

Samples are randomly permuted across conditions for nperm times. Permutation-based p-values are calculated, based on the rank of observed DI among permuted index values.

Usage

singleDC(data, group, GSdefList, nperm, permDI = FALSE)

Arguments

data A data matrix of rows representing genes and columns representing arrays. rownames(data) is used to subset a sub-matrix from data for each gene set. (Rows must be named by gene IDs used in GSdefList. For example, if GSdefList defines gene sets in Entrez Gene IDs, rownames(data) should be Entrez Gene IDs.
group A numeric vector that specifies the number of arrays (columns) in each condition. For example, if c(10, 5) is provided, first 10 columns of the data matrix are used for one condition and the next 5 are used for the other condition.
GSdefList A list of character vectors that define gene sets. Each entry of this list is a gene set.
nperm The desired number of permutations.
permDI TRUE/FALSE. If set TRUE, dispersion index values from permutation are saved and returned; if FALSE, permutation-based dispersion index values are not returned. Default is FALSE.

Details

Samples (columns) are permuted across conditions. For each permutation, condition-specific correlations are re-calculated based on permuted samples, and dispersion indices (DIs) are calculated based on those permutation-based correlations. As focus is on difference, the p-value for each gene set is calculated as:

p = sum(permutation DIs >= observed DI) / nperm .

Value

DI The dispersion index vector for each gene set.
pvalue The permutation-based p-value for each gene set.
permv The permutation-based DI matrix, of nperm columns. The first column is identical to what is returned by DI.

Note

Currently, singleDC implements DC analysis for two conditions (e.g., tumor vs. normal) and three conditions (e.g., AA, AB, and BB genotypes). For three conditions, pairwise DIs are first calculated and averaged (internally).

Author(s)

YounJeong Choi

References

Choi and Kendziorski, submitted.

Examples

data(LungCancer3)
GS <- LungCancer3$info$GSdef
GSdesc <- LungCancer3$info$Name
dc.M <- singleDC(data = LungCancer3$data$Michigan, group = c(86, 10),
GSdefList = GS, nperm = 3, permDI = TRUE)

[Package GSCA version 1.1.0 Index]