metaDI {GSCA}R Documentation

The function to run a meta-GSCA.

Description

This function can be used to run a meta-GSCA, described in Choi and Kendziorski (2009). Unlike singleDC, the study-specific values (e.g., gene-gene pairwise correlations, difference in condition-specific correlation signs) need to be pre-calculated and provided as input. For each gene set defined in GSdefList, the dispersion index is calculated between two given studies and returned.

Gene pairs are randomly permuted across gene sets for nperm times. Permutation-based p-values are calculated, based on the rank of observed DI among permuted index values.

Usage

metaDI(corr1, corr2, GSdefList, nperm, permDI = FALSE)

Arguments

corr1, corr2 Two symmetric matrices from two studies of interest. For meta-GSCA, a difference matrix is used, of two condition-specific correlation sign matrices within each study. The lower triangle is used. rownames(data) is used to subset a sub-matrix from data for each gene set. (Rows must be named by gene IDs used in GSdefList. For example, if GSdefList defines gene sets in Entrez Gene IDs, rownames(data) should be Entrez Gene IDs.
GSdefList A list of character vectors that define gene sets. Each entry of this list is a gene set.
nperm The desired number of permutations.
permDI TRUE/FALSE. If set TRUE, dispersion index values from permutation are saved and returned; if FALSE, permutation-based dispersion index values are not returned. Default is FALSE.

Details

Gene pairs are permuted across gene sets, as described in Choi and Kendziorski (2009). For each permutation, annotated gene pairs (gene pairs which belong to at least one gene set) are randomly re-assigned to gene sets, and dispersion indices (DIs) are calculated based on those random gene sets. As focus is on preservation, the p-value for each gene set is calculated as:

p = sum(permutation DIs <= observed DI) / nperm .

Value

DI The dispersion index vector for each gene set.
pvalue The permutation-based p-value for each gene set.
permv The permutation-based DI matrix, of nperm columns. The first column is identical to what is returned by DI.

Note

Currently, metaDI implements meta-analysis of gene-gene pairwise correlations from two studies. In addition to meta-GSCA described in Choi and Kendziorski (2009), which uses the sign difference, raw correlations can be input to investigate preservation of them.

Author(s)

YounJeong Choi

References

Choi and Kendziorski, submitted

Examples

data(LungCancer3)
GS <- LungCancer3$info$GSdef
GSdesc <- LungCancer3$info$Name

data.grouped <- list(
Tumor = list(Harvard = LungCancer3$data$Harvard[, 1:139],
Michigan = LungCancer3$data$Michigan[, 1:86]),
Normal = list(Harvard = LungCancer3$data$Harvard[, 140:156],
Michigan = LungCancer3$data$Michigan[, 87:96]))

corr.t <- lapply(lapply(data.grouped$Tumor, t), cor, use = "pairwise.complete.obs")
corr.n <- lapply(lapply(data.grouped$Normal, t), cor, use = "pairwise.complete.obs")
cor.diff <- list(Harvard = corr.t$Harvard - corr.n$Harvard,
Michigan = corr.t$Michigan - corr.n$Michigan)

cor.diff.sign <- list(
Harvard = apply((cor.diff$Harvard > 0), 2, as.numeric) -
apply((cor.diff$Harvard < 0), 2, as.numeric),
Michigan = apply((cor.diff$Michigan > 0), 2, as.numeric) -
apply((cor.diff$Michigan < 0), 2, as.numeric))

for (i in 1:length(cor.diff.sign)) {
rownames(cor.diff.sign[[i]]) <- colnames(cor.diff.sign[[i]])
}
dist.HM <- metaDI(cor.diff.sign$Harvard, cor.diff.sign$Michigan, GS, 3, permDI = TRUE)

[Package GSCA version 1.1.0 Index]