metaDI {GSCA} | R Documentation |
This function can be used to run a meta-GSCA, described in Choi and
Kendziorski (2009). Unlike singleDC
, the study-specific values
(e.g., gene-gene pairwise correlations, difference in
condition-specific correlation signs) need to be pre-calculated and
provided as input. For each gene set defined in GSdefList
, the
dispersion index is calculated between two given studies and returned.
Gene pairs are randomly permuted across gene sets for nperm
times. Permutation-based p-values are calculated, based on the rank of
observed DI
among permuted index values.
metaDI(corr1, corr2, GSdefList, nperm, permDI = FALSE)
corr1, corr2 |
Two symmetric matrices from two studies of interest. For meta-GSCA, a
difference matrix is used, of two condition-specific correlation sign
matrices within each study. The lower triangle is
used. rownames(data) is used to subset a sub-matrix from
data for each gene set. (Rows must be named by gene IDs used
in GSdefList . For example, if GSdefList defines gene
sets in Entrez Gene IDs, rownames(data) should be Entrez Gene IDs. |
GSdefList |
A list of character vectors that define gene sets. Each entry of this list is a gene set. |
nperm |
The desired number of permutations. |
permDI |
TRUE/FALSE. If set TRUE, dispersion index values from permutation are saved and returned; if FALSE, permutation-based dispersion index values are not returned. Default is FALSE. |
Gene pairs are permuted across gene sets, as described in Choi and Kendziorski (2009). For each permutation, annotated gene pairs (gene pairs which belong to at least one gene set) are randomly re-assigned to gene sets, and dispersion indices (DIs) are calculated based on those random gene sets. As focus is on preservation, the p-value for each gene set is calculated as:
p = sum(permutation DIs <= observed DI) / nperm .
DI |
The dispersion index vector for each gene set. |
pvalue |
The permutation-based p-value for each gene set. |
permv |
The permutation-based DI matrix, of nperm
columns. The first column is identical to what is returned by DI . |
Currently, metaDI
implements meta-analysis of gene-gene pairwise
correlations from two studies. In addition to meta-GSCA described in
Choi and Kendziorski (2009), which uses the sign difference, raw
correlations can be input to investigate preservation of them.
YounJeong Choi
Choi and Kendziorski, submitted
data(LungCancer3) GS <- LungCancer3$info$GSdef GSdesc <- LungCancer3$info$Name data.grouped <- list( Tumor = list(Harvard = LungCancer3$data$Harvard[, 1:139], Michigan = LungCancer3$data$Michigan[, 1:86]), Normal = list(Harvard = LungCancer3$data$Harvard[, 140:156], Michigan = LungCancer3$data$Michigan[, 87:96])) corr.t <- lapply(lapply(data.grouped$Tumor, t), cor, use = "pairwise.complete.obs") corr.n <- lapply(lapply(data.grouped$Normal, t), cor, use = "pairwise.complete.obs") cor.diff <- list(Harvard = corr.t$Harvard - corr.n$Harvard, Michigan = corr.t$Michigan - corr.n$Michigan) cor.diff.sign <- list( Harvard = apply((cor.diff$Harvard > 0), 2, as.numeric) - apply((cor.diff$Harvard < 0), 2, as.numeric), Michigan = apply((cor.diff$Michigan > 0), 2, as.numeric) - apply((cor.diff$Michigan < 0), 2, as.numeric)) for (i in 1:length(cor.diff.sign)) { rownames(cor.diff.sign[[i]]) <- colnames(cor.diff.sign[[i]]) } dist.HM <- metaDI(cor.diff.sign$Harvard, cor.diff.sign$Michigan, GS, 3, permDI = TRUE)