Perform p-value combination for sets of differential expression tests

This function performs p-value combination for all genes and estimates summary statistics for average effect sizes for all experiments in the input SummarizedExperiment object.

Usage

meta_de(x, FUN, pval = "PValue", lfc = "logFC", impute_missing = TRUE, ...)

Arguments

x: SummarizedExperiment object containing combined differential expression results from different studies. The SE object must contain at least two assays, one for the P-values to combine and the other for the effect sizes to compute (e.g. logFC).
FUN: One of the 'parallel' functions provided by metapod. One of "parallelBerger", "parallelFisher", "parallelHolmMin", "parallelPearson", "parallelSimes", "parallelStouffer", or "parallelWilkinson".
pval: assay name in SE object containing the P-values to combine.
lfc: assay name in the SE object containing the logFC values to combine.
impute_missing: TRUE/FALSE should missing values in the logFC and P-Value assays be imputed prior p-value combination? Default TRUE, missing p-values are imputed with 1 and missing logFCs are imputed with 0.
...: Additional arguments passed to FUN. See the metapod package for details.

Value

data.table with summary stats of the p-value combination of all experiments. Please see the documentation in the metapod package for more details. The returned columns, "Rep.LogFC" and "Rep.Pval" contain the results of extracting the representative effect and P=value from all influential tests. These are individual tests in the data that are particularly important for calculating the combined effects.

Examples

# Example taken from ?dfs2se()

# Define two differential expression dataset data.frames
exp1 <- data.frame(
  feature_id = c("geneA", "geneB", "geneC"),
  PValue = c(0.01, 0.5, 0.05),
  FDR = c(0.02, 0.5, 0.07),
  logFC = c(1.2, -2.5, 3.7),
  logCPM = c(12, 9, 0)
)

exp2 <- data.frame(
  feature_id = c("geneA", "geneB", "geneD"),
  PValue = c(0.07, 0.3, 0.8),
  FDR = c(0.08, 0.4, 1.0),
  logFC = c(1.5, -2.0, 3.0),
  logCPM = c(14, 10, 2)
)

# Combine into a single list
l <- list(experiment1 = exp1, experiment2 = exp2)

# Convert the data to a SummarizedExperiment
se <- dfs2se(l)

# Perform p-value combination across experiments for each gene
#  using Wilkinson's method and passing additional values
result <- meta_de(se, metapod::parallelWilkinson, min.prop = 0.1)
head(result)
#>    Feature Combined.Pval Direction Rep.logFC Rep.Pval Median.logFC Mean.logFC
#>     <char>         <num>    <char>     <num>    <num>        <num>      <num>
#> 1:   geneA        0.0199        up       1.2     0.01         1.35       1.35
#> 2:   geneB        0.5100      down      -2.0     0.30        -2.25      -2.25
#> 3:   geneC        0.0975        up       3.7     0.05         1.85       1.85
#> 4:   geneD        0.9600        up       3.0     0.80         1.50       1.50
#>    Min.logFC Max.logFC
#>        <num>     <num>
#> 1:       1.2       1.5
#> 2:      -2.5      -2.0
#> 3:       0.0       3.7
#> 4:       0.0       3.0