Perform differential expression (DE) analysis on an expression matrix
Source:R/DEfuns.R
DEanalysis.Rd
This function performs DE analysis on an expression using edgeR or DESeq2, given a vector of sample conditions.
Usage
DEanalysis_edger(expression.matrix, condition, var1, var2, anno)
DEanalysis_deseq2(expression.matrix, condition, var1, var2, anno)
Arguments
- expression.matrix
the expression matrix; rows correspond to genes and columns correspond to samples; usually preprocessed by
preprocessExpressionMatrix
; a list (of the same length as modality) can be provided if #'length(modality) > 1
- condition
a vector of the same length as the number of columns of expression.matrix, containing the sample conditions; this is usually the last column of the metadata
- var1, var2
conditions (contained in condition) to perform DE between; note that DESeq2 requires at least two replicates per condition
- anno
annotation data frame containing a match between the row names of the expression.matrix (usually ENSEMBL IDs) and the gene names that should be rendered within the app and in output files; this object is created by
generateShinyApp
using the org.db specified
Value
A tibble with the differential expression results for all genes. Columns are
gene_id (usually ENSEMBL ID matching one of the rows of the expression matrix)
gene_name (name matched through the annotation)
log2exp (average log2(expression) of the gene across samples)
log2FC (log2(fold-change) of the gene between conditions)
pval (p-value of the gene being called DE)
pvalAdj (adjusted p-value using the Benjamini Hochberg correction)
Examples
expression.matrix.preproc <- as.matrix(read.csv(
system.file("extdata", "expression_matrix_preprocessed.csv", package = "bulkAnalyseR"),
row.names = 1
))[1:100, 1:4]
anno <- AnnotationDbi::select(
getExportedValue('org.Mm.eg.db', 'org.Mm.eg.db'),
keys = rownames(expression.matrix.preproc),
keytype = 'ENSEMBL',
columns = 'SYMBOL'
) %>%
dplyr::distinct(ENSEMBL, .keep_all = TRUE) %>%
dplyr::mutate(NAME = ifelse(is.na(SYMBOL), ENSEMBL, SYMBOL))
#> 'select()' returned 1:1 mapping between keys and columns
edger <- DEanalysis_edger(
expression.matrix = expression.matrix.preproc,
condition = rep(c("0h", "12h"), each = 2),
var1 = "0h",
var2 = "12h",
anno = anno
)
deseq <- DEanalysis_edger(
expression.matrix = expression.matrix.preproc,
condition = rep(c("0h", "12h"), each = 2),
var1 = "0h",
var2 = "12h",
anno = anno
)
# DE genes with log2(fold-change) > 1 in both pipelines
intersect(
dplyr::filter(edger, abs(log2FC) > 1, pvalAdj < 0.05)$gene_name,
dplyr::filter(deseq, abs(log2FC) > 1, pvalAdj < 0.05)$gene_name
)
#> [1] "Gm16041" "Adhfe1" "Tcf24" "Prdm14" "Eya1" "Msc"
#> [7] "Jph1" "Crispld1" "Paqr8" "Efhc1"