Perform differential expression (DE) analysis on an expression matrix

This function performs DE analysis on an expression using edgeR or DESeq2, given a vector of sample conditions.

Usage

DEanalysis_edger(expression.matrix, condition, var1, var2, anno)

DEanalysis_deseq2(expression.matrix, condition, var1, var2, anno)

Arguments

expression.matrix: the expression matrix; rows correspond to genes and columns correspond to samples; usually preprocessed by preprocessExpressionMatrix; a list (of the same length as modality) can be provided if #' length(modality) > 1
condition: a vector of the same length as the number of columns of expression.matrix, containing the sample conditions; this is usually the last column of the metadata
var1, var2: conditions (contained in condition) to perform DE between; note that DESeq2 requires at least two replicates per condition
anno: annotation data frame containing a match between the row names of the expression.matrix (usually ENSEMBL IDs) and the gene names that should be rendered within the app and in output files; this object is created by generateShinyApp using the org.db specified

Value

A tibble with the differential expression results for all genes. Columns are

gene_id (usually ENSEMBL ID matching one of the rows of the expression matrix)
gene_name (name matched through the annotation)
log2exp (average log2(expression) of the gene across samples)
log2FC (log2(fold-change) of the gene between conditions)
pval (p-value of the gene being called DE)
pvalAdj (adjusted p-value using the Benjamini Hochberg correction)

Examples

expression.matrix.preproc <- as.matrix(read.csv(
  system.file("extdata", "expression_matrix_preprocessed.csv", package = "bulkAnalyseR"), 
  row.names = 1
))[1:100, 1:4]

anno <- AnnotationDbi::select(
  getExportedValue('org.Mm.eg.db', 'org.Mm.eg.db'),
  keys = rownames(expression.matrix.preproc),
  keytype = 'ENSEMBL',
  columns = 'SYMBOL'
) %>%
  dplyr::distinct(ENSEMBL, .keep_all = TRUE) %>%
  dplyr::mutate(NAME = ifelse(is.na(SYMBOL), ENSEMBL, SYMBOL))
#> 'select()' returned 1:1 mapping between keys and columns
  
edger <- DEanalysis_edger(
  expression.matrix = expression.matrix.preproc,
  condition = rep(c("0h", "12h"), each = 2),
  var1 = "0h",
  var2 = "12h",
  anno = anno
)
deseq <- DEanalysis_edger(
  expression.matrix = expression.matrix.preproc,
  condition = rep(c("0h", "12h"), each = 2),
  var1 = "0h",
  var2 = "12h",
  anno = anno
)
# DE genes with log2(fold-change) > 1 in both pipelines
intersect(
  dplyr::filter(edger, abs(log2FC) > 1, pvalAdj < 0.05)$gene_name,
  dplyr::filter(deseq, abs(log2FC) > 1, pvalAdj < 0.05)$gene_name
)
#>  [1] "Gm16041"  "Adhfe1"   "Tcf24"    "Prdm14"   "Eya1"     "Msc"     
#>  [7] "Jph1"     "Crispld1" "Paqr8"    "Efhc1"