Online documentation
This vignette displays pre-computed results. Run the targets pipeline locally for interactive analysis.
Overview
See the Glossary for term definitions used throughout this project.
- Compares tumor vs normal or baseline vs relapse samples
- Three complementary methods: DESeq2, edgeR, limma-voom
- Consensus genes (significant in all three) used for pathway analysis
- Visualizations: PCA, volcano plot, MA plot, heatmap, method comparison
Note: This vignette was built in CI with sample_limit=20. Local builds default to 200 samples. Numbers below reflect the CI subset.
Method Comparison
Summary of DE genes detected by each method (DESeq2, edgeR, limma-voom) and their consensus overlap.
Generating code
{
if (is.null(de_method_summary) || nrow(de_method_summary) ==
0)
return(NULL)
caption <- paste0("Differential expression results across methods. ",
"Method = DE analysis package. ", "DESeq2: negative binomial GLM with Wald test and apeglm LFC shrinkage. ",
"edgeR: quasi-likelihood F-test with TMM normalization. ",
"limma-voom: linear models with precision weights on log2-CPM. ",
"Genes Tested = number of genes after filtering. ", "Significant = genes with |log2FC| > 1 AND adjusted p-value < 0.05. ",
"Up/Down = direction of fold change (tumor vs normal).")
DT::datatable(de_method_summary, rownames = FALSE, filter = "top",
options = list(pageLength = 10, scrollX = TRUE), colnames = c("Method",
"Genes Tested", "Significant", "Up", "Down"), caption = htmltools::tags$caption(style = "caption-side: top; text-align: left;",
caption))
}
Differential expression results across methods. Method = DE analysis package. DESeq2: negative binomial GLM with Wald test and apeglm LFC shrinkage. edgeR: quasi-likelihood F-test with TMM normalization. limma-voom: linear models with precision weights on log2-CPM. Genes Tested = number of genes after filtering. Significant = genes with |log2FC| > 1 AND adjusted p-value < 0.05. Up/Down = direction of fold change (tumor vs normal).
|
method
|
n_tested
|
n_sig
|
n_up
|
n_down
|
|
DESeq2
|
30675
|
1615
|
581
|
377
|
|
edgeR
|
30675
|
3738
|
1733
|
758
|
|
limma
|
30675
|
159
|
82
|
13
|
Generating code
{
if (is.null(de_method_summary) || nrow(de_method_summary) ==
0)
return(NULL)
long <- rbind(data.frame(method = de_method_summary$method,
direction = "Up", count = de_method_summary$n_up), data.frame(method = de_method_summary$method,
direction = "Down", count = -de_method_summary$n_down))
ggplot2::ggplot(long, ggplot2::aes(x = method, y = count,
fill = direction)) + ggplot2::geom_col() + ggplot2::scale_fill_manual(values = c(Up = "#DC3545",
Down = "#0066CC"), name = "Direction") + ggplot2::geom_hline(yintercept = 0,
linewidth = 0.3) + ggplot2::labs(title = "DE Genes by Method",
x = NULL, y = "Number of genes (down shown as negative)",
caption = paste0("DESeq2: negative binomial GLM with apeglm shrinkage. ",
"edgeR: quasi-likelihood F-test with TMM normalization. ",
"limma: empirical Bayes moderated t-test with voom weights. ",
"Up (red) = log2FC > 1. Down (blue) = log2FC < -1. ",
"Thresholds: |log2FC| > 1, padj < 0.05. ", "With sample_limit=20, zero significant genes is expected. ",
"Source: DESeq2/edgeR/limma results. ", "See method table for exact counts and annotated DE table for top genes.")) +
ggplot2::theme_minimal(base_size = 12) + ggplot2::theme(plot.caption = ggplot2::element_text(size = 7,
hjust = 0, lineheight = 1.2))
}

Principal Component Analysis
- Computed from top 500 most variable genes after VST
- VST stabilizes variance across the expression range
- Preferred over raw counts or logCPM for exploratory visualization

Volcano Plot
- Statistical significance (-log10 adjusted p-value) vs biological effect size (log2 fold change)
- Genes in upper corners are both statistically significant and biologically meaningful

MA Plot
- Bland-Altman plot: mean expression (x-axis) vs fold change (y-axis)
- Reveals whether DE signals concentrate at particular expression levels
- Detects bias in fold-change estimates

Heatmap of Top DE Genes
The heatmap shows Z-score scaled VST expression for the most significant genes. Rows (genes) and columns (samples) are hierarchically clustered.

Consensus Genes
Genes identified as significant by all three methods (DESeq2, edgeR, limma) represent the highest-confidence DE candidates.
60 consensus DE genes found across all three methods.
Top 20 consensus genes (sorted by mean rank): ENSG00000006704.11, ENSG00000027869.12, ENSG00000100065.15, ENSG00000100628.12, ENSG00000104833.12, ENSG00000106025.9, ENSG00000114948.13, ENSG00000127533.4, ENSG00000128283.7, ENSG00000130055.14, ENSG00000131711.15, ENSG00000134533.6, ENSG00000136960.13, ENSG00000138172.11, ENSG00000143341.12, ENSG00000149742.10, ENSG00000154734.16, ENSG00000157214.14, ENSG00000162878.13, ENSG00000163362.11, … (40 more)
These genes are used for pathway analysis via ORA.
Paired Longitudinal DE
For patients with samples at multiple timepoints (baseline and relapse), a paired design (~ patient_id + visit) controls for inter-patient variability and tests for within-patient expression changes over time.
Paired analysis: 2 patients, 4 samples
12 DE genes at padj < 0.05

Annotated DE Results
Gene symbols make Ensembl IDs interpretable. The annotate_genes() function maps Ensembl IDs to HGNC symbols using MSigDB gene mappings.
Generating code
{
if (is.null(de_results_annotated) || nrow(de_results_annotated) ==
0) {
return(NULL)
}
top <- utils::head(de_results_annotated[order(de_results_annotated$padj),
], 15)
display_cols <- intersect(c("gene_symbol", "log2FoldChange",
"baseMean", "padj"), names(top))
if ("log2FoldChange" %in% names(top))
top$log2FoldChange <- round(top$log2FoldChange, 2)
if ("baseMean" %in% names(top))
top$baseMean <- round(top$baseMean, 1)
if ("padj" %in% names(top))
top$padj <- signif(top$padj, 4)
caption <- paste0("Top 15 DE genes by adjusted p-value (DESeq2). ",
"gene_symbol = HGNC symbol mapped from Ensembl IDs via MSigDB. ",
"log2FoldChange = log2 ratio (positive = upregulated in tumor/relapse). ",
"baseMean = mean normalized count across all samples. ",
"padj = Benjamini-Hochberg adjusted p-value. ", "See data dictionary for gene ID details.")
DT::datatable(top[, display_cols], rownames = FALSE, filter = "top",
options = list(pageLength = 15, scrollX = TRUE), caption = htmltools::tags$caption(style = "caption-side: top; text-align: left;",
caption))
}
Top 15 DE genes by adjusted p-value (DESeq2). gene_symbol = HGNC symbol mapped from Ensembl IDs via MSigDB. log2FoldChange = log2 ratio (positive = upregulated in tumor/relapse). baseMean = mean normalized count across all samples. padj = Benjamini-Hochberg adjusted p-value. See data dictionary for gene ID details.
|
gene_symbol
|
log2FoldChange
|
baseMean
|
padj
|
|
PDIA2
|
2.78
|
1027.4
|
0
|
|
IGKV1-5
|
-6.09
|
135865.3
|
0
|
|
IGLV3-25
|
7.46
|
23154.4
|
0
|
|
IGKV1-33
|
-0.10
|
8165.9
|
0
|
|
IGKV1D-33
|
-0.10
|
9086.9
|
0
|
|
GIPC3
|
0.09
|
33.0
|
0
|
|
IGKV1-12
|
1.20
|
15950.1
|
0
|
|
TRIM31
|
3.71
|
54.4
|
0
|
|
HS6ST2
|
8.98
|
16.8
|
0
|
|
DQX1
|
2.58
|
55.9
|
0
|
|
LOC283731
|
-0.09
|
106.0
|
0
|
|
KCNK13
|
2.88
|
85.5
|
0
|
|
NCALD
|
2.32
|
207.8
|
0
|
|
ADAM23
|
4.45
|
46.7
|
0
|
|
STEAP1
|
3.34
|
90.2
|
0
|
Pathway Enrichment Visualizations
GSEA Enrichment Dot Plot
Gene Set Enrichment Analysis using the full ranked gene list against MSigDB Hallmark pathways.

ORA Enrichment Bar Plot
Over-representation analysis testing whether consensus DE genes are enriched in specific pathways.

Next Steps
-
Pathway analysis: Consensus DE genes are tested for enrichment in MSigDB Hallmark, KEGG, and other collections. See the pathway analysis vignette.
-
Survival stratification: DE gene signatures can inform survival analysis. See the survival analysis vignette.
-
Cytogenetic context: See the EDA vignette for the cytogenetic landscape underlying these expression changes.
Data Sources
Results in this vignette are derived from the MMRF CoMMpass study (MMRF-COMMPASS, ~1,143 patients), downloaded via TCGAbiolinks. The pipeline runs with a configurable sample_limit (default 200; CI uses 20).
For full citations, data access tiers, and the distinction between pipeline data and synthetic test data, see the Data Sources vignette.
Recent Changes
Recent project commits with lines added, files changed, and change categories.
Last 20 project commits with change statistics. Date = commit date; Type = conventional-commit prefix (feat/fix/docs/ci/refactor/test/chore). Files = number of files modified; +Lines/-Lines = lines added/removed. Source: git log –numstat. See changes-by-type table for aggregate breakdown.
|
date
|
type
|
summary
|
n_files
|
lines_added
|
lines_removed
|
file_categories
|
|
2026-03-14
|
Bug Fix
|
fix(pipeline): Fix 11 NULL targets — DE condition, ID matching, consensus type
|
41
|
146
|
47
|
Other, R Source
|
|
2026-03-14
|
Bug Fix
|
fix(cachix): Remove –watch-mode auto flag (already default)
|
1
|
1
|
1
|
Other
|
|
2026-03-14
|
Bug Fix
|
fix(pipeline): Fix 3 NULL-target bugs, auto-generate package.nix (#93)
|
87
|
235
|
80
|
Config, Docs, Other, R Source
|
|
2026-03-14
|
Bug Fix
|
fix(nix): Fix cachix signing key, rebuild Bioconductor-dependent targets
|
2
|
0
|
0
|
Other
|
|
2026-03-14
|
New Feature
|
feat(captions): Add dynamic captions to 34 table/plot targets
|
22
|
579
|
89
|
Other, R Source
|
|
2026-03-14
|
Bug Fix
|
fix(vignettes): Enforce zero-computation rule — 22 violations → 0
|
32
|
360
|
764
|
Other, R Source, Vignettes
|
|
2026-03-13
|
Bug Fix
|
fix(vignettes): Convert kable RDS to data.frames, fix telemetry eval guards
|
18
|
8
|
2
|
Other, Vignettes
|
|
2026-03-13
|
Bug Fix
|
fix(ci): Save data frames (not DT widgets) to RDS for Nix portability
|
3
|
0
|
0
|
Other
|
|
2026-03-13
|
Bug Fix
|
fix(vignettes): Use Quarto #| eval syntax for pkgdown-banner chunks
|
11
|
44
|
11
|
Vignettes
|
|
2026-03-13
|
Refactoring
|
refactor(targets): Move Bioconductor packages to per-target declarations
|
11
|
35
|
17
|
Other, R Source, Vignettes
|
|
2026-03-13
|
New Feature
|
feat(vignettes): Add code provenance, kable→DT conversion, caption compliance
|
35
|
1004
|
437
|
CI/CD, Other, R Source, Vignettes
|
|
2026-03-13
|
Bug Fix
|
fix(vignettes): Skip NULL RDS in safe_tar_read, return invisible(NULL)
|
11
|
22
|
22
|
Vignettes
|
|
2026-03-13
|
Bug Fix
|
fix(glossary): Prevent double DT::datatable() wrapping in glossary-table chunk
|
1
|
3
|
1
|
Vignettes
|
|
2026-03-13
|
CI/CD
|
ci: Show quarto errors with quiet=FALSE, render individual vignettes in diagnostic
|
1
|
20
|
6
|
CI/CD
|
|
2026-03-13
|
CI/CD
|
ci: Add verbose quarto error diagnostics on build failure
|
1
|
14
|
1
|
CI/CD
|
|
2026-03-13
|
Bug Fix
|
fix(vignettes): Strip Nix paths from DT widgets, auto-wrap data frames
|
25
|
66
|
28
|
CI/CD, Other, Vignettes
|
|
2026-03-13
|
CI/CD
|
ci: Add diagnostic quarto render step to debug build failure
|
1
|
17
|
0
|
CI/CD
|
|
2026-03-13
|
Bug Fix
|
fix(vignettes): Revert safe_tar_read placeholder, guard gene-report
|
11
|
12
|
56
|
Vignettes
|
|
2026-03-13
|
Maintenance
|
chore: Export vig_count_distribution_plot as ggplot RDS (513KB)
|
1
|
0
|
0
|
Other
|
|
2026-03-13
|
Bug Fix
|
fix(vignettes): Enable code eval in CI with RDS fallback
|
80
|
113
|
74
|
CI/CD, Other, R Source, Vignettes
|
Reproducibility
Session Info (click to expand)
Show code
sessionInfo()
#> R version 4.5.3 (2026-03-11)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] base64url_1.4 gtable_0.3.6 jsonlite_2.0.0
#> [4] dplyr_1.2.0 compiler_4.5.3 tidyselect_1.2.1
#> [7] callr_3.7.6 scales_1.4.0 yaml_2.3.12
#> [10] fastmap_1.2.0 ggplot2_4.0.2 R6_2.6.1
#> [13] labeling_0.4.3 generics_0.1.4 igraph_2.2.2
#> [16] knitr_1.51 backports_1.5.0 targets_1.12.0
#> [19] tibble_3.3.1 pillar_1.11.1 RColorBrewer_1.1-3
#> [22] rlang_1.1.7 xfun_0.57 S7_0.2.1
#> [25] otel_0.2.0 cli_3.6.5 withr_3.0.2
#> [28] magrittr_2.0.4 ps_1.9.1 digest_0.6.39
#> [31] grid_4.5.3 processx_3.8.6 secretbase_1.2.0
#> [34] lifecycle_1.0.5 prettyunits_1.2.0 vctrs_0.7.2
#> [37] evaluate_1.0.5 glue_1.8.0 data.table_1.18.2.1
#> [40] farver_2.1.2 codetools_0.2-20 rmarkdown_2.30
#> [43] tools_4.5.3 pkgconfig_2.0.3 htmltools_0.5.9