safe_tar_read("glossary_table")
#> <table>
#> <caption>CoMMpass glossary: 65 terms across 8 categories (Disease, Cytogenetics, Staging, RNA-seq, DE, Survival, Pathway, Infrastructure). Definition = concise explanation; Appears_In = vignettes where the term is used; See_Also = external references and cross-links to other vignettes. Searchable — use the filter boxes to find terms by category or keyword. Source: curated from GDC documentation, Bioconductor, and domain literature.</caption>
#> <thead>
#> <tr>
#> <th style="text-align:left;"> Term </th>
#> <th style="text-align:left;"> Category </th>
#> <th style="text-align:left;"> Definition </th>
#> <th style="text-align:left;"> Appears_In </th>
#> <th style="text-align:left;"> See_Also </th>
#> </tr>
#> </thead>
#> <tbody>
#> <tr>
#> <td style="text-align:left;"> Multiple Myeloma </td>
#> <td style="text-align:left;"> Disease & Study </td>
#> <td style="text-align:left;"> Cancer of plasma cells in the bone marrow. The most common indication for stem cell transplant in adults </td>
#> <td style="text-align:left;"> survival (3), exploratory (2), data-sources (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Multiple_myeloma), [NCI](https://www.cancer.gov/types/myeloma) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> CoMMpass </td>
#> <td style="text-align:left;"> Disease & Study </td>
#> <td style="text-align:left;"> Clinical Outcomes in Multiple Myeloma to Personal Assessment of Genetic Profile -- MMRF longitudinal study of ~1,143 newly diagnosed MM patients </td>
#> <td style="text-align:left;"> data-sources (5), exploratory (3), survival (2), gene-report (2) </td>
#> <td style="text-align:left;"> [MMRF](https://themmrf.org/finding-a-cure/personalized-treatment-approaches/), [GDC Portal](https://portal.gdc.cancer.gov/projects/MMRF-COMMPASS) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> GDC </td>
#> <td style="text-align:left;"> Disease & Study </td>
#> <td style="text-align:left;"> Genomic Data Commons -- NCI repository hosting CoMMpass RNA-seq and clinical data </td>
#> <td style="text-align:left;"> data-acquisition (8), data-sources (3), data-dictionary (2) </td>
#> <td style="text-align:left;"> [GDC Portal](https://portal.gdc.cancer.gov/), [GDC Docs](https://docs.gdc.cancer.gov/) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> MMRF </td>
#> <td style="text-align:left;"> Disease & Study </td>
#> <td style="text-align:left;"> Multiple Myeloma Research Foundation -- sponsor of the CoMMpass trial </td>
#> <td style="text-align:left;"> data-sources (3), data-acquisition (2) </td>
#> <td style="text-align:left;"> [MMRF](https://themmrf.org/), [Data Sources](data-sources.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> TCGAbiolinks </td>
#> <td style="text-align:left;"> Disease & Study </td>
#> <td style="text-align:left;"> Bioconductor R package for querying and downloading GDC data programmatically </td>
#> <td style="text-align:left;"> data-acquisition (4), data-sources (2) </td>
#> <td style="text-align:left;"> [Bioconductor](https://bioconductor.org/packages/TCGAbiolinks/), [Data Acquisition](data-acquisition.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> FISH </td>
#> <td style="text-align:left;"> Cytogenetics & Markers </td>
#> <td style="text-align:left;"> Fluorescence in situ hybridization -- detects cytogenetic abnormalities like t(4;14), t(14;16), del(17p) </td>
#> <td style="text-align:left;"> exploratory (6), survival (4), gene-report (3) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Fluorescence_in_situ_hybridization), [EDA vignette](exploratory-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Cytogenetic Risk </td>
#> <td style="text-align:left;"> Cytogenetics & Markers </td>
#> <td style="text-align:left;"> Classification of patients into high-risk or standard-risk based on FISH-detected chromosomal abnormalities per IMWG criteria </td>
#> <td style="text-align:left;"> survival (5), exploratory (4), gene-report (3) </td>
#> <td style="text-align:left;"> [Survival vignette](survival-analysis.html), [IMWG](https://doi.org/10.1200/JCO.2014.55.1519) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> t(4;14) </td>
#> <td style="text-align:left;"> Cytogenetics & Markers </td>
#> <td style="text-align:left;"> Translocation between chromosomes 4 and 14 -- associated with poor prognosis in MM. Detected by FISH </td>
#> <td style="text-align:left;"> survival (3), exploratory (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Chromosomal_translocation), [Survival vignette](survival-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> t(14;16) </td>
#> <td style="text-align:left;"> Cytogenetics & Markers </td>
#> <td style="text-align:left;"> Translocation between chromosomes 14 and 16 -- high-risk marker associated with aggressive disease </td>
#> <td style="text-align:left;"> survival (3), exploratory (2) </td>
#> <td style="text-align:left;"> [Survival vignette](survival-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> del(17p) </td>
#> <td style="text-align:left;"> Cytogenetics & Markers </td>
#> <td style="text-align:left;"> Deletion of the short arm of chromosome 17 -- loss of TP53 tumor suppressor, high-risk marker </td>
#> <td style="text-align:left;"> survival (3), exploratory (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Chromosome_17_(human)#Deletions), [Survival vignette](survival-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> High-Risk / Standard-Risk </td>
#> <td style="text-align:left;"> Cytogenetics & Markers </td>
#> <td style="text-align:left;"> High-risk: presence of t(4;14), t(14;16), or del(17p). Standard-risk: absence of all three. Per IMWG 2014 criteria </td>
#> <td style="text-align:left;"> survival (4), exploratory (3) </td>
#> <td style="text-align:left;"> [IMWG criteria](https://doi.org/10.1200/JCO.2014.55.1519), [Survival vignette](survival-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> ISS </td>
#> <td style="text-align:left;"> Staging & Risk </td>
#> <td style="text-align:left;"> International Staging System -- classifies myeloma severity (I-III) by serum albumin and beta-2 microglobulin (Greipp et al. 2005, doi:10.1200/JCO.2005.04.242) </td>
#> <td style="text-align:left;"> survival (6), exploratory (5), data-dictionary (2) </td>
#> <td style="text-align:left;"> [Greipp et al. 2005](https://doi.org/10.1200/JCO.2005.04.242), [EDA vignette](exploratory-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> IMWG </td>
#> <td style="text-align:left;"> Staging & Risk </td>
#> <td style="text-align:left;"> International Myeloma Working Group -- defines cytogenetic risk criteria (Sonneveld et al. 2016, doi:10.1200/JCO.2014.55.1519) </td>
#> <td style="text-align:left;"> survival (3), exploratory (2) </td>
#> <td style="text-align:left;"> [IMWG](https://doi.org/10.1200/JCO.2014.55.1519), [Survival vignette](survival-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> B2M / Serum Albumin </td>
#> <td style="text-align:left;"> Staging & Risk </td>
#> <td style="text-align:left;"> Beta-2 microglobulin and serum albumin -- the two biomarkers used to determine ISS stage. B2M >= 5.5 mg/L = Stage III </td>
#> <td style="text-align:left;"> exploratory (2), data-dictionary (2) </td>
#> <td style="text-align:left;"> [ISS definition](https://doi.org/10.1200/JCO.2005.04.242), [Data Dictionary](data-dictionary.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> B2M </td>
#> <td style="text-align:left;"> Staging & Risk </td>
#> <td style="text-align:left;"> Beta-2 microglobulin -- serum protein used in ISS staging. B2M >= 5.5 mg/L indicates Stage III myeloma </td>
#> <td style="text-align:left;"> survival (3), exploratory (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Beta-2_microglobulin), [ISS staging](https://doi.org/10.1200/JCO.2005.04.242) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> RNA-seq </td>
#> <td style="text-align:left;"> RNA-seq & QC </td>
#> <td style="text-align:left;"> RNA sequencing -- high-throughput method to quantify gene expression. CoMMpass uses bulk RNA-seq from bone marrow aspirates </td>
#> <td style="text-align:left;"> data-acquisition (6), exploratory (3), differential-expression (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/RNA-Seq), [Data Acquisition](data-acquisition.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Read count </td>
#> <td style="text-align:left;"> RNA-seq & QC </td>
#> <td style="text-align:left;"> The number of sequencing reads aligned to a gene in one sample. Raw integer counts are the input for DE methods (DESeq2, edgeR) </td>
#> <td style="text-align:left;"> data-acquisition (4), differential-expression (3) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/RNA-Seq), [Data Acquisition](data-acquisition.html#rnaseq-data) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Library size </td>
#> <td style="text-align:left;"> RNA-seq & QC </td>
#> <td style="text-align:left;"> Total number of sequencing reads mapped to genes in one sample (= sum of all gene counts). A proxy for sequencing depth. Synonym: Total Counts </td>
#> <td style="text-align:left;"> data-acquisition (5), exploratory (3) </td>
#> <td style="text-align:left;"> [Data Acquisition](data-acquisition.html#rnaseq-data), [Wikipedia](https://en.wikipedia.org/wiki/RNA-Seq#Analysis) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> TPM </td>
#> <td style="text-align:left;"> RNA-seq & QC </td>
#> <td style="text-align:left;"> Transcripts per million -- normalized expression measure for cross-sample comparison. Accounts for gene length and sequencing depth </td>
#> <td style="text-align:left;"> data-acquisition (3), data-dictionary (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Transcripts_per_million), [Data Dictionary](data-dictionary.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Gene detection </td>
#> <td style="text-align:left;"> RNA-seq & QC </td>
#> <td style="text-align:left;"> A gene is 'detected' in a sample if it has >= 1 mapped read (count > 0). The number of detected genes per sample is a QC metric; low detection suggests poor depth or degradation </td>
#> <td style="text-align:left;"> data-acquisition (4), exploratory (2) </td>
#> <td style="text-align:left;"> [Data Acquisition QC](data-acquisition.html#quality-control) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> MAD </td>
#> <td style="text-align:left;"> RNA-seq & QC </td>
#> <td style="text-align:left;"> Median absolute deviation -- robust measure of spread. In QC context, MAD of gene counts within a single sample measures expression variability </td>
#> <td style="text-align:left;"> data-acquisition (3), exploratory (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Median_absolute_deviation), [Data Acquisition QC](data-acquisition.html#quality-control) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Outlier (QC) </td>
#> <td style="text-align:left;"> RNA-seq & QC </td>
#> <td style="text-align:left;"> A sample flagged as outlier if it falls in the bottom 5th percentile of library size OR genes detected. The flag is binary (Yes/No) </td>
#> <td style="text-align:left;"> data-acquisition (3), exploratory (2) </td>
#> <td style="text-align:left;"> [Data Acquisition QC](data-acquisition.html#quality-control) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> VST </td>
#> <td style="text-align:left;"> RNA-seq & QC </td>
#> <td style="text-align:left;"> Variance-stabilizing transformation -- normalizes count data for visualization and clustering, reducing mean-variance dependence </td>
#> <td style="text-align:left;"> differential-expression (4), exploratory (3), survival (2) </td>
#> <td style="text-align:left;"> [DESeq2 docs](https://bioconductor.org/packages/DESeq2/), [DE vignette](differential-expression.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> STAR </td>
#> <td style="text-align:left;"> RNA-seq & QC </td>
#> <td style="text-align:left;"> Spliced Transcripts Alignment to a Reference -- RNA-seq aligner used in GDC pipeline </td>
#> <td style="text-align:left;"> data-acquisition (3), data-sources (2) </td>
#> <td style="text-align:left;"> [GitHub](https://github.com/alexdobin/STAR), [GDC Pipeline](https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline/) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> GENCODE </td>
#> <td style="text-align:left;"> RNA-seq & QC </td>
#> <td style="text-align:left;"> Comprehensive gene annotation project providing reference gene models for genome analysis </td>
#> <td style="text-align:left;"> data-acquisition (2), data-sources (1) </td>
#> <td style="text-align:left;"> [gencodegenes.org](https://www.gencodegenes.org/), [Data Acquisition](data-acquisition.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Ensembl </td>
#> <td style="text-align:left;"> RNA-seq & QC </td>
#> <td style="text-align:left;"> Genome database providing gene IDs (ENSG*) used in RNA-seq quantification </td>
#> <td style="text-align:left;"> data-acquisition (3), gene-report (2) </td>
#> <td style="text-align:left;"> [ensembl.org](https://www.ensembl.org/), [Data Acquisition](data-acquisition.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> HGNC </td>
#> <td style="text-align:left;"> RNA-seq & QC </td>
#> <td style="text-align:left;"> HUGO Gene Nomenclature Committee -- authority for standardized human gene symbols </td>
#> <td style="text-align:left;"> gene-report (3), differential-expression (2) </td>
#> <td style="text-align:left;"> [genenames.org](https://www.genenames.org/), [Gene Report](gene-report.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Entrez </td>
#> <td style="text-align:left;"> RNA-seq & QC </td>
#> <td style="text-align:left;"> NCBI gene identifier system used for cross-database gene referencing </td>
#> <td style="text-align:left;"> gene-report (2), data-acquisition (1) </td>
#> <td style="text-align:left;"> [NCBI Gene](https://www.ncbi.nlm.nih.gov/gene/), [Gene Report](gene-report.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> DE </td>
#> <td style="text-align:left;"> Differential Expression </td>
#> <td style="text-align:left;"> Differential expression -- genes with significantly different expression between conditions (e.g. high-risk vs standard-risk) </td>
#> <td style="text-align:left;"> differential-expression (8), gene-report (5), survival (2) </td>
#> <td style="text-align:left;"> [DE vignette](differential-expression.html), [Wikipedia](https://en.wikipedia.org/wiki/Differential_gene_expression) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> DESeq2 </td>
#> <td style="text-align:left;"> Differential Expression </td>
#> <td style="text-align:left;"> Bioconductor package for DE analysis using negative binomial GLMs with shrinkage estimation. The primary DE method in this pipeline </td>
#> <td style="text-align:left;"> differential-expression (6), gene-report (3) </td>
#> <td style="text-align:left;"> [Bioconductor](https://bioconductor.org/packages/DESeq2/), [PMID:25516281](https://pubmed.ncbi.nlm.nih.gov/25516281/) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> edgeR </td>
#> <td style="text-align:left;"> Differential Expression </td>
#> <td style="text-align:left;"> Bioconductor package for DE analysis using empirical Bayes moderation of tagwise dispersions. Used as consensus method alongside DESeq2 </td>
#> <td style="text-align:left;"> differential-expression (4), gene-report (2) </td>
#> <td style="text-align:left;"> [Bioconductor](https://bioconductor.org/packages/edgeR/), [PMID:19910308](https://pubmed.ncbi.nlm.nih.gov/19910308/) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> limma-voom </td>
#> <td style="text-align:left;"> Differential Expression </td>
#> <td style="text-align:left;"> Linear modelling framework (limma) with voom precision weights for RNA-seq. Third consensus DE method in pipeline </td>
#> <td style="text-align:left;"> differential-expression (4), gene-report (2) </td>
#> <td style="text-align:left;"> [Bioconductor](https://bioconductor.org/packages/limma/), [PMID:25605792](https://pubmed.ncbi.nlm.nih.gov/25605792/) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Log2 Fold Change </td>
#> <td style="text-align:left;"> Differential Expression </td>
#> <td style="text-align:left;"> Log-base-2 ratio of expression between conditions. LFC > 0 = upregulated; LFC < 0 = downregulated. Shrinkage-corrected LFC used for ranking </td>
#> <td style="text-align:left;"> differential-expression (5), gene-report (3) </td>
#> <td style="text-align:left;"> [DE vignette](differential-expression.html), [Wikipedia](https://en.wikipedia.org/wiki/Fold_change) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> FDR / Adjusted p-value </td>
#> <td style="text-align:left;"> Differential Expression </td>
#> <td style="text-align:left;"> False discovery rate -- p-values adjusted for multiple testing (Benjamini-Hochberg). FDR < 0.05 is the standard significance threshold for DE </td>
#> <td style="text-align:left;"> differential-expression (4), gene-report (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/False_discovery_rate), [DE vignette](differential-expression.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Volcano Plot </td>
#> <td style="text-align:left;"> Differential Expression </td>
#> <td style="text-align:left;"> Scatter plot of log2 fold change (x) vs -log10 adjusted p-value (y). Highlights significantly DE genes in the upper-left and upper-right quadrants </td>
#> <td style="text-align:left;"> differential-expression (3), gene-report (2) </td>
#> <td style="text-align:left;"> [DE vignette](differential-expression.html), [Wikipedia](https://en.wikipedia.org/wiki/Volcano_plot_(statistics)) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> PCA </td>
#> <td style="text-align:left;"> Differential Expression </td>
#> <td style="text-align:left;"> Principal component analysis -- dimensionality reduction technique for visualizing sample clustering and batch effects </td>
#> <td style="text-align:left;"> exploratory (4), differential-expression (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Principal_component_analysis), [EDA vignette](exploratory-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> LFC </td>
#> <td style="text-align:left;"> Differential Expression </td>
#> <td style="text-align:left;"> Log2 fold change -- effect size measure for differential expression. Positive = upregulated; negative = downregulated </td>
#> <td style="text-align:left;"> differential-expression (5), gene-report (3) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Fold_change), [DE vignette](differential-expression.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Overall Survival (OS) </td>
#> <td style="text-align:left;"> Survival Analysis </td>
#> <td style="text-align:left;"> Time from diagnosis to death from any cause. The primary survival endpoint in CoMMpass </td>
#> <td style="text-align:left;"> survival (8), exploratory (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Overall_survival), [Survival vignette](survival-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Progression-Free Survival (PFS) </td>
#> <td style="text-align:left;"> Survival Analysis </td>
#> <td style="text-align:left;"> Time from diagnosis to disease progression or death. A secondary endpoint capturing earlier clinical events </td>
#> <td style="text-align:left;"> survival (4) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Progression-free_survival), [Survival vignette](survival-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> KM </td>
#> <td style="text-align:left;"> Survival Analysis </td>
#> <td style="text-align:left;"> Kaplan-Meier -- non-parametric survival curve estimator. Handles right-censored data (patients still alive at last follow-up) </td>
#> <td style="text-align:left;"> survival (6), exploratory (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Kaplan%E2%80%93Meier_estimator), [Survival vignette](survival-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Log-Rank Test </td>
#> <td style="text-align:left;"> Survival Analysis </td>
#> <td style="text-align:left;"> Non-parametric test comparing survival distributions between groups. Tests whether KM curves differ significantly (p < 0.05) </td>
#> <td style="text-align:left;"> survival (4) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Logrank_test), [Survival vignette](survival-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Cox PH </td>
#> <td style="text-align:left;"> Survival Analysis </td>
#> <td style="text-align:left;"> Cox proportional hazards -- semi-parametric regression for survival data. Estimates hazard ratios adjusting for covariates </td>
#> <td style="text-align:left;"> survival (5), exploratory (1) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Proportional_hazards_model), [Survival vignette](survival-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> HR </td>
#> <td style="text-align:left;"> Survival Analysis </td>
#> <td style="text-align:left;"> Hazard ratio -- relative risk of event occurrence. HR > 1 = increased risk (worse survival); HR < 1 = protective effect </td>
#> <td style="text-align:left;"> survival (6), gene-report (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Hazard_ratio), [Survival vignette](survival-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Forest Plot </td>
#> <td style="text-align:left;"> Survival Analysis </td>
#> <td style="text-align:left;"> Visual display of hazard ratios with confidence intervals for multiple covariates from a Cox model. Each row is a covariate; dashed line at HR = 1 </td>
#> <td style="text-align:left;"> survival (3) </td>
#> <td style="text-align:left;"> [Survival vignette](survival-analysis.html#forest-plot) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Confidence Interval </td>
#> <td style="text-align:left;"> Survival Analysis </td>
#> <td style="text-align:left;"> Range of plausible values for an estimate (typically 95%). For HRs, a CI crossing 1.0 indicates non-significance </td>
#> <td style="text-align:left;"> survival (4) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Confidence_interval), [Survival vignette](survival-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> GSEA </td>
#> <td style="text-align:left;"> Pathway & Enrichment </td>
#> <td style="text-align:left;"> Gene Set Enrichment Analysis -- tests whether predefined gene sets are enriched at the top or bottom of a ranked gene list. Uses all genes, not just significant ones </td>
#> <td style="text-align:left;"> gene-report (6), differential-expression (3) </td>
#> <td style="text-align:left;"> [GSEA](https://www.gsea-msigdb.org/gsea/), [PMID:16199517](https://pubmed.ncbi.nlm.nih.gov/16199517/), [Gene Report](gene-report.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> ORA </td>
#> <td style="text-align:left;"> Pathway & Enrichment </td>
#> <td style="text-align:left;"> Over-Representation Analysis -- tests whether a set of DE genes contains more members of a pathway than expected by chance (Fisher's exact test) </td>
#> <td style="text-align:left;"> gene-report (4), differential-expression (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Gene_set_enrichment_analysis#Over-representation_analysis), [Gene Report](gene-report.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> MSigDB </td>
#> <td style="text-align:left;"> Pathway & Enrichment </td>
#> <td style="text-align:left;"> Molecular Signatures Database -- curated collection of gene sets for GSEA/ORA. Categories include Hallmark, GO, KEGG, Reactome </td>
#> <td style="text-align:left;"> gene-report (5), differential-expression (2) </td>
#> <td style="text-align:left;"> [MSigDB](https://www.gsea-msigdb.org/gsea/msigdb/), [Gene Report](gene-report.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Hallmark Gene Sets </td>
#> <td style="text-align:left;"> Pathway & Enrichment </td>
#> <td style="text-align:left;"> MSigDB Hallmark collection -- 50 curated gene sets representing well-defined biological states and processes (e.g. EMT, p53 pathway, hypoxia) </td>
#> <td style="text-align:left;"> gene-report (4) </td>
#> <td style="text-align:left;"> [MSigDB Hallmarks](https://www.gsea-msigdb.org/gsea/msigdb/human/collections.jsp#H), [Gene Report](gene-report.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Pathway </td>
#> <td style="text-align:left;"> Pathway & Enrichment </td>
#> <td style="text-align:left;"> A set of genes involved in a common biological process (e.g. cell cycle, apoptosis). Annotated in databases like KEGG, Reactome, GO </td>
#> <td style="text-align:left;"> gene-report (5), differential-expression (3) </td>
#> <td style="text-align:left;"> [KEGG](https://www.genome.jp/kegg/pathway.html), [Reactome](https://reactome.org/), [Gene Report](gene-report.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Gene Set </td>
#> <td style="text-align:left;"> Pathway & Enrichment </td>
#> <td style="text-align:left;"> Any defined collection of genes tested together in enrichment analysis (broader than 'pathway' -- includes GO terms, TF targets, etc.) </td>
#> <td style="text-align:left;"> gene-report (4), differential-expression (2) </td>
#> <td style="text-align:left;"> [Gene Report](gene-report.html), [MSigDB](https://www.gsea-msigdb.org/gsea/msigdb/) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> GO </td>
#> <td style="text-align:left;"> Pathway & Enrichment </td>
#> <td style="text-align:left;"> Gene Ontology -- structured vocabulary of gene/protein functions (Biological Process, Molecular Function, Cellular Component) </td>
#> <td style="text-align:left;"> gene-report (3), differential-expression (2) </td>
#> <td style="text-align:left;"> [geneontology.org](http://geneontology.org/), [Gene Report](gene-report.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> KEGG </td>
#> <td style="text-align:left;"> Pathway & Enrichment </td>
#> <td style="text-align:left;"> Kyoto Encyclopedia of Genes and Genomes -- pathway and molecular interaction database </td>
#> <td style="text-align:left;"> gene-report (3), differential-expression (2) </td>
#> <td style="text-align:left;"> [genome.jp/kegg](https://www.genome.jp/kegg/), [Gene Report](gene-report.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Reactome </td>
#> <td style="text-align:left;"> Pathway & Enrichment </td>
#> <td style="text-align:left;"> Open-source curated pathway database of biological reactions and processes </td>
#> <td style="text-align:left;"> gene-report (2), differential-expression (1) </td>
#> <td style="text-align:left;"> [reactome.org](https://reactome.org/), [Gene Report](gene-report.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Targets </td>
#> <td style="text-align:left;"> Data Infrastructure </td>
#> <td style="text-align:left;"> R-based pipeline tool (targets package) for reproducible, cached computation. Each analysis step is a 'target' with dependency tracking </td>
#> <td style="text-align:left;"> pipeline-dag (6), telemetry (4), data-sources (2) </td>
#> <td style="text-align:left;"> [targets package](https://docs.ropensci.org/targets/), [Pipeline DAG](pipeline-dag.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> sample_limit </td>
#> <td style="text-align:left;"> Data Infrastructure </td>
#> <td style="text-align:left;"> Pipeline parameter controlling the number of patient samples included. Default: local=200, CI=20. Lower values speed up development; higher values improve statistical power </td>
#> <td style="text-align:left;"> data-sources (3), survival (2), exploratory (2) </td>
#> <td style="text-align:left;"> [Data Sources](data-sources.html), [Survival vignette](survival-analysis.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Nix </td>
#> <td style="text-align:left;"> Data Infrastructure </td>
#> <td style="text-align:left;"> Reproducible build system providing isolated, version-pinned R environments via nixpkgs. Ensures all collaborators use identical package versions </td>
#> <td style="text-align:left;"> data-sources (2) </td>
#> <td style="text-align:left;"> [Nix](https://nixos.org/), [rix package](https://docs.ropensci.org/rix/) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> pkgdown </td>
#> <td style="text-align:left;"> Data Infrastructure </td>
#> <td style="text-align:left;"> R package for building package documentation websites. Renders vignettes, function reference, and news into a static site hosted on GitHub Pages </td>
#> <td style="text-align:left;"> data-sources (2) </td>
#> <td style="text-align:left;"> [pkgdown](https://pkgdown.r-lib.org/), [Data Sources](data-sources.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> API </td>
#> <td style="text-align:left;"> Data Infrastructure </td>
#> <td style="text-align:left;"> Application Programming Interface -- structured endpoints for programmatic data access </td>
#> <td style="text-align:left;"> api-usage (6), data-sources (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/API), [API Usage](api-usage.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> DAG </td>
#> <td style="text-align:left;"> Data Infrastructure </td>
#> <td style="text-align:left;"> Directed acyclic graph -- dependency structure used by targets pipeline for reproducible execution </td>
#> <td style="text-align:left;"> pipeline-dag (5), telemetry (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Directed_acyclic_graph), [Pipeline DAG](pipeline-dag.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> DuckDB </td>
#> <td style="text-align:left;"> Data Infrastructure </td>
#> <td style="text-align:left;"> In-process analytical database used for efficient parquet querying in the pipeline </td>
#> <td style="text-align:left;"> data-sources (3), data-dictionary (2) </td>
#> <td style="text-align:left;"> [duckdb.org](https://duckdb.org/), [Data Sources](data-sources.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Parquet </td>
#> <td style="text-align:left;"> Data Infrastructure </td>
#> <td style="text-align:left;"> Columnar storage format used for efficient data storage and querying in the pipeline </td>
#> <td style="text-align:left;"> data-sources (3), api-usage (2) </td>
#> <td style="text-align:left;"> [parquet.apache.org](https://parquet.apache.org/), [Data Sources](data-sources.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> CI </td>
#> <td style="text-align:left;"> Data Infrastructure </td>
#> <td style="text-align:left;"> Continuous integration -- automated testing and deployment via GitHub Actions </td>
#> <td style="text-align:left;"> telemetry (3), pipeline-dag (2) </td>
#> <td style="text-align:left;"> [Wikipedia](https://en.wikipedia.org/wiki/Continuous_integration), [Telemetry](telemetry.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> CRAN </td>
#> <td style="text-align:left;"> Data Infrastructure </td>
#> <td style="text-align:left;"> Comprehensive R Archive Network -- primary repository for R packages </td>
#> <td style="text-align:left;"> data-sources (1) </td>
#> <td style="text-align:left;"> [cran.r-project.org](https://cran.r-project.org/), [Data Sources](data-sources.html) </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> Bioconductor </td>
#> <td style="text-align:left;"> Data Infrastructure </td>
#> <td style="text-align:left;"> Open-source software project for genomics and bioinformatics R packages </td>
#> <td style="text-align:left;"> data-acquisition (4), differential-expression (3) </td>
#> <td style="text-align:left;"> [bioconductor.org](https://www.bioconductor.org/), [Data Acquisition](data-acquisition.html) </td>
#> </tr>
#> </tbody>
#> </table>