Skip to contents

Tests whether a set of significant genes is over-represented in gene set collections using Fisher's exact test (hypergeometric distribution). Gene sets come from MSigDB via msigdbr.

Usage

run_ora(
  sig_genes,
  universe,
  gene_sets = "hallmark",
  gene_id_type = "ensembl_gene",
  min_size = 10L,
  max_size = 500L
)

Arguments

sig_genes

Character vector of significant gene identifiers.

universe

Character vector of all tested gene identifiers (background).

gene_sets

Character string specifying the MSigDB collection (same options as [run_gsea()]), or a named list of character vectors.

gene_id_type

Type of gene identifiers: "ensembl_gene" (default), "gene_symbol", or "entrez_gene".

min_size

Minimum gene set size (default: 10).

max_size

Maximum gene set size (default: 500).

Value

List with components:

results

Data frame with pathway, overlap, gene_set_size, universe_size, p_value, padj, odds_ratio, overlapping_genes.

n_pathways_tested

Number of pathways tested.

n_pathways_enriched

Number significant at padj < 0.05.

n_sig_genes

Number of input significant genes.

top_pathways

Top 20 enriched pathways as a data frame.