How Reliable Are These Numbers? • micromort

Risk numbers disagree. The WHO and the Institute for Health Metrics and Evaluation (IHME) report malaria deaths as 550,000 and 760,000 respectively — a 38% gap from the same underlying deaths. Our World in Data’s Deadliest Animals chart is visually compelling, but converting annual death counts to per-encounter micromorts is non-trivial. This vignette documents how we handle that uncertainty.

1. Why Risk Numbers Disagree

Three factors drive disagreement between sources:

Numerator uncertainty: Death attribution varies by coding system (ICD-10 codes, verbal autopsy, hospital records)
Denominator uncertainty: How many people were exposed? A “deaths per year” figure means nothing without knowing the exposure population
Temporal and geographic aggregation: A global annual average hides enormous regional and seasonal variation

Our inclusion criteria: traceable numerator + defined denominator + reproducible calculation. We reject risks where we cannot identify both the death count and the population at risk.

2. The Confidence System

Every entry in atomic_risks() carries a confidence tier:

Confidence tiers with examples from the micromort dataset
Tier	Criteria	Example	Source type
high	Peer-reviewed, large-N studies with defined denominators	Medical radiation (NRC dosimetry)	Regulatory agency
medium	Reputable sources, reasonable denominators, some extrapolation	Wikipedia micromort list, CDC injury data	Secondary compilation
low	Limited sources, regional uncertainty, or extrapolated denominators	Snake bite in rural Africa (WHO estimate)	Expert estimate
estimated	Derived by calculation from a model (e.g., LNT for radiation)	Annual cosmic radiation from LNT model	Model-derived

Validation status (new)

Within each confidence tier, we now track how thoroughly the estimate has been cross-checked:

Validation status levels
Status	Definition	Source count	Example
`single_source`	One citation, no cross-check	1	Most legacy entries from Wikipedia/micromorts.rip
`corroborated`	2+ sources agree within 2x	2+	Flight risks (Boeing + NCRP + medical literature)
`cross_validated`	3+ sources, range documented, outliers explained	3+	(Future: entries with systematic literature review)

Current validation status across all entries
confidence	corroborated	single_source
high	29	9
low	3	0
medium	12	76
estimated	0	2

3. Geographic Conditioning: The Biggest Source of Variation

The same animal encounter can produce dramatically different micromort values depending on location and healthcare access:

Geographic conditioning: same encounter, different risk
activity	micromorts	condition_value	confidence	notes
Dog bite (US)	6.7	high_income	medium	CDC: ~30 deaths/yr among ~4.5M bites requiring medical attention
Dog bite (rabies-endemic)	160.0	low_income	low	WHO: ~40k rabies deaths/yr, mostly dog-mediated
Snake bite (US, with antivenom)	0.5	high_income	medium	CDC: ~5 deaths/yr among ~10k bites
Snake bite (rural sub-Saharan Africa)	18.5	low_income	low	WHO/Lancet: ~100k deaths/yr among ~5.4M bites in sub-Saharan Africa

Snake bite: 0.5 mm (US, with antivenom) vs 18.5 mm (rural Africa) — a 37x difference. Dog bite: 6.7 mm (US, with rabies PEP) vs 160 mm (rabies-endemic, no treatment) — a 24x difference.

The hedgeability asymmetry

Geography is a hedgeable conditional risk — but only for some people:

A tourist can hedge: choose destination, get travel vaccines, carry antivenom kit, buy travel insurance
A resident cannot hedge: they live there, and may lack healthcare infrastructure, vaccines, or economic choice

This parallels the existing health profile conditioning. A bee sting is 0.03 mm for someone who is not allergic, but 31 mm for someone with a known allergy — a 1,000x difference. The allergic person can hedge (carry an epinephrine auto-injector), but they cannot eliminate the underlying vulnerability.

Health profile conditioning: same sting, different risk
activity	micromorts	condition_value	hedge_description	hedge_reduction_pct
Bee/wasp sting (general)	0.03	healthy	Avoid nests, wear shoes outdoors	30
Bee/wasp sting (allergic)	31.00	allergic	Carry epinephrine auto-injector, immunotherapy	95

Using geographic filtering

# Default: returns high-income estimates
common_risks() |> filter(category == "Wildlife")

# Explicitly request low-income geography
common_risks(profile = list(geography = "low_income")) |> filter(category == "Wildlife")

# Combine with health profile
common_risks(profile = list(geography = "low_income", health_profile = "allergic"))

4. Cross-Validation Methods

We use five methods to assess data reliability:

Source triangulation

Compare the same risk across independent sources. For wildlife risks, we cross-reference:

OWID annual death counts (numerator)
CDC injury surveillance (US denominator)
WHO fact sheets (global denominator)
ISAF shark attack database (species-specific data)

Denominator audit

The most common failure mode. Does the source report both a numerator (deaths) and a denominator (exposures)?

Animal	Numerator available?	Denominator available?	Included?
Shark	Yes (ISAF)	Yes (~100M swims/yr)	Yes
Dog	Yes (CDC, WHO)	Yes (4.5M bites US)	Yes
Mosquito	Yes (WHO: 600k+)	No per-encounter rate	No
Crocodile	Yes (CrocBITE)	No exposure estimate	No

Temporal stability

Has the number changed significantly across editions of the source? Stable estimates across 5+ years increase confidence.

Geographic consistency

Do US, UK, and global estimates agree within an order of magnitude? Large discrepancies suggest unmeasured confounders (see Confounding Variables).

Order-of-magnitude test

Is the number physically plausible? A micromort value that implies more deaths than the population can support is a red flag.

5. Worked Example: Animal Risks from OWID

Our World in Data reports annual deaths by animal. Converting to per-encounter micromorts requires:

$\text{micromorts} = \frac{\text{deaths per year}}{\text{encounters per year}} \times 10^6$

Converting OWID annual counts to per-encounter micromorts
Animal	Annual deaths (approx)	Encounters/yr (approx)	Micromorts	Source for denominator	In dataset?
Shark	~6 (US)	~100M ocean swims	0.06	ISAF	Yes
Dog (US)	~30	~4.5M bites	6.7	CDC	Yes
Bee/wasp (US)	~62	~2M stings	0.03	CDC	Yes
Snake (US)	~5	~10,000 bites	0.5	CDC	Yes
Snake (Africa)	~100,000	~5.4M bites	18.5	WHO/Lancet	Yes
Mosquito	~600,000+	Unknown per-bite	—	—	No
Crocodile	~1,000	Unknown	—	—	No
Elephant	~500	Unknown	—	—	No

Mosquito, crocodile, and elephant fail our inclusion criteria: there is no defensible per-encounter denominator. Mosquito bites are ubiquitous in endemic regions, making a per-bite risk meaningless. We cite OWID for context but do not include these as micromort entries.

6. Estimate Ranges

For wildlife entries, we document plausible ranges reflecting source disagreement:

Estimate ranges for wildlife entries
activity	micromorts	estimate_range	source_count	validation_status
Shark encounter (ocean swim)	0.06	0.03-0.10	2	corroborated
Dog bite (US)	6.70	5-10	2	corroborated
Dog bite (rabies-endemic)	160.00	100-250	2	corroborated
Bee/wasp sting (general)	0.03	0.02-0.05	2	corroborated
Bee/wasp sting (allergic)	31.00	20-50	2	corroborated
Snake bite (US, with antivenom)	0.50	0.3-1.0	2	corroborated
Snake bite (rural sub-Saharan Africa)	18.50	10-30	2	corroborated

The range reflects uncertainty in both the numerator (death counts vary by year and reporting) and denominator (exposure estimates are often rough). The point estimate is our best central value; the range brackets the plausible minimum and maximum.

7. What You Can Contribute

If you find a better source for an existing entry, or want to propose a new risk: open an issue at github.com/johngavin/micromort with:

Numerator: Death count and source citation
Denominator: Exposure count and source citation
Geography/condition: Does the estimate apply globally, or to a specific population?
Time period: When was the data collected?

Entries start at validation_status = "single_source" and get upgraded as more sources confirm them.

Reproducibility

Show code

sessionInfo()
#> R version 4.5.2 (2025-10-31)
#> Platform: aarch64-apple-darwin25.2.0
#> Running under: macOS Tahoe 26.3.1
#> 
#> Matrix products: default
#> BLAS:   /nix/store/ab8sq4g14lg45192ykfqcklgw6fvaswh-blas-3/lib/libblas.dylib 
#> LAPACK: /nix/store/ssl6kfm7w37gz5pn57jn2x7xzw3bss24-openblas-0.3.30/lib/libopenblasp-r0.3.30.dylib;  LAPACK version 3.12.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: Europe/Belfast
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] dplyr_1.1.4     micromort_0.1.0 testthat_3.3.2 
#> 
#> loaded via a namespace (and not attached):
#>  [1] generics_0.1.4      digest_0.6.39       magrittr_2.0.4     
#>  [4] evaluate_1.0.5      grid_4.5.2          RColorBrewer_1.1-3 
#>  [7] pkgload_1.4.1       fastmap_1.2.0       rprojroot_2.1.1    
#> [10] jsonlite_2.0.0      processx_3.8.6      pkgbuild_1.4.8     
#> [13] backports_1.5.0     brio_1.1.5          secretbase_1.1.1   
#> [16] ps_1.9.1            purrr_1.2.1         scales_1.4.0       
#> [19] codetools_0.2-20    cli_3.6.5           rlang_1.1.7        
#> [22] bit64_4.6.0-1       withr_3.0.2         yaml_2.3.12        
#> [25] otel_0.2.0          tools_4.5.2         checkmate_2.3.3    
#> [28] ggplot2_4.0.1       base64url_1.4       credentials_2.0.3  
#> [31] assertthat_0.2.1    vctrs_0.7.1         R6_2.6.1           
#> [34] lifecycle_1.0.5     fs_1.6.6            bit_4.6.0          
#> [37] usethis_3.2.1       targets_1.11.4      arrow_22.0.0       
#> [40] callr_3.7.6         pkgconfig_2.0.3     desc_1.4.3         
#> [43] pillar_1.11.1       gtable_0.3.6        data.table_1.18.2.1
#> [46] glue_1.8.0          gert_2.3.1          xfun_0.56          
#> [49] tibble_3.3.1        tidyselect_1.2.1    sys_3.4.3          
#> [52] knitr_1.51          farver_2.1.2        igraph_2.2.1       
#> [55] htmltools_0.5.9     rmarkdown_2.30      compiler_4.5.2     
#> [58] prettyunits_1.2.0   S7_0.2.1            askpass_1.2.1      
#> [61] openssl_2.3.4