Skip to contents

Distribution-free confidence intervals for population quantiles using order statistics. Uses the Beta distribution to find order-statistic indices j,k such that (X_(j), X_(k)) covers the p-th quantile with at least the specified confidence level.

Usage

ci_order_statistics(x, probs, conf_level = 0.95)

Arguments

x

Numeric vector of observations (NAs removed internally).

probs

Numeric vector of probabilities for which to compute CIs (e.g. c(0.95, 0.99)).

conf_level

Confidence level (default 0.95).

Value

A data.frame with columns: probability, quantile, lower, upper, j, k, actual_coverage, method.

Details

For a sample of size n, the probability that the interval (X_(j), X_(k)) contains the p-th quantile is pbeta(p, j, n-j+1) - pbeta(p, k, n-k+1). We search for the tightest such interval achieving at least conf_level coverage.

This method is distribution-free: it requires no parametric assumptions. With ~8 years of hourly data (~70k observations), order-statistic CIs are well-defined even for extreme quantiles like the 99th percentile.

Examples

set.seed(42)
x <- rnorm(1000)
ci_order_statistics(x, probs = c(0.95, 0.99))
#>   probability quantile    lower    upper   j   k actual_coverage
#> 1        0.95 1.533487 1.429338 1.689459 935 963       0.9544052
#> 2        0.99 2.242351 2.041313 2.727196 983 996       0.9574810
#>             method
#> 1 order_statistics
#> 2 order_statistics