Skip to contents

Estimates the upper tail dependence coefficient (lambda_U) for all unique station pairs using a Gumbel copula. For Gumbel copula with parameter alpha, lambda_U = 2 - 2^(1/alpha). Bootstrap confidence intervals assess whether lambda_U is significantly greater than zero (H1: spatial coherence of extremes).

Also computes empirical chi statistics at multiple quantile levels and Kendall's tau for overall rank dependence.

Usage

compute_extremal_dependence(
  data,
  variable = "wave_height",
  threshold_quantile = seq(0.9, 0.99, by = 0.01),
  n_bootstrap = 100,
  boot_subsample = 5000,
  station_info = NULL
)

Arguments

data

Data frame with columns: time (POSIXct), station_id (character), and the variable specified by variable.

variable

Variable to analyze (default: "wave_height").

threshold_quantile

Quantile levels at which to compute empirical chi (default: seq(0.9, 0.99, by = 0.01)).

n_bootstrap

Number of bootstrap replicates for lambda CI (default: 100).

boot_subsample

Maximum observations per bootstrap replicate. Subsampling speeds computation for large datasets (default: 5000).

station_info

Optional data frame with station metadata (from get_station_info()). If NULL, uses the default 5-station network.

Value

List with:

dependence_table

Data frame with columns: station1, station2, distance_km, kendall_tau, lambda_upper, lambda_lower, lambda_upper_ci_low, lambda_upper_ci_high, n_concurrent, copula_alpha, chi_q95, chi_q99, h1_significant (logical).

method

Character: "gumbel_copula".

n_bootstrap

Integer: number of bootstrap replicates used.

threshold_quantile

Numeric vector of quantile levels for chi.

If the copula package is unavailable or no valid pairs exist, returns a list with an error field.

Examples

if (FALSE) { # \dontrun{
con <- connect_duckdb()
data <- query_buoy_data(con, variables = c("time", "station_id", "wave_height"))
result <- compute_extremal_dependence(data)
result$dependence_table
DBI::dbDisconnect(con)
} # }