Skip to contents

Identifies anomalous values using standard deviation thresholds relative to seasonal norms.

Usage

detect_anomalies(
  data,
  variable = "wave_height",
  time_col = "time",
  threshold = 3
)

Arguments

data

Data frame with time and value columns

variable

Name of the variable (default: "wave_height")

time_col

Name of the time column (default: "time")

threshold

Number of standard deviations for anomaly detection (default: 3)

Value

List with:

  • anomalies: data frame of anomalous observations

  • seasonal_norms: monthly mean and sd used as baseline

  • summary: count of anomalies by month

Examples

set.seed(1)
data <- data.frame(
  time = seq(as.POSIXct("2020-01-01"), by = "hour", length.out = 1000),
  wave_height = 2 + sin(seq(0, 20, length.out = 1000)) + rnorm(1000, 0, 0.3)
)
result <- detect_anomalies(data)
#>  Detecting anomalies...
#>  Detected 0 anomalies (>3 SD from seasonal norm)
#>  Detecting anomalies...

#>  Detecting anomalies... [7ms]
#> 
nrow(result$anomalies)
#> [1] 0
result$summary
#>   month n_anomalies month_name
#> 1     1           0        Jan
#> 2     2           0        Feb