Skip to contents

Uses Parquet files as storage backend with DuckDB as query engine. This provides excellent compression (5-10x) while maintaining query performance.

The architecture:

  • Raw data stored in partitioned Parquet files (by year/month)

  • DuckDB used as query engine (reads Parquet directly)

  • Optional: DuckDB database for metadata and indexes only Initialize Parquet Storage Structure

Usage

init_parquet_storage(
  data_path = "inst/extdata/parquet",
  db_path = "inst/extdata/metadata.duckdb"
)

Arguments

data_path

Base path for Parquet files

db_path

Optional path for metadata database