Skip to content

Configuration

Snowpack is configured through environment variables. Each component reads its own set of variables at startup. Variables without a default are required.

API

The API server connects to Postgres for job state, table cache, and cached health data. It also creates PyIceberg catalogs for table discovery/live health and Spark/Kyuubi clients for direct maintenance paths.

VariableDefaultDescription
SNOWPACK_SPARK_HOSTkyuubi-maintenance.data-platform.us-east-1.test-dataops.fetchrewards.comSpark Thrift Server / Kyuubi hostname.
SNOWPACK_SPARK_PORT10009Spark Thrift Server / Kyuubi port.
SNOWPACK_CATALOGlakehouse_devIceberg catalog name used in Spark SQL statements.
SNOWPACK_TABLE_CACHE_REFRESH_SECONDS300Cadence for the API TableCacheSyncWorker to refresh table inventory.
SNOWPACK_TABLE_CACHE_STALENESS_SECONDS2x refresh cadenceMaximum acceptable table-cache age before /readyz fails. Set -1 to use the default.
SNOWPACK_POSTGRES_HOSTlocalhostPostgreSQL hostname.
SNOWPACK_POSTGRES_PORT5432PostgreSQL port.
SNOWPACK_POSTGRES_DATABASEsnowpackPostgreSQL database name.
SNOWPACK_POSTGRES_USERsnowpackPostgreSQL username.
SNOWPACK_POSTGRES_PASSWORDPostgreSQL password. No default; must be provided.
SNOWPACK_DRAIN_MODEoffSet to on to reject new maintenance submissions. Existing running jobs are unaffected. Useful during planned Spark downtime.

Health Sync Worker

The health sync worker periodically loads table metadata from the PyIceberg catalog and writes health snapshots to Postgres. It also optionally pushes metrics to Mimir via OTLP.

VariableDefaultDescription
SNOWPACK_HEALTH_SYNC_INTERVAL_SECONDS900Health sync cadence in seconds (15 min). Set to 0 to disable the sync loop entirely.
SNOWPACK_HEALTH_SYNC_DATABASES(all)Comma-separated list of databases to sync. When unset, all databases in the catalog are synced.
SNOWPACK_HEALTH_SYNC_CONCURRENCY10Max concurrent PyIceberg table loads. Use ~2 on memory-constrained pods to avoid OOM kills.
SNOWPACK_MIMIR_ENDPOINT(unset)OTLP gRPC endpoint for Mimir metrics push. Leave empty to disable metrics push.
SNOWPACK_GLUE_CATALOGlakehouse_devGlue catalog name used by PyIceberg for direct metadata access.
AWS_REGIONus-east-1AWS region for Glue and S3 API calls.

Orchestrator

The orchestrator is a CronJob that queries the API for table health, decides which tables need maintenance, and submits jobs. It does not connect to Spark directly.

VariableDefaultDescription
SNOWPACK_API_URLhttp://snowpack-api.snowpack.svc.cluster.local:443Snowpack API base URL. The orchestrator calls this for health checks and job submissions.
SNOWPACK_MAINTENANCE_CADENCE_HOURS6Global minimum hours between maintenance runs for a given table. Individual tables can override this via the snowpack.maintenance_cadence_hours table property.
SNOWPACK_HEALTH_CONCURRENCY10Max concurrent health check requests to the API during the discovery phase.
SNOWPACK_MAX_SUBMIT3Max jobs the orchestrator will queue in a single run. Prevents overloading Spark when many tables need maintenance simultaneously.
SNOWPACK_POLL_INTERVAL30Seconds between job status polls while waiting for submitted jobs to complete.
SNOWPACK_OPT_IN_MODEtrueWhen true, only tables with snowpack.maintenance_enabled = true are considered. When false, all tables are eligible unless explicitly excluded via compaction_skip.
SNOWPACK_INCLUDE_DATABASES(unset)Comma-separated database allowlist. When set, only tables in these databases are considered. This is the Helm orchestrator.includeDatabases value.
SNOWPACK_DRY_RUNfalseWhen true, the orchestrator logs all decisions but does not submit any maintenance jobs. Useful for validating configuration changes.
SNOWPACK_SLACK_WEBHOOK_URL(unset)Slack incoming webhook URL. When set, the orchestrator posts a summary after each run. Optional.