Skip to content

Quick Start

This guide walks you through opting a table into Snowpack and running your first maintenance job. You will need Spark SQL access and curl.

1. Confirm your database is in scope

Snowpack maintains tables by default. If your table’s database is on the orchestrator allowlist, the table is already eligible — no per-table action is needed. Ask the Data Platform team whether your database is allowlisted, or request that it be added (a one-time change to the Helm values).

Until the database is allowlisted, you can still run maintenance manually (steps 4-5).

2. (Optional) Opt a table out

If you do not want a table maintained automatically, opt it out explicitly:

ALTER TABLE lakehouse_dev.my_database.my_table
SET TBLPROPERTIES ('snowpack.maintenance_enabled' = 'false');

Use compaction_skip = 'true' instead to hard-exclude a table from all maintenance (for example, during a migration).

3. Check table health

Verify that Snowpack can see your table and inspect its current health:

Terminal window
curl https://snowpack-api.internal/tables/my_database/my_table/health/cached

The response includes metrics like small file count, snapshot count, and whether the table needs_maintenance. Use the /health/live endpoint instead if you need real-time data directly from the catalog (slower, but always current).

4. Submit a manual maintenance job

Trigger maintenance for specific actions:

Terminal window
curl -X POST https://snowpack-api.internal/tables/my_database/my_table/maintenance \
-H "Content-Type: application/json" \
-d '{"actions": ["rewrite_data_files", "expire_snapshots"]}'

The API returns 202 Accepted with a job ID:

{
"job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"status": "pending"
}

You can request any combination of the five maintenance actions. See Key Concepts for the full list and execution order.

5. Monitor the job

Poll the job endpoint to track progress:

Terminal window
curl https://snowpack-api.internal/jobs/a1b2c3d4-e5f6-7890-abcd-ef1234567890

The response includes the current status (pending, running, completed, failed, or cancelled) and details for each action.


Once your database is allowlisted, the orchestrator CronJob handles all of this automatically — it discovers eligible tables, checks their health, and submits maintenance jobs on a schedule.

For deeper coverage, see the Guides section.