Databricks
Databricks Jobs, DLT pipelines, SQL warehouses, workspace notebooks/files, secrets, Unity Catalog, permissions, DBSQL queries, and Git Repos as Swamp models. Compose Databricks pipelines with non-Databricks resources in Swamp workflows.
v0.11 (2026.05.30.11) - bug fix
Fix: create_or_update tombstone bug. Previously after delete,
readResource returned tombstoned data not null, so create_or_update
took the PATCH/PUT/reset path against a workspace resource that no
longer existed and 404'd. Affected: job, dlt_pipeline, sql_warehouse,
secret_scope, uc_catalog, uc_schema, uc_volume.
Fix is workspace-first reconciliation: each create_or_update now checks the workspace via GET (or list for secret_scope) before deciding which path to take. New helper existsOnWorkspace in _lib/databricks.ts. Smoke-validated on Free: uc_schema and job delete-then-create_or_update both correctly take create path now.
Side benefit: also handles out-of-band workspace deletes (someone deletes the job via UI, then create_or_update via Swamp correctly recreates instead of failing).
No breaking changes from v0.10.
| Argument | Type | Description |
|---|---|---|
| name | string | |
| tasks | array | |
| job_clusters? | array | |
| schedule? | object | |
| tags? | record | |
| timeout_seconds? | number | |
| max_concurrent_runs? | number | |
| queue? | object |
| Argument | Type | Description |
|---|---|---|
| job_ref | string |
| Argument | Type | Description |
|---|---|---|
| job_ref | string |
| Argument | Type | Description |
|---|---|---|
| job_ref | string |
| Argument | Type | Description |
|---|---|---|
| job_ref | string | |
| job_parameters? | record | |
| notebook_params? | record | |
| idempotency_token? | string |
| Argument | Type | Description |
|---|---|---|
| run_id | number | |
| poll_seconds | number | |
| timeout_seconds | number |
| Argument | Type | Description |
|---|---|---|
| run_id | number |
| Argument | Type | Description |
|---|---|---|
| name | string | |
| tasks | array | |
| job_clusters? | array | |
| schedule? | object | |
| tags? | record | |
| timeout_seconds? | number | |
| max_concurrent_runs? | number | |
| queue? | object |
Resources
| Argument | Type | Description |
|---|---|---|
| path | string | |
| content | string | Raw notebook source text (NOT base64) |
| language | enum | |
| overwrite | boolean |
| Argument | Type | Description |
|---|---|---|
| path | string | |
| format | enum |
| Argument | Type | Description |
|---|---|---|
| path | string | |
| recursive | boolean |
Resources
| Argument | Type | Description |
|---|---|---|
| name | string | |
| storage? | string | DBFS/UC volume path for pipeline storage. Optional on Free/serverless. |
| configuration? | record | Spark conf passed into the pipeline runtime |
| catalog? | string | Unity Catalog target catalog (use with target schema) |
| target? | string | Default schema/database for pipeline outputs |
| libraries | array | Notebooks or files that define the pipeline |
| clusters? | array | |
| continuous? | boolean | true = streaming, false = triggered (default false) |
| development? | boolean | true = dev mode (no auto-restart on failure) |
| photon? | boolean | |
| edition? | enum | DLT pricing edition; ignored on Free/serverless |
| channel? | enum | |
| serverless? | boolean | Use serverless compute (required on Databricks Free) |
| Argument | Type | Description |
|---|---|---|
| pipeline_ref | string |
| Argument | Type | Description |
|---|---|---|
| pipeline_ref | string |
| Argument | Type | Description |
|---|---|---|
| pipeline_ref | string |
| Argument | Type | Description |
|---|---|---|
| pipeline_ref | string | |
| full_refresh | boolean | |
| full_refresh_selection? | array | Subset of tables to fully refresh |
| refresh_selection? | array | Subset of tables to incrementally refresh |
| cause? | string |
| Argument | Type | Description |
|---|---|---|
| pipeline_ref | string | |
| update_id | string | |
| poll_seconds | number | |
| timeout_seconds | number |
| Argument | Type | Description |
|---|---|---|
| pipeline_ref | string |
| Argument | Type | Description |
|---|---|---|
| name | string | |
| storage? | string | DBFS/UC volume path for pipeline storage. Optional on Free/serverless. |
| configuration? | record | Spark conf passed into the pipeline runtime |
| catalog? | string | Unity Catalog target catalog (use with target schema) |
| target? | string | Default schema/database for pipeline outputs |
| libraries | array | Notebooks or files that define the pipeline |
| clusters? | array | |
| continuous? | boolean | true = streaming, false = triggered (default false) |
| development? | boolean | true = dev mode (no auto-restart on failure) |
| photon? | boolean | |
| edition? | enum | DLT pricing edition; ignored on Free/serverless |
| channel? | enum | |
| serverless? | boolean | Use serverless compute (required on Databricks Free) |
Resources
| Argument | Type | Description |
|---|---|---|
| name | string | |
| min_num_clusters | number | |
| max_num_clusters | number | |
| auto_stop_mins | number | Minutes idle before auto-stop. 0 disables auto-stop. |
| enable_photon? | boolean | |
| enable_serverless_compute | boolean | Required on Databricks Free (serverless-only). |
| warehouse_type | enum | |
| spot_instance_policy? | enum | |
| channel? | object | |
| tags? | object |
| Argument | Type | Description |
|---|---|---|
| warehouse_ref | string |
| Argument | Type | Description |
|---|---|---|
| name | string | |
| warehouse_id | string |
| Argument | Type | Description |
|---|---|---|
| warehouse_ref | string |
| Argument | Type | Description |
|---|---|---|
| warehouse_ref | string |
| Argument | Type | Description |
|---|---|---|
| warehouse_ref | string |
| Argument | Type | Description |
|---|---|---|
| warehouse_ref | string |
| Argument | Type | Description |
|---|---|---|
| warehouse_ref | string | |
| statement | string | |
| catalog? | string | |
| schema? | string | |
| wait_timeout_seconds | number | 0 = async (returns statement_id, poll with wait_statement). 5-50 = sync wait. |
| on_wait_timeout | enum | |
| row_limit? | number |
| Argument | Type | Description |
|---|---|---|
| statement_id | string | |
| poll_seconds | number | |
| timeout_seconds | number |
| Argument | Type | Description |
|---|---|---|
| statement_id | string |
| Argument | Type | Description |
|---|---|---|
| name | string | |
| min_num_clusters | number | |
| max_num_clusters | number | |
| auto_stop_mins | number | Minutes idle before auto-stop. 0 disables auto-stop. |
| enable_photon? | boolean | |
| enable_serverless_compute | boolean | Required on Databricks Free (serverless-only). |
| warehouse_type | enum | |
| spot_instance_policy? | enum | |
| channel? | object | |
| tags? | object |
Resources
| Argument | Type | Description |
|---|---|---|
| path | string | |
| content | string | Raw file content (NOT base64). UTF-8 text only in v0.6. |
| overwrite | boolean |
| Argument | Type | Description |
|---|---|---|
| path | string |
| Argument | Type | Description |
|---|---|---|
| path | string |
Resources
| Argument | Type | Description |
|---|---|---|
| scope | string | |
| initial_manage_principal? | string | Principal granted MANAGE on the scope (e.g. 'users'). |
| scope_backend_type? | enum | DATABRICKS = workspace-managed (default). AZURE_KEYVAULT only on Azure. |
| Argument | Type | Description |
|---|---|---|
| scope | string | |
| initial_manage_principal? | string | Principal granted MANAGE on the scope (e.g. 'users'). |
| scope_backend_type? | enum | DATABRICKS = workspace-managed (default). AZURE_KEYVAULT only on Azure. |
| Argument | Type | Description |
|---|---|---|
| scope_ref | string |
Resources
| Argument | Type | Description |
|---|---|---|
| scope | string | |
| key | string | |
| string_value | string | The secret value. Pass via CEL vault.get to avoid surfacing the literal: |
| Argument | Type | Description |
|---|---|---|
| scope | string | |
| key | string |
| Argument | Type | Description |
|---|---|---|
| scope | string |
Resources
| Argument | Type | Description |
|---|---|---|
| name | string | |
| comment? | string | |
| properties? | record | |
| storage_root? | string | Managed-storage URI for the catalog (s3://, abfss://, gs://). |
| Argument | Type | Description |
|---|---|---|
| catalog_ref | string |
| Argument | Type | Description |
|---|---|---|
| catalog_ref | string | |
| new_name? | string | |
| comment? | string | |
| owner? | string | |
| properties? | record |
| Argument | Type | Description |
|---|---|---|
| catalog_ref | string | |
| force | boolean |
| Argument | Type | Description |
|---|---|---|
| name | string | |
| comment? | string | |
| properties? | record | |
| storage_root? | string | Managed-storage URI for the catalog (s3://, abfss://, gs://). |
Resources
| Argument | Type | Description |
|---|---|---|
| name | string | Schema name (NOT full_name) |
| catalog_name | string | Parent catalog (e.g. 'workspace' on Free) |
| comment? | string | |
| properties? | record | |
| storage_root? | string | External storage root (managed-storage UC volumes path) |
| Argument | Type | Description |
|---|---|---|
| schema_ref | string |
| Argument | Type | Description |
|---|---|---|
| schema_ref | string | |
| new_name? | string | |
| comment? | string | |
| owner? | string | |
| properties? | record |
| Argument | Type | Description |
|---|---|---|
| schema_ref | string | |
| force | boolean |
| Argument | Type | Description |
|---|---|---|
| catalog_name | string |
| Argument | Type | Description |
|---|---|---|
| name | string | Schema name (NOT full_name) |
| catalog_name | string | Parent catalog (e.g. 'workspace' on Free) |
| comment? | string | |
| properties? | record | |
| storage_root? | string | External storage root (managed-storage UC volumes path) |
Resources
| Argument | Type | Description |
|---|---|---|
| full_name | string |
| Argument | Type | Description |
|---|---|---|
| full_name | string |
| Argument | Type | Description |
|---|---|---|
| catalog_name | string | |
| schema_name | string | |
| max_results? | number |
Resources
| Argument | Type | Description |
|---|---|---|
| name | string | |
| catalog_name | string | |
| schema_name | string | |
| comment? | string | |
| storage_location? | string | Required for EXTERNAL volumes (cloud URI). |
| Argument | Type | Description |
|---|---|---|
| volume_ref | string |
| Argument | Type | Description |
|---|---|---|
| volume_ref | string | |
| new_name? | string | |
| comment? | string | |
| owner? | string |
| Argument | Type | Description |
|---|---|---|
| volume_ref | string |
| Argument | Type | Description |
|---|---|---|
| catalog_name | string | |
| schema_name | string | |
| max_results? | number |
| Argument | Type | Description |
|---|---|---|
| name | string | |
| catalog_name | string | |
| schema_name | string | |
| comment? | string | |
| storage_location? | string | Required for EXTERNAL volumes (cloud URI). |
Resources
| Argument | Type | Description |
|---|---|---|
| object_id | string |
| Argument | Type | Description |
|---|---|---|
| object_id | string | |
| access_control_list | array |
| Argument | Type | Description |
|---|---|---|
| object_id | string | |
| access_control_list | array |
| Argument | Type | Description |
|---|---|---|
| object_id | string | Sample object id; needed because levels are returned per-object |
Resources
| Argument | Type | Description |
|---|---|---|
| full_name | string | Securable identifier: '<catalog>' for catalog, '<catalog>.<schema>' for schema, '<catalog>.<schema>.<table>' for table/volume/function |
| Argument | Type | Description |
|---|---|---|
| full_name | string | Securable identifier: '<catalog>' for catalog, '<catalog>.<schema>' for schema, '<catalog>.<schema>.<table>' for table/volume/function |
| Argument | Type | Description |
|---|---|---|
| full_name | string | |
| changes | array |
Resources
| Argument | Type | Description |
|---|---|---|
| name | string | |
| query | string | SQL text |
| warehouse_id | string | SQL warehouse the query runs against (NOT the data_source_id) |
| description? | string | |
| parent? | string | Folder path in the workspace, e.g. 'folders/<id>' |
| run_as_role? | enum | |
| tags? | array |
| Argument | Type | Description |
|---|---|---|
| query_ref | string |
| Argument | Type | Description |
|---|---|---|
| query_ref | string | |
| name? | string | |
| query? | string | |
| warehouse_id? | string | |
| description? | string | |
| run_as_role? | enum | |
| tags? | array |
| Argument | Type | Description |
|---|---|---|
| query_ref | string |
| Argument | Type | Description |
|---|---|---|
| page_size? | number |
Resources
| Argument | Type | Description |
|---|---|---|
| name | string | Swamp-side handle. Workspace path becomes /Repos/<user>/<name> unless `path` is set. |
| url | string | Git repository URL |
| path? | string | Absolute workspace path. Defaults to /Repos/<current-user>/<name>. |
| branch? | string | Initial branch to check out. Defaults to remote default. |
| Argument | Type | Description |
|---|---|---|
| repo_ref | string |
| Argument | Type | Description |
|---|---|---|
| repo_ref | string | |
| branch? | string | New branch to check out. Triggers a pull. Mutually exclusive with `tag`. |
| tag? | string | Tag to check out. Mutually exclusive with `branch`. |
| Argument | Type | Description |
|---|---|---|
| repo_ref | string |
| Argument | Type | Description |
|---|---|---|
| repo_ref | string |
| Argument | Type | Description |
|---|---|---|
| path_prefix? | string | Filter by workspace path prefix (e.g. /Repos/me) |
| next_page_token? | string |
Resources
v0.10 (2026.05.30.10) - Phase 3 (query + repo)
Two new models close the last two critical gaps from the v0.7 review:
@mfbaig35r/databricks/query (create, read, update, delete, list) Manages DBSQL saved queries via /api/2.0/sql/queries. The query_id that this model returns pairs with the job model's sql_task.query.query_id field.
@mfbaig35r/databricks/repo (create, read, update, pull, delete, list) Manages Databricks Git Repos via /api/2.0/repos. Real Databricks jobs reference notebooks via repo paths rather than uploading to /Shared/.
pullre-sends the stored branch in the PATCH body (Databricks rejects empty-body PATCH).
Smoke validated end-to-end on Databricks Free serverless:
- query: list (empty), create with SELECT 1 on Starter warehouse, read, list (1 visible), delete
- repo: create against github.com/mfbaig35r/swamp-databricks (public, no Git PAT setup required), read, pull (after fix), delete
No breaking changes from v0.9.
Total models: 15. End-to-end smoke validated since v0.1: job, notebook, dlt_pipeline, sql_warehouse, workspace_file, secret_scope, secret, uc_schema, uc_table, uc_volume, uc_catalog (list+schema-validated create), workspace_permissions, uc_permissions, query, repo. uc_table has no create (use sql_warehouse.run_query).
Added 2 models. updated labels
v0.9 (2026.05.30.9) - Phase 2 (permissions)
Phase 1 (uc_catalog + idempotency):
- New model: @mfbaig35r/databricks/uc_catalog (create, read, update, delete, list, create_or_update). Completes the UC top-down tree.
- New
create_or_updatemethod on: job, dlt_pipeline, sql_warehouse, secret_scope, uc_catalog, uc_schema, uc_volume. Reconcile semantics via Swamp data layer.
Phase 2 (permissions):
- New model: @mfbaig35r/databricks/workspace_permissions (get, set, update, list_levels). Workspace-level ACLs for jobs, pipelines, warehouses, notebooks, repos, queries, dashboards, alerts, experiments, registered-models, serving-endpoints, clusters, cluster-policies, instance-pools.
- New model: @mfbaig35r/databricks/uc_permissions (get, get_effective, update). UC grants on catalogs, schemas, tables, volumes, functions, external_locations, storage_credentials, models. Changes-style PATCH (add/remove per principal).
Smoke validated end-to-end on Free serverless:
- uc_schema.create_or_update fresh create + same-call patch
- workspace_permissions.list_levels (5 levels for sql/warehouses),
get on Starter Warehouse, update to grant CAN_VIEW on a job to
usersgroup - uc_permissions.get on workspace catalog, update to grant
USE_SCHEMA on a smoke schema to
account users, verify
Tombstone caveat (Phase 1): create_or_update checks Swamp data layer not workspace. After delete, the resource is tombstoned and create_or_update will mistakenly take the update path. Use create() explicitly for delete-then-recreate flows. v0.9 candidate fix: workspace-first lookup.
uc_catalog.create requires storage_root on Free (no default metastore storage); ships as schema-validated only.
No breaking changes from v0.7.
Added 2 models. updated labels
v0.8 (2026.05.30.8) - Phase 1: uc_catalog + idempotency
New model: @mfbaig35r/databricks/uc_catalog (create, read, update, delete, list, create_or_update). Completes the UC top-down tree: catalog -> schema -> table/volume.
New
create_or_updatemethod on: job, dlt_pipeline, sql_warehouse, secret_scope, uc_catalog, uc_schema, uc_volume. Reconcile semantics: if a resource with args.name exists in Swamp's data layer, take the update path (PATCH/PUT/reset); otherwise create. Closes the "create errors on second run" gap that blocked real automation.
Tombstone caveat: create_or_update checks Swamp's data layer, not
the workspace. If you call delete and then create_or_update with
the same name, the second call will hit the patch path against a
workspace resource that no longer exists and fail. For now, delete +
recreate flows should use create explicitly. v0.9 may add a
workspace-first lookup variant.
notebook.upload and workspace_file.upload already have idempotency via
overwrite: true. secret.put is already idempotent (Databricks
replaces on put). uc_table has no create endpoint so no
create_or_update.
Smoke validated on Free serverless: catalog list (3 catalogs visible), schema create_or_update fresh-call (create path) + same-call-again (patch path), delete cleanup. uc_catalog.create requires storage_root on Free (no default metastore storage available); ships as schema-validated.
No breaking changes from v0.7.
Added 1, modified 6 models
v0.7 (2026.05.30.7)
Five new models, ten total.
Secrets (workspace-level, distinct from Swamp vault):
- @mfbaig35r/databricks/secret_scope: create, list, delete.
- @mfbaig35r/databricks/secret: put, delete, list (keys only, never values). Secret values pass through to Databricks and are NEVER persisted in Swamp's data layer. Pass values via CEL ${{ vault.get(...) }}.
Unity Catalog:
- @mfbaig35r/databricks/uc_schema: create, read, update, delete, list.
- @mfbaig35r/databricks/uc_table: read, delete, list. Tables are NOT created via this API; use sql_warehouse.run_query or a job notebook task to CREATE TABLE, then this model captures the snapshot.
- @mfbaig35r/databricks/uc_volume: create, read, update, delete, list.
No breaking changes from v0.6.
Added 5 models. updated labels
v0.6 (2026.05.30.6)
New model: @mfbaig35r/databricks/workspace_file. Owns upload, read (export), and delete for workspace files (FILE object type). Distinct from @mfbaig35r/databricks/notebook (which owns NOTEBOOK objects). Use this when a downstream task references a plain file at a workspace path: sql_task.file, spark_python_task.python_file, dbt project files, etc.
sql_task end-to-end validated. Closes the v0.5 gap: workspace_file upload -> job with sql_task.file -> run -> wait_run COMPLETED + SUCCESS on Databricks Free serverless.
upload uses POST /api/2.0/workspace/import with format=AUTO and no language, then calls /api/2.0/workspace/get-status to verify and record the resulting object_type on the resource. Modern workspaces produce FILE; older workspaces may produce NOTEBOOK depending on content, and the resource reflects what was created.
No breaking changes from v0.5.
Added 1 models
v0.5 (2026.05.30.5)
Expanded job task-type schema from 3 types to 10. Added: spark_python_task, spark_jar_task, python_wheel_task, dbt_task, run_job_task, condition_task, for_each_task (Zod recursive via z.lazy). sql_task now also covers dashboard and alert variants alongside query and file. Workflows that use any of these no longer reject at the schema layer.
End-to-end smoke validation in this release covers notebook_task only (matches what the Databricks Free smoke environment can reasonably exercise). Other task types are schema-validated; the Databricks API accepts the schemas at job-create time. If you hit an edge case on a real workload, open an issue at github.com/mfbaig35r/swamp-databricks.
Not yet covered: spark_submit_task, clean_rooms_notebook_task. Add in a later release if there is real demand.
No breaking changes from v0.4.
v0.4 (2026.05.30.4)
New model: @mfbaig35r/databricks/sql_warehouse. Lifecycle (create, read, update, delete, start, stop) plus SQL Statement Execution (run_query, wait_statement, cancel_statement). run_query waits synchronously up to 50 seconds; longer-running statements get a last_statement resource so wait_statement can take over.
sql_warehouse.adopt method: register an existing workspace warehouse as a Swamp resource without creating a new one. Required pattern on Databricks Free where warehouse quotas are small and the workspace ships with a Starter Warehouse already.
This closes the DLT cleanup gap from v0.3: run_query can execute the DROP TABLE for tables materialized by a deleted pipeline.
No breaking changes from v0.3.
Added 1 models. updated labels
v0.3 (2026.05.30.3)
DLT pipeline model validated end-to-end on Databricks Free serverless. The "preview" caveat from v0.2 is removed.
Schema fix: PipelineSettings now requires
catalogwhenserverless: truevia a Zod .refine(). Surfaces the Databricks API constraint at validation time instead of as a 400 INVALID_PARAMETER_VALUE on create.README clarifies that DELETE /api/2.0/pipelines/{id} does NOT drop the Delta tables a pipeline materialized. After deleting a pipeline, tables in the target schema persist; remove them with a DROP TABLE in a SQL warehouse or notebook. A future @mfbaig35r/databricks/sql_warehouse model will make this a Swamp-native operation.
No breaking changes from v0.2.
v0.2 (2026.05.30.2)
New model: @mfbaig35r/databricks/notebook. Owns upload, read (export), and delete for workspace notebooks. The upload_notebook / delete_notebook methods and 'notebook' resource have moved off the job model into this dedicated model. Workflows that called those methods on the job model in v0.1 must update their
model:references to the new model.New model: @mfbaig35r/databricks/dlt_pipeline. Lifecycle for Delta Live Tables pipelines: create, read, update (full replace via PUT), delete, start_update (POST /pipelines/{id}/updates), wait_update (poll until COMPLETED / FAILED / CANCELED), stop. Libraries support notebook or file references. On Databricks Free serverless, set
serverless: trueon create. Note: DLT pipeline behavior on Databricks Free has not been validated end-to-end yet; treat as preview until smoke-tested.Refactor: workspace auth + bearer fetch + global args schema extracted into extensions/models/_lib/databricks.ts. All three models share one auth surface.
Breaking from v0.1: upload_notebook / delete_notebook removed from @mfbaig35r/databricks/job. Workflows that used those must switch to @mfbaig35r/databricks/notebook upload / delete.
Added 2, modified 1 models. updated labels
Initial release. Adds the @mfbaig35r/databricks/job model with create, read, update (full reset), delete, run, wait_run, and cancel_run methods. Auth via PAT (resolved through CEL: vault.get) or OAuth M2M client_credentials. Azure MSI is stubbed.
v1 task types validated by Zod: notebook_task, sql_task, pipeline_task. Other Databricks task types (spark_python_task, python_wheel_task, dbt_task, run_job_task, for_each_task, condition_task, spark_jar_task) will reject at the schema layer until added in subsequent releases.
Preview surface (will move): upload_notebook and delete_notebook methods, and the 'notebook' resource, are convenience APIs for smoke-testing the run/wait_run loop end-to-end. They will split into a dedicated @mfbaig35r/databricks/notebook model in v0.2. Workflows that call them by name will need to update the model reference at that point.
Validated end-to-end on Databricks Free (AWS serverless): upload_notebook -> create -> run -> wait_run (TERMINATED+SUCCESS) -> delete -> delete_notebook, zero orphan workspace state.
- Has README or module doc2/2earned
- README has a code example1/1earned
- README is substantive1/1earned
- Most symbols documented1/1earned
- No slow types1/1earned
- Dependencies pass trust audit2/2earned
- Has description1/1earned
- Platform support declared (or universal)2/2earned
- License declared1/1earned
- Verified public repository2/2earned