Data Outputs

Data outputs are the versioned artifacts produced when model methods execute. Each method execution can write structured data (resources) and unstructured content (files). Data outputs are stored in .swamp/data/ within the repository, organized by model type, model ID, and data name.

Output Types

Model types declare their output specifications using two categories: resources and files. Each declared spec has a name (the spec name) that identifies it within the model type.

Resource Outputs

Structured JSON data validated against a schema.

Field	Type	Required	Default	Description
`description`	string	No	None	Human-readable description
`schema`	Zod schema	Yes	—	Validates data on write
`lifetime`	`Lifetime`	Yes	—	Retention policy
`garbageCollection`	`GarbageCollectionPolicy`	Yes	—	Version retention policy
`tags`	`Record<string, string>`	No	`{}`	Default tags (auto-includes `type: "resource"`)
`sensitiveOutput`	boolean	No	`false`	Treat all fields as sensitive
`vaultName`	string	No	First available vault	Vault for storing sensitive field values

Resource content is always stored as application/json.

File Outputs

Binary or text content identified by MIME type.

Field	Type	Required	Default	Description
`description`	string	No	None	Human-readable description
`contentType`	string	Yes	—	MIME type (e.g., `text/plain`)
`lifetime`	`Lifetime`	Yes	—	Retention policy
`garbageCollection`	`GarbageCollectionPolicy`	Yes	—	Version retention policy
`streaming`	boolean	No	`false`	Line-oriented append mode
`tags`	`Record<string, string>`	No	`{}`	Default tags (auto-includes `type: "file"`)

Example: `command/shell` Model Type

The built-in command/shell model type declares one resource and one file output:

resources:
  result        [resource]  — Shell command execution result (infinite)
  
files:
  log           [file]      — Shell command output, text/plain (infinite, streaming)

After execution, both are visible via swamp data list:

Data for hello-world (command/shell)

file (1 item):
  log  v2  text/plain  19B  2026-04-07

resource (1 item):
  result  v1  application/json  135B  2026-04-07

report (2 items):
  report-swamp-method-summary  v2  text/markdown  482B  2026-04-07
  report-swamp-method-summary-json  v2  application/json  2.6KB  2026-04-07

Lifetime

Lifetime determines how long data is retained before it becomes eligible for garbage collection.

Value	Description
Duration string	`1h`, `5m`, `10d`, `2w`, `1mo`, `10y`
`ephemeral`	Deleted when the process ends
`infinite`	Never automatically deleted
`job`	Lives until the job completes
`workflow`	Lives until the workflow completes

Duration format: {number}{unit} where unit is h (hours), m (minutes), d (days), w (weeks), mo (months), or y (years).

Zero-duration strings (e.g., 0h, 0d) are normalized to workflow.

Duration Conversion

Unit	Conversion
`m`	value × 60,000 ms
`h`	value × 3,600,000 ms
`d`	value × 86,400,000 ms
`w`	value × 604,800,000 ms
`mo`	value × 2,592,000,000 ms (30 days)
`y`	value × 31,536,000,000 ms (365 days)

Expiration Rules

infinite: Never expires.
ephemeral: Not yet implemented — treated as non-expiring.
job / workflow: Expires when the associated workflow run no longer exists. Requires workflowId and workflowRunId in the owner definition. If either is missing, the data is not expired.
Duration strings: Expires when createdAt + duration is in the past.

Garbage Collection Policy

Garbage collection controls how many versions of a data item are retained.

Value	Description
integer	Keep the N most recent versions
Duration string	Keep versions created within the duration

The integer must be a positive integer. The duration string uses the same format as lifetime durations and must be greater than zero.

# Keep the 10 most recent versions
garbageCollection: 10

# Keep versions from the last 7 days
garbageCollection: 7d

Garbage collection runs as part of swamp data gc and during the lifecycle service. It operates in two phases:

Expired data deletion — removes all versions of data items whose lifetime has elapsed.
Version pruning — for non-expired data, removes old versions that exceed the garbage collection policy.

Versioning

Each data item is versioned with sequential positive integers starting at 1. Every method execution that writes to the same data name produces a new version.

$ swamp data versions hello-world result --json

{
  "dataName": "result",
  "modelName": "hello-world",
  "modelType": "command/shell",
  "versions": [
    {
      "version": 2,
      "createdAt": "2026-04-07T18:03:08.737Z",
      "size": 135,
      "checksum": "d58d16...",
      "isLatest": true
    },
    {
      "version": 1,
      "createdAt": "2026-04-07T18:02:58.146Z",
      "size": 157,
      "checksum": "c631b6...",
      "isLatest": false
    }
  ],
  "total": 2
}

The "latest" Pointer

Each data item has a latest file in its directory containing the current version number as plain text. When data is retrieved without an explicit --version flag, the latest version is returned.

.swamp/data/command/shell/{model-id}/result/
  1/
  2/
  latest          # Contains: "2"

The name latest is reserved — it cannot be used as a data name.

Checksums

Each version includes a SHA-256 checksum computed from the content file at finalization time. The checksum is stored in metadata.yaml and returned by data access commands.

Storage Layout

Data is stored on disk under .swamp/data/ in a hierarchical directory structure:

.swamp/data/
  {model-type-path}/
    {model-id}/
      {data-name}/
        1/
          metadata.yaml
          raw
        2/
          metadata.yaml
          raw
        latest

{model-type-path} — the model type as a directory path (e.g., command/shell).
{model-id} — the UUID of the model definition.
{data-name} — the instance name given when data is written.
{version}/ — a numbered directory for each version.
metadata.yaml — full metadata for the version.
raw — the content (JSON for resources, binary or text for files).
latest — text file containing the current version number.

Metadata File

Each version's metadata.yaml contains the complete data record:

id: e204ea55-3d64-48a0-aa78-32fea656fdac
name: result
version: 1
contentType: application/json
lifetime: infinite
garbageCollection: 10
streaming: false
tags:
  type: resource
  specName: result
  modelName: hello-world
ownerDefinition:
  ownerType: model-method
  ownerRef: 7347cf2c-cc9e-4203-8897-e10845af9732
createdAt: "2026-04-07T18:02:58.146Z"
size: 157
checksum: c631b676cd069af1decf4f20c27568f44bcccf062846bb32bbeae187573c2fe6

Tag Key	Auto-Injected	Description
`type`	Yes	`resource`, `file`, or `report`
`specName`	Yes	Output spec name from the model type
`modelName`	Yes	Definition name for orphan data recovery

Streaming

When streaming: true is set on a file output spec, the data writer operates in line-oriented append mode. Lines are written incrementally to disk as they are produced, rather than buffered in memory.

The command/shell model type declares streaming: true on its log file output to capture stdout and stderr as the command executes.

Streaming file writers support three write patterns:

writeLine(line) — appends a single line with a newline character.
writeStream(stream) — pipes a ReadableStream to disk, invoking optional line callbacks as newlines are encountered.
getFilePath() — returns the allocated content path for direct file writes.

Non-streaming outputs use writeAll(content) or writeText(text) to write the complete content at once.

Sensitive Output

Resource output specs can mark fields as sensitive. Sensitive values are stored in a vault and replaced with vault.get() reference expressions before the data is persisted to disk.

Field-Level Sensitivity

Individual fields are marked sensitive through Zod schema metadata:

schema: z.object({
  apiKey: z.string().meta({ sensitive: true }),
  publicId: z.string(),
});

Only apiKey is stored in the vault. publicId is persisted as-is.

Whole-Output Sensitivity

When sensitiveOutput: true is set on the resource spec, all top-level fields are treated as sensitive.

Vault Resolution Order

The vault used for storing sensitive fields is resolved in this order:

Field-level vaultName from schema metadata
Spec-level vaultName from the resource output specification
First available vault from the vault service

If sensitive fields exist but no vault is configured, an error is thrown.

Vault Key Format

Auto-generated vault keys follow this pattern:

{sanitized-model-type}-{model-id}-{method-name}-{field-path}

Sanitization: @ and null bytes are removed, / and \ are replaced with -, .. is replaced with ..

Persisted Format

After processing, persisted resource data contains vault references:

apiKey: "${{ vault.get('my-secrets', 'command-shell-abc-execute-apiKey') }}"
publicId: "pk_12345"

On read, vault references are automatically resolved back to their original values. Resolved secrets are registered with the secret redactor to prevent log leakage.

Non-string sensitive values are JSON-stringified before vault storage.

Lifecycle States

Each data entry has a lifecycle state.

State	Description
`active`	Normal, live data (default)
`deleted`	Tombstone marker — the data has been deleted or renamed

Deletion Markers

A deletion marker is a version with lifecycle: "deleted", application/json content type, and streaming: false. It signals that the data was intentionally removed.

Rename Markers

A rename marker is a deletion marker with an additional renamedTo field pointing to the new data name. When the latest version of a data item is a rename marker, lookups without an explicit version follow the forward reference to the new name (up to 5 levels deep).

$ swamp data rename hello-world result execution-result --json

{
  "oldName": "result",
  "newName": "execution-result",
  "modelId": "7347cf2c-cc9e-4203-8897-e10845af9732",
  "modelName": "hello-world",
  "modelType": "command/shell",
  "copiedVersion": 2,
  "newVersion": 1,
  "warning": "Any workflows or models that produce data under \"result\" will overwrite the forward reference. Update them to use \"execution-result\" instead."
}

The rename process:

Copies the latest version of the old data name to version 1 under the new name.
Writes a tombstone with a forward reference on the old name.
Updates the latest marker on the old name to point to the tombstone.

Owner Definition

Every data entry tracks its owner — who created it.

Field	Type	Required	Description
`ownerType`	string	Yes	`model-method`, `workflow-step`, or `manual`
`ownerRef`	string	Yes	Identifier of the owner (model ID, step ref)
`workflowId`	string	No	Workflow UUID (for `job`/`workflow` lifetimes)
`workflowRunId`	string	No	Workflow run UUID

Ownership is validated on write — new versions of an existing data name must have the same ownerType and ownerRef as the original.

Data Output Overrides

Workflow steps can override the default output spec settings for data produced by their tasks. See dataOutputOverrides in the workflows reference.

Field	Type	Description
`specName`	string	Output spec name to override
`lifetime`	`Lifetime`	Override retention policy
`garbageCollection`	`GarbageCollectionPolicy`	Override version retention
`tags`	`Record<string, string>`	Additional tags merged with output tags
`vary`	`string[]`	Input key names to vary by (composite data names)

When vary is set, the resolved values of the named input keys are appended as a suffix to the data instance name. This produces distinct data items per iteration in forEach steps.

CEL Access

Data outputs are accessible in CEL expressions through the data namespace:

Function	Description
`data.latest(modelName, dataName)`	Latest version of a data item
`data.latest(modelName, dataName, varyValues[])`	Latest version with vary suffix
`data.version(modelName, dataName, version)`	Specific version
`data.version(modelName, dataName, varyValues[], version)`	Specific version with vary suffix
`data.listVersions(modelName, dataName)`	All version numbers
`data.listVersions(modelName, dataName, varyValues[])`	All version numbers with vary suffix
`data.findByTag(tagKey, tagValue)`	Search by tag
`data.findBySpec(modelName, specName)`	Find by output spec name
`data.query(predicate, select?)`	CEL predicate query

These functions return DataRecord objects with the following fields:

Field	Type	Description
`id`	string	Data UUID
`name`	string	Data instance name
`version`	number	Version number
`createdAt`	string	ISO 8601 timestamp
`attributes`	`Record<string, unknown>`	Parsed JSON content (resources only)
`tags`	`Record<string, string>`	All tags
`modelName`	string	Model definition name
`modelType`	string	Model type path
`specName`	string	Output spec name
`dataType`	string	`resource` or `file`
`contentType`	string	MIME type
`lifetime`	string	Lifetime policy
`ownerType`	string	Owner type
`streaming`	boolean	Whether streaming is enabled
`size`	number	Content size in bytes
`content`	string	Raw content string

CLI Commands

All data commands accept --json to output structured JSON instead of human-readable text.

`swamp data get <model> <data_name>`

Retrieve data by model name and data name. Returns the latest version by default.

Option	Description
`--version`	Retrieve a specific version number
`--workflow`	Get data produced by a workflow
`--run`	Specific workflow run ID
`--no-content`	Show metadata only, without content
`--repo-dir`	Repository directory (default `.`)

$ swamp data get hello-world result --json

{
  "id": "e204ea55-3d64-48a0-aa78-32fea656fdac",
  "name": "result",
  "modelName": "hello-world",
  "modelType": "command/shell",
  "version": 1,
  "contentType": "application/json",
  "lifetime": "infinite",
  "garbageCollection": 10,
  "streaming": false,
  "tags": {
    "type": "resource",
    "specName": "result",
    "modelName": "hello-world"
  },
  "ownerDefinition": {
    "ownerType": "model-method",
    "ownerRef": "7347cf2c-cc9e-4203-8897-e10845af9732"
  },
  "createdAt": "2026-04-07T18:02:58.146Z",
  "size": 157,
  "checksum": "c631b676cd...",
  "contentPath": ".swamp/data/command/shell/.../result/1/raw",
  "content": {
    "exitCode": 0,
    "executedAt": "2026-04-07T18:02:58.143Z",
    "command": "echo \"Hello from the swamp!\"",
    "durationMs": 4,
    "stdout": "Hello from the swamp!",
    "stderr": ""
  }
}

`swamp data list [model]`

List all data for a model, grouped by type.

Option	Description
`--type`	Filter by data type (`resource`, `file`, `report`)
`--workflow`	List data produced by a workflow
`--run`	Specific workflow run ID
`--repo-dir`	Repository directory (default `.`)

$ swamp data list hello-world

Data for hello-world (command/shell)

file (1 item):
  log  v2  text/plain  19B  2026-04-07

resource (1 item):
  result  v1  application/json  135B  2026-04-07

report (2 items):
  report-swamp-method-summary  v2  text/markdown  482B  2026-04-07
  report-swamp-method-summary-json  v2  application/json  2.6KB  2026-04-07

`swamp data versions <model> <data_name>`

Show all versions of a data item.

Option	Description
`--repo-dir`	Repository directory (default `.`)

$ swamp data versions hello-world result --json

{
  "dataName": "result",
  "modelName": "hello-world",
  "modelType": "command/shell",
  "versions": [
    {
      "version": 2,
      "createdAt": "2026-04-07T18:03:08.737Z",
      "size": 135,
      "checksum": "d58d1607...",
      "isLatest": true
    },
    {
      "version": 1,
      "createdAt": "2026-04-07T18:02:58.146Z",
      "size": 157,
      "checksum": "c631b676...",
      "isLatest": false
    }
  ],
  "total": 2
}

`swamp data search [query]`

Search across all data in the repository. Opens an interactive picker in a terminal, or returns JSON with --json.

Option	Description
`--type`	Filter by data type tag (`resource`, `file`, `report`)
`--lifetime`	Filter by lifetime (`ephemeral`, `infinite`, `job`, `workflow`, or duration)
`--owner-type`	Filter by owner type (`model-method`, `workflow-step`, `manual`)
`--workflow`	Filter to data tagged with this workflow name
`--model`	Filter to data owned by this model name
`--content-type`	Filter by MIME content type
`--since`	Only data created within duration (`1h`, `1d`, `7d`, `1w`, `1mo`)
`--output`	Filter by output ID
`--run`	Filter by workflow run ID
`--tag`	Filter by tag (`KEY=VALUE`, repeatable)
`--streaming`	Only show streaming data
`--limit`	Maximum results (default 50)
`--repo-dir`	Repository directory (default `.`)

$ swamp data search --type resource --json

{
  "query": "",
  "filters": {
    "type": "resource"
  },
  "results": [
    {
      "id": "5e7d72ab-7e0d-492e-ab3d-61463d9d4a85",
      "name": "execution-result",
      "version": 1,
      "contentType": "application/json",
      "type": "resource",
      "lifetime": "infinite",
      "ownerType": "model-method",
      "modelName": "hello-world",
      "modelType": "command/shell",
      "streaming": false,
      "size": 135,
      "createdAt": "2026-04-07T18:03:27.361Z",
      "tags": {
        "type": "resource",
        "specName": "result",
        "modelName": "hello-world"
      }
    }
  ],
  "total": 1,
  "limited": false
}

`swamp data query [predicate]`

Query data using a CEL predicate. The predicate evaluates against DataRecord fields directly (not prefixed with data.).

Option	Description
`--select`	CEL expression to project fields (e.g., `name`)
`--limit`	Maximum results (default 100)
`--repo-dir`	Repository directory (default `.`)

Available fields in the predicate: attributes, content, contentType, createdAt, dataType, id, lifetime, modelName, modelType, name, ownerType, size, specName, streaming, tags, version.

$ swamp data query 'tags.type == "resource"' --json

{
  "predicate": "tags.type == \"resource\"",
  "results": [
    {
      "id": "85c471af-a4c8-4f03-a5df-768351388d09",
      "name": "result",
      "version": 2,
      "tags": {
        "type": "resource",
        "specName": "result",
        "modelName": "hello-world"
      },
      "modelName": "hello-world",
      "modelType": "command/shell",
      "dataType": "resource",
      "contentType": "application/json",
      "lifetime": "infinite",
      "streaming": false,
      "size": 135
    }
  ],
  "total": 1,
  "limited": false
}

With --select to project a single field:

$ swamp data query 'tags.type == "resource"' --select 'name' --json

{
  "results": ["result", "execution-result"],
  "total": 2,
  "limited": false
}

`swamp data rename <model> <old_name> <new_name>`

Rename a data item. Creates a copy under the new name and writes a tombstone with a forward reference on the old name.

Option	Description
`--repo-dir`	Repository directory (default `.`)

$ swamp data rename hello-world result execution-result --json

{
  "oldName": "result",
  "newName": "execution-result",
  "modelId": "7347cf2c-cc9e-4203-8897-e10845af9732",
  "modelName": "hello-world",
  "modelType": "command/shell",
  "copiedVersion": 2,
  "newVersion": 1,
  "warning": "Any workflows or models that produce data under \"result\" will overwrite the forward reference. Update them to use \"execution-result\" instead."
}

Lookups for the old name without an explicit version follow the forward reference to the new name (up to 5 levels deep).

`swamp data delete <model> <data_name>`

Permanently delete a data artifact, or a single version of one. Without --version, every version of the artifact is removed.

Option	Description
`--version`	Delete only that version. The `latest` pointer follows the highest remaining version
`-f, --force`	Skip the `[y/N]` confirmation prompt
`--repo-dir`	Repository directory (default `.`)

By default the command prompts for confirmation before deleting. With --force or --json it runs non-interactively.

$ swamp data delete hello-world result --log
About to delete 1 version(s) of "result" from hello-world. Proceed? [y/N] y
19:00:25.924 INF data·delete Deleted 1 version(s) of "result" for "hello-world" ("command/shell")

$ swamp data delete hello-world result --version 1 --force --json

{
  "modelId": "589d3566-0144-478c-a5fc-dcf069d455af",
  "modelName": "hello-world",
  "modelType": "command/shell",
  "dataName": "result",
  "version": 1,
  "versionsDeleted": 1
}

The version field is omitted when every version is deleted. versionsDeleted is the count of versions actually removed.

Error responses:

Missing model — Model not found: <ref>
Missing artifact — No data named "<name>" exists for model <model>
Missing version — Version <V> does not exist for "<name>" (available versions: <list>)

If old_name was previously renamed to new_name (see Rename Markers), swamp data delete <model> <old_name> removes the tombstone forwarder literally — it does not traverse the forward reference to delete new_name. Lookups under the old name afterwards fail with Data not found; the rename target is unaffected.

`swamp data gc`

Run garbage collection — delete expired data and prune old versions.

Option	Description
`--dry-run`	Show what would be deleted
`-f, --force`	Skip confirmation prompt
`--repo-dir`	Repository directory (default `.`)

Two phases execute in sequence:

Expired data deletion — removes all versions of data whose lifetime has elapsed.
Version pruning — removes old versions exceeding the garbage collection policy for non-expired data.

$ swamp data gc --dry-run --json

{
  "dataEntriesExpired": 0,
  "versionsDeleted": 0,
  "bytesReclaimed": 0,
  "dryRun": true,
  "expiredEntries": []
}

Validation Rules

Data names must be non-empty strings.
Data names must not contain .., /, \, or null bytes (path traversal protection).
The name latest is reserved (case-insensitive) and cannot be used as a data name.
Resource data is validated against the spec's Zod schema on write. Schema mismatches produce a warning, not an error.
New writes to an existing data name must have the same owner (ownerType + ownerRef) as the original.
Tags must include a type key.

Output Types

Resource Outputs

File Outputs

Example: `command/shell` Model Type

Lifetime

Duration Conversion

Expiration Rules

Garbage Collection Policy

Versioning

The "latest" Pointer

Checksums

Storage Layout

Metadata File

Tags

Tag Resolution Chain

Common Tag Values

Streaming

Sensitive Output

Field-Level Sensitivity

Whole-Output Sensitivity

Vault Resolution Order

Vault Key Format

Persisted Format

Lifecycle States

Deletion Markers

Rename Markers

Owner Definition

Data Output Overrides

CEL Access

CLI Commands

`swamp data get <model> <data_name>`

`swamp data list [model]`

`swamp data versions <model> <data_name>`

`swamp data search [query]`

`swamp data query [predicate]`

`swamp data rename <model> <old_name> <new_name>`

`swamp data delete <model> <data_name>`

`swamp data gc`

Validation Rules

DATA OUTPUTS

Output Types

Resource Outputs

File Outputs

Example: command/shell Model Type

Lifetime

Duration Conversion

Expiration Rules

Garbage Collection Policy

Versioning

The "latest" Pointer

Checksums

Storage Layout

Metadata File

Tags

Tag Resolution Chain

Common Tag Values

Streaming

Sensitive Output

Field-Level Sensitivity

Whole-Output Sensitivity

Vault Resolution Order

Vault Key Format

Persisted Format

Lifecycle States

Deletion Markers

Rename Markers

Owner Definition

Data Output Overrides

CEL Access

CLI Commands

swamp data get <model> <data_name>

swamp data list [model]

swamp data versions <model> <data_name>

swamp data search [query]

swamp data query [predicate]

swamp data rename <model> <old_name> <new_name>

swamp data delete <model> <data_name>

swamp data gc

Validation Rules

Example: `command/shell` Model Type

`swamp data get <model> <data_name>`

`swamp data list [model]`

`swamp data versions <model> <data_name>`

`swamp data search [query]`

`swamp data query [predicate]`

`swamp data rename <model> <old_name> <new_name>`

`swamp data delete <model> <data_name>`

`swamp data gc`