DATA OUTPUTS
Data outputs are the versioned artifacts produced when model methods execute.
Each method execution can write structured data (resources) and unstructured
content (files). Data outputs are stored in .swamp/data/ within the
repository, organized by model type, model ID, and data name.
Output Types
Model types declare their output specifications using two categories: resources and files. Each declared spec has a name (the spec name) that identifies it within the model type.
Resource Outputs
Structured JSON data validated against a schema.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
description |
string | No | None | Human-readable description |
schema |
Zod schema | Yes | — | Validates data on write |
lifetime |
Lifetime |
Yes | — | Retention policy |
garbageCollection |
GarbageCollectionPolicy |
Yes | — | Version retention policy |
tags |
Record<string, string> |
No | {} |
Default tags (auto-includes type: "resource") |
sensitiveOutput |
boolean | No | false |
Treat all fields as sensitive |
vaultName |
string | No | First available vault | Vault for storing sensitive field values |
Resource content is always stored as application/json.
File Outputs
Binary or text content identified by MIME type.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
description |
string | No | None | Human-readable description |
contentType |
string | Yes | — | MIME type (e.g., text/plain) |
lifetime |
Lifetime |
Yes | — | Retention policy |
garbageCollection |
GarbageCollectionPolicy |
Yes | — | Version retention policy |
streaming |
boolean | No | false |
Line-oriented append mode |
tags |
Record<string, string> |
No | {} |
Default tags (auto-includes type: "file") |
Example: command/shell Model Type
The built-in command/shell model type declares one resource and one file
output:
resources:
result [resource] — Shell command execution result (infinite)
files:
log [file] — Shell command output, text/plain (infinite, streaming)After execution, both are visible via swamp data list:
Data for hello-world (command/shell)
file (1 item):
log v2 text/plain 19B 2026-04-07
resource (1 item):
result v1 application/json 135B 2026-04-07
report (2 items):
report-swamp-method-summary v2 text/markdown 482B 2026-04-07
report-swamp-method-summary-json v2 application/json 2.6KB 2026-04-07Lifetime
Lifetime determines how long data is retained before it becomes eligible for garbage collection.
| Value | Description |
|---|---|
| Duration string | 1h, 5m, 10d, 2w, 1mo, 10y |
ephemeral |
Deleted when the process ends |
infinite |
Never automatically deleted |
job |
Lives until the job completes |
workflow |
Lives until the workflow completes |
Duration format: {number}{unit} where unit is h (hours), m (minutes), d
(days), w (weeks), mo (months), or y (years).
Zero-duration strings (e.g., 0h, 0d) are normalized to workflow.
Duration Conversion
| Unit | Conversion |
|---|---|
m |
value × 60,000 ms |
h |
value × 3,600,000 ms |
d |
value × 86,400,000 ms |
w |
value × 604,800,000 ms |
mo |
value × 2,592,000,000 ms (30 days) |
y |
value × 31,536,000,000 ms (365 days) |
Expiration Rules
infinite: Never expires.ephemeral: Not yet implemented — treated as non-expiring.job/workflow: Expires when the associated workflow run no longer exists. RequiresworkflowIdandworkflowRunIdin the owner definition. If either is missing, the data is not expired.- Duration strings: Expires when
createdAt + durationis in the past.
Garbage Collection Policy
Garbage collection controls how many versions of a data item are retained.
| Value | Description |
|---|---|
| integer | Keep the N most recent versions |
| Duration string | Keep versions created within the duration |
The integer must be a positive integer. The duration string uses the same format as lifetime durations and must be greater than zero.
# Keep the 10 most recent versions
garbageCollection: 10
# Keep versions from the last 7 days
garbageCollection: 7dGarbage collection runs as part of swamp data gc and during the lifecycle
service. It operates in two phases:
- Expired data deletion — removes all versions of data items whose lifetime has elapsed.
- Version pruning — for non-expired data, removes old versions that exceed the garbage collection policy.
Versioning
Each data item is versioned with sequential positive integers starting at 1. Every method execution that writes to the same data name produces a new version.
$ swamp data versions hello-world result --json{
"dataName": "result",
"modelName": "hello-world",
"modelType": "command/shell",
"versions": [
{
"version": 2,
"createdAt": "2026-04-07T18:03:08.737Z",
"size": 135,
"checksum": "d58d16...",
"isLatest": true
},
{
"version": 1,
"createdAt": "2026-04-07T18:02:58.146Z",
"size": 157,
"checksum": "c631b6...",
"isLatest": false
}
],
"total": 2
}The "latest" Pointer
Each data item has a latest file in its directory containing the current
version number as plain text. When data is retrieved without an explicit
--version flag, the latest version is returned.
.swamp/data/command/shell/{model-id}/result/
1/
2/
latest # Contains: "2"The name latest is reserved — it cannot be used as a data name.
Checksums
Each version includes a SHA-256 checksum computed from the content file at
finalization time. The checksum is stored in metadata.yaml and returned by
data access commands.
Storage Layout
Data is stored on disk under .swamp/data/ in a hierarchical directory
structure:
.swamp/data/
{model-type-path}/
{model-id}/
{data-name}/
1/
metadata.yaml
raw
2/
metadata.yaml
raw
latest{model-type-path}— the model type as a directory path (e.g.,command/shell).{model-id}— the UUID of the model definition.{data-name}— the instance name given when data is written.{version}/— a numbered directory for each version.metadata.yaml— full metadata for the version.raw— the content (JSON for resources, binary or text for files).latest— text file containing the current version number.
Metadata File
Each version's metadata.yaml contains the complete data record:
id: e204ea55-3d64-48a0-aa78-32fea656fdac
name: result
version: 1
contentType: application/json
lifetime: infinite
garbageCollection: 10
streaming: false
tags:
type: resource
specName: result
modelName: hello-world
ownerDefinition:
ownerType: model-method
ownerRef: 7347cf2c-cc9e-4203-8897-e10845af9732
createdAt: "2026-04-07T18:02:58.146Z"
size: 157
checksum: c631b676cd069af1decf4f20c27568f44bcccf062846bb32bbeae187573c2fe6Tags
Tags are key-value string pairs attached to data. They are used for filtering, discovery, and categorization.
Tag Resolution Chain
Tags are resolved in order, with later steps overriding earlier ones:
- Type auto-tag —
type: "resource"ortype: "file"(always present). - Definition tags — tags from the model definition.
- Spec defaults — tags declared on the output specification.
- Method write overrides — tags passed by the method when writing.
specNameauto-tag — the output spec name (always injected).modelNameauto-tag — the definition name (always injected).- Workflow tag overrides — tags from workflow step context.
- Runtime tags — tags provided via
--tagCLI flags. - Data output overrides — tags from
workflow
dataOutputOverrides.
The type tag is required on all data. Data without a type tag fails
validation.
Common Tag Values
| Tag Key | Auto-Injected | Description |
|---|---|---|
type |
Yes | resource, file, or report |
specName |
Yes | Output spec name from the model type |
modelName |
Yes | Definition name for orphan data recovery |
Streaming
When streaming: true is set on a file output spec, the data writer operates in
line-oriented append mode. Lines are written incrementally to disk as they are
produced, rather than buffered in memory.
The command/shell model type declares streaming: true on its log file
output to capture stdout and stderr as the command executes.
Streaming file writers support three write patterns:
writeLine(line)— appends a single line with a newline character.writeStream(stream)— pipes aReadableStreamto disk, invoking optional line callbacks as newlines are encountered.getFilePath()— returns the allocated content path for direct file writes.
Non-streaming outputs use writeAll(content) or writeText(text) to write the
complete content at once.
Sensitive Output
Resource output specs can mark fields as sensitive. Sensitive values are stored
in a vault and replaced with vault.get() reference
expressions before the data is persisted to disk.
Field-Level Sensitivity
Individual fields are marked sensitive through Zod schema metadata:
schema: z.object({
apiKey: z.string().meta({ sensitive: true }),
publicId: z.string(),
});Only apiKey is stored in the vault. publicId is persisted as-is.
Whole-Output Sensitivity
When sensitiveOutput: true is set on the resource spec, all top-level fields
are treated as sensitive.
Vault Resolution Order
The vault used for storing sensitive fields is resolved in this order:
- Field-level
vaultNamefrom schema metadata - Spec-level
vaultNamefrom the resource output specification - First available vault from the vault service
If sensitive fields exist but no vault is configured, an error is thrown.
Vault Key Format
Auto-generated vault keys follow this pattern:
{sanitized-model-type}-{model-id}-{method-name}-{field-path}Sanitization: @ and null bytes are removed, / and \ are replaced with -,
.. is replaced with ..
Persisted Format
After processing, persisted resource data contains vault references:
apiKey: "${{ vault.get('my-secrets', 'command-shell-abc-execute-apiKey') }}"
publicId: "pk_12345"On read, vault references are automatically resolved back to their original values. Resolved secrets are registered with the secret redactor to prevent log leakage.
Non-string sensitive values are JSON-stringified before vault storage.
Lifecycle States
Each data entry has a lifecycle state.
| State | Description |
|---|---|
active |
Normal, live data (default) |
deleted |
Tombstone marker — the data has been deleted or renamed |
Deletion Markers
A deletion marker is a version with lifecycle: "deleted", application/json
content type, and streaming: false. It signals that the data was intentionally
removed.
Rename Markers
A rename marker is a deletion marker with an additional renamedTo field
pointing to the new data name. When the latest version of a data item is a
rename marker, lookups without an explicit version follow the forward reference
to the new name (up to 5 levels deep).
$ swamp data rename hello-world result execution-result --json{
"oldName": "result",
"newName": "execution-result",
"modelId": "7347cf2c-cc9e-4203-8897-e10845af9732",
"modelName": "hello-world",
"modelType": "command/shell",
"copiedVersion": 2,
"newVersion": 1,
"warning": "Any workflows or models that produce data under \"result\" will overwrite the forward reference. Update them to use \"execution-result\" instead."
}The rename process:
- Copies the latest version of the old data name to version 1 under the new name.
- Writes a tombstone with a forward reference on the old name.
- Updates the latest marker on the old name to point to the tombstone.
Owner Definition
Every data entry tracks its owner — who created it.
| Field | Type | Required | Description |
|---|---|---|---|
ownerType |
string | Yes | model-method, workflow-step, or manual |
ownerRef |
string | Yes | Identifier of the owner (model ID, step ref) |
workflowId |
string | No | Workflow UUID (for job/workflow lifetimes) |
workflowRunId |
string | No | Workflow run UUID |
Ownership is validated on write — new versions of an existing data name must
have the same ownerType and ownerRef as the original.
Data Output Overrides
Workflow steps can override the default output spec settings for data produced
by their tasks. See
dataOutputOverrides in the
workflows reference.
| Field | Type | Description |
|---|---|---|
specName |
string | Output spec name to override |
lifetime |
Lifetime |
Override retention policy |
garbageCollection |
GarbageCollectionPolicy |
Override version retention |
tags |
Record<string, string> |
Additional tags merged with output tags |
vary |
string[] |
Input key names to vary by (composite data names) |
When vary is set, the resolved values of the named input keys are appended as
a suffix to the data instance name. This produces distinct data items per
iteration in forEach steps.
CEL Access
Data outputs are accessible in
CEL expressions through the data
namespace:
| Function | Description |
|---|---|
data.latest(modelName, dataName) |
Latest version of a data item |
data.latest(modelName, dataName, varyValues[]) |
Latest version with vary suffix |
data.version(modelName, dataName, version) |
Specific version |
data.version(modelName, dataName, varyValues[], version) |
Specific version with vary suffix |
data.listVersions(modelName, dataName) |
All version numbers |
data.listVersions(modelName, dataName, varyValues[]) |
All version numbers with vary suffix |
data.findByTag(tagKey, tagValue) |
Search by tag |
data.findBySpec(modelName, specName) |
Find by output spec name |
data.query(predicate, select?) |
CEL predicate query |
These functions return DataRecord objects with the following fields:
| Field | Type | Description |
|---|---|---|
id |
string | Data UUID |
name |
string | Data instance name |
version |
number | Version number |
createdAt |
string | ISO 8601 timestamp |
attributes |
Record<string, unknown> |
Parsed JSON content (resources only) |
tags |
Record<string, string> |
All tags |
modelName |
string | Model definition name |
modelType |
string | Model type path |
specName |
string | Output spec name |
dataType |
string | resource or file |
contentType |
string | MIME type |
lifetime |
string | Lifetime policy |
ownerType |
string | Owner type |
streaming |
boolean | Whether streaming is enabled |
size |
number | Content size in bytes |
content |
string | Raw content string |
CLI Commands
All data commands accept --json to output structured JSON instead of
human-readable text.
swamp data get <model> <data_name>
Retrieve data by model name and data name. Returns the latest version by default.
| Option | Description |
|---|---|
--version |
Retrieve a specific version number |
--workflow |
Get data produced by a workflow |
--run |
Specific workflow run ID |
--no-content |
Show metadata only, without content |
--repo-dir |
Repository directory (default .) |
$ swamp data get hello-world result --json{
"id": "e204ea55-3d64-48a0-aa78-32fea656fdac",
"name": "result",
"modelName": "hello-world",
"modelType": "command/shell",
"version": 1,
"contentType": "application/json",
"lifetime": "infinite",
"garbageCollection": 10,
"streaming": false,
"tags": {
"type": "resource",
"specName": "result",
"modelName": "hello-world"
},
"ownerDefinition": {
"ownerType": "model-method",
"ownerRef": "7347cf2c-cc9e-4203-8897-e10845af9732"
},
"createdAt": "2026-04-07T18:02:58.146Z",
"size": 157,
"checksum": "c631b676cd...",
"contentPath": ".swamp/data/command/shell/.../result/1/raw",
"content": {
"exitCode": 0,
"executedAt": "2026-04-07T18:02:58.143Z",
"command": "echo \"Hello from the swamp!\"",
"durationMs": 4,
"stdout": "Hello from the swamp!",
"stderr": ""
}
}swamp data list [model]
List all data for a model, grouped by type.
| Option | Description |
|---|---|
--type |
Filter by data type (resource, file, report) |
--workflow |
List data produced by a workflow |
--run |
Specific workflow run ID |
--repo-dir |
Repository directory (default .) |
$ swamp data list hello-worldData for hello-world (command/shell)
file (1 item):
log v2 text/plain 19B 2026-04-07
resource (1 item):
result v1 application/json 135B 2026-04-07
report (2 items):
report-swamp-method-summary v2 text/markdown 482B 2026-04-07
report-swamp-method-summary-json v2 application/json 2.6KB 2026-04-07swamp data versions <model> <data_name>
Show all versions of a data item.
| Option | Description |
|---|---|
--repo-dir |
Repository directory (default .) |
$ swamp data versions hello-world result --json{
"dataName": "result",
"modelName": "hello-world",
"modelType": "command/shell",
"versions": [
{
"version": 2,
"createdAt": "2026-04-07T18:03:08.737Z",
"size": 135,
"checksum": "d58d1607...",
"isLatest": true
},
{
"version": 1,
"createdAt": "2026-04-07T18:02:58.146Z",
"size": 157,
"checksum": "c631b676...",
"isLatest": false
}
],
"total": 2
}swamp data search [query]
Search across all data in the repository. Opens an interactive picker in a
terminal, or returns JSON with --json.
| Option | Description |
|---|---|
--type |
Filter by data type tag (resource, file, report) |
--lifetime |
Filter by lifetime (ephemeral, infinite, job, workflow, or duration) |
--owner-type |
Filter by owner type (model-method, workflow-step, manual) |
--workflow |
Filter to data tagged with this workflow name |
--model |
Filter to data owned by this model name |
--content-type |
Filter by MIME content type |
--since |
Only data created within duration (1h, 1d, 7d, 1w, 1mo) |
--output |
Filter by output ID |
--run |
Filter by workflow run ID |
--tag |
Filter by tag (KEY=VALUE, repeatable) |
--streaming |
Only show streaming data |
--limit |
Maximum results (default 50) |
--repo-dir |
Repository directory (default .) |
$ swamp data search --type resource --json{
"query": "",
"filters": {
"type": "resource"
},
"results": [
{
"id": "5e7d72ab-7e0d-492e-ab3d-61463d9d4a85",
"name": "execution-result",
"version": 1,
"contentType": "application/json",
"type": "resource",
"lifetime": "infinite",
"ownerType": "model-method",
"modelName": "hello-world",
"modelType": "command/shell",
"streaming": false,
"size": 135,
"createdAt": "2026-04-07T18:03:27.361Z",
"tags": {
"type": "resource",
"specName": "result",
"modelName": "hello-world"
}
}
],
"total": 1,
"limited": false
}swamp data query [predicate]
Query data using a CEL predicate. The predicate evaluates against DataRecord
fields directly (not prefixed with data.).
| Option | Description |
|---|---|
--select |
CEL expression to project fields (e.g., name) |
--limit |
Maximum results (default 100) |
--repo-dir |
Repository directory (default .) |
Available fields in the predicate: attributes, content, contentType,
createdAt, dataType, id, lifetime, modelName, modelType, name,
ownerType, size, specName, streaming, tags, version.
$ swamp data query 'tags.type == "resource"' --json{
"predicate": "tags.type == \"resource\"",
"results": [
{
"id": "85c471af-a4c8-4f03-a5df-768351388d09",
"name": "result",
"version": 2,
"tags": {
"type": "resource",
"specName": "result",
"modelName": "hello-world"
},
"modelName": "hello-world",
"modelType": "command/shell",
"dataType": "resource",
"contentType": "application/json",
"lifetime": "infinite",
"streaming": false,
"size": 135
}
],
"total": 1,
"limited": false
}With --select to project a single field:
$ swamp data query 'tags.type == "resource"' --select 'name' --json{
"results": ["result", "execution-result"],
"total": 2,
"limited": false
}swamp data rename <model> <old_name> <new_name>
Rename a data item. Creates a copy under the new name and writes a tombstone with a forward reference on the old name.
| Option | Description |
|---|---|
--repo-dir |
Repository directory (default .) |
$ swamp data rename hello-world result execution-result --json{
"oldName": "result",
"newName": "execution-result",
"modelId": "7347cf2c-cc9e-4203-8897-e10845af9732",
"modelName": "hello-world",
"modelType": "command/shell",
"copiedVersion": 2,
"newVersion": 1,
"warning": "Any workflows or models that produce data under \"result\" will overwrite the forward reference. Update them to use \"execution-result\" instead."
}Lookups for the old name without an explicit version follow the forward reference to the new name (up to 5 levels deep).
swamp data gc
Run garbage collection — delete expired data and prune old versions.
| Option | Description |
|---|---|
--dry-run |
Show what would be deleted |
-f, --force |
Skip confirmation prompt |
--repo-dir |
Repository directory (default .) |
Two phases execute in sequence:
- Expired data deletion — removes all versions of data whose lifetime has elapsed.
- Version pruning — removes old versions exceeding the garbage collection policy for non-expired data.
$ swamp data gc --dry-run --json{
"dataEntriesExpired": 0,
"versionsDeleted": 0,
"bytesReclaimed": 0,
"dryRun": true,
"expiredEntries": []
}Validation Rules
- Data names must be non-empty strings.
- Data names must not contain
..,/,\, or null bytes (path traversal protection). - The name
latestis reserved (case-insensitive) and cannot be used as a data name. - Resource data is validated against the spec's Zod schema on write. Schema mismatches produce a warning, not an error.
- New writes to an existing data name must have the same owner (
ownerType+ownerRef) as the original. - Tags must include a
typekey.