DATA QUERYING
swamp data query searches across all data artifacts in a repository using CEL
predicates. Each predicate is evaluated against every data record, and matching
records are returned. The same query engine powers the data.query() function
in CEL expressions.
Command
swamp data query [predicate]| Option | Type | Default | Description |
|---|---|---|---|
--select |
string | None | CEL expression to project fields from matching records |
--limit |
number | 100 |
Maximum number of results returned |
--repo-dir |
string | . |
Repository directory |
--json |
flag | — | Output in JSON format |
When no predicate is provided and stdout is a TTY, the command opens an
interactive TUI for browsing and filtering data. When no predicate is provided
in non-interactive mode (piped output or --json), the command returns an
error.
DataRecord Fields
Predicates evaluate against DataRecord fields as top-level variables. No
namespace prefix is needed — use modelName, not data.modelName.
| Field | Type | Description |
|---|---|---|
id |
string |
Record UUID |
name |
string |
Data item name |
version |
int |
Version number |
createdAt |
string |
ISO 8601 timestamp |
attributes |
map<string, dyn> |
Parsed JSON content (resource data) |
tags |
map<string, string> |
Metadata tags |
modelName |
string |
Owning model definition name |
modelType |
string |
Model type path (e.g., command/shell) |
specName |
string |
Output spec name |
dataType |
string |
resource, file, or report |
contentType |
string |
MIME type (e.g., application/json) |
lifetime |
string |
Retention policy (e.g., infinite, 30d) |
ownerType |
string |
model-method, workflow-step, or manual |
streaming |
bool |
Whether the data uses streaming writes |
size |
int |
Content size in bytes |
content |
string |
Raw text content |
Referencing an unknown field produces an error listing the available fields:
$ swamp data query 'badField == "test"' --json{
"error": "Unknown field \"badField\" in query predicate.\nAvailable: attributes, content, contentType, createdAt, dataType, id, lifetime, modelName, modelType, name, ownerType, size, specName, streaming, tags, version"
}Lazy-Loaded Fields
The attributes and content fields are loaded from disk only when referenced
in the predicate or the --select expression. All other fields are read from
the metadata catalog without touching data files.
attributes: Populated forapplication/jsondata. The raw content is parsed as JSON. Invalid JSON is treated as an empty map.content: Populated for text content types (text/plain,text/markdown,application/json,application/yaml, etc.). Binary content types produce an empty string.
When neither field is referenced, queries run entirely against the catalog index.
Predicates
A predicate is a CEL expression that evaluates to a boolean. Records where the
predicate returns true are included in the results.
Comparison
$ swamp data query 'modelName == "scanner"' --json{
"predicate": "modelName == \"scanner\"",
"results": [
{
"id": "7a3708d4-4767-4e45-912e-6e4ab42f5ea5",
"name": "log",
"version": 1,
"tags": {
"type": "file",
"specName": "log",
"modelName": "scanner",
"env": "prod"
},
"modelName": "scanner",
"modelType": "command/shell",
"specName": "log",
"dataType": "file",
"contentType": "text/plain",
"lifetime": "infinite",
"streaming": true,
"size": 22
},
{
"id": "696937a4-6517-4002-93f3-029435fec355",
"name": "result",
"version": 1,
"tags": {
"type": "resource",
"specName": "result",
"modelName": "scanner",
"env": "prod"
},
"modelName": "scanner",
"modelType": "command/shell",
"specName": "result",
"dataType": "resource",
"contentType": "application/json",
"lifetime": "infinite",
"streaming": false,
"size": 141
}
],
"total": 2,
"limited": false
}Note
Examples on this page show a subset of DataRecord fields for brevity. Actual
JSON output includes all fields listed in the table above (e.g., createdAt,
ownerType, attributes, content).
Numeric Comparison
$ swamp data query 'size > 100' --limit 2 --json{
"predicate": "size > 100",
"results": [
{
"id": "1ea7486a-4619-41dd-9fb6-cde6a70819df",
"name": "report-swamp-method-summary",
"version": 1,
"dataType": "report",
"contentType": "text/markdown",
"size": 493
},
{
"id": "75046034-50b6-47e2-bac8-613e119e17b8",
"name": "report-swamp-method-summary-json",
"version": 1,
"dataType": "report",
"contentType": "application/json",
"size": 2653
}
],
"total": 2,
"limited": true
}When results are truncated by --limit, the response includes
"limited": true.
Boolean Fields
$ swamp data query 'streaming == true' --jsonReturns all data records with streaming enabled.
Compound Predicates
Combine conditions with && (and) and || (or):
$ swamp data query 'modelName == "scanner" && specName == "result"' --json{
"predicate": "modelName == \"scanner\" && specName == \"result\"",
"results": [
{
"id": "696937a4-6517-4002-93f3-029435fec355",
"name": "result",
"version": 1,
"modelName": "scanner",
"specName": "result",
"dataType": "resource",
"contentType": "application/json",
"size": 141
}
],
"total": 1,
"limited": false
}String Methods
CEL string methods work on string fields:
swamp data query 'name.contains("result")'
swamp data query 'modelName.startsWith("scan")'
swamp data query 'contentType.matches("application/.*")'See String Methods in the CEL reference for the complete list.
Attribute Filtering
Access nested fields within attributes to filter on resource content. This
triggers lazy loading of the content from disk.
$ swamp data query 'dataType == "resource" && attributes.exitCode == 0' --json{
"predicate": "dataType == \"resource\" && attributes.exitCode == 0",
"results": [
{
"id": "f6dffa8a-1fe9-4f7b-bb1d-b7a64f6edf22",
"name": "result",
"version": 1,
"attributes": {
"exitCode": 0,
"executedAt": "2026-04-07T23:45:02.601Z",
"command": "echo \"Hello from the swamp!\"",
"durationMs": 3,
"stdout": "Hello from the swamp!",
"stderr": ""
},
"modelName": "hello-world",
"specName": "result",
"dataType": "resource",
"size": 157
},
{
"id": "696937a4-6517-4002-93f3-029435fec355",
"name": "result",
"version": 1,
"attributes": {
"exitCode": 0,
"executedAt": "2026-04-07T23:45:08.739Z",
"command": "echo \"scan complete\"",
"durationMs": 3,
"stdout": "scan complete",
"stderr": ""
},
"modelName": "scanner",
"specName": "result",
"dataType": "resource",
"size": 141
}
],
"total": 2,
"limited": false
}When a record's attributes map does not contain the referenced key, the record
is excluded from results rather than producing an error.
Tag Filtering
Tags are accessible as a nested map via the tags field. Use dot notation or
bracket notation to access tag values:
swamp data query 'tags.env == "prod"'
swamp data query 'tags["env"] == "prod"'
swamp data query 'tags.type == "resource"'$ swamp data query 'tags.env == "prod"' --json{
"predicate": "tags.env == \"prod\"",
"results": [
{
"id": "7a3708d4-4767-4e45-912e-6e4ab42f5ea5",
"name": "log",
"version": 1,
"tags": {
"type": "file",
"specName": "log",
"modelName": "scanner",
"env": "prod"
},
"modelName": "scanner",
"dataType": "file",
"size": 22
},
{
"id": "696937a4-6517-4002-93f3-029435fec355",
"name": "result",
"version": 1,
"tags": {
"type": "resource",
"specName": "result",
"modelName": "scanner",
"env": "prod"
},
"modelName": "scanner",
"dataType": "resource",
"size": 141
}
],
"total": 2,
"limited": false
}Records that do not have the referenced tag key are silently excluded from results (no error).
Tag Sources
Tags on data records come from multiple sources, resolved in order. See the Tag Resolution Chain in the data outputs reference for the full precedence.
Three tags are always present on every data record:
| Tag Key | Description |
|---|---|
type |
resource, file, or report |
specName |
Output spec name from the model type |
modelName |
Model definition name |
Custom tags are added via --tag flags on method runs, workflow
dataOutputOverrides, or the output spec's tags field in the
model definition.
Projections
The --select flag transforms each matching record into a specified shape. The
select expression is a CEL expression evaluated against each matching record's
fields.
Scalar Projection
Extract a single field value. Returns an array of values.
$ swamp data query 'dataType == "resource"' --select name --json{
"results": [
"result",
"result"
],
"total": 2,
"limited": false
}Map Projection
Build an object from selected fields. Returns an array of objects.
$ swamp data query 'dataType == "resource"' --select '{"name": name, "model": modelName, "size": size}' --json{
"results": [
{
"name": "result",
"model": "hello-world",
"size": 157
},
{
"name": "result",
"model": "scanner",
"size": 141
}
],
"total": 2,
"limited": false
}List Projection
Build an array from selected fields. Returns an array of arrays.
$ swamp data query 'dataType == "resource"' --select '[name, modelName, size]' --json{
"results": [
[
"result",
"hello-world",
157
],
[
"result",
"scanner",
141
]
],
"total": 2,
"limited": false
}Accessing Nested Data in Projections
Select expressions can reference attributes and content even if the
predicate does not. The query engine detects field references in both the
predicate and select expression to determine which fields to load from disk.
swamp data query 'dataType == "resource"' --select 'attributes.stdout'If a record's attributes do not contain the referenced key, the projection
produces null for that record.
data.query() in CEL Expressions
The data.query() function provides the same query capability inside
CEL expressions used in model definitions,
workflow steps, and data output overrides.
data.query("modelName == \"scanner\" && size > 1000")
data.query("modelName == \"scanner\"", "attributes.status")| Signature | Returns | Description |
|---|---|---|
data.query(predicate) |
list<DataRecord> |
Records matching the predicate |
data.query(predicate, select) |
list<dyn> |
Projected values from matches |
The predicate and select arguments are strings containing CEL expressions. The same DataRecord fields and operators are available as in the CLI command.
Interactive Mode
When invoked without a predicate in a terminal, swamp data query opens an
interactive TUI for browsing data. The TUI supports:
- Filtering by tag keys and values
- Text search across record fields
- Selecting and inspecting individual records
Non-interactive invocations (piped output, --json flag, or no TTY) require a
CEL predicate argument.
Result Structure
JSON output includes these top-level fields:
| Field | Type | Description |
|---|---|---|
predicate |
string |
The CEL predicate used (omitted with --select) |
results |
list |
Matching DataRecords or projected values |
total |
int |
Number of results returned |
limited |
bool |
true when results were truncated by --limit |
Without --select, each result is a full DataRecord. With --select, each
result is the projected value (scalar, map, or list depending on the select
expression shape).