How Swamp Works

Swamp is an adaptive automation framework designed to be operated by AI agents. It decomposes automation into a small set of composable primitives: models, workflows, vaults, extensions, skills, and a data layer. Each primitive does one thing. What matters is how they fit together.

Models are the unit of work

A model represents a typed interaction with an external system. It has a type (like command/shell or @swamp/aws/ec2/vpc), a set of methods (like create, sync, delete), and typed schemas for both its inputs and outputs. The model's type defines the shape of data it accepts and produces. A model definition is a YAML file that fills in the specifics: which VPC CIDR block, which shell command, which API endpoint.

This separation matters. The model type is reusable logic written in TypeScript. The model definition is configuration written in YAML. Because the definition is plain data rather than code, an agent can produce one using only its general reasoning ability. It writes structured YAML that conforms to the type's schema, and the type handles execution.

Every method execution produces versioned, immutable data artifacts. A command/shell model's execute method writes a result resource containing stdout, stderr, and exit code. A cloud model's create method writes the resource's state. This data is stored in .swamp, versioned automatically, and queryable. Any subsequent operation can look up what previous operations produced.

Workflows wire models together

A workflow is a DAG of jobs, where each job contains steps that run model methods. Jobs execute in dependency order. Steps within a job run in parallel.

Workflows become useful through data chaining. A step can reference the output of a previous step using CEL expressions. If step A runs a shell command that looks up the latest AMI, step B can read that AMI ID from step A's output and pass it as an argument to an EC2 model's create method. The expression data.latest("ami-lookup", "result").attributes.stdout reaches into the stored data from step A and extracts what step B needs.

This means workflows don't just sequence actions -- they compose data flows. The output of one model becomes the input of another through typed, validated references. If the data doesn't exist yet (because a dependency hasn't run), the expression fails loudly rather than passing through a blank value.

Workflows can also nest. A step can invoke another workflow instead of a model method, passing inputs down. This lets you build higher-order automation: a provisioning workflow that calls a networking workflow that calls individual resource workflows.

Vaults keep secrets out of the data path

The data flow described above has a problem: some values are sensitive, and they shouldn't be stored in plaintext YAML or versioned alongside the data they help produce. Vaults provide a separate path for secrets. Model definitions and workflows reference secrets by name rather than by value, and swamp resolves those references at runtime -- not when the definition is created or the workflow is planned, but when the step actually executes. This means secrets are never frozen into YAML files, never written to .swamp data, and never cached between runs. The same principle applies to output: when a model's schema marks a field as sensitive, swamp redirects that value into vault storage automatically, so secrets never persist in the data layer either.

There's a design tension between solo developers, who need secrets to work without external infrastructure, and teams, who need secrets to live in managed systems (AWS Secrets Manager, 1Password, and the like). Vaults resolve this through swappable providers: a local encryption provider works out of the box, while provider-backed vaults delegate to external secret management. The vault configuration lives in the repo, but the secrets themselves live wherever the provider puts them. Because a vault provider is one of the things an extension can package, teams can integrate with whatever secret backend they already use.

Extensions package reusable model types

When no built-in model type exists for a task, someone writes one. An extension model is a TypeScript file that exports a type definition with Zod schemas for inputs and outputs, and execute functions for each method. The file lives in extensions/models/ and swamp discovers it at startup.

Extensions aren't limited to models. They can also package vault providers (custom secret backends), execution drivers (custom runtimes like Docker, SSH, or Lambda), datastores (custom backends for .swamp data), and reports. A single extension manifest bundles whichever of these it needs, versioned with CalVer, and published to a registry.

The registry exists so that no single person or team needs to build every integration. Someone writes a model type for their own use, publishes it, and anyone else can install it. Trusted collectives go further -- referencing an @swamp/aws model type causes swamp to resolve and install the extension automatically.

Skills teach agents how to use swamp

None of this works if the agent doesn't know how to use it. Skills are the bridge between swamp's capabilities and an agent's ability to exercise them.

A skill is a markdown document -- not executable code -- that teaches an agent how to work with a specific part of swamp. The swamp-model skill explains how to search for model types, create definitions, run methods, and inspect outputs. The swamp-workflow skill covers DAG construction, data chaining, and run history. There are skills for data management, vault configuration, extension authoring, troubleshooting, and more.

Each skill has trigger patterns that tell the agent when to load it. When a user says "create a workflow," the agent loads the swamp-workflow skill and gains the knowledge to do it correctly -- the right CLI commands, the right YAML structure, the validation steps, the common pitfalls.

Skills are bundled with swamp and live in the agent's workspace. Rather than building agent logic into swamp itself, swamp teaches the agent through documentation. The agent uses its general reasoning abilities to apply the skill's instructions to the user's specific situation. Because the interface is markdown rather than a plugin API, any agent that can read and follow instructions can use them.

Data querying ties it all together

Every model execution, every workflow run, every step produces data. Swamp stores this data with rich metadata: which model produced it, which spec it belongs to, tags from the model and workflow, a version number, a timestamp.

The data query system uses CEL (Common Expression Language) to filter across all stored data. You can find all failed results (attributes.status == "failed"), all data from a specific workflow (tags.workflow == "deploy"), or all resources larger than a megabyte (size > 1048576). Projection with --select extracts specific fields from matching records.

Data querying serves two audiences. For humans, it's an inspection tool -- what did the last run produce? What's the current state of my infrastructure? For agents and models, it's a composition mechanism. Extension models can call context.queryData() from within their execute functions to find and process data from other models. Workflow expressions can use data.query() to conditionally branch based on what exists in the datastore.

The data layer also manages lifecycle. Resources have lifetimes (ephemeral, temporary, infinite) and garbage collection policies. Old versions get cleaned up automatically. Data can be renamed with forward references so that refactoring names doesn't break existing workflows. The system is designed for data to accumulate and be useful, not to become debt.

The composition

These primitives compose in a specific pattern:

A skill teaches an agent what swamp can do and how to do it
The agent creates model definitions from existing types, or writes extension models when no type exists
The agent wires models into workflows with data chaining expressions
Vault expressions supply secrets at runtime, keeping sensitive values out of configuration files and stored data
Each execution produces versioned data that's queryable and referenceable
Data queries feed into future decisions -- by humans inspecting results, by agents reasoning about state, or by models reading each other's output

Several properties of the system support this composition. Model definitions and workflows are YAML files committed to git, so they're reviewable and versioned alongside the code they automate. Data is immutable and versioned, so you can always see what happened. Schemas are validated at creation time and execution time, so type mismatches surface early. Extensions are published and versioned, so teams can share and build on each other's work.

Swamp is an architecture for adaptive, AI driven automation. It provides typed building blocks and a data layer. The agent decides how to assemble them for a given task.