extension unyank needs -y/--yes flag for non-interactive use

The audit identified the in-process Deno import cache as a structural cause of brittleness in the extension layer. When a long-running swamp process (workflow runner, eventually a daemon mode, the orchestrator) reloads an extension via await import(toFileUrl(path).href), Deno caches the module by URL. If the same URL is imported twice, the second import returns the cached module — even if the underlying bundle file has changed. The first import's module graph is sticky; subsequent reloads see stale code in memory while the bundle on disk is fresh.

The model loader has a workaround: an allExtensionMethodsAttached flag that re-attaches methods after each reload. The four sibling loaders don't have it. The workaround is silently load-bearing and undocumented.

Predecessors W1a-W4 (and the bug fixes around them — #265 / #273 / #274 / #284) close the catalog-side and bundle-cache-side correctness issues. W5 closes the runtime-import-side issue: bundles on disk are fresh, but V8's module graph keeps serving stale code.

Full architectural context: design/extension-rearchitecture.md ("W5 — Per-fingerprint import URLs + subprocess test harness" section) — referenced from #211.

Scope

Phase 1 — Per-fingerprint import URLs

Modify importBundleByPath (post-W4 lives once in the unified ExtensionLoader) to append a fingerprint query parameter to the bundle URL when importing:

import(toFileUrl(bundlePath).href + \"?fp=\" + sourceFingerprint)

The fingerprint is the catalog row's source_fingerprint (consistent with how freshness is determined elsewhere in the system). Different fingerprints produce different URLs, which Deno treats as distinct modules. If the bundle changes, the fingerprint changes, V8 imports the new module instead of returning the cached old one.

V8 retains both old and new modules in heap memory until garbage collected. For one-shot CLI invocations this doesn't matter (process exits, memory freed). For long-running processes, modules accumulate per fingerprint change. RAM growth is bounded in practice (each module is small) but worth measuring.

Phase 2 — Delete the `allExtensionMethodsAttached` workaround

The model loader's workaround in user_model_loader.ts becomes unnecessary once per-fingerprint URLs work correctly. Per-W4 the model loader is gone — the workaround lives in the unified ExtensionLoader if it was preserved during W4. Phase 2 deletes it and confirms the regression tests still pass.

Phase 3 — Subprocess test harness

In-process Deno can't exercise the cache-aliasing bug because the test process IS the process being tested. Module imports inside the test see the same V8 heap as production code. The fix's verification needs a subprocess test harness: spawn a Deno subprocess that imports, modify the bundle, ask the subprocess to import again, verify it sees the new content.

New src/infrastructure/testing/subprocess_test_harness.ts (or similar) provides:

spawnExtensionProcess(args): starts a subprocess that imports bundles via the unified loader
triggerReload(processHandle, source): signals the subprocess to re-import after the source is modified
assertModuleVersion(processHandle, expectation): verifies which version the subprocess sees

The harness becomes shared infrastructure for any future test that needs cross-process module-reload semantics.

Phase 4 — RAM growth benchmark

A new benchmark in src/libswamp/extensions/ (or alongside W3's reconcile_from_disk_bench.ts) measures heap delta after N module reloads in a subprocess. Pin a threshold (e.g., ≤ 50 MB after 100 reloads) and assert it. If the threshold is blown, surface for redesign — possible mitigations include WeakRef-tracked module versions, periodic subprocess restart, or an explicit eviction API.

Pre-work decisions to pin in the PR description

Fingerprint source. Recommend: catalog row's source_fingerprint column (consistent with the rest of the freshness model). Alternative: hash of the bundle file content. Source fingerprint is path-stable across deno bundle runs that produce identical output; bundle fingerprint changes per build invocation. Source fingerprint is the right choice — it matches what findStaleFiles already computes.
URL format. Query parameter (?fp=<hash>) vs fragment (#fp=<hash>). Recommend: query parameter. Deno treats distinct query strings as distinct modules; fragments may not be respected by Deno's module loader.
RAM growth strategy. Recommend: accept indefinite retention for v1; document as known limitation. Don't add eviction logic speculatively. If post-merge measurement shows real RAM pressure for daemon use cases, file as a follow-up workstream.
Subprocess harness location. Recommend: src/infrastructure/testing/subprocess_test_harness.ts. Shared across W5 + future module-reload tests.
Cross-platform shape. Subprocess harness must work on Linux + macOS. Windows isn't a merge gate per W-series precedent. Use Deno's standard process spawning APIs; avoid platform-specific shell tricks.

Out of scope (deferred to later workstreams)

swamp doctor extensions aggregate-state rendering → W6
Bundle cache file eviction (orphaned bundles) → swamp-club#267 (post-W4)
sourceToRow mtime carrying through Source entity → swamp-club#271 (no urgency)
Daemon-mode subprocess isolation as an alternative to per-fingerprint URLs (out of scope; #267 territory if RAM growth turns out to be a problem)

Success criteria

Per-fingerprint import URLs in use across the unified loader. Bundle imports happen via import(toFileUrl(bundlePath).href + \"?fp=\" + sourceFingerprint).
allExtensionMethodsAttached workaround deleted. No remaining trace in the unified loader. Code grep returns zero occurrences in src/.
Subprocess test harness exists and verifies module reload behavior. Test: spawn subprocess → import bundle → modify bundle → trigger reload → assert subprocess sees new content. Parameterized over all 5 extension kinds.
RAM growth benchmark passes. ≤ 50 MB heap delta after 100 module reloads in a subprocess (or the threshold the agent pre-commits to).
Long-running workflow scenario verified. Run a multi-step workflow that reloads extensions across steps; confirm the workflow runner sees fresh code on each reload, not the first-import-cached code.
All existing tests pass on Linux + macOS (Windows not a merge gate per W-series precedent).
Auto-ship-on-merge readiness verified via diversity-matrix soak.

Suggested test additions

Subprocess module-reload test (parameterized over 5 kinds): import → modify → re-import → assert new content. The load-bearing structural verification.
Fingerprint-collision test: two bundles with the same fingerprint produce the same URL and share V8's cached module (the fast path; this is correct behavior).
Fingerprint-divergence test: same bundle path, different fingerprints (sequential reloads) → V8 sees distinct modules → loader uses the latest.
allExtensionMethodsAttached regression: scenario that previously needed the workaround. Without the workaround AND with per-fingerprint URLs, the scenario passes. (This is the test that proves the workaround is genuinely no longer needed.)
RAM growth benchmark: 100 reloads in a subprocess; assert heap delta within threshold.
Workflow runner integration: a workflow with multiple steps each reloading the same extension after edits; assert each step sees its expected version.

Auto-ship-on-merge constraint

Same gates as W2/W3/W4:

CI green (all new + existing tests + type-check + lint + fmt)
Author smoke on real repo: workflow runs across multiple reloads see fresh code each time
Reviewer smoke on different real repo
Diversity-matrix soak (multiple machines × OS × extension shape × workflow patterns)
- Specifically watch for: long-running process scenarios; workflow runner reloads; any module-cache-aliasing behavior
Subprocess test harness verified on Linux + macOS
RAM growth benchmark passes threshold
No allExtensionMethodsAttached references remaining in src/
Forward-only revert posture documented

Push-back encouraged

If the design doesn't fit the ground, surface before implementation. Specific watch list:

Deno's module loader behavior with query parameters. Verify empirically that import(\"...js?fp=abc\") and import(\"...js?fp=def\") produce distinct V8 modules. If Deno strips query parameters or normalizes them, the fingerprint approach breaks; surface and pivot to an alternative (e.g., temporary symlinks, content-addressed bundle paths).
allExtensionMethodsAttached may be doing more than the documented workaround suggests. If deletion breaks tests in non-obvious ways, the workaround is load-bearing for something else; surface and investigate before forcing the deletion.
Subprocess harness complexity. If spawning + signal handling + cross-platform behavior turns out to be substantially more code than expected, surface — it might warrant its own scoped PR before W5's behavior change.
RAM growth blown. If the benchmark exceeds threshold by a lot, surface for redesign discussion before merging. Mitigation options: WeakRef-tracked modules, periodic subprocess restart, fingerprint-bucketing (group nearby fingerprints into the same URL), or explicit eviction API.

The two most expensive misses to watch for

Deno strips or normalizes query parameters in import URLs. If this is true, per-fingerprint URLs don't work and the entire approach is wrong. Catch by: empirical test EARLY in implementation. A 5-line script that imports two URLs with different query parameters and confirms they're distinct modules. If they're the same module, surface immediately.
allExtensionMethodsAttached is load-bearing for something the audit didn't catch. If its deletion silently breaks production behavior in a way tests don't catch, soak might miss it. Mitigation: explicit regression test for the original failure mode the workaround was added for; if no such failure mode is documented, treat as unsafe and investigate before deletion.

References

Predecessors: #211 (W1 tracking), #223 (W1b), #231 (W2), #252 (W3), #269 (W4)
Closes: the allExtensionMethodsAttached workaround in the model loader (preserved through W4 if it survived loader unification)
Trackers it doesn't address: #267 (extension layer GC), #271 (sourceToRow mtime)
Design doc: design/extension-rearchitecture.md

02Bog Flow

Shipped

5/11/2026, 3:56:20 PM

Click a lifecycle step above to view its details.

03Sludge Pulse

stack72 assigned stack725/11/2026, 2:22:34 PM

extension unyank needs -y/--yes flag for non-interactive use

Step key parsing in execution_service uses naive split(":") and silently truncates colon-containing step names

Workflow-scope report artifacts unreachable via `swamp data get --workflow`

Add --stdin support to method run and workflow run for Unix pipe composition

Docs: add run namespace to CEL expressions reference

Fix invalidate-then-reconcile sequencing in doctor extensions; failure-mode RowStates unreachable in sourceDetails

doctor extensions invalidateAll does not trigger fingerprint recheck for existing Indexed rows

Extension failure recording has dual write paths (legacy buildIndex vs W3 reconcile)

Expose workflow run ID in CEL so resource keys can be run-scoped

Add `swamp doctor workflows` subcommand to surface workflow YAML parse errors during preflight

Add 'swamp issue comment' command for updating existing issues

Close #327 — fixed in 20260511.160514.0-sha.9d03b09a

Nested workflow task fails with 'Bad resource ID' when workflowIdOrName uses @collective/ prefix

Detect stale skill directories and prompt for repo upgrade

Improve skill trigger routing for cross-model edge cases

Add type search for driver, datastore, and report kinds

Surface ReconcileFromDisk dryRun transitions in swamp doctor extensions

Docs: update doctor extensions reference for W6 aggregate-state rendering + repair flags

swamp issue: check if reporter is on an outdated binary before opening

Implement W6: swamp doctor extensions aggregate-state rendering + repair surface (extension catalog rearchitecture)

extension pull fails referencing removed source after extension source rm

Investigate whether allExtensionMethodsAttached guard can be removed via registration-path consolidation

Trace context is lost: CLI ignores inbound TRACEPARENT and raw driver doesn't propagate active context into in-process methods

swamp model create --global-arg KEY=VALUE doesn't coerce strings to z.number() schemas

swamp model create --global-arg KEY=VALUE doesn't coerce strings to z.number() schemas

swamp issue get should rate-limit unauthenticated users instead of blocking

swamp issue get should not require authentication

telemetry: emit child entries for follow-up action method invocations

Publish release-candidate / unstable extension versions

extension source rm leaves stale @local/. rows in catalog, blocking later pulls of registry types

extension push drops binaries: field from re-emitted archive manifest.yaml

macOS launchd autoupdate (club.swamp.autoupdate) silently fails — binary stays stale

Clicking @hivemq/honeycomb extension card on /extensions shows 'Something went wrong'

Docs: Update model-definitions.md and workflows.md for direct type execution

Docs: document binaries manifest field in extension-manifest.md

Add an official @swamp/ssh extension for general-purpose SSH (brownfield-friendly)

Accept and display binaries field from extension push metadata

Direct type execution: collapse model create + method run into one command

Per-method telemetry events for workflow runs

Workflow run liveness: orphaned 'running' records when originating CLI process dies mid-run

Provide a CLI-shape primer for AI agents to reduce rediscovery overhead

quality rubric: don't penalize extensions whose upstream constrains them to a single platform

Extension update rejects multiple .ts files extending the same target type within one local extension (regression)

swamp extension push deadlocks when invoked from inside a swamp workflow step

Collective-scoped auth keys + OIDC federation for CI publishing

forEach self.* in modelIdOrName not resolved in runtime execution path

Award leaderboard points for referrals and collective invites

Add agent harness detection and AiTool to telemetry

Workflow-level runtime expressions (env.*, vault.*) not resolved in driverConfig — docker driver receives literal ${{ ... }} strings

Implement W5: Per-fingerprint import URLs + subprocess test harness (extension catalog rearchitecture)

swamp config set crashes with YAML serialization error

datastore compact: VACUUM fails in compiled binary (SQLITE_LIMIT_ATTACHED=0)

Repo-level version gating: minSwampVersion high-water mark for team consistency

Docs: document self.* expressions in modelIdOrName during forEach

Docs: How-to guide for background autoupdating

Manifest version bumps silently ignored for existing local extension aggregates

materialiseExtensions misclassifies pulled rows when manifest name collides with a pulled extension

Local extension edits don't reliably trigger rebundle

Missing unique indexes on user.email and user.username allow duplicate users

Resolve self.* expressions in modelIdOrName during forEach expansion

discord-bot double-sends sign_up notifications

Discord bot sends duplicate signup notifications

Add 'swamp workflow list' as alias for 'swamp workflow search'

Add 'swamp auth status' as alias for 'swamp auth whoami'

Extension bundle cache does not invalidate on source edits

extension pull fails on @local/[email protected] phantom-claim collision when local repo has its own extension

Lab search by numeric issue ID returns no results

W3 sourceToRow writes empty source_mtime — should carry filesystem mtime through Source entity

Warm-start rebundleAndUpdateCatalog should respect terminal RowStates set by reconcile

Implement W4: KindAdapter + unified loader (extension catalog rearchitecture)

Docs: add vault read-secret command to reference manual

Extension layer garbage collection: prune catalog rows + evict orphaned bundles

Docs: document workflow concurrency limits in reference manual

Locally-sourced extension: source_mtime updates without regenerating stale bundle

Extension push: allow shipping executable host helpers (bin/mudroom blocker)

Vault expressions silently deliver __SWAMP_VSEC__ sentinels under the docker driver

First-class shell-shim support for extensions, with registry-level visibility

Local extension model bundles don't rebuild when source changes (no rebuild CLI; manual cache delete breaks the runner)

Configurable concurrency limits for workflow fan-out (forEach, parallel jobs/steps)

feat(security): redact sensitive method arg values from audit log

Workflow-level runtime expressions (env., vault.) not resolved in driverConfig — docker driver receives literal ${{ ... }} strings