Skip to main content
← Back to list
01Issue
BugShippedSwamp CLI
Assigneesstack72

#218 S3 datastore global lock enters infinite stale-lock loop after interrupted workflow

Opened by bixu · 5/4/2026· Shipped 5/4/2026

Description

After an interrupted swamp workflow run leaves a stale global lock in the S3 datastore, force-releasing it via swamp datastore lock release --force does not resolve the issue. Subsequent swamp workflow run invocations immediately enter an infinite stale-lock loop, never making progress.

Steps to Reproduce

  1. Start a swamp workflow run against an S3-backed datastore
  2. Interrupt it mid-execution (Ctrl-C or process kill)
  3. Verify stale lock with swamp datastore lock status --json — lock is held by a dead PID
  4. Release with swamp datastore lock release --force --json — returns {"released": true}
  5. Run swamp workflow run <name> again

Expected Behavior

The workflow runs normally after the stale lock is released.

Actual Behavior

The workflow immediately enters an infinite cycle (~1s per iteration):

INF datastore·lock Global lock acquired by "..." during per-model lock acquisition — releasing and retrying
WRN datastore·lock Global lock held by "..." appears stale (exceeded TTL of 30000ms) — proceeding
INF datastore·lock Global lock released, proceeding with per-model locks
INF datastore·lock Global lock acquired by "..." during per-model lock acquisition — releasing and retrying

The workflow eventually crashes with:

error: Top-level await promise never resolved
    await runCli(Deno.args);
node:_tls_wrap:17670: Uncaught TypeError: this._handle.start is not a function

Environment

  • swamp version: 20260502.153639.0-sha.907c6883
  • datastore: S3 (AWS)
  • OS: macOS (darwin)
  • Reproduced across multiple sequential and parallel workflow run attempts

Additional Notes

Running two workflows in parallel originally caused the lock contention. Each individual run after that — even with the global lock force-released between attempts — continued to exhibit this loop. Per-model locks (e.g. data/@hivemq/harvester-host-kernel/<id>/.lock) may also remain stale but swamp datastore lock release --force --model <type/id> with the normalized type/id format returned released: false, reason: no lock held, suggesting the per-model lock key format may differ from what the CLI expects.

02Bog Flow
OPENTRIAGEDIN PROGRESSSHIPPED+ 1 MOREASSIGNED+ 5 MOREREVIEW+ 3 MOREPR_MERGEDSHIPPED

Shipped

5/4/2026, 3:26:43 PM

Click a lifecycle step above to view its details.

03Sludge Pulse
stack72 assigned stack725/4/2026, 1:01:44 PM

Sign in to post a ripple.