context.readModelData returns different results depending on invocation context (manual vs workflow)
Opened by stack72 · 4/7/2026· GitHub #1113
Description
context.readModelData(modelName, specName) produces different results depending on whether the method is invoked manually (swamp model method run) or within a workflow. This makes manual runs unreliable for debugging workflow behavior.
- Manual run:
readModelDatareturns ALL historical data for the source model (noworkflowRunIdavailable → no scoping) - Workflow run:
readModelDatais scoped to data produced by the current workflow run (viaworkflowRunIdtag filtering inraw_execution_driver.tslines 138-140)
This means a method that works correctly in a workflow can produce wildly different (and incorrect) results when run manually for debugging.
Concrete Example
anime-source has 27 configured shows. search_configured produces 182 episodes per run.
# Manual run — returns 921 items (all historical data, including removed shows)
swamp model method run dedup filter --input sourceModel=anime-source
→ Read 921 episodes from "anime-source"
→ 304 "new" episodes (many are false positives from orphaned data)
# Workflow run — returns 182 items (current run only)
swamp workflow run discover-and-download
→ Read 182 episodes from "anime-source"
→ correct dedup resultsThe 921 items include data from shows that were removed from the config months ago (e.g., "Dark Gathering" removed from globalArgs, but its data persists with lifetime: infinite). This orphaned data is invisible in workflow runs but pollutes manual runs.
Why This Matters
You can't debug workflows with manual runs. The primary way to test a model method is
swamp model method run. If it returns different data than the workflow, you're debugging a different system.False confidence in fixes. A dedup fix that looks correct in manual testing may behave completely differently in the workflow (or vice versa). We spent significant time chasing dedup bugs that only manifested in one invocation context.
No way to opt into scoping manually. There's no
--scope-to-latest-runflag or equivalent. Manual runs always get the unscoped path.
Current Implementation
In raw_execution_driver.ts:
const workflowRunId = this.context.tagOverrides?.["workflowRunId"];
const readModelData = (modelName: string, specName?: string) =>
dataAccessService.readModelData(modelName, specName, workflowRunId);When workflowRunId is undefined (manual run), readModelData returns everything. When set (workflow run), it filters by workflowRunId tag.
Proposed Solution
readModelData should behave consistently regardless of invocation context. Options:
- Default to latest execution's output — when no
workflowRunIdis available, scope to the source model's most recent method output instead of returning all historical data - Add a CLI flag —
swamp model method run ... --scope-to-latestto simulate workflow scoping during manual runs - Always scope by default — return only the latest version of each unique data name, with an explicit opt-in for historical data
Any of these would make manual runs trustworthy for debugging.
Environment
- swamp version: 20260206.200442.0
- Extension:
@keeb/mms/dedupcallingreadModelData("anime-source", "episode")
Related Issues
- #1020 — closed as not-a-bug (findBySpec run-scoped, but same inconsistency exists)
- #966 — forEach data.findBySpec resolves empty when data written by prior job
- #914 — context.readModelData feature request
Automoved by swampadmin from GitHub issue #1113
Open
No activity in this phase yet.
Sign in to post a ripple.