#355 Partitioned index for S3/GCS datastores (Phase 3)
Opened by stack72 · 5/14/2026
Problem
S3/GCS datastore extensions use a monolithic .datastore-index.json that tracks every file in the bucket. This creates two problems:
Push serialization. Per-model push acquires a global lock because concurrent writes to the same index file would clobber each other. Two people running model methods on different models serialize their pushes even though their data doesn't overlap.
Pull scope.
pullChangedfetches the entire index and walks all entries even when only one model is locked. With Phase 1'sscopeparameter now available in the framework, pull could be scoped — but the monolithic index means you still pay to fetch and parse the whole thing.
Phase 1 (issue #350, shipped) added scope and capabilities() to the domain contracts. Phase 2 (issue #354, shipped) added per-path dirty tracking so push only uploads changed files. Phase 3 completes the picture: partition the index so S3/GCS can declare scopedSync: true and unlock concurrent per-model push without the global lock.
Proposed Solution
Replace the monolithic .datastore-index.json with a two-level structure:
Manifest + per-model sub-indexes
.datastore-index.json → manifest (lightweight, lists sub-indexes)
.datastore-index/
data/aws-ec2/i-abc123.json → sub-index for one model
data/aws-ec2/i-def456.json → sub-index for another model
outputs.json → sub-index for outputs subdirectory
workflow-runs.json → sub-index for workflow runs
...The manifest is a small JSON file listing sub-index keys and their ETags/generations. A scoped pull only fetches the manifest + the relevant sub-index. A scoped push only writes the relevant sub-index and updates the manifest entry.
Each sub-index has the same structure as today's index entries for files under that prefix — keys, sizes, lastModified, localMtime.
Scoped operations
pushChanged({ scope: "data/aws-ec2/i-abc123/" })— only walks/uploads files under that prefix, writes only that sub-index, updates the manifest entry. No global lock needed because sub-indexes don't overlap.pullChanged({ scope: "data/aws-ec2/i-abc123/" })— fetches manifest, then only the relevant sub-index, then only stats/downloads files in that sub-index.- Unscoped operations (structural commands) — fetch manifest + all sub-indexes, equivalent to today's full index.
Declaring scopedSync
After the index is partitioned, both extensions add:
capabilities: () => ({ scopedSync: true }),This tells the framework to use per-model locks only (no global lock) for per-model push, and to pass scope to pull/push.
Migration and backward compatibility
This is the highest-risk phase. Migration must be safe:
Old format detection. On first push from a new extension version, detect the monolithic
.datastore-index.json, read it, split entries by prefix into sub-indexes, write the manifest + sub-indexes, and delete the old file.Compatibility shim. During a transition period (one release cycle), also write a monolithic
.datastore-index.jsonas a read-only shim so older swamp versions can still pull. The shim is rebuilt from sub-indexes on each push. Remove the shim after one release cycle.Idempotent migration. A crash mid-migration must be resumable. The manifest is the source of truth — if it exists, the migration completed. If the old monolithic file exists without a manifest, migration hasn't run yet. If both exist, the shim is active.
Discovery fallback. The existing
discoverIndexFromBucketfallback (lists bucket, builds index from listing) still works as the last-resort recovery — if both manifest and monolithic index are somehow lost, discovery rebuilds from bucket contents.
Risk factors
- Mixed-version teams. User A on new swamp (partitioned) and user B on old swamp (monolithic). The compatibility shim handles this — user B reads the shim. But user B's push writes a monolithic index that the new format won't expect. The migration code must handle this gracefully (re-split on next push from a new version).
- Manifest as coordination point. The manifest replaces the monolithic index as the thing that must be atomically updated. Concurrent manifest writes from two different scoped pushes could still race. Options: use conditional writes (S3
If-Match, GCSifGenerationMatch), or accept last-writer-wins on the manifest since each push only updates its own entry. - Sub-index count. A repo with 500 models means 500+ sub-index files. The manifest avoids needing to list them, but storage cost and cleanup (orphaned sub-indexes after model deletion) need consideration.
Files to change
In systeminit/swamp-extensions:
datastore/s3/extensions/datastores/_lib/s3_cache_sync.ts— index read/write, manifest management, migration, scoped pull/pushdatastore/s3/extensions/datastores/s3.ts— addcapabilities()to providerdatastore/gcs/extensions/datastores/_lib/gcs_cache_sync.ts— same changes, mirroreddatastore/gcs/extensions/datastores/gcs.ts— addcapabilities()to provider- Test files for both extensions
Validation
- Migration test: create monolithic index, run new code, verify split + manifest created
- Migration test: crash mid-split, rerun, verify idempotent completion
- Compatibility test: old swamp reads the shim after new swamp writes partitioned index
- Compatibility test: old swamp writes monolithic, new swamp re-splits on next push
- Scoped push test: two models push concurrently, verify both sub-indexes correct and no data loss
- Scoped pull test: pull with scope only downloads the relevant sub-index entries
- Cold-start test: empty bucket, verify discovery still works with new format
- GC test: deleted model's sub-index is cleaned up
Context
This is Phase 3 of the scoped sync plan documented in design/datastores.md. Depends on Phase 1 (#350, shipped) for scope/capabilities contracts and Phase 2 (#354, shipped) for per-path dirty tracking. After this phase, S3/GCS users get concurrent per-model push — the main goal of the entire initiative.
After this ships, open an issue on https://github.com/keeb/swamp-mongodb-datastore for keeb to add capabilities: () => ({ scopedSync: true }) to their provider — their extension already supports scoped sync natively.
Closed
No activity in this phase yet.
Sign in to post a ripple.