Skip to main content
← Back to list
01Issue
FeatureOpenSwamp CLI
AssigneesNone

#350 Scoped sync and capability-gated concurrency for datastores (Phase 1)

Opened by stack72 · 5/13/2026

Problem

Per-model commands acquire per-model locks (data/{type}/{id}/.lock) allowing concurrent model execution, but the sync layer operates globally:

  • Pull fetches the full remote index and stats every entry — O(all-files) even when only one model changed.
  • Push walks the entire local cache — O(all-files) even when only one model was written.
  • Per-model push acquires a global lock (repo_context.ts acquireModelLocks flush) because S3/GCS share a monolithic .datastore-index.json. Concurrent model pushes serialize here.

Lock granularity is fine-grained but sync scope is global — they don't match. For a team of N people each writing one model, every command pays O(all-files) even though O(1 model) changed.

Community datastores like @keeb/mongodb-datastore already implement per-path dirty tracking and scoped push natively, but the framework forces global-lock push behavior on them regardless.

Proposed Solution (Phase 1 only)

Add two opt-in concepts to the domain contracts so backends can declare support for scoped sync:

1. scope on DatastoreSyncOptions

export interface DatastoreSyncOptions {
  signal?: AbortSignal;
  relPath?: string;
  scope?: string; // e.g. "data/aws-ec2/my-instance"
}

Safety contract: "MUST sync at least all files matching the scope prefix, MAY sync more." Backends that ignore scope sync the full datastore — correct but wasteful.

2. capabilities() on DatastoreProvider

export interface DatastoreProvider {
  // ... existing methods unchanged ...
  capabilities?(): DatastoreCapabilities;
}

export interface DatastoreCapabilities {
  scopedSync?: boolean;
}

scopedSync: true is a hard contract: concurrent scoped pushes to non-overlapping prefixes cannot corrupt shared state. If an extension lies about this, data corruption. Not a performance hint.

3. Core behavior gate in acquireModelLocks flush

When capabilities().scopedSync === true:

  • Per-model push passes scope: "data/{type}/{id}/" and uses only the per-model lock
  • Per-model pull passes scope so only that model's data is fetched
  • No global lock needed

When scopedSync is false or absent (default):

  • Current behavior unchanged — global lock around push, no scope parameter

Files to change

  • src/domain/datastore/datastore_sync_service.ts — add scope to DatastoreSyncOptions
  • src/domain/datastore/datastore_provider.ts — add capabilities() and DatastoreCapabilities
  • src/cli/repo_context.ts — gate acquireModelLocks flush (~lines 955-988) on capabilities().scopedSync
  • design/datastores.md — already updated with "Scoped Sync and Capability-Gated Concurrency" section

Backward compatibility

  • All additions are optional with fallback to current behavior
  • Extensions that don't implement capabilities() see zero behavior change
  • Extensions that don't handle scope in pull/push still work (full sync is a correct superset)
  • No data format changes, no migration needed

Validation

  • Existing test suite passes unchanged
  • New unit tests with mock provider declaring scopedSync: true verify: no global lock acquired, scope passed to push/pull
  • @keeb/mongodb-datastore can add capabilities: () => ({ scopedSync: true }) and benefit immediately

Context

This is Phase 1 of a 4-phase plan documented in design/datastores.md. Later phases add per-path dirty tracking to S3/GCS (Phase 2), partitioned indexes (Phase 3), and a local remote-state mirror (Phase 4). Phase 1 is independent and delivers value to document-store backends immediately.

02Bog Flow
OPENTRIAGEDIN PROGRESSSHIPPED

Open

5/13/2026, 9:11:38 PM

No activity in this phase yet.

03Sludge Pulse

Sign in to post a ripple.