Skip to content

feat: provision --local (umbrella PR)#287

Draft
aram356 wants to merge 69 commits into
mainfrom
feature/provision-local-impl
Draft

feat: provision --local (umbrella PR)#287
aram356 wants to merge 69 commits into
mainfrom
feature/provision-local-impl

Conversation

@aram356

@aram356 aram356 commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Umbrella PR for the provision --local workstream.

Closes #288.

Source artifacts

Architecture

A new ProvisionMode::Local arm threads through Adapter::provision. Local mode:

  • synthesises minimal per-adapter manifests on a clean clone via toml_edit::DocumentMut (CLI-owned bootstrap, before validation)
  • merges per-store bindings + env labels on top
  • writes adapter-specific env files (.edgezero/.env, .dev.vars, <spin_crate>/.env)
  • never shells out to cloud CLIs

Dry-run stages a real fs::copy into a tempfile::TempDir and diffs the result back. Cloudflare/Fastly/Spin manifests become gitignored generated state; Axum's axum.toml stays tracked. Generated <app-cli> runs run_provision_typed::<C> to add #[secret]-field placeholders the bundled edgezero can't see.

Execution plan

This umbrella PR opens as draft and stays draft. Each plan section gets its own follow-up PR that merges into this branch; once Sections 1–9 are all green, this PR converts to ready-for-review and merges to main.

Section breakdown

Each section tracks one sub-issue:

CI gates (from the plan's Global Constraints)

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo test --workspace --all-targets
  • cargo check --workspace --all-targets --features "fastly cloudflare spin"
  • cargo check -p edgezero-adapter-spin --target wasm32-wasip2 --features spin

Test plan

  • All five CI gates pass on the merged branch
  • All four smoke scripts (scripts/smoke_test_{config,kv,secrets,config_key_override}.sh) pass with warm-up
  • Per-adapter contract tests cover the four provision_local_* cases + Spin's env-label alignment quartet (Section 9)
  • Worktree is clean after smoke matrix (Task 43 step 4 — umbrella gate)

Empty tracking commit for the implementation work tracked in:
  docs/superpowers/plans/2026-06-27-provision-local.md

Issues:
  - Epic: <epic-issue-url>
  - Per-section sub-issues linked from the epic.

This PR opens as a DRAFT and stays draft until Section 1 lands its
first real commit. Each section opens as its own follow-up PR
that lands here before the umbrella merges to main.
@aram356 aram356 self-assigned this Jun 30, 2026
aram356 added 14 commits June 30, 2026 00:46
run_shared_checks iterates every declared adapter and dispatches
validate_adapter_manifest, which for Spin does
fs::read_to_string(manifest_root.join(rel)). With the containment
guard sitting after run_shared_checks, a manifest declaring
[adapters.spin.adapter].manifest = "../outside/spin.toml" could
trigger a filesystem read outside the project root before the
guard rejected it — a spec violation of §"Path containment (MUST)"
which requires the helper run BEFORE any manifest-path use.

Fix by relocating the check to fire immediately after
load_push_context, and looping over every declared adapter (not
just ctx.adapter) since run_shared_checks reads all of them.
Also close Task 7's Minor about the duplicate adapter_entry call
by removing the now-redundant per-adapter guard block.

Regression test: config_push_local_rejects_parent_traversal_in_
sibling_spin_adapter declares a poisoned Spin adapter alongside
the pushed axum adapter, and asserts the error names the
containment violation (not Spin's "failed to read spin manifest"
message that would surface under the old ordering).

Also tighten copy_tree's else-branch to explicitly gate on
is_regular_file() rather than "everything non-dir non-symlink",
add a Unix symlink-skip test, and drop a stale
#[expect(dead_code)] on ValidationContext::manifest_path that
now has real callers.
The bare-cwd variant of the accept-test (--manifest edgezero.toml)
previously wrote the manifest to a tempdir and set EDGEZERO_MANIFEST,
but run_provision reads args.manifest directly (no env fallback).
The test therefore failed on manifest load ("failed to load
edgezero.toml") and its negative !contains(path-safety-markers)
assertion vacuously passed — no actual coverage of the
`args.manifest.parent() == ""` fallback.

Fix by adding a CwdGuard RAII helper that chdirs into the tempdir
under the manifest_guard() serialisation lock and restores the
previous cwd on drop. Both accept-tests now also assert positively
that the error is the (true, true) dispatch stub ("local dry-run
staging lands in Task 10/11"), proving the manifest loaded AND
path-safety passed AND we reached the dispatch matrix. Drop the
now-unnecessary EnvOverride from both tests.

Reviewer: reviewer of Task 9 pushed this as a Low ahead of Task 10
because run_with_staging depends on manifest-root/cwd correctness.
aram356 added 30 commits July 2, 2026 15:29
…-in)

Task 28b: drop the `#[ignore]` on
provision_local_dry_run_worktree_clean_and_no_tempdir_paths_in_stdout.
Section 5's per-adapter Local writers are all landed (Tasks 17-28),
so the test now has real behavior to lock — every adapter's Local
arm stages into a tempdir under dry-run, leaving `examples/app-demo`
byte-identical.

`snapshot_dir` needed an exclusion path first: `examples/app-demo/
target/` alone is 29 GB of Rust build output, and walking it reads
every byte — the test SIGKILL'd after 60 seconds on the first
attempt. Added `snapshot_dir_excluding(root, excluded_dir_names)`
that skips directories by name at any depth, and the test now
excludes `[target, .git, .spin, .wrangler]` — build artifacts and
adapter runtime state, all gitignored and outside the "did dry-run
touch the worktree" question. The one-arg `snapshot_dir` stays for
the fake-fixture callers where nothing needs excluding.

Test now runs in 0.07 s across all four adapters.

Contract B (no-tempdir-path leak in stdout via captured log) stays
deferred — `log::set_logger` is a process-wide one-shot and would
require workspace churn to swap the CliLogger for a capturing sink.
The `TODO(section-5)` comment inside the loop and the module-level
comment above the test both name the two viable retro-fit
strategies (subprocess capture / `tracing`-subscriber swap) for a
follow-up.

Closes Section 5.
Important — Spin local provision no longer silently succeeds on a
malformed runtime-config.toml.

`append_key_value_store_block` used to return `bool` and hand back
`false` when `key_value_store` existed but was not a table (e.g.
`key_value_store = "oops"`). The caller treated that as "nothing
changed" and still wrote spin.toml + .env, leaving spin.toml
referencing a store label that runtime-config.toml never declared.
Spin would then fail at boot with a confusing lookup error.

Changed to `Result<bool, String>` and now errors with the same
"refusing to edit in place" pattern the Fastly and Cloudflare local
arms use. Regression:
spin_local_provision_errors_when_runtime_config_key_value_store_is
_not_a_table asserts the error surfaces AND that spin.toml is not
touched on the error path.

Medium — provision-written line-oriented files now carry the
`# edgezero-provision: v1` schema header.

Spec §"Merge mechanics" → "Line-oriented" (line 940 of the spec)
requires the header on every provision-written file so future
migrations can detect the shape. `append_lines_dedup` wasn't
prepending it — Axum `.edgezero/.env`, Spin `.env`, and Cloudflare
`.dev.vars` all shipped without it.

Added a new `append_lines_dedup_with_header(path, header, lines,
dry_run)` that ensures the header is present. `normalised_key`
returns `None` for comment-only lines (no `=`), so the ordinary
dedup path can't self-check the header; the new fn uses trimmed-
equality against existing lines to decide whether to prepend.

Exported a workspace-level `EDGEZERO_PROVISION_HEADER` constant so
a future spec bump touches one line. The existing
`append_lines_dedup` wrapper stays for backward compatibility.

5 new env_file tests cover: first-write prepend, no-duplicate on
re-run, prepend above operator-written content, trim-equality
match, dry-run no-write. Updated all 6 adapter call sites (2
Cloudflare, 2 Spin, 2 Axum) to pass the header.
Important — re-provision after adding a store no longer skips the
runtime env update.

`upsert_runtime_env_config_store` used to `return Ok(false)` the
moment `[local_server.config_stores.edgezero_runtime_env]` existed.
On a second provision (operator added a new `[stores.*]` entry
between runs, or an env-overlay changed a platform name), the block
was skipped entirely — the per-store `[local_server.config_stores
.<platform>]` block for the new store landed correctly, but its
`EDGEZERO__STORES__<KIND>__<LOGICAL>__NAME` line never made it into
`edgezero_runtime_env.contents`, leaving the local runtime unable
to resolve the store from env.

Violated spec §"Merge mechanics" — "preserve operator-set values;
only add what's missing".

The upsert now branches on whether the block already exists:
- First-write path (unchanged): build the full block, insert all
  managed __NAME keys, attach the commented __KEY placeholder decor.
- Additive-merge path (new): open the existing block's `.contents`
  sub-table and insert only the managed __NAME keys not already
  present. Operator values and non-managed keys stay byte-for-byte.
  The commented __KEY decor is not rewritten on re-provision —
  operators may have uncommented or removed those on purpose, so
  clobbering them would be a bigger regression.

Return semantic: `Ok(true)` if the block was newly written OR at
least one key was added; `Ok(false)` if nothing changed.

Regression test `fastly_local_provision_additively_merges_new_stores
_into_existing_runtime_env` runs provision twice (KV-only, then
KV+CONFIG) against the same fastly.toml and asserts:
  - KV __NAME line survives the second provision;
  - new CONFIG __NAME line lands inside the existing runtime-env
    block (not a duplicate block, not a sibling);
  - runtime-env block header appears exactly once.
Two related one-line dispatch changes so downstream CLIs actually
walk their `#[secret]` fields at provision time.

Task 30 — scaffold template:
  crates/edgezero-cli/src/templates/cli/src/main.rs.hbs
  Cmd::Provision arm now calls
  edgezero_cli::run_provision_typed::<{{NameUpperCamel}}Config>(&args)
  instead of edgezero_cli::run_provision(&args).

Task 30b — in-tree app-demo-cli:
  examples/app-demo/crates/app-demo-cli/src/main.rs
  Same substitution with AppDemoConfig — smoke fixtures in
  Section 7/8 (Tasks 35-37) warm up via `app-demo-cli provision`,
  so leaving this on the untyped bundle would silently skip
  Spin's [variables] declarations, SPIN_VARIABLE_* lines, and
  Cloudflare's .dev.vars secret placeholders in every smoke.

Also updated the existing generate_new_scaffolds_workspace_layout
test in generator.rs to require the typed dispatch string
(`run_provision_typed::<DemoAppConfig>`) instead of the untyped
one, plus a negative assertion that the untyped
`edgezero_cli::run_provision(&args)` call must NOT survive
template regeneration.
Section 7 Task 32. Adds the four synthesised adapter manifests
(`wrangler.toml`, `fastly.toml`, `spin.toml`, `runtime-config.toml`)
and Cloudflare's `.dev.vars` to the scaffold's .gitignore. axum.toml
stays tracked — Axum owns its manifest and it's operator-authored,
not provision-generated.

The generate_new_scaffolds_workspace_layout test now asserts every
required entry is present AND that no active ignore rule targets
`axum.toml` (comments mentioning axum.toml are allowed — the
template's explanatory prose calls out the exclusion).
…--local

Section 7 Task 33. Untracks the four in-tree app-demo adapter
manifests (`wrangler.toml`, `fastly.toml`, `spin.toml`,
`runtime-config.toml`) via `git rm --cached` and adds them to the
root .gitignore so subsequent operator provisions don't dirty the
worktree.

`axum.toml` stays tracked — Axum owns its manifest and it's
operator-authored, not provision-generated. Same discipline as the
scaffold's .gitignore (Task 32).

`.dev.vars` is added even though the current in-tree tree doesn't
track any — the regex mirrors the CI gate Task 37 will install so
the two runbooks can't drift.

Verified via `git ls-files | rg '(^|/)(fastly|spin|wrangler|runtime
-config)\.toml$|(^|/)\.dev\.vars$'` returns empty output; worktree
files are still present locally so the dev loop keeps working.
Section 7 Task 35 (closes Section 7). Adds
`scripts/lib/smoke_warmup.sh` — a shared helper that sources the
generated `app-demo-cli` and runs `provision --adapter <name>
--local` for the selected row. Every smoke sources it and calls
`smoke_warmup_provision_local "$ADAPTER"` right after the
ROOT_DIR/DEMO_DIR bootstrap, so fresh clones can boot each
adapter's emulator without a pre-populated worktree (the four
provision-owned manifests + `.dev.vars` were gitignored by Task
33).

`smoke_test_config_key_override.sh` additionally loses its three
`backup_in_tree` calls:
  - `fastly.toml` in the 12.7 per-adapter loop
  - `.dev.vars` in the 12.7 Cloudflare row
  - `fastly.toml` in the 9.3 Fastly chunk-pointer section

All three protected TRACKED copies of files the smoke would mutate;
Task 33 removed the tracked copies, so backup/restore is obsolete.
`smoke_warmup_provision_local` regenerates the same files from
scratch before each row runs.

`cf` remains an accepted operator alias for `cloudflare` —
`smoke_canonical_adapter` normalises it inside the warm-up so the
CLI arg is always the manifest's canonical name.

Syntax verified: `bash -n` clean on all 5 scripts. Running the
smokes end-to-end (each takes several minutes and spawns emulators)
is deferred to Task 37's CI gate.
…oml exempt)

Section 8 Task 37. Adds a pre-`cargo test` gate to the workspace
test workflow that fails the run when any of the four provision-
owned adapter manifests (`wrangler.toml`, `fastly.toml`,
`spin.toml`, `runtime-config.toml`) or Cloudflare's `.dev.vars`
appears in `git ls-files`.

The regex mirrors the pattern used in the local Task 33 runbook
and the gitignore in the scaffold + root, so the two runbooks and
the CI can't drift. axum.toml is intentionally exempt — Axum owns
its manifest and it stays tracked.

Uses POSIX `grep -E` (not `rg`) so no CI-side dependency is added;
GitHub-hosted runners ship BSD/GNU grep.

Verified against the current tree: `git ls-files | grep -E ...`
returns empty, so the gate would pass on `feature/provision-local
-impl` head as of this commit.
Split the 4,713-line `crates/edgezero-adapter-spin/src/cli.rs` into the
canonical `cli/mod.rs` + `cli/provision_local.rs` + `cli/run.rs` shape
(mirroring the Axum split at 492f774) and rename `cli/push_sqlite.rs` to
`cli/push_local.rs` so the file names describe the concern (local push,
which happens to use SQLite as the backend) instead of the storage
engine. `cli/runtime_config.rs` stays as-is: its `read` / `KeyValueBackend`
/ `ParsedRuntimeConfig` items are consumed by BOTH the push path (in
`push_local::dispatch_push`) and the read path (in `read_config_entry`),
so folding it into `provision_local.rs` would leave the push module
importing it via a non-provision path.

Placement:

- `mod.rs`: `impl Adapter for SpinCliAdapter`, statics, `register()` +
  ctor, and helpers whose primary caller is the trait impl
  (`is_valid_spin_key`, `spin_key_rule_violation`,
  `collect_spin_component_ids`, `resolve_spin_component`,
  `ensure_kv_label_in_component`).
- `provision_local.rs`: `provision` (was `provision_local`),
  `provision_typed` (was `provision_typed_local`), and their doc-editing
  helpers (`resolve_component_id`, `upsert_variables_entry`,
  `upsert_component_variable`, `append_kv_store_to_component`,
  `append_key_value_store_block`, `normalise_runtime_config_header`,
  `build_env_lines`).
- `push_local.rs`: existing SQLite writer + `dispatch_push`,
  `verify_label_declared`, `read_sqlite_entry`, `write_sqlite`,
  `read_spin_application_name` (all called from the trait impl's push /
  read paths).
- `run.rs`: `build` / `deploy` / `serve` subprocess wrappers,
  `find_spin_manifest`, `locate_artifact`, `TARGET_TRIPLE`, and the
  `synthesise_*_toml` baselines consumed by `synthesise_baseline_manifest`.

Test module split follows subject-under-test placement per the brief.
Pre-split totals: cli.rs=89, push_cloud.rs=16, push_sqlite.rs=12,
runtime_config.rs=7 (=124). Post-split totals: mod.rs=21,
provision_local.rs=29, push_local.rs=41, push_cloud.rs=17, run.rs=9,
runtime_config.rs=7 (=124). Baseline preserved.

Housekeeping while here: `env_mutation_guard()` moved from
`provision_local::tests` to `cli::mod.rs` as a shared `pub(super)` fn so
`push_cloud::tests::path_mutation_guard` can delegate to the same
process-wide mutex; without it, the two suites' PATH-prepending tests
race against each other (fake-`spin` shims collide).

File-level `#![expect(clippy::mod_module_files, ...)]` +
`#![expect(clippy::arbitrary_source_item_ordering, ...)]` on `mod.rs`
mirror the Axum split; no per-item clippy suppressions were added.
… PATH races

After the cli.rs split, each adapter's per-submodule test suite had its
own path_mutation_guard() with its own static Mutex. Under concurrent
workspace test runs those independent mutexes let PATH-mutating tests
in provision_local, provision_cloud, and push_cloud race with each
other -- the fake vendor CLI shim planted by one test would be evicted
from $PATH by another before the first test read the shim's log.

Hoist a single pub(crate) fn path_mutation_guard() -> &'static Mutex<()>
into each adapter's cli/mod.rs and have every submodule test bring it in
via `use super::super::path_mutation_guard`. Mirrors the shared-guard
pattern the Spin split already established.
The commented CONFIG __KEY placeholder was emitted as
`<placeholder-{logical}-key>` which violated the spec (Task 19,
line 3081) and diverged from every other adapter — axum, spin, and
fastly all emit `{logical}_staging`.

An operator who uncomments the line expecting to switch to a staging
config blob got a nonsensical key that never resolves against push-side
tooling that matches on `{logical}_staging`. The regression test was
written to the wrong value at the same time so the divergence rode
green CI.

Fix the emitter and update the two tests that assert on the placeholder
value (writes-name-lines contract test and dedup-respects-overrides
contract test). Behaviour: byte-identical for operators who never
touched the commented line; for those who uncomment, the value now
matches other adapters.
… writeback

Two related bugs fixed together because both touch the service_id
lifecycle across local + cloud provision.

## service_id positioning (P0)

toml_edit::DocumentMut::insert appends the key at end-of-order. When
the parsed fastly.toml already carries any headed sub-table
([scripts], [local_server]), inserting a fresh root scalar lands the
key AFTER the header, and TOML re-parse assigns it as
local_server.service_id -- a silent divergence that the existing
lock test never caught because it only did after.contains("service_id
= \"SVC1\"").

Fix with a new upsert_root_scalar_before_tables helper that hoists
sub-tables out, inserts the scalar, and re-attaches sub-tables in
original order. Preserves comments and decor via toml_edit's per-item
decor tracking. Mirrors the normalise-shape pattern the Spin split
already established for runtime-config.toml.

Regression test added: parses the re-emitted file and asserts
service_id lives at the TOML root, not as local_server.service_id.

## Cloud deployed writeback (P0)

Fastly's cloud provision unconditionally returned deployed: None
despite deployed_fields() advertising ownership of service_id.
Operators had to hand-copy from fastly.toml into edgezero.toml after
every service creation.

Fix: read service_id from fastly.toml at the end of cloud provision
and thread it into ProvisionOutcome.deployed. Dry-run also populates
-- the CLI's merge_deployed_into_manifest respects its own dry_run
flag and only reports (not writes) the pending edgezero.toml change.

Two regression tests: (1) fastly.toml with service_id present ->
ProvisionOutcome.deployed carries it; (2) no service_id -> deployed
stays None.
run_provision_typed did not merge outcome.deployed from
Adapter::provision_typed into edgezero.toml. Every current
provision_typed impl returns deployed: None so no live break --
but a future secrets-store-id capture would silently leak out.
Add the same merge_deployed_into_manifest call the base run_provision
uses so the writeback hook exists for both arms.
merge_deployed_into_manifest had two hard-coded arrays
(KNOWN_SCALAR_FIELDS, KNOWN_SUB_TABLE_FIELDS) that duplicated the
field list on ManifestAdapterDeployed in a different crate. Adding a
new deployed field to the struct raised no compile error -- the CLI
would return `unknown deployed field` at runtime for any adapter
that emitted the new field.

Move the two arrays onto the struct itself as pub const SCALAR_FIELDS
and pub const SUB_TABLE_FIELDS in the same impl block as
populated_fields(), so a future field addition sits next to both
mapping arrays and the coupling is unmissable.
Three doc fixes surfaced by the self-review:

## axum.toml is not a provision --local output

getting-started.md and cli-reference.md listed `axum.toml` alongside
`wrangler.toml`/`fastly.toml`/`spin.toml` as a manifest that `provision --local`
synthesises. It isn't: Axum's contract test
(crates/edgezero-adapter-axum/src/cli/provision_local.rs) asserts the
file is byte-identical after every provision run. `axum.toml` is
scaffolded from the template and stays operator-authored -- provision
never touches it.

## Spin does not source dotenv from cwd

adapters/spin.md said `cd <spin_crate>; spin up ...` would pick up
`.env` from the working directory. `spin up` has no dotenv-from-cwd
behaviour; the parent `edgezero serve` loads
`<spin_crate>/.env` into the process env before spawning `spin up`.
Replace with the honest recipe: `set -a && source .env` or
`edgezero serve` (which handles it).

## Gitignore list omitted .env and .edgezero/

getting-started.md advertised only wrangler/fastly/spin/runtime-config
+ .dev.vars. Spin also writes `<spin_crate>/.env` and axum writes
`.edgezero/.env`; both are gitignored (the scaffolder emits the rules
via crates/edgezero-cli/src/templates/root/gitignore.hbs).

No behaviour change; docs now match shipped code.
…ocal split

## Blocker: scaffolded fastly.toml carried service_id = ""

crates/edgezero-adapter-fastly/src/templates/fastly.toml.hbs shipped a
literal 'service_id = ""' line and lacked the schema-version header.
The generator writes adapter templates first, then run_provision's
write_baseline_to_disk skips existing files -- so the synthesiser's
'omit service_id until deployed' invariant was bypassed for every
edgezero new: the scaffolded fastly.toml carried the empty placeholder
straight through provision.

Strip the empty service_id line and prepend the edgezero-provision v1
header so the template matches synthesise_fastly_toml's output shape.
Add a scaffold-level assertion that no service_id = "" survives after
edgezero new: assert_scaffold_files in generator.rs. Verified the
assertion detects the regression by stash + re-run.

## Docs: cloud vs local split was misdescribed

Four doc files said cloud provision writes [local_server.*] tables and
local provision writes [setup.*] tables -- the reverse of what shipped.
Actual split: cloud owns [setup.<kind>_stores.*] (fastly compute deploy
consumes it on first deploy); local owns Viceroy
[local_server.<kind>_stores.*]. Corrections in:
- docs/guide/cli-reference.md (two cells)
- docs/guide/kv.md (Viceroy KV example)
- docs/guide/adapters/fastly.md (provision --local paragraph)
- docs/guide/cli-walkthrough.md (fastly cloud section)

## Housekeeping

- scripts/smoke_test_config_key_override.sh comment said fastly.toml
  'mutates the tracked file'; the file is gitignored per Task 32 and
  regenerated by the smoke warm-up.
- examples/app-demo/Cargo.lock picks up tempfile + toml_edit
  transitively via edgezero-cli's new provision-local paths (Tasks 10
  + 16 landed the deps).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Epic: provision --local implementation

1 participant