Skip to content

Add Trusted Server audit command#800

Merged
ChristianPavilonis merged 3 commits into
mainfrom
feature/ts-cli-audit
Jul 1, 2026
Merged

Add Trusted Server audit command#800
ChristianPavilonis merged 3 commits into
mainfrom
feature/ts-cli-audit

Conversation

@ChristianPavilonis

Copy link
Copy Markdown
Collaborator

Summary

  • Adds ts audit as a Trusted Server-specific page audit command on top of the base CLI.
  • Collects rendered page/script evidence with Chrome/Chromium and writes draft audit/config artifacts.
  • Adds audit-specific docs, specs, tests, and host dependencies.

Changes

File Change
crates/trusted-server-cli/src/audit.rs Adds audit orchestration, output planning, artifact writing, and draft config generation.
crates/trusted-server-cli/src/audit/analyzer.rs Detects JS assets, first/third-party classification, redirects, titles, and known integrations.
crates/trusted-server-cli/src/audit/browser_collector.rs Adds headless Chrome/Chromium collection for DOM scripts and network requests.
crates/trusted-server-cli/src/audit/collector.rs Defines audit collection traits/data shapes for testability.
crates/trusted-server-cli/src/args.rs, src/run.rs, src/lib.rs Wires ts audit into CLI parsing and dispatch.
Cargo.toml, crates/trusted-server-cli/Cargo.toml, Cargo.lock Adds audit-only dependencies.
README.md, docs/guide/cli.md, docs/guide/getting-started.md Documents audit workflow and Chrome/Chromium prerequisite.
docs/superpowers/...audit... Adds audit design and implementation plan artifacts.

Closes

No issue provided; this PR is split out from the combined feature/ts-cli-next branch.

Test plan

  • cargo test --workspace
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo fmt --all -- --check
  • JS tests: cd crates/js/lib && npx vitest run
  • JS format: cd crates/js/lib && npm run format
  • Docs format: cd docs && npm run format
  • WASM build: cargo build --package trusted-server-adapter-fastly --release --target wasm32-wasip1
  • Manual testing via fastly compute serve
  • Other: cargo test --package trusted-server-cli --target aarch64-apple-darwin — 42 passed

Checklist

  • Changes follow CLAUDE.md conventions
  • No unwrap() in production code — use expect("should ...")
  • Uses tracing/log macros (not println!)
  • New code has tests
  • No secrets or credentials committed

@prk-Jr prk-Jr left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Adds ts audit: loads a public page in a fresh headless Chrome/Chromium session, inventories rendered JS assets, detects known integrations, and writes a JS-asset report plus a draft trusted-server.toml. Clean AuditCollector trait makes the analysis browser-free and well-tested. One blocking item: the new scraper dependency breaks the integration dependency-parity CI gate.

Blocking

🔧 wrench

  • Dependency-parity CI failurescraper = "0.24.0" added to the workspace conflicts with integration-tests' pinned scraper = "0.21"; the prepare integration artifacts job fails on markup5ever / match_token / scraper / selectors (plus uuid, web-sys, wasm-bindgen-futures). Align the versions or extend the allowlist — see inline on Cargo.toml.

Non-blocking

🤔 thinking

  • Inline substring integration detection is false-positive proneanalyzer.rs:217; auto-enables gpt/didomi/datadome in the draft.
  • First-party classifier over-matches parent/eTLD domainsanalyzer.rs:169.
  • No overall navigation timeoutbrowser_collector.rs:98; a hanging origin stalls ts audit.

♻️ refactor

  • report_error is a misleading no-op wrappererror.rs:7; overlaps cli_error.

⛏ nitpick

  • Asymmetric URL resolutionanalyzer.rs:68; relative src from collector script tags is silently dropped.

👍 praise

  • AuditCollector trait → browser-free unit tests via FakeCollector; strong testability.
  • Browser hygiene: fresh temp user-data-dir per run, close() always called, handler task aborted on close failure, no forced --no-sandbox.
  • parse_audit_url restricts to http/https (blocks file://, data:, chrome://), with a test.
  • Overwrite protection + a pre-collection conflict check, so a refusal doesn't even launch Chrome.
  • GTM container id constrained by GTM-[A-Z0-9]+ → no TOML injection from page content into the draft.
  • audit module and all host-only deps correctly gated behind cfg(not(target_arch = "wasm32")).

CI Status

  • prepare integration artifacts (dependency parity): FAIL (caused by this PR)
  • integration / edgezero / browser tests: SKIPPED (blocked by the failed prepare job)
  • fmt / clippy / cargo test / vitest: not run — those workflows trigger only on PRs targeting main, and this PR targets feature/ts-cli-base. Author reports cargo test --package trusted-server-cli passing (42 tests) locally.

Comment thread Cargo.toml
Comment thread crates/trusted-server-cli/src/audit/analyzer.rs Outdated
Comment thread crates/trusted-server-cli/src/audit/analyzer.rs
Comment thread crates/trusted-server-cli/src/audit/browser_collector.rs Outdated
Comment thread crates/trusted-server-cli/src/error.rs
Comment thread crates/trusted-server-cli/src/audit/analyzer.rs Outdated

@aram356 aram356 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

ts audit is a well-structured, genuinely well-tested addition: a headless-Chrome page auditor that emits a js-assets.toml inventory and a draft trusted-server.toml. The AuditCollector trait makes the orchestration unit-testable without a browser, WASM gating is clean, and overwrite/--force safety is careful. One blocking issue: the new dependencies break CI. The remaining notes are heuristic-quality and design observations.

Local verification on aarch64-apple-darwin: cargo fmt --check, cargo clippy --all-targets -- -D warnings, and cargo test (37 passed) all pass for trusted-server-cli.

Blocking

🔧 wrench

  • PR breaks CI — integration-tests dependency drift: new scraper/chromiumoxide deps bumped the root Cargo.lock but the separate crates/trusted-server-integration-tests lockfile was not updated, failing the parity check. See inline comment on Cargo.toml:34.

Non-blocking

🤔 thinking

  • Public-suffix-unaware party classification (analyzer.rs:169-177): suffix matching treats example as first-party to publisher.example.
  • Loose substring integration detection (analyzer.rs:216-224audit.rs:282-291): incidental substrings like gpt can auto-enable a module in the draft config.
  • Bounded network capture (browser_collector.rs:147-165): resource-timing buffer cap (~250) and no HTTP status; large pages may drop assets.
  • Hard-fail on 4xx/5xx (browser_collector.rs:247-255): bot-protected pages (e.g. DataDome 403 to headless Chrome) yield no artifacts.

♻️ refactor

  • Dead method/status fields (collector.rs:27-33): always "GET"/None, never read.
  • Redundant script collection (analyzer.rs:49-95): parsed HTML and document.scripts are the same set, deduped by URL.

🌱 seedling

  • No timeout on browser.close() (browser_collector.rs:75-78): could hang on an unresponsive browser.
  • Heavy dependency tree: chromiumoxide pulls async-tungstenite/rustls/reqwest (+458 lockfile lines). Acceptable for a host-only operator CLI (no WASM/runtime impact); good that the fetcher feature is off so there's no Chromium auto-download.

⛏ nitpick

  • report_error/cli_error overlap (error.rs:7-9): both just .into(); report_error doesn't actually report.

📝 note

  • ~1.9k lines of design/plan docs added under docs/superpowers/. Scanned both files — only example/known-vendor domains, no secrets. Flagging the volume only.

👍 praise

  • The AuditCollector trait + FakeCollector seam and the thorough analyzer/orchestration tests (redirect warnings, dedup, relative-URL resolution, no-output guard, and collect-only-after-overwrite-check ordering) are excellent. Clean #[cfg(not(target_arch = "wasm32"))] gating throughout.

CI Status

  • prepare integration artifacts: FAIL (dependency drift — see blocking)
  • fmt (trusted-server-cli): PASS (local)
  • clippy (trusted-server-cli): PASS (local)
  • rust tests (trusted-server-cli): PASS (local, 37 passed)

Comment thread Cargo.toml
Comment thread crates/trusted-server-cli/src/audit/analyzer.rs
Comment thread crates/trusted-server-cli/src/audit/analyzer.rs
Comment thread crates/trusted-server-cli/src/audit/analyzer.rs Outdated
Comment thread crates/trusted-server-cli/src/audit/browser_collector.rs
Comment thread crates/trusted-server-cli/src/audit/browser_collector.rs
Comment thread crates/trusted-server-cli/src/audit/browser_collector.rs Outdated
Comment thread crates/trusted-server-cli/src/audit/collector.rs
Comment thread crates/trusted-server-cli/src/error.rs

@aram356 aram356 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review — all prior findings addressed

Re-reviewed at 20f867451 ("Address audit review feedback"). Every item from my earlier CHANGES_REQUESTED is resolved, and all CI checks are now green (including prepare integration artifacts, previously failing).

Prior findings — status

Finding Status
🔧 CI: integration-tests dependency drift Fixedcrates/trusted-server-integration-tests Cargo.lock/Cargo.toml updated; CI green
🤔 Loose substring detection auto-enabling modules Fixed — word-boundary regexes replace contains(); new ..._avoids_short_substring_matches test
🤔 4xx/5xx hard-aborts the audit Fixedvalidate_navigation_response now returns Option<String>; HTTP errors warn and still write partial artifacts
🤔 Bounded resource-timing capture (silent drop) Mitigated — warns when entries reach the 250-entry buffer threshold
🤔 Public-suffix-unaware party classification Acknowledged — documented as an advisory heuristic
♻️ Dead method/status fields Fixed — removed
♻️ Redundant two-path script collection Fixed — consolidated to the single script_tags source
🌱 No timeout on browser.close() Fixed — wrapped in timeout(...); navigation timeouts added too
report_error didn't report Fixed — now log::error!s the message

Local verification on aarch64-apple-darwin: cargo fmt --check, cargo clippy --all-targets -- -D warnings, and cargo test (38 passed) all pass for trusted-server-cli. Correctness spot-checks: \bgoogletag\b correctly does not misfire on googletagmanager; the timeout(...).await.map_err(..)?.map_err(..)? unwrapping is correct; dropping the HTML <script> parse is safe because browser document.scripts .src values are already absolute (relative-URL and invalid-URL tests still pass).

Non-blocking notes

📝 note

  • The new warning paths — validate_navigation_response returning Some(..) and the resource-timing-buffer warning — have no dedicated unit test.
  • Party classification remains a host-suffix heuristic (now documented as such); an eTLD+1 comparison via a public-suffix list would be a future nicety.

CI Status

  • prepare integration artifacts: PASS
  • cargo fmt: PASS
  • cargo test: PASS
  • vitest: PASS
  • integration tests (all adapters): PASS

@aram356 aram356 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up to my approval: anchoring the two non-blocking 📝 notes to their exact lines. Approval stands.

Comment thread crates/trusted-server-cli/src/audit/browser_collector.rs
Comment thread crates/trusted-server-cli/src/audit/browser_collector.rs Outdated
Comment thread crates/trusted-server-cli/src/audit/analyzer.rs
@ChristianPavilonis ChristianPavilonis changed the base branch from feature/ts-cli-base to main July 1, 2026 19:12
@ChristianPavilonis ChristianPavilonis merged commit b260260 into main Jul 1, 2026
18 checks passed
@aram356 aram356 deleted the feature/ts-cli-audit branch July 1, 2026 21:09
prk-Jr added a commit that referenced this pull request Jul 2, 2026
Resolve 22 conflicts from main's EdgeZero adapter/canary evolution against
the branch's config-store 503 work.

- settings_data.rs: union the branch's ConfigStoreUnavailable (503) read
  classification with main's Fastly chunk-length hardening; keep both test sets.
- adapter-fastly/main.rs: adopt main's rollout-percentage canary and
  settings_snapshot finalize refactor (branch 503 logic lives in error.rs).
- CLI audit: adopt main's #800 implementation over the branch's parallel copy.
- Manifests/lockfiles: take main's per-platform adapters and edgezero rev e483.
- proxy.rs: drop a duplicate request_body_bytes left by a false auto-merge.

Verified: fmt, clippy (5 targets), all adapter + core + cli test suites,
integration parity, and JS vitest all pass.
prk-Jr added a commit that referenced this pull request Jul 2, 2026
Rebuilds feature/ts-cli-ad-templates fresh on server-side-ad-templates-impl
(which already carries the ts CLI #799 + audit #800 via main), dropping the
redundant ts-cli-base history that caused the merge conflicts.

Adds the ad-template CLI:
- ts audit ad-templates verify — browser/CDP ad-template slot verification
- ts audit page — read-only page summary
- ts config ad-templates — server-side ad-template diagnostics
- shared CLI app-config loader + ad-template evidence/output models

Keeps #800's URL->draft-config bootstrap by relocating it under
audit/generate/ and exposing it as ts audit generate.

Core: extracts the server-side ad-stack gate into
creative_opportunities::evaluate_ad_stack_gate (three-state Yes/No/Unknown)
so the CLI and the runtime publisher path share one gate definition.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants