Add Trusted Server audit command#800
Conversation
7fdbff6 to
2ea46f1
Compare
0afc3ee to
79911d6
Compare
2ea46f1 to
edb7f0c
Compare
79911d6 to
bdb9284
Compare
prk-Jr
left a comment
There was a problem hiding this comment.
Summary
Adds ts audit: loads a public page in a fresh headless Chrome/Chromium session, inventories rendered JS assets, detects known integrations, and writes a JS-asset report plus a draft trusted-server.toml. Clean AuditCollector trait makes the analysis browser-free and well-tested. One blocking item: the new scraper dependency breaks the integration dependency-parity CI gate.
Blocking
🔧 wrench
- Dependency-parity CI failure —
scraper = "0.24.0"added to the workspace conflicts with integration-tests' pinnedscraper = "0.21"; theprepare integration artifactsjob fails on markup5ever / match_token / scraper / selectors (plus uuid, web-sys, wasm-bindgen-futures). Align the versions or extend the allowlist — see inline onCargo.toml.
Non-blocking
🤔 thinking
- Inline substring integration detection is false-positive prone —
analyzer.rs:217; auto-enables gpt/didomi/datadome in the draft. - First-party classifier over-matches parent/eTLD domains —
analyzer.rs:169. - No overall navigation timeout —
browser_collector.rs:98; a hanging origin stallsts audit.
♻️ refactor
report_erroris a misleading no-op wrapper —error.rs:7; overlapscli_error.
⛏ nitpick
- Asymmetric URL resolution —
analyzer.rs:68; relativesrcfrom collector script tags is silently dropped.
👍 praise
AuditCollectortrait → browser-free unit tests viaFakeCollector; strong testability.- Browser hygiene: fresh temp user-data-dir per run,
close()always called, handler task aborted on close failure, no forced--no-sandbox. parse_audit_urlrestricts to http/https (blocks file://, data:, chrome://), with a test.- Overwrite protection + a pre-collection conflict check, so a refusal doesn't even launch Chrome.
- GTM container id constrained by
GTM-[A-Z0-9]+→ no TOML injection from page content into the draft. - audit module and all host-only deps correctly gated behind
cfg(not(target_arch = "wasm32")).
CI Status
- prepare integration artifacts (dependency parity): FAIL (caused by this PR)
- integration / edgezero / browser tests: SKIPPED (blocked by the failed prepare job)
- fmt / clippy / cargo test / vitest: not run — those workflows trigger only on PRs targeting
main, and this PR targetsfeature/ts-cli-base. Author reportscargo test --package trusted-server-clipassing (42 tests) locally.
aram356
left a comment
There was a problem hiding this comment.
Summary
ts audit is a well-structured, genuinely well-tested addition: a headless-Chrome page auditor that emits a js-assets.toml inventory and a draft trusted-server.toml. The AuditCollector trait makes the orchestration unit-testable without a browser, WASM gating is clean, and overwrite/--force safety is careful. One blocking issue: the new dependencies break CI. The remaining notes are heuristic-quality and design observations.
Local verification on aarch64-apple-darwin: cargo fmt --check, cargo clippy --all-targets -- -D warnings, and cargo test (37 passed) all pass for trusted-server-cli.
Blocking
🔧 wrench
- PR breaks CI — integration-tests dependency drift: new
scraper/chromiumoxidedeps bumped the rootCargo.lockbut the separatecrates/trusted-server-integration-testslockfile was not updated, failing the parity check. See inline comment onCargo.toml:34.
Non-blocking
🤔 thinking
- Public-suffix-unaware party classification (
analyzer.rs:169-177): suffix matching treatsexampleas first-party topublisher.example. - Loose substring integration detection (
analyzer.rs:216-224→audit.rs:282-291): incidental substrings likegptcan auto-enable a module in the draft config. - Bounded network capture (
browser_collector.rs:147-165): resource-timing buffer cap (~250) and no HTTP status; large pages may drop assets. - Hard-fail on 4xx/5xx (
browser_collector.rs:247-255): bot-protected pages (e.g. DataDome 403 to headless Chrome) yield no artifacts.
♻️ refactor
- Dead
method/statusfields (collector.rs:27-33): always"GET"/None, never read. - Redundant script collection (
analyzer.rs:49-95): parsed HTML anddocument.scriptsare the same set, deduped by URL.
🌱 seedling
- No timeout on
browser.close()(browser_collector.rs:75-78): could hang on an unresponsive browser. - Heavy dependency tree:
chromiumoxidepulls async-tungstenite/rustls/reqwest (+458 lockfile lines). Acceptable for a host-only operator CLI (no WASM/runtime impact); good that the fetcher feature is off so there's no Chromium auto-download.
⛏ nitpick
report_error/cli_erroroverlap (error.rs:7-9): both just.into();report_errordoesn't actually report.
📝 note
- ~1.9k lines of design/plan docs added under
docs/superpowers/. Scanned both files — only example/known-vendor domains, no secrets. Flagging the volume only.
👍 praise
- The
AuditCollectortrait +FakeCollectorseam and the thorough analyzer/orchestration tests (redirect warnings, dedup, relative-URL resolution, no-output guard, and collect-only-after-overwrite-check ordering) are excellent. Clean#[cfg(not(target_arch = "wasm32"))]gating throughout.
CI Status
- prepare integration artifacts: FAIL (dependency drift — see blocking)
- fmt (trusted-server-cli): PASS (local)
- clippy (trusted-server-cli): PASS (local)
- rust tests (trusted-server-cli): PASS (local, 37 passed)
fd0322c to
4c2f0ab
Compare
dfa8a29 to
20f8674
Compare
aram356
left a comment
There was a problem hiding this comment.
Re-review — all prior findings addressed
Re-reviewed at 20f867451 ("Address audit review feedback"). Every item from my earlier CHANGES_REQUESTED is resolved, and all CI checks are now green (including prepare integration artifacts, previously failing).
Prior findings — status
| Finding | Status |
|---|---|
| 🔧 CI: integration-tests dependency drift | Fixed — crates/trusted-server-integration-tests Cargo.lock/Cargo.toml updated; CI green |
| 🤔 Loose substring detection auto-enabling modules | Fixed — word-boundary regexes replace contains(); new ..._avoids_short_substring_matches test |
| 🤔 4xx/5xx hard-aborts the audit | Fixed — validate_navigation_response now returns Option<String>; HTTP errors warn and still write partial artifacts |
| 🤔 Bounded resource-timing capture (silent drop) | Mitigated — warns when entries reach the 250-entry buffer threshold |
| 🤔 Public-suffix-unaware party classification | Acknowledged — documented as an advisory heuristic |
♻️ Dead method/status fields |
Fixed — removed |
| ♻️ Redundant two-path script collection | Fixed — consolidated to the single script_tags source |
🌱 No timeout on browser.close() |
Fixed — wrapped in timeout(...); navigation timeouts added too |
⛏ report_error didn't report |
Fixed — now log::error!s the message |
Local verification on aarch64-apple-darwin: cargo fmt --check, cargo clippy --all-targets -- -D warnings, and cargo test (38 passed) all pass for trusted-server-cli. Correctness spot-checks: \bgoogletag\b correctly does not misfire on googletagmanager; the timeout(...).await.map_err(..)?.map_err(..)? unwrapping is correct; dropping the HTML <script> parse is safe because browser document.scripts .src values are already absolute (relative-URL and invalid-URL tests still pass).
Non-blocking notes
📝 note
- The new warning paths —
validate_navigation_responsereturningSome(..)and the resource-timing-buffer warning — have no dedicated unit test. - Party classification remains a host-suffix heuristic (now documented as such); an eTLD+1 comparison via a public-suffix list would be a future nicety.
CI Status
- prepare integration artifacts: PASS
- cargo fmt: PASS
- cargo test: PASS
- vitest: PASS
- integration tests (all adapters): PASS
aram356
left a comment
There was a problem hiding this comment.
Follow-up to my approval: anchoring the two non-blocking 📝 notes to their exact lines. Approval stands.
d32f149 to
6a28b7d
Compare
20f8674 to
8c3c3fc
Compare
Resolve 22 conflicts from main's EdgeZero adapter/canary evolution against the branch's config-store 503 work. - settings_data.rs: union the branch's ConfigStoreUnavailable (503) read classification with main's Fastly chunk-length hardening; keep both test sets. - adapter-fastly/main.rs: adopt main's rollout-percentage canary and settings_snapshot finalize refactor (branch 503 logic lives in error.rs). - CLI audit: adopt main's #800 implementation over the branch's parallel copy. - Manifests/lockfiles: take main's per-platform adapters and edgezero rev e483. - proxy.rs: drop a duplicate request_body_bytes left by a false auto-merge. Verified: fmt, clippy (5 targets), all adapter + core + cli test suites, integration parity, and JS vitest all pass.
Rebuilds feature/ts-cli-ad-templates fresh on server-side-ad-templates-impl (which already carries the ts CLI #799 + audit #800 via main), dropping the redundant ts-cli-base history that caused the merge conflicts. Adds the ad-template CLI: - ts audit ad-templates verify — browser/CDP ad-template slot verification - ts audit page — read-only page summary - ts config ad-templates — server-side ad-template diagnostics - shared CLI app-config loader + ad-template evidence/output models Keeps #800's URL->draft-config bootstrap by relocating it under audit/generate/ and exposing it as ts audit generate. Core: extracts the server-side ad-stack gate into creative_opportunities::evaluate_ad_stack_gate (three-state Yes/No/Unknown) so the CLI and the runtime publisher path share one gate definition.
Summary
ts auditas a Trusted Server-specific page audit command on top of the base CLI.Changes
crates/trusted-server-cli/src/audit.rscrates/trusted-server-cli/src/audit/analyzer.rscrates/trusted-server-cli/src/audit/browser_collector.rscrates/trusted-server-cli/src/audit/collector.rscrates/trusted-server-cli/src/args.rs,src/run.rs,src/lib.rsts auditinto CLI parsing and dispatch.Cargo.toml,crates/trusted-server-cli/Cargo.toml,Cargo.lockREADME.md,docs/guide/cli.md,docs/guide/getting-started.mddocs/superpowers/...audit...Closes
No issue provided; this PR is split out from the combined
feature/ts-cli-nextbranch.Test plan
cargo test --workspacecargo clippy --workspace --all-targets --all-features -- -D warningscargo fmt --all -- --checkcd crates/js/lib && npx vitest runcd crates/js/lib && npm run formatcd docs && npm run formatcargo build --package trusted-server-adapter-fastly --release --target wasm32-wasip1fastly compute servecargo test --package trusted-server-cli --target aarch64-apple-darwin— 42 passedChecklist
unwrap()in production code — useexpect("should ...")println!)