feat(debug-files): scan inside .zip archives on upload#1141
Conversation
Add ZIP archive scanning to `debug-files upload`, matching the legacy sentry-cli behavior. `.zip` files encountered during a scan are expanded in memory and their entries run through the same filter pipeline as on-disk files. Disable with `--no-zips`. - New `src/lib/dif/zip.ts`: `readZipDifEntries()` detects archives by `.zip` extension + `PK` magic, extracts entries via `fflate.unzipSync`, and bounds decompression with a pre-decompression size filter (zip-bomb guard). Directory and empty entries are dropped; nested archives are not recursed. - `prepareDifs()` gains a `scanZips` option (default true); the per-file path is extracted into `prepareFileDif()` and the parse step into the shared `difFromBuffer()` so on-disk and in-memory entries share logic. - `upload` command: add `--no-zips` flag; thread `scanZips` through. Tests: 12 zip unit tests + 2 command wiring tests. Docs/skill updated.
|
Codecov Results 📊✅ Patch coverage is 89.80%. Project has 5114 uncovered lines. Files with missing lines (2)
Coverage diff@@ Coverage Diff @@
## main #PR +/-##
==========================================
+ Coverage 81.48% 81.52% +0.04%
==========================================
Files 396 397 +1
Lines 27588 27671 +83
Branches 17912 17966 +54
==========================================
+ Hits 22478 22557 +79
- Misses 5110 5114 +4
- Partials 1866 1868 +2Generated by Codecov Action |
Source bundles (and JVM bundles / .src.zip) are a ZIP preceded by an 8-byte SYSB+version header, so they start with SYSB, not PK. Verify readZipDifEntries returns null for them so they upload as sourcebundle DIFs rather than being expanded into their inner source files.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 4a41f2e. Configure here.
CI status: all green except an external Anthropic API outageEvery check passes except
This PR does not touch the skill or any LLM path. Will re-run once the Anthropic API recovers. |
…ession Node 24.17.0 / 22.23.0 (CVE-2026-48931 http.Agent fix) added a `data` listener on idle sockets that makes keep-alive fetch reuse throw false ERR_STREAM_PREMATURE_CLOSE errors. The skill-eval E2E planner hits this as 'Premature close' talking to api.anthropic.com. Fixed in 24.18.0 (nodejs/node#64004). A floating `node-version: "24"` silently reuses the runner's pre-cached buggy 24.17.0, so pin the exact patched version.
|
Correction: the The E2E job used floating |
Addresses Seer/Bugbot/warden comments and an adversarial review of the in-place .zip scanner: - Bound cumulative decompression per archive (maxTotalSize, default 2 GiB). unzipSync materializes every accepted entry at once, so the per-entry gate alone could not stop a flat zip-bomb or a huge legit archive from exhausting memory. - Gate the container's on-disk size before reading it into memory, mirroring the on-disk peek-then-read discipline (warden M2). - Skip entries using a compression method fflate cannot inflate instead of letting unzipSync throw and discard the whole archive (and its valid sibling DIFs). - Make oversized zip entries advisory-only: they warn per entry but no longer feed the exit-driving oversizedCount. A compressed entry's format is unknown pre-decompression, so counting it produced false "all matched files too large" failures when an unrelated large asset sat inside a .zip (Bugbot/H2). - Guard non-finite originalSize as defense-in-depth (Seer). Adds regression tests for unsupported-compression survival, the cumulative budget, the container gate, and the advisory oversized accounting.

Summary
Adds ZIP archive scanning to
sentry debug-files upload, matching the legacysentry-clibehavior (try_open_zip/walk_difs_zip). This is Tier B of thedebug-files uploadroadmap (Tier A core upload landed in #1139, limits +--derived-datain #1140)..zipfiles encountered during a scan are now expanded in memory and their entries run through the same filter pipeline (--type/--id/feature filters, size gate) as on-disk files. Pass--no-zipsto skip them.Implementation
src/lib/dif/zip.ts—readZipDifEntries():.zipextension andPKlocal-header magic (a misnamed non-archive falls back to normal parsing; a real DIF namedfoo.zipstill works).fflate.unzipSync.filterrejects entries whose declared uncompressed size exceeds the server'smax_file_size(or omitted = no gate) before inflation, so oversized entries are never materialized. Directory/empty entries are dropped..zipinside a.zipis an opaque, non-object entry), matching the legacy tool.prepareDifs()gains ascanZipsoption (defaulttrue). The on-disk per-file logic is extracted intoprepareFileDif(), and the parse step into a shareddifFromBuffer()so on-disk files and in-memory ZIP entries share the same parse + filter path. Entry display paths are synthetic"<zip>/<entry>"so logs and DIF names stay meaningful.uploadcommand — adds the--no-zipsflag and threadsscanZips: !flags["no-zips"]through.Notes / scope
fflateadded as a devDependency (bundled at build time;check:depspasses).unzipSyncis non-streaming); typical symbol zips are tens/hundreds of MB. Per-entry decompression remains bounded by the size gate.--symbol-maps(BCSymbolMap) and--il2cpp-mappingline mappings — these need newsymbolicWASM exports.Testing
test/lib/dif/zip.test.ts— 12 unit tests (magic/extension detection, malformed input, size gate, no-recursion,scanZipstoggle,--typefiltering, container not double-counted, misnamed-archive fallback). Archives built in memory withfflate.zipSync— no binary fixtures.test/commands/debug-files/upload.test.ts— 2 command-level tests (.zipscanned by default;--no-zipsignores entries).dif+debug-filestests pass;typecheck,lint,check:deps,check:patches,check:fragmentsclean.