Perry Performance Run Log

2026-06-17 - Typed feedback registration hoist

Start revision: e816fc3e4af1
Branch: codex/perry-performance-20260617
Worker assignment: single Codex pass in this worktree
Benchmark environment: Linux /usr/bin/time -v; local node cannot execute .ts benchmark inputs, so Node columns and correctness comparisons were skipped by the harness
Baseline commands:
- cargo build --release
- ./benchmarks/quick.sh
- ./benchmarks/compare.sh --quick --runs 3 --warn-only --json-out /tmp/perry-baseline-e816fc3e4.json
Baseline results:
- quick: fibonacci 260ms/18MB, math_intensive 73ms/18MB, nested_loops 3508ms/17MB, factorial 95ms/18MB, matrix_multiply 6462ms/27MB
- compare quick medians: loop_overhead 74ms/18772KB, fibonacci 262ms/18696KB, math_intensive 70ms/18696KB, nested_loops 3383ms/17724KB, factorial 96ms/18836KB
Selected gap and evidence:
- nested_loops dominated the quick compare set at 3383ms; matrix_multiply was the slowest quick.sh case at 6462ms.
- LLVM trace for benchmarks/suite/10_nested_loops.ts showed js_typed_feedback_register_site(...) emitted inside the hot for.body.21 inner loop before each typed-feedback array guard.
Change:
- Added LlFunction::entry_setup_call_void and changed typed-feedback site registration to emit once in function-entry setup instead of at every guard use site.
- Kept guard, fallback, pass, and counter calls at original use sites so runtime evidence semantics remain per-use.
- Updated benchmark harnesses to support Linux RSS measurement and skip Node .ts columns when the installed Node cannot run TypeScript directly.
Post-change benchmark commands:
- cargo build --release
- ./benchmarks/compare.sh --quick --runs 3 --warn-only --json-out /tmp/perry-final-e816fc3e4.json
- ./benchmarks/quick.sh
Post-change results:
- compare quick medians: loop_overhead 74ms/18768KB, fibonacci 261ms/18920KB, math_intensive 69ms/18944KB, nested_loops 956ms/19152KB, factorial 94ms/18896KB
- quick: fibonacci 262ms/18MB, math_intensive 55ms/18MB, nested_loops 965ms/18MB, factorial 75ms/18MB, matrix_multiply 1842ms/28MB
Measured impact:
- 10_nested_loops compare median: 3383ms -> 956ms, 71.7% faster
- 16_matrix_multiply quick: 6462ms -> 1842ms, 71.5% faster
Verification:
- bash -n benchmarks/quick.sh
- bash -n benchmarks/compare.sh
- cargo fmt --check
- cargo test -p perry-codegen --test typed_feedback
- PERRY_BIN=target/release/perry python3 tests/test_typed_feedback_runtime_evidence.py
- tests/test_benchmark_output_verifier.sh
- target/release/perry compile --no-cache benchmarks/suite/10_nested_loops.ts -o /tmp/perry-nested-loops-final --trace llvm --quiet; trace confirmed registration calls in entry setup only and no registration calls in for.body.21
- /tmp/perry-nested-loops-final produced nested_loops:963 and sum:26991000000
- target/release/perry compile --no-cache benchmarks/suite/16_matrix_multiply.ts -o /tmp/perry-matrix-multiply-final --quiet && /tmp/perry-matrix-multiply-final produced matrix_multiply:1778 and checksum:41079519680
Notes:
- benchmarks/baseline.json is stale for this Linux environment; compare was run with --warn-only and the before/after comparison above uses the captured local baseline JSON.
- Follow-up candidates remain in typed array and numeric array hot paths, but this cycle stopped at the isolated registration-hoist optimization.
PR: #5295

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Perry Performance Run Log

2026-06-17 - Typed feedback registration hoist

Uh oh!

FilesExpand file tree

PERF_RUN_LOG.md

Latest commit

History

PERF_RUN_LOG.md

File metadata and controls

Perry Performance Run Log

2026-06-17 - Typed feedback registration hoist