Skip to content

src: cache missing package.json files in the C++ package config cache#60425

Merged
nodejs-github-bot merged 2 commits into
nodejs:mainfrom
michaelsmithxyz:fix_module_loading_perf_regression
Jan 20, 2026
Merged

src: cache missing package.json files in the C++ package config cache#60425
nodejs-github-bot merged 2 commits into
nodejs:mainfrom
michaelsmithxyz:fix_module_loading_perf_regression

Conversation

@michaelsmithxyz

@michaelsmithxyz michaelsmithxyz commented Oct 27, 2025

Copy link
Copy Markdown
Contributor

Fixes #60397

In #59888 the nearest parent package JSON cache package_json_reader.js was adjusted from a map from any given module path to a representation of its parent package.json file to a map from package.json paths to a deserialized representation of their content. This addressed excessive memory usage caused by repeatedly caching identical deserialized package.json objects for modules that shared a parent package.json, but also reintroduced a filesystem traversal in package_json_reader.js which finds the nearest parent package.json file for a given module. The stat calls in this traversal are not cached, so we end up potentially issuing them for a bunch of duplicate paths. In the reported issue, this leads to poor performance for users using potentially high-latency network filesystems. Similar poor performance is also observed in Node versions that lack #59086, which (re)introduced the JS-side cache initially.

This PR addresses this by unwinding the changes in #59888 and instead making the C++-side package.json cache a bit more expressive, caching both a deserialized representation of a package.json file at a given path, as well as an indicator if no such file exists (modeled as an std::optional). This addresses the poor performance reported in #60397 by:

  1. Removing the repeated stat calls in package_json_reader.js
  2. Avoiding repetitive attempts to read non-existent package.json paths on the C++ side, which also perform poorly on high-latency filesystems

While analyzing the performance of these changes, I noticed a confounding factor which is that the lazy-parsing and caching of imports and exports on deserialized package configuration objects in deserializePackageJSON wasn't working as expected and was also contributing to the varying performance we've been seeing across these changes:

  1. The attempt to define lazy properties to parse and cache the JSON on demand didn't work as expected because the resulting object was immediately spread, meaning we'd immediately run the JSON parsing code anyway
  2. Because the parsed representation of imports and exports is cached on deserialized package.json objects, it's important that a given package.json file map to the same deserialized object. If we don't do this, we repeatedly re-parse these fields redundantly across calls. This motivates the sort of strange two-level caching scheme in getNearestParentPackageJSON that these changes introduce. The downside here is that we potentially redundantly call into modulesBinding.getNearestParentPackageJSON for a given path just to resolve the path to a package.json file that we may already have cached, but I don't see any way to avoid this.

Benchmarks

I benchmarked this change with the same scripts I used in #59888. The first is the reproduction script from #58126:

require('dd-trace').init();
const cdk = require('aws-cdk-lib');

const app = new cdk.App();
for (let i = 0; i < 1000; i++) {
  new cdk.Stack(app, `DdTraceStack${I}`)
}

The second is this:

for (let i = 0; i < 1000; i++) {
  require('date-fns');
}

Each benchmark compares v22.19.0 (which does not include #59888), v25.1.0 (the latest current release, which does include #59888), and this change (which is just the node directory in the output).

Fast disk

ddtrace + CDK

➜ hyperfine --warmup 10 -L node_path ../node/node,../node_worktrees/v25.1.0/node,../node_worktrees/v22.19.0/node "{node_path} dd-cdk-benchmark.js"
Benchmark 1: ../node/node dd-cdk-benchmark.js
  Time (mean ± σ):     161.6 ms ±   1.6 ms    [User: 170.7 ms, System: 20.9 ms]
  Range (min … max):   159.2 ms … 164.9 ms    18 runs

Benchmark 2: ../node_worktrees/v25.1.0/node dd-cdk-benchmark.js
  Time (mean ± σ):     164.9 ms ±   0.9 ms    [User: 174.7 ms, System: 22.8 ms]
  Range (min … max):   163.7 ms … 166.8 ms    17 runs

Benchmark 3: ../node_worktrees/v22.19.0/node dd-cdk-benchmark.js
  Time (mean ± σ):     169.5 ms ±   1.0 ms    [User: 172.4 ms, System: 20.6 ms]
  Range (min … max):   167.9 ms … 172.0 ms    17 runs

Summary
  ../node/node dd-cdk-benchmark.js ran
    1.02 ± 0.01 times faster than ../node_worktrees/v25.1.0/node dd-cdk-benchmark.js
    1.05 ± 0.01 times faster than ../node_worktrees/v22.19.0/node dd-cdk-benchmark.js

date-fns

➜ hyperfine --warmup 10 -L node_path ../node/node,../node_worktrees/v25.1.0/node,../node_worktrees/v22.19.0/node "{node_path} date-fns-benchmark.js"
Benchmark 1: ../node/node date-fns-benchmark.js
  Time (mean ± σ):      71.3 ms ±   1.1 ms    [User: 73.4 ms, System: 13.3 ms]
  Range (min … max):    69.7 ms …  75.2 ms    41 runs

Benchmark 2: ../node_worktrees/v25.1.0/node date-fns-benchmark.js
  Time (mean ± σ):      66.3 ms ±   0.7 ms    [User: 66.8 ms, System: 10.4 ms]
  Range (min … max):    64.9 ms …  69.1 ms    43 runs

Benchmark 3: ../node_worktrees/v22.19.0/node date-fns-benchmark.js
  Time (mean ± σ):     115.9 ms ±   0.7 ms    [User: 139.6 ms, System: 18.4 ms]
  Range (min … max):   114.1 ms … 117.5 ms    25 runs

Summary
  ../node_worktrees/v25.1.0/node date-fns-benchmark.js ran
    1.07 ± 0.02 times faster than ../node/node date-fns-benchmark.js
    1.75 ± 0.02 times faster than ../node_worktrees/v22.19.0/node date-fns-benchmark.js

Slow disk

I emulated this by mounting an NFS volume from localhost with noac (to disable most caching).

ddtrace + CDK

➜ hyperfine --warmup 10 -L node_path ../../node/node,../../node_worktrees/v25.1.0/node,../../node_worktrees/v22.19.0/node "{node_path} dd-cdk-benchmark.js"
Benchmark 1: ../../node/node dd-cdk-benchmark.js
  Time (mean ± σ):      3.976 s ±  1.193 s    [User: 0.210 s, System: 0.148 s]
  Range (min … max):    1.924 s …  5.314 s    10 runs

Benchmark 2: ../../node_worktrees/v25.1.0/node dd-cdk-benchmark.js
  Time (mean ± σ):      7.668 s ±  2.805 s    [User: 0.210 s, System: 0.345 s]
  Range (min … max):    5.307 s … 12.359 s    10 runs

Benchmark 3: ../../node_worktrees/v22.19.0/node dd-cdk-benchmark.js
  Time (mean ± σ):      3.810 s ±  1.343 s    [User: 0.205 s, System: 0.167 s]
  Range (min … max):    2.266 s …  5.542 s    10 runs

Summary
  ../../node_worktrees/v22.19.0/node dd-cdk-benchmark.js ran
    1.04 ± 0.48 times faster than ../../node/node dd-cdk-benchmark.js
    3.01 ± 1.02 times faster than ../../node_worktrees/v25.1.0/node dd-cdk-benchmark.js

date-fns

➜ hyperfine --warmup 10 -L node_path ../../node/node,../../node_worktrees/v25.1.0/node,../../node_worktrees/v22.19.0/node "{node_path} date-fns-benchmark.js"
Benchmark 1: ../../node/node date-fns-benchmark.js
  Time (mean ± σ):     699.7 ms ±  20.3 ms    [User: 65.4 ms, System: 46.3 ms]
  Range (min … max):   668.5 ms … 727.1 ms    10 runs

Benchmark 2: ../../node_worktrees/v25.1.0/node date-fns-benchmark.js
  Time (mean ± σ):     977.5 ms ±  41.0 ms    [User: 68.1 ms, System: 62.0 ms]
  Range (min … max):   923.2 ms … 1038.0 ms    10 runs

Benchmark 3: ../../node_worktrees/v22.19.0/node date-fns-benchmark.js
  Time (mean ± σ):     825.4 ms ± 311.2 ms    [User: 112.7 ms, System: 60.5 ms]
  Range (min … max):   542.0 ms … 1350.4 ms    10 runs

Summary
  ../../node/node date-fns-benchmark.js ran
    1.18 ± 0.45 times faster than ../../node_worktrees/v22.19.0/node date-fns-benchmark.js
    1.40 ± 0.07 times faster than ../../node_worktrees/v25.1.0/node date-fns-benchmark.js

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ Issues and PRs that require attention from people who are familiar with C++. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. module Issues and PRs related to the module subsystem. needs-ci PRs that need a full CI run. typings

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Perf regression in Node 22/24 when loading JS files

6 participants