src: cache missing package.json files in the C++ package config cache#60425
Merged
nodejs-github-bot merged 2 commits intoJan 20, 2026
Merged
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #60397
In #59888 the nearest parent package JSON cache
package_json_reader.jswas adjusted from a map from any given module path to a representation of its parentpackage.jsonfile to a map frompackage.jsonpaths to a deserialized representation of their content. This addressed excessive memory usage caused by repeatedly caching identical deserializedpackage.jsonobjects for modules that shared a parentpackage.json, but also reintroduced a filesystem traversal inpackage_json_reader.jswhich finds the nearest parentpackage.jsonfile for a given module. Thestatcalls in this traversal are not cached, so we end up potentially issuing them for a bunch of duplicate paths. In the reported issue, this leads to poor performance for users using potentially high-latency network filesystems. Similar poor performance is also observed in Node versions that lack #59086, which (re)introduced the JS-side cache initially.This PR addresses this by unwinding the changes in #59888 and instead making the C++-side
package.jsoncache a bit more expressive, caching both a deserialized representation of apackage.jsonfile at a given path, as well as an indicator if no such file exists (modeled as anstd::optional). This addresses the poor performance reported in #60397 by:statcalls inpackage_json_reader.jspackage.jsonpaths on the C++ side, which also perform poorly on high-latency filesystemsWhile analyzing the performance of these changes, I noticed a confounding factor which is that the lazy-parsing and caching of
importsandexportson deserialized package configuration objects indeserializePackageJSONwasn't working as expected and was also contributing to the varying performance we've been seeing across these changes:importsandexportsis cached on deserializedpackage.jsonobjects, it's important that a givenpackage.jsonfile map to the same deserialized object. If we don't do this, we repeatedly re-parse these fields redundantly across calls. This motivates the sort of strange two-level caching scheme ingetNearestParentPackageJSONthat these changes introduce. The downside here is that we potentially redundantly call intomodulesBinding.getNearestParentPackageJSONfor a given path just to resolve the path to apackage.jsonfile that we may already have cached, but I don't see any way to avoid this.Benchmarks
I benchmarked this change with the same scripts I used in #59888. The first is the reproduction script from #58126:
The second is this:
Each benchmark compares v22.19.0 (which does not include #59888), v25.1.0 (the latest current release, which does include #59888), and this change (which is just the
nodedirectory in the output).Fast disk
ddtrace+ CDKdate-fnsSlow disk
I emulated this by mounting an NFS volume from
localhostwithnoac(to disable most caching).ddtrace+ CDKdate-fns