Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: don't handle SitePage nodes when deleting stale nodes #39161

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

pieh
Copy link
Contributor

@pieh pieh commented Nov 13, 2024

Description

  1. When we are clearing up stale nodes (nodes that exist from previous build that were not recreated in current build), we are deleting all SitePage nodes because at this point in time site creation didn't run yet. This results in mass nodes deletion and subsequent mass node creation - other than being wasteful this tends to lead to increase in LMDB datastore file size increasing as managing disk size is not priority for it so it tends to allocate more and more space over time with data mutations even if "logical" data remains at similar volume. This PR mainly looks to avoid wasteful LMDB operations and thus trying to slow down LMDB filesize increases over time
  2. When onCreatePage hook is used (either directly or by a plugin) it often result in temporary SitePage nodes - initial page object created, then plugins in their hook implementation might change a page path or page context. This result in unnecessary lmdb operations. This PR adds intermediate stage when page is create and there is at least one onCreatePage hook that doesn't commit db operations immediately and instead waits for onCreatePage hooks to finish and only commit final operation (this final operation might be no-op in case final page is the same as page from previous build which eliminates db mutations in those cases)

Tests

Existing tests (especially e2e and integration) still pass

https://linear.app/netlify/issue/FRB-1388/gatsby-mdb-file-size-balloons-from-120mb-to-400mb-when-not-adding

@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Nov 13, 2024
@pieh pieh added topic: core Relates to Gatsby's core (e.g. page loading, reporter, state machine) and removed status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer labels Nov 13, 2024
@pieh pieh force-pushed the perf/dont-cleanup-sitepage-node-after-source-nodes branch 2 times, most recently from 63be5c4 to 370b4b2 Compare December 4, 2024 11:01
@pieh pieh force-pushed the perf/dont-cleanup-sitepage-node-after-source-nodes branch from 370b4b2 to f611677 Compare December 4, 2024 11:32
@pieh pieh force-pushed the perf/dont-cleanup-sitepage-node-after-source-nodes branch from d0b1452 to c4adf6a Compare December 11, 2024 12:18
@@ -445,3 +452,99 @@ export const clearGatsbyImageSourceUrls =
type: `CLEAR_GATSBY_IMAGE_SOURCE_URL`,
}
}

// We add a counter to node.internal for fast comparisons/intersections
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the additions here were just moved from public.js file here to make it easier to reuse

@pieh pieh marked this pull request as ready for review December 17, 2024 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: core Relates to Gatsby's core (e.g. page loading, reporter, state machine)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants