Code Splitting and Prefetching
Code splitting in Gatsby occurs during webpack compilation via Dynamic Splitting. During compilation, if webpack finds an
import function invocation, it will split the imported file into a separate bundle. If modules are instead loaded with
require, they are not code split and are instead included in the original bundle.
During the Write Out Pages bootstrap phase, you output
.cache/async-requires.js. This file is key to code splitting. It exports a
components object that maps ComponentChunkNames to functions that import the component’s file on disk. E.g
The entry point to webpack (
production-app.js) references ./async-requires.js. And therefore webpack will analyze it and find the object mapping components to functions that dynamically import page component files. According to dynamic splitting, webpack will then create separate chunks for each of those imported files.
async-requires also exports a
data function that dynamically imports the data.json file so that it too is code split.
Great! You’ve told webpack where you want to code split. But how will these be named on disk? webpack gives you the ability to customize this via the chunkFilename configuration in the output section, which is set by Gatsby in webpack.config.js as:
Content hash is a hash of the contents of the chunk that was code split. But what about
name? You haven’t told webpack the “name” of the chunk, other than the filename, which can’t be a name since it has slashes in it. Normally, webpack would replace
[id] (see webpack docs). But, it gives us the opportunity to specify the name instead as a comment in the
import block. In the example above, that’s what the
/* webpackChunkName: "component---src-blog-js" */ is doing.
Before we go on to show how Gatsby maps components to the generated bundle names, you should understand how webpack chunks work. A chunk group represents a logical code split, e.g. a Gatsby page, or the Gatsby core app. The chunk groups might share a bunch of code or libraries. webpack detects these and creates shared pieces of code. These are chunks, e.g. there might be a chunk for React and other libraries. Then there would be the leftover chunks of core Gatsby JS code for the particular chunk group. This is most easily explained by the below graph.
In the above graph, you can see 3 chunk groups: 2 pages and the core Gatsby app. The two pages share a bunch of libraries. webpack found these common dependencies and created chunks for them. These chunks are id 0 and 1. And you’ll see that both page
chunkGroups depend on them. Each page also depends on its own chunk which represents the page’s core code (from its src code in the Gatsby site). These would be id 7 for
component---src-blog-1-js and 8 for
You can also see the chunk group for
app. It turns out that this shares no dependencies with the pages. But it does include the webpack runtime whose name is declared in webpack.config.js.
Remember that the chunk group name was assigned by the
/* webpackChunkName: .../* in the previous section.
webpack can now split your code into different bundles, and you’ve named them appropriately. But those bundles will still be named with a content hash. E.g. for a component
component--src-blog-js, the output chunk bundle might be named something like
component--src-blog-js-2e49587d85e03a033f58.js. webpack will replace
HTML file generation is covered under the Page HTML Generation docs. In summary, webpack builds
static-entry.js which produces a
To do this, you need to be able to create
<script> tags in the HTML the Gatsby runtime chunk, and the page chunk (e.g. index). But as mentioned above, only webpack knows the name of the generated filename for each chunk. All Gatsby knows is the
It turns out that webpack provides a way to record the mapping. It provides a compilation hook called done that you can register for. It provides a stats data structure that contains all the
chunkGroups (remember that the chunk Group is the
componentChunkName). Each chunk group contains a list of the chunks it depends on. Gatsby provides a custom webpack plugin called GatsbyWebpackStatsExtractor that implements this hook and writes the chunk information to
/public/webpack.stats.json (under the
assetsByChunkName key). E.g
webpack.stats.json maps chunk groups (componentChunkNames) to the chunk asset names they depend on. Your Gatsby webpack compiler hook also outputs
These two files are loaded by static-entry.js so that it can lookup chunk assets for componentChunkNames. This occurs in two places.
As mentioned above,
link tags in the
<head> (see link tag preloading), and then referenced at the bottom of the body in
The Gatsby runtime bundle is called
app (output name from webpack.config.js). You look up
app to get its chunk asset files. Then you do the same for the component by looking up the same collection by
component---src-blog-2-js). These two chunk asset arrays are merged together. For each chunk in it, you create the following link and add it to the headComponents.
rel="preload" tells the browser to start downloading this resource with a high priority as it will likely be referenced further down in the document. So hopefully by the time you get there, the resource will be returned from the server already.
Then, at the end of the body, you include the actual script tag that references the preloaded asset.
If the asset is CSS, you inject it inline in the head.
It does this using the
<link rel="prefetch" href="..." /> parameter. When the browser sees this tag, it will start downloading the resource but at an extremely low priority and only when the resources for the current page have finished loading. Check out the MDN prefetch docs for more.
Here’s how it works. All links on Gatsby sites use the gatsby-link plugin which provides a
GatsbyLink component that uses reach router. The “to” attribute is the page the browser will navigate to if clicked. So once the Link componentDidMount callback is invoked, we enqueue the “to” path into the production-app loader for prefetching.
At this stage, you know the page that you’re navigating to, and can retrieve its
jsonName, but how do you figure out the generated chunkGroup for the component?
You may notice that prefetching doesn’t prefetch the shared chunks (e.g.
id1). Why? This is a punt. We’re guessing that shared chunks will have been loaded earlier for other pages. And if not, then the main page loading logic will download it. It just won’t be prefetched.
One more thing, prefetching can be disabled by implementing the disableCorePrefetching browser API and returning true. This value is checked in loader.enqueue. An example plugin that implements this is gatsby-plugin-guess-js.