Customizing the GraphQL Schema
One of Gatsby’s main strengths is the ability to query data from a variety of sources in a uniform way with GraphQL. For this to work, a GraphQL Schema must be generated that defines the shape of the data.
Gatsby is able to automatically infer a GraphQL Schema from your data, and in many cases, this is really all you need. There are however situations when you either want to explicitly define the data shape, or add custom functionality to the query layer - this is what Gatsby’s Schema Customization API provides.
The following guide walks through some examples to showcase the API.
This guide is aimed at plugin authors, users trying to fix GraphQL schemas created by automatic type inference, developers optimizing builds for larger sites, and anyone interested in customizing Gatsby’s schema generation. As such, the guide assumes that you’re somewhat familiar with GraphQL types and with using Gatsby’s Node APIs. For a higher level approach to using Gatsby with GraphQL, refer to the API reference.
The example project is a blog that gets its data from local Markdown files which provide the post contents, as well as author information in JSON format. There are also occasional guest contributors whose info is kept in a separate JSON file.
To be able to query the contents of these files with GraphQL, they need to first be
loaded into Gatsby’s internal data store. This is what source and transformer
plugin accomplish - in this case
gatsby-transformer-json. Every markdown post
file is hereby transformed into a “node” object in the internal data store with
id and a type
MarkdownRemark. Similarly, an author will be
represented by a node object of type
AuthorJson, and contributor info will be
transformed into node objects of type
This data structure is represented in Gatsby’s GraphQL schema with the
interface, which describes the set of fields common to node objects created by
source and transformer plugins (
children, as well as a couple
internal fields like
type). In GraphQL Schema Definition Language (SDL),
it looks like this:
Types created by source and transformer plugins implement this interface. For
example, the node type created by
will be represented in the GraphQL schema as:
It’s important to note that the data in
author.json does not provide type
information of the Author fields by itself. In order to translate the data
shape into GraphQL type definitions, Gatsby has to inspect the contents of
every field and check its type. In many cases this works very well and it is
still the default mechanism for creating a GraphQL schema.
There are however two problems with this approach: (1) it is quite time-consuming and therefore does not scale very well and (2) if the values on a field are of different types Gatsby cannot decide which one is the correct one. A consequence of this is that if your data sources change, type inference could suddenly fail.
Both problems can be solved by providing explicit type definitions for Gatsby’s GraphQL schema.
Look at the latter case first. Assume a new author joins the team, but in the
new author entry there is a typo on the
joinedAt field: “201-04-02” which is
not a valid Date.
This will confuse Gatsby’s type inference since the
field will now have both Date and String values.
To ensure that the field will always be of Date type, you can provide explicit
type definitions to Gatsby with the
It accepts type definitions in GraphQL Schema Definition Language:
Note that the rest of the fields (
firstName etc.) don’t have to be
provided, they will still be handled by Gatsby’s type inference.
There are however advantages to providing full definitions for a node type, and bypassing the type inference mechanism altogether. With smaller scale projects inference is usually not a performance problem, but as projects grow the performance penalty of having to check each field type will become noticeable.
Gatsby allows to opt out of inference with the
@dontInfer type directive - which
in turn requires that you explicitly provide type definitions for all fields
that should be available for querying:
Note that you don’t need to explicitly provide the Node interface fields (
parent, etc.), Gatsby will automatically add them for you.
If you wonder about the exclamation marks - those allow specifying nullability in GraphQL, i.e. if a field value is allowed to be
You can specify the media types handled by a node type using the
The types passed in are used to determine child relations of the node.
@childOf extension can be used to explicitly define what node types or media types a node is a child of and immediately add
children[MyType] fields on the parent.
types argument takes an array of strings and determines what node types the node is a child of:
mimeTypes argument takes an array of strings and determines what media types the node is a child of:
types arguments can be combined as follows:
So far, the example project has only been dealing with scalar values (
GraphQL also knows
JSON). Fields can
however also contain complex object values. To target those fields in GraphQL SDL, you
can provide a full type definition for the nested type, which can be arbitrarily
named (as long as the name is unique in the schema). In the example project, the
frontmatter field on the
MarkdownRemark node type is a good example. Say you
want to ensure that
frontmatter.tags will always be an array of strings.
Note that with
createTypes you cannot directly target a
without also specifying that this is the type of the
frontmatter field on the
MarkdownRemark type, The following would fail because Gatsby would have no way
of knowing which field the
Frontmatter type should be applied to:
It is useful to think about your data, and the corresponding GraphQL schema, by always starting from the Node types created by source and transformer plugins.
Note that the
Frontmattertype must not implement the Node interface since it is not a top-level type created by source or transformer plugins: it has no
idfield, and is there to describe the data shape on a nested field.
In many cases, GraphQL SDL provides a succinct way to provide type definitions
for your schema. If however you need more flexibility,
accepts type definitions provided with the help of Gatsby Type Builders, which
are more flexible than SDL syntax but less verbose than
graphql-js. They are
accessible on the
schema argument passed to Node APIs.
Gatsby Type Builders allow referencing types as simple strings, and accept full
field configs (
resolve). When defining top-level types, don’t forget
interfaces: ['Node'], which does the same for Type Builders as adding
implements Node does for SDL-defined types. It is also possible to opt out of type
inference with Type Builders by setting the
infer type extension to
Type Builders also exist for Input, Interface and Union types:
buildUnionType. Note that the
createTypesaction also accepts
graphql-jstypes directly, but usually either SDL or Type Builders are the better alternatives.
In the example project, the
frontmatter.author field on
MarkdownRemark nodes to expand the provided field value to a full
For this to work, there has to be provided a custom field resolver. (see below for
more info on
What is happening here is that you provide a custom field resolver that asks
Gatsby’s internal data store for the full node object with the specified
Because creating foreign-key relations is such a common use case, Gatsby luckily also provides a much easier way to do this — with the help of extensions or directives. It looks like this:
This example assumes that your markdown frontmatter is in the shape of:
And your author JSON looks like this:
You provide a
@link directive on a field and Gatsby will internally
add a resolver that is quite similar to the one written manually above. If no
argument is provided, Gatsby will use the
id field as the foreign-key,
otherwise the foreign-key has to be provided with the
by argument. The
from argument allows getting the field on the current type which acts as the foreign-key to the field specified in
In other words, you
by. This makes
from especially helpful when adding a field for back-linking.
For the above example you can read
@link this way: Use the value from the field
Frontmatter.reviewers and match it by the field
Keep in mind that in the example above, the link of
AuthorJson works by defining a path to a node because
author are both objects. If, for example, the
Frontmatter type had a list of
authors instead (
frontmatter.authors.email), you would need to define it like this with
Out of the box, Gatsby provides four extensions that allow adding custom functionality to fields without having to manually write field resolvers:
linkextension has already been discussed above
dateformatallows adding date formatting options
fileByRelativePathis similar to
linkbut will resolve relative paths when linking to
proxyis helpful when dealing with data that contains field names with characters that are invalid in GraphQL or to alias fields
To add an extension to a field you can either use a directive in SDL, or the
extensions property when using Gatsby Type Builders:
The above example adds date formatting options
AuthorJson.joinedAt and the
fields. Those options are available as field arguments when querying those fields:
publishedAt is also provided a default
formatString which will be used
when no explicit formatting options are provided in the query.
If the JSON would contain keys you’d want to
proxy to other names, you could do it like this:
You can also combine multiple extensions (built-in and custom ones).
You can use the
@proxy directive to alias (nested) fields to another field on the same node. This is helpful if e.g. you want to keep the shape you have to query flat or if you need it to keep things backwards compatible.
If you’d add a new field using
createNodeField to the
MarkdownRemark nodes (change this check if you use another source/type) like this:
Hello World would be queryable at:
To be able to query
someInformation like this instead you have to alias the
For setting default field values, Gatsby currently does not (yet) provide an
out-of-the-box extension, so resolving a field to a default value (instead of
null) requires manually adding a field resolver. For example, to add a default
tag to every blog post:
it is possible to define custom extensions as a way to add reusable functionality
to fields. Say you want to add a
fullName field to
You could write a
fullNameResolver, and use it in two places:
However, to make this functionality available to other plugins as well, and make it usable in SDL, you can register it as a field extension.
A field extension definition requires a name, and an
extend function, which
should return a (partial) field config (an object, with
which will be merged into the existing field config.
This approach becomes a lot more powerful when plugins provide custom field extensions. A very basic markdown transformer plugin could for example provide an extension to convert markdown strings into HTML:
It can then be used in any
createTypes call by adding the directive/extension
to the field:
Note that in the above example, there have been additional provided configuration options
args. This is e.g. useful to provide default field arguments:
Also note that field extensions can decide themselves if an existing field resolver
should be wrapped or overwritten. The above examples have all decided to return
resolve function. Because the
extend function receives the current field
config as its second argument, an extension can also decide to wrap an existing resolver:
If multiple field extensions are added to a field, resolvers are processed in this order:
first a custom resolver added with
createResolvers) runs, then field
extension resolvers execute from left to right.
Finally, note that in order to get the current
fieldValue, you use
While it is possible to directly pass
resolvers along the type
definitions using Gatsby Type Builders, an alternative approach specifically
tailored towards adding custom resolvers to fields is the
createResolvers allows adding new fields to types, modifying
resolver — but not overriding the field type. This is because
createResolvers is run last in schema generation, and modifying a field type
would mean having to regenerate corresponding input types (
which you want to avoid. If possible, specifying field types should be done with
As mentioned above, Gatsby’s internal data store and query capabilities are
available to custom field resolvers on the
context.nodeModel argument passed
to every resolver. Accessing node(s) by
id (and optional
type) is possible
getNodesByIds. To get all nodes, or all
nodes of a certain type, use
And running a query from inside your resolver functions can be accomplished
findAll as well, which accepts
sort query arguments.
You could for example add a field to the
AuthorJson type that lists all recent
posts by an author:
findAll to sort query results, be aware that both
GraphQLList fields. Also, nested fields on
have to be provided in dot-notation (not separated by triple underscores).
One powerful approach enabled by
createResolvers is adding custom root query
fields. While the default root query fields added by Gatsby (e.g.
allMarkdownRemark) provide the whole range of query
options, query fields designed specifically for your project can be useful. For
example, you can add a query field for all external contributors to the example blog
who have received their swag:
Because you might also be interested in the reverse - which contributors haven’t received their swag yet - why not add a (required) custom query arg?
It is also possible to provide more complex custom input types which can be defined
directly inline in SDL. You could for example add a field to the
type that counts the number of posts by a contributor, and then add a custom root
contributors which accepts
max arguments to only return
contributors who have written at least
min, or at most
max number of posts:
When creating custom field resolvers, it is important to ensure that Gatsby
knows about the data a page depends on for hot reloading to work properly. When
you retrieve nodes from the store with
it is usually not necessary to do anything manually, because Gatsby will register
dependencies for the query results automatically. If you want to customize this,
you can add a page data dependency either programmatically with
context.nodeModel.trackPageDependencies, or with:
Finally, say you want to have a page on the example blog that lists all
team members (authors and contributors). What you could do is have two queries,
allAuthorJson and one for
allContributorJson and manually merge
those. GraphQL however provides a more elegant solution to these kinds of
problems with “abstract types” (Interfaces and Unions). Since authors and
contributors actually share most of the fields, you can abstract those up into
TeamMember interface and add a custom query field for all team members
(as well as a custom resolver for full names):
To use the newly added root query field in a page query to get the full names of all team members, you could write:
Since Gatsby 3.0.0, you can use interface inheritance to achieve the same thing as above:
TeamMember implements Node. This will treat the interface like a normal top-level type that
Node interface, and thus automatically add root query fields for the interface.
When querying, use inline fragments for the fields that are specific to the types implementing the interface (i.e. fields that are not shared):
__typename introspection field allows to check the node type when iterating
over the query results in your component:
Note: All types implementing a queryable interface must also implement the