As I’m sure you know, Gatsby is absolutely brills for statically generating web pages using any type of data from any kind of data source. In this post I’ll be explaining how to fetch data from the New York Times Archive API and using two of Gatsby’s built in methods to add the data to Gatsby’s global data layer.
If you’re keen to get cracking you can find the demo site and
src code on the links below.
Let’s talk about SSG
SSG or Static Site Generation is a method whereby you “bake” data right into the page at build time making your website faster to load and easier for Google to index which typically results in better SEO.
With SSG pages can be created from data available in Gatsby’s data layer ahead of time resulting in the best possible experience for your users.
In this post i’ll be covering the following:
- How to fetch data from the New York Times Archive API.
- How to add data to Gatsby’s data layer.
- Query Gatsby’s data layer using GraphQL
In this post I won’t be discussing how to create pages using this data but if you’d like to follow me, @PaulieScanlon or @GatsbyJs we’ll be Tweeting about a follow up post that covers this technique fairly soon.
New York Times Archive API
To use the New York Times Archive API you’ll need to head over to https://developer.nytimes.com and sign up
Once you have an account you can create an “App”.
…and then enable the Archive API in your App settings which will reveal an API key.
To ensure your API key is safe and secure you’ll need to store it as an environment variable. With Gatsby you can do this by creating a
.env.production at the root of your project. You can read more about Environment Variables in the Gatsby docs.
In my demo you’ll see the
.env.example file which contains the following.
With your environment variable setup you can now use it to make HTTP requests to the New York Times Archive API from
Fetch Data from the New York Times Archive API
Using the extension point sourceNodes you can write an HTTP request that will fetch data from the New York Times Archive API and then add the response to Gatsby’s data layer using createNode. You will then be able to query this data using GraphQL.
There’s a few things going on in here so i’ll talk you through each step.
This allows Gatsby access to your environment variables.
You could also add this to
gatsby-config.js FYI ☝️
As of Gatsby 4 Gatsby already uses
node-fetch under the hood, you won’t need to install this npm package as it’s already part of Gatsby. To use it though you will need to require it in the
As outlined in the NY Times Archive API docs this endpoint can be used to fetch articles from a given date. In my demo I’ve chosen to fetch articles from the month and year I was born, November 1980.
If you pop in a
console.log(data) and run
gatsby build you should see something similar to the below.
You should be able to see in the response there’s an array of objects in
These objects contain information about each article returned by the NY Times Archive API and using a
forEach you can iterate over each one and add it to Gatsby’s data layer using createNode.
Adding Data to Gatsby’s Data Layer
createNode is a special
action that is part of Gatsby and can be used to add the data returned for each article and add it to Gatsby’s data layer. There are a number of required arguments, they are as follows:
...item:Is the article data returned from the NY Times Archive API
id: Is unique and is used by Gatsby to identify each node. I’ve used; item._id which is returned by the NY Times Archive API since it is already unique.
internal.type: Is the name of the node(s) you’ll query shortly using GraphQL
internal.contentDigest: Is used as a kind of indicator to Gatsby that is used to determine if data is fresh and doesn’t exist in the data layer or has changed, in which case the data layer will be updated.
… and that’s it, you have now added all of the objects returned by the New York Times Archive API to Gatsby’s data layer and given them a name of NyTimesArticle. Now it’s time to query them using GraphQL.
Querying Gatby’s Data Layer using GraphQL
To query the data from Gatsby’s data layer you can use GraphQL and pass the data back to any page within
src/pages. In my demo you’ll see there’s a page called raw.js and the code looks like this
The GraphQL query itself uses the node name you added to
createNode earlier. Gatsby prefixes this name with, “all” which allows you to query all the nodes that are named NyTimesArticle.
You can inspect which fields are available on each node by using Gatsby’s GraphiQL explorer (pronounced graphical) which is available at http://localhost:8000/___graphql after you run
You’ll also notice I’ve used a few GraphQL filters. By adding a filter for the
pub_date you’ll be able to filter out all articles that weren’t published on November 23 (my actual birthday) and any articles that have an empty value for the
You’ll see near the bottom of the page component I’ve exported a
query, this is how Gatsby passes data that has been queried by GraphQL back to the page component and makes it available via the
And now all this lovely data is available for you to use in your page.
What you do with it from here is entirely up to you. In my demo I’ve used the fantastic TailwindCSS to make it look pretty. There’s a guide in the Tailwind docs that will explain how to add TailwindCSS to your Gatsby project: Install Tailwind CSS with Gatsby.
So there you have it, by using sourcesNodes and createNode you can fetch data from any source and add it to Gatsby’s data layer and query it using GraphQL.
I’d love to see what you build so feel free to come find me on Twitter: @PaulieScanlon