Migrate to Netlify Today

Netlify announces the next evolution of Gatsby Cloud. Learn more

ContactSign Up

Fetch Data From the New York Times API and Render Using SSG

Paul Scanlon
December 9th, 2021

As I’m sure you know, Gatsby is absolutely brills for statically generating web pages using any type of data from any kind of data source. In this post I’ll be explaining how to fetch data from the New York Times Archive API and using two of Gatsby’s built in methods to add the data to Gatsby’s global data layer.

If you’re keen to get cracking you can find the demo site and src code on the links below.

Demo: https://datachampionssgnycnews.gatsbyjs.io

Repo: https://github.com/PaulieScanlon/data-champion-ssg-nyc-news

Let’s talk about SSG

SSG or Static Site Generation is a method whereby you “bake” data right into the page at build time making your website faster to load and easier for Google to index which typically results in better SEO

With SSG pages can be created from data available in Gatsby’s data layer ahead of time resulting in the best possible experience for your users. 

In this post i’ll be covering the following:

  • How to fetch data from the New York Times Archive API.
  • How to add data to Gatsby’s data layer.
  • Query Gatsby’s data layer using GraphQL

In this post I won’t be discussing how to create pages using this data but if you’d like to follow me, @PaulieScanlon or @GatsbyJs we’ll be Tweeting about a follow up post that covers this technique fairly soon. 

New York Times Archive API

To use the New York Times Archive API you’ll need to head over to ​​https://developer.nytimes.com and sign up

Once you have an account you can create an “App”.

New York Times Archive API - APP

 …and then enable the Archive API in your App settings which will reveal an API key.

New York Times Archive API - API Keys

Environment Variables

To ensure your API key is safe and secure you’ll need to store it as an environment variable. With Gatsby you can do this by creating a .env.development and .env.production at the root of your project. You can read more about Environment Variables in the Gatsby docs.

In my demo you’ll see the .env.example file which contains the following.

With your environment variable setup you can now use it to make HTTP requests to the New York Times Archive API from gatsby-node.js.

Fetch Data from the New York Times Archive API

Using the extension point sourceNodes you can write an HTTP request that will fetch data from the New York Times Archive API and then add the response to Gatsby’s data layer using createNode. You will then be able to query this data using GraphQL.

There’s a few things going on in here so i’ll talk you through each step. 

require dotenv

This allows Gatsby access to your environment variables. 

You could also add this togatsby-config.js FYI ☝️

node-fetch

As of Gatsby 4 Gatsby already uses node-fetch under the hood, you won’t need to install this npm package as it’s already part of Gatsby. To use it though you will need to require it in the gatsby-node.js file.

sourceNodes

This is part of the Gatsby build step and is called during Gatsby’s bootstrap sequence. Within this function you can source data from absolutely anywhere using standard JavaScript HTTP methods. 

 

HTTP Request

As outlined in the NY Times Archive API docs this endpoint can be used to fetch articles from a given date. In my demo I’ve chosen to fetch articles from the month and year I was born, November 1980.

If you pop in a console.log(data) and run gatsby build you should see something similar to the below.

You should be able to see in the response there’s an array of objects in response.docs.

These objects contain information about each article returned by the NY Times Archive API and using a forEach you can iterate over each one and add it to Gatsby’s data layer using createNode.

Adding Data to Gatsby’s Data Layer

createNode is a special action that is part of Gatsby and can be used to add the data returned for each article and add it to Gatsby’s data layer. There are a number of required arguments, they are as follows:

  • ...item: Is the article data returned from the NY Times Archive API
  • id: Is unique and is used by Gatsby to identify each node. I’ve used; item._id which is returned by the NY Times Archive API since it is already unique.  
  • internal.type: Is the name of the node(s) you’ll query shortly using GraphQL
  • internal.contentDigest: Is used as a kind of indicator to Gatsby that is used to determine if data is fresh and doesn’t exist in the data layer or has changed, in which case the data layer will be updated.

… and that’s it, you have now added all of the objects returned by the New York Times Archive API to Gatsby’s data layer and given them a name of NyTimesArticle. Now it’s time to query them using GraphQL.

Querying Gatby’s Data Layer using GraphQL

To query the data from Gatsby’s data layer you can use GraphQL and pass the data back to any page within src/pages. In my demo you’ll see there’s a page called raw.js and the code looks like this

The GraphQL query itself uses the node name you added to createNode earlier. Gatsby prefixes this name with, “all” which allows you to query all the nodes that are named NyTimesArticle.

You can inspect which fields are available on each node by using Gatsby’s GraphiQL explorer (pronounced graphical) which is available at http://localhost:8000/___graphql after you run gatsby develop

You’ll also notice I’ve used a few GraphQL filters. By adding a filter for the pub_date you’ll be able to filter out all articles that weren’t published on November 23 (my actual birthday) and any articles that have an empty value for the abstract field.

New York Times Archive API - GraphQL

You’ll see near the bottom of the page component I’ve exported a const named query, this is how Gatsby passes data that has been queried by GraphQL back to the page component and makes it available via the data prop.

And now all this lovely data is available for you to use in your page. 

What you do with it from here is entirely up to you. In my demo I’ve used the fantastic TailwindCSS to make it look pretty. There’s a guide in the Tailwind docs that will explain how to add TailwindCSS to your Gatsby project: Install Tailwind CSS with Gatsby

So there you have it, by using sourcesNodes and createNode you can fetch data from any source and add it to Gatsby’s data layer and query it using GraphQL.

I’d love to see what you build so feel free to come find me on Twitter: @PaulieScanlon 

Ttfn 

Paul 🕺

Share on TwitterShare on LinkedInShare on FacebookShare via Email

After all is said and done, structure + order = fun! Senior Software Engineer (Developer Relations) for Gatsby

Follow Paul Scanlon on Twitter

Tagged with data-sourcing, Static Site GenerationView all Tags

Become a Data Champion with Gatsby

Read more
© 2023 Gatsby, Inc.