How to export a Drupal site to Gatsby

Joaquín Bravo Contreras
December 18th, 2018

This blogpost explains how I learned to reduce the cost of maintaining a simple brochure or blog site. When using Drupal, you need at least a shared hosting platform (there is no WordPress.com for Drupal sites). So, migrating to a static site generator, like Jekyll or Gatsby, seemed like a good idea. Gatsby is also a great opportunity to learn React and then get hosting for free using something like GitHub Pages. This post is going to describe how to migrate a simple blog–that has featured images on the posts, comments and tags–from Drupal to Gatsby.

To facilitate exporting the site, the first thing I did was export the database from the mysql database server to a sqlite file that I could use locally. To do this I used the mysql2sqlite project, which, as described on the project page, can be done with two commands like:

How to export a Drupal site to Gatsby yourself

To do this yourself, you'll build a simple blog using the excellent gatsby-starter-blog project. Create a new project and then add a sqlite library as a dev dependency:

The useful commands on a sqlite3 command line to explore are .tables to see all tables 🙂 and .schema table_name to see information about a specific table. Oh! and .help to know more.

Next, you will be creating a new file on your project at src/scripts/import.js. Initially, what you want is to iterate through all your posts and export basic data like title, created date, body and status (published or draft). All of that data is in two tables, the node table and the _field_databody. Initially, your script will look like this:

The interesting thing here is the initial query, and this is based on a Drupal 7 database. A Drupal 8 or Drupal 6 database could be different, so check your schema. Next, load the tags on a simple JavaScript array. Each post can have more than one tag, so you can take advantage of better-sqlite's .pluck() function, which retrieves only the first column of a database query, and the .all() function, which retrieves all rows in a single array:

To avoid 404 in case you created some url aliases you can query the url_alias table and create an aliases frontmatter property and later (depending on your hosting platform) use a plugin like gatsby-plugin-meta-redirect to use the gatsby createRedirect function:

For the image, you will retrieve only the URL of the image, so you can download it and store it locally. And you will replace public:// for the URL path of the images folder on your old site:

And now that you have all the data you need, it is just a matter of creating a file with the metadata in YAML format and the body of the text in Markdown format. Luckily, a Drupal blog can also use Markdown or you can also look for an HTML to Markdown JavaScript library like turndown.

This script is now finished and you can execute it in your shell with this command:

To have comments on your site you can use a service like Disqus. Disqus has an import process that uses a custom XML import format based on the WXR (WordPress eXtended RSS) schema. So the process would be the same. Create a script named src/scripts/export_comments.js to query the database and, in this case, write a single file containing all the comments of your old site:

Run node src/scripts/export_comments.js ../mysqlite.db > comments.xml and that's it. This will generate a comments.xml file that you can import into disqus. Just remember to change the yourSite variable in the script, and it will link each comment to the correct post in your new blog using the slug used in the posts import.

You now have all the posts and all comments ready to be used on your Gatsby blog. You can see a working example here: https://github.com/jackbravo/joaquin.axai.mx.

Joaquín Bravo Contreras
Written by
Joaquín Bravo Contreras

Developer based in Guadalajara, Mexico. Fond of Drupal, Python, Elixir and GatsbyJS

Follow Joaquín Bravo Contreras on Twitter

Tagged with drupal, getting-startedView all Tags
© 2020 Gatsby, Inc.