I'm currently furloughed, so what better time to begin making some wide-ranging updates to this website. First up on the list were small improvements to RSS feeds and the way the calendar component is generated for the Journal[1], but with those out of the way I felt like tackling something a bit bigger: site search.
Search has been on my to-do list since day dot for theAdhocracy; after all, this site is meant to serve as a personal information archive, so making the info as accessible as possible makes sense. However, it also presents a slightly paradoxical challenge. Being built with Gatsby and the Jamstack, it's a static site that has been utterly decoupled from the CMS, so I can't rely on my own back-end to power search in the same way I normally would. Luckily, there are plenty of decentralised services out there which have popped up to fill this void, chief amongst them being Algolia. I did look at other self-hosted options, including some client-side equivalents which would enable offline searching (fancy!) but felt the increased page weight was probably too big an issue. Maybe, in the future, I'll take another look at those so that I own my search data too, but for now it's a compromise I'm okay making[2].
Initial Setup: A New API Endpoint
First things first, I serve data from my CMS using the Elements API plugin. There are a few reasons I use this, but the main ones are that Elements is both free and first-party, meaning that it will be closely maintained with each update to Craft and therefore very unlikely to break (which is pretty important for an API). Would I prefer to be using a GraphQL implementation? I go back and forth on this a lot, but ultimately the answer is yes and at some point in the future I'll probably make the switch.
Right now, though, I needed to set up a new endpoint which would serve all the content I want to make searchable in the future. I chose the rather silly name everything[3]:
'everything.json' => function() { \Craft::$app->response->headers->set('Access-Control-Allow-Origin', '*'); return [ 'elementType' => Entry::class, 'criteria' => [ 'section' => 'articles', 'orderBy' => 'postDate desc' ], 'paginate' => false, 'transformer' => function(Entry $entry) { return [ 'id' => $entry->id, 'type' => $entry->type->handle, 'title' => $entry->title, 'slug' => $entry->slug, 'date' => $entry->postDate->format(\DateTime::ATOM), 'tags' => array_map('strval', $entry->tags->all()), ]; } ]; },
Once that had been added to my Elements config file, I needed to ingest it within Gatsby. I personally use axios for fetch requests:
const fetchWholeFeed = () => axios.get(`https://cms.theadhocracy.co.uk/everything.json`) const getFeed = await fetchWholeFeed() // Map feed and create nodes getFeed.data.data.map((post, i) => { // Create node object const feedNode = { // Required fields for Gatsby id: `${i}`, parent: `__SOURCE__`, internal: { type: `Feed` // name of the graphQL query --> allFeed{} }, children: [], // Fields specific to this endpoint entryId: post.id, title: post.title, slug: post.slug, date: post.date, tags: post.tags, contentType: post.type } // Get content digest of node. (Required field) const contentDigest = crypto .createHash(`md5`) .update(JSON.stringify(feedNode)) .digest(`hex`) feedNode.internal.contentDigest = contentDigest // Create node with the gatsby createNode() API createNode(feedNode) })
With that done, I could move onto implementing Algolia.
Adding Algolia to Gatsby
Gatsby has some pretty useful docs on how to go about this process, but these are fundamentally set up for using Markdown files, so I still ran into a couple of minor gotchas during the process. As a result, I've read a couple of different variants (see Further Reading below), but here's how I had to set things up.
First, you need to install the required dependencies and packages:
yarn install gatsby-plugin-algolia react-instantsearch-dom algoliasearch dotenv
Oh, and you'll need an Algolia account (the free tier is ideal).
Gatsby Config
In your gatsby-config.js file, add the following:
const queries = require("./src/utilities/algolia") require("dotenv").config()
And then, within the `plugins` array, add the settings for Algolia:
plugins: [ { resolve: `gatsby-plugin-algolia`, options: { appId: process.env.GATSBY_ALGOLIA_APP_ID, apiKey: process.env.GATSBY_ALGOLIA_ADMIN_KEY, queries, chunkSize: 10000 }, }, ],
Notice a couple of small things here:
- It assumes you don't already have an `env` file setup, and that this isn't using the environment naming convention e.g. .env.development. If you do have one already you probably don't need to change anything on that front, but you will need to add the new variables and set these with your API keys from Algolia;
- You'll need to create a utilities folder (if not already present) and an algolia.js file within it, plus (again, if not already present), a .env file in the root directory.
- I've changed the path and
apiKey
environment variable name compared to the Gatsby docs, purely for personal preference 😉
Algolia.js
Once you've created the new utilities folder and the algolia.js file within it, you'll need to define how Algolia is going to interact with your GraphQL tree. This was the biggest headache for me and I've gotta say a massive thank you to Christina Hastenrath whose own tutorial on this topic finally helped me get to the aha! moment to get this to work 🙌
Here's my much simpler code than most other tutorials:
const postQuery = `{ posts: allFeed { edges { node { title slug tags contentType date } } } } `const queries = [ { query: postQuery, transformer: ({ data }) => data.posts.edges, indexName: `theAdhocracy_Feed` } ] module.exports = queries
All I'm doing is defining a single query – in this case, my allFeed
node that I set up earlier – and passing that through to Algolia to create a new index called "theAdhocracy_Feed" (though, obviously, this can be anything you want it to be). That's it. If you've got a standard, non-nested data structure like I do, you can safely get rid of most of the complexities that are shown in other tutorials and simplify this massively. Of course, if you have nested data or other reasons for creating multiple indices then you can do so as well, but for me this does what I want (for now).
Here was my big gotcha: if you get a GraphQL error message about expected Name but found )
then you've left the brackets on the initial query i.e. allFeed()
🤦♂️ That legitimately had me stuck for about 30 minutes, until I tried Christina's console logging and GraphiQL steps to work out exactly where I was going wrong... and then felt like an idiot for another 30 minutes 😂
At any rate, you should now be able to run yarn build
and see the Index create itself in Algolia and populate with everything in your feed. That's your backend all set up and ready to go; I'll follow up with a post on creating the frontend once, y'know, I've managed it myself.