Skip to content

Monthly Archives: July 2024

Great GitHub actions for techdocs sites

There are two main options for techdocs sites. You can use a SaaS solution like Readme or DeveloperHub. Or you can use a static site generator like Jekyll, Hugo or Docusaurus. There are strengths to each approach, but for this post I want to look at continuous integration/continuous deployment (CI/CD) tasks you can run on staticly generated tech documentation.

Why use CI/CD for your docs site? CI/CD tools let you automate code or documentation tasks. For documentation, you can run checks to ensure consistency and correctness which make your documentation better.

GitHub Actions is the CI/CD tool I’m going to refer to, but similar functionality is available with other CI/CD solutions such as CircleCI or Jenkins. Additionally, I’ll talk about the content living on GitHub in a repo, but you could use self hosted git, gitlab, bitbucket or other version control solutions.

This post has a couple of assumptions:

  • Your content is in version control.
  • Your content is in a format that needs to be processed before it is ready for publishing. In other words, you are editing text files in a certain format (markdown, asciidoc, etc) that are then processed, rather than using a WYSIWYG editor.
  • The people writing your content are comfortable using version control and tools like markdown.

Here are tasks you should automate:

Deploying The Documentation

This is the first one and the easiest win. You should have a workflow which automatically deploys changes when a PR is merged to your primary branch. This lowers the barrier to entry for contributing to documentation, because you only need to get a GitHub PR approved. Similar to continuous deployment of a SaaS application, it makes it easy to push changes regularly. Because it is a GitHub PR, it also is easy to roll back–you just revert the PR.

You may wonder at my including this, because it seems so obvious, but I’ve been on teams where the deploy process included sshing to a server, pulling the branch, and running a script.

Automating deployment was well worth it, because it changed “I found a typo, oh man, what a pain” to “I found a typo, I can push a PR and fix it in 3 min”.

Vale

Vale is an amazing, free linter for your prose. I haven’t witnessed it used at its full potential, but it is a spell checker, word police and living style guide all in one. It’s kinda imposing to start with since there are so many “nerd knobs”, but start simple and add to it as you learn it.

Vale runs fast enough you can have it run on every PR.

Spelling

Deploy a spell checker to run against your documentation to prevent misspellings and typos.

You have two choices here:

  • spell check just the changes (vale is good at this)
  • spell check the entire site (pyspelling is good at this)

The former lets you block a PR if there’s an issue, while the latter takes a lot longer and should run on a schedule. The latter is a good fit if you have non-technical docs (perhaps from a marketing site) that you also want to check for correctness.

You should plan to add a lot of words to the ‘known words’ file because most techdocs have jargon or names that aren’t in standard dictionaries.

Look For Dead Links

Running a link checker regularly will help your users avoid 404 pages. No matter how cute they are, they annoy folks because a 404 is a dead end. Techdocs are all about enabling end users to solve their problems and a 404 page doesn’t help.

You should run the checker periodically and fix busted links as soon as possible. I haven’t found a way to run it quickly enough to link checking on a PR. (Update July 7: someone on a slack pointed me to this open source partial link checker. I haven’t tested it, but they say they use it.) I like this fast link checker which catches 404s but also will error on page anchors that don’t exist. The latter are not as annoying as the former, but still impact developer experience.

Side note: link checkers only prevent internal linkrot. You can prevent external linkrot by being assiduous about your redirects. Never let an external link be sent to the home page or, worse, a 404 page.

Check For Closed Issues

Often documentation references an externally facing issue tracker, such as GitHub issues or Jira. After all, if there is a known problem that has a workaround or enhancement planned, adding a link to an external tracker lets the dev audience know and can help them accomplish their goals or know to wait for a future release. Often issues are closed but are still referenced in the docs, which means that the link is confusing.

This job mitigates that by iterating docs, looking for issue links, then checking to see the status of the issues. If the issue is closed, this task reports the problem that the doc can be updated.

Shrinking Images

Small images lead to a faster experience for your users, but it can be hard to remember to shrink the images as you are creating documentation. Automating such image shrinking using a tool like tinypng is an easy way to improve user experience.

This is best done on a pull request, but will require committing the changes using a tool like git-auto-commit.

Generic Content Checking

You can write a shell script using grep to look for known content issues. Examples:

  • Images without alt text.
  • Documents without a title or description tag.
  • Alt text that is not a full sentence.
  • Descriptions that are not a full sentence.
  • Enforcing that every blog post is in a known category.
  • Absolute URLs that point to your docs site. Everything should be relative so it’s easy to stand up locally.
  • Markdown syntax issues

In practice, this looks like: exit `find astro/src/content/ -type f -name "*.md*" | xargs grep ']()'| wc -l |sed 's/[ ]*//g'`

That one liner looks for a markdown link with an empty URL. The above GitHub action will fail if there is a non-zero exit code.

Custom Checks

Once you start automating content quality checks, you’ll find opportunities everywhere. Some ideas:

  • If you show example applications on your techdocs website, but store the code elsewhere in a repository, you can check to see that the numbers from each source are equivalent.
  • If you have handcrafted JSON examples, making sure they parse using a tool like jq. It can be easy to miss an errant comma.
  • Making sure every API page has example code on it.

As you look at your docs, I’m sure you’ll think of others.

How To Handle Errors

Many of these tasks will throw an error when something is incorrect, such as a misspelling or syntax error. There are two ways to handle these errors.

  1. If you can run the check quickly enough, run it on every push or every opened PR and provide feedback for fixing the issue. The doc author can then handle it immediately before the PR is merged.
  2. If the check takes a long time, like spell checking your entire site, then run it on a schedule. The last person to edit the workflow will get the notification, so it’s best to catch the error and have it send an email to a shared alias to capture the issue and then fix it.

Also, configure these tasks to be run manually (the workflow_dispatch event for GitHub actions). This helps with troubleshooting or testing when a fix has been made.

Conclusion

All of these tasks can help you remove some of the toil from creating an excellent techdocs site. You don’t have to do them all at once, but adding them will reduce your effort and increase your documentation quality.