Skip to content

GitHub Actions Are Amazingly Easy

GitHub Workflows are automated jobs that can be triggered by various events against a GitHub repository. They are pretty awesome.

GitHub Actions are a way to encapsulate configuration and functionality in a way that can be easily reused in GitHub Workflows.

I was thinking it’d be fun to create some GitHub Actions (yes, I’m the life of the party), so I sat down a few mornings ago to do this. I was shocked at how easy it was.

I followed a few lines of this tutorial to create a workflow. Then I created an action by following this tutorial. Finally, I edited my workflow to use the new action. That was it.

It was amazingly simple and took me about 30 minutes. I ran into one unrelated issue (to set the executable bit on a shell script in windows, I had to modify the shell script contents in order to ensure the change was sent to the remote repo).

If you take a look, you’ll see these are both toy repositories, to be sure. However, the ability to write jobs which will be executed on a git push, pull request or other events is great and removes toil. Being able to extract common functionality to an action is even better. Finally, the ability to share the action publicly by adding it to the GitHub marketplace is fantastic.

I’ve liked CircleCI for a long time, but if I were them I’d be worried.

One issue I found is that the testing/release cycle is pretty tedious (I’ve mentioned that action debugging to be an issue for a while).

While I was troubleshooting my executable bit error, I had to do the following every time I wanted to test a change:

  • make a change in the action repository
  • create a new tag
  • push it to the remote
  • switch to the workflow repository
  • bump the action version
  • push to the remote
  • wait for the workflow to complete

Not horrific, but pretty tedious. I don’t know if there are other options such as local deployment which would reduce that cycle, but that would be swell.

Other than that, 10 out of 10, would write more actions.

Collecting internet points

I’ve been pretty active on HackerNews for the last couple of years and recently made it into the top 100 posters. According to this stats page, in just over 10 years as a member, I’ve posted 3468 comments and had 7824 submissions. That is approximately 1 comment and 2 posts per day, for a decade. (The numbers are as of the time I write this post.)

That’s a lot of hours on a site.

In light of that effort, I’d like to reflect on the good, the bad and the ugly of my years on HN, collecting karma points.

The good:

  • It’s elevated worthwhile posts and sites. I don’t know a single better source of free traffic for technical content. You don’t just get the initial traffic; other sites, online communities and newsletters pick up top ranked HN posts and reshare them, so there’s an echo effect as well that lasts for weeks. It is really fun to find a good article, post it and surprise the author.
  • I’ve learned a lot by reading the comments, especially in fields outside of software engineering. Posts on topics such as economics, physics and careers all receive really insightful comments.
  • I’ve been able to help a few folks get jobs by posting on HN. They have a monthly free jobs board and I know at least two people who have been hired because of one of my posts on the jobs board.
  • While trending on HN doesn’t typically translate directly to sales, it is great for brand awareness. At my current job, quite a few sales processes have been started because an engineer read a post about FusionAuth on HN.
  • For a developer relations position, having an active presence on HN is helpful. You can certainly devrel without being on HN, just like you can devrel without being on Twitter. But in general a public profile is helpful.

The bad:

  • While most folks argue and discuss from a place of goodwill, there are some who are overly pedantic and or just not nice. I can ignore them, but I remember a few flushes of shame where I made a mistake in a comment and was called out on it in an unkind, direct manner.
  • Self-promotion is part and parcel of a community where there’s this much traffic (to be transparent I promote my own stuff, but only 1 out of 10 posts at most). Often, I’ve seen over the top self-promotion. If someone only posts their own content, it simply doesn’t work.

The ugly:

  • HN has its share of trolls. I have personally seen fewer ugly posts recently, but any time I mention HN, folks reflect on the ugly, mean comments they’ve encountered on the site. Here’s an example of some people’s feelings.
  • I’ve had people ask me not to post their content without warning them. This is because of the kind of feedback they get from the site; they want to mentally prepare. That they’d even feel the need to do that is icky.

I think finding a community or three is a key part of growing as a developer.

While HN is not perfect nor it is as welcoming as other communities like the one around the Ruby language, the breadth and volume and diversity of it has been helpful to me.

So, I plan to keep collecting internet points for the coming years.

Thoughts on managing a devtools forum

I’m one of the team members tasked with managing the FusionAuth community forum, where folks using FusionAuth who don’t have a paid support plan can find help and answers.

Here’s some advice for running such a forum. (I wrote previously about why you should use a forum rather than Slack/Discord/live chat.)
  • First, consider why are you going to run a forum? Lots of great reasons: ease a support burden, help with SEO, foster community, get product feedback. Get clear on what you are trying to build before committing, because it is a commitment.
  • Choose forum software carefully. Migration will be a pain. Common options include nodebb (what we use), discourse, and vanilla forums.
  • Seed the forum. This means gathering up questions as you see them pop up in other venues (support tickets, GitHub issues, customer calls). I did that religiously for a few months. I learned a lot about the product and the forum posts meant that folks were helped even when it was new. I’d recommend posting the question and then responding in-thread with an answer.
  • Forums will bubble up commonly asked questions. This can tell you where your docs should be improved.
  • You must groom the forum. It won’t be set and forget. You have to pay attention to it, answer questions, respond to responses. A forum full of unanswered questions is worse than no forum at all. Trust me, developers will notice (we’ve had customers mention that they appreciated how active our forum was).
  • Because we sell support, we don’t answer questions immediately or have engineering staff answer them. There are also questions that we can’t answer such as architecture recommendations. Immediate responses and answers requiring context and research are reserved for paying customers. This hurts my heart some times, but we are open about it. May not be applicable to in all cases.
  • Don’t be afraid to ban users. We ban anyone who spams, no questions asked. Delete the content and ban the user. We luckily haven’t had any abuse issues beyond spam.
  • Have a code of conduct. I grabbed GitHub’s (you can see ours here, and here’s GitHub’s)  but have something. We didn’t in the early days, but it’s a good thing to have out of the gate.
  • Don’t expect a lot of community to grow out of it. At least, I haven’t had that experience, most people just want their questions answered. May be because I’m extremely part time on it and haven’t fostered it, though. Slack/discord is much more likely to build community in my experience. But know what your users want: Google or Facebook?
  • At a certain point, I had to enable a post queue, where a team member approves every new user. We were getting a lot of spam accounts and then they’d post gambling ads and then direct a ton of traffic (1000s of pageviews) to the ads. I don’t know what the spammer endgame was, but approving each new post has solved the issue. I’d definitely look for that feature.
In general I love forums, and so do devs, but they do take some work.

How to be a good conference talk audience member

I recently attended a conference and was both a speaker and audience member. It was on the smaller side; there were probably a few hundred attendees and the audiences ranged from about twenty to hundreds of attendees for the keynotes.

After one of the talks, a speaker came up and said “you were such a good audience member, thank you!”. I said the same thing to one of the attendees of the talk I gave.

I wanted to share how you can be a good audience member at a conference talk. It’s important to note that this advice is for attending in-person talks where the speaker can see the audience. This is typically when there are up to one hundred people. I’ve spoken in front of 800 people and it’s a different experience. While some of these principles apply, in general individual behavior is less important as audience size grows.

And online talks are an entirely different experience for everyone, both audience and speaker! I don’t have enough experience to give any advice for that scenario.

First, though, why would you care to be a good member of an in-person audience? After all, you are providing your time and money to the conference and the presenter. Isn’t it the speaker’s job to entertain and educate you? Why would you expend any energy to help them do so?

First, I’m a big fan of being respectful of other human beings and helping them succeed. Public speaking is a common fear and being a good audience member can reassure the speaker and reduce that fear. It’s hard up there, whether it’s your first talk or your hundredth.

The second reason is that you can make a talk better for yourself. You can learn more and you can tune their presentation to your needs. They are an expert and you can take advantage of their expertise.

So, here are my tips on how to be a great audience member:

  • First, remember that you don’t owe the speaker your time, but you do owe them respect. If you aren’t interested in the talk, if it isn’t what you thought it would be, or if you have another commitment or pressing issue to address, leave the room. Don’t make a big show of it, but get up, walk quietly to the door, open it carefully, and depart.
  • Since you’ve decided to stay, pay attention. Silence your phone. Turn off your computer. If you want to take notes using your laptop, disable wifi so you won’t be distracted.
  • When you understand and agree with something, nod and smile. This feedback provides the speaker a signal that they are reaching you.
  • If you don’t understand, frown or make a questioning face. No need to harumph, but give the presenter feedback that the topic is confusing or that they haven’t made their point clearly.
  • If you have questions, ask them. Speakers should inform you if they want to be interrupted with questions during the talk up front, but if they haven’t, a polite hand raise should be acknowledged. If it isn’t, save your questions for the end.
  • When asking questions, realize you may not get a complete, satisfactory answer. If you don’t, I’d avoid a secondary question. Instead approach the speaker after for a more in-depth discussion.
  • If you didn’t like or understand the talk, give that feedback to the speaker afterwards. No need to be rude, but saying something like “I wish you’d given more background on <X>” or “It seemed like you skipped over the complexities with <Y>” will help the speaker improve their talk.
  • If you feel moved to do so, thank the speaker afterwards. This is not required but a talk is a lot of work and any feedback is usually welcomed.

Am I always a good audience member? Nope.

I get distracted sometimes.

But when I follow my suggestions above, I learn more from the expert on the stage.

Ask for no, don’t ask for yes

I think it is important to have a bias for action. Like anything else, this is something you can make a habit of. Moving forward allows you to make progress. I don’t know about you, but I’ve frozen up in the past not knowing what the right path was for me. Moving forward, even the smallest possible step, helped break that stasis.

One habit I like is to ask for no, not yes. Note that this is based on my experience at small companies (< 200 employees) where a lot of my experience has been. I’m not sure how it’d work in a big company, non-profit, or government.

When you have something you want to do and that you feel is in scope for your position, but you want a bit of reassurance or to let the boss know what you are up to, it’s common to reach out and ask them for permission. Don’t. Don’t ask for a yes. Instead, offer a chance to say no, but with a deadline.

Let’s see how this works.

Suppose I want to set up a new GitHub action that I feel will really improve the quality of our software. This isn’t whimsy, I’ve done some research and tested it locally. I may have even asked a former colleague how they used this GitHub action.

But I’m not quite sure. I want to let my boss know that I’ll be modifying the repository.

I could say “hey, boss, can we install action X? It’ll help with the XYZ problems we’ve been having.”

If you have a busy boss (and most people do), this is going to require a bit of work on their part to say “yes”.

They’ll want to review the XYZ problem, think about how X will solve it and maybe do some thinking or prioritization about how this fits in with other work. Or maybe they’ll want you to share what you know. It may fall off their plate. You will probably have to remind them a few times to get around to saying “yes”. It might be a more pressing issue for you

Now, let’s take the alternative approach.”Hey, boss, I am going to install action X, which should solve the XYZ problems we’ve been having. Will take care of this on Monday unless I hear differently from you.”

Do you see the change in tone?

You are saying (without being explicit) that you “got it” and are going to handle this issue. The boss can still weigh in if they want to, but they don’t have to. If they forget about it or other issues pop up, you still proceed. This lets you keep moving forward and solving problems while keeping the boss informed and allowing them to add their two cents if it is important enough.

You can also use this approach with a group of people.

By the way, the deadline is critical too. Which would you respond to more quickly, if it was Jan 15, all other things being equal and assuming a response was needed?

  • “I’m going to do task X.”
  • “I’m going to do task X on Jan 17.”
  • “I’m going to do task X on Feb 15.”

I would respond to the second one, which has a deadline in the near future. I think that is the way most folks work.

Again, pursue this approach for problems you feel are in the scope of your role but that you want to inform the boss about. It’s great when you want to offer a chance for feedback, but you are confident enough in the course of action that you don’t need feedback.

GitHub actions and workflows

I recently wrote my first real GitHub action workflow at work. It was to publish our website after a merge or push to our main branch.

After this experience, I think these workflows are perfect for simple automation tasks. Things like:

  • Running a linter like rubocop on your code
  • Deploying a simple application (one or a few artifacts).
  • Running unit and integration tests.

I didn’t use self hosted actions, though that seems like a nice escape valve if you want to run things within your own network or run over limit. GitHub publishes the action and workflow limits (storage, runtime) and that’s definitely worth reviewing.

You also can easily stand up a couple of different service containers (right now only postgresql and resdis) for easy integration testing. You can also abstract out your commonly used workflow segments to versioned actions.

It was really a pain to write the workflow, however. I had to push repeatedly to our mainline branch, and there were times I screwed up the YAML or didn’t have my script correct. The feedback loop was slooow. Ouch. There are solutions to run them locally, but I didn’t try it yet.

Other than that, it was a positive experience. If you are using GitHub and have automation needs, take a look at GitHub actions. I am a big fan of CircleCI and have been for years. GitHub actions covers a lot of the same ground. GitHub actions are less sophisticated, but it seems like a definite “innovators dilemma” play. So I expect to see actions to get more and more sophisticated.

Use Cameo for dev focused marketing

Recently, we used a Cameo for a developer focused announcement. If you are not familiar with this service, it lets you request a short video from an actor. You send the actor your idea, pay them, they send you the video, and you can use it for a limited number of purposes. If you, or someone you know, has a favorite actor, it can make for a real fun birthday message. But it also is fun for marketing messages and can help you stand out from the crowd.

My experience below is based on one business Cameo. We plan to do more, so there may be updates.

Why consider a Cameo

It is still relatively unique. I’ve seen a few celebrity endorsements of technical products via Cameo, but not that many. This means that it stands out in a fun way. Using Cameo also gets you easy access to a famous or semi-famous person. All you have to do is submit a form and pay some dollars. Compare this to any kind of commercial, which may involve a casting director, ad agency and other parties.

It also is relatively cheap. I looked at a few actors and none cost more than $2000 for commercial usage (more about that below). While this isn’t cheap, I also saw actors for a couple hundred bucks. We ended up choosing an actor who worked for $500.

Note that a Cameo is a pure brand marketing play. It is fun for shock or surprise value, rather than a CTA. It’s unlikely you’ll get deep technical analysis as well. This playful nature fit with our brand, but make sure it fits with yours.

How it works

You can check out the Cameo site FAQs, but here’s how the process worked for me.

  • Browse actors and come up with a shortlist.
  • Filter out actors who won’t do commercial messages. (Some actors won’t, so check before you get excited.)
  • Decide on a topic to be covered.
  • Review licensing terms for commercial use.
  • Sign up for an account.
  • Put in a credit card.
  • Submit a request on the website. This was limited to 250 characters. (Not 250 words. 250 characters. So the guidance was general.)
  • Install the application to get messaging. (The actor enabled free messaging so he could ask questions.)
  • Go back and forth with the actor and answer clarifying questions, maybe 2 rounds of qs. This had to be done on the Cameo app (boo!).
  • Accept the delivered video.
  • Promote and share it.

Note what wasn’t in there:

  • Any writing talent. I did talk to a number of writers and even selected one. However, after reviewing the constraints, we but mutually decided it didn’t make sense. There just isn’t a lot of room for a complex story line or even a funny line or two. That’s probably why Cameo has the limit.
  • A specific story line. I was able to convey one message to the actor, but otherwise it was in his hands.
  • A lot of back and forth or workshopping. I think I talked about this internally for maybe 15 or 30 minutes and definitely had a good idea of what we wanted to cover. But other than some questions, it wasn’t super collaborative. And, to be honest, that was fine. I believe any actor on Cameo is funnier and knows more about speaking to the camera than I do.

I do wonder whether all the actors would have the same devotion to detail. As mentioned above, the actor enabled free messaging and really dug into the topic. Everyone who watched the video was delighted.

After it is delivered

After it is delivered, it’s time to promote. At the time we bought the Cameo, you could put it on one social media platform or your website for 30 days. We chose Twitter. Then I realized that the actor had recorded 5+ minutes. You aren’t allowed to edit the videos, and the maximum length of a Twitter video is 2:20. So we posted it on an unlisted Youtube link and shared that. Check out the current terms (search for Business CAMEO Videos).

I submitted it to a few online communities, shared in social networks and basically did any other kind of promotion you’d do with an interesting video. It was shared to several email lists and slacks as well. We also bought some traffic.

It didn’t go viral, but it got ~10x the usual number of retweets and interactions as our normal tweets do. It’s unclear if any business came from it.

What I wish we had done differently

  • Understood my limits earlier. I spent a lot of time talking to writers before I realized that 250 characters meant sending over an idea and trusting in the actor. Would have been less stressful to have known that earlier.
  • Be a bit more familiar with the actor. One of the best parts of the Cameo was made in response to an offhand request from a co-worker who was more familiar with their work than I was. I should have done a bit more research.
  • While I focused on the topic and asked the actor to do it in character, I should have included the following in my pitch:
    • How to pronounce the brand.
    • Whether or not I should be mentioned (I was, but that was unnecessary).
    • The optimal length of time (2:20).

At the end of the day, this is a fun alternative (or complement) to the normal boring press release. If you have a character which is in line with your brand or product usage, do check it out.

You should use forums rather than Slack/Discord to support developer community

Hot take after a year or so of trying to build a developer community. If you can pick only one, use forum software rather than synchronous chat software for community building around a developer platform.

While there are tradeoffs in terms of convenience and closeness, for most developer communities a public, managed forum is better than a private, unsearchable Slack.

There are a few key differences between forum software (which includes packages like nodebb, forem, discourse, and others) and chat software (like Slack, Facebook groups, or Discord).

First, though, it’s important to know what you are trying to accomplish. If you are trying to get immediate feedback from a small set of users, then synchronous solutions are better. You can be super responsive, your users will feel loved, and you’ll get feedback quickly. However, synchronous solutions go beyond chat and include phone and video calls. The general goal of validating user feedback at an early stage is beyond the scope of this post.

But as you scale with a chat solution, major problems for the longevity and value of the community emerge.

Problem #1: the memory hole

This occurs when there’s a great answer to a common question, but it isn’t available or is hard to find. This matters more for community Slacks than other synchronous solutions, since Slack limits free plans to 10,000 messages.

It still exists for solutions such as discord because older messages scroll up. Search isn’t great. As a participant it feels easier to just re-ask the question. If the community is vibrant and willing to help newcomers, such questions get answered. If not, they languish or are ignored, frustrating the new participants. Not good.

You can work around the memory hole as a community member by extracting and reifying interesting chat posts. I have done this by generalizing and publishing the messages as blog posts, to a newsletter, or even to a google sheet. But this is additional work that may not be done regularly, or at all.

Side note: for some communities, discussing current events or just chatting with friends, this is actually a feature, not a bug. Who needs to remember who said what six months ago when conversing with friends.

But for developer communities, friendly chat is important, but so is sharing knowledge; the memory hole actively thwarts the latter.

Forums, on the other hand, are optimized for reading. nodebb even suggests related posts as you begin a topic, actively directing people to older posts that may solve their issue without them ever posting.

And if published on the internet, forums are searched via Google.

Problem #2: Google can’t see inside chats

Google is the primary user interface for knowledge gathering among developers.

I hope this statement isn’t controversial. It is based on personal experience and observation, but there’s also some research.

I have seen many many developers use Google as soon as they are confronted by a problem. Youtube, books, going to a specific site and using that search: these all are far less used alternatives when a developer has a question.

Google has made it so easy to find so much good information that most developers have been trained when they face a problem to open a new tab, type in the search term and trust the first page of results.

Chat systems don’t work well with this common workflow, because all the content is hidden.

This means when you use a chat for a developer community, you don’t get compounding benefits when someone, either a team member or a community member, answers a question well or has an insightful comment that would be worth reading. Very few folks ever benefit in the future.

With chat, people who aren’t present at message posting or soon thereafter never learn from that knowledge.

Problem #3: synchronous communication is synchronous

When you are in a chat system, the information is ephemeral. This means that valuable comments can be lost if there is a flurry of other messages.

People can feel ignored even though the reality is that they just posted at an inopportune moment. This feeling can be intimidating; I’ve definitely felt miffed as a question or comment I posted was ignored or unseen and other people’s questions were answered. “Was it me? My topic? Are other people more welcome here than I am?”

People who like to answer questions may feel the need to do so quickly. This may interfere with time for deep work. The Pavlovian response is real; I’ve felt it myself. It feels “better” to write a response to help someone than it does to write a document that will help many, because the former is so concrete and immediate.

When you pick a chat solution, you are optimizing for this kind of response.

Problem #4: less capable moderation tools

Forums have been around a long time. It’s a well understood problem space. There’s a rich set of functionality in most of them for handling the more frustrating aspects of online community management (see also “A Group is its own worst enemy”).

Chat applications have uneven support for this aspect of community management; I have heard Discord is pretty good, but Slack is not.

Remember that when you are running a community, you will inevitably attract trolls and spammers. Make sure you have the tools to protect the community from abuse.

In addition, make sure you have the time/energy. The community may be able to police itself when it gets to a certain size, but initially and for a long while, with a chat solution you may need to be ready to jump in and moderate.

Forums still require attention, don’t get me wrong, but the tooling and the separation of topics means they aren’t quite as vulnerable. They are a higher value target, because of Google, however.

Problem #5: you’re missing out on long tail content

This issue is related to problem #2, but slightly different. When you are building developer tools, there is a wide surface area of support needed. Questions from developers help define that space. When they go to a chat, someone needs to capture the questions and make them public to help future developers. You can capture this knowledge in formal docs.

When using a forum, the answers are made available to the long tail of searchers without any effort at all. A company I worked for got about 5-6% of their traffic from their forum pages.

That traffic was essentially free because the time to answer the questions was required with either solution. (This assumes the question would have been asked in either a chat or a forum.)

Problem #6: questions can be flippant

When I am talking in person to someone and they ask a question, I don’t expect them to have done a ton of research or thinking about it. It’s a conversation, after all.

The same attitude occurs during real time chat.

For technical questions, this can be frustrating because you want to help immediately (see problem #3) and yet you don’t have all the information you need. In async discussions, because they are async, more context is typically provided by the questioner.

This makes it easier for people who want to help to do so.

Why do people use Slack/Discord/etc?

Wow, so many problems with chat and all these reasons are why forum software is better.

So why do so many folks building developer communities choose solutions like Slack or Discord?

There are two motivations that I can see.

One from the company perspective and one from that of the developer.

From the company side: it helps build community between members.

I don’t know about you, but I am much more likely to pitch in and help when I have had a conversation with someone than if some rando drops by with a question and leaves.

A Slack can begin to feel like a real community, where you know people. It doesn’t feel as transactional when I see a question in a Slack when I’ve seen the questioner post other messages or share a bit about themselves. This type of interaction can happen in a forum, but seems more common in a Slack. This type of interaction makes the community more sticky and people more likely to help. A minor benefit is that chat can be hosted elsewhere for free so the startup cost and friction is low.

From the developer’s side: when I run into an issue or problem, I want an answer as soon as frickin’ possible. It’s blocking me, otherwise I wouldn’t have asked it.

Sure, I can context switch, but that has its own costs. So there’s tremendous value from a knowledge seeker’s side to pick a synchronous method of asking questions.

If you had a burning question to ask, which would you prefer? Hopping on the phone or sending a physical email. That’s the allure of the chat platforms.

I will say that some forum software has built chat in, but that isn’t going to get you an answer immediately.

What’s right for you?

Well, what do you want to emphasize? Long term aggregation of knowledge and a culture of completeness, or community and a culture of immediacy.

As alluded to initially, you can of course use both tools at different times in your community’s evolution. I think the longer you build, the more you’ll move to a forum or other public knowledge sharing solution.

Here’s a tweet survey that I ran a month ago asking how developers wanted to get tech help. (Something else turned out to mostly be “well written documentation”, from the thread responses.)

 

Slack failing to open with NS_ERROR_DOM_ABORT_ERR error

I use Firefox as my primary browser (version 84 as of today). As part of this, I keep logged into a number of slack channels. These are some of my favorite communities and I go there to chat with folks, learn about interesting topics and hunt for neat links to resources I didn’t know about. Occasionally I’ll even ask a question or two.

Recently, I saw some weird behavior. The slack channels weren’t updating. The notice at the bottom of the browser bar was something like “Slack is trying to connect”.

Hmm, I thought, that’s weird. I tried reloading the page. I couldn’t do it. That is, even hitting control-f5 didn’t change anything. However, I could open other website just fine.

I tried quitting all my browser tabs and restarting. Made sure Firefox was up to date. Checked the Slack status page to see if slack was down. Opened a new tab and typed the slack channel url into the address bar.

Nothing.

So, I popped open the dev tools and looked at the network tab. I saw this error when loading app.slack.com:

NS_ERROR_DOM_ABORT_ERR

That turned up this bug. From a scan, several other folks where having this issue, with slack and other sites. This comment shared the solution:

Clearing storage for app.slack.com fixed the issue, and the Slack workspace loads correctly now.

All I had to do was clear the storage for app.slack.com and Slack started working again, magically. Even though I was warned that it might force me to log in again, I didn’t have to do so.

Terraform with multiple workspaces and environments

I recently was setting up a couple of AWS environments for a client. This client had a typical web application which talked to an RDS database. There was DNS, a CDN and other components involved. We wanted to use Terraform to maintain traceability and replicability, and have the same configuration for production and staging, with perhaps small differences like ec2 instance size. We also wanted to separate out the components into their own Terraform workspaces to limit the blast radius (so if one component had changes that caused issues or Terraform corruption, it wouldn’t affect others). Finally, we wanted each environment to have its own Terraform backend, again to separate the environments.

I wasn’t able to complete this project due to external factors (I left the position before testing could be completed), but wanted to share the concepts. Obviously I can’t share the working code, but I set up an example project which is simpler. That’s the project I’ll be examining in this post. I also want to be clear that while I’ve tested this as much as I could and have validated the ideas with others who have more Terraform experience, this hasn’t been run in production. You have been warned. (Here’s the Terraform docs about setting up modules, workspaces and repositories.)

Using a tool like Terraform is great for a number of reasons, but my favorite is that it lets you track changes to cloud infrastructure. More than once I’ve wandered into an AWS account and wondered why certain resources were set up in the way they were, and what might break if I changed them. There are occasionally comments, but it is far better to examine a commit. Even better to review the set of commits and see the customer request or bug tied to it. (Bonus link: learn more about Terraform and other cloudy tools in this podcast episode with the creator of Terraform.)

So this simpler example project has a lambda that writes to an SQS queue. For now, it just writes the date of invocation, but obviously you could have it reach out to an external API, read from a database, or do some kind of calculation. The SQS queue could then be read from by an EC2 instance, which processes the message and perhaps updates a database. You have three components of the system:

  • The lambda function
  • The SQS queue
  • The EC2 instance (implementation of which is left as an exercise for the reader)

The SQS queue is shared infrastructure and needs to be accessed by both of the other systems. However, the SQS system doesn’t need to know about either the lambda or the EC2 instance. Using Terraform, we can create each of these components as their own workspace. Each of the subsidiary systems can evolve or change (for instance, the EC2 instance could be replaced with an autoscaling group) with minimal impact on other systems. They could be managed by different teams as well if that made sense.

To enforce this separation, set up each component as a separate Terraform workspace. (All code is on github here.) I use remote state so that more than one person can manage the terraform state, and use the S3/dynamodb backend because we are targetting AWS and want a free scalable solution. This post assumes you know how to set up Terraform using s3/dynamodb as a remote state storage.

Here’s the outputs of the SQS system:

output "queue_url" {
  value = "${aws_sqs_queue.myqueue.id}"
}

output "queue_arn" {
  value = "${aws_sqs_queue.myqueue.arn}"
}

I explicitly define the output variables so I can pull them in from the lambda and EC2 workspaces. This is how you can do that.

...
data "terraform_remote_state" "sqs" {
  backend = "s3"
  config = {
    bucket = "${var.terraform_bucket}"
    key = "sqs/terraform.tfstate"
    encrypt = true
    dynamodb_table = "terraform-remote-state-locks"
    profile = "${var.aws_profile}"
    region = "us-east-2"
  }
}
...
resource "aws_lambda_function" "mylambda" {
...
  environment {
    variables = {
      sqs_url = "${data.terraform_remote_state.sqs.outputs.queue_url}"
    }
  }
}

The terraform_remote_state block defines the location of the previously defined sqs workspace, and the ${data.terraform_remote_state.sqs.outputs.queue_url} references that url. That is then injected as an environment variable into the lambda, which reads it and uses the url to create an SQS client. It can then post whatever message it wants.

You can see how this would work with any number of configuration parameters. If you have typical three tier database driven application with a separate caching layer you can create each of these major components and inject the values into either the environment (for lambda) or the userdata (for EC2). I’m not sure I’d use this with a microservices architecture because using a services registry might be more appropriate.

Note that the lambda component has a rudimentary lambda function (you have to define something). It also uses Terraform to deploy the lambda code. That’s fine for the toy example, but for production you will want to use a real CI/CD system to deploy your lambdas.

Now, suppose you want to run production and staging environments, because you are ready to launch. Here are the constraints you’d want:

  • Production and staging run the same config (except when staging is changing, of course)
  • Production and staging may differ in a few details (the size of the EC2 instance, for example)
  • Production and staging execute in different AWS accounts to limit access and issues. You don’t want an error in staging to affect production. This is handled by creating different profiles which have access to different accounts.
  • Production and staging execute in different Terraform backends for the same reason as the separate AWS accounts.

Staging and production can use the same git repository, but when pulled down they are kept in two places on the filesystem. This is because you need to specify the profile and the bucket when using terraform init. So you end up running something like these two commands:

git clone git@github.com:mooreds/terraform-remote-state-example.git # staging
git clone git@github.com:mooreds/terraform-remote-state-example.git production-terraform-remote-state-example # production

I set up the project so that staging can be managed by normal terraform commands (since that will happen more often), and that production uses either special incantations or a script. For the initialization of the production Terraform environment, this looks like: terraform init -backend-config="profile=trsproduction" -backend-config="bucket=bucketname". For staging, it’s just terraform init. I didn’t have a lot of luck switching between these two Terraform backends in the same filesystem location, so that having two trees was a straightforward workaround.

Any changes between production and staging are each pulled out to a variable, with the staging value as the default. Then each workspace has a script which applies the Terraform configuration to the production environment. The script sets variables to be the correct value for production. Here’s an example for the lambda workspace:

terraform apply -var aws_profile=trsproduction -var terraform_bucket="mooreds-terraform-remote-state-example-production" -var env_indicator="production" -var lambda_memory_size=256

We pass in the production terraform_bucket in case any references need to be made to the remote state (to pull in the SQS queue url, for example). We also pass in an increased lambda memory size because, hey, it’s production. Other things that might vary between environments: for example, VPC or subnet ids, API endpoints, and S3 bucket names.

For simplicity, we just use two profiles for staging and production (in ~/.aws/credentials), but any way of getting credentials that works with Terraform will work:

[trsstaging]
aws_access_key_id = ...
aws_secret_access_key = ...

[trsproduction]
aws_access_key_id = ...
aws_secret_access_key = ...

This lets us separate out who has production access. Some users can have both staging and production profiles (perhaps operations), and others can have only staging profiles (perhaps developers). You can pass region values in via variables as well.

Using this system, the workflow for a change would be:

  • Check out the terraform git repository
  • Create a feature branch (including an issue identifier)
  • Pull request and approval
  • Run terraform apply to apply to staging
  • Run any additional tests
  • Merge to master
  • Run prodapply.sh

Again, I want to be clear that I’ve implemented this partially, but I didn’t get a chance to run this fully in production. I tested all these concepts with the simple system mentioned above (and you can stand up your own using the code on github). There will be issues that I haven’t experienced. But I hope that this post helps illuminate the complexity of managing multiple workspaces and environments within a single Terraform github repository.