Automating one off deployment tasks

When I am deploying a rails application, there are sometimes one off tasks that I need done at release time, but not before. I’ve handled such tasks in the past by:

  • adding a calendar entry (if I know when the release is happening)
  • add a task or a story for the release and capture the tasks there
  • writing a ‘release checklist’ document that I have to remember to check on release.

Depending on release frequency, application complexity and team size, the checklist may be small or it may have many tasks on it. Some tasks on a checklist can include:

  • sending a communication (an email or slack message) to the team detailing released features
  • restarting an external service to pick up configuration or code changes
  • notifying a customer that a bug they reported has been fixed
  • kicking off an external process via an API call now that a required dependency has been released

What these all have in common is that they:

  • are not regular occurrences; they don’t happen every deploy
  • are affecting entities (users or software) beyond code and database
  • may or may not require human interaction

after_party is a gem helps with these tasks. This gem is similar in functionality to database migrations because each after_party task is run once per environment (this is done by checkpointing the task timestamp in the database). However, after_party tasks are arbitrary rake tasks, rather than the database manipulation DSL of migrations.

You add this call to your deployment scripts (the example is for heroku, feel free to replace heroku run with bundle exec):

heroku run rake after_party:run --app appname

This rake task will will run every time, but if an after_party task has already been run and successfully recorded in the database, it will not be run again.

You can generate these tasks files: rails generate after_party:task my_task --description="desc"

Here’s what an after_party task looks like:

namespace :after_party do
  desc 'Deployment task: notify customer about bug fix'
  task notify_issue_111: :environment do
    puts "Running deploy task 'notify_issue_111'"

    TfcAdminNotesMailer.send_release_update_notification("customer X", "issue 111").deliver_now

    AfterParty::TaskRecord.create version: '20180113164945'
  end  # task :notify_issue_111
end  # namespace :after_party

This particular task sends a release update notification email to the customer service user reminding them that we should notify customer X that issue #111 has been resolved. I couldn’t figure out how to make a rake task send email directly, hence the ActionMailer. In general you will want to push all of your logic from the rake task to POROs or other testable objects.

I could have sent the message directly to the customer rather than to customer service. However, I was worried that if a rollback happened, the customer might be informed incorrectly about the state of the application. I also thought it’d be a nice touchpoint, since issues are typically reported to customer service initially. The big win is that ten of these can be added over a period of weeks, and I could have gone on vacation during the release, and the customer update reminders would still be sent.

All is not puppies and rainbows, however. This gem doesn’t appear to be maintained. The last release was over two years ago, though there are forks that have been updated more recently. It works fine with my rails4 app. The main alternative that I’m aware of is hijacking a database migration and calling a rake task or ruby code in the ‘up’ migration clause. That seems non intuitive to me. I wouldn’t expect a database migration to touch anything outside of, well, the database.

Another thing to be aware of is that there is no rollback/roll forward functionality with after_party. Once a task is run, it won’t get run again (unless you run the rake task manually or modify the task_records database table).

This gem will help you automate one off deployment tasks, and I hope you enjoy it.


What can a senior developer do that a junior developer can’t do?

I saw this post about senior vs junior lawyers a while ago and it sparked some thoughts about the senior/junior divide.  I have also seen companies interested in hiring only senior developers. This is not a post hating on junior developers–the only path to a senior developer is starting as a junior developer.

Anyway, I think that senior developers will help you in a number of ways:

  • Senior developers have made their mistakes on someone else’s dime.  We all make mistakes.  One time I took 4 hours to figure out that the reason I couldn’t connect to an Amazon RDS instance was because the security group was too locked down.  But people integrate their mistakes into their experience and don’t make the same mistake twice.  When you hire a senior developer, you’re getting all their previous experience as well as their current effort.
  • Senior developers can own a larger portion of a project than junior can.  This can include non coding tasks like system setup, estimation, requirements definition and customer interaction.  This gives you leverage.
  • They are able to understand interactions between various pieces of your system.
  • They have an intuitive feel for performance impacts, again, probably because of previous mistakes.
  • They map your current solutions and problems into their existing knowledge: “Oh, this is how we handled caching at my <$JOB – 2>.”
  • They bring best practices from other companies (and can comment on worst practices).
  • They can get up to speed more quickly.
  • They can mentor other developers in areas of expertise.
  • They know the power of automation and knowledge sharing.
  • They have previously been confronted with difficult, complex problems and have an idea of how to attack new, difficult complex problems.
  • They understand that almost all problems are people problems, and that communication is a key part of any project’s success.

Junior developers, in my experience, bring value for the money as well.

  • They have more malleable viewpoints, especially with respect to tooling and technology.
  • They are typically very eager to learn.
  • When you are explaining systems to them, they can’t leverage existing knowledge.  This forces you to speak from first principles and may highlight erroneous assumptions.
  • Because they don’t have the knowledge of existing systems, they question why things are done a certain way.  Systems and processes evolve, but sometimes tasks are done only because they’ve “always been done that way” which isn’t necessarily the most efficient.  It’s hard to see that without the “eyes of a beginner”.
  • They are less cynical.

All good developers in my mind are:

  • Eager to learn.
  • Unwilling to say “that’s not my job”.
  • Accept ownership of the success of the project and the business.

If I had to sum up the differences between junior and senior developers in one word, it is would be “scope”.  A senior developer should be able to handle a wider scope of projects, responsibilities and problems than a junior developer.  How wide that scope is depends on the developer, their experience, and the problem space, but there are certain aspects of scope that are the same across domains (non coding tasks).


Seek The Golden Comment: “zeus exit status 1” fix

This bit me yesterday, so I wanted to get it written down.  zeus is a preloader that makes running tests faster.  It can be a bit finicky about the gems available, even when using rvm.

Yesterday, I tried to upgrade a rails 4.2 app to rails 5.  (I failed.)  When I checked out my source branch to work on a different issue, ran a bundle install, and then a zeus start, I saw this:


zeus rake
zeus runner (alias: r)
zeus console (alias: c)
zeus server (alias: s)
zeus generate (alias: g)
zeus destroy (alias: d)
zeus dbconsole
zeus test (alias: rspec, testrb)
exit status 1

And then zeus exited.  I did some google searches and turned up these two issues: 118 and 237. Lots of folks having similar issues.

After reading carefully through these issue, and trying some of the suggested fixes as I did so, I arrived at the golden comment. I reproduce it here in its beautiful entirety:

There’s a lot of “try this” in here, but no actual debugging steps. Here’s how you can find the actual issue:

zeus –log ZEUS.LOG start then cat ZEUS.LOG

Excellent!  Using this logfile I quickly determined that the root issue was a collision in my json gem versions and was able to get zeus running again.

But that’s not really the point. The point is that this user (thank you Steven!) didn’t just provide an answer, he provided the means for me to diagnose and find my own answer.

I wish github had some way of calling out highly recommended comments, as if I’d seen his comment first, it would have saved me some time.

Just goes to show, you should always read the comments fully and look for the golden one when you are troubleshooting.


Functional Core, OO Shell

This video is about 30 minutes long.  Mike Gehard mentioned it to me, and I enjoyed the heck out of it.  Many takeaways from this.  One is is how using test doubles make sense at the beginning of a software project, but how it will eventually come back to bite you as the method signatures of dependencies change, and your tests don’t fail.  Another is how you can write ‘fauxO’, a nice combination that has the strengths of functional programming and OO programming.  A third is how and where integration tests make sense.

Overall, what he proposes is that the boundary of any object that makes decisions have parameters that are simple values so that that it is always easy to unit test them.  Out of this proposal fall several nice features.

Well worth the time to watch.


The wonders of outsourcing devops

I have maintained a Jenkins server (actually, it was Hudson, the precursor). I’ve run my own database server.  I’ve installed a bug tracking system, and even extended it. I’ve set up web servers (apache and nginx).

And I’ll tell you what, if I never have to do any of these again, I’ll be happy as a clam. There are so many tools out there that let you outsource your infrastructure.  Often they start out free and end up charging when you reach a certain point.

By outsourcing the infrastructure to a service privder, you you let specialists focus on maintaining that infrastructure. They achieve scale that you’d be hard pressed to. They hire experts that you won’t be able to hire. They respond to vulnerabilities like it is their job (which it is).

Using one of these services also lets you punch above your weight. If you want, with AWS or GCP you can run your application in multiple data centers around the globe. With heroku, you can scale out during busy times, and you can scale in during slow times. With circleci or github or many of the other devtool offerings, you can have your ci/cd/source repository environment continually improved, without any effort on your part (besides paying the credit card bill).  Specialization wins.

What is the downside? You lose control–the ability to fine tune your infrastructure in ways that the service provider may not have thought of.  You have to conform to their view of the world.  You also may, depending on the service provider, have performance impacted.

At a certain scale, you may need that control and that performance.  But, you may or may not reach that scale.

It can be frustrating to have to workaround issues that, if you just had the appropriate level of access, you would be able to fix quickly.  It’s also frustrating having to come up to speed on the docs and environment that the service provider makes available.

That said, I try to remember all the other tasks that these services are taking off my plate, and the focus allowed on the unique business differentiators.


Leverage

As a software developer, and especially as a senior (expensive) software developer, you need leverage. Leverage makes you more productive.  I find it also makes the job more fun.

Some forms of leverage:

  • test suite. A suite provides leverage both by serving as a living form of documentation (allowing others to understand the code) and a regression suite so that changes to underlying code can be made with assurances that external code behavior don’t happen.
  • libraries and frameworks, like Rails. By solving common problems, libraries, open source or not, can accelerate the building of your product. Depending on the maturity of the library or framework, they may cover edge cases that you would have to discover via user feedback.
  • iaas solutions, like AWS EC2. By giving you IT infrastructure that you can manipulate via software, you can apply software engineering techniques to ensure validity of your infrastructure and make deployments replicable.
  • paas solutions, like Heroku. These may force your application to conform to certain limitations, but take a whole host of operations tasks off your plate (deployments, patching servers). When a new bug comes out affecting Nginx, you don’t have to spend time checking your servers–your provider does. When you reach a certain scale, thost limitations may come back to bite you, but in the early days of a project or company, having them off your plate allow you to focus on business logic.
  • saas solutions, like Google Apps or Delighted. You can have an entire business solution available for a monthly fee. These can be large in scope, like Google Apps, or small in scope, like Delighted, but either way they solve an entire business problem. You can trade time for money.
  • experience, aka the mistakes you’ve made on someone else’s dime. This allows you leverage by pruning the universe of possibilities for solving problems, based on what’s worked in the past. You don’t spend time doing exploration or spikes. Note that experience may guide you toward or away from any of the points of leverages mentioned above. And that experience needs to be tempered with learning, as the software world changes.
  • team. There’s only so much software you can write yourself. A team can help, both in terms of executing against a software design/architecture and improving it via their own experience.

Leverage allows you to be more productive and the more experienced you get, the more you should seek it.


Machine sympathy vs human constraint

I had beers with an work acquaintance recently. He’s a developer of a large system that helps contact management. Talk turned, as it so often does in these situations, to the automation of development work. We both were of the opinion that it was far far in the future. This was three whole decades of experience talking, right? And of course, we weren’t talking our book–ha ha. I’m sure that artisan weavers in the 1800s were positive that their bespoke designs and craftmanship would mean full employment no matter what kind of looms were developed.

But seriously, we each had an independent reason for thinking that software development would not be fully automated anytime soon.

My reason:

It’s very hard to fully think through all the edge cases of development. This includes failure states, exceptional conditions, and just plain human idiosyncrasies. Yes, this is what every system must do. That’s right. Anything you want handled by an automated system has two options: plan for every detail or bump exceptional cases up to human beings to make judgements. The former requires a lot of planning and exercising the system, while the latter slows the system down and introduces labor costs into the mix.

This system definition is hard to do and hard to automate. I’ve seen at least five new languages/IDEs/software platforms over the years that claimed to allow a normal human being to build such robust automatic systems, but they all seem to fail in the short term. I believe that is because normal human beings just don’t think through edge cases, but those edge cases are a key part of software.

His reason:

When systems reach a certain size, abstractions fail (I commented about this years and years ago). Different size, different failures. But just as an experienced car mechanic knows what kind of system failures are likely under what conditions, experienced software engineers, especially those who understand first principles, have insight into these failures. This intution (he called it “machine sympathy”) is something that can only be acquired by experience, and, by its very nature, can’t be automated. The systems are so complex and the layers so deep that every failure is likely to be unique in some manner.

So, which one is more likely to remain a relevant issue. It depends on the organization and system size. Moore’s law (and all the corollaries for other pieces of software systems) works both for and against machine sympathy. For, because, as hardware gets better, the chances of system breakdown decrease, and against, because as hardware gets better, larger and larger systems get more affordable. Whereas I believe the human constraint is ever present at all sizes of system (though less present in smaller ones where there is less concern about ‘bumping up’ issues to humans, or even just not handling edge cases at all).

What do you think?


Early Product Lessons

fence-238475_640I wanted to jot down some lessons I’ve learned being an early stage technical founder of an unfunded startup, from no product or revenue -> product and revenue. (Of The Food Corridor, if you’re interested in the startup.) I had the luxury of a co-founder who had spent years immersed in the problem space and months researching the niche. If you can find that, it really really helps in product development.

That said, here are some other lessons. For an idea of our timeline, we did a build or buy or both evaluation in March, started building in April, did beta testing in May and launched June 1.

 

Determine features through demand/pull, rather than push

Once you have a product that you can show users, show it to them!

It will be embarrassing.  Record all their feedback and note patterns (we did a month of beta testing, as noted above). Then, let the user requests pull features from you, rather than push features to them. This serves a couple of purposes:

  • people will know that you are hearing them, and will be more forgiving of inevitable issues
  • you will build features that people want to use
  • you’ll develop a sense of users needs
  • you’ll learn to politely say no to requests that are off base/only useful to one user

Everything is broken

Everything is borked, all the time. At an early stage startup you just don’t have time to do everything right (nor should you, because the wrong thing perfectly engineered is a waste). So there will be features that are half done, or edge cases unhandled, or undocumented build systems. Do the best you can, and realize that it gets better, but make your priority getting something out that users can give feedback from. “Usage is like oxygen for ideas.” – Matt Mullenweg

You have to walk a fine line between building something quickly and building something that you can build on later.  Get used to ambiguity and brokenness and apologizing to the customer.  (But not too used to the apologies!)

UX/UI polish is relative

Our app is a number of open source gems smashed together with some scaffolded ruby code.  The underlying framework had a decent look and feel, but there are definitely some UI and UX holes.  I thought I’d have to spend some time working on those, but our customers thought the product was beautiful and useful.  My standards were different than their standards.

That doesn’t mean that the app can look horrible, but a plain old bootstrap theme or one of the other common CSS themes is ok. You need to know your audience–many people are stuck using a mix of software and are used to navigating clunky user interfaces. If your interface is just decent, but still solves the problem, you’ll be OK.  Of course, you’ll want to solve gross UX issues eventually, but a startup is all about balance.  A friend of mine gave me the advice: “don’t allow your users to make any mistakes”.

Favor manual process for complex edge cases

There have been a couple of situations during the build where a lot of work was needed to handle an edge case. For example, prorating monthly plans. Once you start thinking about prorating in depth, it turns out to be a really interesting problem with a lot of edge cases. But guess what?  For your startup, edge cases can be a wild goose chase.

When an edge case rears its head, you should consider the following options (in preferential order).

  • can you outsource the complexity (Stripe handles proration, for example, and I guarantee you they handle edge cases you don’t).
  • can you make it a manual process? If it doesn’t happen that often and/or a real time response is unneeded, you can often get by with a manual solution. This may be partly automated, for example, an SQL query that generates an email to a human who can handle the exceptional situation.
  • if neither of the above apply, can you defer it? Maybe for a few months, maybe for just a few weeks. But sometimes requirements change and you learn things from users that may make this edge case less important.
  • if all of the above don’t apply, you may need to bite the bullet and write code.

Back end and front end development doesn’t have to be synchronized

Most users equate the front end with the complete product. Most developers know that, just like an iceberg, there’s a lot of back end processing hidden in any project. But guess what? When you are getting feedback from users, some of the backend processes need to work, but many don’t. For example, we had a billing system that handled monthly invoices. We didn’t need to build the billing system while we were getting feedback from users on what type of charges they needed to handle. We did, however, need to know that we could build it. So make sure you can build the backend system to support your front end system, perhaps by building one path through, but defer the full build-out until you have to.

What about you?  Any tips for early stage product engineering?



How to call shell scripts from java properly

shell photo

Photo by dd21207

This has caused me some grief over the past few weeks, so here’s some tips on calling bash scripts from java correctly.

Use ProcessBuilder. Please.

Make sure you set the redirectErrorStream to true, or make other provisions for handling stderr.

I’ve found that inheritIO is useful. Not calling this caused a failure in a bash script that called other java programs.

Make sure your shell scripts have are executable, or that you call them with ‘bash scriptname’. Tarballs preserve permissions, zipfiles do not.

Read the output of the shell script, even if you do nothing with it (but you probably want to log it, if nothing else). If you don’t, buffers fill up and the process will hang:

final InputStreamReader isr = new InputStreamReader(p.getInputStream());
final BufferedReader br = new BufferedReader(isr);
String line = null;
StringBuffer sb = new StringBuffer();
while ((line = br.readLine()) != null) {
sb.append(line).append("\n");
}

Check your exit values, and wait for your shell script to finish: (If looking for speed, or more complex process flow, write it in java.)


int exitValue = p.waitFor();
if (exitValue != 0) {
// error log
}



© Moore Consulting, 2003-2017 +