Trust the compiler

I loved this post not because I love reading assembler but because it just illustrates so perfectly how often, when writing software, we can easily stand on the shoulders of giants.

My point is not that we should take what we’ve learnt from the LLVM-generated code and write a new version of our hand-rolled assembly. The point is that optimising compilers are really good. There are very smart people working on them and computers are really good at this kind of optimisation problem (in the mathematic sense) in a way that humans find quite difficult. It’s the job of language designers to give us the tools we need to inform the optimiser as best we can as to what our true intent is, and larger integer sizes are another step towards that.


Who’s Afraid of Continuous Deployment?

Fish leaping to a larger pool

Leaping to larger pool

So, who’s afraid of continuous deployment? I am, for one. And I’m not alone. I taught hundreds of people in AWS courses over the past two years. We often discussed continuous delivery and deployment and I asked if this was practiced at their places of work. I’d say about 5-10% of folks said yes. I conducted a very informal survey across two technical slacks as well. Unfortunately I had my terms wrong and asked about continuous delivery:

Wanted to do a quick poll. Can you please give a thumbs up to this message if you or your team does continuous delivery of your software product, and a thumbs down if you don’t. And a :penguin: if it doesn’t apply?

The results were:

  • Did CD: 27
  • Did not do CD: 25
  • Does not apply: 3

In the poll, I defined continuous delivery as “if a change is merged to the mainline branch and passes all the tests, it is deployed to production (or whatever environment your customers see) without human involvement”. This was actually a source of discussion, as some folks were very close to this (they deployed to beta environments where only a few customers saw it, or required one human to push a button to actually release, but everything up to that point was automated). Also, someone shared this link about the difference between continuous delivery and continuous deployment. Turns out I was using the term continuous delivery incorrectly. What I defined as continuous delivery was actually continuous deployment. Whoops!

That said, it was interesting that a large number of folks did not deploy code automatically, almost half (note that I believe the poll had a bias because I asked in one slack on the #devops channel. The numbers from the other slack had less than half doing continuous deployment). I’ve worked at a number of small startups, some without paying customers, and I’ve never worked in a place with continuous deployment. I’ve been in jobs with continuous integration and continuous delivery (and this provides a lot of value) but not continuous deployment. I wanted to talk about some reasons why.

The first reason is that continuous deployment simply doesn’t apply. If you are building software that is deployed to customer sites (on-prem), or is tied to hardware, then it doesn’t make sense to work toward CD because there will always be a manual delivery component. Another reason why it might not apply is legal compliance. Folks in the slacks pointed out that in some regulatory regimes you legally are required to have a human ‘push a button’ to deploy because more than one person needed to be involved in a code deploy to satisfy the law and the auditors. These are totally legitimate reasons for not doing continuous deployment.

Next, let’s discuss the reasons based on fear or lack of software hygiene (automated tests or a robust type system). Before I step into this, I want to acknowledge that there may be times in the life of your business where such software hygiene is detrimental to your chances of survival–you need to get an MVP out and test your value in the market, for example. However, in my years of experience I find that following proper software hygiene is far easier to do if adhered to from the beginning. If you don’t, eventually the difficulty of changing the system will grow along with its complexity. You can bolt on testing later, but it is difficult.

I also want to emphasize that I’ve been in all these situations myself. In some ways this blog post is a warning for future me when I try to shirk these practices.

  • If you don’t have automated test coverage, continuous deployment is reckless. This often happens in systems where the testing was bolted on after the system had been developed for a while. The solution is to work towards having enough test coverage to give yourself confidence (it swaddles your code).
  • A system may have configuration deeply tied to a database. Many content management systems are in this boat, which makes it very difficult to roll new configuration forward automatically.
  • Not having an automated rollback strategy. If you are going to continuously deploy, you need to have a way to rollback with confidence, with one script. If you are on heroku, heroku rollbacks help here. If you are running rails code, you can use db:rollback but you’ll need to know how many steps to rollback (I couldn’t find anything that rolled all migrations back to a given timestamp) and you’ll want to be careful about losing data. It may make more sense to run migrations in a different release, and always have the code be backward compatible. Lots of interesting reading about that strategy in strong_migration’s docs. This solution will vary from application to application.
  • Not having enough users to safely canary. One way to know if your new release has problems is to do a blue/green deployment and send just a fraction of your traffic there (you could use a weighted DNS round robin solution). But if you only have a small number of users, the canary userbase won’t adequately run through all the code paths.
  • Fear of breaking key user flows. At a recent company we did basic manual regression tests just before deployment. These could have been easily automated via selenium and would have made sure that at least basic functionality was available. Also see this post from 2013 on smoke testing.

All of these are not really technical issues, they’re prioritization issues. At this point in time most web applications can be continuously deployed. The tooling and the knowledge is out there, given the business and technology teams commitment.

However, this in some ways sidesteps the real question. Why is continuous deployment a goal worth prioritizing, especially when the team has to spend time supporting that instead of giving customers more features? CD is extra work to set up, but once it is running then you can deliver features at a very rapid pace, and you never have a feature sitting around waiting for other orthogonal features. So, in a way, it will actually lead to more features and better development. There’s also the long term benefits of software hygiene for the ability of the system to evolve.


The Original Sin Of A Software System

AppleI had coffee with an acquaintance a while back who works in a software company. We were talking about their system and he referred to the “original sin”.

It immediately struck a nerve.

Whenever I (the “I” being a team, as it’s extremely rare to build anything by yourself) build a system, I make assumptions based on a number of factors. What I need to provide functionality immediately. Where I expect the system to go in the short term. Where I expect the system to go in the long term. What I need to do to make the system maintainable. What I need to do to make sure I can run the system.

All of these assumptions mix together into an architecture and a design. Similar to the way I imagine one would lay out a subdivision, the early choices (where to put roads and lots) effect later decisions (where trees, parks, and homes go, which homes get the best view).

Often, one or more of these assumptions is incorrect, and it is often one of the core decisions, rather than one of the ancillary ones. (Or, perhaps they both have an equal chance to be incorrect, but if it is a core assumption, I notice it more.) This is the “original sin”.

I can think back to all of the systems that I ran for any length of time in production (that is, with real customers) and name an original sin (at least one, oftentimes more).

What do you do with this mistake?

You can live with it

The mistake is embedded so deep, or the business is evolving so quick, or there are so many other features that it never makes sense to rectify the issue. It just sits in your system, bugging you when you see it, making new features more complicated to build.

You can rip it out

If you have the time and now know what you didn’t know when you first built the system, you can modify the system to remove the original sin. This can take time and effort, because it is embedded and probably touches many pieces of your system. It also may slow down or stop new development. When you are done, you’ll have a system that is better for the world your business is living in now. However, you may add a new original sin.

You can work around it

Instead of living with it (no change to it) or ripping it out (pulling it out at the roots), you can work around it. I’ve done this before by swapping out pieces of an original sin component. This is an iterative approach that that can be intermingled with feature development. This is the most pragmatic approach for long lived software.

You can abandon the system

If the system has a really fundamental original sin, and the technology world has moved on, you may consider abandoning the system, either for a new build or a new buy. Don’t think of the money you spent on the first system as wasted, think of it as tuition.

It is tough to make predictions about the future usage of your system. Building a system is a balance between getting it done and making it flexible enough to meet future needs. However, some decisions are so rooted in your system that they will have ramifications for as long as those systems are used. And some of the decisions you make will be wrong, and you’ll be confronted with that fact as long as you work on that system. It’s OK to make these mistakes, but when you see one, think long and hard about how best to rectify it, balancing development and business pain. Discuss it with your team, including non technical members, so that everyone understands the pain.


Zen of Code Review Series

Tree trunk with mushroomsI stumbled (thanks John!) on this post about the right way to review code. This is a key skill for working in a team and one that is underappreciated. A solid code review makes sure the code changes can be understood.

If code can’t be understood, it can’t be changed. If code can’t be changed, business processes can’t be changed (code is, after all, just business process made digital). If business processes can’t be changed, the organization can’t adapt to new opportunities or threats. Have you ever seen a tree grow around a fence? The fence constrains the tree, but the tree keeps trying its best to work around the fence. Neither end up serving their inner purposes. That’s what I think of when I see code that can’t be changed.

Here is a brief excerpt to give you a flavor of the post.

Your goal, then, is clear: question, probe, analyze, poke, and prod to make sure that you, the reviewer, could support the code presented to you for review. From an overall perspective, there are several questions to keep in mind as you begin your task…

It seems like this is a series, where the author discusses code reviews from a variety of different perspectives. Recommended.


“Choose boring technology”

Reading a boring bookThis article on choosing boring technology is so good. It’s from 2015 so some of the tech it references is more mature, but the thesis is that the technology underlying a business should be well known and boring to implement. It also discusses how important cohesive technology can be (I love the final footnote).

I think this is an interesting contrast to microservices, which is essentially solving repeatedly for local optimizations and then relying on the API or message boundary to allow scaling. Microservices, in my experience, often get a bit hand wavey when it comes to operationalization (“just put it on Kubernetes, it’ll be fine”).

I have definitely dealt with this in previous companies where I had to put the kibosh on new technologies being introduced (“nope, we’re going to do it in java”). This meant that as a small team we had a standard deployment stack and process and could leverage our logging knowledge and debugging tools.

Here are some arguments I hear for using the latest and greatest technologies:

  • “Our developers will get bored and we’ll have a hard time recruiting”
    • This is a great reason to have regular hackfests and or support developers working on open source or side projects.
  • “New tech will help us do things faster”
    • It’s definitely worth evaluating new technology periodically and adding it to your stack (especially if it is outsourced, a la RDS, or boring, a la memcached). As the original post mentions, if you get substantial benefits from the new technology, then use it full bore and plan for a migration. Don’t end up in a state where you are half in/half out. Also consider tooling and processes to get things done faster–these may have quicker iteration times with lower operational risk.
  • “Choose the right tool for the job”
    • Most turing complete languages can be used for any purpose. And you shouldn’t be optimizing for the right tool for the job, but rather the right tool for the business for the job.

Really, you should just go read the post. It’s good.

 


“Run less software”

RunningThis post from the folks at Intercom makes some really good points about the benefits you can get from leveraging other software solutions. It’s an interesting article, but the source works at a company that offers a SaaS solution to help you help your customers (I’m a fan). You can read some interesting discussion at HN. An interesting quote from the post:

Choosing standard technologies is very similar to this, only in software. You need to constrain yourself to solving problems mostly, but not exclusively, with a small, specific set of standard tools. By doing this over time we become experts in them. Then, we are able to build better and faster solutions from them.

From my experience with small teams, having a standard software stack is a big win. It lets you reuse business logic and skills across your business. (The spread of microservices may mean this matters less in the future, though with a small team and product microservices may be overkill.)

However, I wish Rich had titled it “write less software” because that is really what he is advocating for. When I use AWS or Intercom or Gmail, I’m not really running less software. In fact, I may be running more. Since I don’t have insight into those applications, I really don’t know. But I’m not concerned about writing, reading and maintaining as much software, and that is the true win.


Book Review: Working With Coders

Woman with 1s and 0sSoftware is so integral to business processes and relatively inexpensive compared to labor that I believe every company is going to be a custom software company, in the same way that every company is an accounting company or every company uses paper. I happened on an interesting blog post and saw the author had written a book, “Working With Coders”. How non technical folks interact with coders is a topic of perennial interest to me, so I picked it up after reading the first few pages on Amazon. The book is written for clients, CEOs or project managers who are going to be working with developers to deliver applications that will provide business value.

Frankly, I couldn’t put it down.

The author, Patrick, is an engaging, opinionated writer. He breaks down complicated concepts into easily digestible pieces. Where there’s more to the story, there’s a footnote with a snarky comment or a link to more information. Patrick also provides nuts and bolts examples to show why something that seems simple to change is not (scaling text in a browser, for example). He also covers how big decisions like language, frameworks and library choices at the beginning of a project constrain freedom and choices further down.

Patrick covers what developers do, how they think, and why projects often fail. I thought his explanation of the benefits of agile development was darn good, and his explanation that even agile projects fail more often then they succeed was pretty depressing. He also discusses how the house construction metaphor for building software is just a big fat untruth.

I also enjoyed the section about testing in general, the various types of testing, and where they make sense. There’s also a section on finding coders, including a good explanation of why not to hire them as employees (you might be better off just hiring a development shop, depending on your needs). The chapter on how to deal with common issues (“the team hates each other”, “we’re behind schedule”) was worthwhile. His solutions won’t work for everyone. Maybe you’ll want to deal with these issues differently, but considering them before they happen will only help you prepare.

Of course, I also enjoyed the chapter on how to keep coders happy (continuous learning, quiet, a fast computer). In general the author is careful to avoid stereotypes, but does do a good job of covering common themes. I haven’t met too many developers who love working in bullpen environments.

I am definitely not the target audience. Neither is someone who is an experienced manager of developers. However, I am a subject of the book, so it resonated with me and I definitely found myself nodding along. There aren’t too many books I have wanted to distribute copies of (the two others are “The Hard Thing About Hard Things” and “Climate Wars”), but this is one.

If you work in a consulting practice with inexperienced clients or if you work in a product company with an owner or higher up that isn’t technical, reading this book will give you insights into their questions and thought processes. And if you can find a way to give them this book without being condescending (“hey, I found this book fascinating for helping facilitate conversation, maybe you will too”), both they and you will benefit.


Fun with golang

Prairie Dog

Yes, it’s not a gopher, but they are related.

I’m writing a small piece of work in golang. It’s been fun to learn a new language. Initial thoughts:

  • I love the strict compiler. It saved me from making some dumb mistakes. It’s annoying at times (when you are cleaning up after println debugging you have to remove the log import statement).
  • I love that go fmt is built right into the language. No more formatting wars.
  • The standard library has some testing support but it’s pretty primitive.
  • It was weird that you have way to mark methods as private.
  • I did a small bit of concurrency work and go seems like a great fit for that.
  • The docs are great. I spent a lot of time on the Tour and the API docs.
  • The name is too general, but when googling, I was able to search for ‘golang’ and retrieve good results.

I have a few friends that rave about go. I can see why.


The joy of removing code

I am working on a project right now where my main task is to remove code. “What? I thought developers were supposed to add code, not remove it?” Well in some cases removing code actually can help a project. Some reasons to remove code: if the code doesn’t serve any purpose, if it isn’t executed, if it is for an edge case that never happens, if it has been superseded.

Removing code is easier with automated tests, but I still find myself using a combination of automated tests, manual testing, and the find command. (This codebase is ruby, if I was using java I’d use the type system.) It’s painstaking work, but will be good in the end.

My steps for removing code:

  • Start with a plan and and end goal in mind. Otherwise it can get overwhelming if you are dealing with a system of any size.
  • Create a feature branch
  • Identify one piece of code you’re going to either remove or keep
  • Make sure you know where it is called. If it isn’t called anywhere, remove it.
  • Remove any tests and associated functionality (views, helpers, etc).
  • Examine any state managed by the code for removal (database tables, etc)
    • Oftentimes I’ll just note that this should be removed in a few months, and focus on hiding the UX. It’s a lot easier to resurrect UX if you make a mistake than it is a dropped database table.
  • Commit the changes with a good commit message.
  • Fan out from there and see if you need to add anything to your list.

Take your time, don’t rush it.

Code that never runs can’t have any bugs, and is super fast. Think about removing some code today.


CircleCI shutting down version 1.0

I’ve been a happy user of CircleCI at multiple companies. Right now we pay them at The Food Corridor and they handle almost all our deployments. (I still deploy to production manually.)

We just got a note that they are shutting down their 1.0 offering and will not support it as of Aug 2018. The 2.0 offering was announced in 2016 and generally available in 2017. So, there will be about one year of overlap. Not too long.

I understand that desire to move forward.Trust me, I do.

I don’t know how much engineering effort it takes to support the two versions, but my guess is that they’ll see some significant customer loss from this. Why? CI is something that you just want to work. You don’t want to think about it. Which is why a SaaS solution makes so much sense. I am happy to just keep paying them month after month for their excellent product.

But, if I have to take some cycles to move from CircleCI 1.0 to CircleCI 2.0, why wouldn’t I take some time and evaluate other solutions too? I assume they’ve run the numbers and the amount of money it takes to support 1.0 must be more than the amount they will lose via churn.

AWS does a good job of this–they never deprecate anything (you can still set up SimpleDB if you want). They just hide it, make other offerings better, and make older offerings more expensive.

In fact, if I were CircleCI, I might offer a ‘legacy’ CircleCI 1.0 plan, where people with significant investments in the older infrastructure can pay more for access to that old codebase. Depending on the amount of support required, that might be some significant free money.

Relatedly, Amy Hoy has a great post on how to get your customers to pay you more money.



© Moore Consulting, 2003-2017 +