Skip to content

Thoughts on Amazon CloudFormation

cloud formation photo
Photo by eschipul

I recently set up Amazon CloudFormation for a fairly complicated application in AWS.  For those unfamiliar with this service, it allows you specify a number of AWS resources in a declarative way in a JSON document, create them all at once (it’s called a ‘stack’), manage them as one entity, and destroy them.  You are billed just as you would be if you created the resources by hand.  But it’s a versionable, replicable way to create resources.

The distributed application for which I was creating the stack had the following components:

  • queues (SQS)
  • databases (dynamodb, including secondary indices)
  • compute (EC2)
  • alarms (Cloudwatch)
  • storage (S3)
  • a VPC and Subnets
  • event logging (kinesis)
  • hadoop (Elastic Map Reduce)

The last four items were not configured by the CloudFormation template I wrote.  S3, VPC and subnets because I leveraged existing resources, and Kinesis and EMR because they are not supported by CloudFormation.  (Kinesis has some support, but CloudFormation doesn’t allow you to specify a name of a stream, which makes it pretty useless when you want to post or read from a specific stream.)  However, while it would be preferable to have everything specified in CloudFormation, partial stack creation was useful–I just documented the other requirements in the CloudFormation template–because:

  • resource configuration like queue timeouts, names, read throughput, etc can be applied uniformly–consistency is enforced.
  • the infrastructure is defined and documented in one place, allowing a new developer to get up to speed quickly.
  • tags can be applied uniformly.
  • CloudFormation supports parameters, so that you can preface every resource with a deployment environment specific variable (‘stage’, ‘dan-dev’, etc), or have different DynamoDB throughput for different deployments.
  • if different configuration needs to be tested, you can stand up a new stack in minutes and test it.
  • the template can be stored in your version control system, allowing someone to see how things changed over time.  Yay, commit logs!

There were some other possible benefits I just didn’t have time to explore fully before the project wound down.

  • autoscaling groups seemed like they’d be extremely useful.  These aren’t a CloudFormation only tool, but CloudFormation seems an ideal way to define and use them.
  • the ability to create and delete stacks opened up the possibility of creating developer specific environments for debugging issues.

If you are going to start with CloudFormation, I highly recommend setting up an initial environment by hand, and then running CloudFormer, a small application written by Amazon which reads from your existing AWS infrastructure and generates a CloudFormation template.  I used CloudFormer to create a template for everything in our AWS account, and then picked and chose what was pulled over to the new template.  There were a few issues with this though:

  • There was a bug in the CloudFormation documentation for DynamoDB schemas.  You want to use this syntax: "KeySchema": { "HashKeyElement": { "AttributeName": "attrname", "AttributeType": "S" }, ... }.  CloudFormer generated them correctly, however.
  • CloudFormer coerces names of some resources resources including VPCs and subnets to strings, and I had to back those out when I wanted to use existing resources.

Other than not being able to fully define an application (because of dependencies on unsupported AWS tools like Kinesis and EMR), what other downsides does CloudFormation have?

  • it locks you into AWS.  Openstack Heat is an alternative that works across clouds, or so I read.  And, really, once you decide on AWS, is a infrastructure creation script going to be the one thing that keeps you from moving?
  • it is tied to infrastructure creation (though there is resource by resource support for in place updates).  If you want to modify one queue setting, you have to tear down and create anew the entire stack.  I found this to be relatively quick (15 min or so).
  • you are still writing scripts in the UserData section of the EC2 definition to set up your server environment.

After this experience, and reviewing my thoughts above, I believe the sweet spot of CloudFormation is setting up dev and QA environments quickly, and documenting infrastructure choices when you are committed to AWS.