Masterless puppet and CloudFormation

I’ve had some experience with CloudFormation in the past, and recently gained some puppet expertise.  I thought it’d be great to combine the two, working on a new project to set up the ELK stack for a client.

Basically, we are creating an ec2 instance (or a number of them) from a vanilla image using a CloudFormation template, doing a small amount of initialization via the UserData section and then using puppet to configure them further.  However, puppet is used in a masterless context, where the intelligence (of knowing which machine should be configured which way) isn’t in the manifest file, but rather in the code that checks out the modules and manifests. Here’s a great example of a project set up to use masterless puppet.

Before I dive into more details, other solutions I looked at included:

  • doing all the machine setup in UserData
    • This is a bad idea because it forces you to set up and tear down machines each time you want to make a configuration change.  Leads to a longer development cycle, especially at first.  Plus bash is great for small configurations, but when you have dependencies and other complexities, the scripts can get hairy.
  • pulling a bash script from s3/github in UserData
    • puppet is made for configuration management and handles more complexity than a bash script.  I’ll admit, I used puppet with an eye to the future when we had more machines and more types of machines.  I suppose you could do the same with bash, but puppet handles more of typical CM tasks, including setting up cron jobs, making sure services run, and deriving dependencies between services, files and artifacts.
  • using a different CM tool, like ansible or chef
    • I was familiar with puppet.  I imagine the same solution would work with other CM tools.
  • using a puppet master
    • This presentation convinced me to avoid setting up a puppet master.  Cattle not pets.
  • using cloud-init instead of UserData for initial setup
    • I tried.  I couldn’t figure out cloud-init, even with this great post.  It’s been a few months, so I’m afraid I don’t even remember what the issue was, but I remember this solution not working for me.
  • create an instance/AMI with all software installed
    • puppet allows for more flexibility, is quicker to setup, and allows you to manage your configuration in a VCS rather than a pile of different AMIs.
  • use a container instead of AMIs
    • isn’t docker the answer to everything? I didn’t choose this because I was entirely new to containerization and didn’t want to take the risk.

Since I’ve already outlined how the solution works, let’s dive into details.

Here’s the UserData section of the CloudFormation template:


          "Fn::Base64": {
            "Fn::Join": [
              "",
              [
                "#!/bin/bash \n",
                "exec > /tmp/part-001.log 2>&1 \n",
                "date >> /etc/provisioned.date \n",
                "yum install puppet -y \n",
                "yum install git -y \n",
                "aws --region us-west-2 s3 cp s3://s3bucket/auth-files/id_rsa/root/.ssh/id_rsa && chmod 600 /root/.ssh/id_rsa \n",
                "# connect once to github, so we know the host \n",
                "ssh -T -oStrictHostKeyChecking=no git@github.com \n",
                "git clone git@github.com:client/repo.git \n",
                "puppet apply --modulepath repo/infra/puppet/modules pure-spider/infra/puppet/manifests/",
                { "Ref" : "Environment" },
                "/logstash.pp \n",
                "date >> /etc/provisioned.date\n"
              ]
            ]

So, we are using a bash script, but only for a little bit.  The second line (starting with exec) stores output into a logfile for debugging purposes.  We then store off the date and install puppet and git.  The aws command pulls down a private key stored in s3.  This instance has access to s3 because of an IAM setup elsewhere in the CloudFormation template–the access we have is read-only and the private key has already been added to our github repository.  Then we connect to github via ssh to ‘get to know the host’.  Then we clone the repository containing the infrastructure code.  Finally, we apply the manifest, which is partially determined by a parameter to the CloudFormation template.

This bash script will run on creation of the EC2 instance.  Once this script is solid, if you are testing adding additional puppet modules, you only have to do a git pull and puppet apply to add more functionality to the modules.  (Of course, at the end you should stand up and tear down via CloudFormation just to test end to end.)  You can also see how it’d be easy to have the logstash.conf file be a parameter to the CloudFormation template, which would let you store your configuration for web servers, database servers, etc, in puppet as well.

I’m happy with how flexible this solution is.  CloudFormation manages the machine creation as well as any other resources, puppet manages the software installed in those machines, and git allows you to maintain all that configuration in one place.


Making my Twitter feed richer with Zapier and hnrss

twitter photo

Photo by marek.sotak

I read Hacker News, a site for startups and technologies, and occasionally post as well.  A few months back, I realized that the items that I post to HN, I want to tweet as well.  While I could have whipped something up with the HN RSS feed and the Twitter API (would probably be easier than Twitversation), I decided to try to use Zapier (which I’ve loved for a while).  It was dead simple to set up a Zap reading from my HN RSS feed and posting to my Twitter feed.  Probably about 10 minutes of time, and now I doubled my posts to Twitter.

Of course, this misses out on one of the huge benefits of Twitter–the conversational nature of the app.  When my auto posts happen, I don’t have a chance to follow up, or to cc: the authors, etc.

However, the perfect is the enemy of the good, and I figured it was better to engage in Twitter haphazardly and imperfectly than not at all.


Good time had by all at the HN Meetup

Where can you talk about super-capacitors vs batteries, whether you should rewrite your app (hint, you shouldn’t), penetration testing of well known organizations, cost of living, and native vs cross platform mobile apps?  All while enjoying a cold drink and the best fried food the Dark Horse can offer?

At the Boulder Hacker News Meetup, that’s where.  We had our inaugural meetup today and had a good showing.  Developers, startup owners, FTEs, contractors, backend folks, front end devs and penetration testers all showed up, and, as the Meetup page suggests, ate, drank and chatted.

Hope to see you at the next one.



The power of SQL–WP edition

lightning photoI’ve noticed a lot of comment spam lately, and had been dealing with it piecemeal. I have all comments moderated, so it wasn’t affecting the user experience, but I was getting email reminders to moderate lots of comments that were not English and/or advertising something.

WordPress lets you turn off comments for a post, but it is tedious through the GUI. So, I dove into the wordpress database, and ran this SQL query:

update wp_posts set comment_status = 'closed' where post_date < '2014-01-01';

This sets comments to closed for all posts from 2013 or earlier (over 1000 posts).  The power of the command line is the ability to repeat yourself effortlessly.  (The drawback of the command line is its obscurity.)

I should probably write a cron job to do that every year, as posts older than two years old aren’t commented on much. To be honest, the comments section of ‘Dan Moore!’ isn’t used much at all–I know some blogs (AVC) have vibrant comments areas, but I never tried to foster a community.


Using basic authentication and Jetty realms to protect Apache Camel REST routes

So, just finished up looking at authentication mechanisms for a REST service that runs on Camel for a client.  I was using Camel 2.15.2 and the default Jetty component, which is Jetty 8.  I was ready to use Oauth2 until I read this post which makes the very valid point that unless you have difficulty exchanging a shared secret with your users, SSL + basic auth may be enough.  Since these APIs will be used by apps that are built for the same client, and there is no role based access, it seemed like SSL _ basic auth would be good enough.

This example is different from a lot of other Camel security posts, which are all about securing routes within camel.  This example secures the external facing REST routes defined using the REST DSL. I’m pretty sure the spring security example could be modified to protect REST DSL routes as well.

So, easy enough.  I looked at the Camel Jetty security configuration docs, and locking things down with basic auth seemed straightforward.  But, they were out of date.  Here’s an updated version.  (I’ve applied for the ability to edit the Camel website and will update the docs when I can.)

First off, secure the rest configuration in your camel context:

<camelContext ...>
   <restConfiguration component="jetty" ...>
      <endpointProperty key="handlers" value="securityHandler"></endpointProperty>
   </restConfiguration>
...
</camelContext>

Also, make sure you have the camel-jetty module imported your pom.xml (or build.gradle, etc).

Now, create the security context. The below is done in spring XML, but I’m sure it can be converted to Spring annotation configuration easily. This example uses a HashLoginService, which requires a properties file. I didn’t end up using it because this was POC code, but here’s how to configure the HashLoginService.

<!-- Jetty Security handling -->
<bean id="constraint" class="org.eclipse.jetty.util.security.Constraint"> <!-- this class's package changed -->
   <property name="name" value="BASIC"/>
   <property name="roles" value="XXXrolenameXXX"/>
   <property name="authenticate" value="true"/>
</bean>

<bean id="constraintMapping" class="org.eclipse.jetty.security.ConstraintMapping">
   <property name="constraint" ref="constraint"/>
   <property name="pathSpec" value="/*"/>
</bean>

<bean id="securityHandler" class="org.eclipse.jetty.security.ConstraintSecurityHandler">
   <property name="loginService">
      <bean class="org.eclipse.jetty.security.HashLoginService" />
   </property>
   <property name="authenticator">
      <bean class="org.eclipse.jetty.security.authentication.BasicAuthenticator"/>
   </property>
   <property name="constraintMappings">
      <list>
         <ref bean="constraintMapping"/>
      </list>
   </property>
</bean>

As mentioned above, I ended up writing a POC authentication class which had hardcoded values. This post was very helpful to me, as was looking through the jetty8 code on using grepcode.

public class HardcodedLoginService implements LoginService {
  
        // matches what is in the constraint object in the spring config
        private final String[] ACCESS_ROLE = new String[] { "rolename" };
        ...

	@Override
	public UserIdentity login(String username, Object creds) {
		
        UserIdentity user = null;
        
        
        // HERE IS THE HARDCODING
		boolean validUser = "ralph".equals(username) && "s3cr3t".equals(creds);
		if (validUser) {
			Credential credential = (creds instanceof Credential)?(Credential)creds:Credential.getCredential(creds.toString());

		    Principal userPrincipal = new MappedLoginService.KnownUser(username,credential);
		    Subject subject = new Subject();
		    subject.getPrincipals().add(userPrincipal);
		    subject.getPrivateCredentials().add(creds);
		    subject.setReadOnly();
		    user=identityService.newUserIdentity(subject,userPrincipal, ACCESS_ROLE);
		    users.put(user.getUserPrincipal().getName(), true);
		}

	    return (user != null) ? user : null;
	}

	...
}

How, when you retrieve your rest routes, you need to pass your username and password or the request will fail with a 401 error: curl -v --user ralph:s3cr3t http://localhost:8080/restroute

Note that the ACCESS_ROLE variable in the login class must match the roles property of the constraint object, or you’ll get this message:

Problem accessing /restroute. Reason:
!role

You can find a working example on github (with the hardcoded login provider).

Thanks to my client Katasi for letting me publish this work.


Gluecon 2015 takeaways

Is it too early to write a takeaway post before a conference is over? I hope not!

I’m definitely not trying to write an exhaustive overview of Gluecon 2015–for that, check out the agenda. For a flavor of the conversations, check out the twitter stream:


Here are some of my longer term takeaways:

  • Better not to try to attend every session. Make time to chat with random folks in the hallway, and to integrate other knowledge. I attended a bitcoin talk, then tried out the API. (I failed at it, but hey, it was fun to try.)
  • Talks on microservices were plentiful. Lots of challenges there, and the benefits were most clearly espoused by Adrian Cockroft: they make complexity explicit. But they aren’t a silver bullet and require a certain level of organizational and business model maturity before it makes sense.
  • Developer hiring is hard, and it will get worse before it gets better. Some solutions propose starting at the elementary school level with with tools like Scratch. I talked to a number of folks looking to hire, and at least one presenter mentioned that as well at the end of his talk. It’s not quite as bad as 2000 because the standards are still high, but I didn’t talk to anyone who said “we have all the developers we need”. Anecdata, indeed.
  • The Denver Boulder area is a small tech community–I had beers last night with two folks that were friends of friends, and both of them knew and were working with former colleagues of mine. Mind that when thinking of burning that bridge.

To conclude, I’m starting to see repeat folks at Gluecon and that’s exciting. It’s great to have such a thought provoking conference which looks at both the forest and the trees of large scale software development.


Five rules for troubleshooting an unfamiliar system

trouble photo

Photo by Ken and Nyetta

A few weeks ago, I engaged with a client who had a real issue.  They sold a variety of goods via a website (if this was the 90s, they would have been called an ‘e-tailer’), and had been receiving intermittent double orders through their ecommerce system.  Some customers were charged two times for one order.  This led, as you can imagine, to very unhappy customers.  This had been happening for a while and, unfortunately, due to some external obstacles, internal staff were not available to investigate the issue–they had their hands full with an existing higher priority project.

I was called in to see if I could solve this issue.  I had absolutely no familiarity with the system.  But in less than ten hours of time, I was able to find the issue and resolve it.  How I approached the situation can be summed up in five rules:

Number one: define the problem.  Ask questions, and capture the answers.  What is the exact undesired behavior?  When is the undesired behavior happening?  What seems to trigger it?  When did it start?  Were there any changes that happened recently?  Does the client have reproduction steps?

I gathered as much information as I could, but keep it high level.  I asked for architecture and system diagrams.  For the history of the application.  For access to all systems that could possibly be relevant (this will save you time in the future).  For locations of log files, source repositories, configuration files.  For database credentials and credentials for third party systems like CC processors.  It is important at this time to resist the temptation to dive in–at this point the job is to get a high level understanding so I can be efficient in the next steps.

You will get speculation about what the solution is when you are asking about the problem.  Feel free to capture that, but don’t be influenced by it.

Number two–find the finish line.  After getting a clear definition of the problem, I looked in the orders database and find out if the double orders were showing up there.  They were, which was a clue as to which part of the system was malfunctioning, but more importantly let me see the effectiveness of any changes I was making.  It also lets the customer know the objective end goal, which can be important if this is a t&m project, and it let me know the end state to which I was headed–important for morale.  (BTW, don’t do fixed bids for this type of project–overruns will be unpleasant, and there will be overruns.)

I was able to write a SQL script to find double orders over a given time frame.  I ended up writing a script which emailed the results of this query to myself and the client nightly, as an easy way to track progress.  The results of this query were a quantifiable, objective measure of the problem.

Number three–start where you are familiar.  I could have dove in and looked at the codebase, but due to my problem definition, I knew that there had been no changes to the checkout portion of the code base for years.  I also was unfamiliar with the particular software that managed the ecommerce site and could have wasted a lot of time getting up to speed on the control flow.  Instead, once I had the SQL query, I could find users that had been double charged, and look at their sessions in the web server logs.  I’ve been looking at apache http logs for over a decade and was very familiar with this piece of the system.

Number four–follow your nose. I followed a few of the user sessions using grep and noticed some weirdness in the logs.  There were an awful lot of messages that indicated the server had been restarted, and all the double orders I looked at had completed 5-6 seconds after the minute changed.  (It’s hard to define weirdness explicitly, which is why it behooved me to start with a portion of the system that I was experienced with–it made the “weirdness” more obvious.)  From here, I ended up looking at why or how the server was being restarted regularly.  Ended up finding an errant cron job which was restarting the server often enough that the ecommerce system was getting confused and double booking orders–once before the restart and once after.  This was easily fixed by commenting out the cron job.

Number five–know when to stop.  This ecommerce system obviously had a logic flaw–after all, restarting the web server shouldn’t cause an order to be entered twice, whether you restart it every hour or once a year.  I could have dug through the code to find that out.  But instead, I commented out the cron job, let the system run for a week or so and waited for more double orders.  There were none, indicating that the site was low traffic enough that whatever flaw was present didn’t get exercised often, if at all.  I confirmed with the client that this situation met his expectations of completeness, and called it good.

Being thrown into a new system, especially when troubleshooting, is a difficult task.  I am thankful the client was relatively responsive to my questions, and that pressure, while present, wasn’t intense.  These five steps should help you, if you are put in any troubleshooting situation.


Sometimes you just need a technical project manager

stacked blocks photo

Photo by A. Drauglis

As a developer, my skills are applicable across a wide variety of domains.  However,  I re-engaged recently with a prospective client that I wrote about a while ago. They had used another booking solution over the past few months and still had the same pain.  After thinking and doing some research, I can to a conclusion that hiring a developer was not the right answer for them.

This company had an interesting set of constraints:

  • Wedded to a platform (Shopify) because of previous investment and its excellent shopping experience.
  • The platform doesn’t provide all the functionality needed.
  • Users (both internal and customers) are ill served by another system to login and manage.
  • No third party plugins seem to meet the needs, either alone or in combination (at least, no solutions that I could find).  It’s apparently a fairly unique problem set.
  • They were not interested in solving the problem in incremental steps.
  • They had budget limitations (don’t we all).

In this situation, the best solution is to look for someone who can undertake the following steps:

  1. Write up a clear description of the problem. This doesn’t have to be a detailed requirements doc, but should be a clear explication of the issues, needs, timelines (if any), and current systems.
  2. Post the description to the shopify forums and contact the platform vendor directly, to see if anyone has encountered any of the same issues and ask how they solved them. The point of this isn’t to solve the problem, it’s to see if anyone else has solved pieces of the problem. This will help the company identify partners and/or adjust scope of the project. (If Shopify customer support says ‘whoa, we’ve never heard of this’, it’s a different size problem than if they say ‘well, you might want to bolt these three pieces together’.)
  3. After the client has more knowledge, send the requirements to Shopify focused dev shops (and possibly Elance contractors who work with Shopify, but not just Shopify themes). Work with at least two to three of them to see what a solution would cost, either custom or building on their current code. At this time, avoid getting development quotes from anyone who doesn’t have experience with Shopify development (like me!), simply because the integration with the platform is so critical.
  4. Evaluate the results of the RFP process, including following up any avenues that the experts or forums turn up. Consider whether the budget allows for a comprehensive solution or whether it makes more sense to look at point solutions for the high pain areas.
  5. If the results point to a comprehensive solution being within budget, engage the solution provider.  If not, identify the high pain areas and go back to step 1 with the smaller scope.

So, what this client really needs is a technical project manager (TPM). While most freelancers have this skill to some degree, as it is hard to survive without it, making a full time living as a contract project manager is difficult. I know of only one person in 15 years who was making a living as a contract project manager (and she’s not doing it anymore). This particular project doesn’t seem like a full time effort, so the client should be able to get by with a moonlighter, at least until the results of step 4 are known.

Good technical project managers are hard to find. From a friend’s company’s job req (his company who is looking for a TPM, and the job desc captures the description of the skill set well), they have:

balance between hands-on technical knowledge, a ravenous appetite for order, and an understanding of humans and how they work.

Developers (or former developers) who want to project manage and have people skills aren’t quite the unicorn as someone who can both design and develop, but they are almost as rare.  Unless the company can find a contract project manager who has technical chops, they’ll want to either source this internally or find an external contractor with another skill set (developer or designer) who wants to PM this project.

This is a great chance to for an employee to expand their skill set. Based on my experience managing vendors during my time at 8z (a 4 month website relaunch and a longer term, less time intensive data provisioning engagement), I would advise that this employee have technical skills.  Challenging custom solution providers on technical grounds, or at least being able to follow along, ensures the company will get the best solution. This doesn’t work for “take it or leave it” SaaS apps, but this project appears custom enough that the TPM really needs to have both business and technical considerations in mind when managing the development shop. If they are looking for an outside contractor, engaging with a designer or developer who has PM experience would be an option.

As far as stretching budget, proposing a lower price to the development shop in exchange for shared ownership of the code might make sense. The partner could market this code to other clients and recoup some costs, while the client retains a perpetual license.

Sometimes, a generic developer isn’t the right answer for a software development problem. A technical project manager (or someone wearing that hat) can often stretch budget and leverage skill sets of focused development teams.




© Moore Consulting, 2003-2015 +