Skip to content

Annotation driven Spring configuration vs XML driven Spring configuration

 julius caesar photo
Photo by Internet Archive Book Images

First, off, I have to admit that I am not a XML hater.  Sure, it’s not the prettiest structured data language, but it seems to do the job just fine, especially in environments where you aren’t concerned about byte size (like, say, when you are configuring a software system).  Plus, it allows you comments, which some alternatives ([cough] JSON) don’t.

But, I come to contrast two types of Spring configuration, not bury XML or @Annotations.  I’ve been working on a project that uses Spring annotations.  I evaluated Guice, but it didn’t have lifecycle management, and we needed that.  I also evaluated PicoContainer (I know, blast from the past, right) but it didn’t seem functional and the web presence was a mess.  In the past, I’ve used XML based configuration of Spring.  I thought it would be useful for me to capture a few of the differences.

First of all though, they are fairly similar.  Both use the standard Spring workflow of “create a number of components, pass them to other components, pass those to other components, occasionally use some of the extensive spring library in a way that appears magical at times, create a few more components, and string them all together to build a system that does what you want”, whether “what you want” is responding to web requests, modifying databases, pulling data off of a queue for processing, or something entirely different.

For annotation based configuration, you are writing the configuration in classes that are specially tagged with a @Configuration annotation. These configuration classes look for components in a couple of places, either under them in the classpath or in certain packages, as specified by the @ComponentScan annotation. This means that you get all the benefits of your IDE:

  • You get code completion.
  • The compiler checks your types so that you don’t have runtime exceptions if you passed a Foo bean to a class that was expecting a Foobar bean.
  • You can do logic in the configuration class to load different components based on anything (accessing a buildtime constant, a file, a database, an external service, etc)
  • You can refactor bean definitions and know that they’re changed everywhere.

All this power comes with the risk of complexity. If you read from a database table to know what kind of component implements a certain interface, that can be, shall we say, less than obvious.

XML based configuration, in my experience, is simpler. You still can have multiple layers of XML files, but you can’t have any logic, at least without using XSLT and generating your Spring configuration files (if you are doing that, annotation based configuration is going to be far simpler!). Components can be anywhere because they are specified by the fully qualified class name.  XML configuration removes temptation to invoke any java code as part of your component configuration. On the flip side, you do have risk of runtime errors with typos, and you must deal with XML.

Both of these configuration options are well supported. I find that there’s more documentation online about the XML based configuration, but they are isomorphic, so I’d recommend picking the option that suits your needs best. If you need complex configuration or want the blanket of type safety, then annotation based configuration is the best option. If you have a simpler project, especially if everything can be contained in one XML file, XML based configuration is the better option.

Why Use an ETL Tool?

transformation photo
Photo by AlicePopkorn

I’m a big fan of ETL tools.  The one with which I am most familiar is Kettle, aka Pentaho Data Integration.  When I was working for 8z, we used it heavily to pull data from other systems, process it, and update our databases.  While ETL systems are not without their flaws, I think their strengths are such that everyone who is moving data around should consider them.  This is more true now than in the past because there is a lot more data flowing everywhere, and there are several viable open source ETL tools, so you don’t have to spend thousands or tens of thousands of dollars to get started.

What are the benefits of ETL tools?

  • There are pre-built components for common data tasks (connecting to a database, parsing a flat file) that have been tested and debugged by many many people.  It’s hard to over emphasize how much time this can save, allowing you to focus on business logic.
  • You operate at a higher level of abstraction.
  • There is support for other performance features like parallel jobs that you can configure.
  • The GUI makes data flow obvious.
  • You can write your own components that leverage existing libraries.

What are the detriments?

  • Possible to version control, impossible to merge.
  • Limits of components mean you sometimes have to contort your data flows, or drop down to write your own component.
  • Some components (at least for Kettle) are not open source.
  • You have to roll your own testing framework.  I did.
  • You have to learn another tool.

Don’t re-invent the wheel!  Your data movement problem may very well be a super special snowflake, but chances are it isn’t.  Every line of code you write is another you have to maintain.  When you are confronted with a data movement problem, take a look at an ETL tool like Kettle and see if you can stand on the shoulders of giants.  Here’s a list of open source ETL tools to evaluate.

Denver Bootstrappers Lunch

boots photo
Photo by liftarn

My friend and former colleague Corey Snipes has been working to get a Bootstrapper’s meetup off the ground in Denver. This is a small group (limited to 12, I believe) of people who are building products (typically software) and self funding. I believe most of the members are in the solopreneur mode (I know Corey is).

I imagine this kind of support group would be fantastic–certainly I had a similar group when I was a consultant in the past, and bouncing ideas off of others in similar situations made the struggle much easier.

I’ve not made this meetup yet because, a) I’m not sure I’m bootstrapping (and you know what, if you aren’t sure you’re bootstrapping, you aren’t bootstrapping!), b) I live in Boulder and Boulderites have a hard time leaving the Boulder Bubble, and c) Wednesdays in general are tough days for me to do anything outside of the house.

If you are a bootstrapper in the Denver area, take a look.

Lessons from curating a link blog

link photo
Photo by StockMonkeys.com

I maintain a link blog about Colorado food and local food in general.  I use Tumblr, but I’m only incidentally interested in Tumblr traffic.  Tumblr hooks up to Facebook and Twitter, and pushes links there.  (I realize that I am missing interaction on Twitter and Facebook by using these networks as broadcast only, but I don’t have time to fully engage, so I thought a limited presence was better than nothing.)

Having maintained this link blog for over two years, I have learned a few things.

  • It is easy to start a project like this, but hard to finish.  There’s always more to do.  I think I’ll stop when it stops being interesting.
  • Deciding to do this is a great way to gain a broad understanding of a field while providing some value (via curating).  As you find more and more sources of links, videos, articles and audio content, you’ll gain a sense of what is happening.  Even if you don’t painstakingly read every article, you’ll still get a sense.
  • Speaking of sources, Google alerts is your friend.  I get emailed alerts on a variety of searches, and about 25% of the results are worth posting.  Facebook and twitter are additional great sources of links.
  • An RSS reader can help you if you are really diving in.
  • Giving someone notice that you’ve referenced their article via an ‘@’ mention will get you their attention.
  • Queuing up posts on Tumblr is a life saver.  This lets you stack up posts and portion them out one per day.  I typically have between 15 and 30 posts in my queue.  This makes timely posts more difficult, but frees me up to forget about the link blog for weeks at a time.
  • A link blog like this is a great use of your in between time, especially if you have a smartphone.  In five minutes I can scan and post two or three links, where five minutes is barely enough time to think of a regular blog post.  The Tumblr app is very good.
  • A linkblog is a great resource for other content generation.  I have a newsletter about local food as well, and a key section of that is interesting links.  Those are almost entirely drawn from the Tumblr.

The linkblog approach is very similar to Twitter, but differs in a few crucial ways:

These attributes make a linkblog a fine complement to Twitter.

There are some problems with this model, however.

  • Limited interaction with followers, either on Tumblr, Facebook or Twitter.
  • I’ve found that engaging on Twitter and Facebook directly is far more effective if you want content to be viewed or links to be clicked.
  • A linkblog like this is not truly building my tribe

So, if you have limited time, want to gain insight into a particular area of interest, and are OK with the drawbacks, create a linkblog.

A Useful, Tested Git Work Flow

knot photo
Photo by fdecomite

I’m working on a project with a number of developers (about 9 checking in code) that is moving rather fast and we’re using git and github. It’s actually really interesting to me, because most of my experience has been with smaller teams (and centralized VCS) where having everything on HEAD is perfectly fine. I was even able to branch using CVS because the chance of merge conflicts with no one else doing development was small.

I remember having lunch with a friend who worked at Rally and we talked about git. He said that they were heavy users, and “once you use it, dude, you’ll never go back”. At the time I thought–how great can git be? I’d been using it for a small project I was coding by myself, and it seemed nice enough, but not revolutionary.

But, now that I’m using it in a fast moving team with a large number of developers touching lots of parts of the system, the branching and merging capabilities of git are starting to shine. The project lead, who has used git before, recommended the Driessen git flow (from 2010), which is more complex than the github flow.

We’ve been using this for a few weeks and I’ve found it be clear, fairly easy to understand and still flexible enough to let development move forward at a breakneck pace. The supporting branches, along with master (always what is in production) and develop (always works, what is coming down the pike in terms of features), seem to be a nice compromise between the strictures of traditional, centralized VCS and the free-for-all that is possible with git.

Open Source, Consulting and Building SaaS Products

construction photo
Photo by JD Hancock

I was browsing Hacker News the other day, and ran across this article, lamenting how difficult it was to support a company with an open source project and that insomuch as one could, consulting generated far more revenue than selling SaaS services like hosting.  For the record, I’ve never touched LocomotiveCMS.  From a brief glance, it looks nice.

While I feel for them, I think that they have alternatives:

  • Sell premium support.  Right now, it appears the only way to get premium support is to host with them, and it seems that many clients are more interested in self hosted solutions.  Makes sense–if you are a rails developer (the target market for this CMS) you already have a hosting solution.  But if premium support was offered separately, they could hire someone (possibly part time) less skilled than Didier, the primary developer, and have them take care of tier 1 support.  And still offer a warm fuzzy feeling for harder problems, which would escalate to Didier.  Companies like to pay for that kind of service, even if they don’t always use it.  This strategy would also decrease the amount of revenue needed to hire someone to help Didier (customer server folks are less expensive than developers).
  • Sell an ebook (or a couple).  These are far easier to create and sell than a SaaS product.  (I use leanpub!)  It could be an ‘authoritative guide to LocomotiveCMS’ or just focus on one part.  Since Didier knows which questions he often answers for people who have paid him money, he’s probably got a very good idea of where the pain points are.
  • Someone suggested this in the comments, but a marketplace for plugins to LocomotiveCMS seems like a natural way to go.  Again, i don’t know that community, and marketplaces for CMSes can be hard to kick start, but this is worth evaluating.
  • I’m sure there are others.  Here’s an exhaustive list of business models, courtesy of the AVC community, so if I were them, I’d review and see what was a fit.

In my comment on the HN post, I talk about how products often face a “round peg in an elliptical hole” problem. I meant that products often solve 80% of the problem for 80% of the users.  They also require users to change their processes (more crystallization).  Typically there’s just enough offset that people feel cognitive drag.  (Of course, the same thing usually happens with custom solutions, you just don’t know that until you are done.  Doh!)

Especially in crowded markets, like CMSes, it is far far easier to sell enough hours to make a living customizing a solution than it is to sell enough products to make a living.  Brennan Dunn covers this ground well.  Every consulting company I’ve ever seen or been a part of, and every consultant I’ve ever known (except the ones who were contracting for one client and really were employees with more flexibility), dreams of transitioning from non scalable consulting by the hour to scalable product sales.  One friend even had a name for it–the “von MacIntyre machine”, which would make money while he slept.

But it’s hard.