Performance testing is a bit like visiting your girlfriend’s father. You’re never quite sure what you’re accomplishing, it can be alternately frustrating and satisfying, and you have to do it. Right now I’m in the midst of performance testing a web based application for my new company. I’ve been in such testing tangles before, though always as a consultant on a fixed bid project. I’d have to say that performance testing as an employee is less stressful than that.
The reasons why performance testing, especially of web applications, is such a rat’s nest are many:
complexity of platforms
Most modern web applications are built on a lot of code. In our case, it’s a servlet and logging framework on top of tomcat on top of the JVM on top of the operating system. Four levels on the web server, not counting the back end or the load balancer or any interaction with the browser! And this is a relatively simple system. I’ve seen portal applications that had 6 or more levels in the web server. Each level of the software stack interacts in (sometimes unforeseen) ways with the others, which means that changing parameters can have unpredictable effects. You simply must test every change you make.
Unless you’re working with Scrooge McDuck or an application that has yet to be deployed, you’re probably not going to be able to test on production hardware. Very few companies I’ve dealt with are willing to buy a duplicate of their production hardware for testing purposes, so you’ll probably be testing on a scaled down version of the production system. That means that you’ll have to make assumptions about what the smaller system will tell you about the bigger system. One usually safe conclusion is that the smaller system sets a performance minimum for the larger system.
amount of time required
Each performance test takes a significant amount of time, minutes rather than unit testing where you want the unit tests to run quickly. Such slow turnarounds mean that performance testing just can’t be done quickly.
difficulty of understanding real user behavior
The more complicated your application is, the harder it is to understand how people are going to use it. Will they move quickly through the application? Will they leave sessions open for a long time? How many states will they go through? Anyone can come up with a reasonable guess as to the answers for these questions, but the only way to know for sure is to a) user experience test it, or b) unleash the application.
ambiguous or arbitrary goals
Unless you really understand how your userbase is going to use the application, it’s hard to come up with reasonable goals. ‘Make it run faster’ doesn’t cut it. Nor does picking an arbitrary number: ‘we want to service 10,000 hits a second’ may seem like a good goal, but if that number just was plucked from the air, a lot of misery can result. Especially if you’re on a fixed bid project, and every hour you spend is eating into your margin. (It’s OK for performance testing to make a tech person miserable, as long as there’s business benefitand an arbitrary number is likely to under- or overshoot the optimum for business.)
difficulty of reproducing real user behavior
I’ve not had a lot of experience with for pay tools, but have used a variety of free (as in beer) tools. I’ve written before about my experiences with The Grinder, and am currently using JMeter. I’ve also used apachebench. And all of these tools were great at hitting URLs repeatedly and rapidly, but it was hard to really reproduce user (and browser) behavior because they’re simple programs. An example is that some versions of IE can call a servlet multiple times. You can’t possibly hope to replicate all the quirks of browsers when testing, but sometimes those quirks can have performance impacts.
These dimensions of complexity feed on each other. Because it takes so long to performance test an application, you are tempted to change more than one level of the application at once. Because you think you understand user behavior, you come up with an erroneous performance target.
Is it hopeless? Nope, and it can be a very good exerciseit can turn up areas of real weakness in your application. Just remember to document your assumptions, make the tradeoffs abundantly clear to non technical folks and realize that you’re going to miss something important. Your results won’t be worth as much as you think they will be. Oh yeah, and don’t sign any fixed bid performance testing contracts unless you know what you’re doing.