I’ve written about wget before, but I just found a very cool use for it. I’m looking at ways to speed up a site, by stripping out whitespace. I found one servlet filter by googling around. However, that code has a license restriction for commercial use (you just have to pay for it).
A bit more looking and I found this fantastic article: Two Servlet Filters Every Web Application Should Have. Give it a read if you do java web development. The article provides a free set of filters, one of which compresses a servlet response if the browser can handle it. (I’ve touched on using gzip for such purposes before as well.)
I was seeing some fantastic decreases in page size. (The article states that you can’t tell, but FireFox’s Page Info command [ Tools / Page Info ] seemed to reflect the differences.) Basically, a 300% decrease in size: 50K to 5K, 130K to 20K. Pretty cool. Note that if your application is CPU bound, adding this filter will not help performance. But if you’re bandwidth bound, decreasing your average page size will help.
I couldn’t believe those numbers. I also wanted to make sure that clients who didn’t understand gzip could still see the pages. Out comes trusty wget.
wget url pulls down the standard sized file.
wget --header="accept-encoding: gzip" url pulls down a gzipped version that I can even ungzip and verify that nothing has changed.
The only issue I saw was that ‘ c ‘ is apparently rendered as the copyright symbol in uncompressed pages. For compressed pages, you need to use the standard: ©. Other than that, the compression was transparent. Way cool.
Woot! Thanks a lot! for this wget option no I understand what is going on via gzip 😉 !