Skip to content

Useful Tools: wget

I remember writing a spidering program to verify url correctness, about six years ago. I used lwp and wrote threads and all kinds of good stuff. It marked me. Used to be, whenever I want to grab a chunk of html from a server, I scratch out a 30 line perl script. Now I have an alternative. wget (or should it be GNU wget?) is a fantastic way to spider sites. In fact, I just grabbed all the mp3s available here with this command:

wget -r -w 5 --random-wait

The random wait is in there because I didn’t want to overwhelm their servers or get locked out due to repeated, obviously nonhuman resource requests. Pretty cool little tool that can do a lot, as you can see from the options list.