{"id":261,"date":"2005-06-07T12:28:26","date_gmt":"2005-06-07T18:28:26","guid":{"rendered":"http:\/\/www.mooreds.com\/wordpress\/?p=261"},"modified":"2008-04-23T07:15:13","modified_gmt":"2008-04-23T13:15:13","slug":"useful-tools-wget","status":"publish","type":"post","link":"https:\/\/www.mooreds.com\/wordpress\/archives\/261","title":{"rendered":"Useful Tools: wget"},"content":{"rendered":"<p>I remember writing a spidering program to verify url correctness, about six years ago.  I used <a href=\"http:\/\/lwp.linpro.no\/lwp\/LWP\">lwp<\/a> and wrote threads and all kinds of good stuff.  It marked me.  Used to be, whenever I want to grab a chunk of html from a server, I scratch out a 30 line perl script.  Now I have an alternative.  <a href=\"http:\/\/www.gnu.org\/software\/wget\/wget.html\">wget<\/a> (or should it be GNU wget?) is a fantastic way to spider sites.  In fact, I just grabbed all the mp3s available <a href=\"http:\/\/www.turtleserviceslimited.org\/jukebox.htm%3Cbr%3E%3C\/a%3E\">here<\/a> with this command:<br \/>\n<code><br \/>\nwget -r -w 5 --random-wait http:\/\/www.turtleserviceslimited.org\/jukebox.htm<br \/>\n<\/code><br \/>\nThe random wait is in there because I didn&#8217;t want to overwhelm their servers or get locked out due to repeated, obviously nonhuman resource requests.  Pretty cool little tool that can do a lot, as you can see from the <a href=\"http:\/\/www.gnu.org\/software\/wget\/manual\/wget.html#Invoking\">options list<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I remember writing a spidering program to verify url correctness, about six years ago. I used lwp and wrote threads and all kinds of good stuff. It marked me. Used [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,33],"tags":[],"class_list":["post-261","post","type-post","status-publish","format-standard","hentry","category-technology","category-useful-tools"],"_links":{"self":[{"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/posts\/261","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/comments?post=261"}],"version-history":[{"count":0,"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/posts\/261\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/media?parent=261"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/categories?post=261"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/tags?post=261"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}