HTTrack website copier is awesome and FREE! I’ve been using it for a few months and I’ve already saved myself and my team members hundreds of hours by downloading full working copies of various websites locally.
Here are a few examples of websites I download.
- A client was having problems with their hosting company who refused to give them access to their source code (developed by a 3rd party) to their CF website. After weeks of trying to solve the problem, they decided to switch hosting companies. During their migration to a hosting company and a new CMS solution, they wanted to keep a copy of the old site online. Regardless of the engine running the website, HTTrack was able to download a working copy of the site to HTML, minus the dynamic features (Search, CMS Admin, etc…) in a few minutes.
- We inherited a old web server at working running Plone 2.0.4. The machine was running on old hardware that was failing (2/3 dead HDD, broken RAID 1) and Zope was randomly crashing. After spending 3 days trying to get the machine stable, I decide to convert the site to HTML and put on one of our newer “stable” web servers. The process was a little mess, since our plone instance pointed to folders paths vs. actual files (/index_html vs /index.hmtl) the final static site had a lot of weird and randomly directed links! The final static HTML site probably took ~8 solid hours of HTML clean-up, but there was also a lot of bad/obsolete content in the existing portal that made this process painful!
- I have a few websites I like to use as samples/demo that show better as interactive! Using HTTrack and controlling the depth of link following/downloading (like first page only), I can download an offline working copy of a sites main page! This is a great way to demo a feature/capability/competitor website…
A few other good things to know, a couple sites I’ve downloaded require you to login first. This isn’t an issue with HTTrack, since you can build your own cookies to apply to your capture. It was a bit tricky getting the cookie format correct (Used Netscape Cookie Format), but after a few minutes I was able to take the cookie data from IE/FireFox and have a perfect working cookie for HTTrack to use!
Once again, this is FREE utility that is handy to have around!