By now you’ve likely heard that Yahoo! intends on acquiring Tumblr. While, the acquisition is not (at time of writing) confirmed and even if it goes down, it does not mean death to Tumblr, you are likely wondering how to take your stuff and run. Today. Here’s how.
Bad news: there is no official Tumblr backup tool. Some people have cooked up well intentioned tools, but none of them preserve the look and feel of your Tumblr. For those that have spent countless hours perfecting their own theme, this will simply not do. If you want a backup of your Tumblr that looks right, and that you can easily upload to your own server I’d recommend using HTTrack.
Head over to the official HTTrack site and download the distribution you need. If you’re on a Mac and use Homebrew, you can just do that instead. The Windows build provides a GUI, but the others do not. The command line interface is cross-platform however. If you’ve installed correctly, you can copy and paste the below, replacing the URL with the URL of your Tumblr, and let er rip.
httrack -w -n -c8 -N0 -s0 -q -v -I0 -p3 http://example.tumblr.com
This can take quite a while depending on the the size of your Tumblr. If you use infinite scroll, this should work regardless, so long as you’ve maintained the “next” and “previous” pagination hyperlink markup in your template. If you haven’t (this would certainly be an edge case, but I’ve seen it with some artist’s themes), I’m sorry, but your site just isn’t crawlable. When all is said and done you’ll be left with flat HTML files, css, js, images, videos, audio, etc with all hyperlinks to crawled content modified to relative paths – meaning it is a backup you can toss on any server. If you’re curious about the options I’ve used in the line above, here are their full descriptions from the documentation. Enjoy.
w *mirror web sites (--mirror) n get non-html files 'near' an html file (ex: an image located outside) (--near) cN number of multiple connections (*c8) (--sockets[=N]) NN structure type (0 *original structure, 1+: see below) (--structure[=N]) or user defined structure (-N "%h%p/%n%q.%t") q no questions - quiet mode (--quiet) %v display on screen filenames downloaded (in realtime) (--display) I *make an index (I0 don't make) (--index) pN priority mode: (* p3) (--priority[=N]) 0 just scan, don't save anything (for checking links) 1 save only html files 2 save only non html files *3 save all files 7 get html files before, then treat other files