Recursive download HTTPS / FTP with Wget

Wget can recursively download data or web pages. This is a key feature Wget has that cURL does not have. While cURL is a library with a command-line front end, Wget is a command-line tool. Since recursive download requires several Wget options, it is perhaps best shown by example.


wget --recursive -np -nc -nH --cut-dirs=4 --random-wait --wait 1 -e robots=off https://site.example/aaa/bbb/ccc/ddd/

This downloads the files to whatever directory you ran the command in. To use Wget to recursively download using FTP, simply change https:// to ftp:// using the FTP directory.

Wget recursive download options

download recursively (and place in recursive folders on your PC)
--recursive --level=1
recurse but --level=1 don’t go below specified directory
-Q 1g
total overall download --quota option, for example to stop downloading after 1 GB has been downloaded altogether
Never get parent directories (sometimes a site will link upwards)
no clobber – don’t re-download files you already have
no directory structure on download (put all files in one directory commanded by -P)
don’t put vestigial site name directories on your PC
only accept files matching globbed pattern
don’t put a vestigial hierarchy of directories above the desired directory on your PC. Set the number equal to the number of directories on server (here aaa/bbb/ccc/ddd is four)
-e robots=off
Many sites will block robots from mindlessly consuming huge amounts of data. Here we override this setting telling Apache that we’re (somewhat) human.
To avoid excessive download requests (that can get you auto-banned from downloading) we politely wait in-between file downloads
--wait 1
making the random wait time average to about 1 second before starting to download the next file. This helps avoid anti-leeching measures.