Wget, some one you might familiar with this tools. Wget is among many common UNIX applications that come installed with various distrubution.
Wget is non-interactive network downloader. It can retrieves files from a server using HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies. The non-interactive means it will work in the background, even after having logged off (well, it’s different story if you shut the machine down).
This article is more like a cheatsheet. In this article I resume some useful syntax to download content from web using wget.
This is the very basic usage for wget.
wget [options]... [URL]...
Download a File
This is the simple and basic way to download a file. Suppose we want to download a image file (thefile.jpg) from www.website.net using http protocol:
If it is under a directory (for example dir):
Download File with Quotes
Download a file whose URL contains “&”, for example:
Download File and All Related
Download a file and all related files for optimum viewing
wget -p http://www.website.net/index.html
Download an Entire Website
Also known as recursive downloading. This will performs a recursive action, traverse every file and folder from given address and download them all.
wget -r http://www.website.net/
Mirror an Entire Web
Similar to previous, but this will do mirroring.
wget --mirror -p --convert-links -P ./wheretosave http://www.website.net/
Ignore File Type Specific
Using this, wget will reject a specific file type during download.
wget --reject=<extension> http://www.website.net/
Resume a Download
Download is unsuccessful? Or something interrupting? Let’s continue the download progress.
wget -c http://www.website.net/largefile.rar
Specify the Name of a File after Download
Specify or Rename download file after it’s complete. For example:
wget -O newname.zip http://www.website.net/downloads/archieved?id=123456
Masks User-Agent to Download
Masks user-agent while downloading. The term masking means it will use different identity / user-agent when it is asked. An example:
wget --user-agent="Mozilla/15.0.1 (X11; U; Linux x128; en-US; rv:1.9.1) Gecko/2015092515 Firefox/15.0.3" \ http://www.website.net/a_file_here.rar
Limit Download Speed
wget -limit-rate=30k http://www.website.net/a_file_here.png
Stop Download After Certain Size
This command will stop download process at a specific file size, ie 10MB. (This quota will not get effect when you do a download a single URL. That is irrespective of the quota size everything will get downloaded when you specify a single file. This quota is applicable only for recursive downloads).
wget -Q10m -i http://www.website.net/
The quota will be in bytes if no suffix (ie. -Q10 means 10 bytes). Acceptable suffixes: k for kilobytes, m for megabytes.
Use a File Containing Download Addresses
Suppose we have save a list of file we want to download, this command will come in handy. For each line, should be one address.
wget -r -l 1 -H -t 1 -nd -N -np -a .mp3 erobots=download-list.txt
Increase Download Retries
You can specify how many times wget will tries to download if the download fails before it gives up.
wget --tries=75 http://www.website.net/a_file_here.zip
Make Links automatically convert to support Pages
Links will be automatically converted to a local consultation pages.
wget -k http://www.website.net/
Use FTP Authentication to Download
Download a file via ftp with authentication
wget -r -l 4 ftp://username:[email protected]/
Download in Background
Make download process in the background with a log output.
wget -b http://www.website.net/a_file_here.ogg
Log Wget to a Specific File
Use a file instead of stderr for log error message
wget -o download.log http://www.website.net/wget