wget: Overview

1 
1 1 Overview
1 **********
1 
1 GNU Wget is a free utility for non-interactive download of files from
1 the Web.  It supports HTTP, HTTPS, and FTP protocols, as well as
1 retrieval through HTTP proxies.
1 
1    This chapter is a partial overview of Wget’s features.
1 
1    • Wget is non-interactive, meaning that it can work in the
1      background, while the user is not logged on.  This allows you to
1      start a retrieval and disconnect from the system, letting Wget
1      finish the work.  By contrast, most of the Web browsers require
1      constant user’s presence, which can be a great hindrance when
1      transferring a lot of data.
1 
1    • Wget can follow links in HTML, XHTML, and CSS pages, to create
1      local versions of remote web sites, fully recreating the directory
1      structure of the original site.  This is sometimes referred to as
1      “recursive downloading.” While doing that, Wget respects the Robot
1      Exclusion Standard (‘/robots.txt’).  Wget can be instructed to
1      convert the links in downloaded files to point at the local files,
1      for offline viewing.
1 
1    • File name wildcard matching and recursive mirroring of directories
1      are available when retrieving via FTP.  Wget can read the
1      time-stamp information given by both HTTP and FTP servers, and
1      store it locally.  Thus Wget can see if the remote file has changed
1      since last retrieval, and automatically retrieve the new version if
1      it has.  This makes Wget suitable for mirroring of FTP sites, as
1      well as home pages.
1 
1    • Wget has been designed for robustness over slow or unstable network
1      connections; if a download fails due to a network problem, it will
1      keep retrying until the whole file has been retrieved.  If the
1      server supports regetting, it will instruct the server to continue
1      the download from where it left off.
1 
1    • Wget supports proxy servers, which can lighten the network load,
1      speed up retrieval and provide access behind firewalls.  Wget uses
1      the passive FTP downloading by default, active FTP being an option.
1 
1    • Wget supports IP version 6, the next generation of IP. IPv6 is
1      autodetected at compile-time, and can be disabled at either build
1      or run time.  Binaries built with IPv6 support work well in both
1      IPv4-only and dual family environments.
1 
1    • Built-in features offer mechanisms to tune which links you wish to
1      follow (⇒Following Links).
1 
1    • The progress of individual downloads is traced using a progress
1      gauge.  Interactive downloads are tracked using a
1      “thermometer”-style gauge, whereas non-interactive ones are traced
1      with dots, each dot representing a fixed amount of data received
1      (1KB by default).  Either gauge can be customized to your
1      preferences.
1 
1    • Most of the features are fully configurable, either through command
11      line options, or via the initialization file ‘.wgetrc’ (⇒
      Startup File).  Wget allows you to define “global” startup files
1      (‘/etc/wgetrc’ by default) for site settings.  You can also specify
1      the location of a startup file with the –config option.  To disable
1      the reading of config files, use –no-config.  If both –config and
1      –no-config are given, –no-config is ignored.
1 
1    • Finally, GNU Wget is free software.  This means that everyone may
1      use it, redistribute it and/or modify it under the terms of the GNU
1      General Public License, as published by the Free Software
1      Foundation (see the file ‘COPYING’ that came with GNU Wget, for
1      details).
1