wget: HTTPS (SSL/TLS) Options

1 
1 2.8 HTTPS (SSL/TLS) Options
1 ===========================
1 
1 To support encrypted HTTP (HTTPS) downloads, Wget must be compiled with
1 an external SSL library.  The current default is GnuTLS. In addition,
1 Wget also supports HSTS (HTTP Strict Transport Security).  If Wget is
1 compiled without SSL support, none of these options are available.
1 
1 ‘--secure-protocol=PROTOCOL’
1      Choose the secure protocol to be used.  Legal values are ‘auto’,
1      ‘SSLv2’, ‘SSLv3’, ‘TLSv1’, ‘TLSv1_1’, ‘TLSv1_2’, ‘TLSv1_3’ and
1      ‘PFS’.  If ‘auto’ is used, the SSL library is given the liberty of
1      choosing the appropriate protocol automatically, which is achieved
1      by sending a TLSv1 greeting.  This is the default.
1 
1      Specifying ‘SSLv2’, ‘SSLv3’, ‘TLSv1’, ‘TLSv1_1’, ‘TLSv1_2’ or
1      ‘TLSv1_3’ forces the use of the corresponding protocol.  This is
1      useful when talking to old and buggy SSL server implementations
1      that make it hard for the underlying SSL library to choose the
1      correct protocol version.  Fortunately, such servers are quite
1      rare.
1 
1      Specifying ‘PFS’ enforces the use of the so-called Perfect Forward
1      Security cipher suites.  In short, PFS adds security by creating a
1      one-time key for each SSL connection.  It has a bit more CPU impact
1      on client and server.  We use known to be secure ciphers (e.g.  no
1      MD4) and the TLS protocol.  This mode also explicitly excludes
1      non-PFS key exchange methods, such as RSA.
1 
1 ‘--https-only’
1      When in recursive mode, only HTTPS links are followed.
1 
1 ‘--ciphers’
1      Set the cipher list string.  Typically this string sets the cipher
1      suites and other SSL/TLS options that the user wish should be used,
1      in a set order of preference (GnuTLS calls it ’priority string’).
1      This string will be fed verbatim to the SSL/TLS engine (OpenSSL or
1      GnuTLS) and hence its format and syntax is dependant on that.  Wget
1      will not process or manipulate it in any way.  Refer to the OpenSSL
1      or GnuTLS documentation for more information.
1 
1 ‘--no-check-certificate’
1      Don’t check the server certificate against the available
1      certificate authorities.  Also don’t require the URL host name to
1      match the common name presented by the certificate.
1 
1      As of Wget 1.10, the default is to verify the server’s certificate
1      against the recognized certificate authorities, breaking the SSL
1      handshake and aborting the download if the verification fails.
1      Although this provides more secure downloads, it does break
1      interoperability with some sites that worked with previous Wget
1      versions, particularly those using self-signed, expired, or
1      otherwise invalid certificates.  This option forces an “insecure”
1      mode of operation that turns the certificate verification errors
1      into warnings and allows you to proceed.
1 
1      If you encounter “certificate verification” errors or ones saying
1      that “common name doesn’t match requested host name”, you can use
1      this option to bypass the verification and proceed with the
1      download.  _Only use this option if you are otherwise convinced of
1      the site’s authenticity, or if you really don’t care about the
1      validity of its certificate._  It is almost always a bad idea not
1      to check the certificates when transmitting confidential or
1      important data.  For self-signed/internal certificates, you should
1      download the certificate and verify against that instead of forcing
1      this insecure mode.  If you are really sure of not desiring any
1      certificate verification, you can specify –check-certificate=quiet
1      to tell wget to not print any warning about invalid certificates,
1      albeit in most cases this is the wrong thing to do.
1 
1 ‘--certificate=FILE’
1      Use the client certificate stored in FILE.  This is needed for
1      servers that are configured to require certificates from the
1      clients that connect to them.  Normally a certificate is not
1      required and this switch is optional.
1 
1 ‘--certificate-type=TYPE’
1      Specify the type of the client certificate.  Legal values are ‘PEM’
1      (assumed by default) and ‘DER’, also known as ‘ASN1’.
1 
1 ‘--private-key=FILE’
1      Read the private key from FILE.  This allows you to provide the
1      private key in a file separate from the certificate.
1 
1 ‘--private-key-type=TYPE’
1      Specify the type of the private key.  Accepted values are ‘PEM’
1      (the default) and ‘DER’.
1 
1 ‘--ca-certificate=FILE’
1      Use FILE as the file with the bundle of certificate authorities
1      (“CA”) to verify the peers.  The certificates must be in PEM
1      format.
1 
1      Without this option Wget looks for CA certificates at the
1      system-specified locations, chosen at OpenSSL installation time.
1 
1 ‘--ca-directory=DIRECTORY’
1      Specifies directory containing CA certificates in PEM format.  Each
1      file contains one CA certificate, and the file name is based on a
1      hash value derived from the certificate.  This is achieved by
1      processing a certificate directory with the ‘c_rehash’ utility
1      supplied with OpenSSL. Using ‘--ca-directory’ is more efficient
1      than ‘--ca-certificate’ when many certificates are installed
1      because it allows Wget to fetch certificates on demand.
1 
1      Without this option Wget looks for CA certificates at the
1      system-specified locations, chosen at OpenSSL installation time.
1 
1 ‘--crl-file=FILE’
1      Specifies a CRL file in FILE.  This is needed for certificates that
1      have been revocated by the CAs.
1 
1 ‘--pinnedpubkey=file/hashes’
1      Tells wget to use the specified public key file (or hashes) to
1      verify the peer.  This can be a path to a file which contains a
1      single public key in PEM or DER format, or any number of base64
1      encoded sha256 hashes preceded by “sha256//” and separated by “;”
1 
1      When negotiating a TLS or SSL connection, the server sends a
1      certificate indicating its identity.  A public key is extracted
1      from this certificate and if it does not exactly match the public
1      key(s) provided to this option, wget will abort the connection
1      before sending or receiving any data.
1 
1 ‘--random-file=FILE’
1      [OpenSSL and LibreSSL only] Use FILE as the source of random data
1      for seeding the pseudo-random number generator on systems without
1      ‘/dev/urandom’.
1 
1      On such systems the SSL library needs an external source of
1      randomness to initialize.  Randomness may be provided by EGD (see
1      ‘--egd-file’ below) or read from an external source specified by
1      the user.  If this option is not specified, Wget looks for random
1      data in ‘$RANDFILE’ or, if that is unset, in ‘$HOME/.rnd’.
1 
1      If you’re getting the “Could not seed OpenSSL PRNG; disabling SSL.”
1      error, you should provide random data using some of the methods
1      described above.
1 
1 ‘--egd-file=FILE’
1      [OpenSSL only] Use FILE as the EGD socket.  EGD stands for “Entropy
1      Gathering Daemon”, a user-space program that collects data from
1      various unpredictable system sources and makes it available to
1      other programs that might need it.  Encryption software, such as
1      the SSL library, needs sources of non-repeating randomness to seed
1      the random number generator used to produce cryptographically
1      strong keys.
1 
1      OpenSSL allows the user to specify his own source of entropy using
1      the ‘RAND_FILE’ environment variable.  If this variable is unset,
1      or if the specified file does not produce enough randomness,
1      OpenSSL will read random data from EGD socket specified using this
1      option.
1 
1      If this option is not specified (and the equivalent startup command
1      is not used), EGD is never contacted.  EGD is not needed on modern
1      Unix systems that support ‘/dev/urandom’.
1 
1 ‘--no-hsts’
1      Wget supports HSTS (HTTP Strict Transport Security, RFC 6797) by
1      default.  Use ‘--no-hsts’ to make Wget act as a non-HSTS-compliant
1      UA. As a consequence, Wget would ignore all the
1      ‘Strict-Transport-Security’ headers, and would not enforce any
1      existing HSTS policy.
1 
1 ‘--hsts-file=FILE’
1      By default, Wget stores its HSTS database in ‘~/.wget-hsts’.  You
1      can use ‘--hsts-file’ to override this.  Wget will use the supplied
1      file as the HSTS database.  Such file must conform to the correct
1      HSTS database format used by Wget.  If Wget cannot parse the
1      provided file, the behaviour is unspecified.
1 
1      The Wget’s HSTS database is a plain text file.  Each line contains
1      an HSTS entry (ie.  a site that has issued a
1      ‘Strict-Transport-Security’ header and that therefore has specified
1      a concrete HSTS policy to be applied).  Lines starting with a dash
1      (‘#’) are ignored by Wget.  Please note that in spite of this
1      convenient human-readability hand-hacking the HSTS database is
1      generally not a good idea.
1 
1      An HSTS entry line consists of several fields separated by one or
1      more whitespace:
1 
1      ‘<hostname> SP [<port>] SP <include subdomains> SP <created> SP
1      <max-age>’
1 
1      The HOSTNAME and PORT fields indicate the hostname and port to
1      which the given HSTS policy applies.  The PORT field may be zero,
1      and it will, in most of the cases.  That means that the port number
1      will not be taken into account when deciding whether such HSTS
1      policy should be applied on a given request (only the hostname will
1      be evaluated).  When PORT is different to zero, both the target
1      hostname and the port will be evaluated and the HSTS policy will
1      only be applied if both of them match.  This feature has been
1      included for testing/development purposes only.  The Wget testsuite
1      (in ‘testenv/’) creates HSTS databases with explicit ports with the
1      purpose of ensuring Wget’s correct behaviour.  Applying HSTS
1      policies to ports other than the default ones is discouraged by RFC
1      6797 (see Appendix B "Differences between HSTS Policy and
1      Same-Origin Policy").  Thus, this functionality should not be used
1      in production environments and PORT will typically be zero.  The
1      last three fields do what they are expected to.  The field
1      INCLUDE_SUBDOMAINS can either be ‘1’ or ‘0’ and it signals whether
1      the subdomains of the target domain should be part of the given
1      HSTS policy as well.  The CREATED and MAX-AGE fields hold the
1      timestamp values of when such entry was created (first seen by
1      Wget) and the HSTS-defined value ’max-age’, which states how long
1      should that HSTS policy remain active, measured in seconds elapsed
1      since the timestamp stored in CREATED.  Once that time has passed,
1      that HSTS policy will no longer be valid and will eventually be
1      removed from the database.
1 
1      If you supply your own HSTS database via ‘--hsts-file’, be aware
1      that Wget may modify the provided file if any change occurs between
1      the HSTS policies requested by the remote servers and those in the
1      file.  When Wget exists, it effectively updates the HSTS database
1      by rewriting the database file with the new entries.
1 
1      If the supplied file does not exist, Wget will create one.  This
1      file will contain the new HSTS entries.  If no HSTS entries were
1      generated (no ‘Strict-Transport-Security’ headers were sent by any
1      of the servers) then no file will be created, not even an empty
1      one.  This behaviour applies to the default database file
1      (‘~/.wget-hsts’) as well: it will not be created until some server
1      enforces an HSTS policy.
1 
1      Care is taken not to override possible changes made by other Wget
1      processes at the same time over the HSTS database.  Before dumping
1      the updated HSTS entries on the file, Wget will re-read it and
1      merge the changes.
1 
1      Using a custom HSTS database and/or modifying an existing one is
1      discouraged.  For more information about the potential security
1      threats arised from such practice, see section 14 "Security
1      Considerations" of RFC 6797, specially section 14.9 "Creative
1      Manipulation of HSTS Policy Store".
1 
1 ‘--warc-file=FILE’
1      Use FILE as the destination WARC file.
1 
1 ‘--warc-header=STRING’
1      Use STRING into as the warcinfo record.
1 
1 ‘--warc-max-size=SIZE’
1      Set the maximum size of the WARC files to SIZE.
1 
1 ‘--warc-cdx’
1      Write CDX index files.
1 
1 ‘--warc-dedup=FILE’
1      Do not store records listed in this CDX file.
1 
1 ‘--no-warc-compression’
1      Do not compress WARC files with GZIP.
1 
1 ‘--no-warc-digests’
1      Do not calculate SHA1 digests.
1 
1 ‘--no-warc-keep-log’
1      Do not store the log file in a WARC record.
1 
1 ‘--warc-tempdir=DIR’
1      Specify the location for temporary files created by the WARC
1      writer.
1