wget: HTTPS (SSL/TLS) Options
1
1 2.8 HTTPS (SSL/TLS) Options
1 ===========================
1
1 To support encrypted HTTP (HTTPS) downloads, Wget must be compiled with
1 an external SSL library. The current default is GnuTLS. In addition,
1 Wget also supports HSTS (HTTP Strict Transport Security). If Wget is
1 compiled without SSL support, none of these options are available.
1
1 ‘--secure-protocol=PROTOCOL’
1 Choose the secure protocol to be used. Legal values are ‘auto’,
1 ‘SSLv2’, ‘SSLv3’, ‘TLSv1’, ‘TLSv1_1’, ‘TLSv1_2’, ‘TLSv1_3’ and
1 ‘PFS’. If ‘auto’ is used, the SSL library is given the liberty of
1 choosing the appropriate protocol automatically, which is achieved
1 by sending a TLSv1 greeting. This is the default.
1
1 Specifying ‘SSLv2’, ‘SSLv3’, ‘TLSv1’, ‘TLSv1_1’, ‘TLSv1_2’ or
1 ‘TLSv1_3’ forces the use of the corresponding protocol. This is
1 useful when talking to old and buggy SSL server implementations
1 that make it hard for the underlying SSL library to choose the
1 correct protocol version. Fortunately, such servers are quite
1 rare.
1
1 Specifying ‘PFS’ enforces the use of the so-called Perfect Forward
1 Security cipher suites. In short, PFS adds security by creating a
1 one-time key for each SSL connection. It has a bit more CPU impact
1 on client and server. We use known to be secure ciphers (e.g. no
1 MD4) and the TLS protocol. This mode also explicitly excludes
1 non-PFS key exchange methods, such as RSA.
1
1 ‘--https-only’
1 When in recursive mode, only HTTPS links are followed.
1
1 ‘--ciphers’
1 Set the cipher list string. Typically this string sets the cipher
1 suites and other SSL/TLS options that the user wish should be used,
1 in a set order of preference (GnuTLS calls it ’priority string’).
1 This string will be fed verbatim to the SSL/TLS engine (OpenSSL or
1 GnuTLS) and hence its format and syntax is dependant on that. Wget
1 will not process or manipulate it in any way. Refer to the OpenSSL
1 or GnuTLS documentation for more information.
1
1 ‘--no-check-certificate’
1 Don’t check the server certificate against the available
1 certificate authorities. Also don’t require the URL host name to
1 match the common name presented by the certificate.
1
1 As of Wget 1.10, the default is to verify the server’s certificate
1 against the recognized certificate authorities, breaking the SSL
1 handshake and aborting the download if the verification fails.
1 Although this provides more secure downloads, it does break
1 interoperability with some sites that worked with previous Wget
1 versions, particularly those using self-signed, expired, or
1 otherwise invalid certificates. This option forces an “insecure”
1 mode of operation that turns the certificate verification errors
1 into warnings and allows you to proceed.
1
1 If you encounter “certificate verification” errors or ones saying
1 that “common name doesn’t match requested host name”, you can use
1 this option to bypass the verification and proceed with the
1 download. _Only use this option if you are otherwise convinced of
1 the site’s authenticity, or if you really don’t care about the
1 validity of its certificate._ It is almost always a bad idea not
1 to check the certificates when transmitting confidential or
1 important data. For self-signed/internal certificates, you should
1 download the certificate and verify against that instead of forcing
1 this insecure mode. If you are really sure of not desiring any
1 certificate verification, you can specify –check-certificate=quiet
1 to tell wget to not print any warning about invalid certificates,
1 albeit in most cases this is the wrong thing to do.
1
1 ‘--certificate=FILE’
1 Use the client certificate stored in FILE. This is needed for
1 servers that are configured to require certificates from the
1 clients that connect to them. Normally a certificate is not
1 required and this switch is optional.
1
1 ‘--certificate-type=TYPE’
1 Specify the type of the client certificate. Legal values are ‘PEM’
1 (assumed by default) and ‘DER’, also known as ‘ASN1’.
1
1 ‘--private-key=FILE’
1 Read the private key from FILE. This allows you to provide the
1 private key in a file separate from the certificate.
1
1 ‘--private-key-type=TYPE’
1 Specify the type of the private key. Accepted values are ‘PEM’
1 (the default) and ‘DER’.
1
1 ‘--ca-certificate=FILE’
1 Use FILE as the file with the bundle of certificate authorities
1 (“CA”) to verify the peers. The certificates must be in PEM
1 format.
1
1 Without this option Wget looks for CA certificates at the
1 system-specified locations, chosen at OpenSSL installation time.
1
1 ‘--ca-directory=DIRECTORY’
1 Specifies directory containing CA certificates in PEM format. Each
1 file contains one CA certificate, and the file name is based on a
1 hash value derived from the certificate. This is achieved by
1 processing a certificate directory with the ‘c_rehash’ utility
1 supplied with OpenSSL. Using ‘--ca-directory’ is more efficient
1 than ‘--ca-certificate’ when many certificates are installed
1 because it allows Wget to fetch certificates on demand.
1
1 Without this option Wget looks for CA certificates at the
1 system-specified locations, chosen at OpenSSL installation time.
1
1 ‘--crl-file=FILE’
1 Specifies a CRL file in FILE. This is needed for certificates that
1 have been revocated by the CAs.
1
1 ‘--pinnedpubkey=file/hashes’
1 Tells wget to use the specified public key file (or hashes) to
1 verify the peer. This can be a path to a file which contains a
1 single public key in PEM or DER format, or any number of base64
1 encoded sha256 hashes preceded by “sha256//” and separated by “;”
1
1 When negotiating a TLS or SSL connection, the server sends a
1 certificate indicating its identity. A public key is extracted
1 from this certificate and if it does not exactly match the public
1 key(s) provided to this option, wget will abort the connection
1 before sending or receiving any data.
1
1 ‘--random-file=FILE’
1 [OpenSSL and LibreSSL only] Use FILE as the source of random data
1 for seeding the pseudo-random number generator on systems without
1 ‘/dev/urandom’.
1
1 On such systems the SSL library needs an external source of
1 randomness to initialize. Randomness may be provided by EGD (see
1 ‘--egd-file’ below) or read from an external source specified by
1 the user. If this option is not specified, Wget looks for random
1 data in ‘$RANDFILE’ or, if that is unset, in ‘$HOME/.rnd’.
1
1 If you’re getting the “Could not seed OpenSSL PRNG; disabling SSL.”
1 error, you should provide random data using some of the methods
1 described above.
1
1 ‘--egd-file=FILE’
1 [OpenSSL only] Use FILE as the EGD socket. EGD stands for “Entropy
1 Gathering Daemon”, a user-space program that collects data from
1 various unpredictable system sources and makes it available to
1 other programs that might need it. Encryption software, such as
1 the SSL library, needs sources of non-repeating randomness to seed
1 the random number generator used to produce cryptographically
1 strong keys.
1
1 OpenSSL allows the user to specify his own source of entropy using
1 the ‘RAND_FILE’ environment variable. If this variable is unset,
1 or if the specified file does not produce enough randomness,
1 OpenSSL will read random data from EGD socket specified using this
1 option.
1
1 If this option is not specified (and the equivalent startup command
1 is not used), EGD is never contacted. EGD is not needed on modern
1 Unix systems that support ‘/dev/urandom’.
1
1 ‘--no-hsts’
1 Wget supports HSTS (HTTP Strict Transport Security, RFC 6797) by
1 default. Use ‘--no-hsts’ to make Wget act as a non-HSTS-compliant
1 UA. As a consequence, Wget would ignore all the
1 ‘Strict-Transport-Security’ headers, and would not enforce any
1 existing HSTS policy.
1
1 ‘--hsts-file=FILE’
1 By default, Wget stores its HSTS database in ‘~/.wget-hsts’. You
1 can use ‘--hsts-file’ to override this. Wget will use the supplied
1 file as the HSTS database. Such file must conform to the correct
1 HSTS database format used by Wget. If Wget cannot parse the
1 provided file, the behaviour is unspecified.
1
1 The Wget’s HSTS database is a plain text file. Each line contains
1 an HSTS entry (ie. a site that has issued a
1 ‘Strict-Transport-Security’ header and that therefore has specified
1 a concrete HSTS policy to be applied). Lines starting with a dash
1 (‘#’) are ignored by Wget. Please note that in spite of this
1 convenient human-readability hand-hacking the HSTS database is
1 generally not a good idea.
1
1 An HSTS entry line consists of several fields separated by one or
1 more whitespace:
1
1 ‘<hostname> SP [<port>] SP <include subdomains> SP <created> SP
1 <max-age>’
1
1 The HOSTNAME and PORT fields indicate the hostname and port to
1 which the given HSTS policy applies. The PORT field may be zero,
1 and it will, in most of the cases. That means that the port number
1 will not be taken into account when deciding whether such HSTS
1 policy should be applied on a given request (only the hostname will
1 be evaluated). When PORT is different to zero, both the target
1 hostname and the port will be evaluated and the HSTS policy will
1 only be applied if both of them match. This feature has been
1 included for testing/development purposes only. The Wget testsuite
1 (in ‘testenv/’) creates HSTS databases with explicit ports with the
1 purpose of ensuring Wget’s correct behaviour. Applying HSTS
1 policies to ports other than the default ones is discouraged by RFC
1 6797 (see Appendix B "Differences between HSTS Policy and
1 Same-Origin Policy"). Thus, this functionality should not be used
1 in production environments and PORT will typically be zero. The
1 last three fields do what they are expected to. The field
1 INCLUDE_SUBDOMAINS can either be ‘1’ or ‘0’ and it signals whether
1 the subdomains of the target domain should be part of the given
1 HSTS policy as well. The CREATED and MAX-AGE fields hold the
1 timestamp values of when such entry was created (first seen by
1 Wget) and the HSTS-defined value ’max-age’, which states how long
1 should that HSTS policy remain active, measured in seconds elapsed
1 since the timestamp stored in CREATED. Once that time has passed,
1 that HSTS policy will no longer be valid and will eventually be
1 removed from the database.
1
1 If you supply your own HSTS database via ‘--hsts-file’, be aware
1 that Wget may modify the provided file if any change occurs between
1 the HSTS policies requested by the remote servers and those in the
1 file. When Wget exists, it effectively updates the HSTS database
1 by rewriting the database file with the new entries.
1
1 If the supplied file does not exist, Wget will create one. This
1 file will contain the new HSTS entries. If no HSTS entries were
1 generated (no ‘Strict-Transport-Security’ headers were sent by any
1 of the servers) then no file will be created, not even an empty
1 one. This behaviour applies to the default database file
1 (‘~/.wget-hsts’) as well: it will not be created until some server
1 enforces an HSTS policy.
1
1 Care is taken not to override possible changes made by other Wget
1 processes at the same time over the HSTS database. Before dumping
1 the updated HSTS entries on the file, Wget will re-read it and
1 merge the changes.
1
1 Using a custom HSTS database and/or modifying an existing one is
1 discouraged. For more information about the potential security
1 threats arised from such practice, see section 14 "Security
1 Considerations" of RFC 6797, specially section 14.9 "Creative
1 Manipulation of HSTS Policy Store".
1
1 ‘--warc-file=FILE’
1 Use FILE as the destination WARC file.
1
1 ‘--warc-header=STRING’
1 Use STRING into as the warcinfo record.
1
1 ‘--warc-max-size=SIZE’
1 Set the maximum size of the WARC files to SIZE.
1
1 ‘--warc-cdx’
1 Write CDX index files.
1
1 ‘--warc-dedup=FILE’
1 Do not store records listed in this CDX file.
1
1 ‘--no-warc-compression’
1 Do not compress WARC files with GZIP.
1
1 ‘--no-warc-digests’
1 Do not calculate SHA1 digests.
1
1 ‘--no-warc-keep-log’
1 Do not store the log file in a WARC record.
1
1 ‘--warc-tempdir=DIR’
1 Specify the location for temporary files created by the WARC
1 writer.
1