wget: Very Advanced Usage
1
1 7.3 Very Advanced Usage
1 =======================
1
1 • If you wish Wget to keep a mirror of a page (or FTP
1 subdirectories), use ‘--mirror’ (‘-m’), which is the shorthand for
1 ‘-r -l inf -N’. You can put Wget in the crontab file asking it to
1 recheck a site each Sunday:
1
1 crontab
1 0 0 * * 0 wget --mirror https://www.gnu.org/ -o /home/me/weeklog
1
1 • In addition to the above, you want the links to be converted for
1 local viewing. But, after having read this manual, you know that
1 link conversion doesn’t play well with timestamping, so you also
1 want Wget to back up the original HTML files before the conversion.
1 Wget invocation would look like this:
1
1 wget --mirror --convert-links --backup-converted \
1 https://www.gnu.org/ -o /home/me/weeklog
1
1 • But you’ve also noticed that local viewing doesn’t work all that
1 well when HTML files are saved under extensions other than ‘.html’,
1 perhaps because they were served as ‘index.cgi’. So you’d like
1 Wget to rename all the files served with content-type ‘text/html’
1 or ‘application/xhtml+xml’ to ‘NAME.html’.
1
1 wget --mirror --convert-links --backup-converted \
1 --html-extension -o /home/me/weeklog \
1 https://www.gnu.org/
1
1 Or, with less typing:
1
1 wget -m -k -K -E https://www.gnu.org/ -o /home/me/weeklog
1