wget: Very Advanced Usage

1 
1 7.3 Very Advanced Usage
1 =======================
1 
1    • If you wish Wget to keep a mirror of a page (or FTP
1      subdirectories), use ‘--mirror’ (‘-m’), which is the shorthand for
1      ‘-r -l inf -N’.  You can put Wget in the crontab file asking it to
1      recheck a site each Sunday:
1 
1           crontab
1           0 0 * * 0 wget --mirror https://www.gnu.org/ -o /home/me/weeklog
1 
1    • In addition to the above, you want the links to be converted for
1      local viewing.  But, after having read this manual, you know that
1      link conversion doesn’t play well with timestamping, so you also
1      want Wget to back up the original HTML files before the conversion.
1      Wget invocation would look like this:
1 
1           wget --mirror --convert-links --backup-converted  \
1                https://www.gnu.org/ -o /home/me/weeklog
1 
1    • But you’ve also noticed that local viewing doesn’t work all that
1      well when HTML files are saved under extensions other than ‘.html’,
1      perhaps because they were served as ‘index.cgi’.  So you’d like
1      Wget to rename all the files served with content-type ‘text/html’
1      or ‘application/xhtml+xml’ to ‘NAME.html’.
1 
1           wget --mirror --convert-links --backup-converted \
1                --html-extension -o /home/me/weeklog        \
1                https://www.gnu.org/
1 
1      Or, with less typing:
1 
1           wget -m -k -K -E https://www.gnu.org/ -o /home/me/weeklog
1