wget: HTTP Time-Stamping Internals

1 
1 5.2 HTTP Time-Stamping Internals
1 ================================
1 
1 Time-stamping in HTTP is implemented by checking of the ‘Last-Modified’
1 header.  If you wish to retrieve the file ‘foo.html’ through HTTP, Wget
1 will check whether ‘foo.html’ exists locally.  If it doesn’t, ‘foo.html’
1 will be retrieved unconditionally.
1 
1    If the file does exist locally, Wget will first check its local
1 time-stamp (similar to the way ‘ls -l’ checks it), and then send a
1 ‘HEAD’ request to the remote server, demanding the information on the
1 remote file.
1 
1    The ‘Last-Modified’ header is examined to find which file was
1 modified more recently (which makes it “newer”).  If the remote file is
1 newer, it will be downloaded; if it is older, Wget will give up.(1)
1 
1    When ‘--backup-converted’ (‘-K’) is specified in conjunction with
1 ‘-N’, server file ‘X’ is compared to local file ‘X.orig’, if extant,
1 rather than being compared to local file ‘X’, which will always differ
1 if it’s been converted by ‘--convert-links’ (‘-k’).
1 
1    Arguably, HTTP time-stamping should be implemented using the
1 ‘If-Modified-Since’ request.
1 
1    ---------- Footnotes ----------
1 
1    (1) As an additional check, Wget will look at the ‘Content-Length’
1 header, and compare the sizes; if they are not the same, the remote file
1 will be downloaded no matter what the time-stamp says.
1