find: LOCATE02 Database Format

1 
1 4.2.1 LOCATE02 Database Format
1 ------------------------------
1 
1 'updatedb' runs a program called 'frcode' to "front-compress" the list
1 of file names, which reduces the database size by a factor of 4 to 5.
1 Front-compression (also known as incremental encoding) works as follows.
1 
1    The database entries are a sorted list (case-insensitively, for
1 users' convenience).  Since the list is sorted, each entry is likely to
1 share a prefix (initial string) with the previous entry.  Each database
1 entry begins with an offset-differential count byte, which is the
1 additional number of characters of prefix of the preceding entry to use
1 beyond the number that the preceding entry is using of its predecessor.
1 (The counts can be negative.)  Following the count is a null-terminated
1 ASCII remainder - the part of the name that follows the shared prefix.
1 
1    If the offset-differential count is larger than can be stored in a
1 byte (+/-127), the byte has the value 0x80 and the count follows in a
1 2-byte word, with the high byte first (network byte order).
1 
1    Every database begins with a dummy entry for a file called
1 'LOCATE02', which 'locate' checks for to ensure that the database file
1 has the correct format; it ignores the entry in doing the search.
1 
1    Databases cannot be concatenated together, even if the first (dummy)
1 entry is trimmed from all but the first database.  This is because the
1 offset-differential count in the first entry of the second and following
1 databases will be wrong.
1 
1    In the output of 'locate --statistics', the new database format is
1 referred to as 'LOCATE02'.
1