find: LOCATE02 Database Format
1
1 4.2.1 LOCATE02 Database Format
1 ------------------------------
1
1 'updatedb' runs a program called 'frcode' to "front-compress" the list
1 of file names, which reduces the database size by a factor of 4 to 5.
1 Front-compression (also known as incremental encoding) works as follows.
1
1 The database entries are a sorted list (case-insensitively, for
1 users' convenience). Since the list is sorted, each entry is likely to
1 share a prefix (initial string) with the previous entry. Each database
1 entry begins with an offset-differential count byte, which is the
1 additional number of characters of prefix of the preceding entry to use
1 beyond the number that the preceding entry is using of its predecessor.
1 (The counts can be negative.) Following the count is a null-terminated
1 ASCII remainder - the part of the name that follows the shared prefix.
1
1 If the offset-differential count is larger than can be stored in a
1 byte (+/-127), the byte has the value 0x80 and the count follows in a
1 2-byte word, with the high byte first (network byte order).
1
1 Every database begins with a dummy entry for a file called
1 'LOCATE02', which 'locate' checks for to ensure that the database file
1 has the correct format; it ignores the entry in doing the search.
1
1 Databases cannot be concatenated together, even if the first (dummy)
1 entry is trimmed from all but the first database. This is because the
1 offset-differential count in the first entry of the second and following
1 databases will be wrong.
1
1 In the output of 'locate --statistics', the new database format is
1 referred to as 'LOCATE02'.
1