nettle: Randomness
1
1 6.8 Randomness
1 ==============
1
1 A crucial ingredient in many cryptographic contexts is randomness: Let
1 ‘p’ be a random prime, choose a random initialization vector ‘iv’, a
1 random key ‘k’ and a random exponent ‘e’, etc. In the theories, it is
1 assumed that you have plenty of randomness around. If this assumption
1 is not true in practice, systems that are otherwise perfectly secure,
1 can be broken. Randomness has often turned out to be the weakest link
1 in the chain.
1
1 In non-cryptographic applications, such as games as well as
1 scientific simulation, a good randomness generator usually means a
1 generator that has good statistical properties, and is seeded by some
1 simple function of things like the current time, process id, and host
1 name.
1
1 However, such a generator is inadequate for cryptography, for at
1 least two reasons:
1
1 • It’s too easy for an attacker to guess the initial seed. Even if
1 it will take some 2^32 tries before he guesses right, that’s far
1 too easy. For example, if the process id is 16 bits, the
1 resolution of “current time” is one second, and the attacker knows
1 what day the generator was seeded, there are only about 2^32
1 possibilities to try if all possible values for the process id and
1 time-of-day are tried.
1
1 • The generator output reveals too much. By observing only a small
1 segment of the generator’s output, its internal state can be
1 recovered, and from there, all previous output and all future
1 output can be computed by the attacker.
1
1 A randomness generator that is used for cryptographic purposes must
1 have better properties. Let’s first look at the seeding, as the issues
1 here are mostly independent of the rest of the generator. The initial
1 state of the generator (its seed) must be unguessable by the attacker.
1 So what’s unguessable? It depends on what the attacker already knows.
1 The concept used in information theory to reason about such things is
1 called “entropy”, or “conditional entropy” (not to be confused with the
1 thermodynamic concept with the same name). A reasonable requirement is
1 that the seed contains a conditional entropy of at least some 80-100
1 bits. This property can be explained as follows: Allow the attacker to
1 ask ‘n’ yes-no-questions, of his own choice, about the seed. If the
1 attacker, using this question-and-answer session, as well as any other
1 information he knows about the seeding process, still can’t guess the
1 seed correctly, then the conditional entropy is more than ‘n’ bits.
1
1 Let’s look at an example. Say information about timing of received
1 network packets is used in the seeding process. If there is some random
1 network traffic going on, this will contribute some bits of entropy or
1 “unguessability” to the seed. However, if the attacker can listen in to
1 the local network, or if all but a small number of the packets were
1 transmitted by machines that the attacker can monitor, this additional
1 information makes the seed easier for the attacker to figure out. Even
1 if the information is exactly the same, the conditional entropy, or
1 unguessability, is smaller for an attacker that knows some of it already
1 before the hypothetical question-and-answer session.
1
1 Seeding of good generators is usually based on several sources. The
1 key point here is that the amount of unguessability that each source
1 contributes, depends on who the attacker is. Some sources that have
1 been used are:
1
1 High resolution timing of i/o activities
1 Such as completed blocks from spinning hard disks, network packets,
1 etc. Getting access to such information is quite system dependent,
1 and not all systems include suitable hardware. If available, it’s
1 one of the better randomness source one can find in a digital,
1 mostly predictable, computer.
1
1 User activity
1 Timing and contents of user interaction events is another popular
1 source that is available for interactive programs (even if I
1 suspect that it is sometimes used in order to make the user feel
1 good, not because the quality of the input is needed or used
1 properly). Obviously, not available when a machine is unattended.
1 Also beware of networks: User interaction that happens across a
1 long serial cable, TELNET session, or even SSH session may be
1 visible to an attacker, in full or partially.
1
1 Audio input
1 Any room, or even a microphone input that’s left unconnected, is a
1 source of some random background noise, which can be fed into the
1 seeding process.
1
1 Specialized hardware
1 Hardware devices with the sole purpose of generating random data
1 have been designed. They range from radioactive samples with an
1 attached Geiger counter, to amplification of the inherent noise in
1 electronic components such as diodes and resistors, to
1 low-frequency sampling of chaotic systems. Hashing successive
1 images of a Lava lamp is a spectacular example of the latter type.
1
1 Secret information
1 Secret information, such as user passwords or keys, or private
1 files stored on disk, can provide some unguessability. A problem
1 is that if the information is revealed at a later time, the
1 unguessability vanishes. Another problem is that this kind of
1 information tends to be fairly constant, so if you rely on it and
1 seed your generator regularly, you risk constructing almost similar
1 seeds or even constructing the same seed more than once.
1
1 For all practical sources, it’s difficult but important to provide a
1 reliable lower bound on the amount of unguessability that it provides.
1 Two important points are to make sure that the attacker can’t observe
1 your sources (so if you like the Lava lamp idea, remember that you have
1 to get your own lamp, and not put it by a window or anywhere else where
1 strangers can see it), and that hardware failures are detected. What if
1 the bulb in the Lava lamp, which you keep locked into a cupboard
1 following the above advice, breaks after a few months?
1
1 So let’s assume that we have been able to find an unguessable seed,
1 which contains at least 80 bits of conditional entropy, relative to all
1 attackers that we care about (typically, we must at the very least
1 assume that no attacker has root privileges on our machine).
1
1 How do we generate output from this seed, and how much can we get?
1 Some generators (notably the Linux ‘/dev/random’ generator) tries to
1 estimate available entropy and restrict the amount of output. The goal
1 is that if you read 128 bits from ‘/dev/random’, you should get 128
1 “truly random” bits. This is a property that is useful in some
1 specialized circumstances, for instance when generating key material for
1 a one time pad, or when working with unconditional blinding, but in most
1 cases, it doesn’t matter much. For most application, there’s no limit
1 on the amount of useful “random” data that we can generate from a small
1 seed; what matters is that the seed is unguessable and that the
1 generator has good cryptographic properties.
1
1 At the heart of all generators lies its internal state. Future
1 output is determined by the internal state alone. Let’s call it the
1 generator’s key. The key is initialized from the unguessable seed.
1 Important properties of a generator are:
1
1 “Key-hiding”
1 An attacker observing the output should not be able to recover the
1 generator’s key.
1
1 “Independence of outputs”
1 Observing some of the output should not help the attacker to guess
1 previous or future output.
1
1 “Forward secrecy”
1 Even if an attacker compromises the generator’s key, he should not
1 be able to guess the generator output _before_ the key compromise.
1
1 “Recovery from key compromise”
1 If an attacker compromises the generator’s key, he can compute
1 _all_ future output. This is inevitable if the generator is seeded
1 only once, at startup. However, the generator can provide a
1 reseeding mechanism, to achieve recovery from key compromise. More
1 precisely: If the attacker compromises the key at a particular time
1 ‘t_1’, there is another later time ‘t_2’, such that if the attacker
1 observes all output generated between ‘t_1’ and ‘t_2’, he still
1 can’t guess what output is generated after ‘t_2’.
1
1 Nettle includes one randomness generator that is believed to have all
1 the above properties, and two simpler ones.
1
1 ARCFOUR, like any stream cipher, can be used as a randomness
1 generator. Its output should be of reasonable quality, if the seed is
1 hashed properly before it is used with ‘arcfour_set_key’. There’s no
1 single natural way to reseed it, but if you need reseeding, you should
1 be using Yarrow instead.
1
1 The “lagged Fibonacci” generator in ‘<nettle/knuth-lfib.h>’ is a fast
1 generator with good statistical properties, but is *not* for
1 cryptographic use, and therefore not documented here. It is included
1 mostly because the Nettle test suite needs to generate some test data
1 from a small seed.
1
1 The recommended generator to use is Yarrow, described below.
1
1 6.8.1 Yarrow
1 ------------
1
1 Yarrow is a family of pseudo-randomness generators, designed for
1 cryptographic use, by John Kelsey, Bruce Schneier and Niels Ferguson.
1 Yarrow-160 is described in a paper at
1 <http://www.counterpane.com/yarrow.html>, and it uses SHA1 and
1 triple-DES, and has a 160-bit internal state. Nettle implements
1 Yarrow-256, which is similar, but uses SHA256 and AES to get an internal
1 state of 256 bits.
1
1 Yarrow was an almost finished project, the paper mentioned above is
1 the closest thing to a specification for it, but some smaller details
1 are left out. There is no official reference implementation or test
1 cases. This section includes an overview of Yarrow, but for the details
1 of Yarrow-256, as implemented by Nettle, you have to consult the source
1 code. Maybe a complete specification can be written later.
1
1 Yarrow can use many sources (at least two are needed for proper
1 reseeding), and two randomness “pools”, referred to as the “slow pool”
1 and the “fast pool”. Input from the sources is fed alternatingly into
1 the two pools. When one of the sources has contributed 100 bits of
1 entropy to the fast pool, a “fast reseed” happens and the fast pool is
1 mixed into the internal state. When at least two of the sources have
1 contributed at least 160 bits each to the slow pool, a “slow reseed”
1 takes place. The contents of both pools are mixed into the internal
1 state. These procedures should ensure that the generator will
1 eventually recover after a key compromise.
1
1 The output is generated by using AES to encrypt a counter, using the
1 generator’s current key. After each request for output, another 256
1 bits are generated which replace the key. This ensures forward secrecy.
1
1 Yarrow can also use a “seed file” to save state across restarts.
1 Yarrow is seeded by either feeding it the contents of the previous seed
1 file, or feeding it input from its sources until a slow reseed happens.
1
1 Nettle defines Yarrow-256 in ‘<nettle/yarrow.h>’.
1
1 -- Context struct: struct yarrow256_ctx
1
1 -- Context struct: struct yarrow_source
1 Information about a single source.
1
1 -- Constant: YARROW256_SEED_FILE_SIZE
1 Recommended size of the Yarrow-256 seed file.
1
1 -- Function: void yarrow256_init (struct yarrow256_ctx *CTX, unsigned
1 NSOURCES, struct yarrow_source *SOURCES)
1 Initializes the yarrow context, and its NSOURCES sources. It’s
1 possible to call it with NSOURCES=0 and SOURCES=NULL, if you don’t
1 need the update features.
1
1 -- Function: void yarrow256_seed (struct yarrow256_ctx *CTX, size_t
1 LENGTH, uint8_t *SEED_FILE)
1 Seeds Yarrow-256 from a previous seed file. LENGTH should be at
1 least ‘YARROW256_SEED_FILE_SIZE’, but it can be larger.
1
1 The generator will trust you that the SEED_FILE data really is
1 unguessable. After calling this function, you _must_ overwrite the
1 old seed file with newly generated data from ‘yarrow256_random’.
1 If it’s possible for several processes to read the seed file at
1 about the same time, access must be coordinated using some locking
1 mechanism.
1
1 -- Function: int yarrow256_update (struct yarrow256_ctx *CTX, unsigned
1 SOURCE, unsigned ENTROPY, size_t LENGTH, const uint8_t *DATA)
1 Updates the generator with data from source SOURCE (an index that
1 must be smaller than the number of sources). ENTROPY is your
1 estimated lower bound for the entropy in the data, measured in
1 bits. Calling update with zero ENTROPY is always safe, no matter
1 if the data is random or not.
1
1 Returns 1 if a reseed happened, in which case an application using
1 a seed file may want to generate new seed data with
1 ‘yarrow256_random’ and overwrite the seed file. Otherwise, the
1 function returns 0.
1
1 -- Function: void yarrow256_random (struct yarrow256_ctx *CTX, size_t
1 LENGTH, uint8_t *DST)
1 Generates LENGTH octets of output. The generator must be seeded
1 before you call this function.
1
1 If you don’t need forward secrecy, e.g. if you need non-secret
1 randomness for initialization vectors or padding, you can gain some
1 efficiency by buffering, calling this function for reasonably large
1 blocks of data, say 100-1000 octets at a time.
1
1 -- Function: int yarrow256_is_seeded (struct yarrow256_ctx *CTX)
1 Returns 1 if the generator is seeded and ready to generate output,
1 otherwise 0.
1
1 -- Function: unsigned yarrow256_needed_sources (struct yarrow256_ctx
1 *CTX)
1 Returns the number of sources that must reach the threshold before
1 a slow reseed will happen. Useful primarily when the generator is
1 unseeded.
1
1 -- Function: void yarrow256_fast_reseed (struct yarrow256_ctx *CTX)
1 -- Function: void yarrow256_slow_reseed (struct yarrow256_ctx *CTX)
1 Causes a fast or slow reseed to take place immediately, regardless
1 of the current entropy estimates of the two pools. Use with care.
1
1 Nettle includes an entropy estimator for one kind of input source:
1 User keyboard input.
1
1 -- Context struct: struct yarrow_key_event_ctx
1 Information about recent key events.
1
1 -- Function: void yarrow_key_event_init (struct yarrow_key_event_ctx
1 *CTX)
1 Initializes the context.
1
1 -- Function: unsigned yarrow_key_event_estimate (struct
1 yarrow_key_event_ctx *CTX, unsigned KEY, unsigned TIME)
1 KEY is the id of the key (ASCII value, hardware key code, X keysym,
1 ..., it doesn’t matter), and TIME is the timestamp of the event.
1 The time must be given in units matching the resolution by which
1 you read the clock. If you read the clock with microsecond
1 precision, TIME should be provided in units of microseconds. But
1 if you use ‘gettimeofday’ on a typical Unix system where the clock
1 ticks 10 or so microseconds at a time, TIME should be given in
1 units of 10 microseconds.
1
1 Returns an entropy estimate, in bits, suitable for calling
1 ‘yarrow256_update’. Usually, 0, 1 or 2 bits.
1