nettle: Randomness

1 
1 6.8 Randomness
1 ==============
1 
1 A crucial ingredient in many cryptographic contexts is randomness: Let
1 ‘p’ be a random prime, choose a random initialization vector ‘iv’, a
1 random key ‘k’ and a random exponent ‘e’, etc.  In the theories, it is
1 assumed that you have plenty of randomness around.  If this assumption
1 is not true in practice, systems that are otherwise perfectly secure,
1 can be broken.  Randomness has often turned out to be the weakest link
1 in the chain.
1 
1    In non-cryptographic applications, such as games as well as
1 scientific simulation, a good randomness generator usually means a
1 generator that has good statistical properties, and is seeded by some
1 simple function of things like the current time, process id, and host
1 name.
1 
1    However, such a generator is inadequate for cryptography, for at
1 least two reasons:
1 
1    • It’s too easy for an attacker to guess the initial seed.  Even if
1      it will take some 2^32 tries before he guesses right, that’s far
1      too easy.  For example, if the process id is 16 bits, the
1      resolution of “current time” is one second, and the attacker knows
1      what day the generator was seeded, there are only about 2^32
1      possibilities to try if all possible values for the process id and
1      time-of-day are tried.
1 
1    • The generator output reveals too much.  By observing only a small
1      segment of the generator’s output, its internal state can be
1      recovered, and from there, all previous output and all future
1      output can be computed by the attacker.
1 
1    A randomness generator that is used for cryptographic purposes must
1 have better properties.  Let’s first look at the seeding, as the issues
1 here are mostly independent of the rest of the generator.  The initial
1 state of the generator (its seed) must be unguessable by the attacker.
1 So what’s unguessable?  It depends on what the attacker already knows.
1 The concept used in information theory to reason about such things is
1 called “entropy”, or “conditional entropy” (not to be confused with the
1 thermodynamic concept with the same name).  A reasonable requirement is
1 that the seed contains a conditional entropy of at least some 80-100
1 bits.  This property can be explained as follows: Allow the attacker to
1 ask ‘n’ yes-no-questions, of his own choice, about the seed.  If the
1 attacker, using this question-and-answer session, as well as any other
1 information he knows about the seeding process, still can’t guess the
1 seed correctly, then the conditional entropy is more than ‘n’ bits.
1 
1    Let’s look at an example.  Say information about timing of received
1 network packets is used in the seeding process.  If there is some random
1 network traffic going on, this will contribute some bits of entropy or
1 “unguessability” to the seed.  However, if the attacker can listen in to
1 the local network, or if all but a small number of the packets were
1 transmitted by machines that the attacker can monitor, this additional
1 information makes the seed easier for the attacker to figure out.  Even
1 if the information is exactly the same, the conditional entropy, or
1 unguessability, is smaller for an attacker that knows some of it already
1 before the hypothetical question-and-answer session.
1 
1    Seeding of good generators is usually based on several sources.  The
1 key point here is that the amount of unguessability that each source
1 contributes, depends on who the attacker is.  Some sources that have
1 been used are:
1 
1 High resolution timing of i/o activities
1      Such as completed blocks from spinning hard disks, network packets,
1      etc.  Getting access to such information is quite system dependent,
1      and not all systems include suitable hardware.  If available, it’s
1      one of the better randomness source one can find in a digital,
1      mostly predictable, computer.
1 
1 User activity
1      Timing and contents of user interaction events is another popular
1      source that is available for interactive programs (even if I
1      suspect that it is sometimes used in order to make the user feel
1      good, not because the quality of the input is needed or used
1      properly).  Obviously, not available when a machine is unattended.
1      Also beware of networks: User interaction that happens across a
1      long serial cable, TELNET session, or even SSH session may be
1      visible to an attacker, in full or partially.
1 
1 Audio input
1      Any room, or even a microphone input that’s left unconnected, is a
1      source of some random background noise, which can be fed into the
1      seeding process.
1 
1 Specialized hardware
1      Hardware devices with the sole purpose of generating random data
1      have been designed.  They range from radioactive samples with an
1      attached Geiger counter, to amplification of the inherent noise in
1      electronic components such as diodes and resistors, to
1      low-frequency sampling of chaotic systems.  Hashing successive
1      images of a Lava lamp is a spectacular example of the latter type.
1 
1 Secret information
1      Secret information, such as user passwords or keys, or private
1      files stored on disk, can provide some unguessability.  A problem
1      is that if the information is revealed at a later time, the
1      unguessability vanishes.  Another problem is that this kind of
1      information tends to be fairly constant, so if you rely on it and
1      seed your generator regularly, you risk constructing almost similar
1      seeds or even constructing the same seed more than once.
1 
1    For all practical sources, it’s difficult but important to provide a
1 reliable lower bound on the amount of unguessability that it provides.
1 Two important points are to make sure that the attacker can’t observe
1 your sources (so if you like the Lava lamp idea, remember that you have
1 to get your own lamp, and not put it by a window or anywhere else where
1 strangers can see it), and that hardware failures are detected.  What if
1 the bulb in the Lava lamp, which you keep locked into a cupboard
1 following the above advice, breaks after a few months?
1 
1    So let’s assume that we have been able to find an unguessable seed,
1 which contains at least 80 bits of conditional entropy, relative to all
1 attackers that we care about (typically, we must at the very least
1 assume that no attacker has root privileges on our machine).
1 
1    How do we generate output from this seed, and how much can we get?
1 Some generators (notably the Linux ‘/dev/random’ generator) tries to
1 estimate available entropy and restrict the amount of output.  The goal
1 is that if you read 128 bits from ‘/dev/random’, you should get 128
1 “truly random” bits.  This is a property that is useful in some
1 specialized circumstances, for instance when generating key material for
1 a one time pad, or when working with unconditional blinding, but in most
1 cases, it doesn’t matter much.  For most application, there’s no limit
1 on the amount of useful “random” data that we can generate from a small
1 seed; what matters is that the seed is unguessable and that the
1 generator has good cryptographic properties.
1 
1    At the heart of all generators lies its internal state.  Future
1 output is determined by the internal state alone.  Let’s call it the
1 generator’s key.  The key is initialized from the unguessable seed.
1 Important properties of a generator are:
1 
1 “Key-hiding”
1      An attacker observing the output should not be able to recover the
1      generator’s key.
1 
1 “Independence of outputs”
1      Observing some of the output should not help the attacker to guess
1      previous or future output.
1 
1 “Forward secrecy”
1      Even if an attacker compromises the generator’s key, he should not
1      be able to guess the generator output _before_ the key compromise.
1 
1 “Recovery from key compromise”
1      If an attacker compromises the generator’s key, he can compute
1      _all_ future output.  This is inevitable if the generator is seeded
1      only once, at startup.  However, the generator can provide a
1      reseeding mechanism, to achieve recovery from key compromise.  More
1      precisely: If the attacker compromises the key at a particular time
1      ‘t_1’, there is another later time ‘t_2’, such that if the attacker
1      observes all output generated between ‘t_1’ and ‘t_2’, he still
1      can’t guess what output is generated after ‘t_2’.
1 
1    Nettle includes one randomness generator that is believed to have all
1 the above properties, and two simpler ones.
1 
1    ARCFOUR, like any stream cipher, can be used as a randomness
1 generator.  Its output should be of reasonable quality, if the seed is
1 hashed properly before it is used with ‘arcfour_set_key’.  There’s no
1 single natural way to reseed it, but if you need reseeding, you should
1 be using Yarrow instead.
1 
1    The “lagged Fibonacci” generator in ‘<nettle/knuth-lfib.h>’ is a fast
1 generator with good statistical properties, but is *not* for
1 cryptographic use, and therefore not documented here.  It is included
1 mostly because the Nettle test suite needs to generate some test data
1 from a small seed.
1 
1    The recommended generator to use is Yarrow, described below.
1 
1 6.8.1 Yarrow
1 ------------
1 
1 Yarrow is a family of pseudo-randomness generators, designed for
1 cryptographic use, by John Kelsey, Bruce Schneier and Niels Ferguson.
1 Yarrow-160 is described in a paper at
1 <http://www.counterpane.com/yarrow.html>, and it uses SHA1 and
1 triple-DES, and has a 160-bit internal state.  Nettle implements
1 Yarrow-256, which is similar, but uses SHA256 and AES to get an internal
1 state of 256 bits.
1 
1    Yarrow was an almost finished project, the paper mentioned above is
1 the closest thing to a specification for it, but some smaller details
1 are left out.  There is no official reference implementation or test
1 cases.  This section includes an overview of Yarrow, but for the details
1 of Yarrow-256, as implemented by Nettle, you have to consult the source
1 code.  Maybe a complete specification can be written later.
1 
1    Yarrow can use many sources (at least two are needed for proper
1 reseeding), and two randomness “pools”, referred to as the “slow pool”
1 and the “fast pool”.  Input from the sources is fed alternatingly into
1 the two pools.  When one of the sources has contributed 100 bits of
1 entropy to the fast pool, a “fast reseed” happens and the fast pool is
1 mixed into the internal state.  When at least two of the sources have
1 contributed at least 160 bits each to the slow pool, a “slow reseed”
1 takes place.  The contents of both pools are mixed into the internal
1 state.  These procedures should ensure that the generator will
1 eventually recover after a key compromise.
1 
1    The output is generated by using AES to encrypt a counter, using the
1 generator’s current key.  After each request for output, another 256
1 bits are generated which replace the key.  This ensures forward secrecy.
1 
1    Yarrow can also use a “seed file” to save state across restarts.
1 Yarrow is seeded by either feeding it the contents of the previous seed
1 file, or feeding it input from its sources until a slow reseed happens.
1 
1    Nettle defines Yarrow-256 in ‘<nettle/yarrow.h>’.
1 
1  -- Context struct: struct yarrow256_ctx
1 
1  -- Context struct: struct yarrow_source
1      Information about a single source.
1 
1  -- Constant: YARROW256_SEED_FILE_SIZE
1      Recommended size of the Yarrow-256 seed file.
1 
1  -- Function: void yarrow256_init (struct yarrow256_ctx *CTX, unsigned
1           NSOURCES, struct yarrow_source *SOURCES)
1      Initializes the yarrow context, and its NSOURCES sources.  It’s
1      possible to call it with NSOURCES=0 and SOURCES=NULL, if you don’t
1      need the update features.
1 
1  -- Function: void yarrow256_seed (struct yarrow256_ctx *CTX, size_t
1           LENGTH, uint8_t *SEED_FILE)
1      Seeds Yarrow-256 from a previous seed file.  LENGTH should be at
1      least ‘YARROW256_SEED_FILE_SIZE’, but it can be larger.
1 
1      The generator will trust you that the SEED_FILE data really is
1      unguessable.  After calling this function, you _must_ overwrite the
1      old seed file with newly generated data from ‘yarrow256_random’.
1      If it’s possible for several processes to read the seed file at
1      about the same time, access must be coordinated using some locking
1      mechanism.
1 
1  -- Function: int yarrow256_update (struct yarrow256_ctx *CTX, unsigned
1           SOURCE, unsigned ENTROPY, size_t LENGTH, const uint8_t *DATA)
1      Updates the generator with data from source SOURCE (an index that
1      must be smaller than the number of sources).  ENTROPY is your
1      estimated lower bound for the entropy in the data, measured in
1      bits.  Calling update with zero ENTROPY is always safe, no matter
1      if the data is random or not.
1 
1      Returns 1 if a reseed happened, in which case an application using
1      a seed file may want to generate new seed data with
1      ‘yarrow256_random’ and overwrite the seed file.  Otherwise, the
1      function returns 0.
1 
1  -- Function: void yarrow256_random (struct yarrow256_ctx *CTX, size_t
1           LENGTH, uint8_t *DST)
1      Generates LENGTH octets of output.  The generator must be seeded
1      before you call this function.
1 
1      If you don’t need forward secrecy, e.g.  if you need non-secret
1      randomness for initialization vectors or padding, you can gain some
1      efficiency by buffering, calling this function for reasonably large
1      blocks of data, say 100-1000 octets at a time.
1 
1  -- Function: int yarrow256_is_seeded (struct yarrow256_ctx *CTX)
1      Returns 1 if the generator is seeded and ready to generate output,
1      otherwise 0.
1 
1  -- Function: unsigned yarrow256_needed_sources (struct yarrow256_ctx
1           *CTX)
1      Returns the number of sources that must reach the threshold before
1      a slow reseed will happen.  Useful primarily when the generator is
1      unseeded.
1 
1  -- Function: void yarrow256_fast_reseed (struct yarrow256_ctx *CTX)
1  -- Function: void yarrow256_slow_reseed (struct yarrow256_ctx *CTX)
1      Causes a fast or slow reseed to take place immediately, regardless
1      of the current entropy estimates of the two pools.  Use with care.
1 
1    Nettle includes an entropy estimator for one kind of input source:
1 User keyboard input.
1 
1  -- Context struct: struct yarrow_key_event_ctx
1      Information about recent key events.
1 
1  -- Function: void yarrow_key_event_init (struct yarrow_key_event_ctx
1           *CTX)
1      Initializes the context.
1 
1  -- Function: unsigned yarrow_key_event_estimate (struct
1           yarrow_key_event_ctx *CTX, unsigned KEY, unsigned TIME)
1      KEY is the id of the key (ASCII value, hardware key code, X keysym,
1      ..., it doesn’t matter), and TIME is the timestamp of the event.
1      The time must be given in units matching the resolution by which
1      you read the clock.  If you read the clock with microsecond
1      precision, TIME should be provided in units of microseconds.  But
1      if you use ‘gettimeofday’ on a typical Unix system where the clock
1      ticks 10 or so microseconds at a time, TIME should be given in
1      units of 10 microseconds.
1 
1      Returns an entropy estimate, in bits, suitable for calling
1      ‘yarrow256_update’.  Usually, 0, 1 or 2 bits.
1