Functional Principles and Design Decisions for PRNGD. ===================================================== PRNGD has been designed to act as a /dev/urandom replacement. It features an EGD compatible socket interface, so that it can be used instead of EGD, which is a /dev/random replacement. In the following I want to explain the design properties of PRNGD, leading to its strong and weak points. - PRNGD shall always return random bytes: * EGD collects entropy into a pool by calling programs and reading its output. * Other processes read random bytes from EGD emptying the pool and EGD refills by calling more processes. If the random bytes are read faster than EGD can refill, EGD will not return random bytes until the pool is refilled. This makes EGD unusable if you have a large number of processes requiring entropy (e.g. inetd started processes like imap/pop daemons). * PRNGD uses a _pseudo_ random number generator to generate the random bytes. Thus it can never run out of stuff and will always return random bytes. On the other hand, the random bytes generated are not truly random (actually, those generated by EGD also are not truly random) and there is a risk involved that by sucking lots of entropy from the daemon an attacker might guess the contents of the random pool and break your keys. * This potential risk cannot be avoided, it is present by design, as only a _pseudo_ random number generator can avoid the problem of running out of entropy. /dev/urandom faces the same problem. Only a hardware RNG using thermal noise or radioactive decay can generate truly random bytes. See below on how PRNGD tries to minimize this risk. - PRNGD should be low on resource usage: * EGD is written in PERL and hence allows easy porting, but it forces a perl interpreter to be running. From my experience this is eating up resources. * PRNGD is written in C. Most activities are performed using system or library calls and trying to avoid spawning external processes. It will never spawn more than one process at a time. On a 1996 HP-UX box, PRNGD tends to eat around 10-20 Minutes of CPU time per month, depending on the amount of entropy requested (and hence the amount of operations to be performed inside the PRNG). The memory footprint is around 100K. - PRNGD should be robust against system malfunctions: * EGD sometimes tend to run out of entropy and does not refill. I don't speak PERL, so I am not completely sure, but it seems that EGD records "failure" of started gathering processes and does not use them any longer. If the system runs out of memory or out of processes, no gatherers can be started and all gatherers are disabled. (This is just my theory, I don't speak perl, as stated above, and don't have a clue on how to debug EGD.) * PRNGD will always use the same gatherer processes, regardless whether they fail at any time or not. This way a transient resource shortage is simply ignored and PRNGD will continue to work. * In case any gatherer fails, PRNGD will "kill -9" it after some time to not leave any processes hanging around. - PRNGD should provide good random bytes: * There is an excellent paper by Peter Gutmann: http://www.cryptoengines.com/~peter/06_random.pdf Read it! * On startup, PRNGD tries to seed its internal pool as good as possible by reading back its saved entropy state and calling all gatherers. (The entropy state is saved at shutdown time by retrieving random bytes from the PRNG, so that it does not reveal information about the internal state bits. It is fed back as coming from untrusted source like any other input.) You can also run without seed file, PRNGD will call all available gatherers until it has enough entropy available. On my 1996 HP-UX box this only takes one or two seconds. Reading back the seed would also be safe when then contents would be known, because it is only used to initialize the PRNG but then entropy gathering is started immediately. (So: why throw away old seed, it doesn't hurt to read it back.) * Whenever entropy is requested, PRNGD will completely mix the pool, retrieve the random bits, then remix, thus yielding 2 properties: + All bits retrieved depend on _all_ bits in the pool! + When accessing the pool by any means (poking in memory etc) the pool has been remixed, so that one cannot get information about the state of the pool when entropy was retrieved. + An entropy count is maintained that is increased when entropy is added and decreased when random bytes are retrieved. As soon as the entropy count goes down below a given threshold (that defaults to 8192 bits), external gatherer processes are called continuously to add new entropy. This is what EGD does, too. I am not sure how good the entropy obtained this way really is (ever run "tail /var/adm/syslog/syslog.log" 10 times in a row?), but as it is always mixed into the large pool it is better than nothing... * PRNGD uses the following seeding: + Quite often (by default around every 17 seconds), a seed_stat() is performed by stat()ing a file or directory like /etc/passwd, /tmp, ... which is changed or accessed very frequently. This will only give a very small amount of bits every time, but every bit helps :-) + Less frequently (by default around every 49 seconds), an external gathering process is spawned (similar to what EGD does, but in the case of PRNGD the frequency is not related to the retrieval of random bytes). The output of the process is mixed into the pool. + The exact schedule is not fixed, but it depends on the intervals given above (default 17 and 49 seconds). When PRNGD is idle, after the shorter interval (here 17 seconds), a seed_stat() is performed. The external gatherer is started, if more than 49 seconds have passed since the last gatherer was started. Since 49 cannot be diveded by 17, the external gatherer is not spawned with a frequency of 49 seconds, but with some uncertainty. To further increase this uncertainty, this decision is performed after the select() call, which will also be triggered by external processes communicating with PRNGD... + Whenever the call to a gatherer process is finished, additional bits are mixed into the pool by "internal seeding". Internal seeding is using cheap system calls to times(), gettimeofday(), getpid(), getrusage() where available. Each of these calls will not provide much entropy (only some microsecond values are uncertain with respect to granularity etc on a ntp-synchronized host, the resource usage will be quite static etc), but every bit helps...