Re: [GENERAL] Random not so random

Marco Colombo Thu, 07 Oct 2004 06:52:08 -0700

On Wed, 6 Oct 2004, Bruno Wolff III wrote:

I am going to keep this on general for now, since it seems like other people
might be interested even though it is straying a somewhat off topic.


Agreed.
Anyone who's not really interested on /dev/[u]random issues should
jump right to the last paragraphs of this message, which are more
on-topic.

On Wed, Oct 06, 2004 at 18:02:39 +0200,
 Marco Colombo <[EMAIL PROTECTED]> wrote:


It depends. What's wrong with a SQL function taking long to
complete? It could be a long computation, maybe days long. As far as


Days long SQL queries are hardly normal. Common things you might want
to generate secure random numbers for aren't going to be queries you
want to run a long time. For example you might want to generate
session ids to store in a cookie handed out for web requests.


I'm just pointing out that:

(1)     select read_from_dev_random();

may take a while if the read() blocks, but it's not that different
from:

(2)     select very_complex_aggregate() from huge_table;

which may take a while as well. I think the backend is already able to
handle a blocking read(), since _any_ read() from disk may block, a read()
from /dev/random is nothing special. For clients, there's no difference.
So I don't see why a blocking read from /dev/random should be a problem.

the client is concerned, there's no difference. Of course you'll have
to use it with care, just as you would do with any potentially long
function (that is, don't lock everything and then sit down waiting).


This might be reasonable if there was a significant benefit to doing so.
My argument is that /dev/random does not produce significantly more
secure random numbers under normal circumstances.

SHA1 is pretty safe for this purpose. The recent weaknesses of related
hashes isn't going to be of much help in predicting the output of
/dev/urandom. If SHA1 were to be broken that badly /dev/random would
also be broken.


No, because /dev/random returns only after the internal state has been
perturbed "enough". An observer may (in theory) get to know all
bits of the internal pool right _after_ a read() from /dev/random,
but still isn't able to guess the value of all of them right before the
_next_ read() returns, exactly because the kernel waits until the return
is safe in this regard.


OK, I'll buy that for cases where the estimates of entropy acquired are
reasonable. This may still be a problem for some uses though. It may allow
an attacker to figure out other data returned by /dev/random that they
weren't able to observe.


If the kernel is overestimating the entropy count, it's a kernel bug,
and should be fixed.

Now, if the attacker gets the output, and breaks SHA1, he'll know the
internal state again. But if the output goes elsewhere, he won't be
able to guess it. That's why /dev/random is 'safe'. It won't return
until the output is random enough.


You don't necessarily need to break SHA1 to be able to track the internal
state.


Well, I'm not an expert. I base my knowledge on what other people say.
Quoting the opening comment in random.c from a recent Linux kernel tree:

* When random bytes are desired, they are obtained by taking the SHA * hash of the contents of the "entropy pool". The SHA hash avoids * exposing the internal state of the entropy pool. It is believed to * be computationally infeasible to derive any useful information * about the input of SHA from its output. [...] * If this estimate goes to zero, the routine can still generate * random numbers; however, an attacker may (at least in theory) be * able to infer the future output of the generator from prior * outputs. This requires successful cryptanalysis of SHA, which is * not believed to be feasible, but there is a remote possibility.

The latter refers to the /dev/urandom interface, not /dev/random.

[...]
 * The two other interfaces are two character devices /dev/random and
 * /dev/urandom.  /dev/random is suitable for use when very high
 * quality randomness is desired (for example, for key generation or
 * one-time pads), as it will only return a maximum of the number of
 * bits of randomness (as estimated by the random number generator)
 * contained in the entropy pool.
 *
 * The /dev/urandom device does not have this limit, and will return
 * as many bytes as are requested.  As more and more random bytes are
 * requested without giving time for the entropy pool to recharge,
 * this will result in random numbers that are merely cryptographically
 * strong.  For many applications, however, this is acceptable.

Whether "random numbers that are merely cryptographically strong"
are acceptable for your application is up to you. Your application
blocking might be even a less acceptable event, but I don't buy this
kind of argument. Random bits are just another resource.
If the application needs disk space, and the system is out of it,
the application "blocks" or worse. The same for virtual memory.
Should we stop using hard disks or RAM? I guess no. Just as you
make sure your apps have enough disk and RAM, make sure they have
enough random bits, so that /dev/random never blocks (that is, use
a larger pool and feed it from an external source).

Now, my esteem is 0 entropy bits for my pool, since you drained them all.

"Secure Application X" needs to generate a session key (128bits),
so reads them from /dev/urandom. Let's assume the entropy count is still 0.
I (the kernel) provide it with 128bits that come from my PRNG + SHA1 engine.
Now, you can predict those 128 bits easily, and thus you know the session
key. The attack succeeds.

"Really Secure Application Y" needs 128 bits, too. But this time it
reads from /dev/random (assume entropy count is still 0).
Now, I can't fulfill the request. So have Y wait on read().

As time passes, I start collecting entropy bits, from IRQ timings,
or a hardware RNG. These bits change the internal pool in a way you
don't know. Eventually I get the count up to 128 bits.
Now, I run my PRNG (from the now-changed state) + SHA1 engine, and return
128 bits to Y. Can you guess the session key? Of course. You know
how internal state was before, you can "go through all possible entropy
values". How many of them? 2^128. That's, no wonder, the same of trying
to guess the key directly. Now you're telling me if you had the key,
you could know the internal state again? Man, if you had the key, you
already broke application Y, and the attack already succeeded by other
means!


Assuming that all 128 bits were grabbed in one call to /dev/random, you
should be safe. However if the bits were returned a byte at a time
with intervening bytes returned to /dev/urandom to the attacker, the
key could be vunerable.


I'm not sure I get the "intervening bytes" part. Guessing a 128-bits long
string or a 160-bits long one of which 32 bits are known shouldn't make
any difference.

[...]

No. SHA1 protects the internal pool from an attacker who knows all the
output. That's easy, just read from /dev/random until it blocks. If you're
fast enough, you can assume no one else read from /dev/random. Now,
_if you can break SHA1_, you have enough cleartext output of the internal
PRNG to guess it's state. You may have used /dev/urandom as well, there's
no difference.


I think you misunderstood the context. The attacker is assumed to have
initially gotten the state through some means. By watching the output
from /dev/urandom it is possible to figure out what the current state is
if not too much entropy has been added since the attacker knew the state.

The purpose of /dev/random blocking is not protecting the internal pool
state, that's SHA1 job. /dev/random blocks in order to protect other
users (application Y in my example) from you.


SHA1 only protects you from getting the state solely be viewing the output.
It doesn't protect against you getting through other means.


The only other means I can think of (speaking of Linux), are:

- the CAP_SYS_ADMIN RNDGETPOOL ioctl(), to read the pool contents; - the CAP_SYS_ADMIN RNDADDENTROPY ioctl(), to 'taint' the pool; - the CAP_SYS_ADMIN RNDADDTOENTCNT ioctl(), to trick the kernel into thinking the pool contains more randomness w/o adding any.

No wonder, these methods are CAP_SYS_ADMIN only. If you have the
CAP_SYS_ADMIN capability, you already own the system.

Ok, maybe you can control the physics of your hard disks, and have them
behave in a known (to you) way. But if you can do that, you are Q, and
already own the universe. :)

I think /dev/random output would be better protected if it didn't share
a pool with /dev/urandom. Doing that is what makes tracking the internal
state possible.

Another issue with using /dev/random instead of /dev/urandom is that it
adds another way to do denial of service attacks. That may or may not
be an issue in a particular use since there may be more effective ways
of doing denial of service attacks in other ways.


Exaclty. Randomness is just a resourse. Like many other resorces,
it's limited, and that opens to possible DoS attacks. Nothing new.


I think the whole point is: does PostgreSQL have to care about this?
Is providing a source of real random data valuable? Or should the
client handle the issue?

Well, in a Web based application, some kind of server-side tier is
likely to be already present: no one is so crazy to have the DB
directly generate HTML pages, right? (pun intended) :)
So it could take care of /dev/random and friends as well, w/o relaying
on the DB for that. That's because it runs on the server.

But in a real client-server application, where all the logic is either in the DB, or in the (remote) client, placing the need to collect really random data in a safe way on the clients may be infeasible.

The same argument holds for the current weak random() we have now.
I can't think of any client-side language that doesn't provide
a random() function. Even bash can do some $RANDOM. A purist may
argue that "providing a random anwser to a query" is not the job of
a RDBMS. But a random() function can be useful sometimes.

I know 0 of PostgreSQL internals and development issues.
I'll play with the idea of making a contrib module.

.TM.
--
      ____/  ____/   /
     /      /       /                   Marco Colombo
    ___/  ___  /   /                  Technical Manager
   /          /   /                      ESI s.r.l.
 _____/ _____/  _/                     [EMAIL PROTECTED]

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Re: [GENERAL] Random not so random

Reply via email to