Re: [PATCH] (0/4) Entropy accounting fixes

Linus Torvalds (torvalds@transmeta.com)
Sun, 18 Aug 2002 09:59:41 -0700 (PDT)


On Sun, 18 Aug 2002, Oliver Xymoron wrote:
>
> The key word is actually conservative, as in conservative estimate.
> Conservative here means less than or equal to.

Your argument is that even with a gigahz logic analyzer on the network
line, you should certainly see randomness that is worth considering.

I dare you to actually show perfect correlation from it: the interrupt may
be synchronized to the PCI clock, but the code executed there-after
certainly will not. And even if the machine is 100% idle, and the whole
working set fits in the L1 cache, the DMA generated by the packet itself
will result in cache invalidations.

In other words, in order for you to actually be able to predict the TSC
from the outside, you'd have to not just have the gigahz logic analyzer on
the network line, you' dalso have to be able to correlate the ethernet
heartbeat to the PCI clock (which you probably could do by looking at the
timing of the reply packets from a ping flood, although it would be
"interestng" to say the least and probably depends on how the network card
generates the ethernet clock), _and_ you'd have to be able to do a cache
eviction analysis (which in turn requires knowing the initial memory
layout for the kernel data structures for networking).

And your argument that there is zero randomness in the TSC _depends_ on
your ability to perfectly estimate what the TSC is. If you cannot do it,
there is obviously at least one bit of randomness there. So I don't think
your "zero" is a good conservative estimate.

At some point being conservative turns into being useless [ insert
obligatory political joke here ].

[ Side note: the most common source of pseudo-random numbers is the old
linear congruental generator, which really is a sampling of a "beat"
between two frequencies that are supposed to be "close", but prime.

That's a fairly simple and accepted pseudo-random generator _despite_
the fact that the two frequencies are totally known, and there is zero
noise inserted. I'll bet you'll see a _very_ hard-to-predict stream from
something like the PCI clock / CPU clock thing, with noise inserted
thanks to things like cache misses and shared bus interactions. Never
mind the _real_ noise of having a work-load. ]

> No, it says /dev/random is primarily useful for generating large
> (>>160 bit) keys.

Which is exactly what something like sshd would want to use for generating
keys for the machine, right? That is _the_ primary reason to use
/dev/random.

Yet apparently our /dev/random has been too conservative to be actually
useful, because (as you point out somewhere else) even sshd uses
/dev/urandom for the host key generation by default.

That is really sad. That is the _one_ application that is common and that
should really have a reason to maybe care about /dev/random vs urandom.
And that application uses urandom. To me that says that /dev/random has
turned out to be less than useful in real life.

Is there anything that actually uses /dev/random at all (except for
clueless programs that really don't need to)?

Please realize that _this_ is my worry: making /dev/random so useless
that any practical program has no choice but to look elsewhere.

> Actually, half of the point here is in fact to make /dev/urandom safer
> too, by allowing mixing of untrusted data that would otherwise
> compromise /dev/random.

Now this I absolutely agree with. The xor'ing of the buffer data is
clearly a good idea. I agree 100% with this part. You'll see no arguments
against this part at all.

> 99.9% of users aren't using network sampling
> currently, after these patches we can turn it on for everyone and
> still sleep well at night. See?

Oh, that's the _good_ part. Yes.

The bad part is that I think our current /dev/random is close to useless
already, and I'd like to reverse that trend.

> That is an interesting point. A counterpoint is if we account so much as
> 1 bit of entropy per network interrupt on a typical system, the system
> will basically _always_ feel comfortable (see /proc/interrupts). It will
> practically never block and thus it is again identical to /dev/urandom.

But what's the problem with that? The "/dev/random may block" is not the
intrisic value of /dev/random - if people want to wait they are much
better off just using "sleep(1)" than trying to read from /dev/random.

My argument is that on a typical system there really _is_ so much
randomness that /dev/random is actually a useful thing. I think that you'd
have to _work_ at finding a system where /dev/random should block under
any normal use (where "normal use" also obviously means that only programs
that really need it would use it, ie ssh-keygen etc).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/