Re: Hashing and directories

Pavel Machek (pavel@suse.cz)
Fri, 2 Mar 2001 10:00:36 +0100


Hi!

> > > I was hoping to point out that in real life, most systems that
> > > need to access large numbers of files are already designed to do
> > > some kind of hashing, or at least to divide-and-conquer by using
> > > multi-level directory structures.
> >
> > Yes -- because their workaround kernel slowness.
>
> Pavel, I'm afraid that you are missing the point. Several, actually:
> * limits of _human_ capability to deal with large unstructured
> sets of objects

Okay, then. Let's take example where I met with huge directories.

I'm working with timetables (timetab.sourceforge); at one point, I
download 100000 html files, each with one train. I'm able to deal with
that, just fine. After all, there is no structure in there. If I
wanted to add structure there, I'd have to invent some. I don't want
to. I feel just fine with 100000 files. I type its number, and it
works.

Due to kernel/nfs slowness with such huge directories, I had to split
it into several directories (each with 10000 entries). Now it gets
ugly. I was perfectly able to deal with 100000 entries, but having 10
directories is real pain.

> The point being: applications and users _do_ know better what structure
> is there. Kernel can try to second-guess them and be real good at

And if there's none? Did you take a look at netscape cache? It does
hashing to workaround kernel slowness. Yes, I think it would be much
nicer if it did not workaround anything.
Pavel

-- 
The best software in life is free (not shareware)!		Pavel
GCM d? s-: !g p?:+ au- a--@ w+ v- C++@ UL+++ L++ N++ E++ W--- M- Y- R+
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/