Re: [rfc] Near-constant time directory index for Ext2

H. Peter Anvin (hpa@transmeta.com)
Wed, 21 Feb 2001 15:48:57 -0800


Daniel Phillips wrote:
>
> Have you looked at the structure and algorithms I'm using? I would not
> call this a hash table, nor is it a btree. It's a 'hash-keyed
> uniform-depth tree'. It never needs to be rehashed (though it might be
> worthwhile compacting it at some point). It also never needs to be
> rebalanced - it's only two levels deep for up to 50 million files.
>

I'm curious how you do that. It seems each level would have to be 64K
large in order to do that, with a minimum disk space consumption of 128K
for a directory. That seems extremely painful *except* in the case of
hysterically large directories, which tend to be the exception even on
filesystems where they occur.

I think I'd rather take the extra complexity and rebalancing cost of a
B-tree.

> This thing deserves a name of its own. I call it an 'htree'. The
> performance should speak for itself - 150 usec/create across 90,000
> files and still a few optmizations to go.
>
> Random access runs at similar speeds too, it's not just taking advantage
> of a long sequence of insertions into the same directory.
>
> BTW, the discussion in this thread has been very interesting, it just
> isn't entirely relevant to my patch :-)

-hpa

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/