In this round there were two new contenders:
- ReiserFS's R5
- Bob Jenkins' hash
Eirik Fuller pointed me to the latter, the subject of a very interesting
article in Dr. Dobbs, available online here:
http://burtleburtle.net/bob/hash/doobs.html
As before, the runs are for 20,000 creates and I report only the
fullness, because I'm still running these under UML. Suffice to say
that the total running time is roughly related to the average fullness
with a variance of about 15% from the best to the worst. Eventually I
will rerun the entire series of tests natively and provide more detailed
statistics. Here are the results from the second heat of hash races:
Contender Result
========= ======
dentry::hash Average fullness = 2352 (57%)
daniel::hack_hash Average fullness = 2758 (67%)
bob::hash Average fullness = 2539
(61%)
reiserfs::r5 Average fullness = 2064 (50%)
Just looking at R5 I knew it wasn't going to do well in this application
because it's similar to a number of hash functions I tried with the same
idea in mind: to place similar names together in the same leaf block.
That turned out to be not very important compared to achieving a
relatively high fullness of leaf blocks. The problem with R5 when used
with my htree is, it doesn't give very uniform dispersal But according
to Chris Mason (see his post) it does work very well for ReiserFS. This
provides a little more evidence that my htree scheme is a quite
different from other approaches.
u32 r5_hash (const char *msg, int len)
{
u32 a=0;
while(*msg) {
a += *msg << 4;
a += *msg >> 4;
a *= 11;
msg++;
}
return a;
}
I expected more from bob::hash since it's very carefully well-thought
out in terms of dispersal and avoidance of 'funnelling' (the property
that determines the probabililty collision), but it still fell short of
hack_hash's performance. Oh well. Tomorrow I'll try CRC32.
The bottom line: dx_hack_hash is still the reigning champion. OK, come
out and take a bow:
unsigned dx_hack_hash (const char *name, int len)
{
u32 hash0 = 0x12a3fe2d, hash1 = 0x37abe8f9;
while (len--)
{
u32 hash = hash1 + (hash0 ^ (*name++ * 71523));
if (hash < 0) hash -= 0x7fffffff;
hash1 = hash0;
hash0 = hash;
}
return hash0;
}
-- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/