Re: [Bug 417] New: htree much slower than regular ext3

Theodore Ts'o (tytso@mit.edu)
Mon, 10 Mar 2003 16:25:15 -0500


On Sun, Mar 09, 2003 at 10:08:34AM +0300, Alex Tomas wrote:
> 1) As far as I understand, duplicates are possible even in classic ext2
> w/o sortdir/index. See the diagram:
>
> Process 1 Process 2
>
> getdents(2) returns
> dentry1 (file1 -> Inode1)
> dentry2 (file2 -> Inode2)
>
> context switch -->
> unlink(file1), empty dentry1
> creat(file3), Inode3, use dentry1
> creat(file1), Inode1, use dentry3
>
> context switch -->
>
> getdents(2) returns
> dentry3(file1 -> Inode1)
>
>
> Am I right?

Yup. POSIX allows this. If you're in the middle of readdir() when a
file (or files) is created or deleted, then it is undefined how
readdir() will treat the filename(s) involved. However, you're *not*
allowed to duplicate or omit filenames that were not touched by a file
creation or deletion.

> 2) Why do not use hash order for traversal like ext3_dx_readdir() does?
> Upon reading several dentries within some hash set readdir() sorts them
> in inode order and returns to an user.

Yeah, glibc could do this, although it would make telldir/seekdir more
than a little complicated. (There's an awful lot of pain and agony
caused to filesystem developers caused by the existence of
telldir/seekdir, which inherently assumes a flat directory structure,
and so it makes it hard to do tricks like this.)

At some level this is the same solution as "do this in userspace",
except instead of needing to convince the maintainers of du and find,
we'd need to convince Ulrich. It is *not* clear that it would
necessarily be easier to get this fix into the glibc. In fact, since
du and find probably don't use seekdir/telldir, it's probably easier
to get the change into du and find.

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/