Re: ext3-2.4-0.9.4

Mike Touloumtzis (miket@bluemug.com)
Sun, 29 Jul 2001 16:19:03 -0700


On Sun, Jul 29, 2001 at 11:28:10AM +0200, Matthias Andree wrote:
>
> How can autoconf figure if you need to fsync() the directory? Apart from
> that, which Unix MTA uses autoconf?

My point was not that they should be using autoconf;
I don't know if they are or not. My point was that
they should use existing published interfaces that are
reasonable, rather than push for guarantees that impose
new requirements on filesystems. And even without
autoconf it's not hard to figure out what system you're
running on.

rename(tmpfile, spoolfile);
#ifdef __linux___
fsync(tmpdir);
fsync(spooldir);
#endif
/* transaction is complete */

>
> Remember, the whole discussion is about getting rid of the need for
> chattr +S and offering the admin the chance to mount or flag a directory
> for synchronous meta data updates.

Right; and I'm arguing that the way to get rid of the need
for chattr +S is to incorporate directory fsync() in the
MTAs, not to cram more features into the filesystems.

Problem: MTA needs to know when rename() has been forced
to disk.

Solution 1: MTA authors use fsync(dirfd) on Linux.

Analysis: This is not the most portable solution, but it
should work on any FS that supports Linux semantics. You
can't expect such semantics on FAT and other filesystems
that are just supported for compatibility reasons. But you
could, say, switch filesystems for performance reasons, and
not have your MTA start mysteriously failing, because you
are using the official, documented API to do what you want
to do (at the very least you would be in a much stronger
position when pushing a bug fix :-).

Solution 2: Linux semantics are changed so that rename()
returns only when the data hits the disk. All filesystems
are expected to implement this change.

Analysis: This sucks. It precludes some filesystem design
choices, prevents users from making a speed/reliability
tradeoff, and makes each filesystem more complex.

Solution 3: Some filesystems implement synchronous
directory updates for renames, using filesystem-specific
feature flags, chattr, etc.

Analysis: I wouldn't want to try to dictate anything to
the FS authors, but this solution seems inferior to me.
Each filesystem would have to implement such a flag to
become "MTA compatible". Why add a complex feature to the
filesystem when it can already be accessed via a userspace
API? It will be more complex for administrators too --
they will have to know which filesystems implement the
synchronous directory metadata.

There are lots of filesystems out there. Why not use
an interface they should all support rather than ask for
per-filesystem, filesystem-specific improvements?

miket
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/