> Actually, it's not all that simple (you have to find the enclosing
> directories of any files you're modifying, which might require string
> manipulation)
No, you have to find the directories you are modifying. And the
application knows darn well which directories it is modifying.
Don't speculate. Show some sample code, and let's see how hard it
would be to use the "Linux way". I am betting on "not hard at all".
> or necessarily all that fast (you're doubling the number of system
> calls and now the application is imposing an ordering on the
> filesystem that didn't exist before).
No, you are not doubling the number of system calls. As I have tried
to point out repeatedly, doing this stuff reliably and portably
already requires a sequence like this:
write data
flush data
write "validity" indicator (e.g., rename() or fchmod())
flush validity indicator
On Linux, flushing a rename() means calling fsync() on the directory
instead of the file. That's it. Doing that instead of fsync'ing the
file adds at most two system calls (to open and close the directory),
and those can be amortized over many operations on that directory
(think "mail spool"). So the system call overhead is non-existent.
As for "imposing an ordering on the filesystem that didn't exist
before", that is complete nonsense. This is imposing *precisely* the
ordering required for reliable operation; no more, no less. Relying
on mount options, "chattr +S", or journaling artifacts for your
ordering is the inefficient approach; since they impose extra
ordering, they can never be faster and will usually be slower.
> It's only necessary for ext2. Modern Linux filesystems (such as ext3
> or reiserfs) don't require it.
Only because they take the performance hit of flushing the whole log
to disk on every fsync(). Combine that with "data=ordered" and see
what happens to your performance. (Perhaps "data=ordered" should be
called "fsync=sync".) I would rather get back the performance and
convince application authors to understand what they are doing.
> Finally: ext2 isn't safe even if you do call fsync() on the directory!
Wrong.
write temp file
fsync() temp file
rename() temp file to actual file
fsync() directory
No matter where this crashes, it is perfectly safe on ext2. (If not,
ext2 is badly broken.) The worst that can happen after a crash is
that the file might exist with both the old name and the new name.
But an application can detect this case on startup and clean it up.
- Pat
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/