Hm, I believe you are barking up the wrong tree. Either you are omitting
too much information in your statement above or you are contradicting
yourself.
What you are looking for is _exactly_ particular FS solution(s)! And in
particular you are looking for a truly distributed file system.
I just get the impression you are not fully aware what a distributed FS
(call it DFS for short) actually is.
In my understanding a DFS offers exactly what you need: each node has disks
and all disks on all nodes are part of the very same file system. Of course
each node maintains the local disks, i.e. the local part of the file system
and certain operations require that the nodes communicates with the "DFS
master node(s)" in order for example to reserve blocks of disks or to
create/rename files (need to make sure no duplicate filenames are
instantiated for example). -- Sound familiar so far? You wanted to do
exactly the same things but at the block layer and the VFS layer levels
instead of the FS layer...
The difference between a DFS and your proposal is that a DFS maintains all
the caching benefits of a normal FS at the local node level, while your
proposal completely and entirely disables caching, which is debatably
impossible (due to need to load things into ram to read them and to modify
them and then write them back) and certainly no FS author will accept their
FS driver to be crippled in such a way. The performance loss incurred by
removing caching completely is going to make sure you will only be dreaming
of those 50GiB/sec. More likely you will be getting a few bytes/sec... (OK,
I exaggerate a bit.) The seek times on the disks together with the
read/write timings are going to completely annihilate performance. A DFS
maintains caching at local node level, so you can still keep open inodes in
memory for example (just don't allow any other node to open the same file
at the same time or you need to do some juggling via the "Master DFS node").
To give you an analogy, you can think of a DFS like a NUMA machine, where
you have different access speeds to different parts of memory (for DFS the
"storage device", same thing really) and where decision on where to store
things are decided depending on the resource/time cost involved. Simplest
example: A file created on node A, will be allocated/written to a disk (or
multiple disks) located on node A, because accessing the local disks has a
lower time cost compared to going to a different node over the slower wire.
Your time would be much better spent in creating the _one_ true DFS, or
helping improve one of the existing ones instead of trying to hack up the
VFS/block layers to pieces. It almost certainly will be a hell of a lot
less work to implement a decent DFS in comparison to changing the block
layer, the VFS, _and_ every single FS driver out there to comply with the
block layer and VFS changes. And at the same time you get exactly the same
features you wanted to have but with hugely boosted performance.
I hope my ramblings made some kind of sense...
Best regards,
Anton
-- "I've not lost my mind. It's backed up on tape somewhere." - Unknown-- Anton Altaparmakov <aia21 at cantab.net> (replace at with @) Linux NTFS Maintainer / IRC: #ntfs on irc.openprojects.net WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
- Next message: Alan Cox: "Re: PATCH - change to blkdev->queue calling triggers BUG in md.c"
- Previous message: Alan Cox: "Re: aic7xxx sets CDR offline, how to reset?"
- In reply to: Peter T. Breuer: "Re: [RFC] mount flag "direct" (fwd)"
- Next in thread: Andreas Dilger: "Re: [RFC] mount flag "direct" (fwd)"
- Reply: Andreas Dilger: "Re: [RFC] mount flag "direct" (fwd)"
- Reply: Daniel Phillips: "Re: [RFC] mount flag "direct" (fwd)"