This makes me a bit suspicious of hardware, probably networking. It
really looks like data is getting corrupted between client and server.
The fact that two different servers behaved differently while both
running the same kernel, sees to support the hardware theory.
Maybe if you could get a tcpdump (-s 1500 port 2049) on both the server and the
client I could have a look at the filehandles as see if I can see why
they are 'stale', and whether it could be a hardware problem.
NeilBrown
>
> I'm really lost here. What can I try/do to further narrow this down? Any
> specific kernel revision I could try to go back, notice that already
> 2.5.70 triggered it. With 2.4 on the server nothing of this happens.
> Only thing left is to try booting the server without smp support, but I
> get some 'hde: lost interrupt' messages and it doesn't boot.
> Note that I also tried to export a partition not on dm. Filesystem is
> ext3. I also tried the patches you posted some days ago in another thread.
>
> Thanks for any suggestions,
>
> Jan
>
> # grep NFS .config
> CONFIG_NFS_FS=m
> CONFIG_NFS_V3=y
> CONFIG_NFS_V4=y
> CONFIG_NFS_DIRECTIO=y
> CONFIG_NFSD=m
> CONFIG_NFSD_V3=y
> CONFIG_NFSD_V4=y
> CONFIG_NFSD_TCP=y
>
>
> --
> Linux rubicon 2.5.75-mm1-jd10 #1 SMP Sat Jul 12 19:40:28 CEST 2003 i686
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/