After two weeks of quasi stability the server has crashed.. again... with the
following message... (partial)
------------------
kernel BUG at highmem.c:155
Invalid Operand : 0000
CPU : 1
EIP : 0010:[<c012fcb>]
EFLAGS: 00010286
eax:0000001d ebx: 00000000 esi:c2147ec0
edi: 00000000 ebp: 00001000 esp: df587ebc
Process sshd (pid : 6230, stackpage : df587000)
......
-----------------
This is a standard 2.4.3 kernel with four patches, to correct the
dcache inode non-clearance (ever growing inode and directory cache) as well
as the patch to apply vm pressure to lower these caches (Ed Tomlinsons, I
beleive...).., next patches are for aacraid.. including jens axboes' SCSI
patch as well as the aacraid patch.
The symptons were an ever more sluggish machine over time, memory usage
looked pretty standard with the majority of memory assigned to cache... what
would happen is that at terminal it would go into semi-freeze states of about
5-10 seconds (increasing with time), where no user interaction was possible.
By terminal I mean through a ssh remote terminal.... The load would also
occasionally just increase for no apparent reason to values of 7,8,9...
The server is a dual 933mhz Xeon (PIII) on a ServerWorks Motherboard (HP
NetServer) with 1.2 gig or ram, 6 SCSI III 18.6 GIG drives in HP Netraid and
1 9 gig SCSI as the root FS... Running suse 7.0 and reiserfs...
ANy and ALL advice/patches would be greatly appreciated.... I will probably
be moving back to 2.2 kernel series as a result of my stability problems..
Thanks
MarCin
-- ----------------------------- Marcin Kowalski Linux/Perl Developer ----------------------------- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/