It is marginally possible to increase that up so much
	that you get about 3.8-3.9 GB for usermode process.
	(I use k=1024, M=k*k, G=k*k*k)
	It is absolutely impossible to get it into anything above
	the 4.0 GB limit.   This hard limit is buried inside the i386
	(and all of its successors) memory addressing, and mapping
	hardware.  There is a choke-point of 32 address bits along
	the way, which prevents going above 4.0 GB most effectively.
	With considerable infrastructural work(*) it MIGHT be possible
	to go very near the 4.0 GB limit for userspace, but I am not
	an expert here.   The crux is at the supervisor/interrupt mode
	stack memory mapping.  As far as I understand, in i386 we
	must have the supervisor stack (and 'struct task') mapped
	into the same address space as the usermode.  Only the memory
	protection prevents the usermode to access that data.
	Also parts of kernel code must be in that address space + parts
	of kernel data related into MMU control.
	(*) Supervisor (kernel) mode must have the stack, and switch-
	around code + some datasets in its access space when transition
	into the kernel space is done (and reversed).  Accessing user-
	space from kernel can then be done via kmap() (-like) windows.
	Of course this is considerably much slower than the current method
	where each user-space has 1/4 of its total address space allocated
	for kernel internal use.
	To get most out of your box, you need to run your problem as much
	as possible at separate processors and with separate contexts.
	That way you will get most out of your SMP setup.
	(Consider your box as a small Beowulf-cluster.)
	Of course problems where you run e.g. PVM, you will need fast
	communication in between processes, and nothing would beat single
	shared memory space.   You might be able to get that by having
	e.g. SHM segments used for PVM's IPC task.
	Linux doesn't support user semaphores in SHM in scheduling sense,
	though.  You can, of course, do CPU burning spin-locks for shared
	memory area access.  The best would, IMO, be a hybride of using
	SHM for transfering large amounts of data in between processes,
	and something alike PF_UNIX sockets for signaling that there is
	some new data available.
	In _usual_ case you can ignore such details, and use your favourite
	clustering library, like PVM.
> Thanks a great deal,
> Ernie
> 
> PS. Please `CC' me the answer!
> Ernest N. Mamikonyan     E-Mail: ernest@newton.physics.drexel.edu
> Philadelphia, PA  19104  Web: www.physics.drexel.edu/research/astro
/Matti Aarnio
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/