Re: Inclusion of zoned inactive/free shortage patch

Marcelo Tosatti (marcelo@conectiva.com.br)
Tue, 17 Jul 2001 22:19:42 -0300 (BRT)


On Tue, 17 Jul 2001, Linus Torvalds wrote:

>
> On Tue, 17 Jul 2001, Marcelo Tosatti wrote:
> >
> > This fixes most of the highmem problems (I'm not able to deadlock a 4GB
> > machine running memory-intensive programs with the patch anymore. I've
> > also received one success report from Dirk Wetter running two 2GB
> > simulations on a 4GB machine).
>
> Do you have any really compelling reasons for adding the zone parameter to
> swap-out?

Avoid the page-faults and unecessary swap space allocation.

> At worst, we get a few more page-faults (not IO).

Don't think its just a "few more" depending on the setup... I've seen
"__get_swap_page()" using 99%CPU time of the system due to DMA specific
inactive shortage while the kernel was aging/unmapping pte's pointing to
normal/highmem pages during quite some time. As soon as the DMA inactive
shortage is gone, the problem goes away.

That is the main reason why I did zone specific pte scanning.

> At best, NOT doing this should generate a more complete picture of the
> VM state.

Indeed. Thats the price we have to pay...

> I'd really prefer the VM scanning to not be zone-aware..

Right, but think about small/big zones on the same machine.

If we have a _specific_ inactive shortage on the DMA zone on a highmem
machine with shitloads of memory, its not worth to potentially unmap
all pte's pointing to all high/normal memory.

Practical example: 4GB machine, running two "fillmem" (2GB each).

The following stats are for DMA specific "swap_out()" calls.

vm_pteskipzone 2534665 <-- Number of pte's skipped because they pointed to
non-DMA zones.
vm_ptescan 13984 <-- Number of pte's pointing to DMA pages scanned.
vm_pteunmap 6320 <-- From "vm_ptescan", how many pte's have been
succesfully unmapped.

Now imagine that on a 16GB machine. Its a big storm of unecessary
softfaults/swap space allocation.

Its a tradeoff: I think the unecessary pte unmap's are a bigger problem
than the "not complete picture" of the VM state.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/