operative system can workaround such a weird mem layout just fine
with discontigmem, there is no problem making such hardware to work.
> Could you please explain how to work around it with discontigmem?
Are you serious? that's what ARM is doing for ages in 2.4, I think this
part was obvious under the whole previous discussions. just put each
discontigous chunk into a separated pgdat and it will work flawlessy
(also make sure to apply all pending vm/numa fixes in -aa first that are
needed for numa anyways). They will all be normal zones provided you
implement a static view of them in the kernel virtual address space, and
you also cover page_address/virt_to_page/pci_map* of course.
Yes, nonlinear would be just a bit faster than discontigmem in the above
scenario (it's non numa so you are not forced to describe the
discontigmem topology to common code and that would save some runtime
bit), but avoiding nonlinear also lefts the common code quite simpler
without adding further mm abstractions.
With hundred of pgdats the "discontigmem workaround" becomes prohibitive,
and so nonlinear become mandatory in a scenario like origin2k. But in
the above scenario, "nonlinear" would be just a minor optimizations that
also leads to additional common code complexity.
> > If
> > as expected they are (indeed similar to numa-q) placed on different numa
> > nodes, then they must go into pgdat regardless, so nonlinear or not
> > cannot make difference with numa. Either ways (both if it's broken
> > hardware workaroundable with discontigmem, or proper numa architecture)
> > there will be no problem at all in coalescing the blocks below 4G into
> > ZONE_NORMAL (and for the blocks above 4G nonlinaer can do nothing).
>
> Why can config_nonlinear do nothing with blocks above 4G physical?
Just to be sure it's clear, with "do nothing", I mean it cannot put them
into zone_normal anyways. Putting the whole thing into zone_normal was
your whole point of the previous email: "By using config_nonlinear, the
kernel can directly address all of that memory...giving you the full
800MB...as zone_normal". I think I just told you why, grep for vmalloc32
and see why it doesn't passes to GFP the __GFP_HIGHMEM flag.
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/