Re: 7.52 second kernel compile

Cort Dougan (cort@fsmlabs.com)
Mon, 18 Mar 2002 12:42:15 -0700

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Keith Owens: "Re: EXPORT_SYMBOL doesn't work"
Previous message: Jason Li: "Re: EXPORT_SYMBOL doesn't work"

I have a counter-proposal. How about a hardware tlb load (if we must have
one) that does the right thing? I don't think the PPC is a good example of
a hardware well-managed TLB load process. Software loads show up so well
on the PPC because it does some very very foolish things I suspect. I've
had some conversations with Moto engineers who have suggested that my
suspicion that the TLB loads are actually cached when the hardware does
them so that we waste cache space with an line that we better darn well not
be loading again (otherwise we've thrown out our tlb way too early).

I still think there are some clever tricks one could do with the VSID's to
get a much saner system that the current hash table. It would take some
serious work I think but the results could be worthwhile. By carefully
choosing the VSID scatter algorithm and the size of the hash table I think
one could get a much better access method.

} However, one good argument against software TLB loading that I have
} heard (and which you mentioned in another message) is that loading a
} TLB entry in software requires taking an exception, which requires
} synchronizing the pipeline, which is expensive. With hardware TLB
} reload you can just freeze the pipeline while the hardware does a
} couple of fetches from memory. And PPC64 remains the only
} architecture I know of that supports a full 64-bit virtual address
} space _and_ can do hardware TLB reload.
}
} I would be interested to see measurements of how many cache misses on
} average each hardware TLB reload takes; for a hash table I expect it
} would be about 1, for a 3-level tree I expect it would be very
} dependent on access pattern but I wouldn't be surprised if it averaged
} about 1 also on real workloads.
}
} But this is all a bit academic, the real question is how do we deal
} most efficiently with the real hardware that is out there. And if you
} want a 7.5 second kernel compile the only option currently available
} is a machine whose MMU uses a hash table. :)
}
} Paul.
} -
} To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
} the body of a message to majordomo@vger.kernel.org
} More majordomo info at http://vger.kernel.org/majordomo-info.html
} Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Keith Owens: "Re: EXPORT_SYMBOL doesn't work"
Previous message: Jason Li: "Re: EXPORT_SYMBOL doesn't work"