Re: Intel P6 vs P7 system call performance

Linus Torvalds (torvalds@transmeta.com)
Thu, 19 Dec 2002 14:49:50 -0800 (PST)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: John Bradford: "Re: Dedicated kernel bug database"
Previous message: Adam J. Richter: "Re: RFC: bus_type and device_class merge (or partial merge)"

On Thu, 19 Dec 2002, H. Peter Anvin wrote:
>
> Unfortunately it means taking an indirect call cost for every invocation...

Ehh.. I just tested the "cost" of this on a PIII (comparing a indirect
call with a direct one), and it's exactly one extra cycle.

ONE CYCLE.

On a P4 the difference was 4 cycles. On my test P95 system I didn't see
any difference at all. And I don't have an athlon handy in my office.

That's the difference between

static void *address = &do_nothing;
asm("call *%0" :"m" (address))

and

asm("call do_nothing");

So it's between 0-4 cycles on machines that take 200 - 1000 cycles for
just the system call overhead.

And for that "overhead", you get a binary that trivially works on all
kernels, _and_ doesn't need extra mmap's etc (which are _easily_ thousands
of cycles).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: John Bradford: "Re: Dedicated kernel bug database"
Previous message: Adam J. Richter: "Re: RFC: bus_type and device_class merge (or partial merge)"