Then the thing is different, I expected SSE3 not to mess the xmm layout.
If you just know SSE3 would break with the xorps the fxrestor way is
better. Anyways the problems I have about the implementation remains
(memset and duplicate efforts with ptrace in creating the empty fpu
state).
If they tell you the xmm registers won't change with SSE3 instead I
still prefer the xorps, that's 3bytes x 8 registers = 24 bytes of
icachce, compared to throwing away 512bytes/32 = 16 dcache cachelines so
it should be significantly faster.
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/