Alan Cox wrote:
>On Sat, 2002-10-26 at 20:22, Manfred Spraul wrote:
>
>
>>kmalloc spends a large part of the total execution time trying to find
>>the cache for the passed in size.
>>
>>What about the attached patch (against 2.5.44-mm5)?
>>It uses fls jump over the caches that are definitively too small.
>>
>>
>
>Out of curiousity how does fls compare with finding the right cache by
>using a binary tree walk ? A lot of platforms seem to use generic_fls
>which has a lot of conditions in it and also a lot of references to just
>computed values that look likely to stall
>
>
Binary tree walk means 4 unpredictable branches and at least i386 can
use bsrl for a fast fls().
Patch is attached.
-- Manfred
--------------040808040707070602020605 Content-Type: text/plain; name="patch-fls" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch-fls"
--- 2.5/include/asm-i386/bitops.h Sun Sep 22 06:25:12 2002 +++ build-2.5/include/asm-i386/bitops.h Sun Oct 27 11:04:57 2002 @@ -414,11 +414,22 @@ return word; } -/* +/** * fls: find last bit set. + * @x: The word to search + * */ -#define fls(x) generic_fls(x) +static inline int fls(int x) +{ + int r; + + __asm__("bsrl %1,%0\n\t" + "jnz 1f\n\t" + "movl $-1,%0\n" + "1:" : "=r" (r) : "g" (x)); + return r+1; +} #ifdef __KERNEL__
--------------040808040707070602020605--
- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/