Hi!
> When some objects are allocated by one CPU but freed by another CPU we can
> consume lot of cycles doing divides in obj_to_index().
>
> (Typical load on a dual processor machine where network interrupts are
> handled
> by one particular CPU (allocating skbufs), and the other CPU is runni
David Miller a écrit :
From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Mon, 04 Dec 2006 22:34:29 +0100
On a 200 MHz sparcv9 machine, the division takes 64 cycles instead of 1 cycle
for a multiply.
For UltraSPARC I and II (which is what this 200mhz guy probably is),
it's 4 cycle latency for a mul
From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Mon, 04 Dec 2006 22:34:29 +0100
> On a 200 MHz sparcv9 machine, the division takes 64 cycles instead of 1 cycle
> for a multiply.
For UltraSPARC I and II (which is what this 200mhz guy probably is),
it's 4 cycle latency for a multiply (32-bit or 64-bit
instead of a divide in obj_to_index()
When some objects are allocated by one CPU but freed by another CPU we can
consume lot of cycles doing divides in obj_to_index().
(Typical load on a dual processor machine where network interrupts are handled
by one particular CPU (allocating skbufs), and the
This is similar stuff to asm-generic/div64.h right? The divide overhead
depends on the platform? Maybe it would better to place it in
asm-generic/div.h and then have platform specific functions?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
ue with modern CPUS :
> elapsed time for 10^9 loops on Pentium M 1.6 Ghz
> 24 s for the version using divides
> 3.8 s for the version using multiplies
>
> [PATCH] SLAB : use a multiply instead of a divide in obj_to_index()
>
> When some objects are allocated by one
use a multiply instead of a divide in obj_to_index()
When some objects are allocated by one CPU but freed by another CPU we can
consume lot of cycles doing divides in obj_to_index().
(Typical load on a dual processor machine where network interrupts are handled
by one particular CPU (allocati
On Mon, 4 Dec 2006, Eric Dumazet wrote:
> Doing some math, we can use a reciprocal multiplication instead of a divide.
Could you generalize the reciprocal thingy so that the division
can be used from other parts of the kernel as well? It would be useful to
separately get some cycle counts on a
When some objects are allocated by one CPU but freed by another CPU we can
consume lot of cycles doing divides in obj_to_index().
(Typical load on a dual processor machine where network interrupts are handled
by one particular CPU (allocating skbufs), and the other CPU is running the
applicatio
9 matches
Mail list logo