On Tue, 2018-04-10 at 06:34:35 UTC, Christophe Leroy wrote:
> The current implementation of from64to32() gives a poor result:
> 
> 0000000000000270 <.from64to32>:
>  270: 38 00 ff ff     li      r0,-1
>  274: 78 69 00 22     rldicl  r9,r3,32,32
>  278: 78 00 00 20     clrldi  r0,r0,32
>  27c: 7c 60 00 38     and     r0,r3,r0
>  280: 7c 09 02 14     add     r0,r9,r0
>  284: 78 09 00 22     rldicl  r9,r0,32,32
>  288: 7c 00 4a 14     add     r0,r0,r9
>  28c: 78 03 00 20     clrldi  r3,r0,32
>  290: 4e 80 00 20     blr
> 
> This patch modifies from64to32() to operate in the same
> spirit as csum_fold()
> 
> It swaps the two 32-bit halves of sum then it adds it with the
> unswapped sum. If there is a carry from adding the two 32-bit halves,
> it will carry from the lower half into the upper half, giving us the
> correct sum in the upper half.
> 
> The resulting code is:
> 
> 0000000000000260 <.from64to32>:
>  260: 78 60 00 02     rotldi  r0,r3,32
>  264: 7c 60 1a 14     add     r3,r0,r3
>  268: 78 63 00 22     rldicl  r3,r3,32,32
>  26c: 4e 80 00 20     blr
> 
> Signed-off-by: Christophe Leroy <christophe.le...@c-s.fr>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/55a0edf083022e402042255a0afb03

cheers

Reply via email to