Le 13/02/2022 à 04:01, David Laight a écrit : > From: Christophe Leroy >> Sent: 11 February 2022 10:25 >> >> When building kernel with CONFIG_CC_OPTIMISE_FOR_SIZE, several >> copies of csum_sub() are generated, with the following code: >> >> 00000170 <csum_sub>: >> 170: 7c 84 20 f8 not r4,r4 >> 174: 7c 63 20 14 addc r3,r3,r4 >> 178: 7c 63 01 94 addze r3,r3 >> 17c: 4e 80 00 20 blr >> >> Let's define a PPC32 version with subc/addme, and for it's inlining. >> >> It will return 0 instead of 0xffffffff when subtracting 0x80000000 to itself, >> this is not an issue as 0 and ~0 are equivalent, refer to RFC 1624. > > They are not always equivalent. > In particular in the UDP checksum field one of them is (0?) 'checksum not > calculated'. > > I think all the Linux functions have to return a non-zero value (for non-zero > input). > > If the csum is going to be converted to 16 bit, inverted, and put into a > packet > the code usually has to have a check that changes 0 to 0xffff. > However if the csum functions guarantee never to return zero they can feed > an extra 1 into the first csum_partial() then just invert and add 1 at the > end. > Because (~csum_partion(buffer, 1) + 1) is the same as ~csum_partial(buffer, 0) > except when the buffer's csum is 0xffffffff. > > I did do some experiments and the 64bit value can be reduced directly to > 16bits using '% 0xffff'. > This is different because it returns 0 not 0xffff. > However gcc 'randomly' picks between the fast 'multiply by reciprocal' > and slow divide instruction paths. > The former is (probably) faster than reducing using shifts and adc. > The latter definitely slower. >
Ok, I submitted a patch to force inlining of all checksum helpers in net/checksum.h instead. Christophe