Georg, I apologize for calling you Geoff below! Just realized my mistake. Med venlig hilsen / Kind regards, -Morten Brørup
> -----Original Message----- > From: Morten Brørup > Sent: Saturday, 16 October 2021 10.21 > To: 'Georg Sauthoff' > Cc: 'dev@dpdk.org'; 'Ferruh Yigit'; 'Olivier Matz'; 'Thomas Monjalon'; > 'David Marchand' > Subject: RE: [dpdk-dev] [PATCH 1/1] net: fix aliasing issue in checksum > computation > > Geoff, > > I have given this some more thoughts. > > Most bytes transferred in real life are transferred in large packets, > so faster processing of large packets is a great improvement! > > Furthermore, a quick analysis of a recent packet sample from an ISP > customer of ours shows that less than 8 % of the packets are odd size. > Would you consider adding an unlikely() to the branch handling the odd > byte at the end? > > -Morten > > > -----Original Message----- > > From: Morten Brørup > > Sent: Thursday, 14 October 2021 22.22 > > > > > -----Original Message----- > > > From: Ferruh Yigit [mailto:ferruh.yi...@intel.com] > > > Sent: Thursday, 14 October 2021 19.20 > > > > > > On 9/18/2021 12:49 PM, Georg Sauthoff wrote: > > > > That means a superfluous cast is removed and aliasing through a > > > uint8_t > > > > pointer is eliminated. Note that uint8_t doesn't have the same > > > > strict-aliasing properties as unsigned char. > > > > > > > > Also simplified the loop since a modern C compiler can speed up > > (i.e. > > > > auto-vectorize) it in a similar way. For example, GCC auto- > > vectorizes > > > it > > > > for Haswell using AVX registers while halving the number of > > > instructions > > > > in the generated code. > > > > > > > > Signed-off-by: Georg Sauthoff <m...@gms.tf> > > > > > > + Morten. (Because of past reviews on cksum code) > > > > Thanks, Ferruh. > > > > I have not verified the claimed benefits of the patch, but I have > > reviewed the code thoroughly, and it looks perfectly good to me. > > > > Reviewed-by: Morten Brørup <m...@smartsharesystems.com> > > > > BTW: It makes me wonder if other parts of DPDK could benefit from the > > same treatment. Especially some of the older DPDK code, where we were > > trying to optimize by hand what a modern compiler can optimize for us > > today. > > > > > > > > > --- > > > > lib/net/rte_ip.h | 27 ++++++++------------------- > > > > 1 file changed, 8 insertions(+), 19 deletions(-) > > > > > > > > diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h > > > > index 05948b69b7..386db94c85 100644 > > > > --- a/lib/net/rte_ip.h > > > > +++ b/lib/net/rte_ip.h > > > > @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr > > > *ipv4_hdr) > > > > static inline uint32_t > > > > __rte_raw_cksum(const void *buf, size_t len, uint32_t sum) > > > > { > > > > - /* workaround gcc strict-aliasing warning */ > > > > - uintptr_t ptr = (uintptr_t)buf; > > > > + /* extend strict-aliasing rules */ > > > > typedef uint16_t __attribute__((__may_alias__)) u16_p; > > > > - const u16_p *u16_buf = (const u16_p *)ptr; > > > > - > > > > - while (len >= (sizeof(*u16_buf) * 4)) { > > > > - sum += u16_buf[0]; > > > > - sum += u16_buf[1]; > > > > - sum += u16_buf[2]; > > > > - sum += u16_buf[3]; > > > > - len -= sizeof(*u16_buf) * 4; > > > > - u16_buf += 4; > > > > - } > > > > - while (len >= sizeof(*u16_buf)) { > > > > + const u16_p *u16_buf = (const u16_p *)buf; > > > > + const u16_p *end = u16_buf + len / sizeof(*u16_buf); > > > > + > > > > + for (; u16_buf != end; ++u16_buf) > > > > Personally I would prefer post-incrementing here. It makes no > > difference, so I don't see any need to revise the patch. > > > > > > sum += *u16_buf; > > > > - len -= sizeof(*u16_buf); > > > > - u16_buf += 1; > > > > - } > > > > > > > > - /* if length is in odd bytes */ > > > > - if (len == 1) { > > > > + /* if length is odd, keeping it byte order independent */ > > > > + if (len % 2) { > > I assume that the compiler already optimizes "% 2" to "& 1". > > > > > uint16_t left = 0; > > > > - *(uint8_t *)&left = *(const uint8_t *)u16_buf; > > > > + *(unsigned char*)&left = *(const unsigned char *)end; > > > > sum += left; > > > > } > > > >