On Sun, Nov 8, 2020 at 2:03 AM Thomas Monjalon <tho...@monjalon.net> wrote: > > 07/11/2020 20:05, Jerin Jacob: > > On Sun, Nov 8, 2020 at 12:09 AM Thomas Monjalon <tho...@monjalon.net> wrote: > > > 07/11/2020 18:12, Jerin Jacob: > > > > On Sat, Nov 7, 2020 at 10:04 PM Thomas Monjalon <tho...@monjalon.net> > > > > wrote: > > > > > > > > > > The mempool pointer in the mbuf struct is moved > > > > > from the second to the first half. > > > > > It should increase performance on most systems having 64-byte cache > > > > > line, > > > > > > > > > i.e. mbuf is split in two cache lines. > > > > > > > > But In any event, Tx needs to touch the pool to freeing back to the > > > > pool upon Tx completion. Right? > > > > Not able to understand the motivation for moving it to the first 64B > > > > cache line? > > > > The gain varies from driver to driver. For example, a Typical > > > > ARM-based NPU does not need to > > > > touch the pool in Rx and its been filled by HW. Whereas it needs to > > > > touch in Tx if the reference count is implemented. > > > > See below. > > > > > > > > > > > Due to this change, tx_offload is moved, so some vector data paths > > > > > may need to be adjusted. Note: OCTEON TX2 check is removed > > > > > temporarily! > > > > > > > > It will be breaking the Tx path, Please just don't remove the static > > > > assert without adjusting the code. > > > > > > Of course not. > > > I looked at the vector Tx path of OCTEON TX2, > > > it's close to be impossible to understand :) > > > Please help! > > > > Off course. Could you check the above section any share the rationale > > for this change > > and where it helps and how much it helps? > > It has been concluded in the techboard meeting you were part of. > I don't understand why we restart this discussion again. > I won't have the energy to restart this process myself. > If you don't want to apply the techboard decision, then please > do the necessary to request another quick decision.
Yes. Initially, I thought it is OK as we have 128B CL, After looking into Thomas's change, I realized it is not good for ARM64 64B catchlines based NPU as - A Typical ARM-based NPU does not need to touch the pool in Rx and its been filled by HW. Whereas it needs to touch in Tx if the reference count is implemented. - Also it will be effecting exiting vector routines I request to reconsider the tech board decision. > > >