> -----Original Message----- > From: Pavan Nikhilesh Bhagavatula <pbhagavat...@marvell.com> > Sent: Monday, September 14, 2020 11:39 AM > To: Van Haaren, Harry <harry.van.haa...@intel.com>; dev@dpdk.org > Subject: RE: [dpdk-dev] [PATCH] eal: add new prefetch0_write variant > > >> >This commit adds a new rte_prefetch0_write() variant, suggests to > >the > >> >compiler to use a prefetch instruction with intention to write. As a > >> >compiler builtin, the compiler can choose based on compilation > >target > >> >what the best implementation for this instruction is. > >> > >> Why not have the other variants too i.e. l2/l3/temporal store > >prefetches too? > > > >Hi Pavan, > > > Hi Harry, > (LTNS) > > >Are there architectures that actually implement those? Usually for a WB > >mem store to complete, > >the data must be present in L1 cache (on x86 at least), and that's what > >the patch below with write0 achieves. > > ARM64 does supports all modes of store prefetch > " > <type> is one of: > PLD Prefetch for load, encoded in the "Rt<4:3>" field as 0b00. > PLI Preload instructions, encoded in the "Rt<4:3>" field as 0b01. > PST Prefetch for store, encoded in the "Rt<4:3>" field as 0b10. > <target> is one of: > L1 Level 1 cache, encoded in the "Rt<2:1>" field as 0b00. > L2 Level 2 cache, encoded in the "Rt<2:1>" field as 0b01. > L3 Level 3 cache, encoded in the "Rt<2:1>" field as 0b10. > <policy> is one of: > KEEP Retained or temporal prefetch, allocated in the cache normally. Encoded > in > the "Rt<0>" > field as 0. > STRM Streaming or non-temporal prefetch, for data that is used only once. > Encoded > in the > "Rt<0>" field as 1. > For more information on these prefetch > " > > > > >I'm against adding all the variants "just in case", it leads to API bloat, > >and increases > >cognitive load on the programmer. My expectation is that in 99% of > >usage the prefetch > >write instruction should target L1. > > > > There is a use case when cache mode is write through and application is > pipelining work across cores sharing same L2 cluster.
OK - v2 sent: http://patches.dpdk.org/patch/77632/ APIs matching the existing prefetch APIs: rte_prefetch0_write() L1 and all below rte_prefetch1_write() L2 and all below rte_prefetch2_write() L3 Cheers, -Harry