>> >This commit adds a new rte_prefetch0_write() variant, suggests to >the >> >compiler to use a prefetch instruction with intention to write. As a >> >compiler builtin, the compiler can choose based on compilation >target >> >what the best implementation for this instruction is. >> >> Why not have the other variants too i.e. l2/l3/temporal store >prefetches too? > >Hi Pavan, > Hi Harry, (LTNS)
>Are there architectures that actually implement those? Usually for a WB >mem store to complete, >the data must be present in L1 cache (on x86 at least), and that's what >the patch below with write0 achieves. ARM64 does supports all modes of store prefetch " <type> is one of: PLD Prefetch for load, encoded in the "Rt<4:3>" field as 0b00. PLI Preload instructions, encoded in the "Rt<4:3>" field as 0b01. PST Prefetch for store, encoded in the "Rt<4:3>" field as 0b10. <target> is one of: L1 Level 1 cache, encoded in the "Rt<2:1>" field as 0b00. L2 Level 2 cache, encoded in the "Rt<2:1>" field as 0b01. L3 Level 3 cache, encoded in the "Rt<2:1>" field as 0b10. <policy> is one of: KEEP Retained or temporal prefetch, allocated in the cache normally. Encoded in the "Rt<0>" field as 0. STRM Streaming or non-temporal prefetch, for data that is used only once. Encoded in the "Rt<0>" field as 1. For more information on these prefetch " > >I'm against adding all the variants "just in case", it leads to API bloat, >and increases >cognitive load on the programmer. My expectation is that in 99% of >usage the prefetch >write instruction should target L1. > There is a use case when cache mode is write through and application is pipelining work across cores sharing same L2 cluster. >Cheers, -Harry Regards, Pavan. > >> >Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com> >> > >> >--- >> > >> >The integer constants passed to the builtin are not available as >> >a #define value, and doing #defines just for this write variant >> >does not seems a nice solution to me... particularly for those using >> >IDEs where any #define value is auto-hinted for code-completion. >> > >> >--- >> > lib/librte_eal/include/generic/rte_prefetch.h | 16 >++++++++++++++++ >> > 1 file changed, 16 insertions(+) > ><snip patch contents>