> From: David Christensen [mailto:d...@linux.vnet.ibm.com] > Sent: Tuesday, 19 July 2022 20.01 > > On 7/19/22 8:26 AM, Morten Brørup wrote: > > This RFC proposes a set of functions optimized for non-temporal > memory copy. > > > > At this stage, I am asking for feedback on the concept. > > > > Applications sometimes data to another memory location, which is only > used > > much later. > > In this case, it is inefficient to pollute the data cache with the > copied > > data. > > > > An example use case (originating from a real life application): > > Copying filtered packets, or the first part of them, into a capture > buffer > > for offline analysis. > > > > The purpose of these functions is to achieve a performance gain by > not > > polluting the cache when copying data. > > Although the throughput may be improved by further optimization, I do > not > > consider througput optimization relevant initially. > > > Assume that fallback to the standard temporal memcpy is an acceptable > implementation when not supported by the architecture, yes?
Yes, that is exactly what I envisioned. Furthermore, stores unaligned to a degree not supported by the architecture, will also use temporal mempcy - at least for the unaligned first and last part of the copy. The middle (aligned) part may use non-temporal copy. > My internal > queries on the POWER side indicate that there's no support in P8/P9/P10 > ISA for such functionality. > > Dave Thank you for quick feedback, Dave!