> From: David Christensen [mailto:d...@linux.vnet.ibm.com]
> Sent: Tuesday, 19 July 2022 20.01
> 
> On 7/19/22 8:26 AM, Morten Brørup wrote:
> > This RFC proposes a set of functions optimized for non-temporal
> memory copy.
> >
> > At this stage, I am asking for feedback on the concept.
> >
> > Applications sometimes data to another memory location, which is only
> used
> > much later.
> > In this case, it is inefficient to pollute the data cache with the
> copied
> > data.
> >
> > An example use case (originating from a real life application):
> > Copying filtered packets, or the first part of them, into a capture
> buffer
> > for offline analysis.
> >
> > The purpose of these functions is to achieve a performance gain by
> not
> > polluting the cache when copying data.
> > Although the throughput may be improved by further optimization, I do
> not
> > consider througput optimization relevant initially.
> >
> Assume that fallback to the standard temporal memcpy is an acceptable
> implementation when not supported by the architecture, yes?

Yes, that is exactly what I envisioned.

Furthermore, stores unaligned to a degree not supported by the architecture, 
will also use temporal mempcy - at least for the unaligned first and last part 
of the copy. The middle (aligned) part may use non-temporal copy.

> My internal
> queries on the POWER side indicate that there's no support in P8/P9/P10
> ISA for such functionality.
> 
> Dave

Thank you for quick feedback, Dave!

Reply via email to