snehasish wrote:

> @david-xl It's interesting to know this. How is that going on your end?

Currently we are exploring the design space to accommodate the variety of 
platforms and FDO types we use internally. This is a priority for us though so 
we should have some updates to share externally by the end of the year. 

> We've been using similar technique to do memcpy size optimization. 

It's interesting that you are exploring a profile-guided direction for this. 
Chatelet et. al (from Google) published a paper on [automatic generation of 
memcpy](https://conf.researchr.org/details/ismm-2021/ismm-2021/4/automemcpy-A-framework-for-automatic-generation-of-fundamental-memory-operations)
 which uses PMU based parameter profiling at ISMM'21. The technique does not 
use Intel DLA instead we use precise sampling on call instructions in the 
process and filter the functions of interest. We inspect the RDX register to 
collect the parameter value for size. The data in aggregate was used to 
auto-generate the memcpy implementation. Section 2.4 has the rationale for not 
using an FDO approach. @gchatelet is the primary owner for this work.

What is the memcpy implementation you are trying to optimize? Do you see 
context sensitivity or workload specificity an important dimension to consider?

> But a common problem here could be how to generalize the AutoFDO profile 
> format to incorporate both indirect call targets, callsite parameter values 
> and other types of values. Do you have a plan for that? Maybe we can work 
> together on this.

Extensions to the AutoFDO format to accommodate such hints sounds good. Happy 
to collaborate on the design which can be leveraged by future work. Perhaps 
start a separate issue for discussion on Github or a thread on Discourse? 




https://github.com/llvm/llvm-project/pull/66825
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to