snehasish wrote: > @david-xl It's interesting to know this. How is that going on your end?
Currently we are exploring the design space to accommodate the variety of platforms and FDO types we use internally. This is a priority for us though so we should have some updates to share externally by the end of the year. > We've been using similar technique to do memcpy size optimization. It's interesting that you are exploring a profile-guided direction for this. Chatelet et. al (from Google) published a paper on [automatic generation of memcpy](https://conf.researchr.org/details/ismm-2021/ismm-2021/4/automemcpy-A-framework-for-automatic-generation-of-fundamental-memory-operations) which uses PMU based parameter profiling at ISMM'21. The technique does not use Intel DLA instead we use precise sampling on call instructions in the process and filter the functions of interest. We inspect the RDX register to collect the parameter value for size. The data in aggregate was used to auto-generate the memcpy implementation. Section 2.4 has the rationale for not using an FDO approach. @gchatelet is the primary owner for this work. What is the memcpy implementation you are trying to optimize? Do you see context sensitivity or workload specificity an important dimension to consider? > But a common problem here could be how to generalize the AutoFDO profile > format to incorporate both indirect call targets, callsite parameter values > and other types of values. Do you have a plan for that? Maybe we can work > together on this. Extensions to the AutoFDO format to accommodate such hints sounds good. Happy to collaborate on the design which can be leveraged by future work. Perhaps start a separate issue for discussion on Github or a thread on Discourse? https://github.com/llvm/llvm-project/pull/66825 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits