On Thu, Sep 26, 2013 at 2:54 PM, Jan Hubicka <hubi...@ucw.cz> wrote: >> > As for COMDAT merging, i would like to see the patch. I am experimenting >> > now with a patch to also privatize COMDATs during -fprofile-generate to >> > avoid problems with lost profiles mentioned above. >> > >> >> Do you mean you privatize every COMDAT function in the profile-generate? >> We discussed this idea internally and we thought it would not work for >> large applications (like in google) due to size. > > Yes, Martin and I plan to test this on firefox. In a way you already have all > the COMDAT functions unshared in the object files, so the resulting binary > should not be completely off the limits. But I do not have any quantitative > data, yet, since we hit bug in constant folding and devirtualization I fixed > in > meantime but we did not re-run the tests yet.
LInker removes a great numbers of duplicated copies, esp for those template functions. We don't have a quantitative numbers either. But I'll collect some soon. > >> >> > As for context sensitivity, one approach would be to have two sets of >> > counters for every comdat - one merged globally and one counting local >> > instances. We can then privatize always and at profile read in stage >> > just clone every comdat and have two instances - one for offline copy >> > and one for inlining. >> > >> >> In my implementation, I also allow multiple sets of COMDAT profile >> co-existing in one compilation. >> Due to the auxiliary modules in LIPO, I actually have more than two. > > How does auxiliary modules work? It pulls in multiple profiles from other compilation. So there might be multiple inlined profiles. >> >> But I'm wondering how do you determine which profile to use for each >> call-site -- the inline decision may not >> be the same for profile-generate and profile-use compilation. > > My suggestion was to simply use the module local profile for all inline sites > within the given module and the global profile for the offline copy of the > function (that one will, in the case it survives linking, be shared across > all the modules anyway). For simple example like: callsite1 --> comcat_function_foo callsite2 --> comdat_function_foo callsite1 is inlined in profile-generate, it has its own inlined profile counter. callsite2 is not inlined and the profile goes to the offline copies. let's callsite 1 is cold (0 counter) and callsite 2 is hot. Using local profile (the cold one) for callsite2 will not be correct. > > I think this may work in the cases where i.e. use of hash templates in one > module is very different (in average size) from other module. > I did not really put much effort into it - I currently worry primarily about > the cases where profile is lost completely since it gets attached to a > function > not surviving final linking (or because we inline something we did not inlined > at profile time). > > As for context sensitivity, we may try to consider developing more consistent > solution for this. COMDAT functions are definitely not only that may exhibit > context sensitive behaviour. > One approach would be to always have multiple counters for each function and > hash based on cbacktraces collected by indirect call profiling > instrumentation. > In a way this is same path profiling, but that would definitely add quite some > overhead + we will need to think of resonable way to represent this within > compiler. > > How do you decide what functions you want to have multiple profiles for? I do the instrumentation after ipa-inline for comdat function. I know if a callsite is inlined or not. In profile-use phrase, I also need to provide to the context (which module this is from) to pick the right profile. > > Honza