On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka <hubi...@ucw.cz> wrote: >> Not for instrumented FDO (not as I know of). But for AutoFDO, this >> could be a potential risk because some callee is marked unlikely >> executed simply because they are inlined and eliminated in the O2 >> binary. But in ipa-inline it will not get inlined because the edge is >> not hot from cgraph_maybe_hot_edge_p (because callee is >> UNLIKELY_EXECUTED), while the edge->count is actually hot. > > Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases instead? > It seems that having profile set incorrectly will lead to other problems > later, too. > We discussed similar problem with Teresa about the missing profiles for > comdat, > basically one should detect these cases as profile being lost and go with > guessed > profile. (I believe patch for that was posted, too, and so far it seems best > approach > to this issue)
The current AutoFDO implementation will take all functions that do not have have profile as normally executed, thus use guessed profile for it. This is like using profile for truly hot functions, and using O2 for other functions. This works fine. However, it leads to larger code size (approximately 10%~20% larger than FDO). I'd like to introduce another mode for users who care about both performance and code size, and can be sure that profile is representative. In this mode, we will mark all functions without sample as "unlikely executed". However, because AutoFDO use debug info (of optimized code) to represent profile, it's possible that some hot functions (say foo) are inlined and fully eliminated into another hot function (say bar). So in the profile, bar is cold, and because the profile for foo::bar is eliminated, bar will not be inlined into foo before the profile annotation. However, after profile annotate, we can infer from the bb count that foo->bar is hot, thus it should be inlined in ipa-inline phase. However, because bar itself is marked UNLIKELY_EXECUTED, it will not be inlined. One possible workaround would be that during rebuild_cgraph_edges, if we find an edge's callee is unlikely executed, add the edge count to the callee's count and recalculate callee's frequency. Dehao > > Honza