On Mon, Jan 23, 2012 at 12:52 PM, Nathan Sidwell <nat...@acm.org> wrote: > On 01/20/12 22:41, Xinliang David Li wrote: >> >> Nathan, I just noticed this path. This is a good improvement over the >> existing scheme. >> >> I see one potential problem with the patch -- different instances of >> the same comdat function can potentially have different control flows >> (due to differences in early inline, early optimizations in different >> module contexts) before profile counter allocation -- which means >> making the profile counters 'comdat' can later cause coverage mismatch >> problems during profile-use phase. > > > Right, I have an new patch that reenables the comdat-awareness, however > there is one case I theorized about but couldn't generate a testcase for.
One possible way to generate a test case is like the following -- bar will be inlined into the instance of foo<int> in foo_1.cc, but not in the one in foo_2.cc foo.h --------- extern int bar (); template <typename T> __attribute__((noinline)) T foo (T) { return bar(); } foo_1.cc -------------- #include "foo.h" int g; __attribute__((always_inline)) int bar() { if (g > 10) return 1; else return 0; } int test_1 () { return foo<int>(); } foo_2.cc ------------ #include "foo.h" int test_2 () { return foo<int>(); } > Your email convinced me to try harder, and I now have such a testcase that > needs addressing. Anyway, the tree-based coverage counters treat the > per-function counters as-if they are function-scope static variables. I like this part of the change (splitting the counter array in a module into per-function arrays) by itself -- it simplifies the implementation and is much cleaner. >Hence > inlining Foo into Baz will not add new counters to Baz but increment Foo's > counters. So you can't distinguish between different inlinings of Foo. This is not true for early inlining. If Foo is inlined into Baz in early inline pass, Foo's inline instance will have its own 'private' copy of the profile counters 'embedded' in the caller's counter. For ipa inlining, the old implementation allows all inline instances of 'foo' in the same module to share the same counters -- but they do not share the counters with the out of line copy of 'foo' which is defined in another module (and picked up by the linker). Your original comdat aware implementation forces all inline instances of 'foo' also share counters across modules. This however may not only cause problems in profile-use pass, but also cause runtime problems in profile collection run such as out of bound accesses.. > Of > course optimizing Foo's inline instance in Baz may delete various basic > blocks and so lead to non-increments. However, for the purposes of > profiling, the average behaviour over all inlinings in that translation unit > is used. As I recall the rtl-based counters didn't do this. > > Anyway, with comdat-awareness, the case I need to address is a function Foo > that is inlined in some places but not in others. So there's a comdat Foo > instance emitted. That instance may or may not be selected by the linker. > If it is not selected, we still want to track the counter values for its > inlinings in that TU. If it is selected, we don't want to double count its > counter values. I need to think about this a bit, but have some ideas. See above. I think the main problems is that comdat function Foo in different TUs may have incompatible counter variables (due to cfg difference) -- common incompatible variables via comdat may cause problems. > > >> In this patch, fn data buffer is introduced. What is that for and how >> does it work? > > > The intention is that the fn_info structure is in the same comdat group as > the comdat function to which it refers. Therefore, if the instance of > fn_info in the executable will match the instance of the function to which > it refers. However, each object file needs to generate coverage information > for all the functions in that object file and correctly mark whether a > comdat function in it was in the final link or not. So its array of > pointers to fn_info will name the fn_infos of all its functions. For a > comdat function, the slot will select the fn_info of some other TU. The > gcov_info back-pointer in each fn_info allows the gcov runtime to detect > this case. > Yes I knew the intention after reading the code. See also my follow up email about it --- I think it is necessary to track where the counter variable is coming from (the key), but not buffering is not needed. thanks, David > Does that help? > > nathan