https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63432
--- Comment #16 from Teresa Johnson <tejohnson at google dot com> --- On Sat, Oct 4, 2014 at 8:34 AM, Teresa Johnson <tejohn...@google.com> wrote: > On Fri, Oct 3, 2014 at 11:15 PM, Teresa Johnson <tejohn...@google.com> wrote: >> On Fri, Oct 3, 2014 at 3:35 PM, Teresa Johnson <tejohn...@google.com> wrote: >>> Thanks to H.J. for the test case, I have reproduced the issue. It >>> exposed two separate problems. Cc'ing Honza and Jeff for input on the >>> profile count and jump threading issues, respectively. >>> >>> The first is that we are calling my new freqs_to_counts_path in cases >>> where the functions do have non-zero profile counts. I am invoking >>> this when either the profile status is != PROFILE_READ, or when the >>> entry block count is 0. The latter is the case where the read in >>> profile had zero counts for the function, so we kept the guessed >>> frequencies (see counts_to_freqs in predict.c). In some cases I am >>> finding the the profile count on the entry block is 0, but the rest of >>> the blocks (bb 2 and descendants) have non-zero profile counts. I >>> thought the way to check for this case was to look at the entry >>> block's count, and in fact that is what we seem to do in other places. >>> I could also check the entry block successor counts, or I could simply >>> check the blocks along the path (to see if they all have 0 counts but >>> non-0 frequencies). I don't want to check all bbs in the function for >>> every path. Honza, is there a good way to detect this case (function >>> is marked PROFILE_READ but has all-0 bb counts and we kept the >>> estimated frequencies)? > > I am also seeing cases where the entry bb 0 has a non-zero count, but > the rest of the bbs in the function have 0 counts and estimated > frequencies. So I think the safest thing to do here, and what I have > implemented, is if the profile status is PROFILE_READ, then check all > the bbs/edges along the path - if and only if they all have 0 counts > and there is at least one non-zero frequency do we do the > freqs_to_counts. > >>> >>> Even when I skip freqs_to_counts_path in this case, we still get the >>> insane edge probability. I dumped out some more verbose info while >>> jump threading, and I am seeing a jump threading path that I don't >>> understand. Since it violates my assumptions, the new profile update >>> computations go haywire. Here is the path that we are attempting to >>> thread: >>> >>> (119, 150) incoming edge; (150, 155) joiner; (13, 15) normal; >>> >>> Notice that the normal edge is not connected to the rest of the path. >>> This path appears to be constructed during jump threading, as block >>> 155 was created by an earlier threading opportunity. In fact, the >>> earlier threadings that created bb 155 removed all predecessors of bb >>> 13. We originally had >>> >>> bb 150 with successors bb 12 and bb 13 >>> bb 12 with successor bb 13. >>> >>> Then we do: >>> Threaded jump 12 --> 13 to 155 >>> Threaded jump 150 --> 13 to 155 >>> and bb 13 has no more predecessors. So I'm not sure what it means to >>> have a jump threading path through 150 and 13? >>> >>> Jeff, is the above type of jump threading path expected and >>> reasonable? If so, I need to redo some of the profile update code to >>> handle it better. >> >> I see what happened - 155 is a duplicate of 13 from the earlier >> threading. So the path that was originally: >> >> (119, 150) incoming edge; (150, 13) joiner; (13, 15) normal; >> >> became >> >> (119, 150) incoming edge; (150, 155) joiner; (13, 15) normal; >> >> This happened naturally when we redirected the incoming edge 150->13 >> to 155 on the earlier jump threading. During the same threading we >> created duplicate bb 154 of bb 15, and redirect bb 155 to bb 154. This >> redirection does not affect the original 13->15 edge, and thus 13->15 >> remains on the above path. The 155->154 edge should be the new edge in >> the path instead of 13->15. But the edges recorded in other paths are >> not explicitly updated, so the normal edge was left alone. >> >> This works out ok from a correctness standpoint, since bb 13 still >> exists and has the same outgoing edges as it did before, which were >> duplicated onto bb 154. However, those outgoing edges have had their >> profile counts updated and it no longer works out to use those profile >> counts to update the counts along this new threading path. We really >> want to look at 155->154 for the profile to update. >> >> Probably we should be updating remaining jump threading paths when we >> perform jump threading. I need to think about the best way to do that. > > Looking at the code, I think it attempts to check for this case and > prevent it but that code does not work in this case because of the > order the paths are identified and subsequently iterated in the paths > vec. In mark_threaded_blocks it looks at paths with joiners and if > there are any edges along the path already part of another path it > will skip that path. But in this case, we registered the paths in this > order: > > Registering jump thread: (119, 150) incoming edge; (150, 13) > joiner; (13, 15) normal; > ... > Registering jump thread: (150, 13) incoming edge; (13, 15) joiner; > (15, 17) normal; > > For the first path, the path is attached to incoming edge 119->150. So > when we walk edges in the 2nd path we don't see the first path. If we > looked at the paths in the reverse order we would have seen the path > attached to incoming edge 150->13 and skipped the 119->150 path. Note > that we end up doing the actual threading in the reverse order - first > we do the threading through 13 (the second path), then later the > threading through 150 (the first path), leading to the issue I am > seeing. > > Jeff, what is intended here - should we not be threading both of these paths? I have a patch to make the mark_threaded_blocks checking of paths work regardless of the ordering of paths in the vec. This fixes the failure. The other approach is whenever we finish threading a path, go through the vec of remaining paths and update the edges for any that have been affected by the threading and that should instead include the duplicated edges. Teresa > > Thanks, > Teresa > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413