On 8/3/23 04:13, Jan Hubicka wrote:
Note most of the profile consistency checks FAIL when testing with -m32 on
x86_64-unknown-linux-gnu ...
For example vect-11.c has
;; basic block 4, loop depth 0, count 719407024 (estimated locally,
freq 0.6700), maybe hot
;; Invalid sum of incoming counts 708669602 (estimated locally, freq
0.6600), should be 719407024 (estimated locally, freq 0.6700)
;; prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED)
;; pred: 3 [always (guessed)] count:708669602 (estimated
locally, freq 0.6600) (FALSE_VALUE,EXECUTABLE)
__asm__ __volatile__("cpuid
" : "=a" a_44, "=b" b_45, "=c" c_46, "=d" d_47 : "0" 1, "2" 0);
_3 = d_47 & 67108864;
so it looks like it's the check_vect () function that goes wrong
everywhere but only on i?86.
The first dump with the Invalid sum is 095t.fixup_cfg3 already.
Sorry for that, looks like missing/undetected noreturn. I will take a look.
The mismatch at fixup_cfg3 is harmless since we repropagate frequencies
later now. The misupdate is caused by jump threading:
vect-11.c.102t.adjust_alignment:;; Invalid sum of incoming counts 354334800
(estimated locally, freq 0.3300), should be 233860966 (estimated locally, freq
0.2178)
vect-11.c.102t.adjust_alignment:;; Invalid sum of incoming counts 354334800
(estimated locally, freq 0.3300), should be 474808634 (estimated locally, freq
0.4422)
vect-11.c.107t.rebuild_frequencies1
vect-11.c.116t.threadfull1:;; Invalid sum of incoming counts 708669600
(estimated locally, freq 0.6600), should be 719407024 (estimated locally, freq
0.6700)
I know that there are problems left in profile threading update. It was
main pass disturbing profile until gcc13 and now works for basic
testcases but not always. I already spent quite some time trying to
figure out what is wrong with profile threading (PR103680), so at least
this is small testcase.
Jeff, an help would be appreciated here :)
I will try to debug this. One option would be to disable branch
prediciton on vect_check for time being - it is not inlined anyway
Not a lot of insight. The backwards threader uses a totally different
API for the CFG/SSA updates and that API I don't think has made any
significant effort to keep the profile up-to-date.
Jeff