> Jan Hubicka <hubi...@ucw.cz> writes: > > Hi, > > this patch implements loop guard heuristics predicting that if one loop is > > nested in another and controlled by a conditional that conditional is likely > > false. This helps to get more realistic predictionsin larger loop nests. > > Just curious: what's the basis for the condition being likely false? > (Sorry if this is a well-known heuristic.)
Well, the reason is that if you have large loop nest there must be some way to trim it down. The "science" behind predictors is that they are driven by real world data. You implement predictor, collect profile feedback and then use analyze_brprob script to tell you how often given predictor match. These values are then fed into predict.def and prodice.c uses Dempster-Shaffer method to combine the hypoteses into likely outcome. Here are data collected by Martin (Liska) on SPEC. Loop guard predicts 1109 branches statically, executed 6531156943 times and the prediction is correct 61.88% cases. Not the most reliable heuristics but it did caused quite interesting improvements this weekend https://gcc.opensuse.org/SPEC/CINT/sb-vangelis-head-64/recent.html (gzip&gcc seems to get off-noise improvements at least) HEURISTICS BRANCHES (REL) HITRATE COVERAGE COVERAGE (REL) loop guard with recursion 3 0.0% 17.17% / 93.91% 7717 7.72K 0.0% no prediction 10690 18.6% 33.65% / 85.08% 163062046405 163.06G 15.6% loop iv compare 40 0.1% 52.06% / 52.15% 8806870 8.81M 0.0% early return (on trees) 6599 11.5% 54.39% / 86.51% 33950850516 33.95G 3.2% loop guard 1109 1.9% 61.88% / 88.38% 6531156943 6.53G 0.6% opcode values positive (on trees) 3368 5.9% 64.55% / 90.39% 17957835504 17.96G 1.7% continue 505 0.9% 66.66% / 82.85% 10086801881 10.09G 1.0% call 11893 20.7% 67.26% / 92.26% 34994677942 34.99G 3.3% opcode values nonequal (on trees) 7118 12.4% 67.63% / 81.38% 74854159732 74.85G 7.2% loop iterations 2761 4.8% 67.99% / 67.99% 408310259008 408.31G 39.1% const return 271 0.5% 69.39% / 87.09% 301566712 301.57M 0.0% pointer (on trees) 5990 10.4% 69.59% / 87.18% 16667684592 16.67G 1.6% combined 57560 100.0% 69.74% / 80.61% 1044875078660 1.04T 100.0% DS theory 29243 50.8% 70.14% / 86.40% 160792553051 160.79G 15.4% loop exit with recursion 40 0.1% 72.17% / 92.33% 454740027 454.74M 0.0% recursive call 65 0.1% 75.19% / 76.33% 189825093 189.83M 0.0% first match 17627 30.6% 77.81% / 78.31% 721020479204 721.02G 69.0% extra loop exit 139 0.2% 82.80% / 88.17% 1696816070 1.70G 0.2% loop exit 2813 4.9% 85.36% / 87.83% 60086533565 60.09G 5.8% null return 393 0.7% 91.47% / 93.08% 3268678197 3.27G 0.3% guessed loop iterations 8084 14.0% 91.73% / 92.49% 242056066564 242.06G 23.2% guess loop iv compare 207 0.4% 97.75% / 97.79% 4319186982 4.32G 0.4% negative return 277 0.5% 97.94% / 99.23% 1062119028 1.06G 0.1% Fortran loop preheader 2536 4.4% 99.81% / 99.88% 6138904177 6.14G 0.6% noreturn call 2468 4.3% 100.00% / 100.00% 8232182915 8.23G 0.8% Fortran fail alloc 393 0.7% 100.00% / 100.00% 393 393.00 0.0% Fortran repeated allocation/deallocation 218 0.4% 100.00% / 100.00% 218 218.00 0.0% Fortran zero-sized array 677 1.2% 100.00% / 100.00% 112723807 112.72M 0.0% Fortran overflow 1282 2.2% 100.00% / 100.00% 175074185 175.07M 0.0% Loop count: 17886 avg. # of iter: 12141.20 median # of iter: 10.00 avg. (1% cutoff) # of iter: 1038.28 avg. (5% cutoff) # of iter: 464.53 avg. (10% cutoff) # of iter: 222.38 avg. (20% cutoff) # of iter: 64.87 avg. (30% cutoff) # of iter: 44.06