On 30 September 2014 20:20, Teresa Johnson <tejohn...@google.com> wrote: > On Mon, Sep 29, 2014 at 9:33 PM, Jeff Law <l...@redhat.com> wrote: >> On 09/29/14 08:19, Teresa Johnson wrote: >>>> >>>> >>>> Just an update - I found some good test cases by compiling the >>>> c-torture tests with profile feedback with and without my patch. But >>>> in the cases I pulled out I saw that there were still a couple profile >>>> or probability insanities introduced by jump threading (albeit far >>>> less than before), so I wanted to investigate before I commit. I ran >>>> out of time this week and will not get to this until I get back from >>>> vacation the week after next. >>> >>> >>> Hi Jeff, >>> >>> I finally had a chance to get back to this and look at the remaining >>> insanities in the new test cases I created. It turns out that there >>> were still a few issues in the case where there were guessed >>> frequencies and no profile counts. The two test cases I created do use >>> FDO, and the insanities in the routines with profile counts went away >>> with my patch. But the outlined copies of routines that were also >>> inlined into the main routine still had estimated frequencies, and >>> these still had a few issues. >>> >>> The problem is that the profile updates are done incrementally as we >>> walk and update the paths in ssa_fix_duplicate_block_edges, including >>> the block and edge counts, the block frequencies and the >>> probabilities. This is very difficult to do when only operating on >>> frequencies since the edge frequencies are derived from the source >>> block frequency and the probability. Therefore, once the source block >>> frequency is updated, the edge frequency is also affected, and it is >>> really difficult to figure out what the update to the edge frequency >>> (essentially the probability) is using the same incremental update >>> approach. I was attempting to handle this with the routine >>> deduce_freq, for example, but this turned out to have issues for >>> certain types of paths. I tried a few other approaches, but they start >>> looking really ugly and I didn't want to add a parallel but different >>> algorithm in the case of no profile counts. >>> >>> So by far the simplest approach was simply to take a snapshot of the >>> existing block and edge frequencies along the path before we start the >>> updates in ssa_fix_duplicate_block_edges, by copying them into the >>> profile count fields of those blocks and edges. Then the existing >>> algorithm operates the same as when we do have counts, and can >>> essentially operate incrementally on the edge frequencies since they >>> live in the count field of the edge and are no longer affected anytime >>> the source block is updated. Since the algorithm does update block >>> frequencies and probabilities as well (based on the count updates >>> performed), we can simply clear out these fake count fields at the end >>> of ssa_fix_duplicate_block_edges. This takes care of the remaining >>> insanities introduced by jump threading from these test cases. During >>> testing I also added in some checking to ensure that the count fields >>> for the whole routine were cleared properly to make sure the new >>> clear_counts_path was not missing anything (checking is a little too >>> heavyweight to add in normally). >>> >>> New patch below (also attached since my mailer sometimes eats spaces). >>> The differences between the old patch and the new one: >>> - removed deduce_freq (which was my least favorite part of the patch >>> anyway!), and its call from recompute_probabilities, since it is no >>> longer necessary. >>> - two new routines freqs_to_counts_path and clear_counts_path, invoked >>> from ssa_fix_duplicate_block_edges. >>> - two new tests >>> >>> Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk? >>> >>> Thanks, >>> Teresa >>> >>> gcc: >>> >>> 2014-09-29 Teresa Johnson <tejohn...@google.com> >>> >>> * tree-ssa-threadupdate.c (struct ssa_local_info_t): New >>> duplicate_blocks bitmap. >>> (remove_ctrl_stmt_and_useless_edges): Ditto. >>> (create_block_for_threading): Ditto. >>> (compute_path_counts): New function. >>> (update_profile): Ditto. >>> (recompute_probabilities): Ditto. >>> (update_joiner_offpath_counts): Ditto. >>> (freqs_to_counts_path): Ditto. >>> (clear_counts_path): Ditto. >>> (ssa_fix_duplicate_block_edges): Update profile info. >>> (ssa_create_duplicates): Pass new parameter. >>> (ssa_redirect_edges): Remove old profile update. >>> (thread_block_1): New duplicate_blocks bitmap, >>> remove old profile update. >>> (thread_single_edge): Pass new parameter. >>> >>> gcc/testsuite: >>> >>> 2014-09-29 Teresa Johnson <tejohn...@google.com> >>> >>> * testsuite/gcc.dg/tree-prof/20050826-2.c: New test. >>> * testsuite/gcc.dg/tree-prof/cmpsf-1.c: Ditto. >> >> Given I'd already been through this pretty thoroughly, I just gave this a >> cursory review. >> >> clear_counts_path needs a function comment. It's pretty obvious what it's >> doing, but for completeness let's go ahead and get the obvious comment in >> there. > > Done and committed as r215739. >
Since this commit, I can see all my builds for arm*linux* and aarch64*linux* fail while building glibc: /tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools/bin/aarch64-none-linux-gnu-gcc iso-2022-cn.c -c -std=gnu99 -fgnu89-inline -O2 -Wall -Win line -Wundef -Wwrite-strings -fmerge-all-constants -frounding-math -g -Wstrict-prototypes -fPIC -I../include -I/tmp/3496222_18.tmpdir/aci-gcc-f sf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/glibc-1/iconvdata -I/tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux -gnu/glibc-1 -I../sysdeps/unix/sysv/linux/aarch64/nptl -I../sysdeps/unix/sysv/linux/aarch64 -I../sysdeps/unix/sysv/linux/generic -I../sysdeps/unix/s ysv/linux/wordsize-64 -I../nptl/sysdeps/unix/sysv/linux -I../nptl/sysdeps/pthread -I../sysdeps/pthread -I../sysdeps/unix/sysv/linux -I../sysdeps/gn u -I../sysdeps/unix/inet -I../nptl/sysdeps/unix/sysv -I../sysdeps/unix/sysv -I../nptl/sysdeps/unix -I../sysdeps/unix -I../sysdeps/posix -I../sysd eps/aarch64/fpu -I../sysdeps/aarch64/nptl -I../sysdeps/aarch64 -I../sysdeps/wordsize-64 -I../sysdeps/ieee754/ldbl-128 -I../sysdeps/ieee754/dbl-64/w ordsize-64 -I../sysdeps/ieee754/dbl-64 -I../sysdeps/ieee754/flt-32 -I../sysdeps/aarch64/soft-fp -I../sysdeps/ieee754 -I../sysdeps/generic -I../npt l -I.. -I../libio -I. -nostdinc -isystem /tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include -i system /tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include-fixed -isystem /tmp/3496222_18.tmpdir /aci-gcc-fsf/builds/gcc-fsf-gccsrc/sysroot-aarch64-none-linux-gnu/usr/include -D_LIBC_REENTRANT -include ../include/libc-symbols.h -DPIC -DSHARED -DNOT_IN_libc -o /tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/glibc-1/iconvdata/iso-2022-cn.os -MD -MP -MF /tmp/3 496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/glibc-1/iconvdata/iso-2022-cn.os.dt -MT /tmp/3496222_18.tmpdir/aci-gcc-fsf /builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/glibc-1/iconvdata/iso-2022-cn.os In file included from iso-2022-cn.c:407:0: ../iconv/skeleton.c: In function 'gconv': ../iconv/skeleton.c:800:1: internal compiler error: in check_probability, at basic-block.h:959 0xe4e2fb find_many_sub_basic_blocks(simple_bitmap_def*) /tmp/3496222_18.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/basic-block.h:959 0x6623f0 execute /tmp/3496222_18.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:5916 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. Can you have a look? Thanks, Christophe. > Thanks, > Teresa > >> >> With that fix, approved for the trunk. Thanks for taking the time to sort >> out all these issues. >> >> jeff >> >> > > > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413