On 23/03/18 13:50, Kyrill Tkachov wrote:
On 23/03/18 13:31, Sudakshina Das wrote:On 23/03/18 09:12, Kyrill Tkachov wrote:On 23/03/18 08:47, Christophe Lyon wrote:Hi Sudi, On 22 March 2018 at 18:26, Sudakshina Das <sudi....@arm.com> wrote:Hi On 22/03/18 16:52, Kyrill Tkachov wrote:On 22/03/18 16:20, Sudakshina Das wrote:Hi Kyrill On 22/03/18 16:08, Kyrill Tkachov wrote:Hi Sudi, On 21/03/18 17:44, Sudakshina Das wrote:Hi The ICE in the bug report was happening because the macroUSE_RETURN_INSN (FALSE) was returning different values at different points in the compilation. This was internally occurring because the function arm_compute_static_chain_stack_bytes () which was dependent onarm_r3_live_at_start_p () was giving a different value after thecond_exec instructions were created in ce3 causing the liveness of r3to escape up to the start block. The function arm_compute_static_chain_stack_bytes () should reallyonly compute the value once during epilogue/prologue stage. This passintroduces a new member 'static_chain_stack_bytes' to the target definition of the struct machine_function which gets calculated in expand_prologue and is the value that is returned by arm_compute_static_chain_stack_bytes () beyond that.Testing done: Bootstrapped and regtested on arm-none-linux-gnueabihfand added the reported test to the testsuite. Is this ok for trunk?Thanks for working on this. I agree with the approach, I have a couple of comments inline.Sudi ChangeLog entries: *** gcc/ChangeLog *** 2018-03-21 Sudakshina Das <sudi....@arm.com> PR target/84826 * config/arm/arm.h (machine_function): Add static_chain_stack_bytes.* config/arm/arm.c (arm_compute_static_chain_stack_bytes):Avoid re-computing once computed. (arm_expand_prologue): Compute machine->static_chain_stack_bytes. (arm_init_machine_status): Initialize machine->static_chain_stack_bytes. *** gcc/testsuite/ChangeLog *** 2018-03-21 Sudakshina Das <sudi....@arm.com> PR target/84826 * gcc.target/arm/pr84826.c: New testThe new test fails on arm-none-linux-gnueabi --with-mode thumb --with-cpu cortex-a9 --with-fpu default Dejagnu flags: -march=armv5t Because: /gcc/testsuite/gcc.target/arm/pr84826.c: In function 'a': /gcc/testsuite/gcc.target/arm/pr84826.c:15:1: sorry, unimplemented: -fstack-check=specific for Thumb-1 compiler exited with status 1 FAIL: gcc.target/arm/pr84826.c (test for excess errors) You probably have to add a require-effective-target to skip the test in such cases.Yeah, these tests need a { dg-require-effective-target supports_stack_clash_protection } A patch to add that is pre-approved. Sorry for missing it in review. KyrillHi Christophe and Kyrill How about the attached patch? { dg-require-effective-target supports_stack_clash_protection } is notenabled for any of ARM targets, so this is my work around for that. There isa comment in target-supports.exp which makes me a little hesitant in tinkering with the require effective target code. proc check_effective_target_supports_stack_clash_protection { } { # Temporary until the target bits are fully ACK'd. # if { [istarget aarch*-*-*] } { # return 1 # } if { [istarget x86_64-*-*] || [istarget i?86-*-*] || [istarget powerpc*-*-*] || [istarget rs6000*-*-*] || [istarget s390*-*-*] } { return 1 } return 0 } I have opened a new PR 85005 which mentions this that is meant for GCC 9 as part for a bigger cleanup and redesign of the stack clash protection code on ARM backend.Ok. Thanks for doing this.
Thanks and sorry about this! Committed as r258805. Sudi
Kyrill*** gcc/testsuite/ChangeLog *** 2018-03-23 Sudakshina Das <sudi....@arm.com> PR target/84826 * gcc.target/arm/pr84826.c: Add dg directive. Thanks SudiThanks, Christophediff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index bbf3937..2809112 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -1384,6 +1384,9 @@ typedef struct GTY(()) machine_function machine_mode thumb1_cc_mode; /* Set to 1 after arm_reorg has started. */ int after_arm_reorg;+ /* The number of bytes used to store the static chain register on the+ stack, above the stack frame. */ + int static_chain_stack_bytes; } machine_function; #endif diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index cb6ab81..bc31810 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -19392,6 +19392,11 @@ arm_r3_live_at_start_p (void) static int arm_compute_static_chain_stack_bytes (void) { + /* Once the value is updated from the init value of -1, do not + re-compute. */ + if (cfun->machine->static_chain_stack_bytes != -1) + return cfun->machine->static_chain_stack_bytes; +My concern is that this approach caches the first value that is computedfor static_chain_stack_bytes.I believe the layout frame code is called multiple times during registerallocation as it goes throughthe motions and I think we want the last value it computes during reloadHow about we do something like: if (cfun->machine->static_chain_stack_bytes != -1 &&epilogue_completed) return cfun->machine->static_chain_stack_bytes; ?... /* See the defining assertion in arm_expand_prologue. */ if (IS_NESTED (arm_current_func_type ())&& ((TARGET_APCS_FRAME && frame_pointer_needed && TARGET_ARM)@@ -21699,6 +21704,11 @@ arm_expand_prologue (void) emit_insn (gen_movsi (stack_pointer_rtx, r1)); }+ /* Let's compute the static_chain_stack_bytes required and store it.Right+ now the value must the -1 as stored by arm_init_machine_status ().*/ ... this comment would need to be tweaked as cfun->machine->static_chain_stack_bytes may hold an intermediate value computed in reload or some other point before epilogue_completed. + cfun->machine->static_chain_stack_bytes + = arm_compute_static_chain_stack_bytes (); +Maybe I did not understand this completely, but my idea was that I am initializing the value of cfun->machine->static_chain_stack_bytes to be -1 in arm_init_machine_status () and then it stays as -1 all throughout reload and hence the function arm_compute_static_chain_stack_bytes () will keep computing the value instead of returning the cached value. Only during expand_prologue (which I assumed occurs much after reload), I overwrite the initial -1 and after that any call to arm_compute_static_chain_stack_bytes() would return this cached value.I did start out writing the patch with a epilogue_completed check but realized that even during this stage arm_compute_static_chain_stack_bytes () was called several times and thus to avoid those re-computations, (again assuming by this stage we already should have a fixed value) I re-wrote itwith the initialization to -1 approach.Right, I had read the patch too quickly, sorry. You perform the caching of cfun->machine->static_chain_stack_bytes in arm_expand_prologue, not arm_compute_static_chain_stack_bytes. That does give you the right semantics I think. The patch is ok then with a small typo fix:Thanks! Committed as r258777. Sudi+ /* Let's compute the static_chain_stack_bytes required and store it.Right+ now the value must the -1 as stored by arm_init_machine_status ().*/ s/the/be/ + cfun->machine->static_chain_stack_bytes + = arm_compute_static_chain_stack_bytes (); + Thanks, Kyrill