Re: [PATCH] Fix -Wduplicated-branches with -g (PR c/85094)
On Wed, 28 Mar 2018, Jakub Jelinek wrote: > Hi! > > With the introduction of DEBUG_BEGIN_STMT/-gstatement-frontiers on by > default, the -Wduplicated-branches warning doesn't work anymore with -g, > because operand_equal_p in OEP_LEXICOGRAPHIC mode doesn't consider > DEBUG_BEGIN_STMTs as equal unless they are pointer equal. For the warning > we either need to ignore them completely (but that would need more changes, > as the hashing doesn't ignore them), or at least what the second hunk does, > consider them equal. The first hunk is just an optimization, if > -Wduplicated-branches is used on extremely deeply nested STATEMENT_LISTs, > then for -fchecking we'd be computing the hashes etc. at the level of every > stmt of every STATEMENT_LIST; we need to do that just once at the toplevel. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Thanks, Richard. > 2018-03-28 Jakub Jelinek > > PR c/85094 > * fold-const.c (operand_equal_p): Handle DEBUG_BEGIN_STMT. > For STATEMENT_LIST, pass down OEP_LEXICOGRAPHIC and maybe > OEP_NO_HASH_CHECK for recursive call, to avoid exponential > checking. > > * c-c++-common/Wduplicated-branches-14.c: New test. > > --- gcc/fold-const.c.jj 2018-03-27 12:54:43.657242009 +0200 > +++ gcc/fold-const.c 2018-03-28 16:20:40.024574216 +0200 > @@ -3479,7 +3479,8 @@ operand_equal_p (const_tree arg0, const_ > if (tsi_end_p (tsi1) && tsi_end_p (tsi2)) > return 1; > if (!operand_equal_p (tsi_stmt (tsi1), tsi_stmt (tsi2), > - OEP_LEXICOGRAPHIC)) > + flags & (OEP_LEXICOGRAPHIC > + | OEP_NO_HASH_CHECK))) > return 0; > } > } > @@ -3492,6 +3493,10 @@ operand_equal_p (const_tree arg0, const_ > if (flags & OEP_LEXICOGRAPHIC) > return OP_SAME_WITH_NULL (0); > return 0; > + case DEBUG_BEGIN_STMT: > + if (flags & OEP_LEXICOGRAPHIC) > + return 1; > + return 0; > default: > return 0; >} > --- gcc/testsuite/c-c++-common/Wduplicated-branches-14.c.jj 2018-03-28 > 16:20:00.966553266 +0200 > +++ gcc/testsuite/c-c++-common/Wduplicated-branches-14.c 2018-03-28 > 16:19:07.264524448 +0200 > @@ -0,0 +1,16 @@ > +/* PR c/85094 */ > +/* { dg-do compile } */ > +/* { dg-options "-O1 -Wduplicated-branches -g" } */ > + > +extern int g; > + > +void > +foo (int r) > +{ > + if (r < 64) > +g -= 48; > + else if (r < 80) /* { dg-warning "this condition has identical branches" > } */ > +g -= 64 - 45; > + else > +g -= 80 - 61; > +} > > Jakub > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PATCH] Fix _blendm
Hello Jakub, On 28 мар 21:24, Jakub Jelinek wrote: > Hi! > > When looking at PR85090, I've looked into tmp-mddump.md and noticed there > some instructions that can't assemble - vblendmd etc., there is only > vpblendmd or vblendmps. I couldn't make a testcase that would reproduce > this though, seems the masked *mov_internal is used instead. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Patch is obvious. OK. -- Thanks, K
Re: [PATCH, GCC-7, GCC-6][ARM][PR target/84826] Backport Fix ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabihf
Hi Sudi, On 28/03/18 15:04, Sudakshina Das wrote: Hi This patch is a request to backport r258777 and r258805 to gcc-7-branch and gcc-6-branch. The same ICE occurs in both the branches with -fstack-check. Thus the test case directive has been changed. The discussion on the patch that went into trunk is: https://gcc.gnu.org/ml/gcc-patches/2018-03/msg01120.html Testing : Regtested on both the branches with arm-none-linux-gnueabihf Is this ok for gcc-7 and gcc-6? Ok. Thanks, Kyrill Sudi ChangeLog entries: *** gcc/ChangeLog *** 2018-03-28 Sudakshina Das Backport from mainline 2018-03-22 Sudakshina Das PR target/84826 * config/arm/arm.h (machine_function): Add static_chain_stack_bytes. * config/arm/arm.c (arm_compute_static_chain_stack_bytes): Avoid re-computing once computed. (arm_expand_prologue): Compute machine->static_chain_stack_bytes. (arm_init_machine_status): Initialize machine->static_chain_stack_bytes. *** gcc/testsuite/ChangeLog *** 2018-03-28 Sudakshina Das * gcc.target/arm/pr84826.c: Change dg-option to -fstack-check. Backport from mainline 2018-03-23 Sudakshina Das PR target/84826 * gcc.target/arm/pr84826.c: Add dg directive. Backport from mainline 2018-03-22 Sudakshina Das PR target/84826 * gcc.target/arm/pr84826.c: New test.
[PATCH, 0/2] Add scan-ltrans-tree-dump and scan-wpa-ipa-dump
Hi, Consider an lto multi-source test-case main.c and foo.c: .. $ cat main.c extern int foo (void); int main () { return foo () + 1; } $ cat foo.c int __attribute__((noinline, noclone)) foo (void) { return 2; } ... When compiling the test-case like this: ... $ gcc main.c foo.c -O2 -flto -save-temps -flto-partition=1to1 -o a.out ... the following happens: 1. main.s (containing gimple) is produced (and then assembled to main.o) 2. foo.s (containing gimple) is produced (and then assembled to main.o) 3. lto1 is called in wpa mode, generating a.out.ltrans0.o and a.out.ltrans1.o 4. lto1 is called in ltrans mode, generating a.out.ltrans0.s (which is then assembled to a.out.ltrans0.ltrans.o) 5. lto1 is called in ltrans mode, generating a.out.ltrans1.s (which is then assembled to a.out.ltrans1.ltrans.o) 6. a.out is produced from a.out.ltrans0.ltrans.o and a.out.ltrans1.ltrans.o When adding dump flags "-fdump-tree-all -fdump-rtl-all -fdump-ipa-all" to the command line, we get the following dump files. For 1, we generate (ignoring the 000i.* and statistics dumps from here on): - main.c.003t.original - main.c.050t.local-fnsummary2 - main.c.062i.targetclone - main.c.080i.pure-const For 2, we generate: - foo.c.003t.original - foo.c.050t.local-fnsummary2 - foo.c.062i.targetclone - foo.c.080i.pure-const For 3, we generate: - a.out.wpa.046t.profile_estimate - a.out.wpa.071i.whole-program - a.out.wpa.084i.comdats For 4, we generate: - a.out.ltrans0.046t.profile_estimate - a.out.ltrans0.075i.cp - a.out.ltrans0.087i.simdclone - a.out.ltrans0.088t.fixup_cfg4 - a.out.ltrans0.232t.optimized - a.out.ltrans0.234r.expand - a.out.ltrans0.317r.dfinish For 5, we generate: - a.out.ltrans1.046t.profile_estimate - a.out.ltrans1.075i.cp - a.out.ltrans1.087i.simdclone - a.out.ltrans1.088t.fixup_cfg4 - a.out.ltrans1.232t.optimized - a.out.ltrans1.234r.expand - a.out.ltrans1.317r.dfinish With the current set of dg-final commands scan-tree-dump, scan-rtl-dump and scan-ipa-dump, we are able to scan dump files for 1 and 2, but not for 3, 4 and 5. This patch series adds: - scan-wpa-ipa-dump, which allows us to scan the ipa dump files for 3 - scan-ltrans-tree-dump, which: - allows us to scan the tree dump files for 4, and - adds the option -flto-partition=one which forces wpa to combine all functions into a single partition ltrans0 (so that we don't have to worry about ltrans1 in 5). Thanks, - Tom
[PATCH, testsuite, 1/2] Add scan-wpa-ipa-dump
On 03/29/2018 11:11 AM, Tom de Vries wrote: Hi, Consider an lto multi-source test-case main.c and foo.c: .. $ cat main.c extern int foo (void); int main () { return foo () + 1; } $ cat foo.c int __attribute__((noinline, noclone)) foo (void) { return 2; } ... When compiling the test-case like this: ... $ gcc main.c foo.c -O2 -flto -save-temps -flto-partition=1to1 -o a.out ... the following happens: 1. main.s (containing gimple) is produced (and then assembled to main.o) 2. foo.s (containing gimple) is produced (and then assembled to main.o) 3. lto1 is called in wpa mode, generating a.out.ltrans0.o and a.out.ltrans1.o 4. lto1 is called in ltrans mode, generating a.out.ltrans0.s (which is then assembled to a.out.ltrans0.ltrans.o) 5. lto1 is called in ltrans mode, generating a.out.ltrans1.s (which is then assembled to a.out.ltrans1.ltrans.o) 6. a.out is produced from a.out.ltrans0.ltrans.o and a.out.ltrans1.ltrans.o When adding dump flags "-fdump-tree-all -fdump-rtl-all -fdump-ipa-all" to the command line, we get the following dump files. For 1, we generate (ignoring the 000i.* and statistics dumps from here on): - main.c.003t.original - main.c.050t.local-fnsummary2 - main.c.062i.targetclone - main.c.080i.pure-const For 2, we generate: - foo.c.003t.original - foo.c.050t.local-fnsummary2 - foo.c.062i.targetclone - foo.c.080i.pure-const For 3, we generate: - a.out.wpa.046t.profile_estimate - a.out.wpa.071i.whole-program - a.out.wpa.084i.comdats For 4, we generate: - a.out.ltrans0.046t.profile_estimate - a.out.ltrans0.075i.cp - a.out.ltrans0.087i.simdclone - a.out.ltrans0.088t.fixup_cfg4 - a.out.ltrans0.232t.optimized - a.out.ltrans0.234r.expand - a.out.ltrans0.317r.dfinish For 5, we generate: - a.out.ltrans1.046t.profile_estimate - a.out.ltrans1.075i.cp - a.out.ltrans1.087i.simdclone - a.out.ltrans1.088t.fixup_cfg4 - a.out.ltrans1.232t.optimized - a.out.ltrans1.234r.expand - a.out.ltrans1.317r.dfinish With the current set of dg-final commands scan-tree-dump, scan-rtl-dump and scan-ipa-dump, we are able to scan dump files for 1 and 2, but not for 3, 4 and 5. This patch series adds: - scan-wpa-ipa-dump, which allows us to scan the ipa dump files for 3 - scan-ltrans-tree-dump, which: - allows us to scan the tree dump files for 4, and - adds the option -flto-partition=one which forces wpa to combine all functions into a single partition ltrans0 (so that we don't have to worry about ltrans1 in 5). This patch adds scan-wpa-ipa-dump. Bootstrapped and reg-tested on x86_64. OK for stage4/stage1 trunk? Thanks, - Tom [testsuite] Add scan-wpa-ipa-dump 2018-03-28 Tom de Vries PR testsuite/85106 * gcc.dg/ipa/ipa-icf-38.c: New test. * gcc.dg/ipa/ipa-icf-38a.c: New test. * lib/scandump.exp (dump-base): New proc. (scan-dump, scan-dump-times, scan-dump-not, scan-dump-dem) (scan-dump-dem-not): Add and handle parameter for suffix of the dump base. * lib/scanipa.exp: Add "" argument to scan-dump calls. * lib/scanlang.exp: Same. * lib/scanrtl.exp: Same. * lib/scantree.exp: Same. * lib/scanwpaipa.exp: New file. * lib/gcc-dg.exp: Include scanwpaipa.exp. * testsuite/lib/libatomic.exp: Include scanwpaipa.exp. * testsuite/lib/libgomp.exp: Include scanwpaipa.exp. * testsuite/lib/libitm.exp: Include scanwpaipa.exp. * testsuite/lib/libvtv.exp: Include scanwpaipa.exp. * doc/sourcebuild.texi (Commands for use in dg-final, Scan optimization dump files): Add wpa-ipa. --- gcc/doc/sourcebuild.texi | 2 +- gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c | 31 +++ gcc/testsuite/gcc.dg/ipa/ipa-icf-38a.c | 14 gcc/testsuite/lib/gcc-dg.exp | 1 + gcc/testsuite/lib/scandump.exp | 63 +- gcc/testsuite/lib/scanipa.exp | 25 +++--- gcc/testsuite/lib/scanlang.exp | 6 +- gcc/testsuite/lib/scanrtl.exp | 25 +++--- gcc/testsuite/lib/scantree.exp | 25 +++--- gcc/testsuite/lib/scanwpaipa.exp | 147 + libatomic/testsuite/lib/libatomic.exp | 1 + libgomp/testsuite/lib/libgomp.exp | 1 + libitm/testsuite/lib/libitm.exp| 1 + libvtv/testsuite/lib/libvtv.exp| 1 + 14 files changed, 290 insertions(+), 53 deletions(-) diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index b9672ca..db98e9f 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -2596,7 +2596,7 @@ assembly output. @subsubsection Scan optimization dump files These commands are available for @var{kind} of @code{tree}, @code{rtl}, -and @code{ipa}. +@code{ipa}, and @code{wpa-ipa}. @table @code @item scan-@var{kind}-dump @var{regex} @var{suffix} [@{ target/xfail @var{selector} @}] diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c b/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c new file mode 100644 index 000..6e7936a --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c @@ -0,0 +1,31 @@ +/* { dg-do link } */ +/
[PATCH, testsuite, 2/2] Add scan-ltrans-tree-dump
On 03/29/2018 11:11 AM, Tom de Vries wrote: Hi, Consider an lto multi-source test-case main.c and foo.c: .. $ cat main.c extern int foo (void); int main () { return foo () + 1; } $ cat foo.c int __attribute__((noinline, noclone)) foo (void) { return 2; } ... When compiling the test-case like this: ... $ gcc main.c foo.c -O2 -flto -save-temps -flto-partition=1to1 -o a.out ... the following happens: 1. main.s (containing gimple) is produced (and then assembled to main.o) 2. foo.s (containing gimple) is produced (and then assembled to main.o) 3. lto1 is called in wpa mode, generating a.out.ltrans0.o and a.out.ltrans1.o 4. lto1 is called in ltrans mode, generating a.out.ltrans0.s (which is then assembled to a.out.ltrans0.ltrans.o) 5. lto1 is called in ltrans mode, generating a.out.ltrans1.s (which is then assembled to a.out.ltrans1.ltrans.o) 6. a.out is produced from a.out.ltrans0.ltrans.o and a.out.ltrans1.ltrans.o When adding dump flags "-fdump-tree-all -fdump-rtl-all -fdump-ipa-all" to the command line, we get the following dump files. For 1, we generate (ignoring the 000i.* and statistics dumps from here on): - main.c.003t.original - main.c.050t.local-fnsummary2 - main.c.062i.targetclone - main.c.080i.pure-const For 2, we generate: - foo.c.003t.original - foo.c.050t.local-fnsummary2 - foo.c.062i.targetclone - foo.c.080i.pure-const For 3, we generate: - a.out.wpa.046t.profile_estimate - a.out.wpa.071i.whole-program - a.out.wpa.084i.comdats For 4, we generate: - a.out.ltrans0.046t.profile_estimate - a.out.ltrans0.075i.cp - a.out.ltrans0.087i.simdclone - a.out.ltrans0.088t.fixup_cfg4 - a.out.ltrans0.232t.optimized - a.out.ltrans0.234r.expand - a.out.ltrans0.317r.dfinish For 5, we generate: - a.out.ltrans1.046t.profile_estimate - a.out.ltrans1.075i.cp - a.out.ltrans1.087i.simdclone - a.out.ltrans1.088t.fixup_cfg4 - a.out.ltrans1.232t.optimized - a.out.ltrans1.234r.expand - a.out.ltrans1.317r.dfinish With the current set of dg-final commands scan-tree-dump, scan-rtl-dump and scan-ipa-dump, we are able to scan dump files for 1 and 2, but not for 3, 4 and 5. This patch series adds: - scan-wpa-ipa-dump, which allows us to scan the ipa dump files for 3 - scan-ltrans-tree-dump, which: - allows us to scan the tree dump files for 4, and - adds the option -flto-partition=one which forces wpa to combine all functions into a single partition ltrans0 (so that we don't have to worry about ltrans1 in 5). This patch adds scan-ltrans-tree-dump. Bootstrapped and reg-tested on x86_64. OK for stage4/stage1 trunk? Thanks, - Tom [testsuite] Add scan-ltrans-tree-dump 2018-03-28 Tom de Vries PR testsuite/85106 * gcc.dg/ipa/ipa-icf-38.c: Use scan-ltrans-tree-dump. * lib/scanltranstree.exp: New file. * lib/target-supports.exp (scan-ltrans-tree-dump_required_options) (scan-ltrans-tree-dump-times_required_options) (scan-ltrans-tree-dump-not_required_options) (scan-ltrans-tree-dump-dem_required_options) (scan-ltrans-tree-dump-dem-not_required_options): New proc. * lib/gcc-dg.exp: Include scanltranstree.exp. * testsuite/lib/libatomic.exp: Include scanltranstree.exp. * testsuite/lib/libgomp.exp: Include scanltranstree.exp. * testsuite/lib/libitm.exp: Include scanltranstree.exp. * testsuite/lib/libvtv.exp: Include scanltranstree.exp. * doc/sourcebuild.texi (Commands for use in dg-final, Scan optimization dump files): Add ltrans-tree. --- gcc/doc/sourcebuild.texi | 4 +- gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c | 4 +- gcc/testsuite/lib/gcc-dg.exp | 1 + gcc/testsuite/lib/scanltranstree.exp | 148 ++ gcc/testsuite/lib/target-supports.exp | 20 + libatomic/testsuite/lib/libatomic.exp | 1 + libgomp/testsuite/lib/libgomp.exp | 1 + libitm/testsuite/lib/libitm.exp | 1 + libvtv/testsuite/lib/libvtv.exp | 1 + 9 files changed, 178 insertions(+), 3 deletions(-) diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index db98e9f..8ef2d0e 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -2595,8 +2595,8 @@ assembly output. @subsubsection Scan optimization dump files -These commands are available for @var{kind} of @code{tree}, @code{rtl}, -@code{ipa}, and @code{wpa-ipa}. +These commands are available for @var{kind} of @code{tree}, @code{ltrans-tree}, +@code{rtl}, @code{ipa}, and @code{wpa-ipa}. @table @code @item scan-@var{kind}-dump @var{regex} @var{suffix} [@{ target/xfail @var{selector} @}] diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c b/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c index 6e7936a..85531ab 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c @@ -1,5 +1,5 @@ /* { dg-do link } */ -/* { dg-options "-O2 -fdump-ipa-icf -flto" } */ +/* { dg-options "-O2 -fdump-ipa-icf -flto -fdump-tree-fixup_cfg4" } */ /* { dg-require-effective-target lto } */ /* { dg-additional-sources
Re: [PATCH, GCC-7, GCC-6][ARM][PR target/84826] Backport Fix ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabihf
Hi Kyrill On 29/03/18 09:41, Kyrill Tkachov wrote: Hi Sudi, On 28/03/18 15:04, Sudakshina Das wrote: Hi This patch is a request to backport r258777 and r258805 to gcc-7-branch and gcc-6-branch. The same ICE occurs in both the branches with -fstack-check. Thus the test case directive has been changed. The discussion on the patch that went into trunk is: https://gcc.gnu.org/ml/gcc-patches/2018-03/msg01120.html Testing : Regtested on both the branches with arm-none-linux-gnueabihf Is this ok for gcc-7 and gcc-6? Ok. Thanks, Kyrill Thanks! Committed to gcc-7-branch as r258948 and gcc-6-branch as r258949. Sudi Sudi ChangeLog entries: *** gcc/ChangeLog *** 2018-03-28 Sudakshina Das Backport from mainline 2018-03-22 Sudakshina Das PR target/84826 * config/arm/arm.h (machine_function): Add static_chain_stack_bytes. * config/arm/arm.c (arm_compute_static_chain_stack_bytes): Avoid re-computing once computed. (arm_expand_prologue): Compute machine->static_chain_stack_bytes. (arm_init_machine_status): Initialize machine->static_chain_stack_bytes. *** gcc/testsuite/ChangeLog *** 2018-03-28 Sudakshina Das * gcc.target/arm/pr84826.c: Change dg-option to -fstack-check. Backport from mainline 2018-03-23 Sudakshina Das PR target/84826 * gcc.target/arm/pr84826.c: Add dg directive. Backport from mainline 2018-03-22 Sudakshina Das PR target/84826 * gcc.target/arm/pr84826.c: New test.
[PATCH] Print function attributes in rtl dumps
Hi, when we compile a function with attributes: ... int __attribute__((noinline, noclone)) foo (void) { return 2; } ... like this: ... gcc main.c -fdump-tree-all -fdump-rtl-all ... we find the function attributes starting from foo.c.004t.gimple: ... __attribute__((noclone, noinline)) foo () { int D.1961; D.1961 = 2; return D.1961; } ... to foo.c.232t.optimized. But we don't find the attributes in the rtl dumps: ... $ grep __attribute__ foo.c.*r.* $ ... This patch adds printing of the function attributes in the rtl dump, f.i. foo.c.235r.vregs looks like this : ... ;; Function foo (foo, funcdef_no=0, decl_uid=1958, cgraph_uid=0, symbol_order=0) function foo attributes: __attribute__((noclone, noinline)) (note 1 0 3 NOTE_INSN_DELETED) (note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (note 2 3 5 2 NOTE_INSN_FUNCTION_BEG) (insn 5 2 8 2 (set (reg:SI 87 [ _1 ]) (const_int 2 [0x2])) "foo.c":4 86 {*movsi_internal} (nil)) (insn 8 5 12 2 (set (reg:SI 88 [ ]) (reg:SI 87 [ _1 ])) "foo.c":4 86 {*movsi_internal} (nil)) (insn 12 8 13 2 (set (reg/i:SI 0 ax) (reg:SI 88 [ ])) "foo.c":5 86 {*movsi_internal} (nil)) (insn 13 12 0 2 (use (reg/i:SI 0 ax)) "foo.c":5 -1 (nil)) ... I've added the "function foo attributes" prefix because in other rtl dumps there may be quite a number of lines between the ";; Function foo" header and the first insn. OK for stage1 if bootstrap and reg-test on x86 succeeds? Thanks, - Tom
Re: [PATCH] Print function attributes in rtl dumps
[ Fix ENOPATCH ] On 03/29/2018 12:17 PM, Tom de Vries wrote: Hi, when we compile a function with attributes: ... int __attribute__((noinline, noclone)) foo (void) { return 2; } ... like this: ... gcc main.c -fdump-tree-all -fdump-rtl-all ... we find the function attributes starting from foo.c.004t.gimple: ... __attribute__((noclone, noinline)) foo () { int D.1961; D.1961 = 2; return D.1961; } ... to foo.c.232t.optimized. But we don't find the attributes in the rtl dumps: ... $ grep __attribute__ foo.c.*r.* $ ... This patch adds printing of the function attributes in the rtl dump, f.i. foo.c.235r.vregs looks like this : ... ;; Function foo (foo, funcdef_no=0, decl_uid=1958, cgraph_uid=0, symbol_order=0) function foo attributes: __attribute__((noclone, noinline)) (note 1 0 3 NOTE_INSN_DELETED) (note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (note 2 3 5 2 NOTE_INSN_FUNCTION_BEG) (insn 5 2 8 2 (set (reg:SI 87 [ _1 ]) (const_int 2 [0x2])) "foo.c":4 86 {*movsi_internal} (nil)) (insn 8 5 12 2 (set (reg:SI 88 [ ]) (reg:SI 87 [ _1 ])) "foo.c":4 86 {*movsi_internal} (nil)) (insn 12 8 13 2 (set (reg/i:SI 0 ax) (reg:SI 88 [ ])) "foo.c":5 86 {*movsi_internal} (nil)) (insn 13 12 0 2 (use (reg/i:SI 0 ax)) "foo.c":5 -1 (nil)) ... I've added the "function foo attributes" prefix because in other rtl dumps there may be quite a number of lines between the ";; Function foo" header and the first insn. OK for stage1 if bootstrap and reg-test on x86 succeeds? Thanks, - Tom Print function attributes in rtl dump 2018-03-29 Tom de Vries * passes.c (execute_function_dump): Call dump_function_attributes before print_rtl_with_bb. * tree-cfg.c (dump_function_attributes): New function, factored out of ... (dump_function_to_file): ... here. * tree-cfg.h (dump_function_attributes): Declare. --- gcc/passes.c | 5 - gcc/tree-cfg.c | 68 ++ gcc/tree-cfg.h | 1 + 3 files changed, 45 insertions(+), 29 deletions(-) diff --git a/gcc/passes.c b/gcc/passes.c index ad0a912..91518bd 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -1761,7 +1761,10 @@ execute_function_dump (function *fn, void *data) if (fn->curr_properties & PROP_trees) dump_function_to_file (fn->decl, dump_file, dump_flags); else - print_rtl_with_bb (dump_file, get_insns (), dump_flags); + { + dump_function_attributes (fn->decl, dump_file, true); + print_rtl_with_bb (dump_file, get_insns (), dump_flags); + } /* Flush the file. If verification fails, we won't be able to close the file before aborting. */ diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index 9485f73..d0f8b70 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -7963,6 +7963,45 @@ print_no_sanitize_attr_value (FILE *file, tree value) } } +/* Dump function attributes of function FNDECL to FILE. Print prefix showing + the function name if PRINT_PREFIX. */ + +void +dump_function_attributes (tree fndecl, FILE *file, bool print_prefix) +{ + if (DECL_ATTRIBUTES (fndecl) == NULL_TREE) +return; + + if (print_prefix) +fprintf (file, "function %s attributes: ", fndecl_name (fndecl)); + + fprintf (file, "__attribute__(("); + + bool first = true; + tree chain; + for (chain = DECL_ATTRIBUTES (fndecl); chain; + first = false, chain = TREE_CHAIN (chain)) +{ + if (!first) + fprintf (file, ", "); + + tree name = get_attribute_name (chain); + print_generic_expr (file, name, dump_flags); + if (TREE_VALUE (chain) != NULL_TREE) + { + fprintf (file, " ("); + + if (strstr (IDENTIFIER_POINTER (name), "no_sanitize")) + print_no_sanitize_attr_value (file, TREE_VALUE (chain)); + else + print_generic_expr (file, TREE_VALUE (chain), dump_flags); + fprintf (file, ")"); + } +} + + fprintf (file, "))\n"); +} + /* Dump FUNCTION_DECL FN to file FILE using FLAGS (see TDF_* in dumpfile.h) */ @@ -7978,34 +8017,7 @@ dump_function_to_file (tree fndecl, FILE *file, dump_flags_t flags) && decl_is_tm_clone (fndecl)); struct function *fun = DECL_STRUCT_FUNCTION (fndecl); - if (DECL_ATTRIBUTES (fndecl) != NULL_TREE) -{ - fprintf (file, "__attribute__(("); - - bool first = true; - tree chain; - for (chain = DECL_ATTRIBUTES (fndecl); chain; - first = false, chain = TREE_CHAIN (chain)) - { - if (!first) - fprintf (file, ", "); - - tree name = get_attribute_name (chain); - print_generic_expr (file, name, dump_flags); - if (TREE_VALUE (chain) != NULL_TREE) - { - fprintf (file, " ("); - - if (strstr (IDENTIFIER_POINTER (name), "no_sanitize")) - print_no_sanitize_attr_value (file, TREE_VALUE (chain)); - else - print_generic_expr (file, TREE_VALUE (chain), dump_flags); - fprintf (file, ")"); - } - } - - fprintf (file, "))\n"); -} + dump_function_attributes (fndecl, file, false); current_function_d
Re: [Aarch64] Fix conditional branches with target far away.
Hi Sudakshina, Thanks for pointing that out. Updated the conditions for attribute length to take care of boundary conditions for offset range. Please find attached the updated patch. I have tested it for gcc testsuite and the failing testcase. Ok for trunk? On 22 March 2018 at 19:06, Sudakshina Das wrote: > Hi Sameera > > On 22/03/18 02:07, Sameera Deshpande wrote: >> >> Hi Sudakshina, >> >> As per the ARMv8 ARM, for the offset range (-1048576 ,1048572), the >> far branch instruction offset is inclusive of both the offsets. Hence, >> I am using <=||=> and not <||>= as it was in previous implementation. > > > I have to admit earlier I was only looking at the patch mechanically and > found a difference with the previous implementation in offset comparison. > After you pointed out, I looked up the ARMv8 ARM and I have a couple of > doubts: > > 1. My understanding is that any offset in [-1048576 ,1048572] both inclusive > qualifies as an 'in range' offset. However, the code for both attribute > length and far_branch has been using [-1048576 ,1048572), that is, ( >= && < > ). If the far_branch was incorrectly calculated, then maybe the length > calculations with similar magic numbers should also be corrected? Of course, > I am not an expert in this and maybe this was a conscience decision so I > would ask Ramana to maybe clarify if he remembers. > > 2. Now to come back to your patch, if my understanding is correct, I think a > far_branch would be anything outside of this range, that is, > (offset < -1048576 || offset > 1048572), anything that can not be > represented in the 21-bit range. > > Thanks > Sudi > > >> >> On 16 March 2018 at 00:51, Sudakshina Das wrote: >>> >>> On 15/03/18 15:27, Sameera Deshpande wrote: Ping! On 28 February 2018 at 16:18, Sameera Deshpande wrote: > > > On 27 February 2018 at 18:25, Ramana Radhakrishnan > wrote: >> >> >> On Wed, Feb 14, 2018 at 8:30 AM, Sameera Deshpande >> wrote: >>> >>> >>> Hi! >>> >>> Please find attached the patch to fix bug in branches with offsets >>> over >>> 1MiB. >>> There has been an attempt to fix this issue in commit >>> 050af05b9761f1979f11c151519e7244d5becd7c >>> >>> However, the far_branch attribute defined in above patch used >>> insn_length - which computes incorrect offset. Hence, eliminated the >>> attribute completely, and computed the offset from insn_addresses >>> instead. >>> >>> Ok for trunk? >>> >>> gcc/Changelog >>> >>> 2018-02-13 Sameera Deshpande >>> * config/aarch64/aarch64.md (far_branch): Remove attribute. >>> Eliminate >>> all the dependencies on the attribute from RTL patterns. >>> >> >> I'm not a maintainer but this looks good to me modulo notes about how >> this was tested. What would be nice is a testcase for the testsuite as >> well as ensuring that the patch has been bootstrapped and regression >> tested. AFAIR, the original patch was put in because match.pd failed >> when bootstrap in another context. >> >> >> regards >> Ramana >> >>> -- >>> - Thanks and regards, >>> Sameera D. > > > > The patch is tested with GCC testsuite and bootstrapping successfully. > Also tested for spec benchmark. > >>> >>> I am not a maintainer either. I noticed that the range check you do for >>> the offset has a (<= || >=). The "far_branch" however did (< || >=) for a >>> positive value. Was that also part of the incorrect offset calculation? >>> >>> @@ -692,7 +675,11 @@ >>> { >>>if (get_attr_length (insn) =3D=3D 8) >>> { >>> - if (get_attr_far_branch (insn) =3D=3D 1) >>> + long long int offset; >>> + offset =3D INSN_ADDRESSES (INSN_UID (XEXP (operands[2], 0))) >>> + - INSN_ADDRESSES (INSN_UID (insn)); >>> + >>> + if (offset <=3D -1048576 || offset >=3D 1048572) >>> return aarch64_gen_far_branch (operands, 2, "Ltb", >>>"\\t%0, %1, "); >>> else >>> @@ -709,12 +696,7 @@ >>> (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int >>> -32768)) >>> (lt (minus (match_dup 2) (pc)) (const_int >>> 32764))) >>> (const_int 4) >>> - (const_int 8))) >>> - (set (attr "far_branch") >>> - (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int >>> -1048576)) >>> - (lt (minus (match_dup 2) (pc)) (const_int >>> 1048572))) >>> - (const_int 0) >>> - (const_int 1)))] >>> + (const_int 8)))] >>> >>>) >>> >>> Thanks >>> Sudi >>> > -- > - Thanks and regards, > Sameera D. >>> >> >> >> > -- - Thanks and regards, Sameera D. Index: gcc/config/aarch64/aarch
Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).
On 03/28/2018 06:36 PM, Jakub Jelinek wrote: On Wed, Mar 28, 2018 at 06:30:21PM +0200, Martin Liška wrote: --- a/gcc/config/linux.c +++ b/gcc/config/linux.c @@ -37,3 +37,24 @@ linux_libc_has_function (enum function_class fn_class) return false; } + +/* This hook determines whether a function from libc has a fast implementation + FN is present at the runtime. We override it for i386 and glibc C library + as this combination provides fast implementation of mempcpy function. */ + +enum libc_speed +ix86_linux_libc_func_speed (int fn) Putting a ix86_ function into config/linux.c used by most linux targets is weird. Either we multiple linux targets with mempcpy fast, then name it somehow cpu neutral and let all those CPUs pick it up in config/*/linux.h. And yes, we do care about i?86-linux. Or it is for x86 only, and then it shouldn't be in config/linux.c, but either e.g. static inline in config/i386/linux.h, or we need config/i386/linux.c if we don't have it already. I'm fine with putting the implementation into gcc/config/i386/linux.c. Can you please help me how to conditionally build the file? Martin Jakub
[PATCH] Documentation tweaks.
Hi. I'm sending couple of documentation tweaks that touch LTO, binutils auto-load plugin support and valid options of -march (on x86-64). Martin gcc/ChangeLog: 2018-03-29 Martin Liska PR lto/84995. * doc/invoke.texi: Document how LTO works with debug info. Describe auto-load support of binutils. Mention 'x86-64' as valid option value of -march option. --- gcc/doc/invoke.texi | 11 +++ 1 file changed, 11 insertions(+) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index feacd569ef3..424e426a390 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -9482,6 +9482,9 @@ types of hosts. The bytecode files are versioned and there is a strict version check, so bytecode files generated in one version of GCC do not work with an older or newer version of GCC. +Link-time optimization on ELF/DWARF systems works with generation of debugging +information. Other systems are currently not supported. + Link-time optimization does not work well with generation of debugging information. Combining @option{-flto} with @option{-g} is currently experimental and expected to produce unexpected @@ -9565,6 +9568,11 @@ need to support linker plugins to allow a full-featured build environment @command{gcc-nm}, @command{gcc-ranlib} wrappers to pass the right options to these tools. With non fat LTO makefiles need to be modified to use them. +Note that modern binutils provide plugin auto-load mechanism. +Installing the linker plugin into @file{$libdir/bfd-plugins} has the same +effect as usage of the command wrappers (@command{gcc-ar}, @command{gcc-nm} and +@command{gcc-ranlib}). + The default is @option{-fno-fat-lto-objects} on targets with linker plugin support. @@ -26440,6 +26448,9 @@ the result might not run on different machines). Using @option{-mtune=native} produces code optimized for the local machine under the constraints of the selected instruction set. +@item x86-64 +A generic CPU with 64-bit extensions. + @item i386 Original Intel i386 CPU@.
Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).
On Thu, Mar 29, 2018 at 4:28 AM, Martin Liška wrote: > On 03/28/2018 06:36 PM, Jakub Jelinek wrote: >> >> On Wed, Mar 28, 2018 at 06:30:21PM +0200, Martin Liška wrote: >>> >>> --- a/gcc/config/linux.c >>> +++ b/gcc/config/linux.c >>> @@ -37,3 +37,24 @@ linux_libc_has_function (enum function_class fn_class) >>> return false; >>> } >>> + >>> +/* This hook determines whether a function from libc has a fast >>> implementation >>> + FN is present at the runtime. We override it for i386 and glibc C >>> library >>> + as this combination provides fast implementation of mempcpy function. >>> */ >>> + >>> +enum libc_speed >>> +ix86_linux_libc_func_speed (int fn) >> >> >> Putting a ix86_ function into config/linux.c used by most linux targets is >> weird. Either we multiple linux targets with mempcpy fast, then name it >> somehow cpu neutral and let all those CPUs pick it up in config/*/linux.h. >> And yes, we do care about i?86-linux. Or it is for x86 only, and then >> it shouldn't be in config/linux.c, but either e.g. static inline in >> config/i386/linux.h, or we need config/i386/linux.c if we don't have it >> already. > > > I'm fine with putting the implementation into gcc/config/i386/linux.c. Can > you please > help me how to conditionally build the file? config.gcc: extra_objs="${extra_objs} linux.o" config.gcc: extra_objs="$extra_objs powerpcspe-linux.o" config.gcc: extra_objs="$extra_objs rs6000-linux.o" -- H.J.
Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).
On 03/29/2018 02:25 PM, Jakub Jelinek wrote: On Thu, Mar 29, 2018 at 01:28:13PM +0200, Martin Liška wrote: On 03/28/2018 06:36 PM, Jakub Jelinek wrote: On Wed, Mar 28, 2018 at 06:30:21PM +0200, Martin Liška wrote: --- a/gcc/config/linux.c +++ b/gcc/config/linux.c @@ -37,3 +37,24 @@ linux_libc_has_function (enum function_class fn_class) return false; } + +/* This hook determines whether a function from libc has a fast implementation + FN is present at the runtime. We override it for i386 and glibc C library + as this combination provides fast implementation of mempcpy function. */ + +enum libc_speed +ix86_linux_libc_func_speed (int fn) Putting a ix86_ function into config/linux.c used by most linux targets is weird. Either we multiple linux targets with mempcpy fast, then name it somehow cpu neutral and let all those CPUs pick it up in config/*/linux.h. And yes, we do care about i?86-linux. Or it is for x86 only, and then it shouldn't be in config/linux.c, but either e.g. static inline in config/i386/linux.h, or we need config/i386/linux.c if we don't have it already. I'm fine with putting the implementation into gcc/config/i386/linux.c. Can you please Can't you just put it into gcc/config/i386/linux-common.h as static inline, so that it is optimized away whenever not needed? I would like to put it there, but: g++ -c -g -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -fno-PIE -I. -Ibuild -I../../gcc -I../../gcc/build -I../../gcc/../include -I../../gcc/../libcpp/include \ -o build/genpreds.o ../../gcc/genpreds.c In file included from ./tm.h:38:0, from ../../gcc/genpreds.c:26: ../../gcc/config/i386/linux-common.h: In function ‘libc_speed ix86_linux_libc_func_speed(int)’: ../../gcc/config/i386/linux-common.h:137:8: error: use of enum ‘built_in_function’ without previous declaration enum built_in_function f = (built_in_function)fn; ^ ../../gcc/config/i386/linux-common.h:137:31: error: ‘built_in_function’ was not declared in this scope enum built_in_function f = (built_in_function)fn; ^ ../../gcc/config/i386/linux-common.h:137:31: note: suggested alternative: ‘machine_function’ enum built_in_function f = (built_in_function)fn; ^ machine_function ../../gcc/config/i386/linux-common.h:144:12: error: ‘BUILT_IN_MEMPCPY’ was not declared in this scope case BUILT_IN_MEMPCPY: ^~~~ Martin If you really want to add a c file, it better not be called linux.c, because linux.o for it would clash with linux.o from gcc/config/linux.c. And, you'd need to add the whatever.o into extra_objs in gcc/config.gcc and add rules for it into gcc/config/i386/t-linux (see linux.o in config.gcc and config/t-linux). help me how to conditionally build the file? Jakub
Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).
On Thu, Mar 29, 2018 at 01:28:13PM +0200, Martin Liška wrote: > On 03/28/2018 06:36 PM, Jakub Jelinek wrote: > > On Wed, Mar 28, 2018 at 06:30:21PM +0200, Martin Liška wrote: > > > --- a/gcc/config/linux.c > > > +++ b/gcc/config/linux.c > > > @@ -37,3 +37,24 @@ linux_libc_has_function (enum function_class fn_class) > > > return false; > > > } > > > + > > > +/* This hook determines whether a function from libc has a fast > > > implementation > > > + FN is present at the runtime. We override it for i386 and glibc C > > > library > > > + as this combination provides fast implementation of mempcpy function. > > > */ > > > + > > > +enum libc_speed > > > +ix86_linux_libc_func_speed (int fn) > > > > Putting a ix86_ function into config/linux.c used by most linux targets is > > weird. Either we multiple linux targets with mempcpy fast, then name it > > somehow cpu neutral and let all those CPUs pick it up in config/*/linux.h. > > And yes, we do care about i?86-linux. Or it is for x86 only, and then > > it shouldn't be in config/linux.c, but either e.g. static inline in > > config/i386/linux.h, or we need config/i386/linux.c if we don't have it > > already. > > I'm fine with putting the implementation into gcc/config/i386/linux.c. Can > you please Can't you just put it into gcc/config/i386/linux-common.h as static inline, so that it is optimized away whenever not needed? If you really want to add a c file, it better not be called linux.c, because linux.o for it would clash with linux.o from gcc/config/linux.c. And, you'd need to add the whatever.o into extra_objs in gcc/config.gcc and add rules for it into gcc/config/i386/t-linux (see linux.o in config.gcc and config/t-linux). > help me how to conditionally build the file? Jakub
[PATCH] i386: Enable AVX/AVX512 features only if supported by OSXSAVE
Enable AVX and AVX512 features only if their states are supported by OSXSAVE. OK for trunk and release branches? H.J. --- PR target/85100 * config/i386/cpuinfo.c (XCR_XFEATURE_ENABLED_MASK): New. (XSTATE_FP): Likewise. (XSTATE_SSE): Likewise. (XSTATE_YMM): Likewise. (XSTATE_OPMASK): Likewise. (XSTATE_ZMM): Likewise. (XSTATE_HI_ZMM): Likewise. (XCR_AVX_ENABLED_MASK): Likewise. (XCR_AVX512F_ENABLED_MASK): Likewise. (get_available_features): Enable AVX and AVX512 features only if their states are supported by OSXSAVE. --- libgcc/config/i386/cpuinfo.c | 134 +-- 1 file changed, 90 insertions(+), 44 deletions(-) diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c index 4eb3f5cd944..1dac110a79a 100644 --- a/libgcc/config/i386/cpuinfo.c +++ b/libgcc/config/i386/cpuinfo.c @@ -240,6 +240,40 @@ get_available_features (unsigned int ecx, unsigned int edx, unsigned int features = 0; unsigned int features2 = 0; + /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv. */ +#define XCR_XFEATURE_ENABLED_MASK 0x0 +#define XSTATE_FP 0x1 +#define XSTATE_SSE 0x2 +#define XSTATE_YMM 0x4 +#define XSTATE_OPMASK 0x20 +#define XSTATE_ZMM 0x40 +#define XSTATE_HI_ZMM 0x80 + +#define XCR_AVX_ENABLED_MASK \ + (XSTATE_SSE | XSTATE_YMM) +#define XCR_AVX512F_ENABLED_MASK \ + (XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM) + + /* Check if AVX and AVX512 are usable. */ + int avx_usable = 0; + int avx512_usable = 0; + if ((ecx & bit_OSXSAVE)) +{ + /* Check if XMM, YMM, OPMASK, upper 256 bits of ZMM0-ZMM15 and + ZMM16-ZMM31 states are supported by OSXSAVE. */ + unsigned int xcrlow; + unsigned int xcrhigh; + asm (".byte 0x0f, 0x01, 0xd0" + : "=a" (xcrlow), "=d" (xcrhigh) + : "c" (XCR_XFEATURE_ENABLED_MASK)); + if ((xcrlow & XCR_AVX_ENABLED_MASK) == XCR_AVX_ENABLED_MASK) + { + avx_usable = 1; + avx512_usable = ((xcrlow & XCR_AVX512F_ENABLED_MASK) + == XCR_AVX512F_ENABLED_MASK); + } +} + #define set_feature(f) \ if (f < 32) features |= (1U << f); else features2 |= (1U << (f - 32)) @@ -265,10 +299,13 @@ get_available_features (unsigned int ecx, unsigned int edx, set_feature (FEATURE_SSE4_1); if (ecx & bit_SSE4_2) set_feature (FEATURE_SSE4_2); - if (ecx & bit_AVX) -set_feature (FEATURE_AVX); - if (ecx & bit_FMA) -set_feature (FEATURE_FMA); + if (avx_usable) +{ + if (ecx & bit_AVX) + set_feature (FEATURE_AVX); + if (ecx & bit_FMA) + set_feature (FEATURE_FMA); +} /* Get Advanced Features at level 7 (eax = 7, ecx = 0). */ if (max_cpuid_level >= 7) @@ -276,44 +313,50 @@ get_available_features (unsigned int ecx, unsigned int edx, __cpuid_count (7, 0, eax, ebx, ecx, edx); if (ebx & bit_BMI) set_feature (FEATURE_BMI); - if (ebx & bit_AVX2) - set_feature (FEATURE_AVX2); + if (avx_usable) + { + if (ebx & bit_AVX2) + set_feature (FEATURE_AVX2); + } if (ebx & bit_BMI2) set_feature (FEATURE_BMI2); - if (ebx & bit_AVX512F) - set_feature (FEATURE_AVX512F); - if (ebx & bit_AVX512VL) - set_feature (FEATURE_AVX512VL); - if (ebx & bit_AVX512BW) - set_feature (FEATURE_AVX512BW); - if (ebx & bit_AVX512DQ) - set_feature (FEATURE_AVX512DQ); - if (ebx & bit_AVX512CD) - set_feature (FEATURE_AVX512CD); - if (ebx & bit_AVX512PF) - set_feature (FEATURE_AVX512PF); - if (ebx & bit_AVX512ER) - set_feature (FEATURE_AVX512ER); - if (ebx & bit_AVX512IFMA) - set_feature (FEATURE_AVX512IFMA); - if (ecx & bit_AVX512VBMI) - set_feature (FEATURE_AVX512VBMI); - if (ecx & bit_AVX512VBMI2) - set_feature (FEATURE_AVX512VBMI2); - if (ecx & bit_GFNI) - set_feature (FEATURE_GFNI); - if (ecx & bit_VPCLMULQDQ) - set_feature (FEATURE_VPCLMULQDQ); - if (ecx & bit_AVX512VNNI) - set_feature (FEATURE_AVX512VNNI); - if (ecx & bit_AVX512BITALG) - set_feature (FEATURE_AVX512BITALG); - if (ecx & bit_AVX512VPOPCNTDQ) - set_feature (FEATURE_AVX512VPOPCNTDQ); - if (edx & bit_AVX5124VNNIW) - set_feature (FEATURE_AVX5124VNNIW); - if (edx & bit_AVX5124FMAPS) - set_feature (FEATURE_AVX5124FMAPS); + if (avx512_usable) + { + if (ebx & bit_AVX512F) + set_feature (FEATURE_AVX512F); + if (ebx & bit_AVX512VL) + set_feature (FEATURE_AVX512VL); + if (ebx & bit_AVX512BW) + set_feature (FEATURE_AVX512BW); + if (ebx & bit_AVX512DQ) + set_feature (FEATURE_AVX512DQ); + if (
Re: [PATCH] Documentation tweaks.
On Thu, Mar 29, 2018 at 2:14 PM, Martin Liška wrote: > Hi. > > I'm sending couple of documentation tweaks that touch LTO, binutils > auto-load plugin > support and valid options of -march (on x86-64). @@ -9482,6 +9482,9 @@ types of hosts. The bytecode files are versioned and there is a strict version check, so bytecode files generated in one version of GCC do not work with an older or newer version of GCC. +Link-time optimization on ELF/DWARF systems works with generation of debugging +information. Other systems are currently not supported. + Link-time optimization does not work well with generation of debugging information. Combining @option{-flto} with @option{-g} is currently experimental and expected to produce unexpected this just adds to the current warning which doesn't look too useful. Maybe simply Index: doc/invoke.texi === --- doc/invoke.texi (revision 258915) +++ doc/invoke.texi (working copy) @@ -9483,9 +9483,8 @@ strict version check, so bytecode files GCC do not work with an older or newer version of GCC. Link-time optimization does not work well with generation of debugging -information. Combining @option{-flto} with -@option{-g} is currently experimental and expected to produce unexpected -results. +information on systems other than those using a combination of ELF and +DWARF. If you specify the optional @var{n}, the optimization and code generation done at link time is executed in parallel using @var{n} ? The other hunks look OK. Richard. > Martin > > gcc/ChangeLog: > > 2018-03-29 Martin Liska > > PR lto/84995. > * doc/invoke.texi: Document how LTO works with debug info. > Describe auto-load support of binutils. Mention 'x86-64' > as valid option value of -march option. > --- > gcc/doc/invoke.texi | 11 +++ > 1 file changed, 11 insertions(+) > >
Re: [PATCH] Documentation tweaks.
On 03/29/2018 02:56 PM, Richard Biener wrote: On Thu, Mar 29, 2018 at 2:14 PM, Martin Liška wrote: Hi. I'm sending couple of documentation tweaks that touch LTO, binutils auto-load plugin support and valid options of -march (on x86-64). @@ -9482,6 +9482,9 @@ types of hosts. The bytecode files are versioned and there is a strict version check, so bytecode files generated in one version of GCC do not work with an older or newer version of GCC. +Link-time optimization on ELF/DWARF systems works with generation of debugging +information. Other systems are currently not supported. + Link-time optimization does not work well with generation of debugging information. Combining @option{-flto} with @option{-g} is currently experimental and expected to produce unexpected this just adds to the current warning which doesn't look too useful. Maybe simply Index: doc/invoke.texi === --- doc/invoke.texi (revision 258915) +++ doc/invoke.texi (working copy) @@ -9483,9 +9483,8 @@ strict version check, so bytecode files GCC do not work with an older or newer version of GCC. Link-time optimization does not work well with generation of debugging -information. Combining @option{-flto} with -@option{-g} is currently experimental and expected to produce unexpected -results. +information on systems other than those using a combination of ELF and +DWARF. If you specify the optional @var{n}, the optimization and code generation done at link time is executed in parallel using @var{n} ? The other hunks look OK. Richard. Martin gcc/ChangeLog: 2018-03-29 Martin Liska PR lto/84995. * doc/invoke.texi: Document how LTO works with debug info. Describe auto-load support of binutils. Mention 'x86-64' as valid option value of -march option. --- gcc/doc/invoke.texi | 11 +++ 1 file changed, 11 insertions(+) Works for me, installed as r258953. Thanks for the correction. Martin
Re: [PATCH] i386: Enable AVX/AVX512 features only if supported by OSXSAVE
On Thu, Mar 29, 2018 at 2:43 PM, H.J. Lu wrote: > Enable AVX and AVX512 features only if their states are supported by > OSXSAVE. > > OK for trunk and release branches? > > > H.J. > --- > PR target/85100 > * config/i386/cpuinfo.c (XCR_XFEATURE_ENABLED_MASK): New. > (XSTATE_FP): Likewise. > (XSTATE_SSE): Likewise. > (XSTATE_YMM): Likewise. > (XSTATE_OPMASK): Likewise. > (XSTATE_ZMM): Likewise. > (XSTATE_HI_ZMM): Likewise. > (XCR_AVX_ENABLED_MASK): Likewise. > (XCR_AVX512F_ENABLED_MASK): Likewise. > (get_available_features): Enable AVX and AVX512 features only > if their states are supported by OSXSAVE OK for trunk and release branches after a couple of days without problems in trunk. Thanks, Uros. > --- > libgcc/config/i386/cpuinfo.c | 134 > +-- > 1 file changed, 90 insertions(+), 44 deletions(-) > > diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c > index 4eb3f5cd944..1dac110a79a 100644 > --- a/libgcc/config/i386/cpuinfo.c > +++ b/libgcc/config/i386/cpuinfo.c > @@ -240,6 +240,40 @@ get_available_features (unsigned int ecx, unsigned int > edx, >unsigned int features = 0; >unsigned int features2 = 0; > > + /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv. */ > +#define XCR_XFEATURE_ENABLED_MASK 0x0 > +#define XSTATE_FP 0x1 > +#define XSTATE_SSE 0x2 > +#define XSTATE_YMM 0x4 > +#define XSTATE_OPMASK 0x20 > +#define XSTATE_ZMM 0x40 > +#define XSTATE_HI_ZMM 0x80 > + > +#define XCR_AVX_ENABLED_MASK \ > + (XSTATE_SSE | XSTATE_YMM) > +#define XCR_AVX512F_ENABLED_MASK \ > + (XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM) > + > + /* Check if AVX and AVX512 are usable. */ > + int avx_usable = 0; > + int avx512_usable = 0; > + if ((ecx & bit_OSXSAVE)) > +{ > + /* Check if XMM, YMM, OPMASK, upper 256 bits of ZMM0-ZMM15 and > + ZMM16-ZMM31 states are supported by OSXSAVE. */ > + unsigned int xcrlow; > + unsigned int xcrhigh; > + asm (".byte 0x0f, 0x01, 0xd0" > + : "=a" (xcrlow), "=d" (xcrhigh) > + : "c" (XCR_XFEATURE_ENABLED_MASK)); > + if ((xcrlow & XCR_AVX_ENABLED_MASK) == XCR_AVX_ENABLED_MASK) > + { > + avx_usable = 1; > + avx512_usable = ((xcrlow & XCR_AVX512F_ENABLED_MASK) > + == XCR_AVX512F_ENABLED_MASK); > + } > +} > + > #define set_feature(f) \ >if (f < 32) features |= (1U << f); else features2 |= (1U << (f - 32)) > > @@ -265,10 +299,13 @@ get_available_features (unsigned int ecx, unsigned int > edx, > set_feature (FEATURE_SSE4_1); >if (ecx & bit_SSE4_2) > set_feature (FEATURE_SSE4_2); > - if (ecx & bit_AVX) > -set_feature (FEATURE_AVX); > - if (ecx & bit_FMA) > -set_feature (FEATURE_FMA); > + if (avx_usable) > +{ > + if (ecx & bit_AVX) > + set_feature (FEATURE_AVX); > + if (ecx & bit_FMA) > + set_feature (FEATURE_FMA); > +} > >/* Get Advanced Features at level 7 (eax = 7, ecx = 0). */ >if (max_cpuid_level >= 7) > @@ -276,44 +313,50 @@ get_available_features (unsigned int ecx, unsigned int > edx, >__cpuid_count (7, 0, eax, ebx, ecx, edx); >if (ebx & bit_BMI) > set_feature (FEATURE_BMI); > - if (ebx & bit_AVX2) > - set_feature (FEATURE_AVX2); > + if (avx_usable) > + { > + if (ebx & bit_AVX2) > + set_feature (FEATURE_AVX2); > + } >if (ebx & bit_BMI2) > set_feature (FEATURE_BMI2); > - if (ebx & bit_AVX512F) > - set_feature (FEATURE_AVX512F); > - if (ebx & bit_AVX512VL) > - set_feature (FEATURE_AVX512VL); > - if (ebx & bit_AVX512BW) > - set_feature (FEATURE_AVX512BW); > - if (ebx & bit_AVX512DQ) > - set_feature (FEATURE_AVX512DQ); > - if (ebx & bit_AVX512CD) > - set_feature (FEATURE_AVX512CD); > - if (ebx & bit_AVX512PF) > - set_feature (FEATURE_AVX512PF); > - if (ebx & bit_AVX512ER) > - set_feature (FEATURE_AVX512ER); > - if (ebx & bit_AVX512IFMA) > - set_feature (FEATURE_AVX512IFMA); > - if (ecx & bit_AVX512VBMI) > - set_feature (FEATURE_AVX512VBMI); > - if (ecx & bit_AVX512VBMI2) > - set_feature (FEATURE_AVX512VBMI2); > - if (ecx & bit_GFNI) > - set_feature (FEATURE_GFNI); > - if (ecx & bit_VPCLMULQDQ) > - set_feature (FEATURE_VPCLMULQDQ); > - if (ecx & bit_AVX512VNNI) > - set_feature (FEATURE_AVX512VNNI); > - if (ecx & bit_AVX512BITALG) > - set_feature (FEATURE_AVX512BITALG); > - if (ecx & bit_AVX512VPOPCNTDQ) > - set_feature (FEATURE_AVX512VPOPCNTDQ); > - if (edx & bit_AVX5124VNNIW) > - set_feature (FEATURE_AVX5124VNNIW); > - if (edx & bit_AVX
Re: [PATCH 1/2] More underlining of bad arguments (PR c++/85110)
OK. On Wed, Mar 28, 2018 at 4:44 PM, David Malcolm wrote: > As of r256448, the C++ frontend underlines many bad arguments in its > diagnostics; those where perform_overload_resolution returns a > non-NULL candidate, but there's a failure in convert_like_real. > > However, for the case where perform_overload_resolution fails, but > there's a single non-viable candidate, the error is diagnosed by > cp_build_function_call_vec, and that currently doesn't underline > the bad argument: > > $ cat test.cc > void callee (int one, const char **two, int three); > > void > caller (const char *fmt) > { > callee (1, fmt, 3); > } > > We emit: > > $ g++ test.cc > test.cc: In function 'void caller(const char*)': > test.cc:6:20: error: cannot convert 'const char*' to 'const char**' for > argument '2' to 'void callee(int, const char**, int)' >callee (1, fmt, 3); > ^ > > It's going through convert_for_assignment, and > implicitly using input_location. > > This patch updates convert_for_assignment for this case, using > an EXPR_LOCATION if there is one, or falling back to input_location > otherwise, underlining the argument in question: > > test.cc: In function 'void caller(const char*)': > test.cc:6:14: error: cannot convert 'const char*' to 'const char**' for > argument '2' to 'void callee(int, const char**, int)' >callee (1, fmt, 3); > ^~~ > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu; adds > 117 PASS results to g++.sum. > > Not a regression as such, but I've been calling out the underlined > arguments as a feature of gcc 8, so would be good to fix. > > OK for trunk? > > gcc/cp/ChangeLog: > PR c++/85110 > * typeck.c (convert_for_assignment): When complaining due to > conversions for an argument, attempt to use the location of the > argument. > > gcc/testsuite/ChangeLog: > PR c++/85110 > * g++.dg/diagnostic/param-type-mismatch-2.C: New test. > --- > gcc/cp/typeck.c| 5 +- > .../g++.dg/diagnostic/param-type-mismatch-2.C | 175 > + > 2 files changed, 178 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C > > diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c > index 0c2ebd1..e733c79 100644 > --- a/gcc/cp/typeck.c > +++ b/gcc/cp/typeck.c > @@ -8781,8 +8781,9 @@ convert_for_assignment (tree type, tree rhs, >parmnum, complain, flags); > } > else if (fndecl) > - error ("cannot convert %qH to %qI for argument %qP to %qD", > - rhstype, type, parmnum, fndecl); > + error_at (EXPR_LOC_OR_LOC (rhs, input_location), > + "cannot convert %qH to %qI for argument %qP to %qD", > + rhstype, type, parmnum, fndecl); > else > switch (errtype) > { > diff --git a/gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C > b/gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C > new file mode 100644 > index 000..ae84248 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C > @@ -0,0 +1,175 @@ > +// { dg-options "-fdiagnostics-show-caret" } > + > +/* A collection of calls where argument 2 is of the wrong type. */ > + > +/* decl, with argname. */ > + > +extern int callee_1 (int one, const char **two, float three); > + > +int test_1 (int first, const char *second, float third) > +{ > + return callee_1 (first, second, third); // { dg-error "27: cannot convert > 'const char\\*' to 'const char\\*\\*' for argument '2' to 'int > callee_1\\(int, const char\\*\\*, float\\)'" } > + /* { dg-begin-multiline-output "" } > + return callee_1 (first, second, third); > + ^~ > + { dg-end-multiline-output "" } */ > +} > + > +/* decl, without argname. */ > + > +extern int callee_2 (int, const char **, float); > + > +int test_2 (int first, const char *second, float third) > +{ > + return callee_2 (first, second, third); // { dg-error "27: cannot convert > 'const char\\*' to 'const char\\*\\*' for argument '2' to 'int > callee_2\\(int, const char\\*\\*, float\\)'" } > + /* { dg-begin-multiline-output "" } > + return callee_2 (first, second, third); > + ^~ > + { dg-end-multiline-output "" } */ > +} > + > +/* defn, with argname. */ > + > +static int callee_3 (int one, const char **two, float three) > +{ > + return callee_2 (one, two, three); > +} > + > +int test_3 (int first, const char *second, float third) > +{ > + return callee_3 (first, second, third); // { dg-error "27: cannot convert > 'const char\\*' to 'const char\\*\\*' for argument '2' to 'int > callee_3\\(int, const char\\*\\*, float\\)'" } > + /* { dg-begin-multiline-output "" } > + return callee_3 (first, second, third); > +
Re: [PATCH 2/2] Show pertinent parameter (PR c++/85110)
On Wed, Mar 28, 2018 at 4:44 PM, David Malcolm wrote: > This followup patch updates the specific error-handling path > to add a note showing the pertinent parameter decl, taking > the output from: > > test.cc: In function 'void caller(const char*)': > test.cc:6:14: error: cannot convert 'const char*' to 'const char**' for > argument '2' to 'void callee(int, const char**, int)' >callee (1, fmt, 3); > ^~~ > > to: > > test.cc: In function 'void caller(const char*)': > test.cc:6:14: error: cannot convert 'const char*' to 'const char**' for > argument '2' to 'void callee(int, const char**, int)' >callee (1, fmt, 3); > ^~~ > test.cc:1:36: note: initializing argument 2 of 'void callee(int, const > char**, int)' > void callee (int one, const char **two, int three); >~^~~ > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu; adds > a further 18 PASS results to g++.sum. > > Again, not a regression as such, but I've been calling out the underlined > arguments as a feature of gcc 8, so would be good to fix. > > OK for trunk? > > gcc/cp/ChangeLog: > PR c++/85110 > * call.c (get_fndecl_argument_location): Make non-static. > * cp-tree.h (get_fndecl_argument_location): New decl. > * typeck.c (convert_for_assignment): When complaining due to > conversions for an argument, show the location of the parameter > within the decl. > > gcc/testsuite/ChangeLog: > PR c++/85110 > * g++.dg/diagnostic/param-type-mismatch-2.C: Update for the cases > where we now show the pertinent parameter. > --- > gcc/cp/call.c | 2 +- > gcc/cp/cp-tree.h| 2 ++ > gcc/cp/typeck.c | 10 +++--- > .../g++.dg/diagnostic/param-type-mismatch-2.C | 21 > ++--- > 4 files changed, 28 insertions(+), 7 deletions(-) > > diff --git a/gcc/cp/call.c b/gcc/cp/call.c > index 1a87f99..e1a0639 100644 > --- a/gcc/cp/call.c > +++ b/gcc/cp/call.c > @@ -6598,7 +6598,7 @@ maybe_print_user_conv_context (conversion *convs) > ARGNUM is zero based, -1 indicates the `this' argument of a method. > Return the location of the FNDECL itself if there are problems. */ > > -static location_t > +location_t > get_fndecl_argument_location (tree fndecl, int argnum) > { >int i; > diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h > index d5382c2..b45880d 100644 > --- a/gcc/cp/cp-tree.h > +++ b/gcc/cp/cp-tree.h > @@ -5965,6 +5965,8 @@ extern bool can_convert_arg > (tree, tree, tree, int, > tsubst_flags_t); > extern bool can_convert_arg_bad(tree, tree, tree, > int, > tsubst_flags_t); > +extern location_t get_fndecl_argument_location (tree, int); > + > > /* A class for recording information about access failures (e.g. private > fields), so that we can potentially supply a fix-it hint about > diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c > index e733c79..742b2e9 100644 > --- a/gcc/cp/typeck.c > +++ b/gcc/cp/typeck.c > @@ -8781,9 +8781,13 @@ convert_for_assignment (tree type, tree rhs, >parmnum, complain, flags); > } > else if (fndecl) > - error_at (EXPR_LOC_OR_LOC (rhs, input_location), > - "cannot convert %qH to %qI for argument %qP to %qD", > - rhstype, type, parmnum, fndecl); > + { > + error_at (EXPR_LOC_OR_LOC (rhs, input_location), > + "cannot convert %qH to %qI for argument %qP to > %qD", > + rhstype, type, parmnum, fndecl); > + inform (get_fndecl_argument_location (fndecl, parmnum), > + " initializing argument %P of %qD", parmnum, > fndecl); If you're adding the inform, you don't need the %P of %D in the initial error. Jason
[PATCH] PR libgcc/60790: Avoid IFUNC resolver access to uninitialized data
This patch performs lazy initialization of the relevant CPUID feature register value. It will needlessly invoke the CPUID determination code on architectures which lack CPUID support or support for the feature register, but I think it's not worth to avoid the complexity for that. I verified manually that the CMPXCHG16B implementation is still selected for a 128-bit load after the change. I don't know how to write an automated test for that. Thanks, Florian Index: libatomic/ChangeLog === --- libatomic/ChangeLog (revision 258952) +++ libatomic/ChangeLog (working copy) @@ -1,3 +1,18 @@ +2018-03-29 Florian Weimer + + PR libgcc/60790 + x86: Do not assume ELF constructors run before IFUNC resolvers. + * config/x86/host-config.h (libat_feat1_ecx, libat_feat1_edx): + Remove declarations. + (__libat_feat1, __libat_feat1_init): Declare. + (FEAT1_REGISTER): Define. + (load_feat1): New function. + (IFUNC_COND_1): Adjust. + * config/x86/init.c (libat_feat1_ecx, libat_feat1_edx) + (init_cpuid): Remove definitions. + (__libat_feat1): New variable. + (__libat_feat1_init): New function. + 2018-03-09 Andreas Krebbel * config/s390/exch_n.c: New file. Index: libatomic/config/x86/host-config.h === --- libatomic/config/x86/host-config.h (revision 258952) +++ libatomic/config/x86/host-config.h (working copy) @@ -25,13 +25,39 @@ #if HAVE_IFUNC #include -extern unsigned int libat_feat1_ecx HIDDEN; -extern unsigned int libat_feat1_edx HIDDEN; +#ifdef __x86_64__ +# define FEAT1_REGISTER ecx +#else +# define FEAT1_REGISTER edx +#endif +/* Value of the CPUID feature FEAT1_REGISTER for the cmpxchg bit for + IFUNC_COND1 below. */ +extern unsigned int __libat_feat1 HIDDEN; + +/* Initialize libat_feat1 and return its value. */ +unsigned int __libat_feat1_init (void) HIDDEN; + +/* Return the value of the relevant feature register for the relevant + cmpxchg bit, or 0 if there is no CPUID support. */ +static inline unsigned int +__attribute__ ((const)) +load_feat1 (void) +{ + /* Synchronizes with the store in __libat_feat1_init. */ + unsigned int feat1 = __atomic_load_n (&__libat_feat1, __ATOMIC_RELAXED); + if (feat1 == 0) +/* Assume that initialization has not happened yet. This may get + called repeatedly if the CPU does not have any feature bits at + all. */ +feat1 = __libat_feat1_init (); + return feat1; +} + #ifdef __x86_64__ -# define IFUNC_COND_1 (libat_feat1_ecx & bit_CMPXCHG16B) +# define IFUNC_COND_1 (load_feat1 () & bit_CMPXCHG16B) #else -# define IFUNC_COND_1 (libat_feat1_edx & bit_CMPXCHG8B) +# define IFUNC_COND_1 (load_feat1 () & bit_CMPXCHG8B) #endif #ifdef __x86_64__ Index: libatomic/config/x86/init.c === --- libatomic/config/x86/init.c (revision 258952) +++ libatomic/config/x86/init.c (working copy) @@ -26,13 +26,17 @@ #if HAVE_IFUNC -unsigned int libat_feat1_ecx, libat_feat1_edx; +unsigned int __libat_feat1; -static void __attribute__((constructor)) -init_cpuid (void) +unsigned int +__libat_feat1_init (void) { - unsigned int eax, ebx; - __get_cpuid (1, &eax, &ebx, &libat_feat1_ecx, &libat_feat1_edx); + unsigned int eax, ebx, ecx, edx; + FEAT1_REGISTER = 0; + __get_cpuid (1, &eax, &ebx, &ecx, &edx); + /* Synchronizes with the load in load_feat1. */ + __atomic_store_n (&__libat_feat1, FEAT1_REGISTER, __ATOMIC_RELAXED); + return FEAT1_REGISTER; } #endif /* HAVE_IFUNC */
Re: [PATCH][GCC][mid-end] Allow larger copies when target supports unaligned access [Patch (1/2)]
On Thu, Nov 16, 2017 at 03:27:01PM +, Tamar Christina wrote: > --- a/gcc/expr.c > +++ b/gcc/expr.c > @@ -2769,7 +2769,9 @@ copy_blkmode_to_reg (machine_mode mode, tree src) > >n_regs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD; >dst_words = XALLOCAVEC (rtx, n_regs); > - bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD); > + bitsize = BITS_PER_WORD; > + if (targetm.slow_unaligned_access (word_mode, TYPE_ALIGN (TREE_TYPE > (src > +bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD); > >/* Copy the structure BITSIZE bits at a time. */ >for (bitpos = 0, xbitpos = padding_correction; I believe this patch is wrong. Please revert. See the PR84762 testcase. There are two problems. Firstly, if padding_correction is non-zero, then xbitpos % BITS_PER_WORD is non-zero and in store_bit_field (dst_word, bitsize, xbitpos % BITS_PER_WORD, 0, 0, word_mode, extract_bit_field (src_word, bitsize, bitpos % BITS_PER_WORD, 1, NULL_RTX, word_mode, word_mode, false, NULL), false); the stored bit-field exceeds the destination register size. You could fix that by making bitsize the gcd of bitsize and padding_correction. However, that doesn't fix the second problem which is that the extracted bit-field can exceed the source size. That will result in rubbish being read into a register. -- Alan Modra Australia Development Lab, IBM
[PATCH] use warning_n instead of _at (PR 84818)
The attached patch replaces a couple of uses of warning_at in builtins.c with the more appropriate warning_n calls as suggested in the bug. The change passes regression testing on x86_64-linux. I'll go ahead and commit it as obvious in the next day or so unless there are objections. Thanks Martin PR tree-optimization/84818 - integer_onep must not be used for i18n gcc/ChangeLog: PR tree-optimization/84818 * builtins.c (check_access): Use warning_n. Index: gcc/builtins.c === --- gcc/builtins.c (revision 258941) +++ gcc/builtins.c (working copy) @@ -3172,13 +3172,12 @@ check_access (tree exp, tree, tree, tree dstwrite, exp, func, range[0], dstsize); } else if (tree_int_cst_equal (range[0], range[1])) - warning_at (loc, opt, - (integer_onep (range[0]) - ? G_("%K%qD writing %E byte into a region " - "of size %E overflows the destination") - : G_("%K%qD writing %E bytes into a region " - "of size %E overflows the destination")), - exp, func, range[0], dstsize); + warning_n (loc, opt, tree_to_uhwi (range[0]), + "%K%qD writing %E byte into a region " + "of size %E overflows the destination", + "%K%qD writing %E bytes into a region " + "of size %E overflows the destination", + exp, func, range[0], dstsize); else if (tree_int_cst_sign_bit (range[1])) { /* Avoid printing the upper bound if it's invalid. */ @@ -3273,10 +3272,9 @@ check_access (tree exp, tree, tree, tree dstwrite, location_t loc = tree_nonartificial_location (exp); if (tree_int_cst_equal (range[0], range[1])) - warning_at (loc, opt, - (tree_int_cst_equal (range[0], integer_one_node) - ? G_("%K%qD reading %E byte from a region of size %E") - : G_("%K%qD reading %E bytes from a region of size %E")), + warning_n (loc, opt, tree_to_uhwi (range[0]), + "%K%qD reading %E byte from a region of size %E", + "%K%qD reading %E bytes from a region of size %E", exp, func, range[0], slen); else if (tree_int_cst_sign_bit (range[1])) {
[PATCH, rs6000] Tidy implementation of vec_ldl
During code review, an inconsistency was noticed in some of the prototypes defined for the vec_ldl built-in function. In particular, the vector fetched from an address declare to be long long * was returned as "vector int". In addressing this problem, certain other inconsistencies and omissions were discovered. This patch tidies up the implementation of this function. A separate patch is in preparation to address the documentation for this and all other PowerPC built-in functions. In summary, this patch removes two prototypes: vector int vec_ldl (int, long int *) vector unsigned int vec_ldl (int, unsigned long int *) and adds eight: vector bool char vec_ldl (int, bool char *) vector bool short vec_ldl (int, bool short *) vector bool int vec_ldl (int, bool int *) vector bool long long vec_ldl (int, bool long long *) vector pixel vec_ldl (int, pixel *) vector long long vec_ldl (int, long long *) vector unsigned long long vec_ldl (int, unsigned long long *) This patch has been bootstrapped and tested without regressions on both powerpc64le-unknown-linux (P8) and on powerpc-linux (P7 big-endian, with both -m32 and -m64 target options). Is this ok for trunk? gcc/ChangeLog: 2018-03-29 Kelvin Nilsen * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Remove erroneous entries for "vector int vec_ldl (int, long int *)", and "vector unsigned int vec_ldl (int, unsigned long int *)". Add comments and entries for "vector bool char vec_ldl (int, bool char *)", "vector bool short vec_ldl (int, bool short *)", "vector bool int vec_ldl (int, bool int *)", "vector bool long long vec_ldl (int, bool long long *)", "vector pixel vec_ldl (int, pixel *)", "vector long long vec_ldl (int, long long *)", "vector unsigned long long vec_ldl (int, unsigned long long *)". * config/rs6000/rs6000.c (rs6000_init_builtins): Initialize new type tree bool_long_long_type_node and correct definition of bool_V2DI_type_node to make reference to this new type tree. (rs6000_mangle_type): Replace erroneous reference to bool_long_type_node with bool_long_long_type_node. * config/rs6000/rs6000.h (enum rs6000_builtin_type_index): Add comments to emphasize sign distinctions for char and int types and replace RS6000_BTI_bool_long constant with RS6000_BTI_bool_long_long constant. (bool_long_type_node): Remove this macro definition. (bool_long_long_type_node): New macro definition gcc/testsuite/ChangeLog: 2018-03-29 Kelvin Nilsen * gcc.target/powerpc/vec-ldl-1.c: New test. * gcc.dg/vmx/ops-long-1.c: Correct test programs to reflect corrections to ABI implementation. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 258800) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -16947,7 +16947,7 @@ rs6000_init_builtins (void) bool_char_type_node = build_distinct_type_copy (unsigned_intQI_type_node); bool_short_type_node = build_distinct_type_copy (unsigned_intHI_type_node); bool_int_type_node = build_distinct_type_copy (unsigned_intSI_type_node); - bool_long_type_node = build_distinct_type_copy (unsigned_intDI_type_node); + bool_long_long_type_node = build_distinct_type_copy (unsigned_intDI_type_node); pixel_type_node = build_distinct_type_copy (unsigned_intHI_type_node); long_integer_type_internal_node = long_integer_type_node; @@ -17064,7 +17064,7 @@ rs6000_init_builtins (void) bool_V2DI_type_node = rs6000_vector_type (TARGET_POWERPC64 ? "__vector __bool long" : "__vector __bool long long", - bool_long_type_node, 2); + bool_long_long_type_node, 2); pixel_V8HI_type_node = rs6000_vector_type ("__vector __pixel", pixel_type_node, 8); @@ -32855,7 +32855,7 @@ rs6000_mangle_type (const_tree type) if (type == bool_short_type_node) return "U6__bools"; if (type == pixel_type_node) return "u7__pixel"; if (type == bool_int_type_node) return "U6__booli"; - if (type == bool_long_type_node) return "U6__booll"; + if (type == bool_long_long_type_node) return "U6__booll"; /* Use a unique name for __float128 rather than trying to use "e" or "g". Use "g" for IBM extended double, no matter whether it is long double (using Index: gcc/config/rs6000/rs6000.h === --- gcc/config/rs6000/rs6000.h (revision 258800) +++ gcc/config/rs6000/rs6000.h (working copy) @@ -2578,7 +2578,7 @@ enum rs6000_builtin_type_index RS6000_BTI_opaque_V2SF, RS6000_BTI_opaque_p_V2SI, RS6000_BTI_opaque_V4SI, - RS6000_BTI_V16QI, + RS6000_BTI_V16QI, /* __vector signed char */ RS6000_BTI_V1TI, RS6000_BTI_V2SI, RS6000_BTI_V2SF, @@ -2588,7 +2588,7 @@ enum rs6000_builtin_type_index RS6000_BTI_
Re: [Aarch64] Fix conditional branches with target far away.
Hi Sameera On 29/03/18 11:44, Sameera Deshpande wrote: Hi Sudakshina, Thanks for pointing that out. Updated the conditions for attribute length to take care of boundary conditions for offset range. Please find attached the updated patch. I have tested it for gcc testsuite and the failing testcase. Ok for trunk? Thank you so much for fixing the length as well along with you patch. You mention a failing testcase? Maybe it would be helpful to add that to the patch for the gcc testsuite. Sudi On 22 March 2018 at 19:06, Sudakshina Das wrote: Hi Sameera On 22/03/18 02:07, Sameera Deshpande wrote: Hi Sudakshina, As per the ARMv8 ARM, for the offset range (-1048576 ,1048572), the far branch instruction offset is inclusive of both the offsets. Hence, I am using <=||=> and not <||>= as it was in previous implementation. I have to admit earlier I was only looking at the patch mechanically and found a difference with the previous implementation in offset comparison. After you pointed out, I looked up the ARMv8 ARM and I have a couple of doubts: 1. My understanding is that any offset in [-1048576 ,1048572] both inclusive qualifies as an 'in range' offset. However, the code for both attribute length and far_branch has been using [-1048576 ,1048572), that is, ( >= && < ). If the far_branch was incorrectly calculated, then maybe the length calculations with similar magic numbers should also be corrected? Of course, I am not an expert in this and maybe this was a conscience decision so I would ask Ramana to maybe clarify if he remembers. 2. Now to come back to your patch, if my understanding is correct, I think a far_branch would be anything outside of this range, that is, (offset < -1048576 || offset > 1048572), anything that can not be represented in the 21-bit range. Thanks Sudi On 16 March 2018 at 00:51, Sudakshina Das wrote: On 15/03/18 15:27, Sameera Deshpande wrote: Ping! On 28 February 2018 at 16:18, Sameera Deshpande wrote: On 27 February 2018 at 18:25, Ramana Radhakrishnan wrote: On Wed, Feb 14, 2018 at 8:30 AM, Sameera Deshpande wrote: Hi! Please find attached the patch to fix bug in branches with offsets over 1MiB. There has been an attempt to fix this issue in commit 050af05b9761f1979f11c151519e7244d5becd7c However, the far_branch attribute defined in above patch used insn_length - which computes incorrect offset. Hence, eliminated the attribute completely, and computed the offset from insn_addresses instead. Ok for trunk? gcc/Changelog 2018-02-13 Sameera Deshpande * config/aarch64/aarch64.md (far_branch): Remove attribute. Eliminate all the dependencies on the attribute from RTL patterns. I'm not a maintainer but this looks good to me modulo notes about how this was tested. What would be nice is a testcase for the testsuite as well as ensuring that the patch has been bootstrapped and regression tested. AFAIR, the original patch was put in because match.pd failed when bootstrap in another context. regards Ramana -- - Thanks and regards, Sameera D. The patch is tested with GCC testsuite and bootstrapping successfully. Also tested for spec benchmark. I am not a maintainer either. I noticed that the range check you do for the offset has a (<= || >=). The "far_branch" however did (< || >=) for a positive value. Was that also part of the incorrect offset calculation? @@ -692,7 +675,11 @@ { if (get_attr_length (insn) =3D=3D 8) { - if (get_attr_far_branch (insn) =3D=3D 1) + long long int offset; + offset =3D INSN_ADDRESSES (INSN_UID (XEXP (operands[2], 0))) + - INSN_ADDRESSES (INSN_UID (insn)); + + if (offset <=3D -1048576 || offset >=3D 1048572) return aarch64_gen_far_branch (operands, 2, "Ltb", "\\t%0, %1, "); else @@ -709,12 +696,7 @@ (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -32768)) (lt (minus (match_dup 2) (pc)) (const_int 32764))) (const_int 4) - (const_int 8))) - (set (attr "far_branch") - (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -1048576)) - (lt (minus (match_dup 2) (pc)) (const_int 1048572))) - (const_int 0) - (const_int 1)))] + (const_int 8)))] ) Thanks Sudi -- - Thanks and regards, Sameera D.
Re: [Aarch64] Fix conditional branches with target far away.
Hi Sudakshina, That testcase cannot be addwd as of now, as it needs approval from client. On Thu 29 Mar, 2018, 9:01 PM Sudakshina Das, wrote: > Hi Sameera > > On 29/03/18 11:44, Sameera Deshpande wrote: > > Hi Sudakshina, > > > > Thanks for pointing that out. Updated the conditions for attribute > > length to take care of boundary conditions for offset range. > > > > Please find attached the updated patch. > > > > I have tested it for gcc testsuite and the failing testcase. Ok for > trunk? > > Thank you so much for fixing the length as well along with you patch. > You mention a failing testcase? Maybe it would be helpful to add that > to the patch for the gcc testsuite. > > Sudi > > > > > On 22 March 2018 at 19:06, Sudakshina Das wrote: > >> Hi Sameera > >> > >> On 22/03/18 02:07, Sameera Deshpande wrote: > >>> > >>> Hi Sudakshina, > >>> > >>> As per the ARMv8 ARM, for the offset range (-1048576 ,1048572), the > >>> far branch instruction offset is inclusive of both the offsets. Hence, > >>> I am using <=||=> and not <||>= as it was in previous implementation. > >> > >> > >> I have to admit earlier I was only looking at the patch mechanically and > >> found a difference with the previous implementation in offset > comparison. > >> After you pointed out, I looked up the ARMv8 ARM and I have a couple of > >> doubts: > >> > >> 1. My understanding is that any offset in [-1048576 ,1048572] both > inclusive > >> qualifies as an 'in range' offset. However, the code for both attribute > >> length and far_branch has been using [-1048576 ,1048572), that is, ( >= > && < > >> ). If the far_branch was incorrectly calculated, then maybe the length > >> calculations with similar magic numbers should also be corrected? Of > course, > >> I am not an expert in this and maybe this was a conscience decision so I > >> would ask Ramana to maybe clarify if he remembers. > >> > >> 2. Now to come back to your patch, if my understanding is correct, I > think a > >> far_branch would be anything outside of this range, that is, > >> (offset < -1048576 || offset > 1048572), anything that can not be > >> represented in the 21-bit range. > >> > >> Thanks > >> Sudi > >> > >> > >>> > >>> On 16 March 2018 at 00:51, Sudakshina Das wrote: > > On 15/03/18 15:27, Sameera Deshpande wrote: > > > > > > Ping! > > > > On 28 February 2018 at 16:18, Sameera Deshpande > > wrote: > >> > >> > >> On 27 February 2018 at 18:25, Ramana Radhakrishnan > >> wrote: > >>> > >>> > >>> On Wed, Feb 14, 2018 at 8:30 AM, Sameera Deshpande > >>> wrote: > > > Hi! > > Please find attached the patch to fix bug in branches with offsets > over > 1MiB. > There has been an attempt to fix this issue in commit > 050af05b9761f1979f11c151519e7244d5becd7c > > However, the far_branch attribute defined in above patch used > insn_length - which computes incorrect offset. Hence, eliminated > the > attribute completely, and computed the offset from insn_addresses > instead. > > Ok for trunk? > > gcc/Changelog > > 2018-02-13 Sameera Deshpande > * config/aarch64/aarch64.md (far_branch): Remove > attribute. > Eliminate > all the dependencies on the attribute from RTL > patterns. > > >>> > >>> I'm not a maintainer but this looks good to me modulo notes about > how > >>> this was tested. What would be nice is a testcase for the > testsuite as > >>> well as ensuring that the patch has been bootstrapped and > regression > >>> tested. AFAIR, the original patch was put in because match.pd > failed > >>> when bootstrap in another context. > >>> > >>> > >>> regards > >>> Ramana > >>> > -- > - Thanks and regards, > Sameera D. > >> > >> > >> > >> The patch is tested with GCC testsuite and bootstrapping > successfully. > >> Also tested for spec benchmark. > >> > > I am not a maintainer either. I noticed that the range check you do > for > the offset has a (<= || >=). The "far_branch" however did (< || >=) > for a > positive value. Was that also part of the incorrect offset > calculation? > > @@ -692,7 +675,11 @@ > { > if (get_attr_length (insn) =3D=3D 8) > { > - if (get_attr_far_branch (insn) =3D=3D 1) > + long long int offset; > + offset =3D INSN_ADDRESSES (INSN_UID (XEXP (operands[2], 0))) > + - INSN_ADDRESSES (INSN_UID (insn)); > + > + if (offset <=3D -1048576 || offset >=3D 1048572) > return aarch64_gen_far_branch (operands, 2, "Ltb", > "\\t%0, %1, "); > else > @@ -7
[committed] Add testcase for already fixed -Wunused-but-set-variable PR c++/85108
Hi! This got fixed with r253266, but I don't see it covered with a testcase. Tested on x86_64-linux and i686-linux, committed to trunk as obvious. 2018-03-29 Jakub Jelinek PR c++/85108 * g++.dg/warn/Wunused-var-31.C: New test. --- gcc/testsuite/g++.dg/warn/Wunused-var-31.C.jj 2018-03-29 13:15:26.433012694 +0200 +++ gcc/testsuite/g++.dg/warn/Wunused-var-31.C 2018-03-29 13:09:48.184877285 +0200 @@ -0,0 +1,11 @@ +// PR c++/85108 +// { dg-do compile { target c++17 } } +// { dg-options "-Wunused-but-set-variable" } + +int +main () +{ + auto constexpr add = [] (auto a, auto b) { return a + b; }; // { dg-bogus "set but not used" } + auto test_lambda = [&] () { return add (2, 2); }; + return test_lambda () - 4; +} Jakub
[x86] Skylake tuning options
Hi, This patch adds 2 tuning options for -mtune=skylake-avx512. Ok for trunk? gcc/ * x86-tune.def (movx, partial_reg_dependency): Enable for m_SKYLAKE_AVX512. Thanks, Julia 0001-ptc.patch Description: 0001-ptc.patch
Re: [Aarch64] Fix conditional branches with target far away.
Hi Sameera, On 29/03/18 11:44, Sameera Deshpande wrote: Hi Sudakshina, Thanks for pointing that out. Updated the conditions for attribute length to take care of boundary conditions for offset range. Please find attached the updated patch. I have tested it for gcc testsuite and the failing testcase. Ok for trunk? On 22 March 2018 at 19:06, Sudakshina Das wrote: > Hi Sameera > > On 22/03/18 02:07, Sameera Deshpande wrote: >> >> Hi Sudakshina, >> >> As per the ARMv8 ARM, for the offset range (-1048576 ,1048572), the >> far branch instruction offset is inclusive of both the offsets. Hence, >> I am using <=||=> and not <||>= as it was in previous implementation. > > > I have to admit earlier I was only looking at the patch mechanically and > found a difference with the previous implementation in offset comparison. > After you pointed out, I looked up the ARMv8 ARM and I have a couple of > doubts: > > 1. My understanding is that any offset in [-1048576 ,1048572] both inclusive > qualifies as an 'in range' offset. However, the code for both attribute > length and far_branch has been using [-1048576 ,1048572), that is, ( >= && < > ). If the far_branch was incorrectly calculated, then maybe the length > calculations with similar magic numbers should also be corrected? Of course, > I am not an expert in this and maybe this was a conscience decision so I > would ask Ramana to maybe clarify if he remembers. > > 2. Now to come back to your patch, if my understanding is correct, I think a > far_branch would be anything outside of this range, that is, > (offset < -1048576 || offset > 1048572), anything that can not be > represented in the 21-bit range. > > Thanks > Sudi > > >> >> On 16 March 2018 at 00:51, Sudakshina Das wrote: >>> >>> On 15/03/18 15:27, Sameera Deshpande wrote: Ping! On 28 February 2018 at 16:18, Sameera Deshpande wrote: > > > On 27 February 2018 at 18:25, Ramana Radhakrishnan > wrote: >> >> >> On Wed, Feb 14, 2018 at 8:30 AM, Sameera Deshpande >> wrote: >>> >>> >>> Hi! >>> >>> Please find attached the patch to fix bug in branches with offsets >>> over >>> 1MiB. >>> There has been an attempt to fix this issue in commit >>> 050af05b9761f1979f11c151519e7244d5becd7c >>> >>> However, the far_branch attribute defined in above patch used >>> insn_length - which computes incorrect offset. Hence, eliminated the >>> attribute completely, and computed the offset from insn_addresses >>> instead. >>> >>> Ok for trunk? >>> >>> gcc/Changelog >>> >>> 2018-02-13 Sameera Deshpande >>> * config/aarch64/aarch64.md (far_branch): Remove attribute. >>> Eliminate >>> all the dependencies on the attribute from RTL patterns. >>> I'd list all the patterns affected by the removal. There's not many of them and it makes checking the impact of the patch simpler. >> >> I'm not a maintainer but this looks good to me modulo notes about how >> this was tested. What would be nice is a testcase for the testsuite as >> well as ensuring that the patch has been bootstrapped and regression >> tested. AFAIR, the original patch was put in because match.pd failed >> when bootstrap in another context. >> >> >> regards >> Ramana >> >>> -- >>> - Thanks and regards, >>> Sameera D. > > > > The patch is tested with GCC testsuite and bootstrapping successfully. > Also tested for spec benchmark. > >>> >>> I am not a maintainer either. I noticed that the range check you do for >>> the offset has a (<= || >=). The "far_branch" however did (< || >=) for a >>> positive value. Was that also part of the incorrect offset calculation? >>> I'm not a maintainer, but this looks like a good improvement to me, and is more readable to boot! Getting some kind of reduced testcase (perhaps with the creduce tool, or derived from base principles) would be very valuable, but I guess it shouldn't be a blocker for this patch. I've got a couple of nits inline. Index: gcc/config/aarch64/aarch64.md === --- gcc/config/aarch64/aarch64.md (revision 258946) +++ gcc/config/aarch64/aarch64.md (working copy) @@ -264,13 +264,6 @@ (const_string "no") ] (const_string "yes"))) -;; Attribute that specifies whether we are dealing with a branch to a -;; label that is far away, i.e. further away than the maximum/minimum -;; representable in a signed 21-bits number. -;; 0 :=: no -;; 1 :=: yes -(define_attr "far_branch" "" (const_int 0)) - ;; Strictly for compatibility with AArch32 in pipeline models, since AArch64 has ;; no predicated insns. (define_attr "predicated" "yes,no" (const_string "no")) @@ -466,14 +459,9 @@ [(set_attr "type" "branch") (set (attr "length") (if_then_else (and (ge (minus (
Re: [x86] Skylake tuning options
On Thu, Mar 29, 2018 at 5:40 PM, Koval, Julia wrote: > Hi, > > This patch adds 2 tuning options for -mtune=skylake-avx512. Ok for trunk? > > gcc/ > * x86-tune.def (movx, partial_reg_dependency): Enable for > m_SKYLAKE_AVX512. OK. Thanks, Uros.
Re: [x86] Skylake tuning options
On Thu, Mar 29, 2018 at 8:54 AM, Uros Bizjak wrote: > On Thu, Mar 29, 2018 at 5:40 PM, Koval, Julia wrote: >> Hi, >> >> This patch adds 2 tuning options for -mtune=skylake-avx512. Ok for trunk? >> >> gcc/ >> * x86-tune.def (movx, partial_reg_dependency): Enable for >> m_SKYLAKE_AVX512. > > OK. This is: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84413 Please add "PR target/84413" in ChangeLog entry. -- H.J.
[patch, fortran] Fix PR 85111, some more zero size stuff
Hello world, the attached patch fixes PR 85111, a regression introduced with my recent simplification patches. Seems like the zero-size saga has yet another chapter :-) Regression-tested. OK for trunk? Regards Thomas 2017-03-29 Thomas Koenig PR fortran/85111 * array.c (gfc_resolve_character_array_constructor): Early exit for zero-size arrays. * simplify.c (simplify_transformation_to_array): Exit early if the result size is zero. (simplify_minmaxloc_to_array): Likewise. 2017-03-29 Thomas Koenig PR fortran/85111 * gfortran.dg/zero_sized_10.f90: New test. Index: array.c === --- array.c (revision 258845) +++ array.c (working copy) @@ -2003,6 +2003,20 @@ gfc_resolve_character_array_constructor (gfc_expr got_charlen: + /* Early exit for zero size arrays. */ + if (expr->shape) +{ + mpz_t size; + HOST_WIDE_INT arraysize; + + gfc_array_size (expr, &size); + arraysize = mpz_get_ui (size); + mpz_clear (size); + + if (arraysize == 0) + return true; +} + found_length = -1; if (expr->ts.u.cl->length == NULL) Index: simplify.c === --- simplify.c (revision 258845) +++ simplify.c (working copy) @@ -627,7 +627,7 @@ simplify_transformation_to_array (gfc_expr *result n += 1; } - done = false; + done = resultsize <= 0; base = arrayvec; dest = resultvec; while (!done) @@ -5304,7 +5304,7 @@ simplify_minmaxloc_to_array (gfc_expr *result, gfc n += 1; } - done = false; + done = resultsize <= 0; base = arrayvec; dest = resultvec; while (!done) ! { dg-do compile } ! { PR 85111 - this used to ICE. } ! Original test case by Gernhard Steinmetz. program p integer, parameter :: a(2,0) = reshape([1,2,3,4], shape(a)) character, parameter :: ac(2,0) = reshape(['a','b','c','d'], shape(ac)) integer, parameter :: b(2) = maxloc(a, dim=1) ! { dg-error "Different shape" } integer, parameter :: c(2) = minloc(a, dim=1) ! { dg-error "Different shape" } character, parameter :: d(2) = maxval(ac, dim=1) ! { dg-error "Different shape" } end program p
patch for PR84985
The following patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84985 The patch was successfully bootstrapped and tested on x86-64. Committed as rev. 258961. Index: ChangeLog === --- ChangeLog (revision 258960) +++ ChangeLog (working copy) @@ -1,3 +1,9 @@ +2018-03-29 Vladimir Makarov + + PR inline-asm/84985 + * lra-constraints.c (process_alt_operands): Move setting + this_alternative_matches below. + 2018-03-29 Martin Liska PR lto/84995. Index: testsuite/ChangeLog === --- testsuite/ChangeLog (revision 258960) +++ testsuite/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2018-03-29 Vladimir Makarov + + PR inline-asm/84985 + * gcc.target/i386/pr84985.c: New. + 2018-03-29 David Malcolm PR c++/85110 Index: lra-constraints.c === --- lra-constraints.c (revision 258820) +++ lra-constraints.c (working copy) @@ -2126,7 +2126,6 @@ process_alt_operands (int only_alternati && curr_operand_mode[m] != curr_operand_mode[nop]) break; - this_alternative_matches = m; m_hregno = get_hard_regno (*curr_id->operand_loc[m], false); /* We are supposed to match a previous operand. If we do, we win if that one did. If we do @@ -2228,6 +2227,7 @@ process_alt_operands (int only_alternati else did_match = true; + this_alternative_matches = m; /* This can be fixed with reloads if the operand we are supposed to match can be fixed with reloads. */ Index: testsuite/gcc.target/i386/pr84985.c === --- testsuite/gcc.target/i386/pr84985.c (nonexistent) +++ testsuite/gcc.target/i386/pr84985.c (working copy) @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-O0" } */ +int main() { + int a; + asm("" : "=d"(a) : "0"(a), "0ae"(&a)); +}
Re: [PATCH 1/4] More #include suggestions (PR c++/84269)
OK. On Thu, Mar 22, 2018 at 7:44 PM, David Malcolm wrote: > PR c++/84269 reports a number of names in the C and C++ standard > libraries for which we don't yet offer #include fix-it hints. > > This patch adds them (up to comment #9). > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. > > OK for trunk? > > gcc/c-family/ChangeLog: > PR c++/84269 > * known-headers.cc (get_stdlib_header_for_name): Add various names > from , , and ; add more names from > . > > gcc/cp/ChangeLog: > PR c++/84269 > * name-lookup.c (get_std_name_hint): Add names from , > , and . > > gcc/testsuite/ChangeLog: > PR c++/84269 > * g++.dg/lookup/missing-std-include-6.C: New test. > * g++.dg/lookup/missing-std-include.C: Add std::pair and > std::tuple tests. > * g++.dg/spellcheck-reswords.C: Expect a hint about . > * g++.dg/spellcheck-stdlib.C: Add tests for names in , > , , and . > --- > gcc/c-family/known-headers.cc | 33 +- > gcc/cp/name-lookup.c | 14 + > .../g++.dg/lookup/missing-std-include-6.C | 62 ++ > gcc/testsuite/g++.dg/lookup/missing-std-include.C | 8 +++ > gcc/testsuite/g++.dg/spellcheck-reswords.C | 1 + > gcc/testsuite/g++.dg/spellcheck-stdlib.C | 73 > ++ > 6 files changed, 190 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/g++.dg/lookup/missing-std-include-6.C > > diff --git a/gcc/c-family/known-headers.cc b/gcc/c-family/known-headers.cc > index ef23cbe..5524d21 100644 > --- a/gcc/c-family/known-headers.cc > +++ b/gcc/c-family/known-headers.cc > @@ -57,6 +57,9 @@ get_stdlib_header_for_name (const char *name, enum stdlib > lib) >gcc_assert (lib < NUM_STDLIBS); > >static const stdlib_hint hints[] = { > +/* and . */ > +{"assert", {"", ""} }, > + > /* and . */ > {"errno", {"", ""} }, > > @@ -92,16 +95,44 @@ get_stdlib_header_for_name (const char *name, enum stdlib > lib) > {"size_t", {"", ""} }, > {"wchar_t", {"", NULL /* a keyword in C++ */} }, > > -/* . */ > +/* and . */ > {"BUFSIZ", {"", ""} }, > {"EOF", {"", ""} }, > {"FILE", {"", ""} }, > {"FILENAME_MAX", {"", ""} }, > +{"fopen", {"", ""} }, > {"fpos_t", {"", ""} }, > +{"getchar", {"", ""} }, > +{"printf", {"", ""} }, > +{"snprintf", {"", ""} }, > +{"sprintf", {"", ""} }, > {"stderr", {"", ""} }, > {"stdin", {"", ""} }, > {"stdout", {"", ""} }, > > +/* and . */ > +{"free", {"", ""} }, > +{"malloc", {"", ""} }, > +{"realloc", {"", ""} }, > + > +/* and . */ > +{"memchr", {"", ""} }, > +{"memcmp", {"", ""} }, > +{"memcpy", {"", ""} }, > +{"memmove", {"", ""} }, > +{"memset", {"", ""} }, > +{"strcat", {"", ""} }, > +{"strchr", {"", ""} }, > +{"strcmp", {"", ""} }, > +{"strcpy", {"", ""} }, > +{"strlen", {"", ""} }, > +{"strncat", {"", ""} }, > +{"strncmp", {"", ""} }, > +{"strncpy", {"", ""} }, > +{"strrchr", {"", ""} }, > +{"strspn", {"", ""} }, > +{"strstr", {"", ""} }, > + > /* . */ > {"PTRDIFF_MAX", {"", ""} }, > {"PTRDIFF_MIN", {"", ""} }, > diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c > index e193b3b..061729a 100644 > --- a/gcc/cp/name-lookup.c > +++ b/gcc/cp/name-lookup.c > @@ -5453,6 +5453,12 @@ get_std_name_hint (const char *name) > /* . */ > {"map", ""}, > {"multimap", ""}, > +/* . */ > +{"make_shared", ""}, > +{"make_unique", ""}, > +{"shared_ptr", ""}, > +{"unique_ptr", ""}, > +{"weak_ptr", ""}, > /* . */ > {"queue", ""}, > {"priority_queue", ""}, > @@ -5472,6 +5478,9 @@ get_std_name_hint (const char *name) > {"basic_stringstream", ""}, > /* . */ > {"stack", ""}, > +/* . */ > +{"make_tuple", ""}, > +{"tuple", ""}, > /* . */ > {"string", ""}, > {"wstring", ""}, > @@ -5483,6 +5492,11 @@ get_std_name_hint (const char *name) > /* . */ > {"unordered_set", ""}, // C++11 > {"unordered_multiset", ""}, // C++11 > +/* . */ > +{"forward", ""}, > +{"make_pair", ""}, > +{"move", ""}, > +{"pair", ""}, > /* . */ > {"vector", ""}, >}; > diff --git a/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C > b/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C > new file mode 100644 > index 000..100bcc0 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C > @@ -0,0 +1,62 @@ > +// { dg-do compile { target c++11 } } > + > +/* . */ > + > +template > +void test_make_shared () > +{ > + auto p = std::make_shared(); // { dg-error "'make_shared' is not a > member of 'std'" } > + // { dg-message "'#include '" "" { target *-*-* } .-1 } > + // { dg-error "expected primary-expression before '>' token" "" { targe
Re: [PATCH 3/4] C++: suggest missing headers for implicit use of "std" (PR c++/85021)
On Thu, Mar 22, 2018 at 7:44 PM, David Malcolm wrote: > We provide fix-it hints for the most common "std" names when an explicit > "std::" prefix is present, however we don't yet provide fix-it hints for > this implicit case: > > using namespace std; > void f() { cout << "test"; } > > for which we emit: > > t.cc: In function 'void f()': > t.cc:2:13: error: 'cout' was not declared in this scope >void f() { cout << "test"; } >^~~~ > > This patch detects if a "using namespace std;" directive is present > in the current namespace, and if so, offers a suggestion for > unrecognized names that are in our list of common "std" names: > > t.cc: In function 'void f()': > t.cc:2:13: error: 'cout' was not declared in this scope >void f() { cout << "test"; } >^~~~ > t.cc:2:13: note: 'std::cout' is defined in header ''; did you > forget to '#include '? > +#include >using namespace std; >void f() { cout << "test"; } >^~~~ > > Is there a better way to test for the using directive? > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. > > OK for trunk? > > gcc/cp/ChangeLog: > PR c++/85021 > * name-lookup.c (has_using_namespace_std_directive_p): New > function. > (suggest_alternatives_for): Simplify if/else logic using early > returns. If no candidates were found, and there's a > "using namespace std;" directive, call > maybe_suggest_missing_std_header. > (maybe_suggest_missing_header): Split later part of the function > into.. > (maybe_suggest_missing_std_header): New. > > gcc/testsuite/ChangeLog: > PR c++/85021 > * g++.dg/lookup/missing-std-include-7.C: New test. > --- > gcc/cp/name-lookup.c | 68 > +- > .../g++.dg/lookup/missing-std-include-7.C | 16 + > 2 files changed, 70 insertions(+), 14 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/lookup/missing-std-include-7.C > > diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c > index 061729a..4eb980e 100644 > --- a/gcc/cp/name-lookup.c > +++ b/gcc/cp/name-lookup.c > @@ -41,6 +41,7 @@ static cxx_binding *cxx_binding_make (tree value, tree > type); > static cp_binding_level *innermost_nonclass_level (void); > static void set_identifier_type_value_with_scope (tree id, tree decl, > cp_binding_level *b); > +static bool maybe_suggest_missing_std_header (location_t location, tree > name); > > /* Create an overload suitable for recording an artificial TYPE_DECL > and another decl. We use this machanism to implement the struct > @@ -5327,6 +5328,23 @@ qualify_lookup (tree val, int flags) >return true; > } > > +/* Is there a "using namespace std;" directive within the current > + namespace? */ > + > +static bool > +has_using_namespace_std_directive_p () > +{ > + vec *usings = DECL_NAMESPACE_USING (current_namespace); Checking in just the current namespace won't find a "using namespace std" in an inner or outer scope; I think you want to add something to name-lookup.c that iterates over the current enclosing scopes like name_lookup::search_unqualified. Nathan can probably be more specific. Jason > + if (!usings) > +return false; > + > + for (unsigned ix = usings->length (); ix--;) > +if ((*usings)[ix] == std_node) > + return true; > + > + return false; > +} > + > /* Suggest alternatives for NAME, an IDENTIFIER_NODE for which name > lookup failed. Search through all available namespaces and print out > possible candidates. If no exact matches are found, and > @@ -5397,11 +5415,23 @@ suggest_alternatives_for (location_t location, tree > name, > inform (location_of (val), " %qE", val); > } >candidates.release (); > + return; > } > - else if (!suggest_misspellings) > -; > - else if (name_hint hint = lookup_name_fuzzy (name, FUZZY_LOOKUP_NAME, > - location)) > + > + /* No candidates were found in the available namespaces. */ > + > + /* If there's a "using namespace std;" active, and this > + is one of the most common "std::" names, then it's probably a > + missing #include. */ > + if (has_using_namespace_std_directive_p ()) > +if (maybe_suggest_missing_std_header (location, name)) > + return; > + > + /* Otherwise, consider misspellings. */ > + if (!suggest_misspellings) > +return; > + if (name_hint hint = lookup_name_fuzzy (name, FUZZY_LOOKUP_NAME, > + location)) > { >/* Show a spelling correction. */ >gcc_rich_location richloc (location); > @@ -5509,20 +5539,13 @@ get_std_name_hint (const char *name) >return NULL; > } > > -/* If SCOPE is the "std" namespace, then suggest pertinent header > - files for NAME at LOCATION. > +/* Suggest per
Re: [PATCH 4/4] C++: more std header hints; filter on C++ dialect (PR c++/84269)
OK. On Thu, Mar 22, 2018 at 7:44 PM, David Malcolm wrote: > Jonathan added a bunch more suggestions to: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84269#c10 > as I was testing my last patch, some of which need C++14 and C++17, > and some of which use headers that exist in earlier standards. > > For example, exists in C++98, but if the user attempts to > use std::make_shared with -std=c++98, they are suggested to include > , even if they've already included it. > > This patch adds the missing names, and fixes the nonsensical suggestions > by detecting if the name isn't available yet, based on the user's > dialect, and reporting things more intelligently: > > t.cc: In function 'void test_make_shared()': > t.cc:5:8: error: 'make_shared' is not a member of 'std' >std::make_shared(); > ^~~ > t.cc:5:8: note: 'std::make_shared' is only available from C++11 onwards > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. > > OK for trunk? > > gcc/cp/ChangeLog: > PR c++/84269 > * name-lookup.c (struct std_name_hint): Move out of > get_std_name_hint; add field "min_dialect". */ > (get_std_name_hint): Add min_dialect values to all initializers. > Add , , , , , > , , , , , , > , , , and . > Add fstream, ifstream, and ofstream to . > Add istringstream, ostringstream, and stringstream to . > Add basic_string to . > Add tuple_element and tuple_size to . > Add declval to . > Fix ordering of and . > Return a std_name_hint, rather than a const char *. > (get_cxx_dialect_name): New function. > (maybe_suggest_missing_std_header): Detect names that aren't yet > available in the current dialect, and instead of suggesting a > missing #include, warn about the dialect. > > gcc/testsuite/ChangeLog: > PR c++/84269 > * g++.dg/lookup/missing-std-include-6.C: Move std::array and > std::tuple here since they need C++11. > * g++.dg/lookup/missing-std-include-8.C: New test. > * g++.dg/lookup/missing-std-include.C: Move std::array and > std::tuple test to missing-std-include-6.C to avoid failures > with C++98. > --- > gcc/cp/name-lookup.c | 262 > +++-- > .../g++.dg/lookup/missing-std-include-6.C | 13 + > .../g++.dg/lookup/missing-std-include-8.C | 44 > gcc/testsuite/g++.dg/lookup/missing-std-include.C | 7 - > 4 files changed, 248 insertions(+), 78 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/lookup/missing-std-include-8.C > > diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c > index 4eb980e..3d676bb 100644 > --- a/gcc/cp/name-lookup.c > +++ b/gcc/cp/name-lookup.c > @@ -5441,104 +5441,214 @@ suggest_alternatives_for (location_t location, tree > name, > } > } > > +/* A well-known name within the C++ standard library, returned by > + get_std_name_hint. */ > + > +struct std_name_hint > +{ > + /* A name within "std::". */ > + const char *name; > + > + /* The header name defining it within the C++ Standard Library > + (with '<' and '>'). */ > + const char *header; > + > + /* The dialect of C++ in which this was added. */ > + enum cxx_dialect min_dialect; > +}; > + > /* Subroutine of maybe_suggest_missing_header for handling unrecognized names > for some of the most common names within "std::". > - Given non-NULL NAME, a name for lookup within "std::", return the header > - name defining it within the C++ Standard Library (with '<' and '>'), > - or NULL. */ > + Given non-NULL NAME, return the std_name_hint for it, or NULL. */ > > -static const char * > +static const std_name_hint * > get_std_name_hint (const char *name) > { > - struct std_name_hint > - { > -const char *name; > -const char *header; > - }; >static const std_name_hint hints[] = { > +/* . */ > +{"any", "", cxx17}, > +{"any_cast", "", cxx17}, > +{"make_any", "", cxx17}, > /* . */ > -{"array", ""}, // C++11 > +{"array", "", cxx11}, > +/* . */ > +{"atomic", "", cxx11}, > +{"atomic_flag", "", cxx11}, > +/* . */ > +{"bitset", "", cxx11}, > /* . */ > -{"complex", ""}, > -{"complex_literals", ""}, > +{"complex", "", cxx98}, > +{"complex_literals", "", cxx98}, > +/* . */ > +{"condition_variable", "", cxx11}, > +{"condition_variable_any", "", cxx11}, > /* . */ > -{"deque", ""}, > +{"deque", "", cxx98}, > /* . */ > -{"forward_list", ""}, // C++11 > +{"forward_list", "", cxx11}, > /* . */ > -{"basic_filebuf", ""}, > -{"basic_ifstream", ""}, > -{"basic_ofstream", ""}, > -{"basic_fstream", ""}, > +{"basic_filebuf", "", cxx98}, > +{"basic_ifstream", "", cxx98}, > +{"basic_ofstream", "", cxx98}, > +{"basic_fstream", "", cxx98}, > +{"fstream", "", cxx98}, >
[PATCH, rs6000] PR target/83822 fix redundant conditions
I've fixed the redundant conditions in the expressions pointed out by 83822. Bootstrap/regtest passes on ppc64le, ok for trunk? Aaron 2018-03-29 Aaron Sawdey PR target/83822 * config/rs6000/rs6000-string.c (expand_compare_loop): Fix redundant condition. * config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Fix redundant condition. -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC ToolchainIndex: gcc/config/rs6000/rs6000-c.c === --- gcc/config/rs6000/rs6000-c.c (revision 258900) +++ gcc/config/rs6000/rs6000-c.c (working copy) @@ -642,8 +642,7 @@ cpp_get_callbacks (pfile)->macro_to_expand = rs6000_macro_to_expand; } } - if (!TARGET_HARD_FLOAT - || (TARGET_HARD_FLOAT && !TARGET_DOUBLE_FLOAT)) + if (!TARGET_HARD_FLOAT || !TARGET_DOUBLE_FLOAT) builtin_define ("_SOFT_DOUBLE"); /* Used by lwarx/stwcx. errata work-around. */ if (rs6000_cpu == PROCESSOR_PPC405) Index: gcc/config/rs6000/rs6000-string.c === --- gcc/config/rs6000/rs6000-string.c (revision 258900) +++ gcc/config/rs6000/rs6000-string.c (working copy) @@ -966,8 +966,7 @@ rtx final_cleanup = gen_label_rtx (); rtx cmp_rem_before = gen_reg_rtx (word_mode); /* Compare one more word_mode chunk if needed. */ - if (!bytes_is_const - || (bytes_is_const && bytes_remaining >= load_mode_size)) + if (!bytes_is_const || bytes_remaining >= load_mode_size) { /* If remainder length < word length, branch to final cleanup compare. */
C++ PATCH for c++/85060, wrong-code calling base member from template
Here, during instantiate_non_dependent_expr, we were getting the wrong answer for any_dependent_bases_p because processing_template_decl was temporarily cleared. Fixed by using uses_template_parms (which doesn't depend on processing_template_decl) on the type we're asking about. Tested x86_64-pc-linux-gnu, applying to trunk. commit 4c894923eba7d2a86836e43cc1093247a3241a7e Author: Jason Merrill Date: Fri Mar 23 11:14:50 2018 -0400 PR c++/85060 - wrong-code with call to base member in template. * search.c (any_dependent_bases_p): Check uses_template_parms rather than processing_template_decl. diff --git a/gcc/cp/search.c b/gcc/cp/search.c index 6bf8b0e70dc..bfeaf2cc819 100644 --- a/gcc/cp/search.c +++ b/gcc/cp/search.c @@ -2619,7 +2619,7 @@ original_binfo (tree binfo, tree here) bool any_dependent_bases_p (tree type) { - if (!type || !CLASS_TYPE_P (type) || !processing_template_decl) + if (!type || !CLASS_TYPE_P (type) || !uses_template_parms (type)) return false; /* If we haven't set TYPE_BINFO yet, we don't know anything about the bases. diff --git a/gcc/testsuite/g++.dg/template/dependent-base3.C b/gcc/testsuite/g++.dg/template/dependent-base3.C new file mode 100644 index 000..e38b968e774 --- /dev/null +++ b/gcc/testsuite/g++.dg/template/dependent-base3.C @@ -0,0 +1,26 @@ +// PR c++/85060 +// { dg-do compile { target c++14 } } + +struct CA { + constexpr int foo() const { return 42; } +}; + +template +struct CB : CA { }; + +template +struct CC : CB { + constexpr int bar() const { +const int m = CA::foo(); +return m; + } + + constexpr int baz() const { +const T m = CA::foo(); +return m; + } +}; + +constexpr CC c; + +static_assert( c.bar() == 42, "" );
Re: [PATCH] Small x86 debug info improvement (PR debug/83157)
Hi! On Thu, Mar 22, 2018 at 10:49:45PM +0100, Jakub Jelinek wrote: I'd like to ping this patch: > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2018-03-22 Jakub Jelinek > > PR debug/83157 > * var-tracking.c (add_stores): Handle STRICT_LOW_PART SET_DEST. > * cselib.c (cselib_record_sets): For STRICT_LOW_PART dest, > lookup if dest in some wider mode is known to be const0_rtx and > if so, record permanent equivalence for it to be ZERO_EXTEND of > the narrower mode destination. Thanks Jakub
Re: [PATCH, rs6000] PR target/83822 fix redundant conditions
On Thu, Mar 29, 2018 at 02:29:53PM -0500, Aaron Sawdey wrote: > I've fixed the redundant conditions in the expressions pointed out by > 83822. Bootstrap/regtest passes on ppc64le, ok for trunk? This is fine, thanks! Segher > 2018-03-29 Aaron Sawdey > > PR target/83822 > * config/rs6000/rs6000-string.c (expand_compare_loop): Fix redundant > condition. > * config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Fix redundant > condition.
Re: [PATCH] [PR c++/84979] improve auto handling in explicit tmpl args for concepts
On Mar 28, 2018, Jason Merrill wrote: > On Wed, Mar 28, 2018 at 5:06 AM, Alexandre Oliva wrote: >> On Mar 23, 2018, Jason Merrill wrote: >> >>> On Fri, Mar 23, 2018 at 3:38 PM, Alexandre Oliva wrote: + /* Concepts allows 'auto' in template arguments, even multiple + 'auto's in a single argument. >> >>> I think that's only intended for class templates; >> >> Aah, that explains a lot! Dang, it was so close! >> >> It actually makes sense; was there any discussion about standardizing >> this, with reasons to reject this construct, or is it just that we >> aren't there yet? I wouldn't be surprised if it were to appear in some >> future version of C++. > If what were to appear? auto in explicit template arguments for template functions too. > We only want 'auto' in places where it could be deduced, and there's > no way in [temp.deduct.type] to deduce from explicit template > arguments to a non-type template. Well, it could be standardized in some way, e.g. autos could be deduced from types of function arguments and expected return types, just like function overloads, very much like typename template arguments are currently deduced. Having all explicit autos resolved would be a requirement for a specialization to be viable. Consider: template A foo(B b, C c); ... int x1 = foo("", 0); // ok, all args deduced int x2 = foo("", 0); // ok, A is explicit, B and C are deduced int x3 = foo("", 0); // B could be deduced as in x2 int x4 = foo("", 0); // all args deduced as in x1 auto x5 = foo("", 0); // deduction of A from context fails Cases in which auto appears in the top-level are the trivial ones, that our type deduction machinery could support right away (it has to; it's the same as deducing any typename argument left out). More involved cases, in which auto appears other than at the top level, could still be deduced in general, though we would need some more implementation work to get them right: int *x6 = foo("", 0); // deduced: int const charint -> calls foo I seems to me it would be a natural extension to the existing draft specs; I guess the most common use would be to just tell the compiler to figure out one specific template argument, when you only wish to explicitly specify another after that: foo >> Is this (in the patch below) the best spot to test for it? > This seems like a plausible spot, but is there a reason not to check > in cp_parser_template_id? AFAICT we wouldn't always know, within cp_parser_template_id, whether the id is a type or a function. It seemed to me that we could tentatively parse it as a type, and then reuse the same preparsed id as a function. If I were to check the argument list at that point, assuming one or the other, we might reject something we should accept or vice-versa, and we wouldn't revisit that decision. I'm not saying I tried and observed this, it is just a hunch based on observing how the foo() expression was parsed and reparsed, and on extrapolations from the preparsing of qualified names that I've recently had a close encounter with ;-) -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PATCH] [PR c++/84943] allow folding of array indexing indirect_ref
On Mar 28, 2018, Jason Merrill wrote: > It looks like cp_build_addr_expr_1 already calls mark_used for single > static member functions, it should probably do the same for single > non-member functions. Hmm... That existing call is suspicious, now that you point it out. There's a switch before it that peels off BASELINK and takes OVL_FIRST only, so we would be marking as used something other than what overload resolution would select, and not marking what overload resolution would ultimately select, *if* we marked anything. But since we've peeled off the baseline, the test if (processing_template_decl || TREE_CODE (arg) != COMPONENT_REF) passes because FUNCTION_DECL is not COMPONENT_REF, and we call build_address before we get a chance to mark anything. I guess we could move the mark_used logic ahead in the function, but whether or not it do the job will depend on whether cp_build_addr_expr_1 gets called again after template substitution and overload resolution. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization
On 03/02/2018 05:55 PM, Cesar Philippidis wrote: As a follow up patch will show, the nvptx BE falls back to using vector_length = 32 when a vector loop is nested inside a worker loop. I disabled the fallback, and analyzed the vred2d-128.c illegal memory access execution failure. I minimized that down to this ptx: ... .shared .align 8 .u8 __oacc_bcast[176]; { { .reg .u32 %x; mov.u32 %x,%tid.x; setp.ne.u32 %r86,%x,0; } { .reg .u32 %tidy; .reg .u64 %t_bcast; .reg .u64 %y64; mov.u32 %tidy,%tid.y; cvt.u64.u32 %y64,%tidy; add.u64 %y64,%y64,1; cvta.shared.u64 %t_bcast,__oacc_bcast; mad.lo.u64 %r66,%y64,88,%t_bcast; } @ %r86 bra $L28; st.u32 [%r66+80],0; $L28: ret; } ... The ptx is called with 2 workers and 128 vector_length. So, 2 workers mean %tid.y has values 0 and 1. Then %y64 has values 1 and 2. Then %r66 has values __oacc_bcast + (1 * 88) and __oacc_bcast + (2 * 88). Then the st.u32 accesss __oacc_bcast + (1 * 88) + 80 and __oacc_bcast + (2 * 88) + 80. So we're accessing memory at location 256, while the __oacc_bcast is only 176 bytes big. I formulated this assert that AFAIU detects this situation in the compiler: ... @@ -1125,6 +1125,8 @@ nvptx_init_axis_predicate (FILE *file, int regno, const char *name) fprintf (file, "\t}\n"); } +static int nvptx_mach_max_workers (); + /* Emit code to initialize OpenACC worker broadcast and synchronization registers. */ @@ -1148,6 +1150,7 @@ nvptx_init_oacc_workers (FILE *file) "// vector broadcast offset\n", REGNO (cfun->machine->bcast_partition), oacc_bcast_partition); + gcc_assert (oacc_bcast_partition * (nvptx_mach_max_workers () + 1) <= oacc_bcast_size); } if (cfun->machine->sync_bar) fprintf (file, "\t\tadd.u32\t\t%%r%d, %%tidy, 1; " ... The assert is not triggered when the fallback is used. Thanks, - Tom
Re: [PATCH] [PR c++/84943] allow folding of array indexing indirect_ref
On Mar 29, 2018, Alexandre Oliva wrote: > On Mar 28, 2018, Jason Merrill wrote: >> It looks like cp_build_addr_expr_1 already calls mark_used for single >> static member functions, it should probably do the same for single >> non-member functions. > ultimately select, *if* we marked anything. But since we've peeled off > the baseline, the test ^ err, baselink > if (processing_template_decl || TREE_CODE (arg) != COMPONENT_REF) > passes because FUNCTION_DECL is not COMPONENT_REF, and we call > build_address before we get a chance to mark anything. ... unless we're at the case it's supposed to catch, namely when we call obj.static_memfn() ;-) Here's a patch that should take care of the marking a namespace-scoped or static member function as used when taking its address, thus working around (fixing?) the reported problem. Regstrapping now. Ok to install if it passes? [PR c++/84943] mark function as used when taking its address fn[0]() ICEd because we would fold the INDIRECT_REF used for the array indexing while building the address for the call, after not finding the decl hiding there at first. But the decl would be exposed by the folding, and then lower layers would complain we had the decl, after all, but it wasn't one of the artificial or special functions that could be called without being marked as used. This patch arranges for a FUNCTION_DECL to be marked as used when taking its address, just like we already did when taking the address of a static function to call it as a member function (i.e. using the obj.fn() notation). for gcc/cp/ChangeLog PR c++/84943 * typeck.c (cp_build_addr_expr_1): Mark FUNCTION_DECL as used. for gcc/testsuite/ChangeLog PR c++/84943 * g++.dg/pr84943.C: New. * g++.dg/pr84943-2.C: New. --- gcc/cp/typeck.c |8 - gcc/testsuite/g++.dg/pr84943-2.C | 64 ++ gcc/testsuite/g++.dg/pr84943.C |8 + 3 files changed, 79 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/pr84943-2.C create mode 100644 gcc/testsuite/g++.dg/pr84943.C diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c index d3183b5321d3..d6157772c0f3 100644 --- a/gcc/cp/typeck.c +++ b/gcc/cp/typeck.c @@ -5831,6 +5831,8 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue, tsubst_flags_t complain) if (type_unknown_p (arg)) return build1 (ADDR_EXPR, unknown_type_node, arg); + tree maybe_overloaded_arg = arg; + if (TREE_CODE (arg) == OFFSET_REF) /* We want a pointer to member; bypass all the code for actually taking the address of something. */ @@ -5971,6 +5973,9 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue, tsubst_flags_t complain) so we can just form an ADDR_EXPR with the correct type. */ if (processing_template_decl || TREE_CODE (arg) != COMPONENT_REF) { + if (TREE_CODE (arg) == FUNCTION_DECL && arg == maybe_overloaded_arg + && !mark_used (arg, complain) && !(complain & tf_error)) + return error_mark_node; val = build_address (arg); if (TREE_CODE (arg) == OFFSET_REF) PTRMEM_OK_P (val) = PTRMEM_OK_P (arg); @@ -5983,7 +5988,8 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue, tsubst_flags_t complain) function. */ gcc_assert (TREE_CODE (fn) == FUNCTION_DECL && DECL_STATIC_FUNCTION_P (fn)); - if (!mark_used (fn, complain) && !(complain & tf_error)) + if (arg == maybe_overloaded_arg + && !mark_used (fn, complain) && !(complain & tf_error)) return error_mark_node; val = build_address (fn); if (TREE_SIDE_EFFECTS (TREE_OPERAND (arg, 0))) diff --git a/gcc/testsuite/g++.dg/pr84943-2.C b/gcc/testsuite/g++.dg/pr84943-2.C new file mode 100644 index ..d1ef012b9155 --- /dev/null +++ b/gcc/testsuite/g++.dg/pr84943-2.C @@ -0,0 +1,64 @@ +// { dg-do run } + +// Avoid -pedantic-error default +// { dg-options "" } + +// Make sure the functions referenced by various forms of +// address-taking are marked as used and compiled in. + +static void ac() {} +void a() { + ac[0](); // { dg-warning "arithmetic" } +} + +static void bc() {} +void b() { + (&*&*&*&bc)(); +} + +template U cc() {} +void (*c())() { + return cc; +} + +template +struct x { + void a(int); + template static U a(x*) {} + static void a(long) {} + static void a(void *) {} + static void a() { +void (*p0)(void*) = x().a; +p0(0); +void (*p1)(long) = a; +p1(0); +void (*p2)() = a; +p2(); +void (*p3)(x*) = a; +p3(0); + } +}; + +struct z { + void a(int); + template static U a(z*) {} + static void a(long) {} + static void a(void *) {} + static void a() { +void (*p0)(void*) = z().a; +p0(0); +void (*p1)(long) = a; +p1(0); +void (*p2)() = a; +p2(); +void (*p3)(z*) = a; +p3(0); + } +}; + +int main(int argc, char *argv[]) { + if (argc > 1) { +x().a(); +
Re: [patch, fortran] Fix PR 85111, some more zero size stuff
On 03/29/2018 11:07 AM, Thomas Koenig wrote: Hello world, the attached patch fixes PR 85111, a regression introduced with my recent simplification patches. Seems like the zero-size saga has yet another chapter :-) Regression-tested. OK for trunk? Looks OK, Thanks, Jerry Regards Thomas 2017-03-29 Thomas Koenig PR fortran/85111 * array.c (gfc_resolve_character_array_constructor): Early exit for zero-size arrays. * simplify.c (simplify_transformation_to_array): Exit early if the result size is zero. (simplify_minmaxloc_to_array): Likewise. 2017-03-29 Thomas Koenig PR fortran/85111 * gfortran.dg/zero_sized_10.f90: New test.