Re: [PATCH] Fix -Wduplicated-branches with -g (PR c/85094)

2018-03-29 Thread Richard Biener
On Wed, 28 Mar 2018, Jakub Jelinek wrote:

> Hi!
> 
> With the introduction of DEBUG_BEGIN_STMT/-gstatement-frontiers on by
> default, the -Wduplicated-branches warning doesn't work anymore with -g,
> because operand_equal_p in OEP_LEXICOGRAPHIC mode doesn't consider
> DEBUG_BEGIN_STMTs as equal unless they are pointer equal.  For the warning
> we either need to ignore them completely (but that would need more changes,
> as the hashing doesn't ignore them), or at least what the second hunk does,
> consider them equal.  The first hunk is just an optimization, if
> -Wduplicated-branches is used on extremely deeply nested STATEMENT_LISTs,
> then for -fchecking we'd be computing the hashes etc. at the level of every
> stmt of every STATEMENT_LIST; we need to do that just once at the toplevel.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2018-03-28  Jakub Jelinek  
> 
>   PR c/85094
>   * fold-const.c (operand_equal_p): Handle DEBUG_BEGIN_STMT.
>   For STATEMENT_LIST, pass down OEP_LEXICOGRAPHIC and maybe
>   OEP_NO_HASH_CHECK for recursive call, to avoid exponential
>   checking.
> 
>   * c-c++-common/Wduplicated-branches-14.c: New test.
> 
> --- gcc/fold-const.c.jj   2018-03-27 12:54:43.657242009 +0200
> +++ gcc/fold-const.c  2018-03-28 16:20:40.024574216 +0200
> @@ -3479,7 +3479,8 @@ operand_equal_p (const_tree arg0, const_
> if (tsi_end_p (tsi1) && tsi_end_p (tsi2))
>   return 1;
> if (!operand_equal_p (tsi_stmt (tsi1), tsi_stmt (tsi2),
> - OEP_LEXICOGRAPHIC))
> + flags & (OEP_LEXICOGRAPHIC
> +  | OEP_NO_HASH_CHECK)))
>   return 0;
>   }
>   }
> @@ -3492,6 +3493,10 @@ operand_equal_p (const_tree arg0, const_
> if (flags & OEP_LEXICOGRAPHIC)
>   return OP_SAME_WITH_NULL (0);
> return 0;
> + case DEBUG_BEGIN_STMT:
> +   if (flags & OEP_LEXICOGRAPHIC)
> + return 1;
> +   return 0;
>   default:
> return 0;
>}
> --- gcc/testsuite/c-c++-common/Wduplicated-branches-14.c.jj   2018-03-28 
> 16:20:00.966553266 +0200
> +++ gcc/testsuite/c-c++-common/Wduplicated-branches-14.c  2018-03-28 
> 16:19:07.264524448 +0200
> @@ -0,0 +1,16 @@
> +/* PR c/85094 */
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -Wduplicated-branches -g" } */
> +
> +extern int g;
> +
> +void
> +foo (int r)
> +{
> +  if (r < 64)
> +g -= 48;
> +  else if (r < 80)   /* { dg-warning "this condition has identical branches" 
> } */
> +g -= 64 - 45;
> +  else
> +g -= 80 - 61;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] Fix _blendm

2018-03-29 Thread Kirill Yukhin
Hello Jakub,
On 28 мар 21:24, Jakub Jelinek wrote:
> Hi!
> 
> When looking at PR85090, I've looked into tmp-mddump.md and noticed there
> some instructions that can't assemble - vblendmd etc., there is only
> vpblendmd or vblendmps.  I couldn't make a testcase that would reproduce
> this though, seems the masked *mov_internal is used instead.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Patch is obvious. OK.

--
Thanks, K


Re: [PATCH, GCC-7, GCC-6][ARM][PR target/84826] Backport Fix ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabihf

2018-03-29 Thread Kyrill Tkachov

Hi Sudi,

On 28/03/18 15:04, Sudakshina Das wrote:

Hi

This patch is a request to backport r258777 and r258805 to gcc-7-branch
and gcc-6-branch. The same ICE occurs in both the branches with
-fstack-check. Thus the test case directive has been changed.

The discussion on the patch that went into trunk is:
https://gcc.gnu.org/ml/gcc-patches/2018-03/msg01120.html

Testing : Regtested on both the branches with arm-none-linux-gnueabihf

Is this ok for gcc-7 and gcc-6?



Ok.
Thanks,
Kyrill


Sudi

ChangeLog entries:

*** gcc/ChangeLog ***

2018-03-28  Sudakshina Das  

Backport from mainline
2018-03-22  Sudakshina Das  

PR target/84826
* config/arm/arm.h (machine_function): Add static_chain_stack_bytes.
* config/arm/arm.c (arm_compute_static_chain_stack_bytes): Avoid
re-computing once computed.
(arm_expand_prologue): Compute machine->static_chain_stack_bytes.
(arm_init_machine_status): Initialize
machine->static_chain_stack_bytes.

*** gcc/testsuite/ChangeLog ***

2018-03-28  Sudakshina Das  

* gcc.target/arm/pr84826.c: Change dg-option to -fstack-check.

Backport from mainline
2018-03-23  Sudakshina Das  

PR target/84826
* gcc.target/arm/pr84826.c: Add dg directive.

Backport from mainline
2018-03-22  Sudakshina Das  

PR target/84826
* gcc.target/arm/pr84826.c: New test.




[PATCH, 0/2] Add scan-ltrans-tree-dump and scan-wpa-ipa-dump

2018-03-29 Thread Tom de Vries

Hi,

Consider an lto multi-source test-case main.c and foo.c:
..
$ cat main.c
extern int foo (void);

int
main ()
{
  return foo () + 1;
}

$ cat foo.c
int __attribute__((noinline, noclone))
foo (void)
{
  return 2;
}
...

When compiling the test-case like this:
...
$ gcc main.c foo.c -O2 -flto -save-temps -flto-partition=1to1 -o a.out
...

the following happens:
1. main.s (containing gimple) is produced
   (and then assembled to main.o)
2. foo.s (containing gimple) is produced
   (and then assembled to main.o)
3. lto1 is called in wpa mode, generating a.out.ltrans0.o and
   a.out.ltrans1.o
4. lto1 is called in ltrans mode, generating a.out.ltrans0.s
   (which is then assembled to a.out.ltrans0.ltrans.o)
5. lto1 is called in ltrans mode, generating a.out.ltrans1.s
   (which is then assembled to a.out.ltrans1.ltrans.o)
6. a.out is produced from a.out.ltrans0.ltrans.o and
   a.out.ltrans1.ltrans.o

When adding dump flags "-fdump-tree-all -fdump-rtl-all -fdump-ipa-all" 
to the command line, we get the following dump files.


For 1, we generate (ignoring the 000i.* and statistics dumps from here on):
- main.c.003t.original - main.c.050t.local-fnsummary2
- main.c.062i.targetclone - main.c.080i.pure-const

For 2, we generate:
- foo.c.003t.original - foo.c.050t.local-fnsummary2
- foo.c.062i.targetclone - foo.c.080i.pure-const

For 3, we generate:
- a.out.wpa.046t.profile_estimate
- a.out.wpa.071i.whole-program - a.out.wpa.084i.comdats

For 4, we generate:
- a.out.ltrans0.046t.profile_estimate
- a.out.ltrans0.075i.cp - a.out.ltrans0.087i.simdclone
- a.out.ltrans0.088t.fixup_cfg4 - a.out.ltrans0.232t.optimized
- a.out.ltrans0.234r.expand - a.out.ltrans0.317r.dfinish

For 5, we generate:
- a.out.ltrans1.046t.profile_estimate
- a.out.ltrans1.075i.cp - a.out.ltrans1.087i.simdclone
- a.out.ltrans1.088t.fixup_cfg4 - a.out.ltrans1.232t.optimized
- a.out.ltrans1.234r.expand - a.out.ltrans1.317r.dfinish

With the current set of dg-final commands scan-tree-dump, scan-rtl-dump 
and scan-ipa-dump, we are able to scan dump files for 1 and 2, but not 
for 3, 4 and 5.



This patch series adds:
- scan-wpa-ipa-dump, which allows us to scan the ipa dump files for 3
- scan-ltrans-tree-dump, which:
  - allows us to scan the tree dump files for 4, and
  - adds the option -flto-partition=one which forces wpa to combine
all functions into a single partition ltrans0 (so that we don't have
to worry about ltrans1 in 5).

Thanks,
- Tom


[PATCH, testsuite, 1/2] Add scan-wpa-ipa-dump

2018-03-29 Thread Tom de Vries

On 03/29/2018 11:11 AM, Tom de Vries wrote:

Hi,

Consider an lto multi-source test-case main.c and foo.c:
..
$ cat main.c
extern int foo (void);

int
main ()
{
   return foo () + 1;
}

$ cat foo.c
int __attribute__((noinline, noclone))
foo (void)
{
   return 2;
}
...

When compiling the test-case like this:
...
$ gcc main.c foo.c -O2 -flto -save-temps -flto-partition=1to1 -o a.out
...

the following happens:
1. main.s (containing gimple) is produced
    (and then assembled to main.o)
2. foo.s (containing gimple) is produced
    (and then assembled to main.o)
3. lto1 is called in wpa mode, generating a.out.ltrans0.o and
    a.out.ltrans1.o
4. lto1 is called in ltrans mode, generating a.out.ltrans0.s
    (which is then assembled to a.out.ltrans0.ltrans.o)
5. lto1 is called in ltrans mode, generating a.out.ltrans1.s
    (which is then assembled to a.out.ltrans1.ltrans.o)
6. a.out is produced from a.out.ltrans0.ltrans.o and
    a.out.ltrans1.ltrans.o

When adding dump flags "-fdump-tree-all -fdump-rtl-all -fdump-ipa-all" 
to the command line, we get the following dump files.


For 1, we generate (ignoring the 000i.* and statistics dumps from here on):
- main.c.003t.original - main.c.050t.local-fnsummary2
- main.c.062i.targetclone - main.c.080i.pure-const

For 2, we generate:
- foo.c.003t.original - foo.c.050t.local-fnsummary2
- foo.c.062i.targetclone - foo.c.080i.pure-const

For 3, we generate:
- a.out.wpa.046t.profile_estimate
- a.out.wpa.071i.whole-program - a.out.wpa.084i.comdats

For 4, we generate:
- a.out.ltrans0.046t.profile_estimate
- a.out.ltrans0.075i.cp - a.out.ltrans0.087i.simdclone
- a.out.ltrans0.088t.fixup_cfg4 - a.out.ltrans0.232t.optimized
- a.out.ltrans0.234r.expand - a.out.ltrans0.317r.dfinish

For 5, we generate:
- a.out.ltrans1.046t.profile_estimate
- a.out.ltrans1.075i.cp - a.out.ltrans1.087i.simdclone
- a.out.ltrans1.088t.fixup_cfg4 - a.out.ltrans1.232t.optimized
- a.out.ltrans1.234r.expand - a.out.ltrans1.317r.dfinish

With the current set of dg-final commands scan-tree-dump, scan-rtl-dump 
and scan-ipa-dump, we are able to scan dump files for 1 and 2, but not 
for 3, 4 and 5.



This patch series adds:
- scan-wpa-ipa-dump, which allows us to scan the ipa dump files for 3
- scan-ltrans-tree-dump, which:
   - allows us to scan the tree dump files for 4, and
   - adds the option -flto-partition=one which forces wpa to combine
     all functions into a single partition ltrans0 (so that we don't have
     to worry about ltrans1 in 5).


This patch adds scan-wpa-ipa-dump.

Bootstrapped and reg-tested on x86_64.

OK for stage4/stage1 trunk?

Thanks,
- Tom
[testsuite] Add scan-wpa-ipa-dump

2018-03-28  Tom de Vries  

	PR testsuite/85106
	* gcc.dg/ipa/ipa-icf-38.c: New test.
	* gcc.dg/ipa/ipa-icf-38a.c: New test.
	* lib/scandump.exp (dump-base): New proc.
	(scan-dump, scan-dump-times, scan-dump-not, scan-dump-dem)
	(scan-dump-dem-not): Add and handle parameter for suffix of the dump
	base.
	* lib/scanipa.exp: Add "" argument to scan-dump calls.
	* lib/scanlang.exp: Same.
	* lib/scanrtl.exp: Same.
	* lib/scantree.exp: Same.
	* lib/scanwpaipa.exp: New file.
	* lib/gcc-dg.exp: Include scanwpaipa.exp.

	* testsuite/lib/libatomic.exp: Include scanwpaipa.exp.

	* testsuite/lib/libgomp.exp: Include scanwpaipa.exp.

	* testsuite/lib/libitm.exp: Include scanwpaipa.exp.

	* testsuite/lib/libvtv.exp: Include scanwpaipa.exp.

	* doc/sourcebuild.texi (Commands for use in dg-final, Scan optimization
	dump files): Add wpa-ipa.

---
 gcc/doc/sourcebuild.texi   |   2 +-
 gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c  |  31 +++
 gcc/testsuite/gcc.dg/ipa/ipa-icf-38a.c |  14 
 gcc/testsuite/lib/gcc-dg.exp   |   1 +
 gcc/testsuite/lib/scandump.exp |  63 +-
 gcc/testsuite/lib/scanipa.exp  |  25 +++---
 gcc/testsuite/lib/scanlang.exp |   6 +-
 gcc/testsuite/lib/scanrtl.exp  |  25 +++---
 gcc/testsuite/lib/scantree.exp |  25 +++---
 gcc/testsuite/lib/scanwpaipa.exp   | 147 +
 libatomic/testsuite/lib/libatomic.exp  |   1 +
 libgomp/testsuite/lib/libgomp.exp  |   1 +
 libitm/testsuite/lib/libitm.exp|   1 +
 libvtv/testsuite/lib/libvtv.exp|   1 +
 14 files changed, 290 insertions(+), 53 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index b9672ca..db98e9f 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2596,7 +2596,7 @@ assembly output.
 @subsubsection Scan optimization dump files
 
 These commands are available for @var{kind} of @code{tree}, @code{rtl},
-and @code{ipa}.
+@code{ipa}, and @code{wpa-ipa}.
 
 @table @code
 @item scan-@var{kind}-dump @var{regex} @var{suffix} [@{ target/xfail @var{selector} @}]
diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c b/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c
new file mode 100644
index 000..6e7936a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c
@@ -0,0 +1,31 @@
+/* { dg-do link } */
+/

[PATCH, testsuite, 2/2] Add scan-ltrans-tree-dump

2018-03-29 Thread Tom de Vries

On 03/29/2018 11:11 AM, Tom de Vries wrote:

Hi,

Consider an lto multi-source test-case main.c and foo.c:
..
$ cat main.c
extern int foo (void);

int
main ()
{
   return foo () + 1;
}

$ cat foo.c
int __attribute__((noinline, noclone))
foo (void)
{
   return 2;
}
...

When compiling the test-case like this:
...
$ gcc main.c foo.c -O2 -flto -save-temps -flto-partition=1to1 -o a.out
...

the following happens:
1. main.s (containing gimple) is produced
    (and then assembled to main.o)
2. foo.s (containing gimple) is produced
    (and then assembled to main.o)
3. lto1 is called in wpa mode, generating a.out.ltrans0.o and
    a.out.ltrans1.o
4. lto1 is called in ltrans mode, generating a.out.ltrans0.s
    (which is then assembled to a.out.ltrans0.ltrans.o)
5. lto1 is called in ltrans mode, generating a.out.ltrans1.s
    (which is then assembled to a.out.ltrans1.ltrans.o)
6. a.out is produced from a.out.ltrans0.ltrans.o and
    a.out.ltrans1.ltrans.o

When adding dump flags "-fdump-tree-all -fdump-rtl-all -fdump-ipa-all" 
to the command line, we get the following dump files.


For 1, we generate (ignoring the 000i.* and statistics dumps from here on):
- main.c.003t.original - main.c.050t.local-fnsummary2
- main.c.062i.targetclone - main.c.080i.pure-const

For 2, we generate:
- foo.c.003t.original - foo.c.050t.local-fnsummary2
- foo.c.062i.targetclone - foo.c.080i.pure-const

For 3, we generate:
- a.out.wpa.046t.profile_estimate
- a.out.wpa.071i.whole-program - a.out.wpa.084i.comdats

For 4, we generate:
- a.out.ltrans0.046t.profile_estimate
- a.out.ltrans0.075i.cp - a.out.ltrans0.087i.simdclone
- a.out.ltrans0.088t.fixup_cfg4 - a.out.ltrans0.232t.optimized
- a.out.ltrans0.234r.expand - a.out.ltrans0.317r.dfinish

For 5, we generate:
- a.out.ltrans1.046t.profile_estimate
- a.out.ltrans1.075i.cp - a.out.ltrans1.087i.simdclone
- a.out.ltrans1.088t.fixup_cfg4 - a.out.ltrans1.232t.optimized
- a.out.ltrans1.234r.expand - a.out.ltrans1.317r.dfinish

With the current set of dg-final commands scan-tree-dump, scan-rtl-dump 
and scan-ipa-dump, we are able to scan dump files for 1 and 2, but not 
for 3, 4 and 5.



This patch series adds:
- scan-wpa-ipa-dump, which allows us to scan the ipa dump files for 3
- scan-ltrans-tree-dump, which:
   - allows us to scan the tree dump files for 4, and
   - adds the option -flto-partition=one which forces wpa to combine
     all functions into a single partition ltrans0 (so that we don't have
     to worry about ltrans1 in 5).


This patch adds scan-ltrans-tree-dump.

Bootstrapped and reg-tested on x86_64.

OK for stage4/stage1 trunk?

Thanks,
- Tom
[testsuite] Add scan-ltrans-tree-dump

2018-03-28  Tom de Vries  

	PR testsuite/85106
	* gcc.dg/ipa/ipa-icf-38.c: Use scan-ltrans-tree-dump.
	* lib/scanltranstree.exp: New file.
	* lib/target-supports.exp (scan-ltrans-tree-dump_required_options)
	(scan-ltrans-tree-dump-times_required_options)
	(scan-ltrans-tree-dump-not_required_options)
	(scan-ltrans-tree-dump-dem_required_options)
	(scan-ltrans-tree-dump-dem-not_required_options): New proc.
	* lib/gcc-dg.exp: Include scanltranstree.exp.

	* testsuite/lib/libatomic.exp: Include scanltranstree.exp.

	* testsuite/lib/libgomp.exp: Include scanltranstree.exp.

	* testsuite/lib/libitm.exp: Include scanltranstree.exp.

	* testsuite/lib/libvtv.exp: Include scanltranstree.exp.

	* doc/sourcebuild.texi (Commands for use in dg-final, Scan optimization
	dump files): Add ltrans-tree.

---
 gcc/doc/sourcebuild.texi  |   4 +-
 gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c |   4 +-
 gcc/testsuite/lib/gcc-dg.exp  |   1 +
 gcc/testsuite/lib/scanltranstree.exp  | 148 ++
 gcc/testsuite/lib/target-supports.exp |  20 +
 libatomic/testsuite/lib/libatomic.exp |   1 +
 libgomp/testsuite/lib/libgomp.exp |   1 +
 libitm/testsuite/lib/libitm.exp   |   1 +
 libvtv/testsuite/lib/libvtv.exp   |   1 +
 9 files changed, 178 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index db98e9f..8ef2d0e 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2595,8 +2595,8 @@ assembly output.
 
 @subsubsection Scan optimization dump files
 
-These commands are available for @var{kind} of @code{tree}, @code{rtl},
-@code{ipa}, and @code{wpa-ipa}.
+These commands are available for @var{kind} of @code{tree}, @code{ltrans-tree},
+@code{rtl}, @code{ipa}, and @code{wpa-ipa}.
 
 @table @code
 @item scan-@var{kind}-dump @var{regex} @var{suffix} [@{ target/xfail @var{selector} @}]
diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c b/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c
index 6e7936a..85531ab 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c
@@ -1,5 +1,5 @@
 /* { dg-do link } */
-/* { dg-options "-O2 -fdump-ipa-icf -flto" } */
+/* { dg-options "-O2 -fdump-ipa-icf -flto -fdump-tree-fixup_cfg4" } */
 /* { dg-require-effective-target lto } */
 /* { dg-additional-sources 

Re: [PATCH, GCC-7, GCC-6][ARM][PR target/84826] Backport Fix ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabihf

2018-03-29 Thread Sudakshina Das

Hi Kyrill

On 29/03/18 09:41, Kyrill Tkachov wrote:

Hi Sudi,

On 28/03/18 15:04, Sudakshina Das wrote:

Hi

This patch is a request to backport r258777 and r258805 to gcc-7-branch
and gcc-6-branch. The same ICE occurs in both the branches with
-fstack-check. Thus the test case directive has been changed.

The discussion on the patch that went into trunk is:
https://gcc.gnu.org/ml/gcc-patches/2018-03/msg01120.html

Testing : Regtested on both the branches with arm-none-linux-gnueabihf

Is this ok for gcc-7 and gcc-6?



Ok.
Thanks,
Kyrill


Thanks! Committed to gcc-7-branch as r258948 and gcc-6-branch as r258949.

Sudi




Sudi

ChangeLog entries:

*** gcc/ChangeLog ***

2018-03-28  Sudakshina Das  

    Backport from mainline
    2018-03-22  Sudakshina Das  

    PR target/84826
    * config/arm/arm.h (machine_function): Add 
static_chain_stack_bytes.

    * config/arm/arm.c (arm_compute_static_chain_stack_bytes): Avoid
    re-computing once computed.
    (arm_expand_prologue): Compute machine->static_chain_stack_bytes.
    (arm_init_machine_status): Initialize
    machine->static_chain_stack_bytes.

*** gcc/testsuite/ChangeLog ***

2018-03-28  Sudakshina Das  

    * gcc.target/arm/pr84826.c: Change dg-option to -fstack-check.

    Backport from mainline
    2018-03-23  Sudakshina Das  

    PR target/84826
    * gcc.target/arm/pr84826.c: Add dg directive.

    Backport from mainline
    2018-03-22  Sudakshina Das  

    PR target/84826
    * gcc.target/arm/pr84826.c: New test.






[PATCH] Print function attributes in rtl dumps

2018-03-29 Thread Tom de Vries

Hi,

when we compile a function with attributes:
...
int __attribute__((noinline, noclone))
foo (void)
{
  return 2;
}
...

like this:
...
gcc main.c -fdump-tree-all -fdump-rtl-all
...

we find the function attributes starting from foo.c.004t.gimple:
...
__attribute__((noclone, noinline))
foo ()
{
  int D.1961;

  D.1961 = 2;
  return D.1961;
}
...
to foo.c.232t.optimized.


But we don't find the attributes in the rtl dumps:
...
$ grep __attribute__ foo.c.*r.*
$
...

This patch adds printing of the function attributes in the rtl dump, 
f.i. foo.c.235r.vregs looks like this :

...
;; Function foo (foo, funcdef_no=0, decl_uid=1958, cgraph_uid=0, 
symbol_order=0)


function foo attributes: __attribute__((noclone, noinline))
(note 1 0 3 NOTE_INSN_DELETED)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 8 2 (set (reg:SI 87 [ _1 ])
(const_int 2 [0x2])) "foo.c":4 86 {*movsi_internal}
 (nil))
(insn 8 5 12 2 (set (reg:SI 88 [  ])
(reg:SI 87 [ _1 ])) "foo.c":4 86 {*movsi_internal}
 (nil))
(insn 12 8 13 2 (set (reg/i:SI 0 ax)
(reg:SI 88 [  ])) "foo.c":5 86 {*movsi_internal}
 (nil))
(insn 13 12 0 2 (use (reg/i:SI 0 ax)) "foo.c":5 -1
 (nil))
...

I've added the "function foo attributes" prefix because in other rtl 
dumps there may be quite a number of lines between the ";; Function foo" 
header and the first insn.


OK for stage1 if bootstrap and reg-test on x86 succeeds?

Thanks,
- Tom


Re: [PATCH] Print function attributes in rtl dumps

2018-03-29 Thread Tom de Vries

[ Fix ENOPATCH ]

On 03/29/2018 12:17 PM, Tom de Vries wrote:

Hi,

when we compile a function with attributes:
...
int __attribute__((noinline, noclone))
foo (void)
{
   return 2;
}
...

like this:
...
gcc main.c -fdump-tree-all -fdump-rtl-all
...

we find the function attributes starting from foo.c.004t.gimple:
...
__attribute__((noclone, noinline))
foo ()
{
   int D.1961;

   D.1961 = 2;
   return D.1961;
}
...
to foo.c.232t.optimized.


But we don't find the attributes in the rtl dumps:
...
$ grep __attribute__ foo.c.*r.*
$
...

This patch adds printing of the function attributes in the rtl dump, 
f.i. foo.c.235r.vregs looks like this :

...
;; Function foo (foo, funcdef_no=0, decl_uid=1958, cgraph_uid=0, 
symbol_order=0)


function foo attributes: __attribute__((noclone, noinline))
(note 1 0 3 NOTE_INSN_DELETED)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 8 2 (set (reg:SI 87 [ _1 ])
     (const_int 2 [0x2])) "foo.c":4 86 {*movsi_internal}
  (nil))
(insn 8 5 12 2 (set (reg:SI 88 [  ])
     (reg:SI 87 [ _1 ])) "foo.c":4 86 {*movsi_internal}
  (nil))
(insn 12 8 13 2 (set (reg/i:SI 0 ax)
     (reg:SI 88 [  ])) "foo.c":5 86 {*movsi_internal}
  (nil))
(insn 13 12 0 2 (use (reg/i:SI 0 ax)) "foo.c":5 -1
  (nil))
...

I've added the "function foo attributes" prefix because in other rtl 
dumps there may be quite a number of lines between the ";; Function foo" 
header and the first insn.


OK for stage1 if bootstrap and reg-test on x86 succeeds?

Thanks,
- Tom


Print function attributes in rtl dump

2018-03-29  Tom de Vries  

	* passes.c (execute_function_dump): Call dump_function_attributes before
	print_rtl_with_bb.
	* tree-cfg.c (dump_function_attributes): New function, factored out of
	...
	(dump_function_to_file): ... here.
	* tree-cfg.h (dump_function_attributes): Declare.

---
 gcc/passes.c   |  5 -
 gcc/tree-cfg.c | 68 ++
 gcc/tree-cfg.h |  1 +
 3 files changed, 45 insertions(+), 29 deletions(-)

diff --git a/gcc/passes.c b/gcc/passes.c
index ad0a912..91518bd 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1761,7 +1761,10 @@ execute_function_dump (function *fn, void *data)
   if (fn->curr_properties & PROP_trees)
 dump_function_to_file (fn->decl, dump_file, dump_flags);
   else
-	print_rtl_with_bb (dump_file, get_insns (), dump_flags);
+	{
+	  dump_function_attributes (fn->decl, dump_file, true);
+	  print_rtl_with_bb (dump_file, get_insns (), dump_flags);
+	}
 
   /* Flush the file.  If verification fails, we won't be able to
 	 close the file before aborting.  */
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 9485f73..d0f8b70 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -7963,6 +7963,45 @@ print_no_sanitize_attr_value (FILE *file, tree value)
 }
 }
 
+/* Dump function attributes of function FNDECL to FILE.  Print prefix showing
+   the function name if PRINT_PREFIX.  */
+
+void
+dump_function_attributes (tree fndecl, FILE *file, bool print_prefix)
+{
+  if (DECL_ATTRIBUTES (fndecl) == NULL_TREE)
+return;
+
+  if (print_prefix)
+fprintf (file, "function %s attributes: ", fndecl_name (fndecl));
+
+  fprintf (file, "__attribute__((");
+
+  bool first = true;
+  tree chain;
+  for (chain = DECL_ATTRIBUTES (fndecl); chain;
+   first = false, chain = TREE_CHAIN (chain))
+{
+  if (!first)
+	fprintf (file, ", ");
+
+  tree name = get_attribute_name (chain);
+  print_generic_expr (file, name, dump_flags);
+  if (TREE_VALUE (chain) != NULL_TREE)
+	{
+	  fprintf (file, " (");
+
+	  if (strstr (IDENTIFIER_POINTER (name), "no_sanitize"))
+	print_no_sanitize_attr_value (file, TREE_VALUE (chain));
+	  else
+	print_generic_expr (file, TREE_VALUE (chain), dump_flags);
+	  fprintf (file, ")");
+	}
+}
+
+  fprintf (file, "))\n");
+}
+
 /* Dump FUNCTION_DECL FN to file FILE using FLAGS (see TDF_* in dumpfile.h)
*/
 
@@ -7978,34 +8017,7 @@ dump_function_to_file (tree fndecl, FILE *file, dump_flags_t flags)
 		  && decl_is_tm_clone (fndecl));
   struct function *fun = DECL_STRUCT_FUNCTION (fndecl);
 
-  if (DECL_ATTRIBUTES (fndecl) != NULL_TREE)
-{
-  fprintf (file, "__attribute__((");
-
-  bool first = true;
-  tree chain;
-  for (chain = DECL_ATTRIBUTES (fndecl); chain;
-	   first = false, chain = TREE_CHAIN (chain))
-	{
-	  if (!first)
-	fprintf (file, ", ");
-
-	  tree name = get_attribute_name (chain);
-	  print_generic_expr (file, name, dump_flags);
-	  if (TREE_VALUE (chain) != NULL_TREE)
-	{
-	  fprintf (file, " (");
-
-	  if (strstr (IDENTIFIER_POINTER (name), "no_sanitize"))
-		print_no_sanitize_attr_value (file, TREE_VALUE (chain));
-	  else
-		print_generic_expr (file, TREE_VALUE (chain), dump_flags);
-	  fprintf (file, ")");
-	}
-	}
-
-  fprintf (file, "))\n");
-}
+  dump_function_attributes (fndecl, file, false);
 
   current_function_d

Re: [Aarch64] Fix conditional branches with target far away.

2018-03-29 Thread Sameera Deshpande
Hi Sudakshina,

Thanks for pointing that out. Updated the conditions for attribute
length to take care of boundary conditions for offset range.

Please find attached the updated patch.

I have tested it for gcc testsuite and the failing testcase. Ok for trunk?

On 22 March 2018 at 19:06, Sudakshina Das  wrote:
> Hi Sameera
>
> On 22/03/18 02:07, Sameera Deshpande wrote:
>>
>> Hi Sudakshina,
>>
>> As per the ARMv8 ARM, for the offset range (-1048576 ,1048572), the
>> far branch instruction offset is inclusive of both the offsets. Hence,
>> I am using <=||=> and not <||>= as it was in previous implementation.
>
>
> I have to admit earlier I was only looking at the patch mechanically and
> found a difference with the previous implementation in offset comparison.
> After you pointed out, I looked up the ARMv8 ARM and I have a couple of
> doubts:
>
> 1. My understanding is that any offset in [-1048576 ,1048572] both inclusive
> qualifies as an 'in range' offset. However, the code for both attribute
> length and far_branch has been using [-1048576 ,1048572), that is, ( >= && <
> ). If the far_branch was incorrectly calculated, then maybe the length
> calculations with similar magic numbers should also be corrected? Of course,
> I am not an expert in this and maybe this was a conscience decision so I
> would ask Ramana to maybe clarify if he remembers.
>
> 2. Now to come back to your patch, if my understanding is correct, I think a
> far_branch would be anything outside of this range, that is,
> (offset < -1048576 || offset > 1048572), anything that can not be
> represented in the 21-bit range.
>
> Thanks
> Sudi
>
>
>>
>> On 16 March 2018 at 00:51, Sudakshina Das  wrote:
>>>
>>> On 15/03/18 15:27, Sameera Deshpande wrote:


 Ping!

 On 28 February 2018 at 16:18, Sameera Deshpande
  wrote:
>
>
> On 27 February 2018 at 18:25, Ramana Radhakrishnan
>  wrote:
>>
>>
>> On Wed, Feb 14, 2018 at 8:30 AM, Sameera Deshpande
>>  wrote:
>>>
>>>
>>> Hi!
>>>
>>> Please find attached the patch to fix bug in branches with offsets
>>> over
>>> 1MiB.
>>> There has been an attempt to fix this issue in commit
>>> 050af05b9761f1979f11c151519e7244d5becd7c
>>>
>>> However, the far_branch attribute defined in above patch used
>>> insn_length - which computes incorrect offset. Hence, eliminated the
>>> attribute completely, and computed the offset from insn_addresses
>>> instead.
>>>
>>> Ok for trunk?
>>>
>>> gcc/Changelog
>>>
>>> 2018-02-13 Sameera Deshpande 
>>>   * config/aarch64/aarch64.md (far_branch): Remove attribute.
>>> Eliminate
>>>   all the dependencies on the attribute from RTL patterns.
>>>
>>
>> I'm not a maintainer but this looks good to me modulo notes about how
>> this was tested. What would be nice is a testcase for the testsuite as
>> well as ensuring that the patch has been bootstrapped and regression
>> tested. AFAIR, the original patch was put in because match.pd failed
>> when bootstrap in another context.
>>
>>
>> regards
>> Ramana
>>
>>> --
>>> - Thanks and regards,
>>> Sameera D.
>
>
>
> The patch is tested with GCC testsuite and bootstrapping successfully.
> Also tested for spec benchmark.
>
>>>
>>> I am not a maintainer either. I noticed that the range check you do for
>>> the offset has a (<= || >=). The "far_branch" however did (< || >=) for a
>>> positive value. Was that also part of the incorrect offset calculation?
>>>
>>> @@ -692,7 +675,11 @@
>>>  {
>>>if (get_attr_length (insn) =3D=3D 8)
>>>  {
>>> -   if (get_attr_far_branch (insn) =3D=3D 1)
>>> +   long long int offset;
>>> +   offset =3D INSN_ADDRESSES (INSN_UID (XEXP (operands[2], 0)))
>>> + - INSN_ADDRESSES (INSN_UID (insn));
>>> +
>>> +   if (offset <=3D -1048576 || offset >=3D 1048572)
>>> return aarch64_gen_far_branch (operands, 2, "Ltb",
>>>"\\t%0, %1, ");
>>>   else
>>> @@ -709,12 +696,7 @@
>>>   (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int
>>> -32768))
>>>  (lt (minus (match_dup 2) (pc)) (const_int
>>> 32764)))
>>> (const_int 4)
>>> - (const_int 8)))
>>> -   (set (attr "far_branch")
>>> -   (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int
>>> -1048576))
>>> -  (lt (minus (match_dup 2) (pc)) (const_int
>>> 1048572)))
>>> - (const_int 0)
>>> - (const_int 1)))]
>>> + (const_int 8)))]
>>>
>>>)
>>>
>>> Thanks
>>> Sudi
>>>
> --
> - Thanks and regards,
> Sameera D.





>>>
>>
>>
>>
>



-- 
- Thanks and regards,
  Sameera D.
Index: gcc/config/aarch64/aarch

Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).

2018-03-29 Thread Martin Liška

On 03/28/2018 06:36 PM, Jakub Jelinek wrote:

On Wed, Mar 28, 2018 at 06:30:21PM +0200, Martin Liška wrote:

--- a/gcc/config/linux.c
+++ b/gcc/config/linux.c
@@ -37,3 +37,24 @@ linux_libc_has_function (enum function_class fn_class)
  
return false;

  }
+
+/* This hook determines whether a function from libc has a fast implementation
+   FN is present at the runtime.  We override it for i386 and glibc C library
+   as this combination provides fast implementation of mempcpy function.  */
+
+enum libc_speed
+ix86_linux_libc_func_speed (int fn)


Putting a ix86_ function into config/linux.c used by most linux targets is
weird.  Either we multiple linux targets with mempcpy fast, then name it
somehow cpu neutral and let all those CPUs pick it up in config/*/linux.h.
And yes, we do care about i?86-linux.  Or it is for x86 only, and then
it shouldn't be in config/linux.c, but either e.g. static inline in
config/i386/linux.h, or we need config/i386/linux.c if we don't have it
already.


I'm fine with putting the implementation into gcc/config/i386/linux.c. Can you 
please
help me how to conditionally build the file?

Martin



Jakub





[PATCH] Documentation tweaks.

2018-03-29 Thread Martin Liška

Hi.

I'm sending couple of documentation tweaks that touch LTO, binutils auto-load 
plugin
support and valid options of -march (on x86-64).

Martin

gcc/ChangeLog:

2018-03-29  Martin Liska  

PR lto/84995.
* doc/invoke.texi: Document how LTO works with debug info.
Describe auto-load support of binutils.  Mention 'x86-64'
as valid option value of -march option.
---
 gcc/doc/invoke.texi | 11 +++
 1 file changed, 11 insertions(+)


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index feacd569ef3..424e426a390 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9482,6 +9482,9 @@ types of hosts.  The bytecode files are versioned and there is a
 strict version check, so bytecode files generated in one version of
 GCC do not work with an older or newer version of GCC.
 
+Link-time optimization on ELF/DWARF systems works with generation of debugging
+information.  Other systems are currently not supported.
+
 Link-time optimization does not work well with generation of debugging
 information.  Combining @option{-flto} with
 @option{-g} is currently experimental and expected to produce unexpected
@@ -9565,6 +9568,11 @@ need to support linker plugins to allow a full-featured build environment
 @command{gcc-nm}, @command{gcc-ranlib} wrappers to pass the right options
 to these tools. With non fat LTO makefiles need to be modified to use them.
 
+Note that modern binutils provide plugin auto-load mechanism.
+Installing the linker plugin into @file{$libdir/bfd-plugins} has the same
+effect as usage of the command wrappers (@command{gcc-ar}, @command{gcc-nm} and
+@command{gcc-ranlib}).
+
 The default is @option{-fno-fat-lto-objects} on targets with linker plugin
 support.
 
@@ -26440,6 +26448,9 @@ the result might not run on different machines).  Using @option{-mtune=native}
 produces code optimized for the local machine under the constraints
 of the selected instruction set.  
 
+@item x86-64
+A generic CPU with 64-bit extensions.
+
 @item i386
 Original Intel i386 CPU@.
 



Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).

2018-03-29 Thread H.J. Lu
On Thu, Mar 29, 2018 at 4:28 AM, Martin Liška  wrote:
> On 03/28/2018 06:36 PM, Jakub Jelinek wrote:
>>
>> On Wed, Mar 28, 2018 at 06:30:21PM +0200, Martin Liška wrote:
>>>
>>> --- a/gcc/config/linux.c
>>> +++ b/gcc/config/linux.c
>>> @@ -37,3 +37,24 @@ linux_libc_has_function (enum function_class fn_class)
>>>   return false;
>>>   }
>>> +
>>> +/* This hook determines whether a function from libc has a fast
>>> implementation
>>> +   FN is present at the runtime.  We override it for i386 and glibc C
>>> library
>>> +   as this combination provides fast implementation of mempcpy function.
>>> */
>>> +
>>> +enum libc_speed
>>> +ix86_linux_libc_func_speed (int fn)
>>
>>
>> Putting a ix86_ function into config/linux.c used by most linux targets is
>> weird.  Either we multiple linux targets with mempcpy fast, then name it
>> somehow cpu neutral and let all those CPUs pick it up in config/*/linux.h.
>> And yes, we do care about i?86-linux.  Or it is for x86 only, and then
>> it shouldn't be in config/linux.c, but either e.g. static inline in
>> config/i386/linux.h, or we need config/i386/linux.c if we don't have it
>> already.
>
>
> I'm fine with putting the implementation into gcc/config/i386/linux.c. Can
> you please
> help me how to conditionally build the file?

config.gcc:  extra_objs="${extra_objs} linux.o"
config.gcc: extra_objs="$extra_objs powerpcspe-linux.o"
config.gcc: extra_objs="$extra_objs rs6000-linux.o"


-- 
H.J.


Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).

2018-03-29 Thread Martin Liška

On 03/29/2018 02:25 PM, Jakub Jelinek wrote:

On Thu, Mar 29, 2018 at 01:28:13PM +0200, Martin Liška wrote:

On 03/28/2018 06:36 PM, Jakub Jelinek wrote:

On Wed, Mar 28, 2018 at 06:30:21PM +0200, Martin Liška wrote:

--- a/gcc/config/linux.c
+++ b/gcc/config/linux.c
@@ -37,3 +37,24 @@ linux_libc_has_function (enum function_class fn_class)
 return false;
   }
+
+/* This hook determines whether a function from libc has a fast implementation
+   FN is present at the runtime.  We override it for i386 and glibc C library
+   as this combination provides fast implementation of mempcpy function.  */
+
+enum libc_speed
+ix86_linux_libc_func_speed (int fn)


Putting a ix86_ function into config/linux.c used by most linux targets is
weird.  Either we multiple linux targets with mempcpy fast, then name it
somehow cpu neutral and let all those CPUs pick it up in config/*/linux.h.
And yes, we do care about i?86-linux.  Or it is for x86 only, and then
it shouldn't be in config/linux.c, but either e.g. static inline in
config/i386/linux.h, or we need config/i386/linux.c if we don't have it
already.


I'm fine with putting the implementation into gcc/config/i386/linux.c. Can you 
please


Can't you just put it into gcc/config/i386/linux-common.h as static inline,
so that it is optimized away whenever not needed?


I would like to put it there, but:

g++ -c   -g  -DIN_GCC -fno-exceptions -fno-rtti 
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common  
-DHAVE_CONFIG_H -DGENERATOR_FILE -fno-PIE -I. -Ibuild -I../../gcc 
-I../../gcc/build -I../../gcc/../include  -I../../gcc/../libcpp/include  \
-o build/genpreds.o ../../gcc/genpreds.c
In file included from ./tm.h:38:0,
 from ../../gcc/genpreds.c:26:
../../gcc/config/i386/linux-common.h: In function ‘libc_speed 
ix86_linux_libc_func_speed(int)’:
../../gcc/config/i386/linux-common.h:137:8: error: use of enum 
‘built_in_function’ without previous declaration
   enum built_in_function f = (built_in_function)fn;
^
../../gcc/config/i386/linux-common.h:137:31: error: ‘built_in_function’ was not 
declared in this scope
   enum built_in_function f = (built_in_function)fn;
   ^
../../gcc/config/i386/linux-common.h:137:31: note: suggested alternative: 
‘machine_function’
   enum built_in_function f = (built_in_function)fn;
   ^
   machine_function
../../gcc/config/i386/linux-common.h:144:12: error: ‘BUILT_IN_MEMPCPY’ was not 
declared in this scope
   case BUILT_IN_MEMPCPY:
^~~~
Martin



If you really want to add a c file, it better not be called linux.c, because
linux.o for it would clash with linux.o from gcc/config/linux.c.  And,
you'd need to add the whatever.o into extra_objs in gcc/config.gcc and add
rules for it into gcc/config/i386/t-linux (see linux.o in config.gcc and
config/t-linux).


help me how to conditionally build the file?


Jakub





Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).

2018-03-29 Thread Jakub Jelinek
On Thu, Mar 29, 2018 at 01:28:13PM +0200, Martin Liška wrote:
> On 03/28/2018 06:36 PM, Jakub Jelinek wrote:
> > On Wed, Mar 28, 2018 at 06:30:21PM +0200, Martin Liška wrote:
> > > --- a/gcc/config/linux.c
> > > +++ b/gcc/config/linux.c
> > > @@ -37,3 +37,24 @@ linux_libc_has_function (enum function_class fn_class)
> > > return false;
> > >   }
> > > +
> > > +/* This hook determines whether a function from libc has a fast 
> > > implementation
> > > +   FN is present at the runtime.  We override it for i386 and glibc C 
> > > library
> > > +   as this combination provides fast implementation of mempcpy function. 
> > >  */
> > > +
> > > +enum libc_speed
> > > +ix86_linux_libc_func_speed (int fn)
> > 
> > Putting a ix86_ function into config/linux.c used by most linux targets is
> > weird.  Either we multiple linux targets with mempcpy fast, then name it
> > somehow cpu neutral and let all those CPUs pick it up in config/*/linux.h.
> > And yes, we do care about i?86-linux.  Or it is for x86 only, and then
> > it shouldn't be in config/linux.c, but either e.g. static inline in
> > config/i386/linux.h, or we need config/i386/linux.c if we don't have it
> > already.
> 
> I'm fine with putting the implementation into gcc/config/i386/linux.c. Can 
> you please

Can't you just put it into gcc/config/i386/linux-common.h as static inline,
so that it is optimized away whenever not needed?

If you really want to add a c file, it better not be called linux.c, because
linux.o for it would clash with linux.o from gcc/config/linux.c.  And,
you'd need to add the whatever.o into extra_objs in gcc/config.gcc and add
rules for it into gcc/config/i386/t-linux (see linux.o in config.gcc and
config/t-linux).

> help me how to conditionally build the file?

Jakub


[PATCH] i386: Enable AVX/AVX512 features only if supported by OSXSAVE

2018-03-29 Thread H.J. Lu
Enable AVX and AVX512 features only if their states are supported by
OSXSAVE.

OK for trunk and release branches?


H.J.
---
PR target/85100
* config/i386/cpuinfo.c (XCR_XFEATURE_ENABLED_MASK): New.
(XSTATE_FP): Likewise.
(XSTATE_SSE): Likewise.
(XSTATE_YMM): Likewise.
(XSTATE_OPMASK): Likewise.
(XSTATE_ZMM): Likewise.
(XSTATE_HI_ZMM): Likewise.
(XCR_AVX_ENABLED_MASK): Likewise.
(XCR_AVX512F_ENABLED_MASK): Likewise.
(get_available_features): Enable AVX and AVX512 features only
if their states are supported by OSXSAVE.
---
 libgcc/config/i386/cpuinfo.c | 134 +--
 1 file changed, 90 insertions(+), 44 deletions(-)

diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
index 4eb3f5cd944..1dac110a79a 100644
--- a/libgcc/config/i386/cpuinfo.c
+++ b/libgcc/config/i386/cpuinfo.c
@@ -240,6 +240,40 @@ get_available_features (unsigned int ecx, unsigned int edx,
   unsigned int features = 0;
   unsigned int features2 = 0;
 
+  /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv.  */
+#define XCR_XFEATURE_ENABLED_MASK  0x0
+#define XSTATE_FP  0x1
+#define XSTATE_SSE 0x2
+#define XSTATE_YMM 0x4
+#define XSTATE_OPMASK  0x20
+#define XSTATE_ZMM 0x40
+#define XSTATE_HI_ZMM  0x80
+
+#define XCR_AVX_ENABLED_MASK \
+  (XSTATE_SSE | XSTATE_YMM)
+#define XCR_AVX512F_ENABLED_MASK \
+  (XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM)
+
+  /* Check if AVX and AVX512 are usable.  */
+  int avx_usable = 0;
+  int avx512_usable = 0;
+  if ((ecx & bit_OSXSAVE))
+{
+  /* Check if XMM, YMM, OPMASK, upper 256 bits of ZMM0-ZMM15 and
+ ZMM16-ZMM31 states are supported by OSXSAVE.  */
+  unsigned int xcrlow;
+  unsigned int xcrhigh;
+  asm (".byte 0x0f, 0x01, 0xd0"
+  : "=a" (xcrlow), "=d" (xcrhigh)
+  : "c" (XCR_XFEATURE_ENABLED_MASK));
+  if ((xcrlow & XCR_AVX_ENABLED_MASK) == XCR_AVX_ENABLED_MASK)
+   {
+ avx_usable = 1;
+ avx512_usable = ((xcrlow & XCR_AVX512F_ENABLED_MASK)
+  == XCR_AVX512F_ENABLED_MASK);
+   }
+}
+
 #define set_feature(f) \
   if (f < 32) features |= (1U << f); else features2 |= (1U << (f - 32))
 
@@ -265,10 +299,13 @@ get_available_features (unsigned int ecx, unsigned int 
edx,
 set_feature (FEATURE_SSE4_1);
   if (ecx & bit_SSE4_2)
 set_feature (FEATURE_SSE4_2);
-  if (ecx & bit_AVX)
-set_feature (FEATURE_AVX);
-  if (ecx & bit_FMA)
-set_feature (FEATURE_FMA);
+  if (avx_usable)
+{
+  if (ecx & bit_AVX)
+   set_feature (FEATURE_AVX);
+  if (ecx & bit_FMA)
+   set_feature (FEATURE_FMA);
+}
 
   /* Get Advanced Features at level 7 (eax = 7, ecx = 0). */
   if (max_cpuid_level >= 7)
@@ -276,44 +313,50 @@ get_available_features (unsigned int ecx, unsigned int 
edx,
   __cpuid_count (7, 0, eax, ebx, ecx, edx);
   if (ebx & bit_BMI)
set_feature (FEATURE_BMI);
-  if (ebx & bit_AVX2)
-   set_feature (FEATURE_AVX2);
+  if (avx_usable)
+   {
+ if (ebx & bit_AVX2)
+   set_feature (FEATURE_AVX2);
+   }
   if (ebx & bit_BMI2)
set_feature (FEATURE_BMI2);
-  if (ebx & bit_AVX512F)
-   set_feature (FEATURE_AVX512F);
-  if (ebx & bit_AVX512VL)
-   set_feature (FEATURE_AVX512VL);
-  if (ebx & bit_AVX512BW)
-   set_feature (FEATURE_AVX512BW);
-  if (ebx & bit_AVX512DQ)
-   set_feature (FEATURE_AVX512DQ);
-  if (ebx & bit_AVX512CD)
-   set_feature (FEATURE_AVX512CD);
-  if (ebx & bit_AVX512PF)
-   set_feature (FEATURE_AVX512PF);
-  if (ebx & bit_AVX512ER)
-   set_feature (FEATURE_AVX512ER);
-  if (ebx & bit_AVX512IFMA)
-   set_feature (FEATURE_AVX512IFMA);
-  if (ecx & bit_AVX512VBMI)
-   set_feature (FEATURE_AVX512VBMI);
-  if (ecx & bit_AVX512VBMI2)
-   set_feature (FEATURE_AVX512VBMI2);
-  if (ecx & bit_GFNI)
-   set_feature (FEATURE_GFNI);
-  if (ecx & bit_VPCLMULQDQ)
-   set_feature (FEATURE_VPCLMULQDQ);
-  if (ecx & bit_AVX512VNNI)
-   set_feature (FEATURE_AVX512VNNI);
-  if (ecx & bit_AVX512BITALG)
-   set_feature (FEATURE_AVX512BITALG);
-  if (ecx & bit_AVX512VPOPCNTDQ)
-   set_feature (FEATURE_AVX512VPOPCNTDQ);
-  if (edx & bit_AVX5124VNNIW)
-   set_feature (FEATURE_AVX5124VNNIW);
-  if (edx & bit_AVX5124FMAPS)
-   set_feature (FEATURE_AVX5124FMAPS);
+  if (avx512_usable)
+   {
+ if (ebx & bit_AVX512F)
+   set_feature (FEATURE_AVX512F);
+ if (ebx & bit_AVX512VL)
+   set_feature (FEATURE_AVX512VL);
+ if (ebx & bit_AVX512BW)
+   set_feature (FEATURE_AVX512BW);
+ if (ebx & bit_AVX512DQ)
+   set_feature (FEATURE_AVX512DQ);
+ if (

Re: [PATCH] Documentation tweaks.

2018-03-29 Thread Richard Biener
On Thu, Mar 29, 2018 at 2:14 PM, Martin Liška  wrote:
> Hi.
>
> I'm sending couple of documentation tweaks that touch LTO, binutils
> auto-load plugin
> support and valid options of -march (on x86-64).

@@ -9482,6 +9482,9 @@ types of hosts.  The bytecode files are
versioned and there is a
 strict version check, so bytecode files generated in one version of
 GCC do not work with an older or newer version of GCC.

+Link-time optimization on ELF/DWARF systems works with generation of debugging
+information.  Other systems are currently not supported.
+
 Link-time optimization does not work well with generation of debugging
 information.  Combining @option{-flto} with
 @option{-g} is currently experimental and expected to produce unexpected

this just adds to the current warning which doesn't look too useful.

Maybe simply

Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 258915)
+++ doc/invoke.texi (working copy)
@@ -9483,9 +9483,8 @@ strict version check, so bytecode files
 GCC do not work with an older or newer version of GCC.

 Link-time optimization does not work well with generation of debugging
-information.  Combining @option{-flto} with
-@option{-g} is currently experimental and expected to produce unexpected
-results.
+information on systems other than those using a combination of ELF and
+DWARF.

 If you specify the optional @var{n}, the optimization and code
 generation done at link time is executed in parallel using @var{n}

?

The other hunks look OK.

Richard.

> Martin
>
> gcc/ChangeLog:
>
> 2018-03-29  Martin Liska  
>
> PR lto/84995.
> * doc/invoke.texi: Document how LTO works with debug info.
> Describe auto-load support of binutils.  Mention 'x86-64'
> as valid option value of -march option.
> ---
>  gcc/doc/invoke.texi | 11 +++
>  1 file changed, 11 insertions(+)
>
>


Re: [PATCH] Documentation tweaks.

2018-03-29 Thread Martin Liška

On 03/29/2018 02:56 PM, Richard Biener wrote:

On Thu, Mar 29, 2018 at 2:14 PM, Martin Liška  wrote:

Hi.

I'm sending couple of documentation tweaks that touch LTO, binutils
auto-load plugin
support and valid options of -march (on x86-64).


@@ -9482,6 +9482,9 @@ types of hosts.  The bytecode files are
versioned and there is a
  strict version check, so bytecode files generated in one version of
  GCC do not work with an older or newer version of GCC.

+Link-time optimization on ELF/DWARF systems works with generation of debugging
+information.  Other systems are currently not supported.
+
  Link-time optimization does not work well with generation of debugging
  information.  Combining @option{-flto} with
  @option{-g} is currently experimental and expected to produce unexpected

this just adds to the current warning which doesn't look too useful.

Maybe simply

Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 258915)
+++ doc/invoke.texi (working copy)
@@ -9483,9 +9483,8 @@ strict version check, so bytecode files
  GCC do not work with an older or newer version of GCC.

  Link-time optimization does not work well with generation of debugging
-information.  Combining @option{-flto} with
-@option{-g} is currently experimental and expected to produce unexpected
-results.
+information on systems other than those using a combination of ELF and
+DWARF.

  If you specify the optional @var{n}, the optimization and code
  generation done at link time is executed in parallel using @var{n}

?

The other hunks look OK.

Richard.


Martin

gcc/ChangeLog:

2018-03-29  Martin Liska  

 PR lto/84995.
 * doc/invoke.texi: Document how LTO works with debug info.
 Describe auto-load support of binutils.  Mention 'x86-64'
 as valid option value of -march option.
---
  gcc/doc/invoke.texi | 11 +++
  1 file changed, 11 insertions(+)




Works for me, installed as r258953.

Thanks for the correction.
Martin


Re: [PATCH] i386: Enable AVX/AVX512 features only if supported by OSXSAVE

2018-03-29 Thread Uros Bizjak
On Thu, Mar 29, 2018 at 2:43 PM, H.J. Lu  wrote:
> Enable AVX and AVX512 features only if their states are supported by
> OSXSAVE.
>
> OK for trunk and release branches?
>
>
> H.J.
> ---
> PR target/85100
> * config/i386/cpuinfo.c (XCR_XFEATURE_ENABLED_MASK): New.
> (XSTATE_FP): Likewise.
> (XSTATE_SSE): Likewise.
> (XSTATE_YMM): Likewise.
> (XSTATE_OPMASK): Likewise.
> (XSTATE_ZMM): Likewise.
> (XSTATE_HI_ZMM): Likewise.
> (XCR_AVX_ENABLED_MASK): Likewise.
> (XCR_AVX512F_ENABLED_MASK): Likewise.
> (get_available_features): Enable AVX and AVX512 features only
> if their states are supported by OSXSAVE

OK for trunk and release branches after a couple of days without
problems in trunk.

Thanks,
Uros.

> ---
>  libgcc/config/i386/cpuinfo.c | 134 
> +--
>  1 file changed, 90 insertions(+), 44 deletions(-)
>
> diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
> index 4eb3f5cd944..1dac110a79a 100644
> --- a/libgcc/config/i386/cpuinfo.c
> +++ b/libgcc/config/i386/cpuinfo.c
> @@ -240,6 +240,40 @@ get_available_features (unsigned int ecx, unsigned int 
> edx,
>unsigned int features = 0;
>unsigned int features2 = 0;
>
> +  /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv.  */
> +#define XCR_XFEATURE_ENABLED_MASK  0x0
> +#define XSTATE_FP  0x1
> +#define XSTATE_SSE 0x2
> +#define XSTATE_YMM 0x4
> +#define XSTATE_OPMASK  0x20
> +#define XSTATE_ZMM 0x40
> +#define XSTATE_HI_ZMM  0x80
> +
> +#define XCR_AVX_ENABLED_MASK \
> +  (XSTATE_SSE | XSTATE_YMM)
> +#define XCR_AVX512F_ENABLED_MASK \
> +  (XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM)
> +
> +  /* Check if AVX and AVX512 are usable.  */
> +  int avx_usable = 0;
> +  int avx512_usable = 0;
> +  if ((ecx & bit_OSXSAVE))
> +{
> +  /* Check if XMM, YMM, OPMASK, upper 256 bits of ZMM0-ZMM15 and
> + ZMM16-ZMM31 states are supported by OSXSAVE.  */
> +  unsigned int xcrlow;
> +  unsigned int xcrhigh;
> +  asm (".byte 0x0f, 0x01, 0xd0"
> +  : "=a" (xcrlow), "=d" (xcrhigh)
> +  : "c" (XCR_XFEATURE_ENABLED_MASK));
> +  if ((xcrlow & XCR_AVX_ENABLED_MASK) == XCR_AVX_ENABLED_MASK)
> +   {
> + avx_usable = 1;
> + avx512_usable = ((xcrlow & XCR_AVX512F_ENABLED_MASK)
> +  == XCR_AVX512F_ENABLED_MASK);
> +   }
> +}
> +
>  #define set_feature(f) \
>if (f < 32) features |= (1U << f); else features2 |= (1U << (f - 32))
>
> @@ -265,10 +299,13 @@ get_available_features (unsigned int ecx, unsigned int 
> edx,
>  set_feature (FEATURE_SSE4_1);
>if (ecx & bit_SSE4_2)
>  set_feature (FEATURE_SSE4_2);
> -  if (ecx & bit_AVX)
> -set_feature (FEATURE_AVX);
> -  if (ecx & bit_FMA)
> -set_feature (FEATURE_FMA);
> +  if (avx_usable)
> +{
> +  if (ecx & bit_AVX)
> +   set_feature (FEATURE_AVX);
> +  if (ecx & bit_FMA)
> +   set_feature (FEATURE_FMA);
> +}
>
>/* Get Advanced Features at level 7 (eax = 7, ecx = 0). */
>if (max_cpuid_level >= 7)
> @@ -276,44 +313,50 @@ get_available_features (unsigned int ecx, unsigned int 
> edx,
>__cpuid_count (7, 0, eax, ebx, ecx, edx);
>if (ebx & bit_BMI)
> set_feature (FEATURE_BMI);
> -  if (ebx & bit_AVX2)
> -   set_feature (FEATURE_AVX2);
> +  if (avx_usable)
> +   {
> + if (ebx & bit_AVX2)
> +   set_feature (FEATURE_AVX2);
> +   }
>if (ebx & bit_BMI2)
> set_feature (FEATURE_BMI2);
> -  if (ebx & bit_AVX512F)
> -   set_feature (FEATURE_AVX512F);
> -  if (ebx & bit_AVX512VL)
> -   set_feature (FEATURE_AVX512VL);
> -  if (ebx & bit_AVX512BW)
> -   set_feature (FEATURE_AVX512BW);
> -  if (ebx & bit_AVX512DQ)
> -   set_feature (FEATURE_AVX512DQ);
> -  if (ebx & bit_AVX512CD)
> -   set_feature (FEATURE_AVX512CD);
> -  if (ebx & bit_AVX512PF)
> -   set_feature (FEATURE_AVX512PF);
> -  if (ebx & bit_AVX512ER)
> -   set_feature (FEATURE_AVX512ER);
> -  if (ebx & bit_AVX512IFMA)
> -   set_feature (FEATURE_AVX512IFMA);
> -  if (ecx & bit_AVX512VBMI)
> -   set_feature (FEATURE_AVX512VBMI);
> -  if (ecx & bit_AVX512VBMI2)
> -   set_feature (FEATURE_AVX512VBMI2);
> -  if (ecx & bit_GFNI)
> -   set_feature (FEATURE_GFNI);
> -  if (ecx & bit_VPCLMULQDQ)
> -   set_feature (FEATURE_VPCLMULQDQ);
> -  if (ecx & bit_AVX512VNNI)
> -   set_feature (FEATURE_AVX512VNNI);
> -  if (ecx & bit_AVX512BITALG)
> -   set_feature (FEATURE_AVX512BITALG);
> -  if (ecx & bit_AVX512VPOPCNTDQ)
> -   set_feature (FEATURE_AVX512VPOPCNTDQ);
> -  if (edx & bit_AVX5124VNNIW)
> -   set_feature (FEATURE_AVX5124VNNIW);
> -  if (edx & bit_AVX

Re: [PATCH 1/2] More underlining of bad arguments (PR c++/85110)

2018-03-29 Thread Jason Merrill
OK.

On Wed, Mar 28, 2018 at 4:44 PM, David Malcolm  wrote:
> As of r256448, the C++ frontend underlines many bad arguments in its
> diagnostics; those where perform_overload_resolution returns a
> non-NULL candidate, but there's a failure in convert_like_real.
>
> However, for the case where perform_overload_resolution fails, but
> there's a single non-viable candidate, the error is diagnosed by
> cp_build_function_call_vec, and that currently doesn't underline
> the bad argument:
>
> $ cat test.cc
> void callee (int one, const char **two, int three);
>
> void
> caller (const char *fmt)
> {
>   callee (1, fmt, 3);
> }
>
> We emit:
>
> $ g++ test.cc
> test.cc: In function 'void caller(const char*)':
> test.cc:6:20: error: cannot convert 'const char*' to 'const char**' for 
> argument '2' to 'void callee(int, const char**, int)'
>callee (1, fmt, 3);
> ^
>
> It's going through convert_for_assignment, and
> implicitly using input_location.
>
> This patch updates convert_for_assignment for this case, using
> an EXPR_LOCATION if there is one, or falling back to input_location
> otherwise, underlining the argument in question:
>
> test.cc: In function 'void caller(const char*)':
> test.cc:6:14: error: cannot convert 'const char*' to 'const char**' for 
> argument '2' to 'void callee(int, const char**, int)'
>callee (1, fmt, 3);
>   ^~~
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu; adds
> 117 PASS results to g++.sum.
>
> Not a regression as such, but I've been calling out the underlined
> arguments as a feature of gcc 8, so would be good to fix.
>
> OK for trunk?
>
> gcc/cp/ChangeLog:
> PR c++/85110
> * typeck.c (convert_for_assignment): When complaining due to
> conversions for an argument, attempt to use the location of the
> argument.
>
> gcc/testsuite/ChangeLog:
> PR c++/85110
> * g++.dg/diagnostic/param-type-mismatch-2.C: New test.
> ---
>  gcc/cp/typeck.c|   5 +-
>  .../g++.dg/diagnostic/param-type-mismatch-2.C  | 175 
> +
>  2 files changed, 178 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C
>
> diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
> index 0c2ebd1..e733c79 100644
> --- a/gcc/cp/typeck.c
> +++ b/gcc/cp/typeck.c
> @@ -8781,8 +8781,9 @@ convert_for_assignment (tree type, tree rhs,
>parmnum, complain, flags);
> }
>   else if (fndecl)
> -   error ("cannot convert %qH to %qI for argument %qP to %qD",
> -  rhstype, type, parmnum, fndecl);
> +   error_at (EXPR_LOC_OR_LOC (rhs, input_location),
> + "cannot convert %qH to %qI for argument %qP to %qD",
> + rhstype, type, parmnum, fndecl);
>   else
> switch (errtype)
>   {
> diff --git a/gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C 
> b/gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C
> new file mode 100644
> index 000..ae84248
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C
> @@ -0,0 +1,175 @@
> +// { dg-options "-fdiagnostics-show-caret" }
> +
> +/* A collection of calls where argument 2 is of the wrong type.  */
> +
> +/* decl, with argname.  */
> +
> +extern int callee_1 (int one, const char **two, float three);
> +
> +int test_1 (int first, const char *second, float third)
> +{
> +  return callee_1 (first, second, third); // { dg-error "27: cannot convert 
> 'const char\\*' to 'const char\\*\\*' for argument '2' to 'int 
> callee_1\\(int, const char\\*\\*, float\\)'" }
> +  /* { dg-begin-multiline-output "" }
> +   return callee_1 (first, second, third);
> +   ^~
> + { dg-end-multiline-output "" } */
> +}
> +
> +/* decl, without argname.  */
> +
> +extern int callee_2 (int, const char **, float);
> +
> +int test_2 (int first, const char *second, float third)
> +{
> +  return callee_2 (first, second, third); // { dg-error "27: cannot convert 
> 'const char\\*' to 'const char\\*\\*' for argument '2' to 'int 
> callee_2\\(int, const char\\*\\*, float\\)'" }
> +  /* { dg-begin-multiline-output "" }
> +   return callee_2 (first, second, third);
> +   ^~
> + { dg-end-multiline-output "" } */
> +}
> +
> +/* defn, with argname.  */
> +
> +static int callee_3 (int one, const char **two, float three)
> +{
> +  return callee_2 (one, two, three);
> +}
> +
> +int test_3 (int first, const char *second, float third)
> +{
> +  return callee_3 (first, second, third); // { dg-error "27: cannot convert 
> 'const char\\*' to 'const char\\*\\*' for argument '2' to 'int 
> callee_3\\(int, const char\\*\\*, float\\)'" }
> +  /* { dg-begin-multiline-output "" }
> +   return callee_3 (first, second, third);
> +  

Re: [PATCH 2/2] Show pertinent parameter (PR c++/85110)

2018-03-29 Thread Jason Merrill
On Wed, Mar 28, 2018 at 4:44 PM, David Malcolm  wrote:
> This followup patch updates the specific error-handling path
> to add a note showing the pertinent parameter decl, taking
> the output from:
>
> test.cc: In function 'void caller(const char*)':
> test.cc:6:14: error: cannot convert 'const char*' to 'const char**' for 
> argument '2' to 'void callee(int, const char**, int)'
>callee (1, fmt, 3);
>   ^~~
>
> to:
>
> test.cc: In function 'void caller(const char*)':
> test.cc:6:14: error: cannot convert 'const char*' to 'const char**' for 
> argument '2' to 'void callee(int, const char**, int)'
>callee (1, fmt, 3);
>   ^~~
> test.cc:1:36: note:   initializing argument 2 of 'void callee(int, const 
> char**, int)'
>  void callee (int one, const char **two, int three);
>~^~~
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu; adds
> a further 18 PASS results to g++.sum.
>
> Again, not a regression as such, but I've been calling out the underlined
> arguments as a feature of gcc 8, so would be good to fix.
>
> OK for trunk?
>
> gcc/cp/ChangeLog:
> PR c++/85110
> * call.c (get_fndecl_argument_location): Make non-static.
> * cp-tree.h (get_fndecl_argument_location): New decl.
> * typeck.c (convert_for_assignment): When complaining due to
> conversions for an argument, show the location of the parameter
> within the decl.
>
> gcc/testsuite/ChangeLog:
> PR c++/85110
> * g++.dg/diagnostic/param-type-mismatch-2.C: Update for the cases
> where we now show the pertinent parameter.
> ---
>  gcc/cp/call.c   |  2 +-
>  gcc/cp/cp-tree.h|  2 ++
>  gcc/cp/typeck.c | 10 +++---
>  .../g++.dg/diagnostic/param-type-mismatch-2.C   | 21 
> ++---
>  4 files changed, 28 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/cp/call.c b/gcc/cp/call.c
> index 1a87f99..e1a0639 100644
> --- a/gcc/cp/call.c
> +++ b/gcc/cp/call.c
> @@ -6598,7 +6598,7 @@ maybe_print_user_conv_context (conversion *convs)
> ARGNUM is zero based, -1 indicates the `this' argument of a method.
> Return the location of the FNDECL itself if there are problems.  */
>
> -static location_t
> +location_t
>  get_fndecl_argument_location (tree fndecl, int argnum)
>  {
>int i;
> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> index d5382c2..b45880d 100644
> --- a/gcc/cp/cp-tree.h
> +++ b/gcc/cp/cp-tree.h
> @@ -5965,6 +5965,8 @@ extern bool can_convert_arg   
> (tree, tree, tree, int,
>  tsubst_flags_t);
>  extern bool can_convert_arg_bad(tree, tree, tree, 
> int,
>  tsubst_flags_t);
> +extern location_t get_fndecl_argument_location  (tree, int);
> +
>
>  /* A class for recording information about access failures (e.g. private
> fields), so that we can potentially supply a fix-it hint about
> diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
> index e733c79..742b2e9 100644
> --- a/gcc/cp/typeck.c
> +++ b/gcc/cp/typeck.c
> @@ -8781,9 +8781,13 @@ convert_for_assignment (tree type, tree rhs,
>parmnum, complain, flags);
> }
>   else if (fndecl)
> -   error_at (EXPR_LOC_OR_LOC (rhs, input_location),
> - "cannot convert %qH to %qI for argument %qP to %qD",
> - rhstype, type, parmnum, fndecl);
> +   {
> + error_at (EXPR_LOC_OR_LOC (rhs, input_location),
> +   "cannot convert %qH to %qI for argument %qP to 
> %qD",
> +   rhstype, type, parmnum, fndecl);
> + inform (get_fndecl_argument_location (fndecl, parmnum),
> + "  initializing argument %P of %qD", parmnum, 
> fndecl);

If you're adding the inform, you don't need the %P of %D in the initial error.

Jason


[PATCH] PR libgcc/60790: Avoid IFUNC resolver access to uninitialized data

2018-03-29 Thread Florian Weimer
This patch performs lazy initialization of the relevant CPUID feature 
register value.  It will needlessly invoke the CPUID determination code 
on architectures which lack CPUID support or support for the feature 
register, but I think it's not worth to avoid the complexity for that.


I verified manually that the CMPXCHG16B implementation is still selected 
for a 128-bit load after the change.  I don't know how to write an 
automated test for that.


Thanks,
Florian
Index: libatomic/ChangeLog
===
--- libatomic/ChangeLog	(revision 258952)
+++ libatomic/ChangeLog	(working copy)
@@ -1,3 +1,18 @@
+2018-03-29  Florian Weimer  
+
+	PR libgcc/60790
+	x86: Do not assume ELF constructors run before IFUNC resolvers.
+	* config/x86/host-config.h (libat_feat1_ecx, libat_feat1_edx):
+	Remove declarations.
+	(__libat_feat1, __libat_feat1_init): Declare.
+	(FEAT1_REGISTER): Define.
+	(load_feat1): New function.
+	(IFUNC_COND_1): Adjust.
+	* config/x86/init.c (libat_feat1_ecx, libat_feat1_edx)
+	(init_cpuid): Remove definitions.
+	(__libat_feat1): New variable.
+	(__libat_feat1_init): New function.
+
 2018-03-09  Andreas Krebbel  
 
 	* config/s390/exch_n.c: New file.
Index: libatomic/config/x86/host-config.h
===
--- libatomic/config/x86/host-config.h	(revision 258952)
+++ libatomic/config/x86/host-config.h	(working copy)
@@ -25,13 +25,39 @@
 #if HAVE_IFUNC
 #include 
 
-extern unsigned int libat_feat1_ecx HIDDEN;
-extern unsigned int libat_feat1_edx HIDDEN;
+#ifdef __x86_64__
+# define FEAT1_REGISTER ecx
+#else
+# define FEAT1_REGISTER edx
+#endif
 
+/* Value of the CPUID feature FEAT1_REGISTER for the cmpxchg bit for
+   IFUNC_COND1 below.  */
+extern unsigned int __libat_feat1 HIDDEN;
+
+/* Initialize libat_feat1 and return its value.  */
+unsigned int __libat_feat1_init (void) HIDDEN;
+
+/* Return the value of the relevant feature register for the relevant
+   cmpxchg bit, or 0 if there is no CPUID support.  */
+static inline unsigned int
+__attribute__ ((const))
+load_feat1 (void)
+{
+  /* Synchronizes with the store in __libat_feat1_init.  */
+  unsigned int feat1 = __atomic_load_n (&__libat_feat1, __ATOMIC_RELAXED);
+  if (feat1 == 0)
+/* Assume that initialization has not happened yet.  This may get
+   called repeatedly if the CPU does not have any feature bits at
+   all.  */
+feat1 = __libat_feat1_init ();
+  return feat1;
+}
+
 #ifdef __x86_64__
-# define IFUNC_COND_1	(libat_feat1_ecx & bit_CMPXCHG16B)
+# define IFUNC_COND_1	(load_feat1 () & bit_CMPXCHG16B)
 #else
-# define IFUNC_COND_1	(libat_feat1_edx & bit_CMPXCHG8B)
+# define IFUNC_COND_1	(load_feat1 () & bit_CMPXCHG8B)
 #endif
 
 #ifdef __x86_64__
Index: libatomic/config/x86/init.c
===
--- libatomic/config/x86/init.c	(revision 258952)
+++ libatomic/config/x86/init.c	(working copy)
@@ -26,13 +26,17 @@
 
 #if HAVE_IFUNC
 
-unsigned int libat_feat1_ecx, libat_feat1_edx;
+unsigned int __libat_feat1;
 
-static void __attribute__((constructor))
-init_cpuid (void)
+unsigned int
+__libat_feat1_init (void)
 {
-  unsigned int eax, ebx;
-  __get_cpuid (1, &eax, &ebx, &libat_feat1_ecx, &libat_feat1_edx);
+  unsigned int eax, ebx, ecx, edx;
+  FEAT1_REGISTER = 0;
+  __get_cpuid (1, &eax, &ebx, &ecx, &edx);
+  /* Synchronizes with the load in load_feat1.  */
+  __atomic_store_n (&__libat_feat1, FEAT1_REGISTER, __ATOMIC_RELAXED);
+  return FEAT1_REGISTER;
 }
 
 #endif /* HAVE_IFUNC */


Re: [PATCH][GCC][mid-end] Allow larger copies when target supports unaligned access [Patch (1/2)]

2018-03-29 Thread Alan Modra
On Thu, Nov 16, 2017 at 03:27:01PM +, Tamar Christina wrote:
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -2769,7 +2769,9 @@ copy_blkmode_to_reg (machine_mode mode, tree src)
>  
>n_regs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
>dst_words = XALLOCAVEC (rtx, n_regs);
> -  bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD);
> +  bitsize = BITS_PER_WORD;
> +  if (targetm.slow_unaligned_access (word_mode, TYPE_ALIGN (TREE_TYPE 
> (src
> +bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD);
>  
>/* Copy the structure BITSIZE bits at a time.  */
>for (bitpos = 0, xbitpos = padding_correction;

I believe this patch is wrong.  Please revert.  See the PR84762 testcase.

There are two problems.  Firstly, if padding_correction is non-zero,
then xbitpos % BITS_PER_WORD is non-zero and in

  store_bit_field (dst_word, bitsize, xbitpos % BITS_PER_WORD,
   0, 0, word_mode,
   extract_bit_field (src_word, bitsize,
  bitpos % BITS_PER_WORD, 1,
  NULL_RTX, word_mode, word_mode,
  false, NULL),
   false);

the stored bit-field exceeds the destination register size.  You could
fix that by making bitsize the gcd of bitsize and padding_correction.

However, that doesn't fix the second problem which is that the
extracted bit-field can exceed the source size.  That will result in
rubbish being read into a register.

-- 
Alan Modra
Australia Development Lab, IBM


[PATCH] use warning_n instead of _at (PR 84818)

2018-03-29 Thread Martin Sebor

The attached patch replaces a couple of uses of warning_at
in builtins.c with the more appropriate warning_n calls as
suggested in the bug.

The change passes regression testing on x86_64-linux.  I'll
go ahead and commit it as obvious in the next day or so unless
there are objections.

Thanks
Martin
PR tree-optimization/84818 - integer_onep must not be used for i18n

gcc/ChangeLog:

	PR tree-optimization/84818
	* builtins.c (check_access): Use warning_n.

Index: gcc/builtins.c
===
--- gcc/builtins.c	(revision 258941)
+++ gcc/builtins.c	(working copy)
@@ -3172,13 +3172,12 @@ check_access (tree exp, tree, tree, tree dstwrite,
 			  exp, func, range[0], dstsize);
 	}
 	  else if (tree_int_cst_equal (range[0], range[1]))
-	warning_at (loc, opt,
-			(integer_onep (range[0])
-			 ? G_("%K%qD writing %E byte into a region "
-			  "of size %E overflows the destination")
-			 : G_("%K%qD writing %E bytes into a region "
-			  "of size %E overflows the destination")),
-			exp, func, range[0], dstsize);
+	warning_n (loc, opt, tree_to_uhwi (range[0]),
+		   "%K%qD writing %E byte into a region "
+		   "of size %E overflows the destination",
+		   "%K%qD writing %E bytes into a region "
+		   "of size %E overflows the destination",
+		   exp, func, range[0], dstsize);
 	  else if (tree_int_cst_sign_bit (range[1]))
 	{
 	  /* Avoid printing the upper bound if it's invalid.  */
@@ -3273,10 +3272,9 @@ check_access (tree exp, tree, tree, tree dstwrite,
   location_t loc = tree_nonartificial_location (exp);
 
   if (tree_int_cst_equal (range[0], range[1]))
-	warning_at (loc, opt,
-		(tree_int_cst_equal (range[0], integer_one_node)
-		 ? G_("%K%qD reading %E byte from a region of size %E")
-		 : G_("%K%qD reading %E bytes from a region of size %E")),
+	warning_n (loc, opt, tree_to_uhwi (range[0]),
+		   "%K%qD reading %E byte from a region of size %E",
+		   "%K%qD reading %E bytes from a region of size %E",
 		exp, func, range[0], slen);
   else if (tree_int_cst_sign_bit (range[1]))
 	{


[PATCH, rs6000] Tidy implementation of vec_ldl

2018-03-29 Thread Kelvin Nilsen

During code review, an inconsistency was noticed in some
of the prototypes defined for the vec_ldl built-in function.
In particular, the vector fetched from an address declare
to be long long * was returned as "vector int".  In
addressing this problem, certain other inconsistencies
and omissions were discovered.  This patch tidies up
the implementation of this function.  A separate patch
is in preparation to address the documentation for this
and all other PowerPC built-in functions.

In summary, this patch removes two prototypes:

  vector int vec_ldl (int, long int *)
  vector unsigned int vec_ldl (int, unsigned long int *)

and adds eight:

  vector bool char vec_ldl (int, bool char *)
  vector bool short vec_ldl (int, bool short *)
  vector bool int vec_ldl (int, bool int *)
  vector bool long long vec_ldl (int, bool long long *)
  vector pixel vec_ldl (int, pixel *)
  vector long long vec_ldl (int, long long *)
  vector unsigned long long vec_ldl (int, unsigned long long *)

This patch has been bootstrapped and tested without
regressions on both powerpc64le-unknown-linux (P8)
and on powerpc-linux (P7 big-endian, with both -m32
and -m64 target options).

Is this ok for trunk?

gcc/ChangeLog:

2018-03-29  Kelvin Nilsen 

    * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Remove
    erroneous entries for
    "vector int vec_ldl (int, long int *)", and
    "vector    unsigned int vec_ldl (int, unsigned long int *)".
    Add comments and entries for
    "vector bool char vec_ldl (int, bool char *)",
    "vector bool short vec_ldl (int, bool short *)",
    "vector bool int vec_ldl (int, bool int *)",
    "vector bool long long vec_ldl (int, bool long long *)",
    "vector pixel vec_ldl (int, pixel *)",
    "vector long long vec_ldl (int, long long *)",
    "vector unsigned long long vec_ldl (int, unsigned long long *)".
    * config/rs6000/rs6000.c (rs6000_init_builtins): Initialize new
    type tree bool_long_long_type_node and correct definition of
    bool_V2DI_type_node to make reference to this new type tree.
    (rs6000_mangle_type): Replace erroneous reference to
    bool_long_type_node with bool_long_long_type_node.
    * config/rs6000/rs6000.h (enum rs6000_builtin_type_index): Add
    comments to emphasize sign distinctions for char and int types and
    replace RS6000_BTI_bool_long constant with
    RS6000_BTI_bool_long_long constant.
    (bool_long_type_node): Remove this macro definition.
    (bool_long_long_type_node): New macro definition

gcc/testsuite/ChangeLog:

2018-03-29  Kelvin Nilsen 

    * gcc.target/powerpc/vec-ldl-1.c: New test.
    * gcc.dg/vmx/ops-long-1.c: Correct test programs to reflect
    corrections to ABI implementation.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c    (revision 258800)
+++ gcc/config/rs6000/rs6000.c    (working copy)
@@ -16947,7 +16947,7 @@ rs6000_init_builtins (void)
   bool_char_type_node = build_distinct_type_copy 
(unsigned_intQI_type_node);
   bool_short_type_node = build_distinct_type_copy 
(unsigned_intHI_type_node);
   bool_int_type_node = build_distinct_type_copy 
(unsigned_intSI_type_node);
-  bool_long_type_node = build_distinct_type_copy 
(unsigned_intDI_type_node);
+  bool_long_long_type_node = build_distinct_type_copy 
(unsigned_intDI_type_node);

   pixel_type_node = build_distinct_type_copy (unsigned_intHI_type_node);

   long_integer_type_internal_node = long_integer_type_node;
@@ -17064,7 +17064,7 @@ rs6000_init_builtins (void)
   bool_V2DI_type_node = rs6000_vector_type (TARGET_POWERPC64
                     ? "__vector __bool long"
                     : "__vector __bool long long",
-                        bool_long_type_node, 2);
+                        bool_long_long_type_node, 2);
   pixel_V8HI_type_node = rs6000_vector_type ("__vector __pixel",
                  pixel_type_node, 8);

@@ -32855,7 +32855,7 @@ rs6000_mangle_type (const_tree type)
   if (type == bool_short_type_node) return "U6__bools";
   if (type == pixel_type_node) return "u7__pixel";
   if (type == bool_int_type_node) return "U6__booli";
-  if (type == bool_long_type_node) return "U6__booll";
+  if (type == bool_long_long_type_node) return "U6__booll";

   /* Use a unique name for __float128 rather than trying to use "e" or 
"g". Use
  "g" for IBM extended double, no matter whether it is long double 
(using

Index: gcc/config/rs6000/rs6000.h
===
--- gcc/config/rs6000/rs6000.h    (revision 258800)
+++ gcc/config/rs6000/rs6000.h    (working copy)
@@ -2578,7 +2578,7 @@ enum rs6000_builtin_type_index
   RS6000_BTI_opaque_V2SF,
   RS6000_BTI_opaque_p_V2SI,
   RS6000_BTI_opaque_V4SI,
-  RS6000_BTI_V16QI,
+  RS6000_BTI_V16QI,  /* __vector signed char */
   RS6000_BTI_V1TI,
   RS6000_BTI_V2SI,
   RS6000_BTI_V2SF,
@@ -2588,7 +2588,7 @@ enum rs6000_builtin_type_index
   RS6000_BTI_

Re: [Aarch64] Fix conditional branches with target far away.

2018-03-29 Thread Sudakshina Das

Hi Sameera

On 29/03/18 11:44, Sameera Deshpande wrote:

Hi Sudakshina,

Thanks for pointing that out. Updated the conditions for attribute
length to take care of boundary conditions for offset range.

Please find attached the updated patch.

I have tested it for gcc testsuite and the failing testcase. Ok for trunk?


Thank you so much for fixing the length as well along with you patch.
You mention a failing testcase? Maybe it would be helpful to add that
to the patch for the gcc testsuite.

Sudi



On 22 March 2018 at 19:06, Sudakshina Das  wrote:

Hi Sameera

On 22/03/18 02:07, Sameera Deshpande wrote:


Hi Sudakshina,

As per the ARMv8 ARM, for the offset range (-1048576 ,1048572), the
far branch instruction offset is inclusive of both the offsets. Hence,
I am using <=||=> and not <||>= as it was in previous implementation.



I have to admit earlier I was only looking at the patch mechanically and
found a difference with the previous implementation in offset comparison.
After you pointed out, I looked up the ARMv8 ARM and I have a couple of
doubts:

1. My understanding is that any offset in [-1048576 ,1048572] both inclusive
qualifies as an 'in range' offset. However, the code for both attribute
length and far_branch has been using [-1048576 ,1048572), that is, ( >= && <
). If the far_branch was incorrectly calculated, then maybe the length
calculations with similar magic numbers should also be corrected? Of course,
I am not an expert in this and maybe this was a conscience decision so I
would ask Ramana to maybe clarify if he remembers.

2. Now to come back to your patch, if my understanding is correct, I think a
far_branch would be anything outside of this range, that is,
(offset < -1048576 || offset > 1048572), anything that can not be
represented in the 21-bit range.

Thanks
Sudi




On 16 March 2018 at 00:51, Sudakshina Das  wrote:


On 15/03/18 15:27, Sameera Deshpande wrote:



Ping!

On 28 February 2018 at 16:18, Sameera Deshpande
 wrote:



On 27 February 2018 at 18:25, Ramana Radhakrishnan
 wrote:



On Wed, Feb 14, 2018 at 8:30 AM, Sameera Deshpande
 wrote:



Hi!

Please find attached the patch to fix bug in branches with offsets
over
1MiB.
There has been an attempt to fix this issue in commit
050af05b9761f1979f11c151519e7244d5becd7c

However, the far_branch attribute defined in above patch used
insn_length - which computes incorrect offset. Hence, eliminated the
attribute completely, and computed the offset from insn_addresses
instead.

Ok for trunk?

gcc/Changelog

2018-02-13 Sameera Deshpande 
   * config/aarch64/aarch64.md (far_branch): Remove attribute.
Eliminate
   all the dependencies on the attribute from RTL patterns.



I'm not a maintainer but this looks good to me modulo notes about how
this was tested. What would be nice is a testcase for the testsuite as
well as ensuring that the patch has been bootstrapped and regression
tested. AFAIR, the original patch was put in because match.pd failed
when bootstrap in another context.


regards
Ramana


--
- Thanks and regards,
 Sameera D.




The patch is tested with GCC testsuite and bootstrapping successfully.
Also tested for spec benchmark.



I am not a maintainer either. I noticed that the range check you do for
the offset has a (<= || >=). The "far_branch" however did (< || >=) for a
positive value. Was that also part of the incorrect offset calculation?

@@ -692,7 +675,11 @@
  {
if (get_attr_length (insn) =3D=3D 8)
  {
-   if (get_attr_far_branch (insn) =3D=3D 1)
+   long long int offset;
+   offset =3D INSN_ADDRESSES (INSN_UID (XEXP (operands[2], 0)))
+ - INSN_ADDRESSES (INSN_UID (insn));
+
+   if (offset <=3D -1048576 || offset >=3D 1048572)
 return aarch64_gen_far_branch (operands, 2, "Ltb",
"\\t%0, %1, ");
   else
@@ -709,12 +696,7 @@
   (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int
-32768))
  (lt (minus (match_dup 2) (pc)) (const_int
32764)))
 (const_int 4)
- (const_int 8)))
-   (set (attr "far_branch")
-   (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int
-1048576))
-  (lt (minus (match_dup 2) (pc)) (const_int
1048572)))
- (const_int 0)
- (const_int 1)))]
+ (const_int 8)))]

)

Thanks
Sudi


--
- Thanks and regards,
 Sameera D.






















Re: [Aarch64] Fix conditional branches with target far away.

2018-03-29 Thread Sameera Deshpande
Hi Sudakshina,

That testcase cannot be addwd as of now, as it needs approval from client.

On Thu 29 Mar, 2018, 9:01 PM Sudakshina Das,  wrote:

> Hi Sameera
>
> On 29/03/18 11:44, Sameera Deshpande wrote:
> > Hi Sudakshina,
> >
> > Thanks for pointing that out. Updated the conditions for attribute
> > length to take care of boundary conditions for offset range.
> >
> > Please find attached the updated patch.
> >
> > I have tested it for gcc testsuite and the failing testcase. Ok for
> trunk?
>
> Thank you so much for fixing the length as well along with you patch.
> You mention a failing testcase? Maybe it would be helpful to add that
> to the patch for the gcc testsuite.
>
> Sudi
>
> >
> > On 22 March 2018 at 19:06, Sudakshina Das  wrote:
> >> Hi Sameera
> >>
> >> On 22/03/18 02:07, Sameera Deshpande wrote:
> >>>
> >>> Hi Sudakshina,
> >>>
> >>> As per the ARMv8 ARM, for the offset range (-1048576 ,1048572), the
> >>> far branch instruction offset is inclusive of both the offsets. Hence,
> >>> I am using <=||=> and not <||>= as it was in previous implementation.
> >>
> >>
> >> I have to admit earlier I was only looking at the patch mechanically and
> >> found a difference with the previous implementation in offset
> comparison.
> >> After you pointed out, I looked up the ARMv8 ARM and I have a couple of
> >> doubts:
> >>
> >> 1. My understanding is that any offset in [-1048576 ,1048572] both
> inclusive
> >> qualifies as an 'in range' offset. However, the code for both attribute
> >> length and far_branch has been using [-1048576 ,1048572), that is, ( >=
> && <
> >> ). If the far_branch was incorrectly calculated, then maybe the length
> >> calculations with similar magic numbers should also be corrected? Of
> course,
> >> I am not an expert in this and maybe this was a conscience decision so I
> >> would ask Ramana to maybe clarify if he remembers.
> >>
> >> 2. Now to come back to your patch, if my understanding is correct, I
> think a
> >> far_branch would be anything outside of this range, that is,
> >> (offset < -1048576 || offset > 1048572), anything that can not be
> >> represented in the 21-bit range.
> >>
> >> Thanks
> >> Sudi
> >>
> >>
> >>>
> >>> On 16 March 2018 at 00:51, Sudakshina Das  wrote:
> 
>  On 15/03/18 15:27, Sameera Deshpande wrote:
> >
> >
> > Ping!
> >
> > On 28 February 2018 at 16:18, Sameera Deshpande
> >  wrote:
> >>
> >>
> >> On 27 February 2018 at 18:25, Ramana Radhakrishnan
> >>  wrote:
> >>>
> >>>
> >>> On Wed, Feb 14, 2018 at 8:30 AM, Sameera Deshpande
> >>>  wrote:
> 
> 
>  Hi!
> 
>  Please find attached the patch to fix bug in branches with offsets
>  over
>  1MiB.
>  There has been an attempt to fix this issue in commit
>  050af05b9761f1979f11c151519e7244d5becd7c
> 
>  However, the far_branch attribute defined in above patch used
>  insn_length - which computes incorrect offset. Hence, eliminated
> the
>  attribute completely, and computed the offset from insn_addresses
>  instead.
> 
>  Ok for trunk?
> 
>  gcc/Changelog
> 
>  2018-02-13 Sameera Deshpande 
> * config/aarch64/aarch64.md (far_branch): Remove
> attribute.
>  Eliminate
> all the dependencies on the attribute from RTL
> patterns.
> 
> >>>
> >>> I'm not a maintainer but this looks good to me modulo notes about
> how
> >>> this was tested. What would be nice is a testcase for the
> testsuite as
> >>> well as ensuring that the patch has been bootstrapped and
> regression
> >>> tested. AFAIR, the original patch was put in because match.pd
> failed
> >>> when bootstrap in another context.
> >>>
> >>>
> >>> regards
> >>> Ramana
> >>>
>  --
>  - Thanks and regards,
>   Sameera D.
> >>
> >>
> >>
> >> The patch is tested with GCC testsuite and bootstrapping
> successfully.
> >> Also tested for spec benchmark.
> >>
> 
>  I am not a maintainer either. I noticed that the range check you do
> for
>  the offset has a (<= || >=). The "far_branch" however did (< || >=)
> for a
>  positive value. Was that also part of the incorrect offset
> calculation?
> 
>  @@ -692,7 +675,11 @@
>    {
>  if (get_attr_length (insn) =3D=3D 8)
>    {
>  -   if (get_attr_far_branch (insn) =3D=3D 1)
>  +   long long int offset;
>  +   offset =3D INSN_ADDRESSES (INSN_UID (XEXP (operands[2], 0)))
>  + - INSN_ADDRESSES (INSN_UID (insn));
>  +
>  +   if (offset <=3D -1048576 || offset >=3D 1048572)
>   return aarch64_gen_far_branch (operands, 2, "Ltb",
>  "\\t%0, %1, ");
> else
>  @@ -7

[committed] Add testcase for already fixed -Wunused-but-set-variable PR c++/85108

2018-03-29 Thread Jakub Jelinek
Hi!

This got fixed with r253266, but I don't see it covered with a testcase.
Tested on x86_64-linux and i686-linux, committed to trunk as obvious.

2018-03-29  Jakub Jelinek  

PR c++/85108
* g++.dg/warn/Wunused-var-31.C: New test.

--- gcc/testsuite/g++.dg/warn/Wunused-var-31.C.jj   2018-03-29 
13:15:26.433012694 +0200
+++ gcc/testsuite/g++.dg/warn/Wunused-var-31.C  2018-03-29 13:09:48.184877285 
+0200
@@ -0,0 +1,11 @@
+// PR c++/85108
+// { dg-do compile { target c++17 } }
+// { dg-options "-Wunused-but-set-variable" }
+
+int
+main ()
+{
+  auto constexpr add = [] (auto a, auto b) { return a + b; };  // { dg-bogus 
"set but not used" }
+  auto test_lambda = [&] () { return add (2, 2); };
+  return test_lambda () - 4;
+}

Jakub


[x86] Skylake tuning options

2018-03-29 Thread Koval, Julia
Hi,

This patch adds 2 tuning options for -mtune=skylake-avx512. Ok for trunk?

gcc/
* x86-tune.def (movx, partial_reg_dependency): Enable for 
m_SKYLAKE_AVX512.

Thanks,
Julia


0001-ptc.patch
Description: 0001-ptc.patch


Re: [Aarch64] Fix conditional branches with target far away.

2018-03-29 Thread Kyrill Tkachov

Hi Sameera,

On 29/03/18 11:44, Sameera Deshpande wrote:

Hi Sudakshina,

Thanks for pointing that out. Updated the conditions for attribute
length to take care of boundary conditions for offset range.

Please find attached the updated patch.

I have tested it for gcc testsuite and the failing testcase. Ok for trunk?

On 22 March 2018 at 19:06, Sudakshina Das  wrote:
> Hi Sameera
>
> On 22/03/18 02:07, Sameera Deshpande wrote:
>>
>> Hi Sudakshina,
>>
>> As per the ARMv8 ARM, for the offset range (-1048576 ,1048572), the
>> far branch instruction offset is inclusive of both the offsets. Hence,
>> I am using <=||=> and not <||>= as it was in previous implementation.
>
>
> I have to admit earlier I was only looking at the patch mechanically and
> found a difference with the previous implementation in offset comparison.
> After you pointed out, I looked up the ARMv8 ARM and I have a couple of
> doubts:
>
> 1. My understanding is that any offset in [-1048576 ,1048572] both inclusive
> qualifies as an 'in range' offset. However, the code for both attribute
> length and far_branch has been using [-1048576 ,1048572), that is, ( >= && <
> ). If the far_branch was incorrectly calculated, then maybe the length
> calculations with similar magic numbers should also be corrected? Of course,
> I am not an expert in this and maybe this was a conscience decision so I
> would ask Ramana to maybe clarify if he remembers.
>
> 2. Now to come back to your patch, if my understanding is correct, I think a
> far_branch would be anything outside of this range, that is,
> (offset < -1048576 || offset > 1048572), anything that can not be
> represented in the 21-bit range.
>
> Thanks
> Sudi
>
>
>>
>> On 16 March 2018 at 00:51, Sudakshina Das  wrote:
>>>
>>> On 15/03/18 15:27, Sameera Deshpande wrote:


 Ping!

 On 28 February 2018 at 16:18, Sameera Deshpande
  wrote:
>
>
> On 27 February 2018 at 18:25, Ramana Radhakrishnan
>  wrote:
>>
>>
>> On Wed, Feb 14, 2018 at 8:30 AM, Sameera Deshpande
>>  wrote:
>>>
>>>
>>> Hi!
>>>
>>> Please find attached the patch to fix bug in branches with offsets
>>> over
>>> 1MiB.
>>> There has been an attempt to fix this issue in commit
>>> 050af05b9761f1979f11c151519e7244d5becd7c
>>>
>>> However, the far_branch attribute defined in above patch used
>>> insn_length - which computes incorrect offset. Hence, eliminated the
>>> attribute completely, and computed the offset from insn_addresses
>>> instead.
>>>
>>> Ok for trunk?
>>>
>>> gcc/Changelog
>>>
>>> 2018-02-13 Sameera Deshpande 
>>>   * config/aarch64/aarch64.md (far_branch): Remove attribute.
>>> Eliminate
>>>   all the dependencies on the attribute from RTL patterns.
>>>


I'd list all the patterns affected by the removal. There's not many of them
and it makes checking the impact of the patch simpler.



>>
>> I'm not a maintainer but this looks good to me modulo notes about how
>> this was tested. What would be nice is a testcase for the testsuite as
>> well as ensuring that the patch has been bootstrapped and regression
>> tested. AFAIR, the original patch was put in because match.pd failed
>> when bootstrap in another context.
>>
>>
>> regards
>> Ramana
>>
>>> --
>>> - Thanks and regards,
>>> Sameera D.
>
>
>
> The patch is tested with GCC testsuite and bootstrapping successfully.
> Also tested for spec benchmark.
>
>>>
>>> I am not a maintainer either. I noticed that the range check you do for
>>> the offset has a (<= || >=). The "far_branch" however did (< || >=) for a
>>> positive value. Was that also part of the incorrect offset calculation?
>>>




I'm not a maintainer, but this looks like a good improvement to me, and is more 
readable to boot!
Getting some kind of reduced testcase (perhaps with the creduce tool, or 
derived from base principles)
would be very valuable, but I guess it shouldn't be a blocker for this patch.
I've got a couple of nits inline.

Index: gcc/config/aarch64/aarch64.md
===
--- gcc/config/aarch64/aarch64.md   (revision 258946)
+++ gcc/config/aarch64/aarch64.md   (working copy)
@@ -264,13 +264,6 @@
(const_string "no")
] (const_string "yes")))
 
-;; Attribute that specifies whether we are dealing with a branch to a

-;; label that is far away, i.e. further away than the maximum/minimum
-;; representable in a signed 21-bits number.
-;; 0 :=: no
-;; 1 :=: yes
-(define_attr "far_branch" "" (const_int 0))
-
 ;; Strictly for compatibility with AArch32 in pipeline models, since AArch64 
has
 ;; no predicated insns.
 (define_attr "predicated" "yes,no" (const_string "no"))
@@ -466,14 +459,9 @@
   [(set_attr "type" "branch")
(set (attr "length")
(if_then_else (and (ge (minus (

Re: [x86] Skylake tuning options

2018-03-29 Thread Uros Bizjak
On Thu, Mar 29, 2018 at 5:40 PM, Koval, Julia  wrote:
> Hi,
>
> This patch adds 2 tuning options for -mtune=skylake-avx512. Ok for trunk?
>
> gcc/
> * x86-tune.def (movx, partial_reg_dependency): Enable for 
> m_SKYLAKE_AVX512.

OK.

Thanks,
Uros.


Re: [x86] Skylake tuning options

2018-03-29 Thread H.J. Lu
On Thu, Mar 29, 2018 at 8:54 AM, Uros Bizjak  wrote:
> On Thu, Mar 29, 2018 at 5:40 PM, Koval, Julia  wrote:
>> Hi,
>>
>> This patch adds 2 tuning options for -mtune=skylake-avx512. Ok for trunk?
>>
>> gcc/
>> * x86-tune.def (movx, partial_reg_dependency): Enable for 
>> m_SKYLAKE_AVX512.
>
> OK.

This is:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84413

Please add "PR target/84413" in ChangeLog entry.

-- 
H.J.


[patch, fortran] Fix PR 85111, some more zero size stuff

2018-03-29 Thread Thomas Koenig

Hello world,

the attached patch fixes PR 85111, a regression introduced with my
recent simplification patches.  Seems like the zero-size saga
has yet another chapter :-)

Regression-tested. OK for trunk?

Regards

Thomas

2017-03-29  Thomas Koenig  

PR fortran/85111
* array.c (gfc_resolve_character_array_constructor): Early
exit for zero-size arrays.
* simplify.c (simplify_transformation_to_array): Exit early
if the result size is zero.
(simplify_minmaxloc_to_array): Likewise.

2017-03-29  Thomas Koenig  

PR fortran/85111
* gfortran.dg/zero_sized_10.f90: New test.
Index: array.c
===
--- array.c	(revision 258845)
+++ array.c	(working copy)
@@ -2003,6 +2003,20 @@ gfc_resolve_character_array_constructor (gfc_expr
 
 got_charlen:
 
+  /* Early exit for zero size arrays. */
+  if (expr->shape)
+{
+  mpz_t size;
+  HOST_WIDE_INT arraysize;
+
+  gfc_array_size (expr, &size);
+  arraysize = mpz_get_ui (size);
+  mpz_clear (size);
+
+  if (arraysize == 0)
+	return true;
+}
+
   found_length = -1;
 
   if (expr->ts.u.cl->length == NULL)
Index: simplify.c
===
--- simplify.c	(revision 258845)
+++ simplify.c	(working copy)
@@ -627,7 +627,7 @@ simplify_transformation_to_array (gfc_expr *result
   n += 1;
 }
 
-  done = false;
+  done = resultsize <= 0;
   base = arrayvec;
   dest = resultvec;
   while (!done)
@@ -5304,7 +5304,7 @@ simplify_minmaxloc_to_array (gfc_expr *result, gfc
   n += 1;
 }
 
-  done = false;
+  done = resultsize <= 0;
   base = arrayvec;
   dest = resultvec;
   while (!done)
! { dg-do compile }
! { PR 85111 - this used to ICE. }
! Original test case by Gernhard Steinmetz.
program p
   integer, parameter :: a(2,0) = reshape([1,2,3,4], shape(a))
   character, parameter :: ac(2,0) = reshape(['a','b','c','d'], shape(ac))
   integer, parameter :: b(2) = maxloc(a, dim=1) ! { dg-error "Different shape" }
   integer, parameter :: c(2) = minloc(a, dim=1) ! { dg-error "Different shape" }
   character, parameter :: d(2) = maxval(ac, dim=1) ! { dg-error "Different shape" }
 end program p


patch for PR84985

2018-03-29 Thread Vladimir Makarov

  The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84985

  The patch was successfully bootstrapped and tested on x86-64.

  Committed as rev. 258961.

Index: ChangeLog
===
--- ChangeLog	(revision 258960)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2018-03-29  Vladimir Makarov  
+
+	PR inline-asm/84985
+	* lra-constraints.c (process_alt_operands): Move setting
+	this_alternative_matches below.
+
 2018-03-29  Martin Liska  
 
 PR lto/84995.
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog	(revision 258960)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2018-03-29  Vladimir Makarov  
+
+	PR inline-asm/84985
+	* gcc.target/i386/pr84985.c: New.
+
 2018-03-29  David Malcolm  
 
 	PR c++/85110
Index: lra-constraints.c
===
--- lra-constraints.c	(revision 258820)
+++ lra-constraints.c	(working copy)
@@ -2126,7 +2126,6 @@ process_alt_operands (int only_alternati
 			&& curr_operand_mode[m] != curr_operand_mode[nop])
 		  break;
 		
-		this_alternative_matches = m;
 		m_hregno = get_hard_regno (*curr_id->operand_loc[m], false);
 		/* We are supposed to match a previous operand.
 		   If we do, we win if that one did.  If we do
@@ -2228,6 +2227,7 @@ process_alt_operands (int only_alternati
 		else
 		  did_match = true;
 
+		this_alternative_matches = m;
 		/* This can be fixed with reloads if the operand
 		   we are supposed to match can be fixed with
 		   reloads. */
Index: testsuite/gcc.target/i386/pr84985.c
===
--- testsuite/gcc.target/i386/pr84985.c	(nonexistent)
+++ testsuite/gcc.target/i386/pr84985.c	(working copy)
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O0" } */
+int main() {
+  int a;
+  asm("" : "=d"(a) : "0"(a), "0ae"(&a));
+}


Re: [PATCH 1/4] More #include suggestions (PR c++/84269)

2018-03-29 Thread Jason Merrill
OK.

On Thu, Mar 22, 2018 at 7:44 PM, David Malcolm  wrote:
> PR c++/84269 reports a number of names in the C and C++ standard
> libraries for which we don't yet offer #include fix-it hints.
>
> This patch adds them (up to comment #9).
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
>
> OK for trunk?
>
> gcc/c-family/ChangeLog:
> PR c++/84269
> * known-headers.cc (get_stdlib_header_for_name): Add various names
> from , , and ; add more names from
> .
>
> gcc/cp/ChangeLog:
> PR c++/84269
> * name-lookup.c (get_std_name_hint): Add names from ,
> , and .
>
> gcc/testsuite/ChangeLog:
> PR c++/84269
> * g++.dg/lookup/missing-std-include-6.C: New test.
> * g++.dg/lookup/missing-std-include.C: Add std::pair and
> std::tuple tests.
> * g++.dg/spellcheck-reswords.C: Expect a hint about .
> * g++.dg/spellcheck-stdlib.C: Add tests for names in ,
> , , and .
> ---
>  gcc/c-family/known-headers.cc  | 33 +-
>  gcc/cp/name-lookup.c   | 14 +
>  .../g++.dg/lookup/missing-std-include-6.C  | 62 ++
>  gcc/testsuite/g++.dg/lookup/missing-std-include.C  |  8 +++
>  gcc/testsuite/g++.dg/spellcheck-reswords.C |  1 +
>  gcc/testsuite/g++.dg/spellcheck-stdlib.C   | 73 
> ++
>  6 files changed, 190 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/lookup/missing-std-include-6.C
>
> diff --git a/gcc/c-family/known-headers.cc b/gcc/c-family/known-headers.cc
> index ef23cbe..5524d21 100644
> --- a/gcc/c-family/known-headers.cc
> +++ b/gcc/c-family/known-headers.cc
> @@ -57,6 +57,9 @@ get_stdlib_header_for_name (const char *name, enum stdlib 
> lib)
>gcc_assert (lib < NUM_STDLIBS);
>
>static const stdlib_hint hints[] = {
> +/*  and .  */
> +{"assert", {"",  ""} },
> +
>  /*  and .  */
>  {"errno", {"", ""} },
>
> @@ -92,16 +95,44 @@ get_stdlib_header_for_name (const char *name, enum stdlib 
> lib)
>  {"size_t", {"", ""} },
>  {"wchar_t", {"", NULL /* a keyword in C++ */} },
>
> -/* .  */
> +/*  and .  */
>  {"BUFSIZ", {"", ""} },
>  {"EOF", {"", ""} },
>  {"FILE", {"", ""} },
>  {"FILENAME_MAX", {"", ""} },
> +{"fopen", {"", ""} },
>  {"fpos_t", {"", ""} },
> +{"getchar", {"", ""} },
> +{"printf", {"", ""} },
> +{"snprintf", {"", ""} },
> +{"sprintf", {"", ""} },
>  {"stderr", {"", ""} },
>  {"stdin", {"", ""} },
>  {"stdout", {"", ""} },
>
> +/*  and .  */
> +{"free", {"", ""} },
> +{"malloc", {"", ""} },
> +{"realloc", {"", ""} },
> +
> +/*  and .  */
> +{"memchr", {"", ""} },
> +{"memcmp", {"", ""} },
> +{"memcpy", {"", ""} },
> +{"memmove", {"", ""} },
> +{"memset", {"", ""} },
> +{"strcat", {"", ""} },
> +{"strchr", {"", ""} },
> +{"strcmp", {"", ""} },
> +{"strcpy", {"", ""} },
> +{"strlen", {"", ""} },
> +{"strncat", {"", ""} },
> +{"strncmp", {"", ""} },
> +{"strncpy", {"", ""} },
> +{"strrchr", {"", ""} },
> +{"strspn", {"", ""} },
> +{"strstr", {"", ""} },
> +
>  /* .  */
>  {"PTRDIFF_MAX", {"", ""} },
>  {"PTRDIFF_MIN", {"", ""} },
> diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
> index e193b3b..061729a 100644
> --- a/gcc/cp/name-lookup.c
> +++ b/gcc/cp/name-lookup.c
> @@ -5453,6 +5453,12 @@ get_std_name_hint (const char *name)
>  /* .  */
>  {"map", ""},
>  {"multimap", ""},
> +/* .  */
> +{"make_shared", ""},
> +{"make_unique", ""},
> +{"shared_ptr", ""},
> +{"unique_ptr", ""},
> +{"weak_ptr", ""},
>  /* .  */
>  {"queue", ""},
>  {"priority_queue", ""},
> @@ -5472,6 +5478,9 @@ get_std_name_hint (const char *name)
>  {"basic_stringstream", ""},
>  /* .  */
>  {"stack", ""},
> +/* .  */
> +{"make_tuple", ""},
> +{"tuple", ""},
>  /* .  */
>  {"string", ""},
>  {"wstring", ""},
> @@ -5483,6 +5492,11 @@ get_std_name_hint (const char *name)
>  /* .  */
>  {"unordered_set", ""}, // C++11
>  {"unordered_multiset", ""}, // C++11
> +/* .  */
> +{"forward", ""},
> +{"make_pair", ""},
> +{"move", ""},
> +{"pair", ""},
>  /* .  */
>  {"vector", ""},
>};
> diff --git a/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C 
> b/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C
> new file mode 100644
> index 000..100bcc0
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C
> @@ -0,0 +1,62 @@
> +// { dg-do compile { target c++11 } }
> +
> +/* .  */
> +
> +template
> +void test_make_shared ()
> +{
> +  auto p = std::make_shared(); // { dg-error "'make_shared' is not a 
> member of 'std'" }
> +  // { dg-message "'#include '" "" { target *-*-* } .-1 }
> +  // { dg-error "expected primary-expression before '>' token" "" { targe

Re: [PATCH 3/4] C++: suggest missing headers for implicit use of "std" (PR c++/85021)

2018-03-29 Thread Jason Merrill
On Thu, Mar 22, 2018 at 7:44 PM, David Malcolm  wrote:
> We provide fix-it hints for the most common "std" names when an explicit
> "std::" prefix is present, however we don't yet provide fix-it hints for
> this implicit case:
>
>   using namespace std;
>   void f() {  cout << "test"; }
>
> for which we emit:
>
>   t.cc: In function 'void f()':
>   t.cc:2:13: error: 'cout' was not declared in this scope
>void f() {  cout << "test"; }
>^~~~
>
> This patch detects if a "using namespace std;" directive is present
> in the current namespace, and if so, offers a suggestion for
> unrecognized names that are in our list of common "std" names:
>
>   t.cc: In function 'void f()':
>   t.cc:2:13: error: 'cout' was not declared in this scope
>void f() {  cout << "test"; }
>^~~~
>   t.cc:2:13: note: 'std::cout' is defined in header ''; did you 
> forget to '#include '?
>   +#include 
>using namespace std;
>void f() {  cout << "test"; }
>^~~~
>
> Is there a better way to test for the using directive?
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
>
> OK for trunk?
>
> gcc/cp/ChangeLog:
> PR c++/85021
> * name-lookup.c (has_using_namespace_std_directive_p): New
> function.
> (suggest_alternatives_for): Simplify if/else logic using early
> returns.  If no candidates were found, and there's a
> "using namespace std;" directive, call
> maybe_suggest_missing_std_header.
> (maybe_suggest_missing_header): Split later part of the function
> into..
> (maybe_suggest_missing_std_header): New.
>
> gcc/testsuite/ChangeLog:
> PR c++/85021
> * g++.dg/lookup/missing-std-include-7.C: New test.
> ---
>  gcc/cp/name-lookup.c   | 68 
> +-
>  .../g++.dg/lookup/missing-std-include-7.C  | 16 +
>  2 files changed, 70 insertions(+), 14 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/lookup/missing-std-include-7.C
>
> diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
> index 061729a..4eb980e 100644
> --- a/gcc/cp/name-lookup.c
> +++ b/gcc/cp/name-lookup.c
> @@ -41,6 +41,7 @@ static cxx_binding *cxx_binding_make (tree value, tree 
> type);
>  static cp_binding_level *innermost_nonclass_level (void);
>  static void set_identifier_type_value_with_scope (tree id, tree decl,
>   cp_binding_level *b);
> +static bool maybe_suggest_missing_std_header (location_t location, tree 
> name);
>
>  /* Create an overload suitable for recording an artificial TYPE_DECL
> and another decl.  We use this machanism to implement the struct
> @@ -5327,6 +5328,23 @@ qualify_lookup (tree val, int flags)
>return true;
>  }
>
> +/* Is there a "using namespace std;" directive within the current
> +   namespace?  */
> +
> +static bool
> +has_using_namespace_std_directive_p ()
> +{
> +  vec *usings = DECL_NAMESPACE_USING (current_namespace);

Checking in just the current namespace won't find a "using namespace
std" in an inner or outer scope; I think you want to add something to
name-lookup.c that iterates over the current enclosing scopes like
name_lookup::search_unqualified.  Nathan can probably be more
specific.

Jason

> +  if (!usings)
> +return false;
> +
> +  for (unsigned ix = usings->length (); ix--;)
> +if ((*usings)[ix] == std_node)
> +  return true;
> +
> +  return false;
> +}
> +
>  /* Suggest alternatives for NAME, an IDENTIFIER_NODE for which name
> lookup failed.  Search through all available namespaces and print out
> possible candidates.  If no exact matches are found, and
> @@ -5397,11 +5415,23 @@ suggest_alternatives_for (location_t location, tree 
> name,
>   inform (location_of (val), "  %qE", val);
> }
>candidates.release ();
> +  return;
>  }
> -  else if (!suggest_misspellings)
> -;
> -  else if (name_hint hint = lookup_name_fuzzy (name, FUZZY_LOOKUP_NAME,
> -  location))
> +
> +  /* No candidates were found in the available namespaces.  */
> +
> +  /* If there's a "using namespace std;" active, and this
> + is one of the most common "std::" names, then it's probably a
> + missing #include.  */
> +  if (has_using_namespace_std_directive_p ())
> +if (maybe_suggest_missing_std_header (location, name))
> +  return;
> +
> +  /* Otherwise, consider misspellings.  */
> +  if (!suggest_misspellings)
> +return;
> +  if (name_hint hint = lookup_name_fuzzy (name, FUZZY_LOOKUP_NAME,
> + location))
>  {
>/* Show a spelling correction.  */
>gcc_rich_location richloc (location);
> @@ -5509,20 +5539,13 @@ get_std_name_hint (const char *name)
>return NULL;
>  }
>
> -/* If SCOPE is the "std" namespace, then suggest pertinent header
> -   files for NAME at LOCATION.
> +/* Suggest per

Re: [PATCH 4/4] C++: more std header hints; filter on C++ dialect (PR c++/84269)

2018-03-29 Thread Jason Merrill
OK.

On Thu, Mar 22, 2018 at 7:44 PM, David Malcolm  wrote:
> Jonathan added a bunch more suggestions to:
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84269#c10
> as I was testing my last patch, some of which need C++14 and C++17,
> and some of which use headers that exist in earlier standards.
>
> For example,  exists in C++98, but if the user attempts to
> use std::make_shared with -std=c++98, they are suggested to include
> , even if they've already included it.
>
> This patch adds the missing names, and fixes the nonsensical suggestions
> by detecting if the name isn't available yet, based on the user's
> dialect, and reporting things more intelligently:
>
> t.cc: In function 'void test_make_shared()':
> t.cc:5:8: error: 'make_shared' is not a member of 'std'
>std::make_shared();
> ^~~
> t.cc:5:8: note: 'std::make_shared' is only available from C++11 onwards
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
>
> OK for trunk?
>
> gcc/cp/ChangeLog:
> PR c++/84269
> * name-lookup.c (struct std_name_hint): Move out of
> get_std_name_hint; add field "min_dialect".  */
> (get_std_name_hint): Add min_dialect values to all initializers.
> Add , , , , ,
> , , , , , ,
> , , , and .
> Add fstream, ifstream, and ofstream to .
> Add istringstream, ostringstream, and stringstream to .
> Add basic_string to .
> Add tuple_element and tuple_size to .
> Add declval to .
> Fix ordering of  and .
> Return a std_name_hint, rather than a const char *.
> (get_cxx_dialect_name): New function.
> (maybe_suggest_missing_std_header): Detect names that aren't yet
> available in the current dialect, and instead of suggesting a
> missing #include, warn about the dialect.
>
> gcc/testsuite/ChangeLog:
> PR c++/84269
> * g++.dg/lookup/missing-std-include-6.C: Move std::array and
> std::tuple here since they need C++11.
> * g++.dg/lookup/missing-std-include-8.C: New test.
> * g++.dg/lookup/missing-std-include.C: Move std::array and
> std::tuple test to missing-std-include-6.C to avoid failures
> with C++98.
> ---
>  gcc/cp/name-lookup.c   | 262 
> +++--
>  .../g++.dg/lookup/missing-std-include-6.C  |  13 +
>  .../g++.dg/lookup/missing-std-include-8.C  |  44 
>  gcc/testsuite/g++.dg/lookup/missing-std-include.C  |   7 -
>  4 files changed, 248 insertions(+), 78 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/lookup/missing-std-include-8.C
>
> diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
> index 4eb980e..3d676bb 100644
> --- a/gcc/cp/name-lookup.c
> +++ b/gcc/cp/name-lookup.c
> @@ -5441,104 +5441,214 @@ suggest_alternatives_for (location_t location, tree 
> name,
>  }
>  }
>
> +/* A well-known name within the C++ standard library, returned by
> +   get_std_name_hint.  */
> +
> +struct std_name_hint
> +{
> +  /* A name within "std::".  */
> +  const char *name;
> +
> +  /* The header name defining it within the C++ Standard Library
> + (with '<' and '>').  */
> +  const char *header;
> +
> +  /* The dialect of C++ in which this was added.  */
> +  enum cxx_dialect min_dialect;
> +};
> +
>  /* Subroutine of maybe_suggest_missing_header for handling unrecognized names
> for some of the most common names within "std::".
> -   Given non-NULL NAME, a name for lookup within "std::", return the header
> -   name defining it within the C++ Standard Library (with '<' and '>'),
> -   or NULL.  */
> +   Given non-NULL NAME, return the std_name_hint for it, or NULL.  */
>
> -static const char *
> +static const std_name_hint *
>  get_std_name_hint (const char *name)
>  {
> -  struct std_name_hint
> -  {
> -const char *name;
> -const char *header;
> -  };
>static const std_name_hint hints[] = {
> +/* .  */
> +{"any", "", cxx17},
> +{"any_cast", "", cxx17},
> +{"make_any", "", cxx17},
>  /* .  */
> -{"array", ""}, // C++11
> +{"array", "", cxx11},
> +/* .  */
> +{"atomic", "", cxx11},
> +{"atomic_flag", "", cxx11},
> +/* .  */
> +{"bitset", "", cxx11},
>  /* .  */
> -{"complex", ""},
> -{"complex_literals", ""},
> +{"complex", "", cxx98},
> +{"complex_literals", "", cxx98},
> +/* . */
> +{"condition_variable", "", cxx11},
> +{"condition_variable_any", "", cxx11},
>  /* .  */
> -{"deque", ""},
> +{"deque", "", cxx98},
>  /* .  */
> -{"forward_list", ""},  // C++11
> +{"forward_list", "", cxx11},
>  /* .  */
> -{"basic_filebuf", ""},
> -{"basic_ifstream", ""},
> -{"basic_ofstream", ""},
> -{"basic_fstream", ""},
> +{"basic_filebuf", "", cxx98},
> +{"basic_ifstream", "", cxx98},
> +{"basic_ofstream", "", cxx98},
> +{"basic_fstream", "", cxx98},
> +{"fstream", "", cxx98},
> 

[PATCH, rs6000] PR target/83822 fix redundant conditions

2018-03-29 Thread Aaron Sawdey
I've fixed the redundant conditions in the expressions pointed out by
83822. Bootstrap/regtest passes on ppc64le, ok for trunk?

Aaron


2018-03-29  Aaron Sawdey  

PR target/83822
* config/rs6000/rs6000-string.c (expand_compare_loop): Fix redundant
condition.
* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Fix redundant
condition.

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: gcc/config/rs6000/rs6000-c.c
===
--- gcc/config/rs6000/rs6000-c.c	(revision 258900)
+++ gcc/config/rs6000/rs6000-c.c	(working copy)
@@ -642,8 +642,7 @@
 	  cpp_get_callbacks (pfile)->macro_to_expand = rs6000_macro_to_expand;
 	}
 }
-  if (!TARGET_HARD_FLOAT
-  || (TARGET_HARD_FLOAT && !TARGET_DOUBLE_FLOAT))
+  if (!TARGET_HARD_FLOAT || !TARGET_DOUBLE_FLOAT)
 builtin_define ("_SOFT_DOUBLE");
   /* Used by lwarx/stwcx. errata work-around.  */
   if (rs6000_cpu == PROCESSOR_PPC405)
Index: gcc/config/rs6000/rs6000-string.c
===
--- gcc/config/rs6000/rs6000-string.c	(revision 258900)
+++ gcc/config/rs6000/rs6000-string.c	(working copy)
@@ -966,8 +966,7 @@
   rtx final_cleanup = gen_label_rtx ();
   rtx cmp_rem_before = gen_reg_rtx (word_mode);
   /* Compare one more word_mode chunk if needed.  */
-  if (!bytes_is_const
-	  || (bytes_is_const && bytes_remaining >= load_mode_size))
+  if (!bytes_is_const || bytes_remaining >= load_mode_size)
 	{
 	  /* If remainder length < word length, branch to final
 	 cleanup compare.  */


C++ PATCH for c++/85060, wrong-code calling base member from template

2018-03-29 Thread Jason Merrill
Here, during instantiate_non_dependent_expr, we were getting the wrong
answer for any_dependent_bases_p because processing_template_decl was
temporarily cleared.  Fixed by using uses_template_parms (which
doesn't depend on processing_template_decl) on the type we're asking
about.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 4c894923eba7d2a86836e43cc1093247a3241a7e
Author: Jason Merrill 
Date:   Fri Mar 23 11:14:50 2018 -0400

PR c++/85060 - wrong-code with call to base member in template.

* search.c (any_dependent_bases_p): Check uses_template_parms
rather than processing_template_decl.

diff --git a/gcc/cp/search.c b/gcc/cp/search.c
index 6bf8b0e70dc..bfeaf2cc819 100644
--- a/gcc/cp/search.c
+++ b/gcc/cp/search.c
@@ -2619,7 +2619,7 @@ original_binfo (tree binfo, tree here)
 bool
 any_dependent_bases_p (tree type)
 {
-  if (!type || !CLASS_TYPE_P (type) || !processing_template_decl)
+  if (!type || !CLASS_TYPE_P (type) || !uses_template_parms (type))
 return false;
 
   /* If we haven't set TYPE_BINFO yet, we don't know anything about the bases.
diff --git a/gcc/testsuite/g++.dg/template/dependent-base3.C b/gcc/testsuite/g++.dg/template/dependent-base3.C
new file mode 100644
index 000..e38b968e774
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/dependent-base3.C
@@ -0,0 +1,26 @@
+// PR c++/85060
+// { dg-do compile { target c++14 } }
+
+struct CA {
+  constexpr int foo() const { return 42; }
+};
+
+template 
+struct CB : CA { };
+
+template 
+struct CC : CB {
+  constexpr int bar() const {
+const int m = CA::foo();
+return m;
+  }
+
+  constexpr int baz() const {
+const T m = CA::foo();
+return m;
+  }
+};
+
+constexpr CC c;
+
+static_assert( c.bar() == 42, "" );


Re: [PATCH] Small x86 debug info improvement (PR debug/83157)

2018-03-29 Thread Jakub Jelinek
Hi!

On Thu, Mar 22, 2018 at 10:49:45PM +0100, Jakub Jelinek wrote:

I'd like to ping this patch:

> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2018-03-22  Jakub Jelinek  
> 
>   PR debug/83157
>   * var-tracking.c (add_stores): Handle STRICT_LOW_PART SET_DEST.
>   * cselib.c (cselib_record_sets): For STRICT_LOW_PART dest,
>   lookup if dest in some wider mode is known to be const0_rtx and
>   if so, record permanent equivalence for it to be ZERO_EXTEND of
>   the narrower mode destination.

Thanks

Jakub


Re: [PATCH, rs6000] PR target/83822 fix redundant conditions

2018-03-29 Thread Segher Boessenkool
On Thu, Mar 29, 2018 at 02:29:53PM -0500, Aaron Sawdey wrote:
> I've fixed the redundant conditions in the expressions pointed out by
> 83822. Bootstrap/regtest passes on ppc64le, ok for trunk?

This is fine, thanks!


Segher


> 2018-03-29  Aaron Sawdey  
> 
>   PR target/83822
>   * config/rs6000/rs6000-string.c (expand_compare_loop): Fix redundant
>   condition.
>   * config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Fix redundant
>   condition.


Re: [PATCH] [PR c++/84979] improve auto handling in explicit tmpl args for concepts

2018-03-29 Thread Alexandre Oliva
On Mar 28, 2018, Jason Merrill  wrote:

> On Wed, Mar 28, 2018 at 5:06 AM, Alexandre Oliva  wrote:
>> On Mar 23, 2018, Jason Merrill  wrote:
>> 
>>> On Fri, Mar 23, 2018 at 3:38 PM, Alexandre Oliva  wrote:
 +  /* Concepts allows 'auto' in template arguments, even multiple
 + 'auto's in a single argument.
>> 
>>> I think that's only intended for class templates;
>> 
>> Aah, that explains a lot!  Dang, it was so close!
>> 
>> It actually makes sense; was there any discussion about standardizing
>> this, with reasons to reject this construct, or is it just that we
>> aren't there yet?  I wouldn't be surprised if it were to appear in some
>> future version of C++.

> If what were to appear?

auto in explicit template arguments for template functions too.

> We only want 'auto' in places where it could be deduced, and there's
> no way in [temp.deduct.type] to deduce from explicit template
> arguments to a non-type template.

Well, it could be standardized in some way, e.g. autos could be deduced
from types of function arguments and expected return types, just like
function overloads, very much like typename template arguments are
currently deduced.  Having all explicit autos resolved would be a
requirement for a specialization to be viable.  Consider:

template  A foo(B b, C c);

...

  int x1 = foo("", 0); // ok, all args deduced
  int x2 = foo("", 0); // ok, A is explicit, B and C are deduced

  int x3 = foo("", 0); // B could be deduced as in x2
  int x4 = foo("", 0); // all args deduced as in x1

  auto x5 = foo("", 0); // deduction of A from context fails

Cases in which auto appears in the top-level are the trivial ones, that
our type deduction machinery could support right away (it has to; it's
the same as deducing any typename argument left out).

More involved cases, in which auto appears other than at the top level,
could still be deduced in general, though we would need some more
implementation work to get them right:

  int *x6 = foo("", 0);
  // deduced:   int const charint -> calls foo


I seems to me it would be a natural extension to the existing draft
specs; I guess the most common use would be to just tell the compiler to
figure out one specific template argument, when you only wish to
explicitly specify another after that: foo


>> Is this (in the patch below) the best spot to test for it?

> This seems like a plausible spot, but is there a reason not to check
> in cp_parser_template_id?

AFAICT we wouldn't always know, within cp_parser_template_id, whether
the id is a type or a function.  It seemed to me that we could
tentatively parse it as a type, and then reuse the same preparsed id as
a function.  If I were to check the argument list at that point,
assuming one or the other, we might reject something we should accept or
vice-versa, and we wouldn't revisit that decision.  I'm not saying I
tried and observed this, it is just a hunch based on observing how the
foo() expression was parsed and reparsed, and on extrapolations
from the preparsing of qualified names that I've recently had a close
encounter with ;-)

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PATCH] [PR c++/84943] allow folding of array indexing indirect_ref

2018-03-29 Thread Alexandre Oliva
On Mar 28, 2018, Jason Merrill  wrote:

> It looks like cp_build_addr_expr_1 already calls mark_used for single
> static member functions, it should probably do the same for single
> non-member functions.

Hmm...  That existing call is suspicious, now that you point it out.

There's a switch before it that peels off BASELINK and takes OVL_FIRST
only, so we would be marking as used something other than what overload
resolution would select, and not marking what overload resolution would
ultimately select, *if* we marked anything.  But since we've peeled off
the baseline, the test

if (processing_template_decl || TREE_CODE (arg) != COMPONENT_REF)
 
passes because FUNCTION_DECL is not COMPONENT_REF, and we call
build_address before we get a chance to mark anything.

I guess we could move the mark_used logic ahead in the function, but
whether or not it do the job will depend on whether cp_build_addr_expr_1
gets called again after template substitution and overload resolution.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-29 Thread Tom de Vries

On 03/02/2018 05:55 PM, Cesar Philippidis wrote:

As a follow up patch will show, the nvptx BE falls back to using
vector_length = 32 when a vector loop is nested inside a worker loop.


I disabled the fallback, and analyzed the vred2d-128.c illegal memory 
access execution failure.


I minimized that down to this ptx:
...
.shared .align 8 .u8 __oacc_bcast[176];

{
  {
.reg .u32 %x;
mov.u32 %x,%tid.x;
setp.ne.u32 %r86,%x,0;
  }

  {
.reg .u32 %tidy;
.reg .u64 %t_bcast;
.reg .u64 %y64;
mov.u32 %tidy,%tid.y;
cvt.u64.u32 %y64,%tidy;
add.u64 %y64,%y64,1;
cvta.shared.u64 %t_bcast,__oacc_bcast;
mad.lo.u64 %r66,%y64,88,%t_bcast;
  }

  @ %r86 bra $L28;
  st.u32 [%r66+80],0;
 $L28:
  ret;
}
...

The ptx is called with 2 workers and 128 vector_length.

So, 2 workers mean %tid.y has values 0 and 1.
Then %y64 has values 1 and 2.
Then %r66 has values __oacc_bcast + (1 * 88) and __oacc_bcast + (2 * 88).
Then the st.u32 accesss __oacc_bcast + (1 * 88) + 80 and __oacc_bcast + 
(2 * 88) + 80.


So we're accessing memory at location 256, while the __oacc_bcast is 
only 176 bytes big.


I formulated this assert that AFAIU detects this situation in the compiler:
...
@@ -1125,6 +1125,8 @@ nvptx_init_axis_predicate (FILE *file, int regno, 
const char *name)

   fprintf (file, "\t}\n");
 }

+static int nvptx_mach_max_workers ();
+
 /* Emit code to initialize OpenACC worker broadcast and synchronization
registers.  */

@@ -1148,6 +1150,7 @@ nvptx_init_oacc_workers (FILE *file)
   "// vector broadcast offset\n",
   REGNO (cfun->machine->bcast_partition),
   oacc_bcast_partition);
+  gcc_assert (oacc_bcast_partition * (nvptx_mach_max_workers () + 
1) <= oacc_bcast_size);

 }
   if (cfun->machine->sync_bar)
 fprintf (file, "\t\tadd.u32\t\t%%r%d, %%tidy, 1; "
...

The assert is not triggered when the fallback is used.

Thanks,
- Tom


Re: [PATCH] [PR c++/84943] allow folding of array indexing indirect_ref

2018-03-29 Thread Alexandre Oliva
On Mar 29, 2018, Alexandre Oliva  wrote:

> On Mar 28, 2018, Jason Merrill  wrote:
>> It looks like cp_build_addr_expr_1 already calls mark_used for single
>> static member functions, it should probably do the same for single
>> non-member functions.

> ultimately select, *if* we marked anything.  But since we've peeled off
> the baseline, the test
 ^ err, baselink

> if (processing_template_decl || TREE_CODE (arg) != COMPONENT_REF)
 
> passes because FUNCTION_DECL is not COMPONENT_REF, and we call
> build_address before we get a chance to mark anything.

... unless we're at the case it's supposed to catch, namely when we call
obj.static_memfn() ;-)


Here's a patch that should take care of the marking a namespace-scoped
or static member function as used when taking its address, thus working
around (fixing?) the reported problem.

Regstrapping now.  Ok to install if it passes?

[PR c++/84943] mark function as used when taking its address

fn[0]() ICEd because we would fold the INDIRECT_REF used for the
array indexing while building the address for the call, after not
finding the decl hiding there at first.  But the decl would be exposed
by the folding, and then lower layers would complain we had the decl,
after all, but it wasn't one of the artificial or special functions
that could be called without being marked as used.

This patch arranges for a FUNCTION_DECL to be marked as used when
taking its address, just like we already did when taking the address
of a static function to call it as a member function (i.e. using the
obj.fn() notation).


for  gcc/cp/ChangeLog

PR c++/84943
* typeck.c (cp_build_addr_expr_1): Mark FUNCTION_DECL as
used.

for  gcc/testsuite/ChangeLog

PR c++/84943
* g++.dg/pr84943.C: New.
* g++.dg/pr84943-2.C: New.
---
 gcc/cp/typeck.c  |8 -
 gcc/testsuite/g++.dg/pr84943-2.C |   64 ++
 gcc/testsuite/g++.dg/pr84943.C   |8 +
 3 files changed, 79 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/pr84943-2.C
 create mode 100644 gcc/testsuite/g++.dg/pr84943.C

diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index d3183b5321d3..d6157772c0f3 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -5831,6 +5831,8 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue, 
tsubst_flags_t complain)
   if (type_unknown_p (arg))
 return build1 (ADDR_EXPR, unknown_type_node, arg);
 
+  tree maybe_overloaded_arg = arg;
+
   if (TREE_CODE (arg) == OFFSET_REF)
 /* We want a pointer to member; bypass all the code for actually taking
the address of something.  */
@@ -5971,6 +5973,9 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue, 
tsubst_flags_t complain)
  so we can just form an ADDR_EXPR with the correct type.  */
   if (processing_template_decl || TREE_CODE (arg) != COMPONENT_REF)
 {
+  if (TREE_CODE (arg) == FUNCTION_DECL && arg == maybe_overloaded_arg
+ && !mark_used (arg, complain) && !(complain & tf_error))
+   return error_mark_node;
   val = build_address (arg);
   if (TREE_CODE (arg) == OFFSET_REF)
PTRMEM_OK_P (val) = PTRMEM_OK_P (arg);
@@ -5983,7 +5988,8 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue, 
tsubst_flags_t complain)
 function.  */
   gcc_assert (TREE_CODE (fn) == FUNCTION_DECL
  && DECL_STATIC_FUNCTION_P (fn));
-  if (!mark_used (fn, complain) && !(complain & tf_error))
+  if (arg == maybe_overloaded_arg
+ && !mark_used (fn, complain) && !(complain & tf_error))
return error_mark_node;
   val = build_address (fn);
   if (TREE_SIDE_EFFECTS (TREE_OPERAND (arg, 0)))
diff --git a/gcc/testsuite/g++.dg/pr84943-2.C b/gcc/testsuite/g++.dg/pr84943-2.C
new file mode 100644
index ..d1ef012b9155
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr84943-2.C
@@ -0,0 +1,64 @@
+// { dg-do run }
+
+// Avoid -pedantic-error default
+// { dg-options "" }
+
+// Make sure the functions referenced by various forms of
+// address-taking are marked as used and compiled in.
+
+static void ac() {}
+void a() {
+  ac[0](); // { dg-warning "arithmetic" }
+}
+
+static void bc() {}
+void b() {
+  (&*&*&*&bc)();
+}
+
+template  U cc() {}
+void (*c())() {
+  return cc;
+}
+
+template 
+struct x {
+  void a(int);
+  template  static U a(x*) {}
+  static void a(long) {}
+  static void a(void *) {}
+  static void a() {
+void (*p0)(void*) = x().a;
+p0(0);
+void (*p1)(long) = a;
+p1(0);
+void (*p2)() = a;
+p2();
+void (*p3)(x*) = a;
+p3(0);
+  }
+};
+
+struct z {
+  void a(int);
+  template  static U a(z*) {}
+  static void a(long) {}
+  static void a(void *) {}
+  static void a() {
+void (*p0)(void*) = z().a;
+p0(0);
+void (*p1)(long) = a;
+p1(0);
+void (*p2)() = a;
+p2();
+void (*p3)(z*) = a;
+p3(0);
+  }
+};
+
+int main(int argc, char *argv[]) {
+  if (argc > 1) {
+x().a();
+   

Re: [patch, fortran] Fix PR 85111, some more zero size stuff

2018-03-29 Thread Jerry DeLisle

On 03/29/2018 11:07 AM, Thomas Koenig wrote:

Hello world,

the attached patch fixes PR 85111, a regression introduced with my
recent simplification patches.  Seems like the zero-size saga
has yet another chapter :-)

Regression-tested. OK for trunk?



Looks OK,

Thanks,

Jerry

Regards

 Thomas

2017-03-29  Thomas Koenig  

     PR fortran/85111
     * array.c (gfc_resolve_character_array_constructor): Early
     exit for zero-size arrays.
     * simplify.c (simplify_transformation_to_array): Exit early
     if the result size is zero.
     (simplify_minmaxloc_to_array): Likewise.

2017-03-29  Thomas Koenig  

     PR fortran/85111
     * gfortran.dg/zero_sized_10.f90: New test.