On 02/12/15 18:58, Thomas Schwinge wrote:
Hi!
On Tue, 1 Dec 2015 15:25:42 +0100, Tom de Vries<tom_devr...@mentor.com> wrote:
>Handle BUILT_IN_GOACC_PARALLEL in ipa-pta
> * c-c++-common/goacc/kernels-alias-ipa-pta-2.c: New test.
> * c-c++-common/goacc/kernels-alias-ipa-pta-3.c: New test.
> * c-c++-common/goacc/kernels-alias-ipa-pta.c: New test.
I see:
PASS: c-c++-common/goacc/kernels-alias-ipa-pta-2.c (test for excess errors)
FAIL: c-c++-common/goacc/kernels-alias-ipa-pta-2.c scan-tree-dump-times optimized
"(?n)= 0;$" 2
PASS: c-c++-common/goacc/kernels-alias-ipa-pta-2.c scan-tree-dump-times optimized
"(?n)= 1;$" 1
FAIL: c-c++-common/goacc/kernels-alias-ipa-pta-2.c scan-tree-dump-times optimized
"(?n)= \\*a" 0
PASS: c-c++-common/goacc/kernels-alias-ipa-pta-3.c (test for excess errors)
PASS: c-c++-common/goacc/kernels-alias-ipa-pta-3.c scan-tree-dump-times optimized
"(?n)= 0;$" 1
PASS: c-c++-common/goacc/kernels-alias-ipa-pta-3.c scan-tree-dump-times optimized
"(?n)= 1;$" 1
PASS: c-c++-common/goacc/kernels-alias-ipa-pta-3.c scan-tree-dump-times optimized
"(?n)= \\*a" 1
PASS: c-c++-common/goacc/kernels-alias-ipa-pta.c (test for excess errors)
FAIL: c-c++-common/goacc/kernels-alias-ipa-pta.c scan-tree-dump-times optimized
"(?n)= 0;$" 2
PASS: c-c++-common/goacc/kernels-alias-ipa-pta.c scan-tree-dump-times optimized
"(?n)= 1;$" 1
FAIL: c-c++-common/goacc/kernels-alias-ipa-pta.c scan-tree-dump-times optimized
"(?n)= \\*_[0-9]\\[0\\];$" 0
..., and similar for C++.
Curious, I get all passes for both C and C++ (at r231192).
Looking at
c-c++-common/goacc/kernels-alias-ipa-pta.c:
>--- /dev/null
>+++ b/gcc/testsuite/c-c++-common/goacc/kernels-alias-ipa-pta.c
>@@ -0,0 +1,23 @@
>+/* { dg-additional-options "-O2" } */
>+/* { dg-additional-options "-fipa-pta -fdump-tree-optimized" } */
>+
>+#define N 2
>+
>+void
>+foo (void)
>+{
>+ unsigned int a[N];
>+ unsigned int b[N];
>+ unsigned int c[N];
>+
>+#pragma acc kernels pcopyout (a, b, c)
>+ {
>+ a[0] = 0;
>+ b[0] = 1;
>+ c[0] = a[0];
>+ }
>+}
>+
>+/* { dg-final { scan-tree-dump-times "(?n)= 0;$" 2 "optimized" } } */
>+/* { dg-final { scan-tree-dump-times "(?n)= 1;$" 1 "optimized" } } */
>+/* { dg-final { scan-tree-dump-times "(?n)= \\*_\[0-9\]\\\[0\\\];$" 0
"optimized" } } */
..., manually running that one for C, I get:
;; Function foo._omp_fn.0 (foo._omp_fn.0, funcdef_no=1, decl_uid=1874,
cgraph_uid=1, symbol_order=1)
__attribute__((oacc function (1, 1, 1), omp target entrypoint))
foo._omp_fn.0 (const struct .omp_data_t.0 & restrict .omp_data_i)
{
unsigned int[2] * _3;
unsigned int[2] * _5;
unsigned int _7;
unsigned int[2] * _8;
<bb 2>:
_3 = *.omp_data_i_2(D).a;
*_3[0] = 0;
_5 = *.omp_data_i_2(D).b;
*_5[0] = 1;
_7 = *_3[0];
_8 = *.omp_data_i_2(D).c;
*_8[0] = _7;
return;
}
Indeed, the optimization hasn't taken place here so it's correct that
the scan fails.
I've attached the kernels-alias-ipa-pta.c.201t.optimized which I get,
and it's clear the optimization is happening there (which explains why I
get all PASSES).
The question is, why is the optimization not happening for you.
Thanks,
- Tom
;; Function foo._omp_fn.0 (foo._omp_fn.0, funcdef_no=1, decl_uid=1874,
cgraph_uid=1, symbol_order=1)
__attribute__((oacc function (1, 1, 1), omp target entrypoint))
foo._omp_fn.0 (const struct .omp_data_t.0 & restrict .omp_data_i)
{
unsigned int[2] * _3;
unsigned int[2] * _5;
unsigned int[2] * _8;
<bb 2>:
_3 = *.omp_data_i_2(D).a;
*_3[0] = 0;
_5 = *.omp_data_i_2(D).b;
*_5[0] = 1;
_8 = *.omp_data_i_2(D).c;
*_8[0] = 0;
return;
}
;; Function foo (foo, funcdef_no=0, decl_uid=1866, cgraph_uid=0, symbol_order=0)
foo ()
{
unsigned int c[2];
unsigned int b[2];
unsigned int a[2];
struct .omp_data_t.0 .omp_data_arr.1;
static long unsigned int .omp_data_sizes.2[3] = {8, 8, 8};
static short unsigned int .omp_data_kinds.3[3] = {514, 514, 514};
<bb 2>:
.omp_data_arr.1.c = &c;
.omp_data_arr.1.b = &b;
.omp_data_arr.1.a = &a;
__builtin_GOACC_parallel_keyed (-1, foo._omp_fn.0, 3, &.omp_data_arr.1,
&.omp_data_sizes.2, &.omp_data_kinds.3, 0);
.omp_data_arr.1 ={v} {CLOBBER};
a ={v} {CLOBBER};
b ={v} {CLOBBER};
c ={v} {CLOBBER};
return;
}