On Tue, Jun 6, 2017 at 3:39 PM, Tom de Vries <tom_devr...@mentor.com> wrote: > [ was: Re: [nvptx, PATCH, 3/3] Add v2di support ] > > On 06/06/2017 03:12 PM, Tom de Vries wrote: >> >> diff --git a/libgomp/testsuite/libgomp.oacc-c/vec.c >> b/libgomp/testsuite/libgomp.oacc-c/vec.c >> new file mode 100644 >> index 0000000..79c1c17 >> --- /dev/null >> +++ b/libgomp/testsuite/libgomp.oacc-c/vec.c >> @@ -0,0 +1,48 @@ >> +/* { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */ >> +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ >> +/* { dg-additional-options "-std=c99 -ftree-slp-vectorize >> -foffload=-ftree-slp-vectorize -foffload=-fdump-tree-slp1 >> -foffload=-save-temps -save-temps" } */ >> + >> +#include <stdio.h> >> +#include <sys/time.h> >> + >> +long long int p[32 *1000] __attribute__((aligned(16))); >> +long long int p2[32 *1000] __attribute__((aligned(16))); >> + >> +int >> +main (void) >> +{ >> +#pragma acc parallel num_gangs(1) num_workers(1) vector_length(32) >> + { >> + if (((unsigned long int)p & (0xfULL)) != 0) >> + __builtin_abort (); >> + if (((unsigned long int)p2 & (0xfULL)) != 0) >> + __builtin_abort (); >> + >> + for (unsigned int k = 0; k < 10000; k += 1) >> + { >> +#pragma acc loop vector >> + for (unsigned long long int j = 0; j < 32; j += 1) >> + { >> + unsigned long long a, b; >> + unsigned long long *p3, *p4; >> + p3 = (unsigned long long *)((unsigned long long int)p & >> (~0xfULL)); >> + p4 = (unsigned long long *)((unsigned long long int)p2 & >> (~0xfULL)); >> + >> + for (unsigned int i = 0; i < 1000; i += 2) >> + { >> + a = p3[j * 1000 + i]; >> + b = p3[j * 1000 + i + 1]; >> + >> + p4[j * 1000 + i] = a; >> + p4[j * 1000 + i + 1] = b; >> + } >> + } >> + } >> + } >> + >> + return 0; >> +} >> + >> +/* Todo: make a scan-tree-dump variant that scans vec.o instead. */ >> +/* { dg-final { file copy -force [glob vec.o.*] [regsub \.o\. [glob >> vec.o.*] \.c\.] } } */ >> +/* { dg-final { scan-tree-dump "vector\\(2\\) long long unsigned int" >> "slp1" } } */ > > > Hi, > > we have scan-tree-dump that scans in test.c.* files. But when we run lto1 > for the offloaded region, we produce test.o.* files instead. In the > test-case above, I work around that by using 'dg-final { file copy }'. What > is a good way to get rid of this workaround ? > > Add scan-o-tree-dump ? > > Or make the "slp1" field smarter, and allow f.i. "o.slp1" ?
There is the same issue with regular LTO tests using scan-tree-dump which end up scanning the "fat" compilation dumpfile. Maybe add scan-ltrans-tree-dump and scan-wpa-ipa-dump that look at appropriate files plus passing appropriate flags to generate dumpfiles in known locations (I think part of them end up in /tmp). Richard. > Thanks, > - Tom