Does gcc cilk plus support include offloading to graphics hardware?

2016-04-19 Thread Hal Ashburner
Release notes say:
"Full support for Cilk Plus has been added to the GCC compiler. Cilk
Plus is an extension to the C and C++ languages to support data and
task parallelism."

gcc-5.2 (centos-7, devtoolset-4) says:

g++ -std=c++14 -Wall -O3 -march=native -fcilkplus vec_add.cpp -o vec_add
vec_add.cpp:6:0: warning: ignoring #pragma offload target [-Wunknown-pragmas]
 #pragma offload target(gfx) pin(out, in1, in2 : length(n))

Thanks





#include 
#include 

void vec_add(int n, float *out, float *in1, float *in2)
{
#pragma offload target(gfx) pin(out, in1, in2 : length(n))
cilk_for(int i = 0; i != n; ++i)
{
out[i] = in1[i] + in2[i];
}
}

static int ar_sz = 10;
int main (int argc, char **argv)
{
float foo[ar_sz];
float bar[ar_sz];
float out[ar_sz];
for(int i = 0; i != ar_sz; ++i)
{
foo[i] = i + ar_sz * 10;
bar[i] = i;
}
vec_add(ar_sz, out, foo, bar);

for(int i = 0; i != ar_sz; i += 100)
{
std::cout << "foo[" << i << "] =" << foo[i] << "\t|\tbar[" <<
i << "] =" <<  bar[i] << std::endl;
}
}

Compiled with

FLAGS=-std=c++14 -Wall -O3 -march=native -fcilkplus

all: vec_add fib

vec_add: vec_add.cpp
g++ $(FLAGS) $< -o $@



$gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/rh/devtoolset-4/root/usr/libexec/gcc/x86_64-redhat-linux/5.2.1/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap
--enable-languages=c,c++,fortran,lto
--prefix=/opt/rh/devtoolset-4/root/usr
--mandir=/opt/rh/devtoolset-4/root/usr/share/man
--infodir=/opt/rh/devtoolset-4/root/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared
--enable-threads=posix --enable-checking=release --enable-multilib
--with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id --enable-plugin --with-linker-hash-style=gnu
--enable-initfini-array --disable-libgcj
--with-default-libstdcxx-abi=gcc4-compatible
--with-isl=/builddir/build/BUILD/gcc-5.2.1-20150902/obj-x86_64-redhat-linux/isl-install
--enable-libmpx --enable-gnu-indirect-function --with-tune=generic
--with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 5.2.1 20150902 (Red Hat 5.2.1-2) (GCC)


Re: Does gcc cilk plus support include offloading to graphics hardware?

2016-04-20 Thread Hal Ashburner
Thank you Ilya,
I now understand that gcc has full support for only the _language_
extensions of cilk plus. Perhaps the release notes might be updated to
note this.

Intel market #pragma offload(gfx) as a cilk plus feature. For example
slides 12-15 here:
https://meetingcpp.com/tl_files/mcpp/2015/talks/Intel%20Graphics%20Technology%20for%20general%20purpose%20computing.pdf

Thanks again.



On 20 April 2016 at 19:31, Ilya Verbin  wrote:
> 2016-04-20 4:01 GMT+03:00 Hal Ashburner :
>> Release notes say:
>> "Full support for Cilk Plus has been added to the GCC compiler. Cilk
>> Plus is an extension to the C and C++ languages to support data and
>> task parallelism."
>>
>> gcc-5.2 (centos-7, devtoolset-4) says:
>>
>> g++ -std=c++14 -Wall -O3 -march=native -fcilkplus vec_add.cpp -o vec_add
>> vec_add.cpp:6:0: warning: ignoring #pragma offload target [-Wunknown-pragmas]
>>  #pragma offload target(gfx) pin(out, in1, in2 : length(n))
>
> "#pragma offload" is not a part of Cilk Plus [1], and it is not
> supported by GCC.
> However, GCC supports similar "#pragma omp target" for offloading to
> Intel Xeon Phi and other GPUs (Intel Graphics is not supported).
>
> [1] 
> www.cilkplus.org/sites/default/files/open_specifications/Intel_Cilk_plus_lang_spec_1.2.htm
>
>   -- Ilya


Re: Does gcc cilk plus support include offloading to graphics hardware?

2016-04-20 Thread Hal Ashburner
Another cilk plus question:
Is op_ostream also considered to be outside of cilk plus?
https://www.cilkplus.org/docs/doxygen/include-dir/group___reducers_ostream.html
I am trying to compile the basic "Cilk Plus Tutorial Sources" code as
supplied at http://cilkplus.org/download
reducer-ostream-demo.cpp, reducer-string-demo.cpp and
reducer-wstring-demo.cpp I am unable to get to compile.

Thank you kindly



Makefile:

FLAGS=-std=c++14 -Wall -O3 -march=native
CILK_FLAGS=-fcilkplus

SOURCES=$(shell ls *.cpp)
BINS=$(SOURCES:.cpp=)

all:$(BINS)
%:%.cpp
g++ $(FLAGS) $(CILK_FLAGS) $< -o $@


Re: Does gcc cilk plus support include offloading to graphics hardware?

2016-04-21 Thread Hal Ashburner
That basic tutorial code was last updated 3 years ago. I think we've
established pretty clearly that gcc does _not_ have full support of
what intel calls cilk plus. Offload not supported, and the 3 year old
basic introductory tutorial code from the cilkplus.org website doesn't
compile. I'm likely to be suspicious of gcc's claims of support of
cilk plus in the future.
Thank you very much for your help, Ilya I appreciate it. It's good to
find these things out as early as possible.

On 22 April 2016 at 05:31, Ilya Verbin  wrote:
> 2016-04-21 7:09 GMT+03:00 Hal Ashburner :
>> Another cilk plus question:
>> Is op_ostream also considered to be outside of cilk plus?
>> https://www.cilkplus.org/docs/doxygen/include-dir/group___reducers_ostream.html
>> I am trying to compile the basic "Cilk Plus Tutorial Sources" code as
>> supplied at http://cilkplus.org/download
>> reducer-ostream-demo.cpp, reducer-string-demo.cpp and
>> reducer-wstring-demo.cpp I am unable to get to compile.
>
> The tutorial samples require the latest Cilk runtime (not in GCC yet).
> The new runtime will be merged into mainline soon.
>
>   -- Ilya


__attribute__((aligned())) address of (&) arithmetic

2014-06-28 Thread Hal Ashburner
//alignment.cpp
#include 

struct Struct_containing_64aligned_member {
Struct_containing_64aligned_member()
: aligned_var(80)
{}
uint64_t aligned_var __attribute__((aligned(64)));
};

struct Struct_no_aligned_member {
Struct_no_aligned_member()
: var(80)
{}
uint64_t var;
};


int main()
{
Struct_no_aligned_member *s1 = new Struct_no_aligned_member();
Struct_containing_64aligned_member *s2 = new
Struct_containing_64aligned_member();

std::cout << "alignof(Struct_no_aligned_member) : " <<
alignof(Struct_no_aligned_member)   << '\n' ;
std::cout << "alignof(Struct_containing_64aligned_member) : " <<
alignof(Struct_containing_64aligned_member)   << '\n' ;
std::cout << "alignof(*s1) : " << alignof(*s1)   << '\n' ;
std::cout << "alignof(*s2) : " << alignof(*s2)   << '\n' ;

std::cout << "s1 = " << s1 << "\n";
std::cout << "s2 = " << s2 << "\n";

std::cout << "&*s1 = " << &*s1 << "\n";
std::cout << "&*s2 = " << &*s2 << "\n";

std::cout << "&*s1 % 0x40 = " << (uint64_t)&*s1 % 0x40 << "\n";
std::cout << "&*s2 % 0x40= " << (uint64_t)&*s2 % 0x40 << "\n";


std::cout << "&s1->var = " << &s1->var << "\n";
std::cout << "&s2->aligned_var = " << &s2->aligned_var << "\n";

std::cout << "&s1->var % 0x40 = 0x" << std::hex
<<(uint64_t)&s1->var % 0x40 << "\n";
std::cout << "&s2->aligned_var % 0x40  = 0x" << std::hex <<
(uint64_t)&s2->aligned_var % 0x40 << "\n";
}


g++ -std=c++11 -Wall -g  alignment.cpp -o alignment



alignof(Struct_no_aligned_member) : 8
alignof(Struct_containing_64aligned_member) : 64
alignof(*s1) : 8
alignof(*s2) : 64
s1 = 0x199c010
s2 = 0x199c030
&*s1 = 0x199c010
&*s2 = 0x199c030
&*s1 % 0x40 = 16
&*s2 % 0x40= 48
&s1->var = 0x199c010
&s2->aligned_var = 0x199c030
&s1->var % 0x40 = 0x10
&s2->aligned_var % 0x40  = 0x0

Surprised me that &s2->aligned_var % 40 was not calculated as 0x30
given its address.

I suspect something related to this resulted in a segfault for me.
with -O3 -march=corei7-avx gcc thought 4 consecutive 64 bit integers
were aligned to 256 bits and could be initialised with a vmovdqa
instruction. This was incorrect as the structure was on the heap and
the alloc was not aligned using posix_memalign() or similar but gcc
seems to have calculated offsets as though it were aligned.

Not sure if this is the intended behaviour for an object on the heap
with an aligned member and the docs need updating to include the
gotcha or if this is working differently to intended. Interested to
hear people's thinking.


Centos 6.4 with gcc-4.8.2 from Scientific Gnu/Linux CERN 6
(same behaviour evident on ubuntu 14.04 gnu/linux gcc 4.8.2)

 g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/opt/centos/devtoolset-2/root/usr/bin/../libexec/gcc/x86_64-redhat-linux/4.8.2/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/opt/rh/devtoolset-2/root/usr
--mandir=/opt/rh/devtoolset-2/root/usr/share/man
--infodir=/opt/rh/devtoolset-2/root/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap
--enable-shared --enable-threads=posix --enable-checking=release
--with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id --enable-languages=c,c++,fortran,lto
--enable-plugin --with-linker-hash-style=gnu --enable-initfini-array
--disable-libgcj
--with-isl=/builddir/build/BUILD/gcc-4.8.2-20140120/obj-x86_64-redhat-linux/isl-install
--with-cloog=/builddir/build/BUILD/gcc-4.8.2-20140120/obj-x86_64-redhat-linux/cloog-install
--with-mpc=/builddir/build/BUILD/gcc-4.8.2-20140120/obj-x86_64-redhat-linux/mpc-install
--with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.8.2 20140120 (Red Hat 4.8.2-15) (GCC)