[Bug c++/82629] OpenMP 4.5 Target Region mangling problem

2017-10-27 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82629 --- Comment #4 from Thorsten Kurth --- Hello Richard, Was the test case received? Best Regards Thorsten Kurth

[Bug c++/82629] OpenMP 4.5 Target Region mangling problem

2017-10-20 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82629 --- Comment #3 from Thorsten Kurth --- One more thing, In the test case I send, please change the $(XPPFLAGS) in the main.x target compilation to $(CXXFLAGS), so that -fopenmp is used at link time also. However, that does not solve the problem b

[Bug c++/82629] OpenMP 4.5 Target Region mangling problem

2017-10-20 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82629 --- Comment #2 from Thorsten Kurth --- Created attachment 42420 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42420&action=edit This is the test case demonstrating the problem. Linking this code will produce: -bash-4.2$ make main.x g++ -

[Bug c++/81896] omp target enter data not recognized

2017-10-20 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81896 --- Comment #2 from Thorsten Kurth --- Hello, another data point: when I create a dummy variable, it works: for example alias data to tmp and then use tmp. I think this is not working for the same reason one cannot arbitrarily put class member v

[Bug libgomp/80859] Performance Problems with OpenMP 4.5 support

2017-10-20 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #28 from Thorsten Kurth --- Hello, can someone please give me an update on this bug? Best Regards Thorsten Kurth

[Bug c++/81896] omp target enter data not recognized

2017-10-20 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81896 --- Comment #1 from Thorsten Kurth --- Hello, is this report actually being worked on? It is in unconfirmed state for quite a while now. Best Regards Thorsten Kurth

[Bug c++/82629] New: OpenMP 4.5 Target Region mangling problem

2017-10-19 Thread thorstenkurth at me dot com
++ Assignee: unassigned at gcc dot gnu.org Reporter: thorstenkurth at me dot com Target Milestone: --- Dear Sir/Madam, I run into linking issues with gcc (GCC) 7.1.1 20170718 and OpenMP 4.5 target offloading. I am compiling a mixed fortran/C++ code where target regions can be in

[Bug c++/81896] New: omp target enter data not recognized

2017-08-18 Thread thorstenkurth at me dot com
++ Assignee: unassigned at gcc dot gnu.org Reporter: thorstenkurth at me dot com Target Milestone: --- Created attachment 42005 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42005&action=edit small test case Dear Sir/Madam, I am not sure if my report got posted the fir

[Bug c++/81850] New: OpenMP target enter data compilation issues

2017-08-14 Thread thorstenkurth at me dot com
++ Assignee: unassigned at gcc dot gnu.org Reporter: thorstenkurth at me dot com Target Milestone: --- Created attachment 41990 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41990&action=edit Test case Dear Sir/Madam, g++ 7.1.1 cannot compile correct OpenMP 4.5

[Bug libgomp/80859] Performance Problems with OpenMP 4.5 support

2017-08-08 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #27 from Thorsten Kurth --- Hello Jakub, I wanted to follow up on this. Is there any progress on this issue? Best Regards Thorsten Kurth

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-26 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #26 from Thorsten Kurth --- Hello Jakub, thanks for the clarification. So a team maps to a CTA which is somewhat equivalent to a block in CUDA language, correct? And it is good to have some categorical equivalency between GPU and CPU

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-26 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #24 from Thorsten Kurth --- Hello Jakub, I know that the section you mean is racey and gets the wrong number of threads is not right but I put this in in order to see if I get the correct numbers on a CPU (I am not working on a GPU y

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-25 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #22 from Thorsten Kurth --- Hello Jakub, that is stuff for Intel vTune. I have commented it out and added the NUM_TEAMS defines in the GNUmakefile. Please pull the latest changes. Best and thanks Thorsten

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-25 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #20 from Thorsten Kurth --- To compile the code, edit the GNUmakefile to suit your needs (feel free to ask any questions) and in order to run it run the generated executable, called something like main3d.XXX... and the XXX tell

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-24 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #19 from Thorsten Kurth --- Thanks you very much. I am sorry that I do not have a simpler test case. The kernel which is executed is in the same directory as ABecLaplacian and called MG_3D_cpp.cpp. We have seen similar problems with

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-24 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #17 from Thorsten Kurth --- the result though is correct, I verified that both codes generate the correct output.

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-24 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #16 from Thorsten Kurth --- FYI, the code is: https://github.com/zronaghi/BoxLib.git in branch cpp_kernels_openmp4dot5 and then in Src/LinearSolvers/C_CellMG the file ABecLaplacian.cpp. For example, lines 542 and 543 can be comme

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-24 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #15 from Thorsten Kurth --- The code I care about definitely has optimization enabled. For the fortran stuff it does (for example): ftn -g -O3 -ffree-line-length-none -fno-range-check -fno-second-underscore -Jo/3d.gnu.MPI.OMP.EXE -

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-24 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #13 from Thorsten Kurth --- Hello Jakub, the compiler options are just -fopenmp. I am sure it does not have to do anything with vectorization as I compare the code runtime with and without the target directives and thus vectorization

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-24 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #11 from Thorsten Kurth --- Hello Jakub, yes, you are right. I thought that map(tofrom:) is the default mapping but I might be wrong. In any case, teams is always 1. So this code is basically just data streaming so there is no n

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-24 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #9 from Thorsten Kurth --- Sorry, in the second run I set the number of threads to 12. I think the code works as expected.

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-24 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #8 from Thorsten Kurth --- Here is the output of the get_num_threads section: [tkurth@cori02 omp_3_vs_45_test]$ export OMP_NUM_THREADS=32 [tkurth@cori02 omp_3_vs_45_test]$ ./nested_test_omp_4dot5.x We got 1 teams and 32 threads. and

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-24 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #7 from Thorsten Kurth --- Hello Jakub, thanks for your comment but I think the parallel for is not racey. Every thread is working a block of i-indices so that is fine. The dotprod kernel is actually a kernel from the OpenMP standard

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-24 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #5 from Thorsten Kurth --- To clarify the problem: I think that the additional movq, pushq and other instructions generated when using the target directive can cause a big hit on the performance. I understand that these instructions a

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-24 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #4 from Thorsten Kurth --- Created attachment 41415 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41415&action=edit Testcase This is the test case. The files ending on .as contain the assembly code with and without target regi

[Bug c++/80859] Performance Problems with OpenMP 4.5 support

2017-05-24 Thread thorstenkurth at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859 --- Comment #3 from Thorsten Kurth --- Created attachment 41414 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41414&action=edit OpenMP 4.5 Testcase This is the source code

[Bug c++/80859] New: Performance Problems with OpenMP 4.5 support

2017-05-22 Thread thorstenkurth at me dot com
++ Assignee: unassigned at gcc dot gnu.org Reporter: thorstenkurth at me dot com Target Milestone: --- Dear Sir/Madam, I am working on the Cori HPC system, a Cray XC-40 with intel Xeon Phi 7250. I probably found a performance "bug" when using the OpenMP 4.5 target

[Bug c/60101] New: Long compile times when mixed complex floating point datatypes are used in lengthy expressions

2014-02-06 Thread thorstenkurth at me dot com
Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: thorstenkurth at me dot com Created attachment 32071 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32071&action=edit Archive which includes test ca