Re: [OMPI users] runtime errors with openmpi-v2.x-dev-950-g995993b
Siegmar, thanks for the report about the issue with Sun compiler and helloworld, the root cause is an incorrect packaging and a fix is available at https://github.com/open-mpi/ompi/pull/1285 (note the issue only occurs when building from a tarball) i will have a look at the other issues Cheers, Gilles On 1/6/2016 9:57 PM, Siegmar Gross wrote: Hi, I've successfully built openmpi-v2.x-dev-950-g995993b on my machines (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with gcc-5.1.0 and Sun C 5.13. Unfortunately I get errors running some small test programs. All programs work as expected using my gcc or cc version of openmpi-v1.10.1-138-g0e3b111. I get similar errors for the master openmpi-dev-3329-ge4bdad0. I used the following commands to build the package for gcc. mkdir openmpi-v2.x-dev-950-g995993b-${SYSTEM_ENV}.${MACHINE_ENV}.64_gcc cd openmpi-v2.x-dev-950-g995993b-${SYSTEM_ENV}.${MACHINE_ENV}.64_gcc ../openmpi-v2.x-dev-950-g995993b/configure \ --prefix=/usr/local/openmpi-2.0.0_64_gcc \ --libdir=/usr/local/openmpi-2.0.0_64_gcc/lib64 \ --with-jdk-bindir=/usr/local/jdk1.8.0/bin \ --with-jdk-headers=/usr/local/jdk1.8.0/include \ JAVA_HOME=/usr/local/jdk1.8.0 \ LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \ CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \ CPP="cpp" CXXCPP="cpp" \ --enable-mpi-cxx \ --enable-cxx-exceptions \ --enable-mpi-java \ --enable-heterogeneous \ --enable-mpi-thread-multiple \ --with-hwloc=internal \ --without-verbs \ --with-wrapper-cflags="-std=c11 -m64" \ --with-wrapper-cxxflags="-m64" \ --with-wrapper-fcflags="-m64" \ --enable-debug \ |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc make |& tee log.make.$SYSTEM_ENV.$MACHINE_ENV.64_gcc rm -r /usr/local/openmpi-2.0.0_64_gcc.old mv /usr/local/openmpi-2.0.0_64_gcc /usr/local/openmpi-2.0.0_64_gcc.old make install |& tee log.make-install.$SYSTEM_ENV.$MACHINE_ENV.64_gcc make check |& tee log.make-check.$SYSTEM_ENV.$MACHINE_ENV.64_gcc A simple "hello world" or "matrix multiplication" program works with my gcc version but breaks with my cc version as you can see at the bottom. Spawning processes breaks with both versions. tyr spawn 128 mpiexec -np 1 --hetero-nodes --host tyr,sunpc1,linpc1,tyr spawn_multiple_master Parent process 0 running on tyr.informatik.hs-fulda.de I create 3 slave processes. [tyr.informatik.hs-fulda.de:22370] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c at line 829 [tyr.informatik.hs-fulda.de:22370] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c at line 2176 [tyr:22378] *** An error occurred in MPI_Comm_spawn_multiple [tyr:22378] *** reported by process [4047765505,0] [tyr:22378] *** on communicator MPI_COMM_WORLD [tyr:22378] *** MPI_ERR_SPAWN: could not spawn processes [tyr:22378] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [tyr:22378] ***and potentially your MPI job) tyr spawn 129 tyr spawn 151 mpiexec -np 1 --hetero-nodes --host sunpc1,linpc1,linpc1 spawn_intra_comm Parent process 0: I create 2 slave processes Parent process 0 running on sunpc1 MPI_COMM_WORLD ntasks: 1 COMM_CHILD_PROCESSES ntasks_local: 1 COMM_CHILD_PROCESSES ntasks_remote: 2 COMM_ALL_PROCESSES ntasks: 3 mytid in COMM_ALL_PROCESSES:0 Child process 1 running on linpc1 MPI_COMM_WORLD ntasks: 2 COMM_ALL_PROCESSES ntasks: 3 mytid in COMM_ALL_PROCESSES:2 Child process 0 running on linpc1 MPI_COMM_WORLD ntasks: 2 COMM_ALL_PROCESSES ntasks: 3 mytid in COMM_ALL_PROCESSES:1 -- mpiexec noticed that process rank 0 with PID 16203 on node sunpc1 exited on signal 13 (Broken Pipe). -- tyr spawn 152 I don't see a broken pipe, if a change the sequence of sunpc1 and linpc1. tyr spawn 146 mpiexec -np 1 --hetero-nodes --host linpc1,sunpc1,sunpc1 spawn_intra_comm Parent process 0: I create 2 slave processes Child process 1 running on sunpc1 MPI_COMM_WORLD ntasks: 2 COMM_ALL_PROCESSES ntasks: 3 mytid in COMM_ALL_PROCESSES:2 Child process 0 running on sunpc1 MPI_COMM_WORLD ntasks: 2 COMM_ALL_PROCESSES ntasks: 3 mytid in COMM_ALL_PROCESSES:1 Parent process 0 running on linpc1 MPI_COMM_WORLD ntasks: 1 COMM_CHILD_PROCESSES ntasks_local: 1 COMM_CHILD_PROCESSES ntasks_remote: 2 COMM_ALL_PROCESSES ntasks: 3 mytid in COMM_ALL_PROCESSES:0 The process doesn't return and uses about 50% cpu time (1 of 2 processors), if I combine a x86_64 proce
[OMPI users] warnings building openmpi-v2.x-dev-950-g995993b
Hi, perhaps sombody is interested in some warnings which I got building openmpi-v2.x-dev-950-g995993b on my machines (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with gcc-5.1.0 and Sun C 5.13. I got the same warnings on Solaris sparc and x86_64. I've produced the attached files with the following command grep -i warning \ openmpi-v2.x-dev-950-g995993b-Linux.x86_64.64_cc/log.make.* | \ grep -v atomic.h | grep -v relinking | grep -v "not reached" | \ grep -v "failed to detect system" | grep -v "Optimizer level" | \ grep -v "seems to be" | grep -v "attempted multiple" | \ sort | uniq > /tmp/warning_Linux_x86_64_cc.txt Kind regards Siegmar "../../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/hwloc/hwloc/hwloc/src/topology-custom.c", line 88: warning: initializer will be sign-extended: -1 "../../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/hwloc/hwloc/hwloc/src/topology-linux.c", line 2694: warning: initializer will be sign-extended: -1 "../../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/hwloc/hwloc/hwloc/src/topology-synthetic.c", line 851: warning: initializer will be sign-extended: -1 "../../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/hwloc/hwloc/hwloc/src/topology-xml.c", line 1667: warning: initializer will be sign-extended: -1 "../../../../../../openmpi-v2.x-dev-950-g995993b/ompi/mca/io/romio314/romio/adio/common/utils.c", line 97: warning: argument #3 is incompatible with prototype: "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/buffer_ops/internal_functions.c", line 117: warning: identifier redeclared: opal_pmix_pmix112_pmix_bfrop_get_data_type "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/buffer_ops/open_close.c", line 56: warning: initialization type mismatch "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/buffer_ops/open_close.c", line 57: warning: initialization type mismatch "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/buffer_ops/open_close.c", line 58: warning: initialization type mismatch "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/buffer_ops/open_close.c", line 59: warning: initialization type mismatch "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/buffer_ops/open_close.c", line 60: warning: initialization type mismatch "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/buffer_ops/pack.c", line 63: warning: identifier redeclared: opal_pmix_pmix112_pmix_bfrop_pack_buffer "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/class/pmix_hash_table.c", line 299: warning: identifier redeclared: opal_pmix_pmix112_pmix_hash_table_remove_value_uint64 "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/class/pmix_hash_table.c", line 365: warning: identifier redeclared: opal_pmix_pmix112_pmix_hash_table_get_value_ptr "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/class/pmix_hash_table.c", line 392: warning: identifier redeclared: opal_pmix_pmix112_pmix_hash_table_set_value_ptr "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/class/pmix_hash_table.c", line 433: warning: identifier redeclared: opal_pmix_pmix112_pmix_hash_table_remove_value_ptr "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/class/pmix_hash_table.c", line 466: warning: identifier redeclared: opal_pmix_pmix112_pmix_hash_table_get_first_key_uint32 "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/class/pmix_hash_table.c", line 493: warning: identifier redeclared: opal_pmix_pmix112_pmix_hash_table_get_next_key_uint32 "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/class/pmix_hash_table.c", line 538: warning: identifier redeclared: opal_pmix_pmix112_pmix_hash_table_get_first_key_uint64 "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/class/pmix_hash_table.c", line 565: warning: identifier redeclared: opal_pmix_pmix112_pmix_hash_table_get_next_key_uint64 "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/class/pmix_hash_table.c", line 73: warning: identifier redeclared: opal_pmix_pmix112_pmix_hash_table_init "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/class/pmix_hash_table.h", line 323: warning: implicit function declaration: __builtin_clz "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/client/pmix_client.c", line 223: warning: identifier redeclared: OPAL_PMIX_PMIX112_PMIx_Init "../../../../../../openmpi-v2.x-dev-950-g995993b/opal/mca/pmix/pmix112/pmix/src/client/pmix_client.c", line 465: warning: identifier redeclared: OPAL_PMIX_PMIX112_PMIx_Abort "../../../.
[OMPI users] Singleton process spawns additional thread
Hi! I have a weird problem with executing a singleton OpenMPI program, where an additional thread causes full load, while the master thread performs the actual calculations. In contrast, executing "mpirun -np 1 [executable]" performs the same calculation at the same speed but the additional thread is idling. In my understanding, both calculations should behave in the same way (i.e., one working thread) for a program which is simply moving some data around (mainly some MPI_BCAST and MPI_GATHER commands). I could observe this behaviour in OpenMPI 1.10.1 with ifort 16.0.1 and gfortran 5.3.0. I could create a minimal working example, which is appended to this mail. Am I missing something? Best regards, Stefan - MWE: Compile this with "mpifort main.f90". When executing with "./a.out", there is thread wasting cycles, while the master thread waits for input. When executing with "mpirun -np 1 ./a.out" this thread is idling. program main use mpi_f08 implicit none integer :: ierror,rank call MPI_Init(ierror) call MPI_Comm_Rank(MPI_Comm_World,rank,ierror) ! let master thread wait on [RETURN]-key if (rank == 0) then read(*,*) end if write(*,*) rank call mpi_barrier(mpi_comm_world, ierror) end program
Re: [OMPI users] Singleton process spawns additional thread
Stefan, I don't know if this is related to your issue, but FYI... > Those are async progress threads - they block unless something requires doing > > >> On Apr 15, 2015, at 8:36 AM, Sasso, John (GE Power & Water, Non-GE) >> wrote: >> >> I stumbled upon something while using 'ps -eFL' to view threads of >> processes, and Google searches have failed to answer my question. This >> question holds for OpenMPI 1.6.x and even OpenMPI 1.4.x. >> >> For a program which is pure MPI (built and run using OpenMPI) and does not >> implement Pthreads or OpenMP, why is it that each MPI task appears as having >> 3 threads: >> >> UID PID PPID LWP C NLWPSZ RSS PSR STIME TTY TIME CMD >> sasso 20512 20493 20512 993 187849 582420 14 11:01 ? 00:26:37 >> /home/sasso/mpi_example.exe >> sasso 20512 20493 20588 03 187849 582420 11 11:01 ? 00:00:00 >> /home/sasso/mpi_example.exe >> sasso 20512 20493 20599 03 187849 582420 12 11:01 ? 00:00:00 >> /home/sasso/mpi_example.exe >> >> whereas if I compile and run a non-MPI program, 'ps -eFL' shows it running >> as a single thread? >> >> Granted the CPU utilization (C) for 2 of the 3 threads is zero, but the >> threads are bound to different processors (11,12,14). I am curious as to >> why this is, and no complaining that there is a problem. Thanks! >> >> --john -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Au Eelis Sent: Thursday, January 07, 2016 7:10 AM To: us...@open-mpi.org Subject: [OMPI users] Singleton process spawns additional thread Hi! I have a weird problem with executing a singleton OpenMPI program, where an additional thread causes full load, while the master thread performs the actual calculations. In contrast, executing "mpirun -np 1 [executable]" performs the same calculation at the same speed but the additional thread is idling. In my understanding, both calculations should behave in the same way (i.e., one working thread) for a program which is simply moving some data around (mainly some MPI_BCAST and MPI_GATHER commands). I could observe this behaviour in OpenMPI 1.10.1 with ifort 16.0.1 and gfortran 5.3.0. I could create a minimal working example, which is appended to this mail. Am I missing something? Best regards, Stefan - MWE: Compile this with "mpifort main.f90". When executing with "./a.out", there is thread wasting cycles, while the master thread waits for input. When executing with "mpirun -np 1 ./a.out" this thread is idling. program main use mpi_f08 implicit none integer :: ierror,rank call MPI_Init(ierror) call MPI_Comm_Rank(MPI_Comm_World,rank,ierror) ! let master thread wait on [RETURN]-key if (rank == 0) then read(*,*) end if write(*,*) rank call mpi_barrier(mpi_comm_world, ierror) end program ___ users mailing list us...@open-mpi.org Subscription: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-2Dmpi.org_mailman_listinfo.cgi_users&d=CwICAg&c=IV_clAzoPDE253xZdHuilRgztyh_RiV3wUrLrDQYWSI&r=tqKZ2vRCLufSSXPvzNxBrKr01YPimBPnb-JT-Js0Fmk&m=NPeEHKik35WrcHGDl5ZRq4IC6Le5g03o5YoqD9InrHw&s=eRYTNaknio7tNJFdOMTqvdlNNIq9p6evJoQxuvmqrLs&e= Link to this post: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-2Dmpi.org_community_lists_users_2016_01_28237.php&d=CwICAg&c=IV_clAzoPDE253xZdHuilRgztyh_RiV3wUrLrDQYWSI&r=tqKZ2vRCLufSSXPvzNxBrKr01YPimBPnb-JT-Js0Fmk&m=NPeEHKik35WrcHGDl5ZRq4IC6Le5g03o5YoqD9InrHw&s=2_axdls1JH4Wm5MlkOXRrtXFb2LLVLCleKVx4ybpltU&e=
Re: [OMPI users] OpenMPI Profiling
I don't know specifically what you want to do, but there is a FAQ section on profiling and tracing. http://www.open-mpi.org/faq/?category=perftools On 12/31/2015 9:03 AM, anil maurya wrote: I have compiled HPL using OpenMPI and GotoBLAS. I want to do profiling and tracing. I have compiled openmpi using -enable-profile-mpi. Please let me know how to do the profiling. If I want to use PAPI for the hardware base profiling, Do I need to compile HPL again with PAPI support??