Here are the requested files.In the archive, you will find the output of configure, make, make install as well as the config.log, the environment when running ring_c and the ompi_info --all.
Just for a reminder, the ring_c example compiled and ran, but produced no output when running and exited with code 65.
Thanks, Maxime Le 2014-08-14 15:26, Joshua Ladd a écrit :
One more, Maxime, can you please make sure you've covered everything here: http://www.open-mpi.org/community/help/ JoshOn Thu, Aug 14, 2014 at 3:18 PM, Joshua Ladd <jladd.m...@gmail.com <mailto:jladd.m...@gmail.com>> wrote:And maybe include your LD_LIBRARY_PATH Josh On Thu, Aug 14, 2014 at 3:16 PM, Joshua Ladd <jladd.m...@gmail.com <mailto:jladd.m...@gmail.com>> wrote: Can you try to run the example code "ring_c" across nodes? Josh On Thu, Aug 14, 2014 at 3:14 PM, Maxime Boissonneault <maxime.boissonnea...@calculquebec.ca <mailto:maxime.boissonnea...@calculquebec.ca>> wrote: Yes, Everything has been built with GCC 4.8.x, although x might have changed between the OpenMPI 1.8.1 build and the gromacs build. For OpenMPI 1.8.2rc4 however, it was the exact same compiler for everything. Maxime Le 2014-08-14 14:57, Joshua Ladd a écrit :Hmmm...weird. Seems like maybe a mismatch between libraries. Did you build OMPI with the same compiler as you did GROMACS/Charm++? I'm stealing this suggestion from an old Gromacs forum with essentially the same symptom: "Did you compile Open MPI and Gromacs with the same compiler (i.e. both gcc and the same version)? You write you tried different OpenMPI versions and different GCC versions but it is unclear whether those match. Can you provide more detail how you compiled (including all options you specified)? Have you tested any other MPI program linked against those Open MPI versions? Please make sure (e.g. with ldd) that the MPI and pthread library you compiled against is also used for execution. If you compiled and run on different hosts, check whether the error still occurs when executing on the build host." http://redmine.gromacs.org/issues/1025 Josh On Thu, Aug 14, 2014 at 2:40 PM, Maxime Boissonneault <maxime.boissonnea...@calculquebec.ca <mailto:maxime.boissonnea...@calculquebec.ca>> wrote: I just tried Gromacs with two nodes. It crashes, but with a different error. I get [gpu-k20-13:142156] *** Process received signal *** [gpu-k20-13:142156] Signal: Segmentation fault (11) [gpu-k20-13:142156] Signal code: Address not mapped (1) [gpu-k20-13:142156] Failing at address: 0x8 [gpu-k20-13:142156] [ 0] /lib64/libpthread.so.0(+0xf710)[0x2ac5d070c710] [gpu-k20-13:142156] [ 1] /usr/lib64/nvidia/libcuda.so.1(+0x263acf)[0x2ac5ddfbcacf] [gpu-k20-13:142156] [ 2] /usr/lib64/nvidia/libcuda.so.1(+0x229a83)[0x2ac5ddf82a83] [gpu-k20-13:142156] [ 3] /usr/lib64/nvidia/libcuda.so.1(+0x15b2da)[0x2ac5ddeb42da] [gpu-k20-13:142156] [ 4] /usr/lib64/nvidia/libcuda.so.1(cuInit+0x43)[0x2ac5ddea0933] [gpu-k20-13:142156] [ 5] /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15965)[0x2ac5d0930965] [gpu-k20-13:142156] [ 6] /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a0a)[0x2ac5d0930a0a] [gpu-k20-13:142156] [ 7] /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a3b)[0x2ac5d0930a3b] [gpu-k20-13:142156] [ 8] /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(cudaDriverGetVersion+0x4a)[0x2ac5d094602a] [gpu-k20-13:142156] [ 9] /software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_print_version_info_gpu+0x55)[0x2ac5cf9a90b5] [gpu-k20-13:142156] [10] /software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_log_open+0x17e)[0x2ac5cf54b9be] [gpu-k20-13:142156] [11] mdrunmpi(cmain+0x1cdb)[0x43b4bb] [gpu-k20-13:142156] [12] /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ac5d1534d1d] [gpu-k20-13:142156] [13] mdrunmpi[0x407be1] [gpu-k20-13:142156] *** End of error message *** -------------------------------------------------------------------------- mpiexec noticed that process rank 0 with PID 142156 on node gpu-k20-13 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- We do not have MPI_THREAD_MULTIPLE enabled in our build, so Charm++ cannot be using this level of threading. The configure line for OpenMPI was ./configure --prefix=$PREFIX \ --with-threads --with-verbs=yes --enable-shared --enable-static \ --with-io-romio-flags="--with-file-system=nfs+lustre" \ --without-loadleveler --without-slurm --with-tm \ --with-cuda=$(dirname $(dirname $(which nvcc))) Maxime Le 2014-08-14 14:20, Joshua Ladd a écrit :What about between nodes? Since this is coming from the OpenIB BTL, would be good to check this. Do you know what the MPI thread level is set to when used with the Charm++ runtime? Is it MPI_THREAD_MULTIPLE? The OpenIB BTL is not thread safe. Josh On Thu, Aug 14, 2014 at 2:17 PM, Maxime Boissonneault <maxime.boissonnea...@calculquebec.ca <mailto:maxime.boissonnea...@calculquebec.ca>> wrote: Hi, I ran gromacs successfully with OpenMPI 1.8.1 and Cuda 6.0.37 on a single node, with 8 ranks and multiple OpenMP threads. Maxime Le 2014-08-14 14:15, Joshua Ladd a écrit :Hi, Maxime Just curious, are you able to run a vanilla MPI program? Can you try one one of the example programs in the "examples" subdirectory. Looks like a threading issue to me. Thanks, Josh _______________________________________________ users mailing list us...@open-mpi.org <mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post:http://www.open-mpi.org/community/lists/users/2014/08/25023.php_______________________________________________ users mailing list us...@open-mpi.org <mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2014/08/25024.php _______________________________________________ users mailing list us...@open-mpi.org <mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post:http://www.open-mpi.org/community/lists/users/2014/08/25025.php-- ---------------------------------Maxime Boissonneault Analyste de calcul - Calcul Québec, Université Laval Ph. D. en physique _______________________________________________ users mailing list us...@open-mpi.org <mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2014/08/25026.php _______________________________________________ users mailing list us...@open-mpi.org <mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post:http://www.open-mpi.org/community/lists/users/2014/08/25027.php-- ---------------------------------Maxime Boissonneault Analyste de calcul - Calcul Québec, Université Laval Ph. D. en physique _______________________________________________ users mailing list us...@open-mpi.org <mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2014/08/25028.php _______________________________________________ users mailing list us...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2014/08/25031.php
-- --------------------------------- Maxime Boissonneault Analyste de calcul - Calcul Québec, Université Laval Ph. D. en physique
output.tar.bz2
Description: BZip2 compressed data