But OMPI 1.8.x does run the ring_c program successfully on your compute
node, right? The error only happens on the front-end login node if I
understood you correctly.

Josh


On Fri, Aug 15, 2014 at 5:20 PM, Maxime Boissonneault <
maxime.boissonnea...@calculquebec.ca> wrote:

>  Here are the requested files.
>
> In the archive, you will find the output of configure, make, make install
> as well as the config.log, the environment when running ring_c and the
> ompi_info --all.
>
> Just for a reminder, the ring_c example compiled and ran, but produced no
> output when running and exited with code 65.
>
> Thanks,
>
> Maxime
>
> Le 2014-08-14 15:26, Joshua Ladd a écrit :
>
>  One more, Maxime, can you please make sure you've covered everything
> here:
>
> http://www.open-mpi.org/community/help/
>
>  Josh
>
>
> On Thu, Aug 14, 2014 at 3:18 PM, Joshua Ladd <jladd.m...@gmail.com> wrote:
>
>>  And maybe include your LD_LIBRARY_PATH
>>
>>  Josh
>>
>>
>> On Thu, Aug 14, 2014 at 3:16 PM, Joshua Ladd <jladd.m...@gmail.com>
>> wrote:
>>
>>>  Can you try to run the example code "ring_c" across nodes?
>>>
>>>  Josh
>>>
>>>
>>>  On Thu, Aug 14, 2014 at 3:14 PM, Maxime Boissonneault <
>>> maxime.boissonnea...@calculquebec.ca> wrote:
>>>
>>>>   Yes,
>>>> Everything has been built with GCC 4.8.x, although x might have changed
>>>> between the OpenMPI 1.8.1 build and the gromacs build. For OpenMPI 1.8.2rc4
>>>> however, it was the exact same compiler for everything.
>>>>
>>>> Maxime
>>>>
>>>> Le 2014-08-14 14:57, Joshua Ladd a écrit :
>>>>
>>>>   Hmmm...weird. Seems like maybe a mismatch between libraries. Did you
>>>> build OMPI with the same compiler as you did GROMACS/Charm++?
>>>>
>>>> I'm stealing this suggestion from an old Gromacs forum with essentially
>>>> the same symptom:
>>>>
>>>> "Did you compile Open MPI and Gromacs with the same compiler (i.e. both
>>>> gcc and the same version)? You write you tried different OpenMPI versions
>>>> and different GCC versions but it is unclear whether those match. Can you
>>>> provide more detail how you compiled (including all options you specified)?
>>>> Have you tested any other MPI program linked against those Open MPI
>>>> versions? Please make sure (e.g. with ldd) that the MPI and pthread library
>>>> you compiled against is also used for execution. If you compiled and run on
>>>> different hosts, check whether the error still occurs when executing on the
>>>> build host."
>>>>
>>>> http://redmine.gromacs.org/issues/1025
>>>>
>>>>  Josh
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Aug 14, 2014 at 2:40 PM, Maxime Boissonneault <
>>>> maxime.boissonnea...@calculquebec.ca> wrote:
>>>>
>>>>>  I just tried Gromacs with two nodes. It crashes, but with a
>>>>> different error. I get
>>>>> [gpu-k20-13:142156] *** Process received signal ***
>>>>> [gpu-k20-13:142156] Signal: Segmentation fault (11)
>>>>> [gpu-k20-13:142156] Signal code: Address not mapped (1)
>>>>> [gpu-k20-13:142156] Failing at address: 0x8
>>>>> [gpu-k20-13:142156] [ 0]
>>>>> /lib64/libpthread.so.0(+0xf710)[0x2ac5d070c710]
>>>>> [gpu-k20-13:142156] [ 1]
>>>>> /usr/lib64/nvidia/libcuda.so.1(+0x263acf)[0x2ac5ddfbcacf]
>>>>> [gpu-k20-13:142156] [ 2]
>>>>> /usr/lib64/nvidia/libcuda.so.1(+0x229a83)[0x2ac5ddf82a83]
>>>>> [gpu-k20-13:142156] [ 3]
>>>>> /usr/lib64/nvidia/libcuda.so.1(+0x15b2da)[0x2ac5ddeb42da]
>>>>> [gpu-k20-13:142156] [ 4]
>>>>> /usr/lib64/nvidia/libcuda.so.1(cuInit+0x43)[0x2ac5ddea0933]
>>>>> [gpu-k20-13:142156] [ 5]
>>>>> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15965)[0x2ac5d0930965]
>>>>> [gpu-k20-13:142156] [ 6]
>>>>> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a0a)[0x2ac5d0930a0a]
>>>>> [gpu-k20-13:142156] [ 7]
>>>>> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a3b)[0x2ac5d0930a3b]
>>>>> [gpu-k20-13:142156] [ 8]
>>>>> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(cudaDriverGetVersion+0x4a)[0x2ac5d094602a]
>>>>> [gpu-k20-13:142156] [ 9]
>>>>> /software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_print_version_info_gpu+0x55)[0x2ac5cf9a90b5]
>>>>> [gpu-k20-13:142156] [10]
>>>>> /software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_log_open+0x17e)[0x2ac5cf54b9be]
>>>>> [gpu-k20-13:142156] [11] mdrunmpi(cmain+0x1cdb)[0x43b4bb]
>>>>> [gpu-k20-13:142156] [12]
>>>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ac5d1534d1d]
>>>>> [gpu-k20-13:142156] [13] mdrunmpi[0x407be1]
>>>>> [gpu-k20-13:142156] *** End of error message ***
>>>>>
>>>>> --------------------------------------------------------------------------
>>>>> mpiexec noticed that process rank 0 with PID 142156 on node gpu-k20-13
>>>>> exited on signal 11 (Segmentation fault).
>>>>>
>>>>> --------------------------------------------------------------------------
>>>>>
>>>>>
>>>>>
>>>>> We do not have MPI_THREAD_MULTIPLE enabled in our build, so Charm++
>>>>> cannot be using this level of threading. The configure line for OpenMPI 
>>>>> was
>>>>> ./configure --prefix=$PREFIX \
>>>>>       --with-threads --with-verbs=yes --enable-shared --enable-static \
>>>>>       --with-io-romio-flags="--with-file-system=nfs+lustre" \
>>>>>        --without-loadleveler --without-slurm --with-tm \
>>>>>        --with-cuda=$(dirname $(dirname $(which nvcc)))
>>>>>
>>>>> Maxime
>>>>>
>>>>>
>>>>> Le 2014-08-14 14:20, Joshua Ladd a écrit :
>>>>>
>>>>>   What about between nodes? Since this is coming from the OpenIB BTL,
>>>>> would be good to check this.
>>>>>
>>>>> Do you know what the MPI thread level is set to when used with the
>>>>> Charm++ runtime? Is it MPI_THREAD_MULTIPLE? The OpenIB BTL is not thread
>>>>> safe.
>>>>>
>>>>>  Josh
>>>>>
>>>>>
>>>>> On Thu, Aug 14, 2014 at 2:17 PM, Maxime Boissonneault <
>>>>> maxime.boissonnea...@calculquebec.ca> wrote:
>>>>>
>>>>>>  Hi,
>>>>>> I ran gromacs successfully with OpenMPI 1.8.1 and Cuda 6.0.37 on a
>>>>>> single node, with 8 ranks and multiple OpenMP threads.
>>>>>>
>>>>>> Maxime
>>>>>>
>>>>>>
>>>>>> Le 2014-08-14 14:15, Joshua Ladd a écrit :
>>>>>>
>>>>>>   Hi, Maxime
>>>>>>
>>>>>>  Just curious, are you able to run a vanilla MPI program? Can you try
>>>>>> one one of the example programs in the "examples" subdirectory. Looks 
>>>>>> like
>>>>>> a threading issue to me.
>>>>>>
>>>>>>  Thanks,
>>>>>>
>>>>>>  Josh
>>>>>>
>>>>>>
>>>>>>
>>>>>>  _______________________________________________
>>>>>> users mailing listus...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>> Link to this post: 
>>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25023.php
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post:
>>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25024.php
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>  _______________________________________________
>>>>> users mailing listus...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> Link to this post: 
>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25025.php
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ---------------------------------
>>>>> Maxime Boissonneault
>>>>> Analyste de calcul - Calcul Québec, Université Laval
>>>>> Ph. D. en physique
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post:
>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25026.php
>>>>>
>>>>
>>>>
>>>>
>>>>  _______________________________________________
>>>> users mailing listus...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/users/2014/08/25027.php
>>>>
>>>>
>>>>
>>>> --
>>>> ---------------------------------
>>>> Maxime Boissonneault
>>>> Analyste de calcul - Calcul Québec, Université Laval
>>>> Ph. D. en physique
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>  Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2014/08/25028.php
>>>>
>>>
>>>
>>
>
>
> _______________________________________________
> users mailing listus...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/08/25031.php
>
>
>
> --
> ---------------------------------
> Maxime Boissonneault
> Analyste de calcul - Calcul Québec, Université Laval
> Ph. D. en physique
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/08/25039.php
>

Reply via email to