Soon after this, mpirun
crashes. Nodes communicate over a semi-dedicated TCP/IP GigE connection.
Is this a known bug? What is going wrong?
Regards,
Michael Curtis
On 27/01/2011, at 4:51 PM, Michael Curtis wrote:
Some more debugging information:
> Failing case:
> michael@ipc ~ $ salloc -n8 mpirun --display-map ./mpi
> JOB MAP
Backtrace with debugging symbols
#0 0x77bb5c1e in ?? ()
On 28/01/2011, at 8:16 PM, Michael Curtis wrote:
>
> On 27/01/2011, at 4:51 PM, Michael Curtis wrote:
>
> Some more debugging information:
Is anyone able to help with this problem? As far as I can tell it's a
stock-standard recently installed SLURM installation.
I can try 1
On 04/02/2011, at 9:35 AM, Samuel K. Gutierrez wrote:
> I just tried to reproduce the problem that you are experiencing and was
> unable to.
>
>
> SLURM 2.1.15
> Open MPI 1.4.3 configured with:
> --with-platform=./contrib/platform/lanl/tlcc/debug-nopanasas
>
> I'll dig a bit further.
Intere
On 04/02/2011, at 9:35 AM, Samuel K. Gutierrez wrote:
Hi,
> I just tried to reproduce the problem that you are experiencing and was
> unable to.
>
> SLURM 2.1.15
> Open MPI 1.4.3 configured with:
> --with-platform=./contrib/platform/lanl/tlcc/debug-nopanasas
I compiled OpenMPI 1.4.3 (vanilla
On 07/02/2011, at 12:36 PM, Michael Curtis wrote:
>
> On 04/02/2011, at 9:35 AM, Samuel K. Gutierrez wrote:
>
> Hi,
>
>> I just tried to reproduce the problem that you are experiencing and was
>> unable to.
>>
>> SLURM 2.1.15
>> Open MPI 1.4.3 c
On 09/02/2011, at 2:38 AM, Ralph Castain wrote:
> Another possibility to check - are you sure you are getting the same OMPI
> version on the backend nodes? When I see it work on local node, but fail
> multi-node, the most common problem is that you are picking up a different
> OMPI version due
On 09/02/2011, at 2:17 AM, Samuel K. Gutierrez wrote:
> Hi Michael,
>
> You may have tried to send some debug information to the list, but it appears
> to have been blocked. Compressed text output of the backtrace text is
> sufficient.
Odd, I thought I sent it to you directly. In any case,
On 09/02/2011, at 9:16 AM, Ralph Castain wrote:
> See below
>
>
> On Feb 8, 2011, at 2:44 PM, Michael Curtis wrote:
>
>>
>> On 09/02/2011, at 2:17 AM, Samuel K. Gutierrez wrote:
>>
>>> Hi Michael,
>>>
>>> You may have tried to sen