Hi all,
I am using the latest version of OpenMPI (1.5.1) and BLCR (0.8.2).
I found that when running an application,which uses MPI_Isend, MPI_Irecv and
MPI_Wait,
enabling C/R, i.e using "-am ft-enable-cr", the application runtime is much
longer than the normal execution with mpirun (no checkpoint
Hi Michael,
You may have tried to send some debug information to the list, but it
appears to have been blocked. Compressed text output of the backtrace
text is sufficient.
Thanks,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Feb 7, 2011, at 8:38 AM, Samuel K. Gutierrez wrote:
Another possibility to check - are you sure you are getting the same OMPI
version on the backend nodes? When I see it work on local node, but fail
multi-node, the most common problem is that you are picking up a different OMPI
version due to path differences on the backend nodes.
On Feb 8, 201
There are a few reasons why this might be occurring. Did you build with the
'--enable-ft-thread' option?
If so, it looks like I didn't move over the thread_sleep_wait adjustment from
the trunk - the thread was being a bit too aggressive. Try adding the following
to your command line options, an
On 09/02/2011, at 2:38 AM, Ralph Castain wrote:
> Another possibility to check - are you sure you are getting the same OMPI
> version on the backend nodes? When I see it work on local node, but fail
> multi-node, the most common problem is that you are picking up a different
> OMPI version due
On 09/02/2011, at 2:17 AM, Samuel K. Gutierrez wrote:
> Hi Michael,
>
> You may have tried to send some debug information to the list, but it appears
> to have been blocked. Compressed text output of the backtrace text is
> sufficient.
Odd, I thought I sent it to you directly. In any case,
See below
On Feb 8, 2011, at 2:44 PM, Michael Curtis wrote:
>
> On 09/02/2011, at 2:17 AM, Samuel K. Gutierrez wrote:
>
>> Hi Michael,
>>
>> You may have tried to send some debug information to the list, but it
>> appears to have been blocked. Compressed text output of the backtrace text
>
On 09/02/2011, at 9:16 AM, Ralph Castain wrote:
> See below
>
>
> On Feb 8, 2011, at 2:44 PM, Michael Curtis wrote:
>
>>
>> On 09/02/2011, at 2:17 AM, Samuel K. Gutierrez wrote:
>>
>>> Hi Michael,
>>>
>>> You may have tried to send some debug information to the list, but it
>>> appears to
I would personally suggest not reconfiguring your system simply to support a
particular version of OMPI. The only difference between the 1.4 and 1.5 series
wrt slurm is that we changed a few things to support a more recent version of
slurm. It is relatively easy to backport that code to the 1.4