2016-05-05 9:27 GMT-05:00 Gilles Gouaillardet <gilles.gouaillar...@gmail.com
>:

> Out of curiosity, can you try
> mpirun --mca btl self,sm ...
>
Same as before. Many MPI_Test calls.

> and
> mpirun --mca btl self,vader ...
>
A requested component was not found, or was unable to be opened.  This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded).  Note that
Open MPI stopped checking at the first component that it did not find.

Host:      VirtualBox
Framework: btl
Component: vade
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  mca_bml_base_open() failed
  --> Returned "Not found" (-13) instead of "Success" (0)
--------------------------------------------------------------------------
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[VirtualBox:2188] Local abort before MPI_INIT completed successfully; not
able to aggregate error messages, and not able to guarantee that all other
processes were killed!
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status,
thus causing
the job to be terminated. The first process to do so was:

  Process name: [[9235,1],0]
  Exit code:    1
--------------------------------------------------------------------------
[VirtualBox:02186] 1 more process has sent help message help-mca-base.txt /
find-available:not-valid
[VirtualBox:02186] Set MCA parameter "orte_base_help_aggregate" to 0 to see
all help / error messages


>
> and see if one performs better than the other ?
>
> Cheers,
>
> Gilles
>
> On Thursday, May 5, 2016, Zhen Wang <tod...@gmail.com> wrote:
>
>> Gilles,
>>
>> Thanks for your reply.
>>
>> Best regards,
>> Zhen
>>
>> On Wed, May 4, 2016 at 8:43 PM, Gilles Gouaillardet <
>> gilles.gouaillar...@gmail.com> wrote:
>>
>>> Note there is no progress thread in openmpi 1.10
>>> from a pragmatic point of view, that means that for "large" messages, no
>>> data is sent in MPI_Isend, and the data is sent when MPI "progresses" e.g.
>>> call a MPI_Test, MPI_Probe, MPI_Recv or some similar subroutine.
>>> in your example, the data is transferred after the first usleep
>>> completes.
>>>
>> I agree.
>>
>>>
>>> that being said, it takes quite a while, and there could be an issue.
>>> what if you use MPI_Send instead () ?
>>>
>> Works as expected.
>>
>> MPI 1: Recv of 0 started at 08:37:10.
>> MPI 1: Recv of 0 finished at 08:37:10.
>> MPI 0: Send of 0 started at 08:37:10.
>> MPI 0: Send of 0 finished at 08:37:10.
>>
>>
>>> what if you send/Recv a large message first (to "warmup" connections),
>>> MPI_Barrier, and then start your MPI_Isend ?
>>>
>> Not working. For what I want to accomplish, is my code the right way to
>> go? Is there an altenative method? Thanks.
>>
>> MPI 1: Recv of 0 started at 08:38:46.
>> MPI 0: Isend of 0 started at 08:38:46.
>> MPI 0: Isend of 1 started at 08:38:46.
>> MPI 0: Isend of 2 started at 08:38:46.
>> MPI 0: Isend of 3 started at 08:38:46.
>> MPI 0: Isend of 4 started at 08:38:46.
>> MPI 0: MPI_Test of 0 at 08:38:46.
>> MPI 0: MPI_Test of 0 at 08:38:46.
>> MPI 0: MPI_Test of 0 at 08:38:46.
>> MPI 0: MPI_Test of 0 at 08:38:46.
>> MPI 0: MPI_Test of 0 at 08:38:46.
>> MPI 0: MPI_Test of 0 at 08:38:46.
>> MPI 0: MPI_Test of 0 at 08:38:46.
>> MPI 0: MPI_Test of 0 at 08:38:47.
>> MPI 0: MPI_Test of 0 at 08:38:47.
>> MPI 0: MPI_Test of 0 at 08:38:47.
>> MPI 0: MPI_Test of 0 at 08:38:47.
>> MPI 0: MPI_Test of 0 at 08:38:47.
>> MPI 0: MPI_Test of 0 at 08:38:47.
>> MPI 0: MPI_Test of 0 at 08:38:47.
>> MPI 0: MPI_Test of 0 at 08:38:47.
>> MPI 0: MPI_Test of 0 at 08:38:47.
>> MPI 0: MPI_Test of 0 at 08:38:47.
>> MPI 0: MPI_Test of 0 at 08:38:48.
>> MPI 0: MPI_Test of 0 at 08:38:48.
>> MPI 0: MPI_Test of 0 at 08:38:48.
>> MPI 0: MPI_Test of 0 at 08:38:48.
>> MPI 0: MPI_Test of 0 at 08:38:48.
>> MPI 0: MPI_Test of 0 at 08:38:48.
>> MPI 0: MPI_Test of 0 at 08:38:48.
>> MPI 0: MPI_Test of 0 at 08:38:48.
>> MPI 0: MPI_Test of 0 at 08:38:48.
>> MPI 0: MPI_Test of 0 at 08:38:48.
>> MPI 0: MPI_Test of 0 at 08:38:49.
>> MPI 0: MPI_Test of 0 at 08:38:49.
>> MPI 0: MPI_Test of 0 at 08:38:49.
>> MPI 0: MPI_Test of 0 at 08:38:49.
>> MPI 0: MPI_Test of 0 at 08:38:49.
>> MPI 0: MPI_Test of 0 at 08:38:49.
>> MPI 0: MPI_Test of 0 at 08:38:49.
>> MPI 0: MPI_Test of 0 at 08:38:49.
>> MPI 0: MPI_Test of 0 at 08:38:49.
>> MPI 0: MPI_Test of 0 at 08:38:49.
>> MPI 0: MPI_Test of 0 at 08:38:50.
>> MPI 0: MPI_Test of 0 at 08:38:50.
>> MPI 0: MPI_Test of 0 at 08:38:50.
>> MPI 0: MPI_Test of 0 at 08:38:50.
>> MPI 1: Recv of 0 finished at 08:38:50.
>> MPI 1: Recv of 1 started at 08:38:50.
>> MPI 0: MPI_Test of 0 at 08:38:50.
>> MPI 0: Isend of 0 finished at 08:38:50.
>> MPI 0: MPI_Test of 1 at 08:38:50.
>> MPI 0: MPI_Test of 1 at 08:38:50.
>> MPI 0: MPI_Test of 1 at 08:38:50.
>> MPI 0: MPI_Test of 1 at 08:38:50.
>> MPI 0: MPI_Test of 1 at 08:38:50.
>> MPI 0: MPI_Test of 1 at 08:38:51.
>> MPI 0: MPI_Test of 1 at 08:38:51.
>> MPI 0: MPI_Test of 1 at 08:38:51.
>> MPI 0: MPI_Test of 1 at 08:38:51.
>> MPI 0: MPI_Test of 1 at 08:38:51.
>> MPI 0: MPI_Test of 1 at 08:38:51.
>> MPI 0: MPI_Test of 1 at 08:38:51.
>> MPI 0: MPI_Test of 1 at 08:38:51.
>> MPI 0: MPI_Test of 1 at 08:38:51.
>> MPI 0: MPI_Test of 1 at 08:38:51.
>> MPI 0: MPI_Test of 1 at 08:38:52.
>> MPI 0: MPI_Test of 1 at 08:38:52.
>> MPI 0: MPI_Test of 1 at 08:38:52.
>> MPI 0: MPI_Test of 1 at 08:38:52.
>> MPI 0: MPI_Test of 1 at 08:38:52.
>> MPI 0: MPI_Test of 1 at 08:38:52.
>> MPI 0: MPI_Test of 1 at 08:38:52.
>> MPI 0: MPI_Test of 1 at 08:38:52.
>> MPI 0: MPI_Test of 1 at 08:38:52.
>> MPI 0: MPI_Test of 1 at 08:38:52.
>> MPI 0: MPI_Test of 1 at 08:38:53.
>> MPI 0: MPI_Test of 1 at 08:38:53.
>> MPI 0: MPI_Test of 1 at 08:38:53.
>> MPI 0: MPI_Test of 1 at 08:38:53.
>> MPI 0: MPI_Test of 1 at 08:38:53.
>> MPI 0: MPI_Test of 1 at 08:38:53.
>> MPI 0: MPI_Test of 1 at 08:38:53.
>> MPI 0: MPI_Test of 1 at 08:38:53.
>> MPI 0: MPI_Test of 1 at 08:38:53.
>> MPI 0: MPI_Test of 1 at 08:38:53.
>> MPI 0: MPI_Test of 1 at 08:38:54.
>> MPI 0: MPI_Test of 1 at 08:38:54.
>> MPI 0: MPI_Test of 1 at 08:38:54.
>> MPI 0: MPI_Test of 1 at 08:38:54.
>> MPI 0: MPI_Test of 1 at 08:38:54.
>> MPI 1: Recv of 1 finished at 08:38:54.
>> MPI 1: Recv of 2 started at 08:38:54.
>> MPI 0: MPI_Test of 1 at 08:38:54.
>> MPI 0: Isend of 1 finished at 08:38:54.
>> MPI 0: MPI_Test of 2 at 08:38:54.
>> MPI 0: MPI_Test of 2 at 08:38:54.
>> MPI 0: MPI_Test of 2 at 08:38:54.
>> MPI 0: MPI_Test of 2 at 08:38:55.
>> MPI 0: MPI_Test of 2 at 08:38:55.
>> MPI 0: MPI_Test of 2 at 08:38:55.
>> MPI 0: MPI_Test of 2 at 08:38:55.
>> MPI 0: MPI_Test of 2 at 08:38:55.
>> MPI 0: MPI_Test of 2 at 08:38:55.
>> MPI 0: MPI_Test of 2 at 08:38:55.
>> MPI 0: MPI_Test of 2 at 08:38:55.
>> MPI 0: MPI_Test of 2 at 08:38:55.
>> MPI 0: MPI_Test of 2 at 08:38:55.
>> MPI 0: MPI_Test of 2 at 08:38:56.
>> MPI 0: MPI_Test of 2 at 08:38:56.
>> MPI 0: MPI_Test of 2 at 08:38:56.
>> MPI 0: MPI_Test of 2 at 08:38:56.
>> MPI 0: MPI_Test of 2 at 08:38:56.
>> MPI 0: MPI_Test of 2 at 08:38:56.
>> MPI 0: MPI_Test of 2 at 08:38:56.
>> MPI 0: MPI_Test of 2 at 08:38:56.
>> MPI 0: MPI_Test of 2 at 08:38:56.
>> MPI 0: MPI_Test of 2 at 08:38:56.
>> MPI 0: MPI_Test of 2 at 08:38:57.
>> MPI 0: MPI_Test of 2 at 08:38:57.
>> MPI 0: MPI_Test of 2 at 08:38:57.
>> MPI 0: MPI_Test of 2 at 08:38:57.
>> MPI 0: MPI_Test of 2 at 08:38:57.
>> MPI 0: MPI_Test of 2 at 08:38:57.
>> MPI 0: MPI_Test of 2 at 08:38:57.
>> MPI 0: MPI_Test of 2 at 08:38:57.
>> MPI 0: MPI_Test of 2 at 08:38:57.
>> MPI 0: MPI_Test of 2 at 08:38:57.
>> MPI 0: MPI_Test of 2 at 08:38:58.
>> MPI 0: MPI_Test of 2 at 08:38:58.
>> MPI 0: MPI_Test of 2 at 08:38:58.
>> MPI 0: MPI_Test of 2 at 08:38:58.
>> MPI 0: MPI_Test of 2 at 08:38:58.
>> MPI 0: MPI_Test of 2 at 08:38:58.
>> MPI 0: MPI_Test of 2 at 08:38:58.
>> MPI 1: Recv of 2 finished at 08:38:58.
>> MPI 1: Recv of 3 started at 08:38:58.
>> MPI 0: MPI_Test of 2 at 08:38:58.
>> MPI 0: Isend of 2 finished at 08:38:58.
>> MPI 0: MPI_Test of 3 at 08:38:58.
>> MPI 0: MPI_Test of 3 at 08:38:58.
>> MPI 0: MPI_Test of 3 at 08:38:59.
>> MPI 0: MPI_Test of 3 at 08:38:59.
>> MPI 0: MPI_Test of 3 at 08:38:59.
>> MPI 0: MPI_Test of 3 at 08:38:59.
>> MPI 0: MPI_Test of 3 at 08:38:59.
>> MPI 0: MPI_Test of 3 at 08:38:59.
>> MPI 0: MPI_Test of 3 at 08:38:59.
>> MPI 0: MPI_Test of 3 at 08:38:59.
>> MPI 0: MPI_Test of 3 at 08:38:59.
>> MPI 0: MPI_Test of 3 at 08:38:59.
>> MPI 0: MPI_Test of 3 at 08:39:00.
>> MPI 0: MPI_Test of 3 at 08:39:00.
>> MPI 0: MPI_Test of 3 at 08:39:00.
>> MPI 0: MPI_Test of 3 at 08:39:00.
>> MPI 0: MPI_Test of 3 at 08:39:00.
>> MPI 0: MPI_Test of 3 at 08:39:00.
>> MPI 0: MPI_Test of 3 at 08:39:00.
>> MPI 0: MPI_Test of 3 at 08:39:00.
>> MPI 0: MPI_Test of 3 at 08:39:00.
>> MPI 0: MPI_Test of 3 at 08:39:00.
>> MPI 0: MPI_Test of 3 at 08:39:01.
>> MPI 0: MPI_Test of 3 at 08:39:01.
>> MPI 0: MPI_Test of 3 at 08:39:01.
>> MPI 0: MPI_Test of 3 at 08:39:01.
>> MPI 0: MPI_Test of 3 at 08:39:01.
>> MPI 0: MPI_Test of 3 at 08:39:01.
>> MPI 0: MPI_Test of 3 at 08:39:01.
>> MPI 0: MPI_Test of 3 at 08:39:01.
>> MPI 0: MPI_Test of 3 at 08:39:01.
>> MPI 0: MPI_Test of 3 at 08:39:01.
>> MPI 0: MPI_Test of 3 at 08:39:02.
>> MPI 0: MPI_Test of 3 at 08:39:02.
>> MPI 0: MPI_Test of 3 at 08:39:02.
>> MPI 0: MPI_Test of 3 at 08:39:02.
>> MPI 0: MPI_Test of 3 at 08:39:02.
>> MPI 0: MPI_Test of 3 at 08:39:02.
>> MPI 0: MPI_Test of 3 at 08:39:02.
>> MPI 0: MPI_Test of 3 at 08:39:02.
>> MPI 1: Recv of 3 finished at 08:39:02.
>> MPI 1: Recv of 4 started at 08:39:02.
>> MPI 0: MPI_Test of 3 at 08:39:02.
>> MPI 0: Isend of 3 finished at 08:39:02.
>> MPI 0: MPI_Test of 4 at 08:39:02.
>> MPI 0: MPI_Test of 4 at 08:39:03.
>> MPI 0: MPI_Test of 4 at 08:39:03.
>> MPI 0: MPI_Test of 4 at 08:39:03.
>> MPI 0: MPI_Test of 4 at 08:39:03.
>> MPI 0: MPI_Test of 4 at 08:39:03.
>> MPI 0: MPI_Test of 4 at 08:39:03.
>> MPI 0: MPI_Test of 4 at 08:39:03.
>> MPI 0: MPI_Test of 4 at 08:39:03.
>> MPI 0: MPI_Test of 4 at 08:39:03.
>> MPI 0: MPI_Test of 4 at 08:39:03.
>> MPI 0: MPI_Test of 4 at 08:39:04.
>> MPI 0: MPI_Test of 4 at 08:39:04.
>> MPI 0: MPI_Test of 4 at 08:39:04.
>> MPI 0: MPI_Test of 4 at 08:39:04.
>> MPI 0: MPI_Test of 4 at 08:39:04.
>> MPI 0: MPI_Test of 4 at 08:39:04.
>> MPI 0: MPI_Test of 4 at 08:39:04.
>> MPI 0: MPI_Test of 4 at 08:39:04.
>> MPI 0: MPI_Test of 4 at 08:39:04.
>> MPI 0: MPI_Test of 4 at 08:39:04.
>> MPI 0: MPI_Test of 4 at 08:39:05.
>> MPI 0: MPI_Test of 4 at 08:39:05.
>> MPI 0: MPI_Test of 4 at 08:39:05.
>> MPI 0: MPI_Test of 4 at 08:39:05.
>> MPI 0: MPI_Test of 4 at 08:39:05.
>> MPI 0: MPI_Test of 4 at 08:39:05.
>> MPI 0: MPI_Test of 4 at 08:39:05.
>> MPI 0: MPI_Test of 4 at 08:39:05.
>> MPI 0: MPI_Test of 4 at 08:39:05.
>> MPI 0: MPI_Test of 4 at 08:39:05.
>> MPI 0: MPI_Test of 4 at 08:39:06.
>> MPI 0: MPI_Test of 4 at 08:39:06.
>> MPI 0: MPI_Test of 4 at 08:39:06.
>> MPI 0: MPI_Test of 4 at 08:39:06.
>> MPI 0: MPI_Test of 4 at 08:39:06.
>> MPI 0: MPI_Test of 4 at 08:39:06.
>> MPI 0: MPI_Test of 4 at 08:39:06.
>> MPI 0: MPI_Test of 4 at 08:39:06.
>> MPI 0: MPI_Test of 4 at 08:39:06.
>> MPI 1: Recv of 4 finished at 08:39:06.
>> MPI 0: MPI_Test of 4 at 08:39:06.
>> MPI 0: Isend of 4 finished at 08:39:06.
>>
>>
>>>
>>> Cheers,
>>>
>>> Gilles
>>>
>>>
>>> On Thursday, May 5, 2016, Zhen Wang <tod...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm having a problem with Isend, Recv and Test in Linux Mint 16 Petra.
>>>> The source is attached.
>>>>
>>>> Open MPI 1.10.2 is configured with
>>>> ./configure --enable-debug --prefix=/home/<me>/Tool/openmpi-1.10.2-debug
>>>>
>>>> The source is built with
>>>> ~/Tool/openmpi-1.10.2-debug/bin/mpiCC a5.cpp
>>>>
>>>> and run in one node with
>>>> ~/Tool/openmpi-1.10.2-debug/bin/mpirun -n 2 ./a.out
>>>>
>>>> The output is in the end. What puzzles me is why MPI_Test is called so
>>>> many times, and it takes so long to send a message. Am I doing something
>>>> wrong? I'm simulating a more complicated program: MPI 0 Isends data to MPI
>>>> 1, computes (usleep here), and calls Test to check if data are sent. MPI 1
>>>> Recvs data, and computes.
>>>>
>>>> Thanks in advance.
>>>>
>>>>
>>>> Best regards,
>>>> Zhen
>>>>
>>>> MPI 0: Isend of 0 started at 20:32:35.
>>>> MPI 1: Recv of 0 started at 20:32:35.
>>>> MPI 0: MPI_Test of 0 at 20:32:35.
>>>> MPI 0: MPI_Test of 0 at 20:32:35.
>>>> MPI 0: MPI_Test of 0 at 20:32:35.
>>>> MPI 0: MPI_Test of 0 at 20:32:35.
>>>> MPI 0: MPI_Test of 0 at 20:32:35.
>>>> MPI 0: MPI_Test of 0 at 20:32:35.
>>>> MPI 0: MPI_Test of 0 at 20:32:36.
>>>> MPI 0: MPI_Test of 0 at 20:32:36.
>>>> MPI 0: MPI_Test of 0 at 20:32:36.
>>>> MPI 0: MPI_Test of 0 at 20:32:36.
>>>> MPI 0: MPI_Test of 0 at 20:32:36.
>>>> MPI 0: MPI_Test of 0 at 20:32:36.
>>>> MPI 0: MPI_Test of 0 at 20:32:36.
>>>> MPI 0: MPI_Test of 0 at 20:32:36.
>>>> MPI 0: MPI_Test of 0 at 20:32:36.
>>>> MPI 0: MPI_Test of 0 at 20:32:37.
>>>> MPI 0: MPI_Test of 0 at 20:32:37.
>>>> MPI 0: MPI_Test of 0 at 20:32:37.
>>>> MPI 0: MPI_Test of 0 at 20:32:37.
>>>> MPI 0: MPI_Test of 0 at 20:32:37.
>>>> MPI 0: MPI_Test of 0 at 20:32:37.
>>>> MPI 0: MPI_Test of 0 at 20:32:37.
>>>> MPI 0: MPI_Test of 0 at 20:32:37.
>>>> MPI 0: MPI_Test of 0 at 20:32:37.
>>>> MPI 0: MPI_Test of 0 at 20:32:37.
>>>> MPI 0: MPI_Test of 0 at 20:32:38.
>>>> MPI 0: MPI_Test of 0 at 20:32:38.
>>>> MPI 0: MPI_Test of 0 at 20:32:38.
>>>> MPI 0: MPI_Test of 0 at 20:32:38.
>>>> MPI 0: MPI_Test of 0 at 20:32:38.
>>>> MPI 0: MPI_Test of 0 at 20:32:38.
>>>> MPI 0: MPI_Test of 0 at 20:32:38.
>>>> MPI 0: MPI_Test of 0 at 20:32:38.
>>>> MPI 0: MPI_Test of 0 at 20:32:38.
>>>> MPI 0: MPI_Test of 0 at 20:32:38.
>>>> MPI 0: MPI_Test of 0 at 20:32:39.
>>>> MPI 0: MPI_Test of 0 at 20:32:39.
>>>> MPI 0: MPI_Test of 0 at 20:32:39.
>>>> MPI 0: MPI_Test of 0 at 20:32:39.
>>>> MPI 0: MPI_Test of 0 at 20:32:39.
>>>> MPI 0: MPI_Test of 0 at 20:32:39.
>>>> MPI 1: Recv of 0 finished at 20:32:39.
>>>> MPI 0: MPI_Test of 0 at 20:32:39.
>>>> MPI 0: Isend of 0 finished at 20:32:39.
>>>>
>>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/05/29086.php
>>>
>>
>>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/05/29097.php
>

Reply via email to