additional info
I am running mpirun on hostA, and providing hostlist with hostB and hostC.
I expect that each application would run on hostB and hostC, but I get all
of them running on hostA.
dellix7$cat appfile
-np 1 hostname
-np 1 hostname
dellix7$mpirun -np 2 -H witch1,witch2 -app appfile
dellix7
dellix7
Thanks
Lenny.

On Tue, Jul 14, 2009 at 4:59 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Strange - let me have a look at it later today. Probably something simple
> that another pair of eyes might spot.
> On Jul 14, 2009, at 7:43 AM, Lenny Verkhovsky wrote:
>
> Seems like connected problem:
> I can't use rankfile with app, even after all those fixes ( working with
> trunk 1.4a1r21657).
> This is my case :
>
> $cat rankfile
> rank 0=+n1 slot=0
> rank 1=+n0 slot=0
> $cat appfile
> -np 1 hostname
> -np 1 hostname
> $mpirun -np 2 -H witch1,witch2 -rf rankfile -app appfile
> --------------------------------------------------------------------------
> Rankfile claimed host +n1 by index that is bigger than number of allocated
> hosts.
> --------------------------------------------------------------------------
> [dellix7:13414] [[10851,0],0] ORTE_ERROR_LOG: Bad parameter in file
> ../../../../../orte/mca/rmaps/rank_file/rmaps_rank_file.c at line 422
> [dellix7:13414] [[10851,0],0] ORTE_ERROR_LOG: Bad parameter in file
> ../../../../orte/mca/rmaps/base/rmaps_base_map_job.c at line 85
> [dellix7:13414] [[10851,0],0] ORTE_ERROR_LOG: Bad parameter in file
> ../../../../orte/mca/plm/base/plm_base_launch_support.c at line 103
> [dellix7:13414] [[10851,0],0] ORTE_ERROR_LOG: Bad parameter in file
> ../../../../../orte/mca/plm/rsh/plm_rsh_module.c at line 1001
>
>
> The problem is, that rankfile mapper tries to find an appropriate host in
> the partial ( and not full ) hostlist.
>
> Any suggestions how to fix it?
>
> Thanks
> Lenny.
>
> On Wed, May 13, 2009 at 1:55 AM, Ralph Castain <r...@open-mpi.org> wrote:
>
>> Okay, I fixed this today too....r21219
>>
>>
>> On May 11, 2009, at 11:27 PM, Anton Starikov wrote:
>>
>> Now there is another problem :)
>>>
>>> You can try oversubscribe node. At least by 1 task.
>>> If you hostfile and rank file limit you at N procs, you can ask mpirun
>>> for N+1 and it wil be not rejected.
>>> Although in reality there will be N tasks.
>>> So, if your hostfile limit is 4, then "mpirun -np 4" and "mpirun -np 5"
>>> both works, but in both cases there are only 4 tasks. It isn't crucial,
>>> because there is nor real oversubscription, but there is still some bug
>>> which can affect something in future.
>>>
>>> --
>>> Anton Starikov.
>>>
>>> On May 12, 2009, at 1:45 AM, Ralph Castain wrote:
>>>
>>> This is fixed as of r21208.
>>>>
>>>> Thanks for reporting it!
>>>> Ralph
>>>>
>>>>
>>>> On May 11, 2009, at 12:51 PM, Anton Starikov wrote:
>>>>
>>>> Although removing this check solves problem of having more slots in
>>>>> rankfile than necessary, there is another problem.
>>>>>
>>>>> If I set rmaps_base_no_oversubscribe=1 then if, for example:
>>>>>
>>>>>
>>>>> hostfile:
>>>>>
>>>>> node01
>>>>> node01
>>>>> node02
>>>>> node02
>>>>>
>>>>> rankfile:
>>>>>
>>>>> rank 0=node01 slot=1
>>>>> rank 1=node01 slot=0
>>>>> rank 2=node02 slot=1
>>>>> rank 3=node02 slot=0
>>>>>
>>>>> mpirun -np 4 ./something
>>>>>
>>>>> complains with:
>>>>>
>>>>> "There are not enough slots available in the system to satisfy the 4
>>>>> slots
>>>>> that were requested by the application"
>>>>>
>>>>> but "mpirun -np 3 ./something" will work though. It works, when you ask
>>>>> for 1 CPU less. And the same behavior in any case (shared nodes, 
>>>>> non-shared
>>>>> nodes, multi-node)
>>>>>
>>>>> If you switch off rmaps_base_no_oversubscribe, then it works and all
>>>>> affinities set as it requested in rankfile, there is no oversubscription.
>>>>>
>>>>>
>>>>> Anton.
>>>>>
>>>>> On May 5, 2009, at 3:08 PM, Ralph Castain wrote:
>>>>>
>>>>> Ah - thx for catching that, I'll remove that check. It no longer is
>>>>>> required.
>>>>>>
>>>>>> Thx!
>>>>>>
>>>>>> On Tue, May 5, 2009 at 7:04 AM, Lenny Verkhovsky <
>>>>>> lenny.verkhov...@gmail.com> wrote:
>>>>>> According to the code it does cares.
>>>>>>
>>>>>> $vi orte/mca/rmaps/rank_file/rmaps_rank_file.c +572
>>>>>>
>>>>>> ival = orte_rmaps_rank_file_value.ival;
>>>>>> if ( ival > (np-1) ) {
>>>>>> orte_show_help("help-rmaps_rank_file.txt", "bad-rankfile", true, ival,
>>>>>> rankfile);
>>>>>> rc = ORTE_ERR_BAD_PARAM;
>>>>>> goto unlock;
>>>>>> }
>>>>>>
>>>>>> If I remember correctly, I used an array to map ranks, and since the
>>>>>> length of array is NP, maximum index must be less than np, so if you have
>>>>>> the number of rank > NP, you have no place to put it inside array.
>>>>>>
>>>>>> "Likewise, if you have more procs than the rankfile specifies, we map
>>>>>> the additional procs either byslot (default) or bynode (if you specify 
>>>>>> that
>>>>>> option). So the rankfile doesn't need to contain an entry for every 
>>>>>> proc."
>>>>>>  - Correct point.
>>>>>>
>>>>>>
>>>>>> Lenny.
>>>>>>
>>>>>>
>>>>>> On 5/5/09, Ralph Castain <r...@open-mpi.org> wrote: Sorry Lenny, but
>>>>>> that isn't correct. The rankfile mapper doesn't care if the rankfile
>>>>>> contains additional info - it only maps up to the number of processes, 
>>>>>> and
>>>>>> ignores anything beyond that number. So there is no need to remove the
>>>>>> additional info.
>>>>>>
>>>>>> Likewise, if you have more procs than the rankfile specifies, we map
>>>>>> the additional procs either byslot (default) or bynode (if you specify 
>>>>>> that
>>>>>> option). So the rankfile doesn't need to contain an entry for every proc.
>>>>>>
>>>>>> Just don't want to confuse folks.
>>>>>> Ralph
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, May 5, 2009 at 5:59 AM, Lenny Verkhovsky <
>>>>>> lenny.verkhov...@gmail.com> wrote:
>>>>>> Hi,
>>>>>> maximum rank number must be less then np.
>>>>>> if np=1 then there is only rank 0 in the system, so rank 1 is invalid.
>>>>>> please remove "rank 1=node2 slot=*" from the rankfile
>>>>>> Best regards,
>>>>>> Lenny.
>>>>>>
>>>>>> On Mon, May 4, 2009 at 11:14 AM, Geoffroy Pignot <geopig...@gmail.com>
>>>>>> wrote:
>>>>>> Hi ,
>>>>>>
>>>>>> I got the openmpi-1.4a1r21095.tar.gz tarball, but unfortunately my
>>>>>> command doesn't work
>>>>>>
>>>>>> cat rankf:
>>>>>> rank 0=node1 slot=*
>>>>>> rank 1=node2 slot=*
>>>>>>
>>>>>> cat hostf:
>>>>>> node1 slots=2
>>>>>> node2 slots=2
>>>>>>
>>>>>> mpirun  --rankfile rankf --hostfile hostf  --host node1 -n 1 hostname
>>>>>> : --host node2 -n 1 hostname
>>>>>>
>>>>>> Error, invalid rank (1) in the rankfile (rankf)
>>>>>>
>>>>>>
>>>>>> --------------------------------------------------------------------------
>>>>>> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in file
>>>>>> rmaps_rank_file.c at line 403
>>>>>> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in file
>>>>>> base/rmaps_base_map_job.c at line 86
>>>>>> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in file
>>>>>> base/plm_base_launch_support.c at line 86
>>>>>> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in file
>>>>>> plm_rsh_module.c at line 1016
>>>>>>
>>>>>>
>>>>>> Ralph, could you tell me if my command syntax is correct or not ? if
>>>>>> not, give me the expected one ?
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> Geoffroy
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2009/4/30 Geoffroy Pignot <geopig...@gmail.com>
>>>>>>
>>>>>> Immediately Sir !!! :)
>>>>>>
>>>>>> Thanks again Ralph
>>>>>>
>>>>>> Geoffroy
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------
>>>>>>
>>>>>> Message: 2
>>>>>> Date: Thu, 30 Apr 2009 06:45:39 -0600
>>>>>> From: Ralph Castain <r...@open-mpi.org>
>>>>>> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
>>>>>> To: Open MPI Users <us...@open-mpi.org>
>>>>>> Message-ID:
>>>>>>    <71d2d8cc0904300545v61a42fe1k50086d2704d0f...@mail.gmail.com>
>>>>>> Content-Type: text/plain; charset="iso-8859-1"
>>>>>>
>>>>>> I believe this is fixed now in our development trunk - you can
>>>>>> download any
>>>>>> tarball starting from last night and give it a try, if you like. Any
>>>>>> feedback would be appreciated.
>>>>>>
>>>>>> Ralph
>>>>>>
>>>>>>
>>>>>> On Apr 14, 2009, at 7:57 AM, Ralph Castain wrote:
>>>>>>
>>>>>> Ah now, I didn't say it -worked-, did I? :-)
>>>>>>
>>>>>> Clearly a bug exists in the program. I'll try to take a look at it (if
>>>>>> Lenny
>>>>>> doesn't get to it first), but it won't be until later in the week.
>>>>>>
>>>>>> On Apr 14, 2009, at 7:18 AM, Geoffroy Pignot wrote:
>>>>>>
>>>>>> I agree with you Ralph , and that 's what I expect from openmpi but my
>>>>>> second example shows that it's not working
>>>>>>
>>>>>> cat hostfile.0
>>>>>> r011n002 slots=4
>>>>>> r011n003 slots=4
>>>>>>
>>>>>> cat rankfile.0
>>>>>> rank 0=r011n002 slot=0
>>>>>> rank 1=r011n003 slot=1
>>>>>>
>>>>>> mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -n 1
>>>>>> hostname
>>>>>> ### CRASHED
>>>>>>
>>>>>> > > Error, invalid rank (1) in the rankfile (rankfile.0)
>>>>>> > >
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>>>> file
>>>>>> > > rmaps_rank_file.c at line 404
>>>>>> > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>>>> file
>>>>>> > > base/rmaps_base_map_job.c at line 87
>>>>>> > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>>>> file
>>>>>> > > base/plm_base_launch_support.c at line 77
>>>>>> > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>>>> file
>>>>>> > > plm_rsh_module.c at line 985
>>>>>> > >
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > > A daemon (pid unknown) died unexpectedly on signal 1  while
>>>>>> > attempting to
>>>>>> > > launch so we are aborting.
>>>>>> > >
>>>>>> > > There may be more information reported by the environment (see
>>>>>> > above).
>>>>>> > >
>>>>>> > > This may be because the daemon was unable to find all the needed
>>>>>> > shared
>>>>>> > > libraries on the remote node. You may set your LD_LIBRARY_PATH to
>>>>>> > have the
>>>>>> > > location of the shared libraries on the remote nodes and this will
>>>>>> > > automatically be forwarded to the remote nodes.
>>>>>> > >
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > >
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > > orterun noticed that the job aborted, but has no info as to the
>>>>>> > process
>>>>>> > > that caused that situation.
>>>>>> > >
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > > orterun: clean termination accomplished
>>>>>>
>>>>>>
>>>>>>
>>>>>> Message: 4
>>>>>> Date: Tue, 14 Apr 2009 06:55:58 -0600
>>>>>> From: Ralph Castain <r...@lanl.gov>
>>>>>> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
>>>>>> To: Open MPI Users <us...@open-mpi.org>
>>>>>> Message-ID: <f6290ada-a196-43f0-a853-cbcb802d8...@lanl.gov>
>>>>>> Content-Type: text/plain; charset="us-ascii"; Format="flowed";
>>>>>>   DelSp="yes"
>>>>>>
>>>>>> The rankfile cuts across the entire job - it isn't applied on an
>>>>>> app_context basis. So the ranks in your rankfile must correspond to
>>>>>> the eventual rank of each process in the cmd line.
>>>>>>
>>>>>> Unfortunately, that means you have to count ranks. In your case, you
>>>>>> only have four, so that makes life easier. Your rankfile would look
>>>>>> something like this:
>>>>>>
>>>>>> rank 0=r001n001 slot=0
>>>>>> rank 1=r001n002 slot=1
>>>>>> rank 2=r001n001 slot=1
>>>>>> rank 3=r001n002 slot=2
>>>>>>
>>>>>> HTH
>>>>>> Ralph
>>>>>>
>>>>>> On Apr 14, 2009, at 12:19 AM, Geoffroy Pignot wrote:
>>>>>>
>>>>>> > Hi,
>>>>>> >
>>>>>> > I agree that my examples are not very clear. What I want to do is to
>>>>>> > launch a multiexes application (masters-slaves) and benefit from the
>>>>>> > processor affinity.
>>>>>> > Could you show me how to convert this command , using -rf option
>>>>>> > (whatever the affinity is)
>>>>>> >
>>>>>> > mpirun -n 1 -host r001n001 master.x options1  : -n 1 -host r001n002
>>>>>> > master.x options2 : -n 1 -host r001n001 slave.x options3 : -n 1 -
>>>>>> > host r001n002 slave.x options4
>>>>>> >
>>>>>> > Thanks for your help
>>>>>> >
>>>>>> > Geoffroy
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Message: 2
>>>>>> > Date: Sun, 12 Apr 2009 18:26:35 +0300
>>>>>> > From: Lenny Verkhovsky <lenny.verkhov...@gmail.com>
>>>>>> > Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
>>>>>> > To: Open MPI Users <us...@open-mpi.org>
>>>>>> > Message-ID:
>>>>>> >        <453d39990904120826t2e1d1d33l7bb1fe3de65b5...@mail.gmail.com
>>>>>> >
>>>>>> > Content-Type: text/plain; charset="iso-8859-1"
>>>>>> >
>>>>>> > Hi,
>>>>>> >
>>>>>> > The first "crash" is OK, since your rankfile has ranks 0 and 1
>>>>>> > defined,
>>>>>> > while n=1, which means only rank 0 is present and can be allocated.
>>>>>> >
>>>>>> > NP must be >= the largest rank in rankfile.
>>>>>> >
>>>>>> > What exactly are you trying to do ?
>>>>>> >
>>>>>> > I tried to recreate your seqv but all I got was
>>>>>> >
>>>>>> > ~/work/svn/ompi/trunk/build_x86-64/install/bin/mpirun --hostfile
>>>>>> > hostfile.0
>>>>>> > -rf rankfile.0 -n 1 hostname : -rf rankfile.1 -n 1 hostname
>>>>>> > [witch19:30798] mca: base: component_find: paffinity
>>>>>> > "mca_paffinity_linux"
>>>>>> > uses an MCA interface that is not recognized (component MCA v1.0.0
>>>>>> !=
>>>>>> > supported MCA v2.0.0) -- ignored
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > It looks like opal_init failed for some reason; your parallel
>>>>>> > process is
>>>>>> > likely to abort. There are many reasons that a parallel process can
>>>>>> > fail during opal_init; some of which are due to configuration or
>>>>>> > environment problems. This failure appears to be an internal
>>>>>> failure;
>>>>>> > here's some additional information (which may only be relevant to an
>>>>>> > Open MPI developer):
>>>>>> >
>>>>>> >  opal_carto_base_select failed
>>>>>> >  --> Returned value -13 instead of OPAL_SUCCESS
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > [witch19:30798] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
>>>>>> file
>>>>>> > ../../orte/runtime/orte_init.c at line 78
>>>>>> > [witch19:30798] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
>>>>>> file
>>>>>> > ../../orte/orted/orted_main.c at line 344
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > A daemon (pid 11629) died unexpectedly with status 243 while
>>>>>> > attempting
>>>>>> > to launch so we are aborting.
>>>>>> >
>>>>>> > There may be more information reported by the environment (see
>>>>>> above).
>>>>>> >
>>>>>> > This may be because the daemon was unable to find all the needed
>>>>>> > shared
>>>>>> > libraries on the remote node. You may set your LD_LIBRARY_PATH to
>>>>>> > have the
>>>>>> > location of the shared libraries on the remote nodes and this will
>>>>>> > automatically be forwarded to the remote nodes.
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > mpirun noticed that the job aborted, but has no info as to the
>>>>>> process
>>>>>> > that caused that situation.
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > mpirun: clean termination accomplished
>>>>>> >
>>>>>> >
>>>>>> > Lenny.
>>>>>> >
>>>>>> >
>>>>>> > On 4/10/09, Geoffroy Pignot <geopig...@gmail.com> wrote:
>>>>>> > >
>>>>>> > > Hi ,
>>>>>> > >
>>>>>> > > I am currently testing the process affinity capabilities of
>>>>>> > openmpi and I
>>>>>> > > would like to know if the rankfile behaviour I will describe below
>>>>>> > is normal
>>>>>> > > or not ?
>>>>>> > >
>>>>>> > > cat hostfile.0
>>>>>> > > r011n002 slots=4
>>>>>> > > r011n003 slots=4
>>>>>> > >
>>>>>> > > cat rankfile.0
>>>>>> > > rank 0=r011n002 slot=0
>>>>>> > > rank 1=r011n003 slot=1
>>>>>> > >
>>>>>> > >
>>>>>> > >
>>>>>> >
>>>>>>
>>>>>> ##################################################################################
>>>>>> > >
>>>>>> > > mpirun --hostfile hostfile.0 -rf rankfile.0 -n 2  hostname ### OK
>>>>>> > > r011n002
>>>>>> > > r011n003
>>>>>> > >
>>>>>> > >
>>>>>> > >
>>>>>> >
>>>>>>
>>>>>> ##################################################################################
>>>>>> > > but
>>>>>> > > mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -n 1
>>>>>> > hostname
>>>>>> > > ### CRASHED
>>>>>> > > *
>>>>>> > >
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > > Error, invalid rank (1) in the rankfile (rankfile.0)
>>>>>> > >
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>>>> file
>>>>>> > > rmaps_rank_file.c at line 404
>>>>>> > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>>>> file
>>>>>> > > base/rmaps_base_map_job.c at line 87
>>>>>> > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>>>> file
>>>>>> > > base/plm_base_launch_support.c at line 77
>>>>>> > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>>>> file
>>>>>> > > plm_rsh_module.c at line 985
>>>>>> > >
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > > A daemon (pid unknown) died unexpectedly on signal 1  while
>>>>>> > attempting to
>>>>>> > > launch so we are aborting.
>>>>>> > >
>>>>>> > > There may be more information reported by the environment (see
>>>>>> > above).
>>>>>> > >
>>>>>> > > This may be because the daemon was unable to find all the needed
>>>>>> > shared
>>>>>> > > libraries on the remote node. You may set your LD_LIBRARY_PATH to
>>>>>> > have the
>>>>>> > > location of the shared libraries on the remote nodes and this will
>>>>>> > > automatically be forwarded to the remote nodes.
>>>>>> > >
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > >
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > > orterun noticed that the job aborted, but has no info as to the
>>>>>> > process
>>>>>> > > that caused that situation.
>>>>>> > >
>>>>>> >
>>>>>> --------------------------------------------------------------------------
>>>>>> > > orterun: clean termination accomplished
>>>>>> > > *
>>>>>> > > It seems that the rankfile option is not propagted to the second
>>>>>> > command
>>>>>> > > line ; there is no global understanding of the ranking inside a
>>>>>> > mpirun
>>>>>> > > command.
>>>>>> > >
>>>>>> > >
>>>>>> > >
>>>>>> >
>>>>>>
>>>>>> ##################################################################################
>>>>>> > >
>>>>>> > > Assuming that , I tried to provide a rankfile to each command
>>>>>> line:
>>>>>> > >
>>>>>> > > cat rankfile.0
>>>>>> > > rank 0=r011n002 slot=0
>>>>>> > >
>>>>>> > > cat rankfile.1
>>>>>> > > rank 0=r011n003 slot=1
>>>>>> > >
>>>>>> > > mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -rf
>>>>>> > rankfile.1
>>>>>> > > -n 1 hostname ### CRASHED
>>>>>> > > *[r011n002:28778] *** Process received signal ***
>>>>>> > > [r011n002:28778] Signal: Segmentation fault (11)
>>>>>> > > [r011n002:28778] Signal code: Address not mapped (1)
>>>>>> > > [r011n002:28778] Failing at address: 0x34
>>>>>> > > [r011n002:28778] [ 0] [0xffffe600]
>>>>>> > > [r011n002:28778] [ 1]
>>>>>> > > /tmp/HALMPI/openmpi-1.3.1/lib/libopen-rte.so.
>>>>>> > 0(orte_odls_base_default_get_add_procs_data+0x55d)
>>>>>> > > [0x5557decd]
>>>>>> > > [r011n002:28778] [ 2]
>>>>>> > > /tmp/HALMPI/openmpi-1.3.1/lib/libopen-rte.so.
>>>>>> > 0(orte_plm_base_launch_apps+0x117)
>>>>>> > > [0x555842a7]
>>>>>> > > [r011n002:28778] [ 3] /tmp/HALMPI/openmpi-1.3.1/lib/openmpi/
>>>>>> > mca_plm_rsh.so
>>>>>> > > [0x556098c0]
>>>>>> > > [r011n002:28778] [ 4] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
>>>>>> > [0x804aa27]
>>>>>> > > [r011n002:28778] [ 5] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
>>>>>> > [0x804a022]
>>>>>> > > [r011n002:28778] [ 6] /lib/libc.so.6(__libc_start_main+0xdc)
>>>>>> > [0x9f1dec]
>>>>>> > > [r011n002:28778] [ 7] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
>>>>>> > [0x8049f71]
>>>>>> > > [r011n002:28778] *** End of error message ***
>>>>>> > > Segmentation fault (core dumped)*
>>>>>> > >
>>>>>> > >
>>>>>> > >
>>>>>> > > I hope that I've found a bug because it would be very important
>>>>>> > for me to
>>>>>> > > have this kind of capabiliy .
>>>>>> > > Launch a multiexe mpirun command line and be able to bind my exes
>>>>>> > and
>>>>>> > > sockets together.
>>>>>> > >
>>>>>> > > Thanks in advance for your help
>>>>>> > >
>>>>>> > > Geoffroy
>>>>>> > _______________________________________________
>>>>>> > users mailing list
>>>>>> > us...@open-mpi.org
>>>>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>> -------------- next part --------------
>>>>>> HTML attachment scrubbed and removed
>>>>>>
>>>>>> ------------------------------
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>> End of users Digest, Vol 1202, Issue 2
>>>>>> **************************************
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> -------------- next part --------------
>>>>>> HTML attachment scrubbed and removed
>>>>>>
>>>>>> ------------------------------
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>> End of users Digest, Vol 1218, Issue 2
>>>>>> **************************************
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to