Dear MPI people, 
                                   I want to use LogGP model with MPI to find a 
message with K bytes will take how much time. In this, I need to find Latency 
L, Overhead o and Gap G. Can somebody tell me how can I measure these three 
parameters of the underlying network ? and how often should I measure these 
parameters so that the predication of time for sending a message of K bytes 
remains accurate.

regards,
Mudassar



________________________________
From: "users-requ...@open-mpi.org" <users-requ...@open-mpi.org>
To: us...@open-mpi.org
Sent: Wednesday, October 26, 2011 6:00 PM
Subject: users Digest, Vol 2052, Issue 1

Send users mailing list submissions to
    us...@open-mpi.org

To subscribe or unsubscribe via the World Wide Web, visit
    http://www.open-mpi.org/mailman/listinfo.cgi/users
or, via email, send a message with subject or body 'help' to
    users-requ...@open-mpi.org

You can reach the person managing the list at
    users-ow...@open-mpi.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."


Today's Topics:

   1. Re: Problem-Bug with MPI_Intercomm_create() (Ralph Castain)
   2. Re: Checkpoint from inside MPI program with OpenMPI 1.4.2 ?
      (Josh Hursey)
   3. Subnet routing (1.2.x) not working in 1.4.3 anymore (Mirco Wahab)
   4. Re: mpirun should run with just the localhost    interface on
      win? (MM)
   5. Re: Checkpoint from inside MPI program with OpenMPI 1.4.2 ?
      (Nguyen Toan)
   6. Re: exited on signal 11 (Segmentation fault).
      (Mouhamad Al-Sayed-Ali)
   7. Changing plm_rsh_agent system wide (Patrick Begou)
   8. Re: Checkpoint from inside MPI program with OpenMPI 1.4.2 ?
      (Josh Hursey)
   9. Re: Changing plm_rsh_agent system wide (Ralph Castain)
  10. Re: Changing plm_rsh_agent system wide (TERRY DONTJE)
  11. Re: Changing plm_rsh_agent system wide (TERRY DONTJE)
  12. Re: Changing plm_rsh_agent system wide (Patrick Begou)


----------------------------------------------------------------------

Message: 1
Date: Tue, 25 Oct 2011 10:08:00 -0600
From: Ralph Castain <r...@open-mpi.org>
Subject: Re: [OMPI users] Problem-Bug with MPI_Intercomm_create()
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <30d41149-6683-41c2-ace0-776c64e5c...@open-mpi.org>
Content-Type: text/plain; charset=iso-8859-1

FWIW: I have tracked this problem down. The fix is a little more complicated 
then I'd like, so I'm going to have to ping some other folks to ensure we 
concur on the approach before doing something.

On Oct 25, 2011, at 8:20 AM, Ralph Castain wrote:

> I still see it failing the test George provided on the trunk. I'm unaware of 
> anyone looking further into it, though, as the prior discussion seemed to 
> just end.
> 
> On Oct 25, 2011, at 7:01 AM, orel wrote:
> 
>> Dears,
>> 
>> I try from several days to use advanced MPI2 features in the following 
>> scenario :
>> 
>> 1) a master code A (of size NPA) spawns (MPI_Comm_spawn()) two slave
>>    codes B (of size NPB) and C (of size NPC), providing intercomms A-B and 
>>A-C ;
>> 2) i create intracomm AB and AC by merging intercomms ;
>> 3) then i create intercomm AB-C by calling MPI_Intercomm_create() by using 
>> AC as bridge...
>> 
>>   MPI_Comm intercommABC; A: MPI_Intercomm_create(intracommAB, 0, 
>>intracommAC, NPA, TAG,&intercommABC);
>> B: MPI_Intercomm_create(intracommAB, 0, MPI_COMM_NULL, 0,TAG,&intercommABC);
>> C: MPI_Intercomm_create(intracommC, 0, intracommAC, 0, TAG,&intercommABC);
>> 
>>     In these calls, A0 and C0 play the role of local leader for AB and C 
>>respectively.
>>     C0 and A0 play the roles of remote leader in bridge intracomm AC.
>> 
>> 3)  MPI_Barrier(intercommABC);
>> 4)  i merge intercomm AB-C into intracomm ABC$
>> 5)  MPI_Barrier(intracommABC);
>> 
>> My BUG: These calls success, but when i try to use intracommABC for a 
>> collective communication like MPI_Barrier(),
>>              i got the following error :
>> 
>> *** An error occurred in MPI_Barrier
>> *** on communicator
>> *** MPI_ERR_INTERN: internal error
>> *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
>> 
>> 
>> I try with OpenMPI trunk, 1.5.3, 1.5.4 and Mpich2-1.4.1p1
>> 
>> My code works perfectly if intracomm A, B and C are obtained by 
>> MPI_Comm_split() instead of MPI_Comm_spawn() !!!!
>> 
>> 
>> I found same problem in a previous thread of the OMPI Users mailing list :
>> 
>> => http://www.open-mpi.org/community/lists/users/2011/06/16711.php
>> 
>> Is that bug/problem is currently under investigation ? :-)
>> 
>> i can give detailed code, but the one provided by George Bosilca in this 
>> previous thread provides same error...
>> 
>> Thank you to help me...
>> 
>> -- 
>> Aur?lien Esnard
>> University Bordeaux 1 / LaBRI / INRIA (France)
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 




------------------------------

Message: 2
Date: Tue, 25 Oct 2011 13:25:27 -0500
From: Josh Hursey <jjhur...@open-mpi.org>
Subject: Re: [OMPI users] Checkpoint from inside MPI program with
    OpenMPI 1.4.2 ?
To: Open MPI Users <us...@open-mpi.org>
Message-ID:
    <CAANzjEnOdwva5J4fFBmXtsK6Kj3yGE9j=dKdtaWuZs=whzg...@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Open MPI (trunk/1.7 - not 1.4 or 1.5) provides an application level
interface to request a checkpoint of an application. This API is
defined on the following website:
  http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_checkpoint

This will behave the same as if you requested the checkpoint of the
job from the command line.

-- Josh

On Mon, Oct 24, 2011 at 12:37 PM, Nguyen Toan <nguyentoan1...@gmail.com> wrote:
> Dear all,
> I want to automatically checkpoint an MPI program with OpenMPI ( I'm
> currently using 1.4.2 version with BLCR 0.8.2),
> not by manually typing ompi-checkpoint command line from another terminal.
> So I would like to know if there is a way to call checkpoint function from
> inside an MPI program
> with OpenMPI or how to do that.
> Any ideas are very appreciated.
> Regards,
> Nguyen Toan
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


------------------------------

Message: 3
Date: Tue, 25 Oct 2011 22:15:12 +0200
From: Mirco Wahab <mirco.wa...@chemie.tu-freiberg.de>
Subject: [OMPI users] Subnet routing (1.2.x) not working in 1.4.3
    anymore
To: us...@open-mpi.org
Message-ID: <4ea718d0.5060...@chemie.tu-freiberg.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

In the last few years, it has been very simple to
set up high-performance (GbE) multiple back-to-back
connections between three nodes (triangular topology)
or four nodes (tetrahedral topology).

The only things you had to do was
- use 3 (or 4) cheap compute nodes w/Linux and connect
   each of them via standard GbE router (onboard GbE NIC)
   to a file server,
- put 2 (trigonal topol.) or 3 (tetrahedral topol.)
   $25 PCIe-GbE-NICs into *each* node,
- connect the nodes with 3 (trigonal) or 4 (tetrahedral)
   short crossover Cat5e cables,
- configure the extra NICs into different subnets
   according to their "edge index", eg.
   for 3 nodes (node10, node11, node12)
     node10
       onboard NIC: 192.168.0.10 on eth0 (to router/server)
       extra NIC: 10.0.1.10 on eth1 (edge 1 to 10.0.1.11)
       extra NIC: 10.0.2.10 on eth2 (edge 2 to 10.0.2.12)
     node11
       onboard NIC: 192.168.0.11 on eth0 (to router/server)
       extra NIC: 10.0.1.11 on eth1 (edge 1 to 10.0.1.10)
       extra NIC: 10.0.3.11 on eth3 (edge 3 to 10.0.3.12)
     node12
       onboard NIC: 192.168.0.12 on eth0 (to router/server)
       extra NIC: 10.0.2.12 on eth2 (edge 2 to 10.0.2.10)
       extra NIC: 10.0.3.12 on eth3 (edge 3 to 10.0.3.11)
- that's it. I mean, that *was* it, with 1.2.x.

OMPI 1.2.x would then ingeniously discover the routable edges
and open communication ports accordingly without any additional
explicit host routing, eg. invoked by

$> mpirun -np 12 --host c10,c11,c12 --mca btl_tcp_if_exclude lo,eth0  my_mpi_app

and (measured by iftop) saturate the available edges with
about 100MB/sec duplex on each of them. It would not stumble
on the fact, that some interfaces are not reacheable by
every NIC directly. And this was very convenient over the years.

With 1.4.3 (which comes out of the box) w/actual Linux distributions,
this won't work. It would hang and complain after timeout about failed
endpoint connects, eg:

[node12][[52378,1],2][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
 connect() to 10.0.1.11 failed: Connection timed out (110)

* Can the intelligent behaviour of 1.2.x be "configured back"?

* How should the topology look like to work with 1,4,x painlessly?

Thanks & regards

M.




------------------------------

Message: 4
Date: Tue, 25 Oct 2011 21:33:54 +0100
From: "MM" <finjulh...@gmail.com>
Subject: Re: [OMPI users] mpirun should run with just the localhost
    interface on win?
To: "'openmpi mailing list'" <us...@open-mpi.org>
Message-ID: <00d601cc9355$6af47290$40dd57b0$@com>
Content-Type: text/plain;    charset="us-ascii"

-----Original Message-----

if the interface is down, should localhost still allow mpirun to run mpi
processes?



------------------------------

Message: 5
Date: Wed, 26 Oct 2011 13:52:17 +0900
From: Nguyen Toan <nguyentoan1...@gmail.com>
Subject: Re: [OMPI users] Checkpoint from inside MPI program with
    OpenMPI 1.4.2 ?
To: Open MPI Users <us...@open-mpi.org>
Message-ID:
    <CAFiEserJ0U9m9euy1-CA8m=_KihMM5s73qaJiii_N=p7f3k...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Dear Josh,

Thank you. I will test the 1.7 trunk as you suggested.
Also I want to ask if we can add this interface to OpenMPI 1.4.2,
because my applications are mainly involved in this version.

Regards,
Nguyen Toan

On Wed, Oct 26, 2011 at 3:25 AM, Josh Hursey <jjhur...@open-mpi.org> wrote:

> Open MPI (trunk/1.7 - not 1.4 or 1.5) provides an application level
> interface to request a checkpoint of an application. This API is
> defined on the following website:
>  http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_checkpoint
>
> This will behave the same as if you requested the checkpoint of the
> job from the command line.
>
> -- Josh
>
> On Mon, Oct 24, 2011 at 12:37 PM, Nguyen Toan <nguyentoan1...@gmail.com>
> wrote:
> > Dear all,
> > I want to automatically checkpoint an MPI program with OpenMPI ( I'm
> > currently using 1.4.2 version with BLCR 0.8.2),
> > not by manually typing ompi-checkpoint command line from another
> terminal.
> > So I would like to know if there is a way to call checkpoint function
> from
> > inside an MPI program
> > with OpenMPI or how to do that.
> > Any ideas are very appreciated.
> > Regards,
> > Nguyen Toan
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
>
>
> --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
-------------- next part --------------
HTML attachment scrubbed and removed

------------------------------

Message: 6
Date: Wed, 26 Oct 2011 09:57:38 +0200
From: Mouhamad Al-Sayed-Ali <mouhamad.al-sayed-...@u-bourgogne.fr>
Subject: Re: [OMPI users] exited on signal 11 (Segmentation fault).
To: Gus Correa <g...@ldeo.columbia.edu>
Cc: Open MPI Users <us...@open-mpi.org>
Message-ID: <20111026095738.119675e8nwvpx...@webmail.u-bourgogne.fr>
Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes";
    format="flowed"

Hi Gus Correa,

  the output of ulimit -a     is


----
file(blocks)         unlimited
coredump(blocks)     2048
data(kbytes)         unlimited
stack(kbytes)        10240
lockedmem(kbytes)    unlimited
memory(kbytes)       unlimited
nofiles(descriptors) 1024
processes            256
--------


Thanks

Mouhamad
Gus Correa <g...@ldeo.columbia.edu> a ?crit?:

> Hi Mouhamad
>
> The locked memory is set to unlimited, but the lines
> about the stack are commented out.
> Have you tried to add this line:
>
> *   -   stack       -1
>
> then run wrf again? [Note no "#" hash character]
>
> Also, if you login to the compute nodes,
> what is the output of 'limit' [csh,tcsh] or 'ulimit -a' [sh,bash]?
> This should tell you what limits are actually set.
>
> I hope this helps,
> Gus Correa
>
> Mouhamad Al-Sayed-Ali wrote:
>> Hi all,
>>
>>   I've checked the "limits.conf", and it contains theses lines
>>
>>
>> # Jcb 29.06.2007 : pbs wrf (Siji)
>> #*      hard    stack   1000000
>> #*      soft    stack   1000000
>>
>> # Dr 14.02.2008 : pour voltaire mpi
>> *      hard    memlock unlimited
>> *      soft    memlock unlimited
>>
>>
>>
>> Many thanks for your help
>> Mouhamad
>>
>> Gus Correa <g...@ldeo.columbia.edu> a ?crit :
>>
>>> Hi Mouhamad, Ralph, Terry
>>>
>>> Very often big programs like wrf crash with segfault because they
>>> can't allocate memory on the stack, and assume the system doesn't
>>> impose any limits for it.  This has nothing to do with MPI.
>>>
>>> Mouhamad:  Check if your stack size is set to unlimited on all compute
>>> nodes.  The easy way to get it done
>>> is to change /etc/security/limits.conf,
>>> where you or your system administrator could add these lines:
>>>
>>> *   -   memlock     -1
>>> *   -   stack       -1
>>> *   -   nofile      4096
>>>
>>> My two cents,
>>> Gus Correa
>>>
>>> Ralph Castain wrote:
>>>> Looks like you are crashing in wrf - have you asked them for help?
>>>>
>>>> On Oct 25, 2011, at 7:53 AM, Mouhamad Al-Sayed-Ali wrote:
>>>>
>>>>> Hi again,
>>>>>
>>>>> This is exactly the error I have:
>>>>>
>>>>> ----
>>>>> taskid: 0 hostname: part034.u-bourgogne.fr
>>>>> [part034:21443] *** Process received signal ***
>>>>> [part034:21443] Signal: Segmentation fault (11)
>>>>> [part034:21443] Signal code: Address not mapped (1)
>>>>> [part034:21443] Failing at address: 0xfffffffe01eeb340
>>>>> [part034:21443] [ 0] /lib64/libpthread.so.0 [0x3612c0de70]
>>>>> [part034:21443] [ 1] wrf.exe(__module_ra_rrtm_MOD_taugb3+0x418)  
>>>>> [0x11cc9d8]
>>>>> [part034:21443] [ 2] wrf.exe(__module_ra_rrtm_MOD_gasabs+0x260)  
>>>>> [0x11cfca0]
>>>>> [part034:21443] [ 3] wrf.exe(__module_ra_rrtm_MOD_rrtm+0xb31) [0x11e6e41]
>>>>> [part034:21443] [ 4]  
>>>>> wrf.exe(__module_ra_rrtm_MOD_rrtmlwrad+0x25ec) [0x11e9bcc]
>>>>> [part034:21443] [ 5]  
>>>>> wrf.exe(__module_radiation_driver_MOD_radiation_driver+0xe573)  
>>>>> [0xcc4ed3]
>>>>> [part034:21443] [ 6]  
>>>>> wrf.exe(__module_first_rk_step_part1_MOD_first_rk_step_part1+0x40c5)  
>>>>> [0xe0e4f5]
>>>>> [part034:21443] [ 7] wrf.exe(solve_em_+0x22e58) [0x9b45c8]
>>>>> [part034:21443] [ 8] wrf.exe(solve_interface_+0x80a) [0x902dda]
>>>>> [part034:21443] [ 9]  
>>>>> wrf.exe(__module_integrate_MOD_integrate+0x236) [0x4b2c4a]
>>>>> [part034:21443] [10] wrf.exe(__module_wrf_top_MOD_wrf_run+0x24)  
>>>>> [0x47a924]
>>>>> [part034:21443] [11] wrf.exe(main+0x41) [0x4794d1]
>>>>> [part034:21443] [12] /lib64/libc.so.6(__libc_start_main+0xf4)  
>>>>> [0x361201d8b4]
>>>>> [part034:21443] [13] wrf.exe [0x4793c9]
>>>>> [part034:21443] *** End of error message ***
>>>>> -------
>>>>>
>>>>> Mouhamad
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>




------------------------------

Message: 7
Date: Wed, 26 Oct 2011 11:11:08 +0200
From: Patrick Begou <patrick.be...@hmg.inpg.fr>
Subject: [OMPI users] Changing plm_rsh_agent system wide
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <4ea7ceac.3080...@hmg.inpg.fr>
Content-Type: text/plain; charset=ISO-8859-15; format=flowed

I need to change system wide how OpenMPI launch the jobs on the nodes of my 
cluster.

Setting:
export OMPI_MCA_plm_rsh_agent=oarsh

works fine but I would like this config to be the default with OpenMPI. I've 
read several threads (discussions, FAQ) about this but none of the provided 
solutions seams to work.

I have two files:
/usr/lib/openmpi/1.4-gcc/etc/openmpi-mca-params.conf
/usr/lib64/openmpi/1.4-gcc/etc/openmpi-mca-params.conf

In these files I've set various flavor of the syntax (only one at a time, and 
the same in each file of course!):
test 1) plm_rsh_agent = oarsh
test 2) pls_rsh_agent = oarsh
test 3) orte_rsh_agent = oarsh

But each time when I run "ompi_info --param plm rsh" I get:
MCA plm: parameter "plm_rsh_agent" (current value: "ssh : rsh", data source: 
default value, synonyms:
                   pls_rsh_agent)
                   The command used to launch executables on remote nodes 
(typically either "ssh" or "rsh")

With the exported variable it works fine.
Any suggestion ?

The rpm package of my linux Rocks Cluster provides:
    Package: Open MPI root@build-x86-64 Distribution
    Open MPI: 1.4.3
    Open MPI SVN revision: r23834
    Open MPI release date: Oct 05, 2010

Thanks

Patrick



  --
===============================================================
|  Equipe M.O.S.T.         | http://most.hmg.inpg.fr          |
|  Patrick BEGOU           |       ------------               |
|  LEGI                    | mailto:patrick.be...@hmg.inpg.fr |
|  BP 53 X                 | Tel 04 76 82 51 35               |
|  38041 GRENOBLE CEDEX    | Fax 04 76 82 52 71               |
===============================================================



------------------------------

Message: 8
Date: Wed, 26 Oct 2011 07:20:38 -0500
From: Josh Hursey <jjhur...@open-mpi.org>
Subject: Re: [OMPI users] Checkpoint from inside MPI program with
    OpenMPI 1.4.2 ?
To: Open MPI Users <us...@open-mpi.org>
Message-ID:
    <CAANzjEmx=so_9mtzvm+wiplwfhpsim6uxeosxnpgdd8quzo...@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Since this would be a new feature for 1.4, we cannot move it since the
1.4 branch is for bug fixes only. However, we may be able to add it to
1.5. I filed a ticket if you want to track that progress:
  https://svn.open-mpi.org/trac/ompi/ticket/2895

-- Josh


On Tue, Oct 25, 2011 at 11:52 PM, Nguyen Toan <nguyentoan1...@gmail.com> wrote:
> Dear Josh,
> Thank you. I will test the 1.7 trunk as you suggested.
> Also I want to ask if we can add this interface to OpenMPI 1.4.2,
> because my applications are mainly involved in this version.
> Regards,
> Nguyen Toan
> On Wed, Oct 26, 2011 at 3:25 AM, Josh Hursey <jjhur...@open-mpi.org> wrote:
>>
>> Open MPI (trunk/1.7 - not 1.4 or 1.5) provides an application level
>> interface to request a checkpoint of an application. This API is
>> defined on the following website:
>> ?http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_checkpoint
>>
>> This will behave the same as if you requested the checkpoint of the
>> job from the command line.
>>
>> -- Josh
>>
>> On Mon, Oct 24, 2011 at 12:37 PM, Nguyen Toan <nguyentoan1...@gmail.com>
>> wrote:
>> > Dear all,
>> > I want to automatically checkpoint an MPI program with OpenMPI ( I'm
>> > currently using 1.4.2 version with BLCR 0.8.2),
>> > not by manually typing ompi-checkpoint command line from another
>> > terminal.
>> > So I would like to know if there is a way to call checkpoint function
>> > from
>> > inside an MPI program
>> > with OpenMPI or how to do that.
>> > Any ideas are very appreciated.
>> > Regards,
>> > Nguyen Toan
>> > _______________________________________________
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>>
>>
>>
>> --
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



------------------------------

Message: 9
Date: Wed, 26 Oct 2011 08:44:45 -0600
From: Ralph Castain <r...@open-mpi.org>
Subject: Re: [OMPI users] Changing plm_rsh_agent system wide
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <f188cf99-9a7a-4327-af9c-51d578cd5...@open-mpi.org>
Content-Type: text/plain; charset=us-ascii

Did the version you are running get installed in /usr? Sounds like you are 
picking up a different version when running a command - i.e., that your PATH is 
finding a different installation than the one in /usr.


On Oct 26, 2011, at 3:11 AM, Patrick Begou wrote:

> I need to change system wide how OpenMPI launch the jobs on the nodes of my 
> cluster.
> 
> Setting:
> export OMPI_MCA_plm_rsh_agent=oarsh
> 
> works fine but I would like this config to be the default with OpenMPI. I've 
> read several threads (discussions, FAQ) about this but none of the provided 
> solutions seams to work.
> 
> I have two files:
> /usr/lib/openmpi/1.4-gcc/etc/openmpi-mca-params.conf
> /usr/lib64/openmpi/1.4-gcc/etc/openmpi-mca-params.conf
> 
> In these files I've set various flavor of the syntax (only one at a time, and 
> the same in each file of course!):
> test 1) plm_rsh_agent = oarsh
> test 2) pls_rsh_agent = oarsh
> test 3) orte_rsh_agent = oarsh
> 
> But each time when I run "ompi_info --param plm rsh" I get:
> MCA plm: parameter "plm_rsh_agent" (current value: "ssh : rsh", data source: 
> default value, synonyms:
>                  pls_rsh_agent)
>                  The command used to launch executables on remote nodes 
>(typically either "ssh" or "rsh")
> 
> With the exported variable it works fine.
> Any suggestion ?
> 
> The rpm package of my linux Rocks Cluster provides:
>   Package: Open MPI root@build-x86-64 Distribution
>   Open MPI: 1.4.3
>   Open MPI SVN revision: r23834
>   Open MPI release date: Oct 05, 2010
> 
> Thanks
> 
> Patrick
> 
> 
> 
> --
> ===============================================================
> |  Equipe M.O.S.T.         | http://most.hmg.inpg.fr          |
> |  Patrick BEGOU           |       ------------               |
> |  LEGI                    | mailto:patrick.be...@hmg.inpg.fr |
> |  BP 53 X                 | Tel 04 76 82 51 35               |
> |  38041 GRENOBLE CEDEX    | Fax 04 76 82 52 71               |
> ===============================================================
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




------------------------------

Message: 10
Date: Wed, 26 Oct 2011 10:49:38 -0400
From: TERRY DONTJE <terry.don...@oracle.com>
Subject: Re: [OMPI users] Changing plm_rsh_agent system wide
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <4ea81e02.6080...@oracle.com>
Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"

I am using prefix configuration so no it does not exist in /usr.

--td

On 10/26/2011 10:44 AM, Ralph Castain wrote:
> Did the version you are running get installed in /usr? Sounds like you are 
> picking up a different version when running a command - i.e., that your PATH 
> is finding a different installation than the one in /usr.
>
>
> On Oct 26, 2011, at 3:11 AM, Patrick Begou wrote:
>
>> I need to change system wide how OpenMPI launch the jobs on the nodes of my 
>> cluster.
>>
>> Setting:
>> export OMPI_MCA_plm_rsh_agent=oarsh
>>
>> works fine but I would like this config to be the default with OpenMPI. I've 
>> read several threads (discussions, FAQ) about this but none of the provided 
>> solutions seams to work.
>>
>> I have two files:
>> /usr/lib/openmpi/1.4-gcc/etc/openmpi-mca-params.conf
>> /usr/lib64/openmpi/1.4-gcc/etc/openmpi-mca-params.conf
>>
>> In these files I've set various flavor of the syntax (only one at a time, 
>> and the same in each file of course!):
>> test 1) plm_rsh_agent = oarsh
>> test 2) pls_rsh_agent = oarsh
>> test 3) orte_rsh_agent = oarsh
>>
>> But each time when I run "ompi_info --param plm rsh" I get:
>> MCA plm: parameter "plm_rsh_agent" (current value: "ssh : rsh", data source: 
>> default value, synonyms:
>>                   pls_rsh_agent)
>>                   The command used to launch executables on remote nodes 
>>(typically either "ssh" or "rsh")
>>
>> With the exported variable it works fine.
>> Any suggestion ?
>>
>> The rpm package of my linux Rocks Cluster provides:
>>    Package: Open MPI root@build-x86-64 Distribution
>>    Open MPI: 1.4.3
>>    Open MPI SVN revision: r23834
>>    Open MPI release date: Oct 05, 2010
>>
>> Thanks
>>
>> Patrick
>>
>>
>>
>> --
>> ===============================================================
>> |  Equipe M.O.S.T.         | http://most.hmg.inpg.fr          |
>> |  Patrick BEGOU           |       ------------               |
>> |  LEGI                    | mailto:patrick.be...@hmg.inpg.fr |
>> |  BP 53 X                 | Tel 04 76 82 51 35               |
>> |  38041 GRENOBLE CEDEX    | Fax 04 76 82 52 71               |
>> ===============================================================
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>



-------------- next part --------------
HTML attachment scrubbed and removed
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 2059 bytes
Desc: not available
URL: 
<http://www.open-mpi.org/MailArchives/users/attachments/20111026/2e811d83/attachment.gif>

------------------------------

Message: 11
Date: Wed, 26 Oct 2011 10:51:06 -0400
From: TERRY DONTJE <terry.don...@oracle.com>
Subject: Re: [OMPI users] Changing plm_rsh_agent system wide
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <4ea81e5a.3030...@oracle.com>
Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"

Sorry please disregard my reply to this email.

:-)

--td

On 10/26/2011 10:44 AM, Ralph Castain wrote:
> Did the version you are running get installed in /usr? Sounds like you are 
> picking up a different version when running a command - i.e., that your PATH 
> is finding a different installation than the one in /usr.
>
>
> On Oct 26, 2011, at 3:11 AM, Patrick Begou wrote:
>
>> I need to change system wide how OpenMPI launch the jobs on the nodes of my 
>> cluster.
>>
>> Setting:
>> export OMPI_MCA_plm_rsh_agent=oarsh
>>
>> works fine but I would like this config to be the default with OpenMPI. I've 
>> read several threads (discussions, FAQ) about this but none of the provided 
>> solutions seams to work.
>>
>> I have two files:
>> /usr/lib/openmpi/1.4-gcc/etc/openmpi-mca-params.conf
>> /usr/lib64/openmpi/1.4-gcc/etc/openmpi-mca-params.conf
>>
>> In these files I've set various flavor of the syntax (only one at a time, 
>> and the same in each file of course!):
>> test 1) plm_rsh_agent = oarsh
>> test 2) pls_rsh_agent = oarsh
>> test 3) orte_rsh_agent = oarsh
>>
>> But each time when I run "ompi_info --param plm rsh" I get:
>> MCA plm: parameter "plm_rsh_agent" (current value: "ssh : rsh", data source: 
>> default value, synonyms:
>>                   pls_rsh_agent)
>>                   The command used to launch executables on remote nodes 
>>(typically either "ssh" or "rsh")
>>
>> With the exported variable it works fine.
>> Any suggestion ?
>>
>> The rpm package of my linux Rocks Cluster provides:
>>    Package: Open MPI root@build-x86-64 Distribution
>>    Open MPI: 1.4.3
>>    Open MPI SVN revision: r23834
>>    Open MPI release date: Oct 05, 2010
>>
>> Thanks
>>
>> Patrick
>>
>>
>>
>> --
>> ===============================================================
>> |  Equipe M.O.S.T.         | http://most.hmg.inpg.fr          |
>> |  Patrick BEGOU           |       ------------               |
>> |  LEGI                    | mailto:patrick.be...@hmg.inpg.fr |
>> |  BP 53 X                 | Tel 04 76 82 51 35               |
>> |  38041 GRENOBLE CEDEX    | Fax 04 76 82 52 71               |
>> ===============================================================
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>



-------------- next part --------------
HTML attachment scrubbed and removed
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 2059 bytes
Desc: not available
URL: 
<http://www.open-mpi.org/MailArchives/users/attachments/20111026/5d399085/attachment.gif>

------------------------------

Message: 12
Date: Wed, 26 Oct 2011 17:57:54 +0200
From: Patrick Begou <patrick.be...@hmg.inpg.fr>
Subject: Re: [OMPI users] Changing plm_rsh_agent system wide
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <4ea82e02.9020...@hmg.inpg.fr>
Content-Type: text/plain; charset=ISO-8859-15; format=flowed

Ralph Castain a ?crit :
> Did the version you are running get installed in /usr? Sounds like you are 
> picking up a different version when running a command - i.e., that your PATH 
> is finding a different installation than the one in /usr.

Right! I'm using OpenMPI with Rocks Cluster distribution. There is:
  openmpi-1.4-4.el5 rpm installed with
/usr/lib*/openmpi/1.4-gcc/etc/openmpi-mca-params.conf

but there is also  rocks-openmpi-1.4.3-1 with
/opt/openmpi/etc/openmpi-mca-params.conf

I never notice this double default install of OpenMPI in this linux 
distribution.
Thanks a lot for the suggestion, I was fixed on a syntax error in my config...

Patrick
>
>
> On Oct 26, 2011, at 3:11 AM, Patrick Begou wrote:
>
>> I need to change system wide how OpenMPI launch the jobs on the nodes of my 
>> cluster.
>>
>> Setting:
>> export OMPI_MCA_plm_rsh_agent=oarsh
>>
>> works fine but I would like this config to be the default with OpenMPI. I've 
>> read several threads (discussions, FAQ) about this but none of the provided 
>> solutions seams to work.
>>
>> I have two files:
>> /usr/lib/openmpi/1.4-gcc/etc/openmpi-mca-params.conf
>> /usr/lib64/openmpi/1.4-gcc/etc/openmpi-mca-params.conf
>>
>> In these files I've set various flavor of the syntax (only one at a time, 
>> and the same in each file of course!):
>> test 1) plm_rsh_agent = oarsh
>> test 2) pls_rsh_agent = oarsh
>> test 3) orte_rsh_agent = oarsh
>>
>> But each time when I run "ompi_info --param plm rsh" I get:
>> MCA plm: parameter "plm_rsh_agent" (current value: "ssh : rsh", data source: 
>> default value, synonyms:
>>                   pls_rsh_agent)
>>                   The command used to launch executables on remote nodes 
>>(typically either "ssh" or "rsh")
>>
>> With the exported variable it works fine.
>> Any suggestion ?
>>
>> The rpm package of my linux Rocks Cluster provides:
>>    Package: Open MPI root@build-x86-64 Distribution
>>    Open MPI: 1.4.3
>>    Open MPI SVN revision: r23834
>>    Open MPI release date: Oct 05, 2010
>>
>> Thanks
>>
>> Patrick
>>
>>
>>
>> --
>> ===============================================================
>> |  Equipe M.O.S.T.         | http://most.hmg.inpg.fr          |
>> |  Patrick BEGOU           |       ------------               |
>> |  LEGI                    | mailto:patrick.be...@hmg.inpg.fr |
>> |  BP 53 X                 | Tel 04 76 82 51 35               |
>> |  38041 GRENOBLE CEDEX    | Fax 04 76 82 52 71               |
>> ===============================================================
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


-- 
===============================================================
|  Equipe M.O.S.T.         | http://most.hmg.inpg.fr          |
|  Patrick BEGOU           |       ------------               |
|  LEGI                    | mailto:patrick.be...@hmg.inpg.fr |
|  BP 53 X                 | Tel 04 76 82 51 35               |
|  38041 GRENOBLE CEDEX    | Fax 04 76 82 52 71               |
===============================================================





------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

End of users Digest, Vol 2052, Issue 1
**************************************

Reply via email to