Re: [OMPI users] Mapping by hwthreads without fully populating sockets

2016-08-16 Thread Ben Menadue
Hi Gilles,

Ah, of course - I forgot about that.

Thanks,
Ben


-Original Message-
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Gilles
Gouaillardet
Sent: Tuesday, 16 August 2016 4:07 PM
To: Open MPI Users 
Subject: Re: [OMPI users] Mapping by hwthreads without fully populating
sockets

Ben,


in my case (two sockets, 6 cores per socket, 2 threads per core) and with
Open MPI master


$ cat rf
rank 0=n0 slot=0
rank 1=n0 slot=12
rank 2=n0 slot=6
rank 3=n0 slot=18

$ mpirun -np 4 --rankfile rf --mca rmaps_rank_file_physical 1 --bind-to 
hwthread --report-bindings true
[n0:38430] MCW rank 0 bound to socket 0[core 0[hwt 0]]: 
[B./../../../../..][../../../../../..]
[n0:38430] MCW rank 1 bound to socket 0[core 0[hwt 1]]: 
[.B/../../../../..][../../../../../..]
[n0:38430] MCW rank 2 bound to socket 1[core 6[hwt 0]]: 
[../../../../../..][B./../../../../..]
[n0:38430] MCW rank 3 bound to socket 1[core 6[hwt 1]]: 
[../../../../../..][.B/../../../../..]


Cheers,

Gilles

On 8/16/2016 12:40 PM, Ben Menadue wrote:
> Hi,
>
> I'm trying to map by hwthread but only partially populating sockets. For
> example, I'm looking to create arrangements like this:
>
> Rank 0: [B./../../../../../../..][../../../../../../../..]
> Rank 1: [.B/../../../../../../..][../../../../../../../..]
> Rank 2: [../../../../../../../..][B./../../../../../../..]
> Rank 3: [../../../../../../../..][.B/../../../../../../..]
>
> Problem is, I can't work out a --map-by that will give this to me, short
of
> binding two to each core and using a wrapper around the binary to further
> bind each to its hwthread. I thought of using a rankfile, but can't work
out
> how to specify hwthreads in that.
>
> Any suggestions? Or is such a wrapper the only way to do it for now?
>
> Yes, it's very strange, and almost certainly won't perform very well, but
> I'd like to include it for completeness - I'm comparing several systems,
> including KNL (where SMT is important to get good performance in most
> cases).
>
> Thanks,
> Ben
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


[OMPI users] mpiexec runs only in first 3 host names

2016-08-16 Thread Madhuranga Rathnayake
I have a parallel setup of 6 identical machines with Linux mint 18, ssh and
openmpi.

when i execute this,
mpiexec -np 16 --hostfile mpi-hostfile namd2 apoa1.namd > apoa1.log
with following host file
localhost slots=4
slave1 slots=4
slave2 slots=4
slave3 slots=4
slave4 slots=4
slave5 slots=4

it gives error
ssh: Could not resolve hostname slave3: Temporary failure in name resolution
ssh: Could not resolve hostname slave4: Temporary failure in name resolution
ssh: Could not resolve hostname slave5: Temporary failure in name resolution

and if comment slave3,4,5 and run mpiexec -np 12 it works fine

if I changed the order, then it runs with first 3 host names.

is there any limitation with openmpi? or any idea to solve this?

-- 
kind regards,
-Madhuranga Rathnayake | මධුරංග රත්නායක-
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] mpiexec runs only in first 3 host names

2016-08-16 Thread Gilles Gouaillardet
By default, Open MPI spawns orted via ssh in a tree fashion. that 
basically requires all nodes can ssh to each other.


this is likely not your case (for example slave2 might not be able to 
ssh slave4)



as a workaround, can you try to

mpirun --mca plm_rsh_no_tree_spawn 1 ...

and see whether it fixes your problem ?


or you can simply fix name resolution (dns, /etc/hosts, ldap, nis, ...) 
on *all* your nodes



Cheers,


Gilles


On 8/16/2016 4:45 PM, Madhuranga Rathnayake wrote:
I have a parallel setup of 6 identical machines with Linux mint 18, 
ssh and openmpi.


when i execute this,
mpiexec -np 16 --hostfile mpi-hostfile namd2 apoa1.namd > apoa1.log
with following host file
localhost slots=4
slave1 slots=4
slave2 slots=4
slave3 slots=4
slave4 slots=4
slave5 slots=4

it gives error
ssh: Could not resolve hostname slave3: Temporary failure in name 
resolution
ssh: Could not resolve hostname slave4: Temporary failure in name 
resolution
ssh: Could not resolve hostname slave5: Temporary failure in name 
resolution


and if comment slave3,4,5 and run mpiexec -np 12 it works fine

if I changed the order, then it runs with first 3 host names.

is there any limitation with openmpi? or any idea to solve this?

--
kind regards,
-Madhuranga Rathnayake | මධුරංග රත්නායක-


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] mpiexec runs only in first 3 host names

2016-08-16 Thread Madhuranga Rathnayake
Hi Gilles Gouaillardet,

Thank you for your kind assistant and YES --mca plm_rsh_no_tree_spawn 1
works fine. i think it suppose to be slower than normal mpi run.

as you mentioned slave1 can't ssh to others. only master can ssh to all
slaves. I'll fix it and check again.

Thanking you in advance,

kind regards,
PVGM

On Tue, Aug 16, 2016 at 1:44 PM, Gilles Gouaillardet 
wrote:

> By default, Open MPI spawns orted via ssh in a tree fashion. that
> basically requires all nodes can ssh to each other.
>
> this is likely not your case (for example slave2 might not be able to ssh
> slave4)
>
>
> as a workaround, can you try to
>
> mpirun --mca plm_rsh_no_tree_spawn 1 ...
>
> and see whether it fixes your problem ?
>
>
> or you can simply fix name resolution (dns, /etc/hosts, ldap, nis, ...) on
> *all* your nodes
>
>
> Cheers,
>
>
> Gilles
>
> On 8/16/2016 4:45 PM, Madhuranga Rathnayake wrote:
>
> I have a parallel setup of 6 identical machines with Linux mint 18, ssh
> and openmpi.
>
> when i execute this,
> mpiexec -np 16 --hostfile mpi-hostfile namd2 apoa1.namd > apoa1.log
> with following host file
> localhost slots=4
> slave1 slots=4
> slave2 slots=4
> slave3 slots=4
> slave4 slots=4
> slave5 slots=4
>
> it gives error
> ssh: Could not resolve hostname slave3: Temporary failure in name
> resolution
> ssh: Could not resolve hostname slave4: Temporary failure in name
> resolution
> ssh: Could not resolve hostname slave5: Temporary failure in name
> resolution
>
> and if comment slave3,4,5 and run mpiexec -np 12 it works fine
>
> if I changed the order, then it runs with first 3 host names.
>
> is there any limitation with openmpi? or any idea to solve this?
>
> --
> kind regards,
> -Madhuranga Rathnayake | මධුරංග රත්නායක-
>
>
> ___
> users mailing 
> listus...@lists.open-mpi.orghttps://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>



-- 
Regards,
-Madhuranga Rathnayake | මධුරංග රත්නායක-
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OPENSHMEM ERROR with 2+ Distributed Machines

2016-08-16 Thread Debendra Das
As far as I understood I have to wait for version 2.0.1 to fix the issue.So
can you please give any idea about when 2.0.1 will be released.Also I could
not understand how to use the patch.

Thanking You,
Debendranath Das

On Mon, Aug 15, 2016 at 8:27 AM, Gilles Gouaillardet 
wrote:

> Thanks for both the report and posting the logs in a plain text file.
>
>
> i opened https://github.com/open-mpi/ompi/issues/1966 to track this issue,
>
> it contains a patch that fixes/works around this issue.
>
>
> Cheers,
>
>
> Gilles
>
> On 8/14/2016 7:39 PM, Debendra Das wrote:
>
> I have installed OpenMPI-2.0.0 in 5 systems with IP addresses 172.16.5.29,
> 172.16.5.30, 172.16.5.31, 172.16.5.32, 172.16.5.33.While executing the
> hello_oshmem_c.c program (under the examples directory) , correct output is
> coming only when execution is done using 2 distributed machines.But error
> is coming when 3 or more distributed machines are used.The outputs and the
> host file  are attached.Can anybody please help me to sort out this error?
>
> Thanking You.
> Debendranath Das
>
> On Fri, Aug 12, 2016 at 7:06 PM, r...@open-mpi.org 
> wrote:
>
>> Just as a suggestion: most of us are leery of opening Word attachments on
>> mailing lists. I’d suggest sending this to us as plain text if you want us
>> to read it.
>>
>>
>> > On Aug 12, 2016, at 4:03 AM, Debendra Das 
>> wrote:
>> >
>> > I have installed OpenMPI-2.0.0 in 5 systems with IP addresses
>> 172.16.5.29, 172.16.5.30, 172.16.5.31, 172.16.5.32, 172.16.5.33.While
>> executing the hello_oshmem_c.c program (under the examples directory) ,
>> correct output is coming only when executing is done using 2 distributed
>> machines.But error is coming when 3 or more distributed machines are
>> used.The outputs and the host file  are attached.Can anybody please help me
>> to sort out this error?
>> >
>> > Thanking You.
>> > Debendranath Das
>> > ___
>> > users mailing list
>> > users@lists.open-mpi.org
>> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
>
> ___
> users mailing 
> listus...@lists.open-mpi.orghttps://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OPENSHMEM ERROR with 2+ Distributed Machines

2016-08-16 Thread Jeff Squyres (jsquyres)
On Aug 16, 2016, at 6:09 AM, Debendra Das  wrote:
> 
> As far as I understood I have to wait for version 2.0.1 to fix the issue.So 
> can you please give any idea about when 2.0.1 will be released.

We had hoped to release it today, actually.  :-\  But there's still a few 
issues we're working out (including this one).

> Also I could not understand how to use the patch.

I think that's ok because we don't have agreement that that patch is the 
correct fix yet, anyway.  

Sorry for the delay.  :-\

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] OPENSHMEM ERROR with 2+ Distributed Machines

2016-08-16 Thread Gilles Gouaillardet
assuming you have an infiniband network, an other option is to install mxm
(mellanox proprietary but free library) and rebuild Open MPI.
pml/yalla will be used instead of ob1 and you should be just fine

Cheers,

Gilles

On Tuesday, August 16, 2016, Jeff Squyres (jsquyres) 
wrote:

> On Aug 16, 2016, at 6:09 AM, Debendra Das  > wrote:
> >
> > As far as I understood I have to wait for version 2.0.1 to fix the
> issue.So can you please give any idea about when 2.0.1 will be released.
>
> We had hoped to release it today, actually.  :-\  But there's still a few
> issues we're working out (including this one).
>
> > Also I could not understand how to use the patch.
>
> I think that's ok because we don't have agreement that that patch is the
> correct fix yet, anyway.
>
> Sorry for the delay.  :-\
>
> --
> Jeff Squyres
> jsquy...@cisco.com 
> For corporate legal information go to: http://www.cisco.com/web/
> about/doing_business/legal/cri/
>
> ___
> users mailing list
> users@lists.open-mpi.org 
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] SGE integration broken in 2.0.0

2016-08-16 Thread Jeff Squyres (jsquyres)
On Aug 12, 2016, at 2:15 PM, Reuti  wrote:
> 
> I updated my tools to:
> 
> autoconf-2.69
> automake-1.15
> libtool-2.4.6
> 
> but I face with Open MPI's ./autogen.pl:
> 
> configure.ac:152: error: possibly undefined macro: AC_PROG_LIBTOOL
> 
> I recall seeing in already before, how to get rid of it? For now I fixed the 
> single source file just by hand.

This means your Autotools installation isn't correct.  A common mistake that 
I've seen people do is install Autoconf, Automake, and Libtool in separate 
prefixes (vs. installing all 3 into a single prefix).  Another common mistake 
is accidentally using the wrong autoconf, automake, and/or libtool (e.g., using 
2 out of the 3 from your new/correct install, but accidentally using a 
system-level install for the 3rd).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] An equivalent to btl_openib_include_if when MXM over Infiniband ?

2016-08-16 Thread Audet, Martin
Hi Josh,

Thanks for your reply. I did try setting MXM_RDMA_PORTS=mlx4_0:1 for all my MPI 
processes
and it did improve performance but the performance I obtain isn't completely 
satisfying.

When I use IMB 4.1 pingpong and sendrecv benchmarks between two nodes I get 
using
Open MPI 1.10.3:

 without MXM_RDMA_PORTS

   comm   lat_min  bw_max  bw_max
  pingpong pingpongsendrecv
  (us) (MB/s)  (MB/s)
   ---
   openib 1.79 5947.0711534
   mxm2.51 5166.96 8079.18
   yalla  2.47 5167.29 8278.15


 with MXM_RDMA_PORTS=mlx4_0:1

   comm   lat_min  bw_max  bw_max
  pingpong pingpongsendrecv
  (us) (MB/s)  (MB/s)
   ---
   openib 1.79 5827.9311552.4
   mxm2.23 5191.77 8201.76
   yalla  2.18 5200.55 8109.48


openib means: pml=ob1 btl=openib,vader,self  
btl_openib_include_if=mlx4_0
mxmmeans: pml=cm,ob1 mtl=mxm  btl=vader,self
yalla  means: pml=yalla,ob1   btl=vader,self

lspci reports for our FDR Infiniband HCA:
  Infiniband Controler: Mellanox Technologies MT27500 Family [ConnectX-3]

and 16 lines like:
  Infiniband Controler: Mellanox Technologies MT27500/MT27520 Family 
[ConnectX-3/ConnectX-3 Pro Virtual Function]

the nodes use two octacore Xeon E5-2650v2 Ivybridge-EP 2.67 GHz sockets

ofed_info reports that mxm version is 3.4.3cce223-0.32200

As you can see the results are not very good. I would expect mxm and yalla to 
perform
better than openib both in term of latency and bandwidth (note: sendrecv 
bandwidth is
full duplex). I would expect the yalla bandwidth to be around 1.1 us like shown 
here
https://www.open-mpi.org/papers/sc-2014/Open-MPI-SC14-BOF.pdf (page 33).

I also ran mxm_perftest (located in /opt/mellanox/bin) and it reports the 
following
latency between two nodes:

 without MXM_RDMA_PORTS1.92 us
 withMXM_RDMA_PORTS=mlx4_0:1   1.65 us

Again I think we can expect a better latency with our configuration. 1.65 us is 
not a
very good result.

Note however that the 0.27 us (1.92 - 1.65 = 0.27) reduction reduction in raw 
mxm
latency correspond to the above Open MPI latencies observed with mxm (2.51 - 
2.23 = 0.28)
and yalla (2.47 - 2.18 = 0.29).

Another detail: everything is run inside LXC containers. Also SR-IOV is 
probably used.

Does anyone has any idea what's wrong with our cluster ?

Martin Audet


> Hi, Martin
>
> The environment variable:
>
> MXM_RDMA_PORTS=device:port
>
> is what you're looking for. You can specify a device/port pair on your OMPI
> command line like:
>
> mpirun -np 2 ... -x MXM_RDMA_PORTS=mlx4_0:1 ...
>
>
> Best,
>
> Josh

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] SGE integration broken in 2.0.0

2016-08-16 Thread Reuti

Am 16.08.2016 um 13:26 schrieb Jeff Squyres (jsquyres):

> On Aug 12, 2016, at 2:15 PM, Reuti  wrote:
>> 
>> I updated my tools to:
>> 
>> autoconf-2.69
>> automake-1.15
>> libtool-2.4.6
>> 
>> but I face with Open MPI's ./autogen.pl:
>> 
>> configure.ac:152: error: possibly undefined macro: AC_PROG_LIBTOOL
>> 
>> I recall seeing in already before, how to get rid of it? For now I fixed the 
>> single source file just by hand.
> 
> This means your Autotools installation isn't correct.  A common mistake that 
> I've seen people do is install Autoconf, Automake, and Libtool in separate 
> prefixes (vs. installing all 3 into a single prefix).

Thx a bunch - that was it. Despite searching for a solution I found only hints 
that didn't solve the issue.

-- Reuti


>  Another common mistake is accidentally using the wrong autoconf, automake, 
> and/or libtool (e.g., using 2 out of the 3 from your new/correct install, but 
> accidentally using a system-level install for the 3rd).
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] SGE integration broken in 2.0.0

2016-08-16 Thread Jeff Squyres (jsquyres)
On Aug 16, 2016, at 3:07 PM, Reuti  wrote:
> 
> Thx a bunch - that was it. Despite searching for a solution I found only 
> hints that didn't solve the issue.

FWIW, we talk about this in the HACKING file, but I admit that's not 
necessarily the easiest place to find:

https://github.com/open-mpi/ompi/blob/master/HACKING#L126-L129

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users