[OMPI users] Behavior of `ompi_info`

2017-04-25 Thread Reuti
Hi, In case Open MPI is moved to a different location than it was installed into initially, one has to export OPAL_PREFIX. While checking for the availability of the GridEngine integration, I exported OPAL_PREFIX but obviously with a typo and came to the conclusion that it's not available, as I

[OMPI users] In preparation for 3.x

2017-04-25 Thread Eric Chamberland
Hi, just testing the 3.x branch... I launch: mpirun -n 8 echo "hello" and I get: -- There are not enough slots available in the system to satisfy the 8 slots that were requested by the application: echo Either request f

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
What is in your hostfile? > On Apr 25, 2017, at 11:39 AM, Eric Chamberland > wrote: > > Hi, > > just testing the 3.x branch... I launch: > > mpirun -n 8 echo "hello" > > and I get: > > -- > There are not enough slots a

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread Eric Chamberland
Hi, the host file has been constructed automatically by the configuration+installation process and seems to contain only comments and a blank line: (15:53:50) [zorg]:~> cat /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile # # Copyright (c) 2004-2005 The Trustees of Indiana University and I

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
Okay - so effectively you have no hostfile, and no allocation. So this is running just on the one node where mpirun exists? Add “-mca ras_base_verbose 10 --display-allocation” to your cmd line and let’s see what it found > On Apr 25, 2017, at 12:56 PM, Eric Chamberland > wrote: > > Hi, > >

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread Eric Chamberland
Ok, here it is: === first, with -n 8: === mpirun -mca ras_base_verbose 10 --display-allocation -n 8 echo "Hello" [zorg:22429] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh path NULL [zorg:22429] plm:base:set_hnp_name: initial bias 22429 nodename hash 810

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread George Bosilca
I confirm a similar issue on a more managed environment. I have an hostfile that worked for the last few years, and that span across a small cluster (30 nodes of 8 cores each). Trying to spawn any number of processes across P nodes fails if the number of processes is larger than P (despite the fac

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread George Bosilca
Just to be clear, the hostfile contains the correct info: dancer00 slots=8 dancer01 slots=8 The output regarding the 2 nodes (dancer00 and dancer01) is clearly wrong. George. On Tue, Apr 25, 2017 at 4:32 PM, George Bosilca wrote: > I confirm a similar issue on a more managed environment.

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
Okay, so what’s happening is that we are auto-detecting only 4 cores on that box, and since you didn’t provide any further info, we set the #slots = #cores. If you want to run more than that, you can either tell us a number of slots to use (e.g., -host mybox:32) or add --oversubscribe to the cmd

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread Eric Chamberland
Oh, forgot something important, since OpenMPI 1.8.x I am using: export OMPI_MCA_hwloc_base_binding_policy=none Also, I am exporting this since 1.6.x?: export OMPI_MCA_mpi_yield_when_idle=1 Eric On 25/04/17 04:31 PM, Eric Chamberland wrote: Ok, here it is: === first, with -

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
Sigh - that _is_ the requested behavior. The -host option defaults to indicating only one slot should be used on that node. > On Apr 25, 2017, at 1:32 PM, George Bosilca wrote: > > I confirm a similar issue on a more managed environment. I have an hostfile > that worked for the last few years,

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
I suspect it read the file just fine - what you are seeing in the output is a reflection of the community’s design decision that only one slot would be allocated for each time a node is listed in -host. This is why they added the :N modifier so you can specify the #slots to use in lieu of writin

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread Eric Chamberland
On 25/04/17 04:36 PM, r...@open-mpi.org wrote: add --oversubscribe to the cmd line good, it works! :) Is there an environment variable equivalent to --oversubscribe argument? I can't find this option in near related FAQ entries, should it be added here? : https://www.open-mpi.org/faq/?cat

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread George Bosilca
Thanks Ralph, Indeed, if I add :8 I get back the expected behavior. I can cope with this (I don't usually restrict my runs to a subset of the nodes). George. On Tue, Apr 25, 2017 at 4:53 PM, r...@open-mpi.org wrote: > I suspect it read the file just fine - what you are seeing in the output

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
If it helps, I believe I added the ability to just use ‘:*’ to indicate “take them all” so you don’t have to remember the number. > On Apr 25, 2017, at 2:13 PM, George Bosilca wrote: > > Thanks Ralph, > > Indeed, if I add :8 I get back the expected behavior. I can cope with this (I > don't us

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
Sure - there is always an MCA param for everything: OMPI_MCA_rmaps_base_oversubscribe=1 > On Apr 25, 2017, at 2:10 PM, Eric Chamberland > wrote: > > On 25/04/17 04:36 PM, r...@open-mpi.org wrote: >> add --oversubscribe to the cmd line > > good, it works! :) > > Is there an environment varia