Hello Ralph,

Thank you for your comments.

My understanding, from reading Jeff's blog on V1.5 processor affinity, is that 
the bindings in 
Example 1 balance the load better than the bindings in Example 2.

Therefore I would like to obtain the bindings in Example 1, but using Open MPI 
2.1.1, and 
using application context files.

How can I do this?

Sincerely,

Ted Sussman

On 29 Jun 2017 at 19:09, r...@open-mpi.org wrote:

> 
> It´s a difficult call to make as to which is the correct behavior. In Example 
> 1, you are executing a 
> single app_context that has two procs in it. In Example 2, you are executing 
> two app_contexts, 
> each with a single proc in it.
> 
> Now some people say that the two should be treated the same, with the second 
> app_context in 
> Example 2 being mapped starting from the end of the first app_context. In 
> this model, a 
> comm_spawn would also start from the end of the earlier app_context, and thus 
> the new proc 
> would not be on the same node (or socket, in this case) as its parent.
> 
> Other people argue for the opposite behavior - that each app_context should 
> start from the first 
> available slot in the allocation. In that model, a comm_spawn would result in 
> the first child 
> occupying the same node (or socket) as its parent, assuming an available slot.
> 
> We´ve bounced around a bit on the behavior over the years as different groups 
> voiced their 
> opinions. OMPI 1.4.3 is _very_ old and fell in the prior camp, while 2.1.1 is 
> just released and is in 
> the second camp. I honestly don´t recall where the change occurred, or even 
> how consistent we 
> have necessarily been over the years. It isn´t something that people raise 
> very often.
> 
> I´ve pretty much resolved to leave the default behavior as it currently sits, 
> but plan to add an option 
> to support the alternative behavior as there seems no clear cut consensus in 
> the user community 
> for this behavior. Not sure when I´ll get to it - definitely not for the 2.x 
> series, and maybe not for 3.x 
> since that is about to be released.
> 
>     On Jun 29, 2017, at 11:24 AM, Ted Sussman <ted.suss...@adina.com> wrote:
> 
>     Hello all,
> 
>     Today I have a problem with the --map-to socket feature of Open MPI 2.1.1 
> when used with 
>     application context files.
> 
>     In the examples below, I am testing on a 2 socket computer, each socket 
> with 4 cores.
> 
>     ---
> 
>     Example 1:
> 
>     .../openmpi-2.1.1/bin/mpirun --report-bindings \
>                 -map-by socket \
>                 -np 2 \
>                 afftest01.exe
> 
>     returns
> 
>     ...MCW rank 0 bound to socket 0 ... : [B/B/B/B][./././.]
>     ...MCW rank 1 bound to socket 1 ... : [./././.][B/B/B/B]
> 
>     which is what I would expect.
> 
>     ---
> 
>     Example 2:
> 
>     Create appfile as:
> 
>     -np 1 afftest01.exe
>     -np 1 afftest01.exe
> 
>     Then
> 
>     .../openmpi-2.1.1/bin/mpirun --report-bindings \
>                 -map-by socket \
>                 -app appfile
> 
>     returns
> 
>     ...MCW rank 0 bound to socket 0 ... : [B/B/B/B][./././.]
>     ...MCW rank 1 bound to socket 0 ... : [B/B/B/B][./././.]
> 
>     which is not what I expect. I expect the same bindings as in Example 1.
> 
>     ---
> 
>     Example 3:
> 
>     Using the same appfile as in Example 2,
> 
>     .../openmpi-1.4.3/bin/mpirun --report-bindings \
>                 -bysocket --bind-to-core  \
>                 -app appfile
> 
>     returns
> 
>     ... odls:default:fork binding child ... to socket 0 cpus 0002
>     ... odls:default:fork binding child ... to socket 1 cpus 0001
> 
>     which is what I would expect.  Here I use --bind-to-core just to get the 
> bindings printed.
> 
>     ---
> 
>     The examples show that the --map-by socket feature does not work as 
> expected when 
>     application context files are used.  However the older -bysocket feature 
> worked as expected 
>     in OpenMPI 1.4.3 when application context files are used.
> 
>     If I am using the wrong syntax in Example 2, please let me know.
> 
>     Sincerely,
> 
>     Ted Sussman
> 
>       
>     _______________________________________________
>     users mailing list
>     users@lists.open-mpi.org
>     https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 



_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to