Guess I disagree - it isn't a question of what the code can handle, but rather 
user expectation. If you specify a definite number of cores for each process, 
then we have to bind to core in order to meet that directive. Binding to numa 
won't do it as the OS will continue to schedule the proc on only one core at a 
time.

So I think the current behavior is correct.
Ralph

On Feb 11, 2014, at 7:13 PM, Tetsuya Mishima <tmish...@jcity.maeda.co.jp> wrote:

> Your fix worked for me, thanks.
> 
> By the way, I noticed that "bind-to obj" is forcibly override by "bind-to 
> core", when pe=N is specified.
> This is just my opinion, but I think it's too conservative and a kind of 
> regression from the openmpi-1.6.5. For example, "-map-by slot:pe=N -bind-to 
> numa" looks 
> acceptable to me. Your round_robin_mapper is now robust enough to handle it. 
> The patch below would be better.  Please give it a try.
> 
> --- orte/mca/rmaps/base/rmaps_base_frame.c.org        2014-02-11 
> 17:34:36.000000000 +0900
> +++ orte/mca/rmaps/base/rmaps_base_frame.c    2014-02-12 11:01:42.000000000 
> +0900
> @@ -393,13 +393,13 @@
>          * bind to those cpus - any other binding policy is an
>          * error
>          */
> -        if (!(OPAL_BIND_GIVEN & 
> OPAL_GET_BINDING_POLICY(opal_hwloc_binding_policy))) {
> +        if (OPAL_BIND_TO_NONE == 
> OPAL_GET_BINDING_POLICY(opal_hwloc_binding_policy)) {
>             if (opal_hwloc_use_hwthreads_as_cpus) {
>                 OPAL_SET_BINDING_POLICY(opal_hwloc_binding_policy, 
> OPAL_BIND_TO_HWTHREAD);
>             } else {
>                 OPAL_SET_BINDING_POLICY(opal_hwloc_binding_policy, 
> OPAL_BIND_TO_CORE);
>             }
> -        } else {
> +        } else if (OPAL_BIND_TO_L1CACHE < 
> OPAL_GET_BINDING_POLICY(opal_hwloc_binding_policy)) {
>             if (opal_hwloc_use_hwthreads_as_cpus) {
>                 if (OPAL_BIND_TO_HWTHREAD != 
> OPAL_GET_BINDING_POLICY(opal_hwloc_binding_policy)) {
>                     orte_show_help("help-orte-rmaps-base.txt", 
> "mismatch-binding", true,
> Regards,
> Tetsuya Mishima
> 
>> Okay, I fixed it. Keep getting caught by a very, very unfortunate design 
>> flaw in hwloc that forces you to treat cache's as a special case that 
>> requires you to 
> call a different function. So you have to constantly protect function calls 
> into hwloc with "if cache, call this one - else, call that one". REALLY 
> irritating, and 
> it caught us again here.
>> 
>> Should be fixed now in trunk now, set to go over to 1.7.5
>> 
>> Thanks
>> Ralph
>> 
>> On Feb 11, 2014, at 4:47 PM, tmish...@jcity.maeda.co.jp wrote:
>> 
>>> 
>>> Hi Ralph,
>>> 
>>> Since the ticket #4240 has been already set as fixed, I'm sending this
>>> email to you. ( I don't konw I could add comments to the fixed ticket)
>>> 
>>> When I tried to bind the process to l3chace, it didn't work like below:
>>> (the host mangae has the normal topology - not inverted)
>>> 
>>> [mishima@manage openmpi-1.7.4]$ mpirun -np 2 -bind-to l3cache
>>> -report-bindings ~/mis/openmpi/demos/myprog
>>> --------------------------------------------------------------------------
>>> No objects of the specified type were found on at least one node:
>>> 
>>> Type: Cache
>>> Node: manage
>>> 
>>> The map cannot be done as specified.
>>> --------------------------------------------------------------------------
>>> 
>>> "-bind-to l1cache/l2cahce" doesn't work as well. At least, I confirmed that
>>> the openmpi-1.7.4 works with "-bind-to l3cache".
>>> 
>>> Regards,
>>> Tetsuya Mishima
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ----
> Tetsuya Mishima  tmish...@jcity.maeda.co.jp
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to