I'm afraid it is too late for 1.7.4 as I have locked that down, barring any last-second smoke test failures. I'll give this some thought for 1.7.5, but I'm a little leery of the proposed change. The problem is that ppr comes in thru a different MCA param than the "map-by" param, and hence we can indeed get conflicts that we cannot resolve.
This is one of those issues that I need to cleanup in general. We've deprecated a number of params due to similar problems - the "ppr" policy is the last one on the list. Needs to somehow be folded into the "map-by" param, though it also influences the number of procs (unlike the other map-by directives). On Jan 27, 2014, at 7:46 PM, tmish...@jcity.maeda.co.jp wrote: > > > Hi Ralph, it seems you are rounding the final turn to release 1.7.4! > I hope this will be my final request for openmpi-1.7.4 as well. > > I mostly use rr_mapper but sometimes use ppr_mapper. I have a simple > request to ask you to improve its usability. Namely, I propose to > remove redfining-policy-check routine in rmaps_ppr_component.c > (the line 130-138) : > > 130 if (ORTE_MAPPING_GIVEN & ORTE_GET_MAPPING_DIRECTIVE > (orte_rmaps_base.mapping)) { > 131 /* if a non-default mapping is already specified, then we > 132 * have an error > 133 */ > 134 orte_show_help("help-orte-rmaps-base.txt", > "redefining-policy", true, "mapping", > 135 "PPR", orte_rmaps_base_print_mapping > (orte_rmaps_base.mapping)); > 136 ORTE_SET_MAPPING_DIRECTIVE(orte_rmaps_base.mapping, > ORTE_MAPPING_CONFLICTED); > 137 return ORTE_ERR_SILENT; > 138 } > > The reasons are as follows: > > 1) The final mapper to be used should be selected by the priority set > by system or mca param. The ppr_priority is fixed to be 90 and the > rr_priority can be set by mca param(default = 10). > > 2) If we set "rmaps_base_mapping_policy = something" in > mca-params.conf, -ppr option is always refused by this check as > below: > [mishima@manage demos]$ mpirun -np 2 -ppr 1:socket > ~/mis/openmpi/demos/myprog > -------------------------------------------------------------------------- > Conflicting directives for mapping policy are causing the policy > to be redefined: > > New policy: PPR > Prior policy: BYSOCKET > > Please check that only one policy is defined. > > 3) This fix does not seem to affect any other behavior as far as > I confirmed. > > Regard, > Tetsuya Mishima > >> Kewl - thanks! >> >> On Jan 27, 2014, at 4:08 PM, tmish...@jcity.maeda.co.jp wrote: >> >>> >>> >>> Thanks, Ralph. I quickly checked the fix. It worked fine for me. >>> >>> Tetsuya Mishima >>> >>>> I fixed that in today's final cleanup >>>> >>>> On Jan 27, 2014, at 3:17 PM, tmish...@jcity.maeda.co.jp wrote: >>>> >>>> >>>> >>>> As for the NEWS - it is actually already correct. We default to map-by >>>> core, not slot, as of 1.7.4. >>>> >>>> Is it correct? As far as I browse the source code, map-by slot is used > if >>>> np <=2. >>>> >>>> [mishima@manage openmpi-1.7.4rc2r30425]$ cat -n >>>> orte/mca/rmaps/base/rmaps_base_map_job.c >>>> ... >>>> 107 /* default based on number of procs */ >>>> 108 if (nprocs <= 2) { >>>> 109 opal_output_verbose(5, >>>> orte_rmaps_base_framework.framework_output, >>>> 110 "mca:rmaps mapping not > given - >>>> using byslot"); >>>> 111 ORTE_SET_MAPPING_POLICY(map->mapping, >>>> ORTE_MAPPING_BYSLOT); >>>> 112 } else { >>>> 113 opal_output_verbose(5, >>>> orte_rmaps_base_framework.framework_output, >>>> 114 "mca:rmaps mapping not > given - >>>> using bysocket"); >>>> 115 ORTE_SET_MAPPING_POLICY(map->mapping, >>>> ORTE_MAPPING_BYSOCKET); >>>> 116 } >>>> >>>> Regards, >>>> Tetsuya Mishima >>>> >>>> On Jan 26, 2014, at 3:02 PM, tmish...@jcity.maeda.co.jp wrote: >>>> >>>> >>>> Hi Ralph, >>>> >>>> I tried latest nightly snapshots of openmpi-1.7.4rc2r30425.tar.gz. >>>> Almost everything works fine, except that the unexpected output > appears >>>> as below: >>>> >>>> [mishima@node04 ~]$ mpirun -cpus-per-proc 4 ~/mis/openmpi/demos/myprog >>>> App launch reported: 3 (out of 3) daemons - 8 (out of 12) procs >>>> ... >>>> >>>> You dropped the if-statement checking "orte_report_launch_progress" in >>>> plm_base_receive.c @ r30423, which causes the problem. >>>> >>>> --- orte/mca/plm/base/plm_base_receive.c.org2014-01-25 >>>> 11:51:59.000000000 +0900 >>>> +++ orte/mca/plm/base/plm_base_receive.c2014-01-26 >>>> 12:20:10.000000000 >>>> +0900 >>>> @@ -315,9 +315,11 @@ >>>> /* record that we heard back from a daemon during app >>>> launch >>>> */ >>>> if (running && NULL != jdata) { >>>> jdata->num_daemons_reported++; >>>> - if (0 == jdata->num_daemons_reported % 100 || >>>> - jdata->num_daemons_reported == >>>> orte_process_info.num_procs) { >>>> - ORTE_ACTIVATE_JOB_STATE(jdata, >>>> ORTE_JOB_STATE_REPORT_PROGRESS); >>>> + if (orte_report_launch_progress) { >>>> + if (0 == jdata->num_daemons_reported % 100 || >>>> + jdata->num_daemons_reported == >>>> orte_process_info.num_procs) { >>>> + ORTE_ACTIVATE_JOB_STATE(jdata, >>>> ORTE_JOB_STATE_REPORT_PROGRESS); >>>> + } >>>> } >>>> } >>>> /* prepare for next job */ >>>> >>>> Regards, >>>> Tetsuya Mishima >>>> >>>> P.S. It's also better to change the line 65 in NEWS. >>>> >>>> ... >>>> 64 * Mapping: >>>> 65 * if #procs <= 2, default to map-by core -> map-by slot >>>> ^^^^^^^^^^^ >>>> 66 * if #procs > 2, default to map-by socket >>>> ... >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> >>> > http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________ > >>> >>>> users mailing list >>>> users@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users