A quick look at the code seems to confirm my feeling. get/set_module() callbacks manipulate arrays of logical indexes, and they do not convert them back to physical indexes before binding.
Here's a quick patch that may help. Only compile tested... Brice Le 11/04/2012 09:49, Brice Goglin a écrit : > Le 11/04/2012 09:06, tmish...@jcity.maeda.co.jp a écrit : >> Hi, Brice. >> >> I installed the latest hwloc-1.4.1. >> Here is the output of lstopo -p. >> >> [root@node03 bin]# ./lstopo -p >> Machine (126GB) >> Socket P#0 (32GB) >> NUMANode P#0 (16GB) + L3 (5118KB) >> L2 (512KB) + L1 (64KB) + Core P#0 + PU P#0 >> L2 (512KB) + L1 (64KB) + Core P#1 + PU P#4 >> L2 (512KB) + L1 (64KB) + Core P#2 + PU P#8 >> L2 (512KB) + L1 (64KB) + Core P#3 + PU P#12 > Ok then the cpuset of this numanode is 1111. > >>> [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child >>> [[55518,1],0] to cpus 1111 > So openmpi 1.5.4 is correct. > >>> [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child >>> [[40566,1],0] to cpus 000f > And openmpi 1.5.5 is indeed wrong. > > Random guess: 000f is the bitmask made of hwloc *logical* indexes. hwloc > cpusets (used for binding) are internally made of hwloc *physical* > indexes (1111 here). > > Jeff, Ralph: > How does OMPI 1.5.5 build hwloc cpusets for binding? Are you doing > bitmap operations on hwloc object cpusets? > If yes, I don't know what's going wrong here. > If no, are you building hwloc cpusets manually by setting individual > bits from object indexes? If yes, you must use *physical* indexes to do so. > > Brice > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
--- opal/mca/paffinity/hwloc/paffinity_hwloc_module.c.old 2012-04-11 10:19:36.766710073 +0200 +++ opal/mca/paffinity/hwloc/paffinity_hwloc_module.c 2012-04-11 10:32:07.398696734 +0200 @@ -167,6 +167,7 @@ int i, ret = OPAL_SUCCESS; hwloc_bitmap_t set; hwloc_topology_t *t; + hwloc_obj_t pu; /* bozo check */ if (NULL == opal_hwloc_topology) { @@ -178,10 +179,11 @@ if (NULL == set) { return OPAL_ERR_OUT_OF_RESOURCE; } - hwloc_bitmap_zero(set); - for (i = 0; ((unsigned int) i) < OPAL_PAFFINITY_BITMASK_CPU_MAX; ++i) { + for (i = 0, pu = hwloc_get_obj_by_type(*t, HWLOC_OBJ_PU, 0); + ((unsigned int) i) < OPAL_PAFFINITY_BITMASK_CPU_MAX; + ++i, pu = pu->next_cousin) { if (OPAL_PAFFINITY_CPU_ISSET(i, mask)) { - hwloc_bitmap_set(set, i); + hwloc_bitmap_set(set, pu->os_index); } } @@ -199,6 +201,7 @@ int i, ret = OPAL_SUCCESS; hwloc_bitmap_t set; hwloc_topology_t *t; + hwloc_obj_t pu; /* bozo check */ if (NULL == opal_hwloc_topology) { @@ -218,8 +221,10 @@ ret = OPAL_ERR_IN_ERRNO; } else { OPAL_PAFFINITY_CPU_ZERO(*mask); - for (i = 0; ((unsigned int) i) < 8 * sizeof(*mask); i++) { - if (hwloc_bitmap_isset(set, i)) { + for (i = 0, pu = hwloc_get_obj_by_type(*t, HWLOC_OBJ_PU, 0); + ((unsigned int) i) < 8 * sizeof(*mask); + i++, pu = pu->next_cousin) { + if (hwloc_bitmap_isset(set, pu->os_index)) { OPAL_PAFFINITY_CPU_SET(i, *mask); } }