Looping in the users mailing list so that Ralph and Oracle can comment...
On Jul 14, 2011, at 2:34 PM, Rayson Ho wrote: > (CC'ing Jeff from the Open-MPI project...) > > On Thu, Jul 14, 2011 at 1:35 PM, Tad Kollar <tad.kol...@gmail.com> wrote: >> As I thought more about it, I was afraid that might be the case, but hoped >> sge_shepherd would do some magic for tightly-integrated jobs. > > To SGE, if each of the tasks is not started by sge_shepherd, then the > only option is to set the binding mask to the allocation, which in > your original case, was the whole system (48 CPUs). > > >> We're running OpenMPI 1.5.3 if that makes a difference. Do you know of >> anyone using an MVAPICH2 1.6 pe that can handle binding? > > I just downloaded Open MPI 1.5.4a and grep'ed the source, looks like > it is not looking at the SGE_BINDING env variable that is set by SGE. > > >> The serial case worked (its affinity list was '0' instead of '0-47'), so at >> least we know that's in good shape :-) > > Please also submit a few more jobs and see if the new hwloc code is > able to handle multiple jobs running on your AMD MC server. > > >> My ultimate goal is for affinity support to be enabled and scheduled >> automatically for all MPI users, i.e. without them having to do any more >> than they would for a no-affinity job (otherwise I have a feeling most of >> them would just ignore it). What do you think it will take to get to that >> point? > > That's my goal since 2008... > > I started a mail thread, "processor affinity -- OpenMPI / batchsystem > integration" to the Open MPI list in 2008. And in 2009, the conclusion > was that Sun was saying that the binding info is set in the > environment and Open MPI would perform the binding itself (so I > assumed that was done): > > http://www.open-mpi.org/community/lists/users/2009/10/10938.php > > Revisiting the presentation (see: job2core.pdf link at the above URL), > Sun's variable name is $SUNW_MP_BIND, so it is most likely Sun Cluster > Toolkit implementation specific rather than a feature in Open MPI -- > and looking at the Open MPI code I don't see SUNW_MP_BIND referenced > anywhere. > > I believe it is a matter of integrating the thread binding support > between the 2 -- both SGE & Open MPI support thread binding. The > harder part is to handle cross node binding as SGE binds threads > locally only (not directly controlled by qmaster) -- may be a call to > "qstat -cb -j <job id>" would do the trick, and the info is parsed and > passed to mpirun via the "--rankfile" option. > > http://www.open-mpi.org/faq/?category=tuning#using-paffinity-v1.4 > > Rayson > > > >> Thanks! >> Tad >> -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/