Yves, In Open MPI you can have a very fine control over how the deployment is bound to the cores. For more information, please refer to the faq concerning the rankfile description (in a rankfile you can specify very precisely what rank goes on what physical PU). For a more single shot option, you can use the --slot-list option with the -nperproc option, to specify the order in which your ranks are deployed on physical PU.
Dr. Aurelien Bouteiller Innovative Computing Laboratory, The University of Tennessee Le 1 août 2010 à 01:17, Eugene Loh a écrit : > Yves Caniou wrote: >> Le Wednesday 28 July 2010 15:05:28, vous avez écrit : >> >> >>> I am confused. I thought all you wanted to do is report out the binding of >>> the process - yes? Are you trying to set the affinity bindings yourself? >>> >>> If the latter, then your script doesn't do anything that mpirun wouldn't >>> do, and doesn't do it as well. You would be far better off just adding >>> --bind-to-core to the mpirun cmd line. >>> >>> >> "mpirun -h" says that it is the default, so there is not even something to >> do? >> I don't even have to add "--mca mpi_paffinity_alone 1" ? >> >> > Wow. I just tried "mpirun -h" and, yes, it claims that "--bind-to-core" is > the default. I believe this is wrong... or at least "misleading." :^) You > should specify --bind-to-core explicitly. It is the successor to paffinity. > Do add --report-bindings to check what you're getting. >>> On Jul 28, 2010, at 6:37 AM, Yves Caniou wrote: >>> >>> >>>> Le Wednesday 28 July 2010 11:34:13 Ralph Castain, vous avez écrit : >>>> >>>> >>>>> On Jul 27, 2010, at 11:18 PM, Yves Caniou wrote: >>>>> >>>>> >>>>>> Le Wednesday 28 July 2010 06:03:21 Nysal Jan, vous avez écrit : >>>>>> >>>>>> >>>>>>> OMPI_COMM_WORLD_RANK can be used to get the MPI rank. >>>>>>> >>>>>>> >>>>>> Are processes affected to nodes sequentially, so that I can get the >>>>>> NODE number from $OMPI_COMM_WORLD_RANK modulo the number of proc per >>>>>> node? >>>>>> >>>>>> >>>>> By default, yes. However, you can select alternative mapping methods. >>>>> >>>>> >>>> It reports to stderr, so the $OMPI_COMM_WORLD_RANK modulo the number of >>>> proc per nodes seems more appropriate for what I need, right? >>>> >>>> So is the following valid to put memory affinity? >>>> >>>> script.sh: >>>> MYRANK=$OMPI_COMM_WORLD_RANK >>>> MYVAL=$(expr $MYRANK / 4) >>>> NODE=$(expr $MYVAL % 4) >>>> numactl --cpunodebind=$NODE --membind=$NODE $@ >>>> >>>> mpiexec ./script.sh -n 128 myappli myparam >>>> >>>> > Another option is to use OMPI_COMM_WORLD_LOCAL_RANK. This environment > variable directly gives you the value you're looking for, regardless of how > process ranks are mapped to the nodes. >>>> Which is better: using this option, or the cmd line with numactl (if it >>>> works)? What is the difference? >>>> > I don't know what's "better," but here are some potential issues: > > *) Different MPI implementations use different mechanisms for specifying > binding. So, if you want your solution to be "portable"... well, if you want > that, you're out of luck. But, perhaps some mechanisms (command-line > arguments, run-time scripts, etc.) might seem easier for you to adapt than > others. > > *) Some mechanisms bind processes at process launch time and some at MPI_Init > time. The former might be better. Otherwise, a process might place some > NUMA memory in a location before MPI_Init and then be moved away from that > memory when MPI_Init is encountered. I believe both the numactl and OMPI > --bind-to-core mechanisms have this characteristic. (OMPI's older paffinity > might not, but I don't remember for sure.) > > Mostly, if you're going to use just OMPI, the --bind-to-core command-line > argument might be the simplest. > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users