On Aug 20, 2014, at 11:16 AM, Reuti <re...@staff.uni-marburg.de> wrote:
> Am 20.08.2014 um 19:05 schrieb Ralph Castain: > >>> <snip> >>> Aha, this is quite interesting - how do you do this: scanning the >>> /proc/<pid>/status or alike? What happens if you don't find enough free >>> cores as they are used up by other applications already? >>> >> >> Remember, when you use mpirun to launch, we launch our own daemons using the >> native launcher (e.g., qsub). So the external RM will bind our daemons to >> the specified cores on each node. We use hwloc to determine what cores our >> daemons are bound to, and then bind our own child processes to cores within >> that range. > > Thx for reminding me of this. Indeed, I mixed up two different aspects in > this discussion. > > a) What will happen in case no binding was done by the RM (hence Open MPI > could use all cores) and two Open MPI jobs (or something completely different > besides one Open MPI job) are running on the same node (due to the Tight > Integration with two different Open MPI directories in /tmp and two `orted`, > unique for each job)? Will the second Open MPI job know what the first Open > MPI job used up already? Or will both use the same set of cores as "-bind-to > none" can't be set in the given `mpiexec` command because of "-map-by > slot:pe=$OMP_NUM_THREADS" was used - which triggers "-bind-to core" > indispensable and can't be switched off? I see the same cores being used for > both jobs. Yeah, each mpirun executes completely independently of the other, so they have no idea what the other is doing. So the cores will be overloaded. Multi-pe's requires bind-to-core otherwise there is no way to implement the request > > Altering the machinefile instead: the processes are not bound to any core, > and the OS takes care of a proper assignment. > > >> If the cores we are bound to are the same on each node, then we will do this >> with no further instruction. However, if the cores are different on the >> individual nodes, then you need to add --hetero-nodes to your command line >> (as the nodes appear to be heterogeneous to us). > > b) Aha, it's not about different type CPU types, but also same CPU type but > different allocations between the nodes? It's not in the `mpiexec` man-page > of 1.8.1 though. I'll have a look at it. The man page is probably a little out-of-date in this area - but yes, --hetero-nodes is required for *any* difference in the way the nodes appear to us (cpus, slot assignments, etc.). The 1.9 series may remove that requirement - still looking at it. > > >> So it is up to the RM to set the constraint - we just live within it. > > Fine. > > -- Reuti > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/08/25097.php