Yes I never made it to my mailbox.  Strange, (wink wink, ahh email).

Thanks for letting me know about it, I have the message now.

as for using 1.3 prerelease, that is not really an option right now for us. I think we can get by with 1.2 without threads or do some hacking (ppn=largest number we have launch with -bynode).
TIll a 1.3 stable is out.

Thanks, new features for launching look really neat.

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Oct 30, 2008, at 10:12 AM, Ralph Castain wrote:

I believe I answered much of this the other day - did it get lost in the email?

As for using TM with a hostfile - this is an unfortunately bug in the 1.2 series. You can't - you'll have to move to 1.3 to do so. When you do, note the changed handling of hostfiles as specified on the wiki:

https://svn.open-mpi.org/trac/ompi/wiki/HostFilePlan

Ralph


I take it this is using OMPI 1.2.x? If so, there really isn't a way to do this in that series.

If they are using 1.3 (in some pre-release form), then there are two options:

1. they could use the sequential mapper by specifying "-mca rmaps seq". This mapper takes a hostfile and maps one process to each entry, in rank order. So they could specify that we only map to half of the actual number of cores on a particular node

2. they could use the rank_file mapper that allows you to specify what cores are to be used by what rank. I am less familiar with this option and there isn't a lot of documentation on how to use it - but you may have to provide a fairly comprehensive map file since your nodes are not all the same.

I have been asked by some other folks to provide a mapping option "--stride x" that would cause the default round-robin mapper to step across the specified number of slots. So a stride of 2 would automatically cause byslot mapping to increment by 2 instead of the current stride of 1. I doubt that will be in 1.3.0, but it will show up in later releases.

Ralph

On Oct 30, 2008, at 7:46 AM, Brock Palen wrote:

Any thoughts on this?

We are looking writing a script that parses $PBS_NODEFILE to create a machinefile and using -machinefile

When we do that though we have to disable tm to avoid an error (- mca pls ^tm) this is far from preferable.

Any ideas to tell mpirun to only launch on half the cpus given to it by PBS, but each cpu must have adjacent to it another cpu in the same node?

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Oct 25, 2008, at 5:36 PM, Brock Palen wrote:

We have a user with a code that uses threaded solvers inside each MPI rank. They would like to run two threads per process.

The question is how to launch this? The default -byslot puts all the processes on the first sets of cpus not leaving any cpus for the second thread for each process. And half the cpus are wasted.

The -bynode option works in theory, if all our nodes had the same number of core (they do not).

So right now the user did:

#PBS -l nodes=22:ppn=2
export OMP_NUM_THREADS=2
mpirun -np 22 app

Which made me aware of the problem.

How can I basically tell OMPI that a 'slot' is two cores on the same machine? This needs to work inside out torque based queueing system.

Sorry If I was not clear about my goal.


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Reply via email to