If anyone else is using xgrid, there is a mechanism to limit the
processes per machine:
sudo defaults write /Library/Preferences/com.apple.xgrid.agent
MaximumTaskCount 8
on each of the nodes and then restarting xgrid tells the controller to
only send 8 processes to that node. For now that is fine solution for
my need. I'll try and figure out how to specify hosts via xgrid and
get back to the list...
Thanks for everyone's help,
Cheers, Jody
On 11-Jul-09, at 12:42 PM, Ralph Castain wrote:
Looking at the code, you are correct in that the Xgrid launcher is
ignoring hostfiles. I'll have to look at it to determine how to
correct that situation - I didn't write that code, nor do I have a
way to test any changes I might make to it.
For now, though, if you add --bynode to your command line, you
should get the layout you want. I'm not sure you'll get the rank
layout you'll want, though...or if that is important to what you are
doing.
Ralph
On Jul 11, 2009, at 1:18 PM, Klymak Jody wrote:
Hi Vitorio,
Thanks for getting back to me! My hostfile is
xserve01.local max-slots=8
xserve02.local max-slots=8
xserve03.local max-slots=8
xserve04.local max-slots=8
I've now checked, and this seems to work fine just using ssh. i.e.
if I turn off the Xgrid queue manager I can submit jobs manually to
the appropriate nodes using --hosts.
However, I'd really like to use Xgrid as my queue manager as it is
already set up (though I'll happily take hints on how to set up
other queue managers on an OS X cluster).
So you have 4 nodes each one with 2 processors, each processor 4-
core - quad-core.
So you have capacity for 32 process in parallel.
The new Xeon chips designate 2-processes per core, though at a
reduced clock rate. This means that Xgrid believes I have 16
processors/node. For large jobs I expect that to be useful, but
for my more modest jobs I really only want 8 processes/node.
It appears that the default way xgrid assigns the jobs is to fill
all 16 slots on one node before moving to the next. OpenMPI
doesn't appear to look at the hostfile configuration when using
Xgrid, so it makes it hard for me to deprecate this behaviour.
Thanks, Jody
I think that only using the hostfile is enough is how I use. If
you to specify a specific host or a different sequence, the mpirun
will obey the host sequence in your hostfile to start the process,
also can you put how you configured your host files ? I'm asking
this because you should have something like:
# This is an example hostfile. Comments begin with
# #
# The following node is a single processor machine:
foo.example.com
# The following node is a dual-processor machine:
bar.example.com slots=2
# The following node is a quad-processor machine, and we absolutely
# want to disallow over-subscribing it:
yow.example.com slots=4 max-slots=4
so in your case like mine you should have something like:
your.hostname.domain slots=8 max-slots=8 # for each node
I hope this will help you.
Regards.
Vitorio.
Le 09-07-11 à 10:56, Klymak Jody a écrit :
Hi all,
Sorry in advance if these are naive questions - I'm not
experienced in running a grid...
I'm using openMPI on 4 duo Quad-core Xeon xserves. The 8 cores
mimic 16 cores and show up in xgrid as each agent having 16
processors. However, the processing speed goes down as the used
processors exceeds 8, so if possible I'd prefer to not have more
than 8 processors working on each machine at a time.
Unfortunately, if I submit a 16-processor job to xgrid it all
goes to "xserve03". Or even worse, it does so if I submit two
separate 8-processor jobs. Is there anyway to steer jobs to less-
busy agents?
I tried making a hostfile and then specifying the host, but I get:
/usr/local/openmpi/bin/mpirun -n 8 --hostfile hostfile --host
xserve01.local ../build/mitgcmuv
Some of the requested hosts are not included in the current
allocation for the
application:
../build/mitgcmuv
The requested hosts were:
xserve01.local
so I assume --host doesn't work with xgrid?
Is a reasonable alternative to simply not use xgrid and rely on
ssh?
Thanks, Jody
--
Jody Klymak
http://web.uvic.ca/~jklymak
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users