This case actually works. We run into it few days ago, when we discovered that one of the compute nodes in a cluster didn't get his Myrinet card installed properly ... The performance were horrible but the application run to completion.

You will have to use the following flags: --mca pml ob1 --mca btl mx,tcp,self

  george.

On Jan 15, 2008, at 8:49 AM, M Jones wrote:

Hi,

  We have a mixed environment in which roughly 2/3 of the nodes
in our cluster have myrinet (mx 1.2.1), while the full cluster has
gigE.  Running open-mpi exclusively on myrinet nodes or exclusively
on non-myrinet nodes is fine, but mixing the two nodes types
results in a runtime error (PML add procs failed), no matter what -- mca
flags I try to use to push the traffic onto tcp (note that
--mca mtl ^mx --mca btl ^mx does appear to use tcp, as long as all
of the nodes have myrinet cards, but not in the mixed case).

I thought that we would be able to use a single open-mpi build to
support both networks (and users would be able to request mx nodes if
they need them using the batch queuing system, which they are
already accustomed to).  Am I missing something (or just doing
something dumb)? Compiling mpi implementations for each compiler suite
is bad enough, add in separate builds for networks and it just gets
worse ...

Matt

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to