> -----Original Message----- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Keith Refson > Sent: Tuesday, July 18, 2006 6:21 AM > To: Open MPI Users > Subject: Re: [OMPI users] Openmpi, LSF and GM > > > > The arguments you want would look like: > > > > > > mpirun -np X -mca btl gm,sm,self -mca btl_base_verbose 1 -mca > > > btl_gm_debug 1 <other arguments> > > Aha. I think I had misunderstood the syntax slightly, which > explains why > I previously saw no debugging information. I has also > omitted the "sm" > btl - though I'm not sure what that one is....
"sm" = "shared memory". It's used for on-node communication between processes. > I am now getting some debugging output > > [scarf-cn008.rl.ac.uk:04291] [0,1,0] gm_port 017746B0, board > 545460846592, global 3712550725 node > [snipped] > > which I home means that I am using the GM btl. The run is > also about 20% quicker than > before which may suggest that I was not previously using gm. It does. Excellent! > I have also noticed that if I simply specify --mca btl ^tcp + > the debugging options > the run works apparently uses gm, and as quickly. It was Right. > (and is) the combination > -mca btl gm,sm,self,^tcp > that fails with > No available btl components were found! The syntax only allows you to do the "^" notation if that's *all* you use. Check out this FAQ entry (I just expanded its text a bit): http://www.open-mpi.org/faq/?category=tuning#selecting-components More specifically, you cannot mix the "^" and non-"^" notation -- it doesn't make sense. Here's why -- if you list: --mca btl a,b,c This tells Open MPI to *only* use components a, b, and c. Using the exclusive behavior, thus: --mca btl ^d means "use all components *except* d". Hence, doing this: --mca btl a,b,c,^d would assumedly mean "only use a, b, and c" and "use all components *except* d", which doesn't make sense. Taking a looser definition of the inclusive and exclusive behavior, you could interpret it to mean "use only a, b, and c, and *not* use d" -- but that would be redundant because it's already *not* going to use d because it's *only* using a, b, and c. Hence, the inclusive and exclusive notations are mutually exclusive. Indeed, the ^ character is only recognized as a prefix for the whole value for this exact reason. This is why you got the error that you did -- when you used "tcp,sm,^gm", it was looking for a component named "^gm" since the "^" was not recognized as the exclusion character (and therefore didn't find it). I'll add some detection code such that if we find "^" in the string and it's not the first chacter to emit a warning. > > > LSF. I believe it is on our feature request list, but I > also don't > > > believe we have a timeline for implementation. > > OK. It is actually quite easy to construct a hostfile from the LSF > environment and start the processes using the openmpi mpirun command. > I don't know how this will interact with for larger scale usage, > job termination etc but I plan to experiment. If you use the LSF drop-in replacement for rsh (lsgrun), you should be ok because it will use LSF's native job-launching mechanisms behind the scenes (and therefore can use LSF's native job-termination mechanisms when necessary). > One further question. My run times are still noticably longer than > with mpich_gm. I saw in the mailing list archives that there was > a new implementation of the collective routines in 1.0, > (which my application > depends on rather heavil. Is this the default in openmpi 1.1 or is The new collectives were introduced in 1.1, not 1.0, and yes, they are the default. > it still necessary to specify this manually? And if anyone > has a comparison > of MPI_AlltoallV performance with other MPI implementations > I'd like to > hear the numbers. There is still work to be done in the collectives, however -- there were no optimized "vector" algorithms introduced yet (e.g., MPI_Alltoallv). > Thanks again for all the work. Openmpi looks very promising and it is > definitely the easiest to install and get running of any MPI > implementation > I have tried so far. Glad to hear it -- thanks for the feedback! -- Jeff Squyres Server Virtualization Business Unit Cisco Systems