On Wed, 2009-07-08 at 15:09 -0400, Michael Di Domenico wrote: > On Wed, Jul 8, 2009 at 12:33 PM, Ashley Pittman<ash...@pittman.co.uk> wrote: > > Is the machine configured correctly to allow non OpenMPI QsNet programs > > to run, for example tping? > > > > Which resource manager are you running, I think slurm compiled for RMS > > is essential. > > I can ping via TCP/IP using the eip0 ports. > > When i run tping i get: > ELAN_EXCEOPTIOn @ --: 6 (Initialization error) > elan_init: Can't get capability from environment > > I am not using slurm or RMS at all, just trying to get openmpi to run > between two nodes.
To attach to the elan a process has to have a "capability" which is a kernel attribute describing the size (number of nodes/ranks) of the job, without this you'll get errors like the one from tping. The only way to generate these capabilities is by using RMS, Slurm or I believe pdsh which can generate one and push it into the kernel before calling fork() to create the user application. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk