On Tue, 2009-07-07 at 17:18 -0400, Michael Di Domenico wrote: > So, first run i seem to have run into a bit of an issue. All the > Quadrics modules are compiled and loaded. I can ping between nodes > over the quadrics interfaces. But when i try to run one of the hello > mpi example from openmpi, i get: > > first run, the process hung - killed with ctl-c > though it doesnt seem to actually die and kill -9 doesn't work > > second run, the process fails with > failed elan4_attach <snipped> Device or resource busy > <snipped> > elan_allocSleepDesc <snipped> Failed to allocate IRQ cookie 2a: 22 > Invalid argument > all subsequent runs fail the same way and i have to reboot the box to > get the processes to go away > > I'm not sure if this is a quadrics or openmpi issue at this point, but > i figured since there are quadrics people on the list its a good place > to start
Is the machine configured correctly to allow non OpenMPI QsNet programs to run, for example tping? Which resource manager are you running, I think slurm compiled for RMS is essential. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk