Hello,
I'm having problems running Open MPI jobs under PBS Pro 10.2. I've configured
and built OpenMPI 1.4.1 with the Intel 11.1 compiler on Linux and with
--with-tm support and the build runs fine. I've also built with static
libraries per the FAQ suggestion since libpbs is static. However,
I'm a tad confused - this trace would appear to indicate that mpirun is
failing, yes? Not your application?
The reason it works for local procs is that tm_init isn't called for that case
- mpirun just fork/exec's the procs directly. When remote nodes are required,
mpirun must connect to Torque.
Yes, the failure seems to be in mpirun, it never even gets to my application.
The proto for tm_init looks like this:
int tm_init(void *info, struct tm_roots *roots);
where the struct has 6 elements: 2 x tm_task_id + 3 x int + 1 x tm_task_id
If the API was different, wouldn't the compiler most li
Afraid compilers don't help when the param is a void*...
It looks like this is consistent, but I've never tried it under that particular
environment. Did prior versions of OMPI work, or are you trying this for the
first time?
One thing you might check is that you have the correct PATH and LD_LI