Sorry - hit "send" and then saw the version sitting right there in the subject!
Doh...
First, let's try verifying what components are actually getting used. Run this:
mpirun -n 1 -mca ras_base_verbose 10 -mca plm_base_verbose 10 which orted
Then get an allocation and run
mpirun -pernode which
That error has nothing to do with Torque. The cmd line is simply wrong - you
are specifying a btl that doesn't exist.
It should work just fine with
mpirun -n X hellocluster
Nothing else is required. When you run
mpirun --hostfile nodefile hellocluster
OMPI will still use Torque to do the laun
Ah, and do I have to take care of the MCA ras plugin by my own?
I tried somethings like
> mpirun --mca ras tm --mca btl ras,plm --mca ras_tm_nodefile_dir
> /var/spool/torque/aux/ hellocluster
but despite that it has not helped/worked out ([node3:22726] mca: base:
components_open: component pml / c
Hi Ralph and all,
Yes, the OMPI libs and binaries are at the same place on the nodes, I
packed OMPI via checkinstall and installed the deb via pdsh on the nodes.
The LD_LIBRARY_PATH is set; I can run for example "mpirun --hostfile
nodefile hellocluster" without problems. But when started via torqu
All,
Succeeded in overcoming the 'libtool' failure with PGI using
the patched snap (thanks Jeff), but now I am running
into a down stream problem compiling for our IB clusters.
I am using the latest PGI compiler (10.0-1) and the 12-14-09
snap of OpenMPI of version 1.4.0.
My configure line looks
On Sat, 19 Dec 2009, Jeff Squyres wrote:
No, sorry -- there are no "buffered" variants of the MPI_FILE_* functions like there are
with point-to-point communications. So when you do MPI_FILE_WRITE (for example), it'll be directly
using the buffer that you pass to it (which is almost always wha
No, sorry -- there are no "buffered" variants of the MPI_FILE_* functions like
there are with point-to-point communications. So when you do MPI_FILE_WRITE
(for example), it'll be directly using the buffer that you pass to it (which is
almost always what you want, anyway -- "buffered" modes of c
Hi Everyone,
I am trying to checkpoint an mpi application running on
multiple nodes. However, I get some error messages when i trigger the
checkpointing process.
Error: expected_component: PID information unavailable!
Error: expected_component: Component Name information u