Re: [OMPI users] Torque 2.4.3 fails with OpenMPI 1.3.4; no startup at all

2009-12-19 Thread Ralph Castain
Sorry - hit "send" and then saw the version sitting right there in the subject! Doh... First, let's try verifying what components are actually getting used. Run this: mpirun -n 1 -mca ras_base_verbose 10 -mca plm_base_verbose 10 which orted Then get an allocation and run mpirun -pernode which

Re: [OMPI users] Torque 2.4.3 fails with OpenMPI 1.3.4; no startup at all

2009-12-19 Thread Ralph Castain
That error has nothing to do with Torque. The cmd line is simply wrong - you are specifying a btl that doesn't exist. It should work just fine with mpirun -n X hellocluster Nothing else is required. When you run mpirun --hostfile nodefile hellocluster OMPI will still use Torque to do the laun

Re: [OMPI users] Torque 2.4.3 fails with OpenMPI 1.3.4; no startup at all

2009-12-19 Thread Johann Knechtel
Ah, and do I have to take care of the MCA ras plugin by my own? I tried somethings like > mpirun --mca ras tm --mca btl ras,plm --mca ras_tm_nodefile_dir > /var/spool/torque/aux/ hellocluster but despite that it has not helped/worked out ([node3:22726] mca: base: components_open: component pml / c

Re: [OMPI users] Torque 2.4.3 fails with OpenMPI 1.3.4; no startup at all

2009-12-19 Thread Johann Knechtel
Hi Ralph and all, Yes, the OMPI libs and binaries are at the same place on the nodes, I packed OMPI via checkinstall and installed the deb via pdsh on the nodes. The LD_LIBRARY_PATH is set; I can run for example "mpirun --hostfile nodefile hellocluster" without problems. But when started via torqu

[OMPI users] Problem compiling 1.4.0 snap with PGI 10.0-1 and openib flags turned on ...

2009-12-19 Thread Richard Walsh
All, Succeeded in overcoming the 'libtool' failure with PGI using the patched snap (thanks Jeff), but now I am running into a down stream problem compiling for our IB clusters. I am using the latest PGI compiler (10.0-1) and the 12-14-09 snap of OpenMPI of version 1.4.0. My configure line looks

Re: [OMPI users] MPI-IO, providing buffers

2009-12-19 Thread Ricardo Reis
On Sat, 19 Dec 2009, Jeff Squyres wrote: No, sorry -- there are no "buffered" variants of the MPI_FILE_* functions like there are with point-to-point communications. So when you do MPI_FILE_WRITE (for example), it'll be directly using the buffer that you pass to it (which is almost always wha

Re: [OMPI users] MPI-IO, providing buffers

2009-12-19 Thread Jeff Squyres
No, sorry -- there are no "buffered" variants of the MPI_FILE_* functions like there are with point-to-point communications. So when you do MPI_FILE_WRITE (for example), it'll be directly using the buffer that you pass to it (which is almost always what you want, anyway -- "buffered" modes of c

[OMPI users] checkpointing multi node and multi process applications

2009-12-19 Thread Jean Potsam
Hi Everyone,    I am trying to checkpoint an mpi application running on multiple nodes. However, I get some error messages when i trigger the checkpointing process. Error: expected_component: PID information unavailable! Error: expected_component: Component Name information u