Re: [OMPI users] mpirun fails on the host

2009-06-19 Thread Honest Guvnor
On Fri, Jun 19, 2009 at 3:12 AM, Ralph Castain wrote: > Add --debug-devel to your cmd line and you'll get a bunch of diagnostic > info. Did you configure --enable-debug? If so, then additional debug can be > obtained - can let you know how to get it, if necessary. Yes we had run with the -d fla

Re: [OMPI users] vfs_write returned -14

2009-06-19 Thread Josh Hursey
On Jun 18, 2009, at 7:33 PM, Kritiraj Sajadah wrote: Hello Josh, ThanK you again for your respond. I tried chekpointing a simple c program using BLCR...and got the same error, i.e: - vfs_write returned -14 - file_header: write returned -14 Checkpoint failed: Bad address So I wo

[OMPI users] Bug in 1.3.2?: sm btl and isend is serializes

2009-06-19 Thread Mark Bolstad
I have a small test code that I've managed to duplicate the results from a larger code. In essence, using the sm btl with ISend, I wind up with the communication being completely serialized, i.e., all the calls from process 1 complete, then all from 2, ... This is version 1.3.2, vanilla compile. I

Re: [OMPI users] Bug in 1.3.2?: sm btl and isend is serializes

2009-06-19 Thread Eugene Loh
Mark Bolstad wrote: I'll post the test code if requested (this email is already long) Yipes, how long is the test code? Short enough to send, yes? Please send.

Re: [OMPI users] Bug in 1.3.2?: sm btl and isend is serializes

2009-06-19 Thread Mark Bolstad
Not that long, 150 lines. Here it is: #include #include #include #include #include #include #define BUFLEN 25000 #define LOOPS 10 #define BUFFERS 4 #define GROUP_SIZE 4 int main(int argc, char *argv[]) { int myid, numprocs, next, namelen; int color, key, newid; char buffer[BUFLE

[OMPI users] Error in mx_init (error MX library incompatible with driver version)

2009-06-19 Thread SLIM H.A.
This is a question I raised before but for OpenMPI over IB. I have build the app with the Portland compiler and OpenMPI 1.2.3 for Myrinet and InfiniBand. Now I wish to run this on some nodes that have no fast interconnect. We use GridEngine, this is the script: #!/bin/csh #$ -cwd ##$ -j y module

Re: [OMPI users] Bug in 1.3.2?: sm btl and isend is serializes

2009-06-19 Thread Eugene Loh
Mark Bolstad wrote: I have a small test code that I've managed to duplicate the results from a larger code. In essence, using the sm btl with ISend, I wind up with the communication being completely serialized, i.e., all the calls from process 1 complete, then all from 2, ... I need to do so

Re: [OMPI users] Bug in 1.3.2?: sm btl and isend is serializes

2009-06-19 Thread Mark Bolstad
Thanks, but that won't help. In the real application the messages are at least 25,000 bytes long, mostly much larger. Thanks, Mark On Fri, Jun 19, 2009 at 1:17 PM, Eugene Loh wrote: > Mark Bolstad wrote: > > I have a small test code that I've managed to duplicate the results from a >> larger

[OMPI users] Linking MPI applications with pgi IPA

2009-06-19 Thread Brock Palen
When linking application that are being compiled and linked with the - Mipa=fast,inline option, the IPA stops with errors like this case with amber: The following function(s) are called, but no IPA information is available: mpi_allgatherv_, mpi_gatherv_, mpi_bcast_, mpi_wait_, mpi_get_count_

[OMPI users] Machinefile option in opempi-1.3.2

2009-06-19 Thread Rajesh Sudarsan
Hi, I tested a simple hello world program on 5 nodes each with dual quad-core processors. I noticed that openmpi does not always follow the order of the processors indicated in the machinefile. Depending upon the number of processors requested, openmpi does some type of sorting to find the best no

Re: [OMPI users] Bug in 1.3.2?: sm btl and isend is serializes

2009-06-19 Thread George Bosilca
Mark, MPI does not impose any global order on the messages. The only requirement is that between two peers on the same communicator the messages (or at least the part required for the matching) is delivered in order. This make both execution traces you sent with your original email (share

Re: [OMPI users] Bug in 1.3.2?: sm btl and isend is serializes

2009-06-19 Thread Eugene Loh
George Bosilca wrote: MPI does not impose any global order on the messages. The only requirement is that between two peers on the same communicator the messages (or at least the part required for the matching) is delivered in order. This make both execution traces you sent with your origin

Re: [OMPI users] mpirun fails on the host

2009-06-19 Thread Honest Guvnor
The source of the problem has been determined, but not wholly understood, by fully disabling the firewall on the host to the internal network. Parallel jobs involving the host and nodes launched from a node were successful while those launched on the host were apparently blocked by the firewall. Wo

Re: [OMPI users] Error in mx_init (error MX library incompatible with driver version)

2009-06-19 Thread Scott Atchley
On Jun 19, 2009, at 1:05 PM, SLIM H.A. wrote: Although the mismatch between MX lib version and the kernel version appears to cause the mx_init error this should never be called as there is no mx card on those nodes. Thanks in advance for any advice to solve this Henk Henk, Is MX statical

Re: [OMPI users] Machinefile option in opempi-1.3.2

2009-06-19 Thread Ralph Castain
If you do "man orte_hosts", you'll see a full explanation of how the various machinefile options work. The default mapper doesn't do any type of sorting - it is a round-robin mapper that just works its way through the provided nodes. We don't reorder them in any way. However, it does depend on the

Re: [OMPI users] mpirun fails on the host

2009-06-19 Thread Ralph Castain
I believe you will find a fairly complete discussion of firewall issues with MPI on the OMPI mailing lists. Bottom line is that the firewall blocks both the ssh port plus the TCP communication ports required to wireup the MPI transports. If you are using the TCP transport, then those ports are also