[OMPI users] MPI_Comm_accept randomly gives errors

2012-10-04 Thread Valentin Clement
Hi everyone, I'm currently implementing communication based on MPI in our parallel language middle-ware POP-C++. It was using TCP/IP socket before but due to a project to port the language on a supercomputer, I have to use OpenMPI for the communication. I successfully change the old communication b

[OMPI users] remark on process mapping

2012-10-04 Thread Siegmar Gross
Hi, > tyr fd1026 179 cat host_sunpc0_1 > sunpc0 slots=4 > sunpc1 slots=4 > > > tyr fd1026 180 mpiexec -report-bindings -hostfile host_sunpc0_1 -np 4 \ > -cpus-per-proc 2 -bind-to-core hostname And this will of course not work. In your hostfile, you told us

Re: [OMPI users] unacceptable latency in gathering process

2012-10-04 Thread Iliev, Hristo
Hi, I would suggest that (if you haven't done it already) you trace your program's execution with Vampir or Scalasca. The latter has some pretty nice analysis capabilities built-in and can detect common patterns that would make your code not to scale, no matter how good the MPI library is. Also

[OMPI users] wrong results in a heterogeneous environment with openmpi-1.6.2

2012-10-04 Thread Siegmar Gross
Hi, I have a small matrix multiplication program which computes wrong results in a heterogeneous environment with different little endian and big endian architectures. Every process computes one row (block) of the result matrix. Solaris 10 x86_64 and Linux x86_64: tyr matrix 162 mpiexec -np 4 -

Re: [OMPI users] EXTERNAL: Re: unacceptable latency in gathering process

2012-10-04 Thread Hodge, Gary C
Once I read your comment, Ralph, about this being "orders of magnitude worse than anything we measure", I knew it had to be our problem We already had some debug code in place to measure when we send and when we receive over MPI. I turned this code on and ran with 12 slaves instead of 4. Our de

Re: [OMPI users] Mellanox MLX4_EVENT_TYPE_SRQ_LIMIT kernel messages

2012-10-04 Thread Dave Love
Meanwhile, much later -- you'll sympathize: Did you have any joy with this? You wrote: > These messages appeared when running IMB compiled with openmpi 1.6.1 > across 256 cores (16 nodes, 16 cores per node). The job ran from > 09:56:54 to 10:08:46 and failed with no obvious error messages. I d

[OMPI users] -output-filename 1234 versus --mca orte_output_filename 1234

2012-10-04 Thread Sébastien Boisvert
Hi, Is there any difference in the code path between mpiexec -n 1 -output-filename 1234 ./a.out and mpiexec -n 1 --mca orte_output_filename 1234 ./a.out ?

Re: [OMPI users] EXTERNAL: Re: unacceptable latency in gathering process

2012-10-04 Thread Ralph Castain
Sorry for delayed response - been on the road all day. Usually we use the standard NetPipe, IMB, and other benchmarks to measure latency. IIRC, these are all point-to-point measurements - i.e., they measure the latency for a single process sending to one other process (typically on the order of a