[OMPI users] SIGV at MPI_Cart_sub

2012-01-10 Thread Anas Al-Trad
Dear people, In my application, I have the segmentation fault of Integer Divide-by-zero when calling MPI_cart_sub routine. My program is as follows, I have 128 ranks, I make a new communicator of the first 96 ranks via MPI_Comm_creat. Then I create a grid of 8X12 by calling MPI_Cart_

Re: [OMPI users] SIGV at MPI_Cart_sub

2012-01-10 Thread Paul Kapinos
A blind guess: did you use Intel compiler? If so, there is/was an error leading to SIGSEGV _in Open MPI itselv_. http://www.open-mpi.org/community/lists/users/2012/01/18091.php If the SIGSEGV arise not in OpenMPI but in application itself it may be a programming issue.. In any case, more precis

Re: [OMPI users] SIGV at MPI_Cart_sub

2012-01-10 Thread Anas Al-Trad
Thanks Paul, yes I use Intel 12.1.0, and this error is intermittent, not always produced but most of the times it occurs. My program is large and contains many files that are related to each other, I don't think it will help if I take the snippet of the code. The program run parallel matrix multipl

[OMPI users] Strange TCP latency results on Amazon EC2

2012-01-10 Thread Roberto Rey
Hi, I'm running some tests on EC2 cluster instances with 10 Gigabit Ethernet hardware and I'm getting strange latency results with Netpipe and OpenMPI. If I run Netpipe over OpenMPI (NPmpi) I get a network latency around 60 microseconds for small messages (less than 2kbytes). However, when I run

Re: [OMPI users] SIGV at MPI_Cart_sub

2012-01-10 Thread Ralph Castain
Have you tried the suggested fix from the email thread Paul cited? Sounds to me like the most likely cause of the problem, assuming it comes from inside OMPI. Have you looked at the backtrace to see if it is indeed inside OMPI vs your code? On Jan 10, 2012, at 6:13 AM, Anas Al-Trad wrote: > >

Re: [OMPI users] SIGV at MPI_Cart_sub

2012-01-10 Thread Anas Al-Trad
Hi Ralph, I changed the intel icc module from 12.1.0 to 11.1.069, the previous default one used at a Neolith Cluster. I submitted the job and I still waiting for the result. Here is the message of the segmentation fault: [n764:29867] *** Process received signal *** [n764:29867] Signal: Floating p

Re: [OMPI users] SIGV at MPI_Cart_sub

2012-01-10 Thread Jeff Squyres
This may be a dumb question, but are you 100% sure that the input values are correct? On Jan 10, 2012, at 8:16 AM, Anas Al-Trad wrote: > Hi Ralph, I changed the intel icc module from 12.1.0 to 11.1.069, the > previous default one used at a Neolith Cluster. I submitted the job and I > still wa

Re: [OMPI users] Problem launching application on windows

2012-01-10 Thread Shiqing Fan
Hi Alex, Have you solved the problem? Another user also spotted the same problem but under Cygwin. Did you also see the problem under Cygwin, or in normal Windows command prompt? Actually, there shouldn't be anything wrong with sockets in Open MPI to cause such errors anymore, but they of cou

Re: [OMPI users] SIGV at MPI_Cart_sub

2012-01-10 Thread Anas Al-Trad
it is a good question I asked it myself at the first but then I said it should be correct but anyway I want to confirm that: her is the code snippet of the program: ... int ranks[size]; for(i=0; i < size; ++i) { ranks[i] = i; } ... for(p=8; p <= (size); p+=4) { MPI_Barrier(MP

Re: [OMPI users] SIGV at MPI_Cart_sub

2012-01-10 Thread Anas Al-Trad
Anyway, after compiling my code with icc/11.1.069, the job is running without stuck or that sigv which it occurred before when using icc/12.1.0 module. Also I have to point that when I was using icc/12.1.0 I was getting strange outputs or stuck, and I solved them by changing the name of parameters

[OMPI users] OMPI C++ Bindings problems

2012-01-10 Thread John Doe
I'm trying to compile some code that uses the Chombo mesh package which uses Open MPI's C++ but keep getting errors like this: AMRLevelX.o: In function `Intracomm': /opt/ompi/gnu/1.4.4/include/openmpi/ompi/mpi/cxx/intracomm.h:25: undefined reference to `MPI::Comm::Comm()' AMRLevelX.o: In function

Re: [OMPI users] OMPI C++ Bindings problems

2012-01-10 Thread Ralph Castain
Did you use OMPI's C++ wrapper compiler to build your code? Looks to me like you are missing the required include paths, which is what the wrapper compiler would provide. On Jan 10, 2012, at 11:50 AM, John Doe wrote: > I'm trying to compile some code that uses the Chombo mesh package which use

[OMPI users] OpenMPI 1.5.4 remote send hang on Windows 2008R2

2012-01-10 Thread Randy Abernethy
Hello, I have run into an issue that appears to be related to sending messages to multiple processes on a single remote host prior to the remote processes sending messages to the origin. I have cooked the issue down to the following: *Test Environment of 3 Identical Hosts:* ยท * Intel i

[OMPI users] Passwordless ssh

2012-01-10 Thread Shaandar Nyamtulga
Hi I built Beuwolf cluster using OpenMPI reading the following link. http://techtinkering.com/2009/12/02/setting-up-a-beowulf-cluster-using-open-mpi-on-linux/ I can do ssh to my slave nodes without the slave mpiuser's password before mounting my slaves. But when I mount my slaves and do ssh, the

Re: [OMPI users] Passwordless ssh

2012-01-10 Thread Ralph Castain
You might want to ask that on the Beowulf mailing lists - I suspect it has something to do with the mount procedure, but honestly have no real idea how to resolve it. On Jan 10, 2012, at 8:45 PM, Shaandar Nyamtulga wrote: > Hi > I built Beuwolf cluster using OpenMPI reading the following link.