Dear people,
In my application, I have the segmentation fault of Integer
Divide-by-zero when calling MPI_cart_sub routine. My program is as follows,
I have 128 ranks, I make a new communicator of the first 96 ranks via
MPI_Comm_creat. Then I create a grid of 8X12 by calling MPI_Cart_
A blind guess: did you use Intel compiler?
If so, there is/was an error leading to SIGSEGV _in Open MPI itselv_.
http://www.open-mpi.org/community/lists/users/2012/01/18091.php
If the SIGSEGV arise not in OpenMPI but in application itself it may be
a programming issue.. In any case, more precis
Thanks Paul,
yes I use Intel 12.1.0, and this error is intermittent, not always produced
but most of the times it occurs.
My program is large and contains many files that are related to each other,
I don't think it will help if I take the snippet of the code. The program
run parallel matrix multipl
Hi,
I'm running some tests on EC2 cluster instances with 10 Gigabit Ethernet
hardware and I'm getting strange latency results with Netpipe and OpenMPI.
If I run Netpipe over OpenMPI (NPmpi) I get a network latency around 60
microseconds for small messages (less than 2kbytes). However, when I run
Have you tried the suggested fix from the email thread Paul cited? Sounds to me
like the most likely cause of the problem, assuming it comes from inside OMPI.
Have you looked at the backtrace to see if it is indeed inside OMPI vs your
code?
On Jan 10, 2012, at 6:13 AM, Anas Al-Trad wrote:
>
>
Hi Ralph, I changed the intel icc module from 12.1.0 to 11.1.069, the
previous default one used at a Neolith Cluster. I submitted the job and I
still waiting for the result. Here is the message of the segmentation fault:
[n764:29867] *** Process received signal ***
[n764:29867] Signal: Floating p
This may be a dumb question, but are you 100% sure that the input values are
correct?
On Jan 10, 2012, at 8:16 AM, Anas Al-Trad wrote:
> Hi Ralph, I changed the intel icc module from 12.1.0 to 11.1.069, the
> previous default one used at a Neolith Cluster. I submitted the job and I
> still wa
Hi Alex,
Have you solved the problem?
Another user also spotted the same problem but under Cygwin. Did you
also see the problem under Cygwin, or in normal Windows command prompt?
Actually, there shouldn't be anything wrong with sockets in Open MPI to
cause such errors anymore, but they of cou
it is a good question I asked it myself at the first but then I said it
should be correct but anyway I want to confirm that:
her is the code snippet of the program:
...
int ranks[size];
for(i=0; i < size; ++i)
{
ranks[i] = i;
}
...
for(p=8; p <= (size); p+=4)
{
MPI_Barrier(MP
Anyway, after compiling my code with icc/11.1.069, the job is running
without stuck or that sigv which it occurred before when using icc/12.1.0
module.
Also I have to point that when I was using icc/12.1.0 I was getting strange
outputs or stuck, and I solved them by changing the name of parameters
I'm trying to compile some code that uses the Chombo mesh package which
uses Open MPI's C++ but keep getting errors like this:
AMRLevelX.o: In function `Intracomm':
/opt/ompi/gnu/1.4.4/include/openmpi/ompi/mpi/cxx/intracomm.h:25:
undefined reference to `MPI::Comm::Comm()'
AMRLevelX.o: In function
Did you use OMPI's C++ wrapper compiler to build your code? Looks to me like
you are missing the required include paths, which is what the wrapper compiler
would provide.
On Jan 10, 2012, at 11:50 AM, John Doe wrote:
> I'm trying to compile some code that uses the Chombo mesh package which use
Hello,
I have run into an issue that appears to be related to sending messages to
multiple processes on a single remote host prior to the remote processes
sending messages to the origin. I have cooked the issue down to the
following:
*Test Environment of 3 Identical Hosts:*
ยท * Intel i
Hi
I built Beuwolf cluster using OpenMPI reading the following link.
http://techtinkering.com/2009/12/02/setting-up-a-beowulf-cluster-using-open-mpi-on-linux/
I can do ssh to my slave nodes without the slave mpiuser's password before
mounting my slaves.
But when I mount my slaves and do ssh, the
You might want to ask that on the Beowulf mailing lists - I suspect it has
something to do with the mount procedure, but honestly have no real idea how to
resolve it.
On Jan 10, 2012, at 8:45 PM, Shaandar Nyamtulga wrote:
> Hi
> I built Beuwolf cluster using OpenMPI reading the following link.
15 matches
Mail list logo