I recompiled MPI with -g, but it didn't solve the problem. Two things that
have changed are: buf in PMPI_Recv is no longer of value 0 and backtrace in
gdb shows more functions (eg. mca_pml_ob1_recv_frag_callback_put as #1).
As you recommended, I will try to walk up the stack, but it's not so easy
I am not sure this has anything to do with your problem but if you look
at the stack entry for PMPI_Recv I noticed the buf has a value of 0.
Shouldn't that be an address?
Does your code fail if the MPI library is built with -g? If it does
fail the same way, the next step I would do would be
Some update on this issue. I've attached gdb to the crashing
application and I got:
-
Program received signal SIGSEGV, Segmentation fault.
mca_pml_ob1_send_request_put (sendreq=0x130c480, btl=0xc49850,
hdr=0xd10e60) at pml_ob1_sendreq.c:1231
1231pml_ob1_sendreq.c: No such file or directory
Hi,
I'm using mkl scalapack in my project. Recently, I was trying to run
my application on new set of nodes. Unfortunately, when I try to
execute more than about 20 processes, I get segmentation fault.
[compn7:03552] *** Process received signal ***
[compn7:03552] Signal: Segmentation fault (11)
[c