Am 15.01.2009 um 16:20 schrieb Jeff Dusenberry:
I'm trying to launch multiple xterms under OpenMPI 1.2.8 and the
SGE job scheduler for purposes of running a serial debugger. I'm
experiencing file-locking problems on the .Xauthority file.
I tried to fix this by asking for a delay between su
Dear all,
1. I have not run it with debugger, could you tell me how to do it?
2. How can I make sure that it is or it is not killing my job.
siorry if my questions seems wierd. But I have to solve the problem immediately.
Thanks for helping me
Jeff Squyres wrote:
On Jan 7, 2009, at 6:28 PM, Biagio Lucini wrote:
[[5963,1],13][btl_openib_component.c:2893:handle_wc] from node24 to:
node11 error polling LP CQ with status RECEIVER NOT READY RETRY
EXCEEDED ERROR status number 13 for wr_id 37779456 opcode 0 qp_idx 0
Ah! If we're dealing a
I'm trying to launch multiple xterms under OpenMPI 1.2.8 and the SGE job
scheduler for purposes of running a serial debugger. I'm experiencing
file-locking problems on the .Xauthority file.
I tried to fix this by asking for a delay between successive launches,
to reduce the chances of content
Dear OpenMPI developers,
I'm running my MPI application over Infiniband connection net over 128
processors. During the execution my application, i get a strange time
out error:
checkPAMRESActionTab: action 63 connecting to RES on host timed
out after 200 seconds
Is a net problem or an applicatio
Have you checked to ensure that the job manager is not killing your
job? As I mentioned yesterday, SIGTERM is usually when some external
agent kills your job.
On Jan 15, 2009, at 3:39 AM, Hana Milani wrote:
please tell me how to get rid of the message and how to run the
parallel job?
I
Without any details it's difficult to make a diagnosis,
but it looks like one of your processes crashes, perhaps from a
segmentation fault .
Have you run it with a debugger?
Jody
On Thu, Jan 15, 2009 at 9:39 AM, Hana Milani wrote:
> please tell me how to get rid of the message and how to run th
please tell me how to get rid of the message and how to run the parallel job?
I have another code running directly by mpirun without a problem, but this one
that needed blacs and scalapack is palying with me.
please if there is any solution let me have it.
Regards,
hana
Hello Simon,
For running the program in parallel, I write: mpirun -np 4 ~/program
output
It
takes a second that I receive the message: mpirun noticed that job rank
0 with PID 9477 on node linux-4pel exited on signal 15 (Terminated).
and at the end of the output file, I receive: "3 additiona