> I need it's the backtrace on the process which generate the
> segfault. Second, in order to understand the backtrace, it's
> better to have run debug version of Open MPI. Without the
> debug version we only see the address where the fault occur
> without having access to the line number ...
On Jan 8, 2007, at 9:34 PM, Reese Faucette wrote:
Right, that's the maximum number of open MX channels, i.e. processes
than can run on the node using MX. With MX (1.2.0c I think), I get
weird messages if I run a second mpirun quickly after the first one
failed. The myrinet guys, I quite sure, c
Right, that's the maximum number of open MX channels, i.e. processes
than can run on the node using MX. With MX (1.2.0c I think), I get
weird messages if I run a second mpirun quickly after the first one
failed. The myrinet guys, I quite sure, can explain why and how.
Somehow, when an application
On Jan 8, 2007, at 9:11 PM, Reese Faucette wrote:
Second thing. From one of your previous emails, I see that MX
is configured with 4 instance by node. Your running with
exactly 4 processes on the first 2 nodes. Weirds things might
happens ...
4 processes per node will be just fine. This is n
Second thing. From one of your previous emails, I see that MX
is configured with 4 instance by node. Your running with
exactly 4 processes on the first 2 nodes. Weirds things might
happens ...
4 processes per node will be just fine. This is not like GM where the 4
includes some "reserved" port
Not really. This is the backtrace of the process that get killed because
mpirun detect that the other one died ... What I need it's the backtrace
on the process which generate the segfault. Second, in order to understand
the backtrace, it's better to have run debug version of Open MPI. Without
> >> PS: Is there any way you can attach to the processes with gdb ? I
> >> would like to see the backtrace as showed by gdb in order
> to be able
> >> to figure out what's wrong there.
> >
I found out that all processes on the 2nd node crash so I just put a 30
second wait before MPI_Init in or
On Mon, Jan 08, 2007 at 03:07:57PM -0500, Jeff Squyres wrote:
> if you're running in an ssh environment, you generally have 2 choices to
> attach serial debuggers:
>
> 1. Put a loop in your app that pauses until you can attach a
> debugger. Perhaps something like this:
>
> { int i = 0; prin
> >> PS: Is there any way you can attach to the processes with gdb ? I
> >> would like to see the backtrace as showed by gdb in order
> to be able
> >> to figure out what's wrong there.
> >
> > When I can get more detailed dbg, I'll send. Though I'm not
> clear on
> > what executable is being
On Jan 8, 2007, at 2:52 PM, Grobe, Gary L. ((JSC-EV))[ESCG] wrote:
I was wondering if someone could send me the HACKING file so I can
do a
bit more with debugging on the snapshots. Our web proxy has webdav
methods turned off (request methods fail) so that I can't get to the
latest of the svn r
I was wondering if someone could send me the HACKING file so I can do a
bit more with debugging on the snapshots. Our web proxy has webdav
methods turned off (request methods fail) so that I can't get to the
latest of the svn repos.
> Second thing. From one of your previous emails, I see that MX
: Re: [OMPI users] Ompi failing on mx only
This is just an FYI of the Jan 5th snapshot.
I'll send a backtrace of the processes as soon as I get a b3 running.
Between my filtered webdav svn access problems and the latest nightly
snapshots, my builds are currently failing where the same config
This is just an FYI of the Jan 5th snapshot.
I'll send a backtrace of the processes as soon as I get a b3 running.
Between my filtered webdav svn access problems and the latest nightly
snapshots, my builds are currently failing where the same config lines
worked on previous snapshots ...
$./confi
-
mpi.org] On
Behalf Of Brian W. Barrett
Sent: Tuesday, January 02, 2007 4:11 PM
To: Open MPI Users
Subject: Re: [OMPI users] Ompi failing on mx only
Sorry to jump into the discussion late. The mx btl does not support
communication between processes on the same node by itself, so you
have
9 additional processes aborted (not shown)
-Original Message-
From: users-boun...@open-mpi.org [mailto:users-bounces@open-
mpi.org] On
Behalf Of Brian W. Barrett
Sent: Tuesday, January 02, 2007 4:11 PM
To: Open MPI Users
Subject: Re: [OMPI users] Ompi failing on mx only
Sorry to jump int
.@open-mpi.org] On
Behalf Of Brian W. Barrett
Sent: Tuesday, January 02, 2007 4:11 PM
To: Open MPI Users
Subject: Re: [OMPI users] Ompi failing on mx only
Sorry to jump into the discussion late. The mx btl does not support
communication between processes on the same node by itself, so you have
to
$ mpirun --prefix /usr/local/openmpi-1.2b2 --hostfile ./h1-3 -np 1 --mca
btl mx,sm,self ./cpi
[node-1:09704] mca: base: component_find: unable to open mtl mx: file
not found (ignored)
[node-1:09704] mca: base: component_find: unable to open btl mx: file
not found (ignored)
This in particular is
3.1415926544231341, Error is 0.08333410
wall clock time = 0.000331
-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Brian W. Barrett
Sent: Tuesday, January 02, 2007 4:11 PM
To: Open MPI Users
Subject: Re: [OMPI users] Ompi failing on mx only
ry 02, 2007 4:08 PM
To: Open MPI Users
Subject: Re: [OMPI users] Ompi failing on mx only
Ompi failing on mx only> I've attached the ompi_info from node-1 and
node-2.
thanks, but i need "mx_info", not "ompi_info" ;-)
> But now that you mention mapper, I take it
] On
Behalf Of Reese Faucette
Sent: Tuesday, January 02, 2007 4:08 PM
To: Open MPI Users
Subject: Re: [OMPI users] Ompi failing on mx only
Ompi failing on mx only> I've attached the ompi_info from node-1 and
node-2.
thanks, but i need "mx_info", not "ompi_info" ;-)
>
As for the MTL, there is a bug in the MX
MTL for v1.2 that has been fixed, but after 1.2b2 ...
oops, i was stupidly assuming he already had that fix. yes, this is an
important fix...
-reese
users-boun...@open-mpi.org [mailto:users-bounces@open-
mpi.org] On Behalf Of Reese Faucette
Sent: Tuesday, January 02, 2007 2:52 PM
To: Open MPI Users
Subject: Re: [OMPI users] Ompi failing on mx only
Hi, Gary-
This looks like a config problem, and not a code problem yet.
Could you send the
Ompi failing on mx only> I've attached the ompi_info from node-1 and node-2.
thanks, but i need "mx_info", not "ompi_info" ;-)
But now that you mention mapper, I take it that's what SEGV_MAPERR might
be referring to.
this is an ompi red herring; it has nothing to do with Myrinet mapping, even
3 un-ACKed alerts
Mapping is complete, last map generated by node-20
Database generation not yet complete.
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Reese Faucette
Sent: Tuesday, January 02, 2007 2:52 PM
To: Open MPI Use
Ompi failing on mx onlyHi, Gary-
This looks like a config problem, and not a code problem yet. Could you send
the output of mx_info from node-1 and from node-2? Also, forgive me
counter-asking a possibly dumb OMPI question, but is "-x LD_LIBRARY_PATH"
really what you want, as opposed to "-x LD
I was initially using 1.1.2 and moved to 1.2b2 because of a hang on
MPI_Bcast() which 1.2b2 reports to fix, and seemed to have done so. My
compute nodes are 2 dual core xeons on myrinet with mx. The problem is
trying to get ompi running on mx only. My machine file is as follows ...
node-1 slots=4
26 matches
Mail list logo