Evening everyone,
I'm running a CFD code on IB and I've encountered an error I'm not sure about
and I'm looking for some guidance on where to start looking. Here's the error:
mlx4: local QP operation err (QPN 260092, WQE index 9a9e, vendor syndrome
6f, opcode = 5e)
[0,1,6][btl_openib_compon
Terry,
Is there a libnuma.a on your system. If not the -static flag to ifort
won't do any thing because there isn't a static library for it to link
against.
Doug Reeder
On Mar 4, 2009, at 6:06 PM, Terry Frankcombe wrote:
Thanks to everyone who contributed.
I no longer think this is Open
Thanks to everyone who contributed.
I no longer think this is Open MPI's problem. This system is just
stupid. Everything's 64 bit (which various probes with file confirm).
There's no icc, so I can't test with that. gcc finds libnuma without
-L. (Though a simple gcc -lnuma -Wl,-t reports that
On Wed, Mar 04, 2009 at 04:34:49PM -0500, Jeff Squyres wrote:
> On Mar 4, 2009, at 4:16 PM, Jan Lindheim wrote:
>
> >On Wed, Mar 04, 2009 at 04:02:06PM -0500, Jeff Squyres wrote:
> >> This *usually* indicates a physical / layer 0 problem in your IB
> >> fabric. You should do a diagnostic on your
On Mar 4, 2009, at 4:16 PM, Jan Lindheim wrote:
On Wed, Mar 04, 2009 at 04:02:06PM -0500, Jeff Squyres wrote:
> This *usually* indicates a physical / layer 0 problem in your IB
> fabric. You should do a diagnostic on your HCAs, cables, and
switches.
>
> Increasing the timeout value should on
On Wed, Mar 04, 2009 at 04:02:06PM -0500, Jeff Squyres wrote:
> This *usually* indicates a physical / layer 0 problem in your IB
> fabric. You should do a diagnostic on your HCAs, cables, and switches.
>
> Increasing the timeout value should only be necessary on very large IB
> fabrics and/or
This *usually* indicates a physical / layer 0 problem in your IB
fabric. You should do a diagnostic on your HCAs, cables, and switches.
Increasing the timeout value should only be necessary on very large IB
fabrics and/or very congested networks.
On Mar 4, 2009, at 3:28 PM, Jan Lindheim w
Sorry for the delay; a bunch of higher priority stuff got in the way
of finishing this thread. Anyhoo...
On Feb 24, 2009, at 4:24 AM, Olaf Lenz wrote:
I think it would be also sufficient to place a short text and link
to the Trac page, so that the developers that want to use the "Bug
Tra
Sorry for the delay in replying -- INBOX deluge makes me miss emails
on the users list sometimes.
I'm unfortunately not familiar with gamess -- have you checked with
their support lists or documentation?
Note that Open MPI's IB progression engine will spin hard to make
progress for messag
I found several reports on the openmpi users mailing list from users,
who need to bump up the default value for btl_openib_ib_timeout.
We also have some applications on our cluster, that have problems,
unless we set this value from the default 10 to 15:
[24426,1],122][btl_openib_component.c:2905:
I suppose one initial question is: what version of Open MPI are you
running? OMPI 1.3 should not be attempting to ssh a daemon on a local
job like this - OMPI 1.2 -will-, so it is important to know which one
we are talking about.
Just do "mpirun --version" and it should tell you.
Ralph
O
Sorry for the delay in replying; the usual INBOX deluge keeps me from
being timely in replying to all mails... More below.
On Feb 24, 2009, at 6:52 AM, Jovana Knezevic wrote:
I'm new to MPI, so I'm going to explain my problem in detail
I'm trying to compile a simple application using mpicc (
On Feb 27, 2009, at 1:56 PM, Mahmoud Payami wrote:
I am using intel lc_prof-11 (and its own mkl) and have built
openmpi-1.3.1 with connfigure options: "FC=ifort F77=ifort CC=icc
CXX=icpc". Then I have built my application.
The linux box is 2Xamd64 quad. In the middle of running of my
applic
On Mar 1, 2009, at 7:24 PM, Brett Pemberton wrote:
I'd appreciate some advice on if I'm using OFED correctly.
I'm running OFED 1.4, however not the kernel modules, just userland.
Is this a bad idea?
I believe so. I'm not a kernel guy, but I've always used the userland
bits matched with th
Terry Frankcombe wrote:
Having just downloaded and installed Open MPI 1.3 with ifort and gcc, I
merrily went off to compile my application.
In my final link with mpif90 I get the error:
/usr/bin/ld: cannot find -lnuma
Adding --showme reveals that
-I/home/terry/bin/Local/include -pthread -I/
Jeff Squyres wrote:
...
In general, you need both OMPI and your application compiled natively
for each platform. One easy way to do this is to install Open MPI
locally on each node in the same filesystem location (e.g.,
/opt/openmpi-). You also want exactly the same version of
Open MPI on a
On Mar 4, 2009, at 11:38 AM, Yury Tarasievich wrote:
I'm not quite sure what an MP-MPICH meta host is.
Open MPI allows you to specify multiple hosts in a hostfile and run
a single MPI job across all of them, assuming they're connected by
at least some common TCP network.
What I need is one
Problem is that some systems install both 32 and 64 bit support, and
build OMPI both ways. So we really can't just figure it out without
some help.
At our location, we simply take care to specify the -L flag to point
to the correct version so we avoid any confusion.
On Mar 4, 2009, at 8:
Jeff Squyres wrote:
I'm not quite sure what an MP-MPICH meta host is.
Open MPI allows you to specify multiple hosts in a hostfile and run a
single MPI job across all of them, assuming they're connected by at
least some common TCP network.
What I need is one MPI job put for distributed
compu
It would also help to have some idea how you installed and ran this -
e.g., did you set mpi_paffinity_alone so that the processes would bind to
their processors? That could explain the cpu vs. elapsed time since it
helps the processes from being swapped out as much.
Ralph
> Your Intel processors
Your Intel processors are I assume not the new Nehalem/I7 ones? The older
quad-core ones are seriously memory bandwidth limited when running a memory
intensive application. That might explain why using all 8 cores per node
slows down your calculation.
Why do you get such a difference between cp
Jeff,
See my reply to Dr. Frankcombe's original e-mail. I've experienced this
same problem with the PGI compilers, so this isn't limited to just the
Intel compilers. I provided a fix, but I think OpenMPI should be able to
figure out and add the correct linker flags during the
configuration/build s
Terry Frankcombe wrote:
> Having just downloaded and installed Open MPI 1.3 with ifort and gcc, I
> merrily went off to compile my application.
>
> In my final link with mpif90 I get the error:
>
> /usr/bin/ld: cannot find -lnuma
>
> Adding --showme reveals that
>
> -I/home/terry/bin/Local/incl
On Mar 2, 2009, at 10:17 AM, Tiago Silva wrote:
Has anyone had success building openmpi with the 64 bit Lahey
fortran compiler? I have seen a previous thread about the problems
with 1.2.6 and am wondering if any progress has been made.
I can build individual libraries by removing -rpath and
No, it is not obvious, unfortunately. Can you send all the
information listed here:
http://www.open-mpi.org/community/help/
On Mar 3, 2009, at 5:22 AM, Ondrej Marsalek wrote:
Dear everyone,
I have a calculation (the CP2K program) using MPI over Infiniband and
it is stuck. All processe
Unfortunately, we don't have a whole lot of insight into how the
internals of the IO support work -- we mainly bundle the ROMIO package
from MPICH2 into Open MPI. Our latest integration was the ROMIO from
MPICH2 v1.0.7.
Do you see the same behavior if you run your application under MPICH2
Hmm; that's odd.
Is icc / icpc able to find libnuma with no -L, but ifort is unable to
find it without a -L?
On Mar 3, 2009, at 10:00 PM, Terry Frankcombe wrote:
Having just downloaded and installed Open MPI 1.3 with ifort and
gcc, I
merrily went off to compile my application.
In my fina
I'm not quite sure what an MP-MPICH meta host is.
Open MPI allows you to specify multiple hosts in a hostfile and run a
single MPI job across all of them, assuming they're connected by at
least some common TCP network.
On Mar 4, 2009, at 4:42 AM, Yury Tarasievich wrote:
Can't find this i
Can't find this in FAQ... Can I create the metahost in OpenMPI (a la
MP-MPICH), to execute the MPI application simultaneously on several
physically different machines connected by TCP/IP?
--
Hi all,
Now LAM-MPI is also installed and tested the fortran application by
running with LAM-MPI.
But LAM-MPI is performing still worse than Open MPI
No of nodes:3 cores per node:8 total core: 3*8=24
CPU TIME :1 HOURS 51 MINUTES 23.49 SECONDS
ELAPSED TIME :7 HOURS 28 MINUTES
30 matches
Mail list logo