(s) to
eventually come up on the same node?
many thanks,
Jan
hi Gabriele,
You can check some of the available options here -
https://www.ibm.com/support/knowledgecenter/en/SSZTET_10.1.0/smpi02/smpi02_interconnect.html
The "-pami_noib" option might be of help in this scenario. Alternatively,
on a single node, the vader BTL can also be used.
Regards
--Nysal
If this is Power 8 in LE mode, its most likely a libtool issue. You need
libtool >= 2.4.3, which has the LE patches, and need to run autogen.pl
again. I have an issue open for this -
https://github.com/open-mpi/ompi/issues/396
Regards
--Nysal
On Sat, Mar 28, 2015 at 12:41 AM, Hammond, Simon David
Collecting MPI Profile information might help narrow down the issue. You
could use some of the tools mentioned here -
http://www.open-mpi.org/faq/?category=perftools
--Nysal
On Wed, Dec 1, 2010 at 11:59 PM, Price, Brian M (N-KCI) <
brian.m.pr...@lmco.com> wrote:
> OpenMPI version: 1.4.3
>
> Pla
n.
>
>
>
> Brian
>
>
>
>
>
> *From:* users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] *On
> Behalf Of *Nysal Jan
> *Sent:* Wednesday, November 10, 2010 12:19 AM
> *To:* Open MPI Users
> *Subject:* EXTERNAL: Re: [OMPI users] Creating 64-bit obj
the same issue occur in OMPI 1.5?
>
> Should we put in a local patch for OMPI 1.4.x and/or OMPI 1.5? (we've done
> this before while waiting for upstream Libtool patches to be released, etc.)
>
>
>
> On Nov 10, 2010, at 2:19 AM, Nysal Jan wrote:
>
> > Hi Brian,
&
Hi Brian,
This problem was first reported by Paul H. Hargrove in the developer mailing
list. It is a bug in libtool and has been fixed in the latest release
(2.2.8). More details are available here -
http://www.open-mpi.org/community/lists/devel/2010/10/8606.php
Regards
--Nysal
On Wed, Nov 10, 20
t;>>>>>>>>> messages size are
> > >>>>>>>>>>>
> > >>>>>>>>>>>> 10k but can be very much larger.
> > >>>>>>>>>>>
> > >>>>>>>>&g
Hi Eloi,
Sorry for the delay in response. I haven't read the entire email thread, but
do you have a test case which can reproduce this error? Without that it will
be difficult to nail down the cause. Just to clarify, I do not work for an
iwarp vendor. I can certainly try to reproduce it on an IB sy
Hi Gilbert,
Checksums are turned off by default. If you need checksums to be activated
add "-mca pml csum" to the mpirun command line.
Checksums are enabled only for inter-node communication. Intra-node
communication is typically over shared memory and hence checksum is disabled
for this case.
If y
te an invalid
> memory access allowing to understand why reg->cbfunc / hdr->tag are null.
>
> Do you think that a thread race condition could explain the hdr->tag value
> ?
>
> Thanks for your help,
> Eloi
>
> On Monday 16 August 2010 20:46:39 Nysal Jan wrote:
&
The value of hdr->tag seems wrong.
In ompi/mca/pml/ob1/pml_ob1_hdr.h
#define MCA_PML_OB1_HDR_TYPE_MATCH (MCA_BTL_TAG_PML + 1)
#define MCA_PML_OB1_HDR_TYPE_RNDV (MCA_BTL_TAG_PML + 2)
#define MCA_PML_OB1_HDR_TYPE_RGET (MCA_BTL_TAG_PML + 3)
#define MCA_PML_OB1_HDR_TYPE_ACK (MCA_BT
ibc instead of openmpi one, but does
> it have an effect on performance or something else ?
>
> Nicolas
>
> 2010/8/8 Nysal Jan
>
> What interconnect are you using? Infiniband? Use
>> "--without-memory-manager" option while building ompi in order to disable
>
Thanks for reporting this Matthew. Fixed in r23576 (
https://svn.open-mpi.org/trac/ompi/changeset/23576)
Regards
--Nysal
On Fri, Aug 6, 2010 at 10:38 PM, Matthew Clark wrote:
> I was looking in my copy of openmpi-1.4.1 opal/asm/base/POWERPC32.asm
> and saw the following:
>
> START_FUNC(opal_sys_
What interconnect are you using? Infiniband? Use "--without-memory-manager"
option while building ompi in order to disable ptmalloc.
Regards
--Nysal
On Sun, Aug 8, 2010 at 7:49 PM, Nicolas Deladerriere <
nicolas.deladerri...@gmail.com> wrote:
> Yes, I'am using 24G machine on 64 bit Linux OS.
>
You can find the template for a BTL in ompi/mca/btl/template (You will find
this on the subversion trunk). Copy and rename the folder/files. Use this as
a starting point.
For details on creating a new component (such as a new BTL) look here -
https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateComp
OMPI_COMM_WORLD_RANK can be used to get the MPI rank. For other environment
variables -
http://www.open-mpi.org/faq/?category=running#mpi-environmental-variables
For processor affinity see this FAQ entry -
http://www.open-mpi.org/faq/?category=all#using-paffinity
--Nysal
On Wed, Jul 28, 2010 at 9
__cxa_get_exception_ptr should be defined in libstdc++ shared library.
--Nysal
On Mon, Jun 14, 2010 at 5:51 AM, HeeJin Kim wrote:
> Dear all,
>
> I had built openmpi-1.4.2 with:
> configure CC=icc CXX=icpc F77=ifort FC=ifort
> --prefix=/home/biduri/program/openmpi --enable-mpi-threads --enable-
This probably got fixed in https://svn.open-mpi.org/trac/ompi/ticket/2386
Can you try 1.4.2, the fix should be in there.
Regards
--Nysal
On Thu, May 20, 2010 at 2:02 PM, Olivier Riff wrote:
> Hello,
>
> I assume this question has been already discussed many times, but I can not
> find on Intern
Thanks Jeff. We solve this problem finally. Download the newest
OFED-1.4.1-rc6.tgz, and reinstall all node's infiniband drivers and
utilities. Everythings looks good, and I have my own coffee time now. Thanks
again.
Best Regards,
Gloria Jan
Wavelink Technology Inc
I don't think th
:0
active_width: 4X (2)
active_speed: 5.0 Gbps (2)
phys_state: LINK_UP (5)
GID[ 0]:
fe80::::0018:8b90:97fe:73ce
Best Regards,
Gloria Jan
Wavelink
Thank you Jeff. I have passed the mail to the IB vendor Dell company(the
blade was ordered from Dell Taiwan), but he todl me that he didn't
understand "layer 0 diagnostics". Coluld you help us to get more
information of "layer 0 diagnostics". Thanks again.
Rega
Thank you Jeff. I have passed the mail to the IB vendor Dell company(the
blade was ordered from Dell Taiwan), but he todl me that he didn't
understand "layer 0 diagnostics". Coluld you help us to get more
information of "layer 0 diagnostics". Thanks again.
Rega
][[3175,1],0][btl_openib_component.c:3029:poll_device] error polling
HP CQ with -2 errno says Success
=
Is this problem unsolvable?
Best Regards,
Gloria Jan
Wavelink Technology Inc
I can confirm that I have exactly the same
nectX product?
Thank you again.
Best Regards,
Gloria Jan
Wavelink Technology Inc
I can confirm that I have exactly the same problem, also on Dell
system, even with latest openpmpi.
Our system is:
Dell M905
OpenSUSE 11.1
kernel: 2.6.27.21-0.1-default
ofed-1.4-21.12 from SUSE repositories.
Op
configuration again. but
found the problem still occurred periodic, ie. twice success, then twice
failed, twice
success, then twice failed ... . Do you have any suggestion for this issue?
Thank you again.
Best Regards,
Gloria Jan
Wavelink Technology Inc.
Per http://www.open-mpi.org/community/lists
:21 AM, jan wrote:
Dear Sir,
I?m running a cluster with OpenMPI.
$mpirun --mca mpi_show_mpi_alloc_mem_leaks 8 --mca
mpi_show_handle_leaks 1 $HOME/test/cpi
I got the error message as job failed:
Process 15 on node2
Process 6 on node1
Process 14 on node2
? ? ?
Process 0 on node1
Process 10 on
ed on Dell PowerEdge M600 Blade Server. The infiniband
Mezzanine Cards is Mellanox ConnectX QDR & DDR. And infiniband switch module is
Mellanox M2401G. OS is CentOS 5.3, kernel 2.6.18-128.1.6.el5, with PGI V7.2-5
compiler. It’s running OpenSM subnet manager.
Best Regards,
Gloria Jan
Wavel
ed on Dell PowerEdge M600 Blade Server. The infiniband
Mezzanine Cards is Mellanox ConnectX QDR & DDR. And infiniband switch module is
Mellanox M2401G. OS is CentOS 5.3, kernel 2.6.18-128.1.6.el5, with PGI V7.2-5
compiler. It’s running OpenSM subnet manager.
Best Regards,
Gloria Jan
Wavel
You could try statically linking the Intel-provided libraries. Use
LDFLAGS=-static-intel
--Nysal
On Wed, 2009-04-15 at 21:03 +0200, Francesco Pietra wrote:
> On Wed, Apr 15, 2009 at 8:39 PM, Prentice Bisbal wrote:
> > Francesco Pietra wrote:
> >> I used --with-libnuma=/usr since Prentice Bisbal
Can you try adding --disable-dlopen to the configure command line
--Nysal
On Tue, 2009-04-14 at 10:19 +0200, Jean-Michel Beuken wrote:
> Hello,
>
> I'm trying to build 1.3.1 under IBM Power5 + SLES 9.1 + XLF 9.1...
>
> after some searches on FAQ and Google, my configure :
>
> export CC="/opt/
fs1 is selecting the "cm" PML whereas other nodes are selecting the
"ob1" PML component. You can force ob1 to be used via "--mca pml ob1"
What kind of hardware/NIC does fs1 have?
--Nysal
On Wed, 2009-03-18 at 17:17 -0400, Gary Draving wrote:
> Hi all,
>
> anyone ever seen an error like this? Se
ts checked, 103 ports have errors beyond threshold
I wonder if this is something that needs to be tuned in the Infiniband
switch or if there is something in OpenMPI/OpenIB that can be tuned.
Thanks,
Jan Lindheim
On Wed, Mar 04, 2009 at 04:34:49PM -0500, Jeff Squyres wrote:
> On Mar 4, 2009, at 4:16 PM, Jan Lindheim wrote:
>
> >On Wed, Mar 04, 2009 at 04:02:06PM -0500, Jeff Squyres wrote:
> >> This *usually* indicates a physical / layer 0 problem in your IB
> >> fabric. You
gt; fabrics and/or very congested networks.
Thanks Jeff!
What is considered to be very large IB fabrics?
I assume that with just over 180 compute nodes,
our cluster does not fall into this category.
Jan
>
>
> On Mar 4, 2009, at 3:28 PM, Jan Lindheim wrote:
>
> >I found sev
parameters that need to be looked at too?
Thanks for any insight on this!
Regards,
Jan Lindheim
On Tue, 2009-02-24 at 13:30 -0500, Jeff Squyres wrote:
> - Get Python to give you the possibility of opening dependent
> libraries in the global scope. This may be somewhat controversial;
> there are good reasons to open plugins in private scopes. But I have
> to imagine that OMPI is not th
mp_Stealth-OMPI < in.testbench_small
> LAMMPS (22 Jan 2008)
>
> Interestingly, I downloaded Open MPI 1.2.8, built it with the same
> configure options I had used with 1.3, and it worked.
>
> I'm getting by fine with 1.2.8. I just wanted to file a possible bug
> rep
I implementations to validate
the prediction model I constructed in my local cluster.
Regards,
Jan Ploski
users-boun...@open-mpi.org schrieb am 08/06/2008 07:44:03 PM:
> On Aug 6, 2008, at 12:37 PM, Jan Ploski wrote:
>
> >> I'm using the latest of Open MPI compiled with debug turned on, and
> >> valgrind 3.3.0. From your trace it looks like there is a conflict
> >
George Bosilca wrote:
Jan,
I'm using the latest of Open MPI compiled with debug turned on, and
valgrind 3.3.0. From your trace it looks like there is a conflict
between two memory managers. I'm not having the same problem as I
disable the Open MPI memory manager on my builds
users-boun...@open-mpi.org schrieb am 08/05/2008 05:51:51 PM:
> Jan,
>
> I'm using valgrind with Open MPI on a [very] regular basis and I never
> had any problems. I usually want to know the execution path on the MPI
> applications. For this I use:
> mpirun -np XX valgr
tation and recompiling)?
Best regards,
Jan Ploski
--
Dipl.-Inform. (FH) Jan Ploski
OFFIS
FuE Bereich Energie | R&D Division Energy
Escherweg 2 - 26121 Oldenburg - Germany
Phone/Fax: +49 441 9722 - 184 / 202
E-Mail: jan.plo...@offis.de
URL: http://www.offis.de
Gabriele,
Can you try with Open MPI 1.2.6. It has a parameter to disable early
completion, set it to zero (-mca pml_ob1_use_early_completion 0).
--Nysal
On Wed, May 7, 2008 at 9:29 PM, Gabriele FATIGATI
wrote:
> I have attached informations requested about Infiniband net and OpenMPi
> enviromen
Jeff,
Ok, this solved the problem with the Pathscale compiler.
Thanks
-- Jan
Message: 2
Date: Thu, 1 Jun 2006 17:37:36 -0400
From: "Jeff Squyres \(jsquyres\)"
Subject: Re: [OMPI users] openmpi-1.1a9r10157 Fails to build with Nag
f95Compiler
To: "Open MPI Use
Hi,
openmpi-1.1a9r10157's fortran bindings also fail to build with the
Pathscale 2.1 pathf90 compiler. At the same spot but with different
error messages (see below), which perhaps helps to clarify things. Any
help greatly appreciated as well.
Best regards,
Jan De
46 matches
Mail list logo