Hi Matt,
There seem to be two different issues here:
a) The warning message comes from the openib btl. Given that Omnipath has
verbs API and you have the necessary libraries in your system, openib btl finds
itself as a potential transport and prints the warning during its init (openib
btl
Hi Matt,
Few comments/questions:
- If your cluster has Omni-Path, you won’t need UCX. Instead you can
run using PSM2, or alternatively OFI (a.k.a. Libfabric)
- With the command you shared below (4 ranks on the local node) (I
think) a shared mem transport is being selected (va
FI provider:
/tmp> mpirun -np 2 -mca mtl ofi -mca pml cm -mca mtl_ofi_provider_include psm2
./a
Hello World from proccess 0 out of 2
This is process 0 reporting::
Hello World from proccess 1 out of 2
Process 1 received number 10 from process 0
From: users [mailto:users-boun...@lists.open-mpi.
pirun -np 2 -mca mtl ofi -mca pml cm -mca mtl_ofi_provider_include psm2
./a
Hello World from proccess 0 out of 2
This is process 0 reporting::
Hello World from proccess 1 out of 2
Process 1 received number 10 from process 0
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Cabral
rom: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Cabral,
Matias A
Sent: Friday, January 11, 2019 3:22 PM
To: Open MPI Users
Subject: Re: [OMPI users] Open MPI 4.0.0 - error with MPI_Send
Hi Eduardo,
The OFI MTL got some new features during 2018 that went into v4.0.0 but are
Hi Eduardo,
The OFI MTL got some new features during 2018 that went into v4.0.0 but are not
backported to older OMPI versions.
What version of libfabric are you using and where are you installing it from?
I will try to reproduce your error. I'm running some quick tests and I see it
working:
Hey Jeff,
I will help with the OFI part.
Thanks,
_MAC
-Original Message-
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Jeff Squyres
(jsquyres) via users
Sent: Thursday, June 14, 2018 12:50 PM
To: Open MPI User's List
Cc: Jeff Squyres (jsquyres)
Subject: Re: [OMP
Hi Charles,
What version of libfabric do you have installed? To run OMPI using the verbs
provider you need to pair it with the ofi_rxm provider. fi_info should list it
like:
…
provider: verbs;ofi_rxm
…
So in your command line you have to specify:
mpirun -mca pml cm -mca mtl ofi -mca mtl_ofi_pro
Hi William,
Couple other questions:
- Please share how you ompi configure line looks like.
- Please clarify which is/are the compat libraries you refer to. There are
some that are actually for the opposite case: Making TS app/libs run on
Omnipath.
- As Gilles mentions, moving to a newer m
Hi Esthela,
As George mentions, this is indeed libpsm2 printing this error. Opcode=0xCC is
a disconnect retry. There are a few scenarios that could be happening, but can
simplify in saying it is an already disconnected endpoint message arriving
late. What version of Intel Ompin-path Software or
Hi Jingchao,
The log shows the psm mtl is being selected.
…
[c1725.crane.hcc.unl.edu:187002] select: init returned priority 20
[c1725.crane.hcc.unl.edu:187002] selected cm best priority 30
[c1725.crane.hcc.unl.edu:187002] select: component ob1 not selected / finalized
[c1725.crane.hcc.unl.edu:1870
Hi Hristo,
As you mention I have seen that the sm btl shows better performance for smaller
messages than PMS2 shm device does, by running some osu benchmarks (especially
BW for msg<256B). I even suspect that the difference would be more notable if
you test the vader btl. However, the piece tha
Hi Thyago,
psm is the user library to run with Intel TruScale cards.
psm2 is for Intel OmniPath.
There is a current problem in the libraries with OMPI java bindings:
https://www.open-mpi.org/faq/?category=java#java_limitations
thanks,
_MAC
From: users [mailto:users-boun...@lists.open-mpi.org]
port_lmc: 0x00
link_layer: InfiniBand
Regards.
2017-01-31 17:55 GMT+01:00 Cabral, Matias A
mailto:matias.a.cab...@intel.com>>:
Hi Wodel,
As Howard mentioned, this is probably because many ranks and sending to a
s
Hi Wodel,
As Howard mentioned, this is probably because many ranks and sending to a
single one and exhausting the receive requests MQ. You can individually enlarge
the receive/send requests queues with the specific variables
(PSM_MQ_RECVREQS_MAX/ PSM_MQ_SENDREQS_MAX) or increase both with
PSM_
>Anyway, /dev/hfi1_0 doesn't exist.
Make sure you have the hfi1 module/driver loaded.
In addition, please confirm the links are in active state on all the nodes
`opainfo`
_MAC
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Howard
Pritchard
Sent: Thursday, December 08, 2016
build --with-psm2 failed on CentOS 7.2
Thank you very much, MAC!
Limin
On Tue, Oct 11, 2016 at 10:15 PM, Cabral, Matias A
mailto:matias.a.cab...@intel.com>> wrote:
Building psm2 should not be complicated (in case you cannot find a newer
binary):
https://github.com/01org/opa-psm2
/lib64/libpsm2.so.2: no symbols
[root@uranus ~]#
Thanks!
Limin
On Tue, Oct 11, 2016 at 7:00 PM, Cabral, Matias A
mailto:matias.a.cab...@intel.com>> wrote:
Hi Limin,
psm2_mq_irecv2 should be in libpsm2.so. I’m not quite sure how CentOS packs it
so I would like a little more info about
Hi Limin,
psm2_mq_irecv2 should be in libpsm2.so. I’m not quite sure how CentOS packs it
so I would like a little more info about the version being used. Some things to
share:
>rpm -qi libpsm2-0.7-4.el7.x86_64
> objdump –p /usr/lib64/libpsm2.so |grep SONAME
>nm /usr/lib64/libpsm2.so |grep psm
Hi Giles et.al.,
You are right, ptl.c is in PSM2 code. As Ralph mentions, dynamic process
support was/is not working in OMPI when using PSM2 because of an issue related
to the transport keys. This was fixed in PR #1602
(https://github.com/open-mpi/ompi/pull/1602) and should be included in v2.0.
when the receiver is not ready to receive?
On Wed, Aug 10, 2016 at 11:48 PM, Cabral, Matias A
mailto:matias.a.cab...@intel.com>> wrote:
To remain in eager mode you need to increase the size of
PSM2_MQ_RNDV_HFI_THRESH.
PSM2_MQ_EAGER_SDMA_SZ is the threshold at which PSM changes from P
e on
"Just in case PSM2_MQ_EAGER_SDMA_SZ changes PIO to SDMA, always in eager mode."
Thanks!
Michael
On Wed, Aug 10, 2016 at 3:59 PM, Cabral, Matias A
mailto:matias.a.cab...@intel.com>> wrote:
Hi Michael,
When Open MPI run on Omni-Path it will choose the PSM2 MTL by defaul
Hi Michael,
When Open MPI run on Omni-Path it will choose the PSM2 MTL by default, to use
the libpsm2.so. Strictly speaking, it has compatibility to run using the openib
BTL. However, the performance so significantly impacted that it is, not only
discouraged, but no tuning would make sense. Reg
Hi Durga,
Here is a short summary:
PSM: is intended for Intel TrueScale InfiniBand product series. It is also
known as PSM gen 1, uses libpsm_infinipath.so
PSM2: is intended for Intel’s next generation fabric called OmniPath. PSM gen2,
uses libpsm2.so. I didn’t know about the owner.txt missing.
016 5:52 AM
To: Open MPI Users
Subject: Re: [OMPI users] locked memory and queue pairs
On Wed, Mar 16, 2016 at 4:49 PM, Cabral, Matias A
wrote:
> I didn't go into the code to see who is actually calling this error message,
> but I suspect this may be a generic error for "out
age-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Michael Di Domenico
Sent: Wednesday, March 16, 2016 1:25 PM
To: Open MPI Users
Subject: Re: [OMPI users] locked memory and queue pairs
On Wed, Mar 16, 2016 at 3:37 PM, Cabral, Matias A
wrote:
> Hi Michael,
>
> I may be mis
Hi Michael,
I may be missing some context, if you are using the qlogic cards you will
always want to use the psm mtl (-mca pml cm -mca mtl psm) and not openib btl.
As Tom suggest, confirm the limits are setup on every node: could it be the
alltoall is reaching a node that "others" are not? Plea
27 matches
Mail list logo