[Public]
Hi Gilles,
Thanks very much for the information.
I was looking for the best pml + btl combination for a standalone intra node
with high task count (>= 192) with no HPC-class networking installed.
Just now realized that I can’t use pml ucx for such cases as it is unable find
IB and fa
[AMD Official Use Only - General]
Hi Gilles,
Yes, I am using xpmem, but getting the below issue.
https://github.com/open-mpi/ompi/issues/11463
--Arun
From: Gilles Gouaillardet
Sent: Monday, March 6, 2023 2:08 PM
To: Chandran, Arun
Subject: Re: [OMPI users] What is the best choice of pml and
If this run was on a single node, then UCX probably disabled itself since it
wouldn't be using InfiniBand or RoCE to communicate between peers.
Also, I'm not sure your command line was correct:
perf_benchmark $ mpirun -np 32 --map-by core --bind-to core ./perf --mca pml
ucx
You probably need
ucx PML should work just fine even on a single node scenario. As Jeff
indicated you need to move the MCA param `--mca pml ucx` before your
command.
George.
On Mon, Mar 6, 2023 at 9:48 AM Jeff Squyres (jsquyres) via users <
users@lists.open-mpi.org> wrote:
> If this run was on a single node, t
[Public]
Hi,
Yes, it is run on a single node, there is no IB anr RoCE attached to it.
Pasting the complete o/p (I might have mistakenly copy pasted the command in
the previous mail)
#
perf_benchmark $ mpirun -np 2 --map-by core --bind-to core --mca pml ucx --mca
pml_base_verbose 10 -
Per George's comments, I stand corrected: UCX does work fine in single-node
cases -- he confirmed to me that he tested it on his laptop, and it worked for
him.
That being said, you're passing "--mca pml ucx" in the correct place now, and
you're therefore telling Open MPI "_only_ use the UCX PM
[AMD Official Use Only - General]
UCX will disqualify itself unless it finds cuda, rocm, or InfiniBand network to
use. To allow UCX to run on a regular shared memory job without GPUs or IB, you
have to set UCX_TLS environment variable explicitly allowe UCX to run for shm,
e.g :
mpirun -x UCX_T
Edgar is right, UCX_TLS has some role in the selection. You can see the
current selection by running `uxc_info -c`. In my case, UCX_TLS is set to
`all` somehow, and I had either a not-connected IB device or a GPU.
However, I did not set UCX_TLS manually, and I can't see it anywhere in my
system con
Per George's comments, I stand corrected: UCX does work fine in single-node
cases -- he confirmed to me that he tested it on his laptop, and it worked for
him.
I think some of the mails in this thread got delivered out of order. Edgar's
and George's comments about how/when the UCX PML is sele
[Public]
I was able to run with ucx, with the below command (Ref:
https://www.mail-archive.com/users@lists.open-mpi.org/msg34585.html)
$ mpirun -np 2 --map-by core --bind-to core --mca pml ucx --mca
pml_base_verbose 10 --mca mtl_base_verbose 10 -x OMPI_MCA_pml_ucx_verbose=10 -x
UCX_LO
Hi,
I passed through this code
(https://github.com/open-mpi/ompi/blob/main/opal/mca/common/ucx/common_ucx.c#L216)
last week and the logic can be summarized as :
* ask available transports on a context
* check if one of transports specified by opal_common_ucx_tls (or
mca_pml_ucx_tls or
11 matches
Mail list logo