Per George's comments, I stand corrected: UCX does work fine in single-node cases -- he confirmed to me that he tested it on his laptop, and it worked for him.
I think some of the mails in this thread got delivered out of order. Edgar's and George's comments about how/when the UCX PML is selected make my above comment moot. Sorry for any confusion! ________________________________ From: users <users-boun...@lists.open-mpi.org> on behalf of Jeff Squyres (jsquyres) via users <users@lists.open-mpi.org> Sent: Monday, March 6, 2023 10:40 AM To: Chandran, Arun <arun.chand...@amd.com>; Open MPI Users <users@lists.open-mpi.org> Cc: Jeff Squyres (jsquyres) <jsquy...@cisco.com> Subject: Re: [OMPI users] What is the best choice of pml and btl for intranode communication Per George's comments, I stand corrected: UCX does work fine in single-node cases -- he confirmed to me that he tested it on his laptop, and it worked for him. That being said, you're passing "--mca pml ucx" in the correct place now, and you're therefore telling Open MPI "_only_ use the UCX PML". Hence, if the UCX PML can't be used, it's an aborting type of error. The question is: why is the UCX PML not usable on your node? Your output clearly shows that UCX chooses to disable itself -- is that because there are no IB / RoCE interfaces at all? (this is an open question to George / the UCX team) ________________________________ From: Chandran, Arun <arun.chand...@amd.com> Sent: Monday, March 6, 2023 10:31 AM To: Jeff Squyres (jsquyres) <jsquy...@cisco.com>; Open MPI Users <users@lists.open-mpi.org> Subject: RE: [OMPI users] What is the best choice of pml and btl for intranode communication [Public] Hi, Yes, it is run on a single node, there is no IB anr RoCE attached to it. Pasting the complete o/p (I might have mistakenly copy pasted the command in the previous mail) ######### perf_benchmark $ mpirun -np 2 --map-by core --bind-to core --mca pml ucx --mca pml_base_verbose 10 --mca mtl_base_verbose 10 -x OMPI_MCA_pml_ucx_verbose=10 -x UCX_LOG_LEVEL=info -x UCX_PROTO_ENABLE=y -x UCX_PROTO_INFO=y ./perf [1678115882.908665] [lib-ssp-04:759377:0] ucp_context.c:1849 UCX INFO Version 1.13.1 (loaded from /home/arun/openmpi_work/ucx-1.13.1/install/lib/libucp.so.0) [lib-ssp-04:759377] mca: base: components_register: registering framework pml components [lib-ssp-04:759377] mca: base: components_register: found loaded component ucx [lib-ssp-04:759377] mca: base: components_register: component ucx register function successful [lib-ssp-04:759377] mca: base: components_open: opening pml components [lib-ssp-04:759377] mca: base: components_open: found loaded component ucx [lib-ssp-04:759377] common_ucx.c:174 using OPAL memory hooks as external events [lib-ssp-04:759377] pml_ucx.c:197 mca_pml_ucx_open: UCX version 1.13.1 [lib-ssp-04:759377] mca: base: components_open: component ucx open function successful [lib-ssp-04:759377] select: initializing pml component ucx [lib-ssp-04:759377] common_ucx.c:333 self/memory0: did not match transport list [lib-ssp-04:759377] common_ucx.c:333 tcp/lo: did not match transport list [lib-ssp-04:759377] common_ucx.c:333 tcp/enp33s0: did not match transport list [lib-ssp-04:759377] common_ucx.c:333 sysv/memory: did not match transport list [lib-ssp-04:759377] common_ucx.c:333 posix/memory: did not match transport list [lib-ssp-04:759377] common_ucx.c:333 cma/memory: did not match transport list [lib-ssp-04:759377] common_ucx.c:333 xpmem/memory: did not match transport list [lib-ssp-04:759377] common_ucx.c:337 support level is none [lib-ssp-04:759377] select: init returned failure for component ucx -------------------------------------------------------------------------- No components were able to be opened in the pml framework. This typically means that either no components of this type were installed, or none of the installed components can be loaded. Sometimes this means that shared libraries required by these components are unable to be found/loaded. Host: lib-ssp-04 Framework: pml -------------------------------------------------------------------------- [lib-ssp-04:759377] PML ucx cannot be selected [lib-ssp-04:759376] mca: base: components_register: registering framework pml components [lib-ssp-04:759376] mca: base: components_register: found loaded component ucx [lib-ssp-04:759376] mca: base: components_register: component ucx register function successful [lib-ssp-04:759376] mca: base: components_open: opening pml components [lib-ssp-04:759376] mca: base: components_open: found loaded component ucx [lib-ssp-04:759376] common_ucx.c:174 using OPAL memory hooks as external events [lib-ssp-04:759376] pml_ucx.c:197 mca_pml_ucx_open: UCX version 1.13.1 [1678115882.913551] [lib-ssp-04:759376:0] ucp_context.c:1849 UCX INFO Version 1.13.1 (loaded from /home/arun/openmpi_work/ucx-1.13.1/install/lib/libucp.so.0) ########## So, running with pml/ucx is disabled by default if there is no compatible Networking-Equipment is found. --Arun From: Jeff Squyres (jsquyres) <jsquy...@cisco.com> Sent: Monday, March 6, 2023 8:13 PM To: Open MPI Users <users@lists.open-mpi.org> Cc: Chandran, Arun <arun.chand...@amd.com> Subject: Re: [OMPI users] What is the best choice of pml and btl for intranode communication Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. If this run was on a single node, then UCX probably disabled itself since it wouldn't be using InfiniBand or RoCE to communicate between peers. Also, I'm not sure your command line was correct: perf_benchmark $ mpirun -np 32 --map-by core --bind-to core ./perf --mca pml ucx You probably need to list all of mpirun's CLI options before you list the ./perf executable. In its right-to-left traversal, once mpirun hits a CLI option it does not recognize (e.g., "./perf"), it assumes that it is the user's executable name, and does not process the CLI options to the right of that. Hence, the output you show must have forced the UCX PML another way -- perhaps you set an environment variable or something? ________________________________ From: users <users-boun...@lists.open-mpi.org<mailto:users-boun...@lists.open-mpi.org>> on behalf of Chandran, Arun via users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> Sent: Monday, March 6, 2023 3:33 AM To: Open MPI Users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> Cc: Chandran, Arun <arun.chand...@amd.com<mailto:arun.chand...@amd.com>> Subject: Re: [OMPI users] What is the best choice of pml and btl for intranode communication [Public] Hi Gilles, Thanks very much for the information. I was looking for the best pml + btl combination for a standalone intra node with high task count (>= 192) with no HPC-class networking installed. Just now realized that I can’t use pml ucx for such cases as it is unable find IB and fails. perf_benchmark $ mpirun -np 32 --map-by core --bind-to core ./perf --mca pml ucx -------------------------------------------------------------------------- No components were able to be opened in the pml framework. This typically means that either no components of this type were installed, or none of the installed components can be loaded. Sometimes this means that shared libraries required by these components are unable to be found/loaded. Host: lib-ssp-04 Framework: pml -------------------------------------------------------------------------- [lib-ssp-04:753542] PML ucx cannot be selected [lib-ssp-04:753531] PML ucx cannot be selected [lib-ssp-04:753541] PML ucx cannot be selected [lib-ssp-04:753539] PML ucx cannot be selected [lib-ssp-04:753545] PML ucx cannot be selected [lib-ssp-04:753547] PML ucx cannot be selected [lib-ssp-04:753572] PML ucx cannot be selected [lib-ssp-04:753538] PML ucx cannot be selected [lib-ssp-04:753530] PML ucx cannot be selected [lib-ssp-04:753537] PML ucx cannot be selected [lib-ssp-04:753546] PML ucx cannot be selected [lib-ssp-04:753544] PML ucx cannot be selected [lib-ssp-04:753570] PML ucx cannot be selected [lib-ssp-04:753567] PML ucx cannot be selected [lib-ssp-04:753534] PML ucx cannot be selected [lib-ssp-04:753592] PML ucx cannot be selected [lib-ssp-04:753529] PML ucx cannot be selected <snip> That means my only choice is pml/ob1 + btl/vader. --Arun From: users <users-boun...@lists.open-mpi.org<mailto:users-boun...@lists.open-mpi.org>> On Behalf Of Gilles Gouaillardet via users Sent: Monday, March 6, 2023 12:56 PM To: Open MPI Users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> Cc: Gilles Gouaillardet <gilles.gouaillar...@gmail.com<mailto:gilles.gouaillar...@gmail.com>> Subject: Re: [OMPI users] What is the best choice of pml and btl for intranode communication Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. Arun, First Open MPI selects a pml for **all** the MPI tasks (for example, pml/ucx or pml/ob1) Then, if pml/ob1 ends up being selected, a btl component (e.g. btl/uct, btl/vader) is used for each pair of MPI tasks (tasks on the same node will use btl/vader, tasks on different nodes will use btl/uct) Note that if UCX is available, pml/ucx takes the highest priority, so no btl is involved (in your case, if means intra-node communications will be handled by UCX and not btl/vader). You can force ob1 and try different combinations of btl with mpirun --mca pml ob1 --mca btl self,<btl1>,<btl2> ... I expect pml/ucx is faster than pml/ob1 with btl/uct for inter node communications. I have not benchmarked Open MPI for a while and it is possible btl/vader outperforms pml/ucx for intra nodes communications, so if you run on a small number of Infiniband interconnected nodes with a large number of tasks per node, you might be able to get the best performances by forcing pml/ob1. Bottom line, I think it is best for you to benchmark your application and pick the combination that leads to the best performances, and you are more than welcome to share your conclusions. Cheers, Gilles On Mon, Mar 6, 2023 at 3:12 PM Chandran, Arun via users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote: [Public] Hi Folks, I can run benchmarks and find the pml+btl (ob1, ucx, uct, vader, etc) combination that gives the best performance, but I wanted to hear from the community about what is generally used in "__high_core_count_intra_node_" cases before jumping into conclusions. As I am a newcomer to openMPI I don't want to end up using a combination only because it fared better in a benchmark (overfitting?) Or the choice of pml+btl for the 'intranode' case is not so important as openmpi is mainly used in 'internode' and the 'networking-equipment' decides the pml+btl? (UCX for IB) --Arun -----Original Message----- From: users <users-boun...@lists.open-mpi.org<mailto:users-boun...@lists.open-mpi.org>> On Behalf Of Chandran, Arun via users Sent: Thursday, March 2, 2023 4:01 PM To: users@lists.open-mpi.org<mailto:users@lists.open-mpi.org> Cc: Chandran, Arun <arun.chand...@amd.com<mailto:arun.chand...@amd.com>> Subject: [OMPI users] What is the best choice of pml and btl for intranode communication Hi Folks, As the number of cores in a socket is keep on increasing, the right pml,btl (ucx, ob1, uct, vader, etc) that gives the best performance in "intra-node" scenario is important. For openmpi-4.1.4, which pml, btl combination is the best for intra-node communication in the case of higher core count scenario? (p-to-p as well as coll) and why? Does the answer for the above question holds good for the upcoming ompi5 release? --Arun