Dear Gilles and Nathan, Thank you for your suggestions.
I have experimented with them and got few questions for you. My application is open-source FFTW library and its MPI test bench - It makes use of distributed/MPI global transpose for a given MPI input problem. I ran the program on 5 different combinations of OS and openMPI version/flags. Please check below few sample cases for which performance vary a lot across these 5 combinations. Single-node 128 MPI ranks GFLOPS comparison (absolute numbers not provided) Double-precision complex type 1D array of size Ubuntu 19.04 + openMPI3.1.1 Ubuntu 20.04 + openMPI4.1.0 Ubuntu 20.04 + openMPI4.1.0 + Using --mca pml ob1 --mca btl vader,self, Ubuntu 20.04 + openMPI4.1.0 + xpmem Ubuntu 20.04 + openMPI4.1.0 + xpmem + Using --mca pml ob1 --mca btl vader,self, Comments 390625 Best Not-best Not-best Not-best Not-best Option with (Ubuntu 19.04 + openMPI3.1.1) is best 2097152 Not-best Not-best Not-best Not-best Best Option with (Ubuntu 20.04 + openMPI4.1.0 + xpmem + Using --mca pml ob1 --mca btl vader,self,) is best 4194304 Not-best Not-best Not-best Best Not-best Option with (Ubuntu 20.04 + openMPI4.1.0 + xpmem) is best 64000000 Not-best Best Not-best Not-best Not-best Option with (Ubuntu 20.04 + openMPI4.1.0) is best My questions are:- 1. I was using openMPI3.1.1 on Ubuntu19.04 without “xpmem” and “runtime mca vader option”, then why the plain/stock openMPI4.1.0 on Ubuntu20.04 is not giving the best performance? 2. In most of the cases using “xpmem” library gives the best performance, but for few cases “Ubuntu 19.04 + openMPI3.1.1” is best. How to finalize which version to use universally? 3. I was getting a runtime warning for “xpmem” as mentioned below:- WARNING: Could not generate an xpmem segment id for this process’ address space. The vader shared memory BTL will fall back on another single-copy mechanish if one is available. This may result in lower performance. How to resolve this issue? With Regards, S. Biplab Raut -----Original Message----- From: users <users-boun...@lists.open-mpi.org> On Behalf Of Gilles Gouaillardet via users Sent: Friday, March 5, 2021 5:58 AM To: Open MPI Users <users@lists.open-mpi.org> Cc: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> Subject: Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ? [CAUTION: External Email] On top of XPMEM, try to also force btl/vader with mpirun --mca pml ob1 --mca btl vader,self, ... On Fri, Mar 5, 2021 at 8:37 AM Nathan Hjelm via users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote: > > I would run the v4.x series and install xpmem if you can > (https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgithub.com%2Fhjelmn%2Fxpmem&data=04%7C01%7CBiplab.Raut%40amd.com%7C34ff47831b764d6add4808d8df6de2e1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637505010367426915%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=MOIgiiQ5J9qIaD7uBjCbPkuAOhioSxLiyDQbOJspGAU%3D&reserved=0). > You will need to build with —with-xpmem=/path/to/xpmem to use xpmem > otherwise vader will default to using CMA. This will provide the best > possible performance. > > -Nathan > > On Mar 4, 2021, at 5:55 AM, Raut, S Biplab via users > <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote: > > [AMD Official Use Only - Internal Distribution Only] > > It is a single node execution, so it should be using shared memory (vader). > > With Regards, > S. Biplab Raut > > From: Heinz, Michael William > <michael.william.he...@cornelisnetworks.com<mailto:michael.william.he...@cornelisnetworks.com>> > Sent: Thursday, March 4, 2021 5:17 PM > To: Open MPI Users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> > Cc: Raut, S Biplab <biplab.r...@amd.com<mailto:biplab.r...@amd.com>> > Subject: Re: [OMPI users] Stable and performant openMPI version for > Ubuntu20.04 ? > > [CAUTION: External Email] > > What interconnect are you using at run time? That is, are you using Ethernet > or InfiniBand or Omnipath? > > Sent from my iPad > > > > On Mar 4, 2021, at 5:05 AM, Raut, S Biplab via users > <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote: > > > [AMD Official Use Only - Internal Distribution Only] > > After downloading a particular openMPI version, let’s say v3.1.1 from > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdownload.open-mpi.org%2Frelease%2Fopen-mpi%2Fv3.1%2Fopenmpi-3.1.1.tar.gz&data=04%7C01%7CBiplab.Raut%40amd.com%7C34ff47831b764d6add4808d8df6de2e1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637505010367426915%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=pc7oejptvf2zmmhpil4LepWcjDoIfTFFTgLEa%2Bld8CE%3D&reserved=0 > , I follow the below steps. > ./configure --prefix="$INSTALL_DIR" --enable-mpi-fortran --enable-mpi-cxx > --enable-shared=yes --enable-static=yes --enable-mpi1-compatibility > make -j > make install > export PATH=$INSTALL_DIR/bin:$PATH > export LD_LIBRARY_PATH=$INSTALL_DIR/lib:$LD_LIBRARY_PATH > Additionally, I also install libnuma-dev on the machine. > > For all the machines having Ubuntu 18.04 and 19.04, it works correctly and > results in expected performance/GFLOPS. > But, when OS is changed to Ubuntu 20.04, then I start getting the issues as > mentioned in my original/previous mail below. > > With Regards, > S. Biplab Raut > > From: users > <users-boun...@lists.open-mpi.org<mailto:users-boun...@lists.open-mpi.org>> > On Behalf Of John > Hearns via users > Sent: Thursday, March 4, 2021 1:53 PM > To: Open MPI Users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> > Cc: John Hearns <hear...@gmail.com<mailto:hear...@gmail.com>> > Subject: Re: [OMPI users] Stable and performant openMPI version for > Ubuntu20.04 ? > > [CAUTION: External Email] > How are you installing the OpenMPI versions? Are you using packages which are > distributed by the OS? > > It might be worth looking at using Easybuid or Spack > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs > .easybuild.io%2Fen%2Flatest%2FIntroduction.html&data=04%7C01%7CBip > lab.Raut%40amd.com%7C34ff47831b764d6add4808d8df6de2e1%7C3dd8961fe4884e > 608e11a82d994e183d%7C0%7C0%7C637505010367426915%7CUnknown%7CTWFpbGZsb3 > d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7 > C1000&sdata=0Lp0Aa2zNXxJ%2BBTtqpii%2Ft0RZCQx3sXafZNsGnrzWeQ%3D& > ;reserved=0 > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspac > k.readthedocs.io%2Fen%2Flatest%2F&data=04%7C01%7CBiplab.Raut%40amd > .com%7C34ff47831b764d6add4808d8df6de2e1%7C3dd8961fe4884e608e11a82d994e > 183d%7C0%7C0%7C637505010367436916%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w > LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdat > a=Dga6yYAmGwLDDNZY1eZwGQeWV4UO1atgl83besYtkIo%3D&reserved=0 > > > On Thu, 4 Mar 2021 at 07:35, Raut, S Biplab via users > <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Dear Experts, > Until recently, I was using openMPI3.1.1 to run > single node 128 ranks MPI application on Ubuntu18.04 and Ubuntu19.04. > But, now the OS on these machines are upgraded to Ubuntu20.04, and I have > been observing program hangs with openMPI3.1.1 version. > So, I tried with openMPI4.0.5 version – The program ran properly without any > issues but there is a performance regression in my application. > > Can I know the stable openMPI version recommended for Ubuntu20.04 that has no > known regression compared to v3.1.1. > > With Regards, > S. Biplab Raut > >