Dear Gilles, I am using GCC (v9.3.0) as provided by Ubuntu 20.04. As I said only few test cases were in regression with the earlier patch, but this new patch resolves all of them.
With Regards, S. Biplab Raut -----Original Message----- From: gil...@rist.or.jp <gil...@rist.or.jp> Sent: Friday, April 9, 2021 5:45 AM To: Open MPI Users <users@lists.open-mpi.org> Cc: Gilles Gouaillardet <gilles.gouaillar...@gmail.com>; Raut, S Biplab <biplab.r...@amd.com> Subject: Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ? [CAUTION: External Email] Are you using gcc provided by Ubuntu 20.04? if not which compiler (vendor and version) are you using? My (light) understanding is that this patch should not impact performances, so I am not sure whether the performance being back is something I do not understand, or the side effect of a compiler bug. Anyway, I issued https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopen-mpi%2Fompi%2Fpull%2F8789&data=04%7C01%7CBiplab.Raut%40amd.com%7C1c4ece784e9046362baa08d8faec8ebf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637535241229925553%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=BDCXkCjWmem3ajzvqY%2FbHygOytd5GzUxRz8ZuEofLLk%3D&reserved=0 and asked for a review. Cheers, Gilles ----- Original Message ----- > Dear Gilles, > As per your suggestion, I tried the inline patch as discussed in https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopen-mpi%2Fompi%2Fpull%2F8622%23issuecomment-800776864&data=04%7C01%7CBiplab.Raut%40amd.com%7C1c4ece784e9046362baa08d8faec8ebf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637535241229925553%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=M2zHaN2MQ1y0XpXjGzorrJ7YvTTV0RhnLjtP66GTd3s%3D&reserved=0 . > > This has fixed the regression completely for the remaining test cases in FFTW MPI in-built test bench - which was persisting even after using the git patch https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatch-diff.githubusercontent.com%2Fraw%2Fopen-mpi%2Fompi%2Fpull%2F8623.patch&data=04%7C01%7CBiplab.Raut%40amd.com%7C1c4ece784e9046362baa08d8faec8ebf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637535241229925553%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jheNkQa3d177uME1Qtxc9y4GTsOKyaAOgyC9%2Bqu0hcU%3D&reserved=0 as merged by you. > So, it seems there is a performance difference between asm volatile("": : :"memory"); and __atomic_thread_fence (__ATOMIC_ACQUIRE) on x86_64. > > I would request you to please make this change and merge it to respective openMPI branches - please intimate if possible whenever that takes place. > I also request you to plan for an early 4.1.1rc2 release at least by June 2021. > > With Regards, > S. Biplab Raut > > -----Original Message----- > From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> > Sent: Thursday, April 1, 2021 8:31 AM > To: Raut, S Biplab <biplab.r...@amd.com> > Subject: Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ? > > [CAUTION: External Email] > > I really had no time to investigate this. > > A quick test is to apply the patch in the inline comment at > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith > ub.com%2Fopen-mpi%2Fompi%2Fpull%2F8622%23issuecomment-800776864&da > ta=04%7C01%7CBiplab.Raut%40amd.com%7C1c4ece784e9046362baa08d8faec8ebf% > 7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637535241229925553%7CUnkn > own%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwi > LCJXVCI6Mn0%3D%7C1000&sdata=M2zHaN2MQ1y0XpXjGzorrJ7YvTTV0RhnLjtP66 > GTd3s%3D&reserved=0 and see whether it helps. > > If not, I would recommend you try Open MPI 3.1.6 (after manually applying https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopen-mpi%2Fompi%2Fpull%2F8624.patch&data=04%7C01%7CBiplab.Raut%40amd.com%7C1c4ece784e9046362baa08d8faec8ebf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637535241229925553%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=FxcMmtniXES2%2BWaRijQf2T5GQFupDqOv5ytZvjg2NLo%3D&reserved=0) and see whether there is a performance regression between 3.1.1 and ( patched) 3.1.6 > > Cheers, > > Gilles > > On Thu, Apr 1, 2021 at 11:25 AM Raut, S Biplab <biplab.r...@amd.com> wrote: > > > > Dear Gilles, > > Did you get a chance to look into my below mail content? > > I find the regression is not completely fixed. > > > > With Regards, > > S. Biplab Raut > > > > -----Original Message----- > > From: Raut, S Biplab > > Sent: Wednesday, March 24, 2021 11:32 PM > > To: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> > > Subject: RE: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ? > > > > Dear Gilles, > > After applying the below patch, I thoroughly tested various test cases of FFTW using its in-built benchmark test program. > > Many of the test cases, that showed regression previously as compared to openMPI3.1.1, have now improved with positive gains. > > However, there are still few test cases where the performance is lower than openMPI3.1.1. > > Are there more performance issues in openMPI4.x that need to be discovered? > > > > Please check the below details. > > > > 1) For problem size 1024x1024x512 :- > > $ mpirun --map-by core --rank-by core --bind-to core ./fftw/ mpi/mpi-bench -opatient -r500 -s dcif1024x1024x512 > > openMPI3.3.1_stock performance -> 147 MFLOPS > > openMPI4.1.0_stock performance -> 137 MFLOPS > > openMPI4.1.0_patch performance -> 137 MFLOPS > > 2) For problem size 512x512x512 :- > > $ mpirun --map-by core --rank-by core --bind-to core ./fftw/ mpi/mpi-bench -opatient -r500 -s dcif512x512x512 > > openMPI3.3.1_stock performance -> 153 MFLOPS > > openMPI4.1.0_stock performance -> 144 MFLOPS > > openMPI4.1.0_patch performance -> 147 MFLOPS > > > > With Regards, > > S. Biplab Rsut > > > > -----Original Message----- > > From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> > > Sent: Wednesday, March 17, 2021 11:14 AM > > To: Raut, S Biplab <biplab.r...@amd.com> > > Subject: Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ? > > > > [CAUTION: External Email] > > > > The patch has been merged into the v4.1.x release branch, but 4.1. 1rc2 has not been yet released. > > Your best bet is to download and apply the patch at > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi > > th > > ub.com%2Fopen-mpi%2Fompi%2Fpull%2F8623.patch&data=04%7C01% 7CBiplab > > .Raut%40amd.com%7C6b277b24afa04650c86c08d8f4ba5dc7% 7C3dd8961fe4884e608 > > e11a82d994e183d%7C0%7C0%7C637528428572315404%7CUnknown% 7CTWFpbGZsb3d8e > > yJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D% 7C10 > > 00&sdata=OKGqhhQM68vPhuADfVdqOlHmY0ZHGtUdM%2B1WeeJ9WoY%3D& rese > > rved=0 (since this does not involve any configury stuff, the process > > should be painless) > > > > Cheers, > > > > Gilles > > > > On Wed, Mar 17, 2021 at 2:31 PM Raut, S Biplab <biplab.r...@amd.com> wrote: > > > > > > Dear Gilles, > > > Thank you for your support and quick fix for this issue. > > > Could you tell me if the fix is finally merged and how do I get the RC version of this code (v4.1) ? > > > Please point me to exact link, it will be helpful (since it will be used in production servers). > > > > > > With Regards, > > > S. Biplab Raut > > > > > > -----Original Message----- > > > From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> > > > Sent: Sunday, March 14, 2021 4:18 PM > > > To: Raut, S Biplab <biplab.r...@amd.com> > > > Subject: Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ? > > > > > > [CAUTION: External Email] > > > > > > This is something you can/have to do by yourself: > > > log into github, open the issue and click the Subscribe button in > > > Notifications > > > > > > Cheers, > > > > > > Gilles > > > > > > On Sun, Mar 14, 2021 at 7:30 PM Raut, S Biplab <Biplab.Raut@amd. com> wrote: > > > > > > > > Thank you very much for your support. > > > > Can you please add me to this issue/ticket as a watcher/stake- holder? > > > > > > > > With Regards, > > > > S. Biplab Raut > > > > > > > > -----Original Message----- > > > > From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> > > > > Sent: Sunday, March 14, 2021 3:23 PM > > > > To: Raut, S Biplab <biplab.r...@amd.com> > > > > Subject: Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ? > > > > > > > > [CAUTION: External Email] > > > > > > > > Glad too we are finally on the same page! > > > > > > > > I filled an issue at > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F% > > > > 2F > > > > gi > > > > thub.com%2Fopen-mpi%2Fompi%2Fissues%2F8603&data=04%7C01% 7CBipl > > > > ab > > > > .Raut%40amd.com%7C4bca16633f1b488e486008d8e907ac24% 7C3dd8961fe4884 > > > > e6 > > > > 08e11a82d994e183d%7C0%7C0%7C637515566455681853%7CUnknown% 7CTWFpbGZ > > > > sb > > > > 3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6M n0 > > > > %3 > > > > D%7C1000&sdata=khoOPlLF2xbwhcUQhmk%2B0IBvuC4J5OyoBkwepijKJc0 > > > > % 3 > > > > D& > > > > amp;reserved=0, > > > > let's follow-up here from now > > > > > > > > Cheers, > > > > > > > > Gilles > > > > > > > > On Sun, Mar 14, 2021 at 5:38 PM Raut, S Biplab <Biplab.Raut@amd. com> wrote: > > > > > > > > > > Dear Gilles, > > > > > > > > > > Reposting your comments along with my replies in the mailing-list for everybody to view/react. > > > > > > > > > > > > > > > > > > > > I am seeing some important performance degradation between Open > > > > > MPI > > > > > > > > > > 3.1.1 and the top of the v3.1.x branch > > > > > > > > > > when running on a large number of cores. > > > > > > > > > > Same performance between 4.1.0 and the top of v3.1.x > > > > > > > > > > I am now running git bisect to find out when this started happening. > > > > > > > > > > I am finally feeling relieved and happy that you could reproduce and acknowledge this regression !! > > > > > > > > > > Do I need to file any bug officially anywhere? > > > > > > > > > > > > > > > > > > > > IIRC, I noted an xpmem error in your logs (that means xpmem is not used). > > > > > > > > > > The root cause could be that the xpmem kernel module is not > > > > > loaded, of the permissions on the device are incorrect As Nathan > > > > > pointed out, xpmem is likely to get the best performances, so > > > > > while I am running git bisect, I do invite you to fix your xpmem > > > > > issue and see how this impacts performances > > > > > > > > > > Sure, I will try to fix the xpmem error and check the impact on the performance. > > > > > > > > > > > > > > > > > > > > With Regards, > > > > > > > > > > S. Biplab Raut > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> > > > > > Sent: Sunday, March 14, 2021 8:45 AM > > > > > To: Raut, S Biplab <biplab.r...@amd.com> > > > > > Subject: Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ? > > > > > > > > > > > > > > > > > > > > [CAUTION: External Email] > > > > > > > > > > > > > > > > > > > > I am seeing some important performance degradation between Open > > > > > MPI > > > > > > > > > > 3.1.1 and the top of the v3.1.x branch > > > > > > > > > > when running on a large number of cores. > > > > > > > > > > Same performance between 4.1.0 and the top of v3.1.x > > > > > > > > > > > > > > > > > > > > I am now running git bisect to find out when this started happening. > > > > > > > > > > > > > > > > > > > > IIRC, I noted an xpmem error in your logs (that means xpmem is not used). > > > > > > > > > > The root cause could be that the xpmem kernel module is not > > > > > loaded, of the permissions on the device are incorrect As Nathan > > > > > pointed out, xpmem is likely to get the best performances, so > > > > > while I am running git bisect, I do invite you to fix your xpmem > > > > > issue and see how this impacts performances > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > Gilles > > > > > > > > > > > > > > > > > > > > On Sat, Mar 13, 2021 at 12:08 AM Raut, S Biplab <Biplab.Raut@ amd.com> wrote: > > > > > > > > > > > > > > > > > > > > > > Dear Gilles, > > > > > > > > > > > > > > > > > > > > > > Please check my replies inline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >>> Can you please post the output of > > > > > > > > > > > > > > > > > > > > > > >>> ompi_info --param btl vader --level 3 > > > > > > > > > > > > > > > > > > > > > > >>> with both Open MPI 3.1 and 4.1? > > > > > > > > > > > > > > > > > > > > > > openMPI3.1.1 > > > > > > > > > > > > > > > > > > > > > > ------------------ > > > > > > > > > > > > > > > > > > > > > > $ ompi_info --param btl vader --level 3 > > > > > > > > > > > > > > > > > > > > > > MCA btl: vader (MCA v2.1.0, API v3.0.0, > > > > > > Component > > > > > > > > > > > v3.1.1) > > > > > > > > > > > > > > > > > > > > > > MCA btl vader: > > > > > > > > > > > --------------------------------------------------- > > > > > > > > > > > > > > > > > > > > > > MCA btl vader: parameter "btl_vader_single_copy_ mechanism" > > > > > > > > > > > > > > > > > > > > > > (current value: "cma", data source: default, level: > > > > > > > > > > > > > > > > > > > > > > 3 user/all, type: int) > > > > > > > > > > > > > > > > > > > > > > Single copy mechanism to use > > > > > > (defaults to > > > > > > > > > > > best > > > > > > > > > > > > > > > > > > > > > > available) > > > > > > > > > > > > > > > > > > > > > > Valid values: 1:"cma", 3:"none" > > > > > > > > > > > > > > > > > > > > > > openMPI4.1.0 > > > > > > > > > > > > > > > > > > > > > > ------------------ > > > > > > > > > > > > > > > > > > > > > > $ ompi_info --param btl vader --level 3 > > > > > > > > > > > > > > > > > > > > > > MCA btl: vader (MCA v2.1.0, API v3.1.0, > > > > > > Component > > > > > > > > > > > v4.1.0) > > > > > > > > > > > > > > > > > > > > > > MCA btl vader: > > > > > > > > > > > --------------------------------------------------- > > > > > > > > > > > > > > > > > > > > > > MCA btl vader: parameter "btl_vader_single_copy_ mechanism" > > > > > > > > > > > > > > > > > > > > > > (current value: "cma", data source: default, level: > > > > > > > > > > > > > > > > > > > > > > 3 user/all, type: int) > > > > > > > > > > > > > > > > > > > > > > Single copy mechanism to use > > > > > > (defaults to > > > > > > > > > > > best > > > > > > > > > > > > > > > > > > > > > > available) > > > > > > > > > > > > > > > > > > > > > > Valid values: 1:"cma", 4:"emulated ", 3:"none" > > > > > > > > > > > > > > > > > > > > > > MCA btl vader: parameter "btl_vader_backing_ directory" > > > > > > > > > > > (current > > > > > > > > > > > > > > > > > > > > > > value: "/dev/shm", data source: > > > > > > default, > > > > > > > > > > > level: 3 > > > > > > > > > > > > > > > > > > > > > > user/all, type: string) > > > > > > > > > > > > > > > > > > > > > > Directory to place backing files for > > > > > > shared > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > > > > communication. This directory should > > > > > > be on a > > > > > > > > > > > local > > > > > > > > > > > > > > > > > > > > > > filesystem such as /tmp or /dev/ shm (default: > > > > > > > > > > > > > > > > > > > > > > (linux) /dev/shm, (others) session > > > > > > > > > > > directory) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >>> What if you run with only 2 MPI ranks? > > > > > > > > > > > > > > > > > > > > > > >>> do you observe similar performance differences between Open MPI 3.1 and 4.1? > > > > > > > > > > > > > > > > > > > > > > When I run only 2 MPI ranks, the performance regression is not significant. > > > > > > > > > > > > > > > > > > > > > > openMPI3.1.1 gives MFLOPS: 11122 > > > > > > > > > > > > > > > > > > > > > > openMPI4.1.0 gives MFLOPS: 11041 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > With Regards, > > > > > > > > > > > > > > > > > > > > > > S. Biplab Raut > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> > > > > > > > > > > > Sent: Friday, March 12, 2021 7:07 PM > > > > > > > > > > > To: Raut, S Biplab <biplab.r...@amd.com> > > > > > > > > > > > Subject: Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [CAUTION: External Email] > > > > > > > > > > > > > > > > > > > > > > Can you please post the output of > > > > > > > > > > > > > > > > > > > > > > ompi_info --param btl vader --level 3 > > > > > > > > > > > > > > > > > > > > > > with both Open MPI 3.1 and 4.1? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What if you run with only 2 MPI ranks? > > > > > > > > > > > > > > > > > > > > > > do you observe similar performance differences between Open MPI 3.1 and 4.1? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > > > Gilles > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Mar 12, 2021 at 6:31 PM Raut, S Biplab <Biplab.Raut@ amd.com> wrote: > > > > > > > > > > > > > > > > > > > > > > Dear Gilles, > > > > > > > > > > > > > > > > > > > > > > Thank you for the reply. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >>> when running > > > > > > > > > > > > > > > > > > > > > > >>> mpirun --map-by core -rank-by core --bind-to core --mca > > > > > > >>> pml > > > > > > >>> ob1 > > > > > > > > > > > >>> --mca btl vader,self ./mpi-bench ic1000000 > > > > > > > > > > > > > > > > > > > > > > >>> I go similar flops with Open MPI 3.1.1, 3.1.6, 4.1.0 and > > > > > > >>> 4.1.1rc1 > > > > > > > > > > > >>> on my system > > > > > > > > > > > > > > > > > > > > > > >>> If you are using a different command line, please let me > > > > > > >>> know and > > > > > > > > > > > >>> I will give it a try > > > > > > > > > > > > > > > > > > > > > > Although the command line that I use is different, but I ran with the above command line as used by you. > > > > > > > > > > > > > > > > > > > > > > I still find that openMPI4.1.0 is poor as compared to openMPI3.1.1. Please check the details below. I have also provided my system details if it matters. > > > > > > > > > > > > > > > > > > > > > > openMPI3.1.1 > > > > > > > > > > > > > > > > > > > > > > ------------------- > > > > > > > > > > > > > > > > > > > > > > $ mpirun --map-by core -rank-by core --bind-to core --mca pml > > > > > > ob1 > > > > > > > > > > > --mca btl vader,self ./mpi-bench ic1000000 > > > > > > > > > > > > > > > > > > > > > > Problem: ic1000000, setup: 552.20 ms, time: 1.33 ms, `` mflops'': > > > > > > 75143 > > > > > > > > > > > > > > > > > > > > > > $ ompi_info --all|grep 'command line' > > > > > > > > > > > > > > > > > > > > > > Configure command line: '--prefix=/home/server/ompi3/gcc' '--enable-mpi-fortran' '--enable-mpi-cxx' '--enable-shared=yes' '-- enable-static=yes' '--enable-mpi1-compatibility' > > > > > > > > > > > > > > > > > > > > > > User-specified command line > > > > > > parameters > > > > > > > > > > > passed to ROMIO's configure script > > > > > > > > > > > > > > > > > > > > > > Complete set of command line > > > > > > parameters > > > > > > > > > > > passed to ROMIO's configure script > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > openMPI4.1.0 > > > > > > > > > > > > > > > > > > > > > > ------------------- > > > > > > > > > > > > > > > > > > > > > > $ mpirun --map-by core -rank-by core --bind-to core --mca pml > > > > > > ob1 > > > > > > > > > > > --mca btl vader,self ./mpi-bench ic1000000 > > > > > > > > > > > > > > > > > > > > > > Problem: ic1000000, setup: 557.12 ms, time: 1.75 ms, `` mflops'': > > > > > > 57029 > > > > > > > > > > > > > > > > > > > > > > $ ompi_info --all|grep 'command line' > > > > > > > > > > > > > > > > > > > > > > Configure command line: '--prefix=/home/server/ompi4_plain ' '--enable-mpi-fortran' '--enable-mpi-cxx' '--enable-shared=yes' '-- enable-static=yes' '--enable-mpi1-compatibility' > > > > > > > > > > > > > > > > > > > > > > User-specified command line > > > > > > parameters > > > > > > > > > > > passed to ROMIO's configure script > > > > > > > > > > > > > > > > > > > > > > Complete set of command line > > > > > > parameters > > > > > > > > > > > passed to ROMIO's configure script > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > openMPI4.1.0 + xpmem > > > > > > > > > > > > > > > > > > > > > > -------------------------------- > > > > > > > > > > > > > > > > > > > > > > $ mpirun --map-by core -rank-by core --bind-to core --mca pml > > > > > > ob1 > > > > > > > > > > > --mca btl vader,self ./mpi-bench ic1000000 > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------ -- > > > > > > -- > > > > > > ------ > > > > > > > > > > > ---- > > > > > > > > > > > > > > > > > > > > > > WARNING: Could not generate an xpmem segment id for this process' > > > > > > > > > > > > > > > > > > > > > > address space. > > > > > > > > > > > > > > > > > > > > > > The vader shared memory BTL will fall back on another > > > > > > single-copy > > > > > > > > > > > > > > > > > > > > > > mechanism if one is available. This may result in lower performance. > > > > > > > > > > > > > > > > > > > > > > Local host: lib-daytonax-03 > > > > > > > > > > > > > > > > > > > > > > Error code: 2 (No such file or directory) > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------ -- > > > > > > -- > > > > > > ------ > > > > > > > > > > > ---- > > > > > > > > > > > > > > > > > > > > > > Problem: ic1000000, setup: 559.55 ms, time: 1.77 ms, `` mflops'': > > > > > > 56280 > > > > > > > > > > > > > > > > > > > > > > $ ompi_info --all|grep 'command line' > > > > > > > > > > > > > > > > > > > > > > Configure command line: '--prefix=/home/server/ompi4_xmem' '--with-xpmem=/opt/xpmm' '--enable-mpi-fortran' '--enable-mpi-cxx' '-- enable-shared=yes' '--enable-static=yes' '--enable-mpi1-compatibility' > > > > > > > > > > > > > > > > > > > > > > User-specified command line > > > > > > parameters > > > > > > > > > > > passed to ROMIO's configure script > > > > > > > > > > > > > > > > > > > > > > Complete set of command line > > > > > > parameters > > > > > > > > > > > passed to ROMIO's configure script > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Other System Config > > > > > > > > > > > > > > > > > > > > > > ---------------------------- > > > > > > > > > > > - $ cat /etc/os-release > > > > > > > > > > > > > > > > > > > > > > NAME="Ubuntu" > > > > > > > > > > > > > > > > > > > > > > VERSION="20.04 LTS (Focal Fossa)" > > > > > > > > > > > > > > > > > > > > > > $ gcc -v > > > > > > > > > > > > > > > > > > > > > > cc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04) > > > > > > > > > > > > > > > > > > > > > > DRAM:- 1TB DDR4-3200 MT/s RDIMM memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The recommended command line to run would be as below:- > > > > > > > > > > > > > > > > > > > > > > mpirun --map-by core --rank-by core --bind-to core --mca pml > > > > > > ob1 --mca > > > > > > > > > > > btl vader,self ./mpi-bench -owisdom -opatient -r1000 -s > > > > > > icf1000000 > > > > > > > > > > > > > > > > > > > > > > (Here, -opatient would allow the use of best kernel/ algorithm > > > > > > plan, > > > > > > > > > > > > > > > > > > > > > > -r1000 would run the test for 1000 iterations to > > > > > > avoid > > > > > > > > > > > run-to-run variations, > > > > > > > > > > > > > > > > > > > > > > -owisdom would take off the first-time setup > > > > > > overhead/time > > > > > > > > > > > when executing the "mpirun command line" next time) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please suggest me if any other details needed for you to analyze this performance regression? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > With Regards, > > > > > > > > > > > > > > > > > > > > > > S. Biplab Raut > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> > > > > > > > > > > > Sent: Friday, March 12, 2021 12:46 PM > > > > > > > > > > > To: Raut, S Biplab <biplab.r...@amd.com> > > > > > > > > > > > Subject: Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [CAUTION: External Email] > > > > > > > > > > > > > > > > > > > > > > when running > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > mpirun --map-by core -rank-by core --bind-to core --mca pml > > > > > > ob1 --mca > > > > > > > > > > > btl vader,self ./mpi-bench ic1000000 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I go similar flops with Open MPI 3.1.1, 3.1.6, 4.1.0 and > > > > > > 4.1.1rc1 on > > > > > > > > > > > my system > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If you are using a different command line, please let me know > > > > > > and I > > > > > > > > > > > will give it a try > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Gilles > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Mar 12, 2021 at 3:20 PM Raut, S Biplab <Biplab.Raut@ amd.com> wrote: > > > > > > > > > > > > > > > > > > > > > > Reposting here without the logs - it seems there is a message size limit of 150KB and so could not attach the logs. > > > > > > > > > > > > > > > > > > > > > > (Request the moderator to approve the original mail that has > > > > > > > > > > > attachment of compressed logs) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My main concern in moving from ompi3.1.1 to ompi4.1.0 - Why does ompi4.1.0 perform poorly as compared to opmi3.1.1 for some test sizes??? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I ran "FFTW MPI bench binary" in verbose mode "10" (as suggested by Gilles) for below three cases and confirmed that btl/vader is used by default. > > > > > > > > > > > > > > > > > > > > > > FFTW MPI test for a 1D problem size (1000000) is run on a > > > > > > single-node > > > > > > > > > > > as below:- > > > > > > > > > > > > > > > > > > > > > > mpirun --map-by core --rank-by core --bind-to core -np 128 > > > > > > > > > > > <fftw/mpi/bench program binary> <program binary options for > > > > > > problem > > > > > > > > > > > size 1000000 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The three test cases are described below :- Test run with openMPI3.1.1 performs best. > > > > > > > > > > > > > > > > > > > > > > Test run on Ubuntu20.04 and stock openMPI3.1.1 : gives mflops: > > > > > > 76978 > > > > > > > > > > > Test run on Ubuntu20.04 and stock openMPI4.1.1 : gives mflops: > > > > > > 56205 > > > > > > > > > > > Test run on Ubuntu20.04 and openMPI4.1.1 configured with xpmem : > > > > > > gives > > > > > > > > > > > mflops: 56411 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please check more details in the below mail chain. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > P.S: > > > > > > > > > > > > > > > > > > > > > > FFTW MPI bench test binary can be compiled from sources > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A > > > > > > %2F%2Fgithub.com%2Famd%2Famd-fftw&data=04%7C01%7CBiplab. > > > > > > Raut%40amd.com%7C1c4ece784e9046362baa08d8faec8ebf%7C3dd8961f > > > > > > e4884e608e11a82d994e183d%7C0%7C0%7C637535241229935547%7CUnkn > > > > > > own%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTi > > > > > > I6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=kSJy4GhnFlVCJFcACa > > > > > > OXrcAHF6X3O4%2B%2B2uC7OmHsh0o%3D&reserved=0 OR https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FFFTW%2Ffftw3&data=04%7C01%7CBiplab.Raut%40amd.com%7C1c4ece784e9046362baa08d8faec8ebf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637535241229935547%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=yiOIHr1%2F1vY3vIx8f1pA3m%2BwsC4ufamkioMSXWXsOu0%3D&reserved=0 . > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > With Regards, > > > > > > > > > > > > > > > > > > > > > > S. Biplab Raut > > > > > > > >