Hi Ralph, Well, this gets chalked up to user error - the default AMI images come without the NUMA-dev libraries, so OpenMPI didn't get built with it (and in my haste, I hadn't checked). Oops. Things seem to be working correctly now.
Thanks again for your help, - Brian On Fri, Dec 22, 2017 at 2:14 PM, r...@open-mpi.org <r...@open-mpi.org> wrote: > I honestly don’t know - will have to defer to Brian, who is likely out for > at least the extended weekend. I’ll point this one to him when he returns. > > > On Dec 22, 2017, at 1:08 PM, Brian Dobbins <bdobb...@gmail.com> wrote: > > > Hi Ralph, > > OK, that certainly makes sense - so the next question is, what prevents > binding memory to be local to particular cores? Is this possible in a > virtualized environment like AWS HVM instances? > > And does this apply only to dynamic allocations within an instance, or > static as well? I'm pretty unfamiliar with how the hypervisor (KVM-based, > I believe) maps out 'real' hardware, including memory, to particular > instances. We've seen *some* parts of the code (bandwidth heavy) run > ~10x faster on bare-metal hardware, though, *presumably* from memory > locality, so it certainly has a big impact. > > Thanks again, and merry Christmas! > - Brian > > > On Fri, Dec 22, 2017 at 1:53 PM, r...@open-mpi.org <r...@open-mpi.org> > wrote: > >> Actually, that message is telling you that binding to core is available, >> but that we cannot bind memory to be local to that core. You can verify the >> binding pattern by adding --report-bindings to your cmd line. >> >> >> On Dec 22, 2017, at 11:58 AM, Brian Dobbins <bdobb...@gmail.com> wrote: >> >> >> Hi all, >> >> We're testing a model on AWS using C4/C5 nodes and some of our timers, >> in a part of the code with no communication, show really poor performance >> compared to native runs. We think this is because we're not binding to a >> core properly and thus not caching, and a quick 'mpirun --bind-to core >> hostname' does suggest issues with this on AWS: >> >> *[bdobbins@head run]$ mpirun --bind-to core hostname* >> >> *--------------------------------------------------------------------------* >> *WARNING: a request was made to bind a process. While the system* >> *supports binding the process itself, at least one node does NOT* >> *support binding memory to the process location.* >> >> * Node: head* >> >> *Open MPI uses the "hwloc" library to perform process and memory* >> *binding. This error message means that hwloc has indicated that* >> *processor binding support is not available on this machine.* >> >> (It also happens on compute nodes, and with real executables.) >> >> Does anyone know how to enforce binding to cores on AWS instances? Any >> insight would be great. >> >> Thanks, >> - Brian >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users >> >> >> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users >> > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users > > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users