Hi Gilles,
You're right, we no longer get warnings... and the performance disparity
still exists, though to be clear it's only in select parts of the code -
others run as we'd expect. This is probably why I initially guessed it was
a process/memory affinity issue - the one timer I looked at is
Brian,
i have no doubt this was enough to get rid of the warning messages.
out of curiosity, are you now able to experience performances close to
native runs ?
if i understand correctly, the linux kernel allocates memory on the
closest NUMA domain (e.g. socket if i oversimplify), and since
MPI ta
Hi Ralph,
Well, this gets chalked up to user error - the default AMI images come
without the NUMA-dev libraries, so OpenMPI didn't get built with it (and in
my haste, I hadn't checked). Oops. Things seem to be working correctly
now.
Thanks again for your help,
- Brian
On Fri, Dec 22, 20
I honestly don’t know - will have to defer to Brian, who is likely out for at
least the extended weekend. I’ll point this one to him when he returns.
> On Dec 22, 2017, at 1:08 PM, Brian Dobbins wrote:
>
>
> Hi Ralph,
>
> OK, that certainly makes sense - so the next question is, what pre
Hi Ralph,
OK, that certainly makes sense - so the next question is, what prevents
binding memory to be local to particular cores? Is this possible in a
virtualized environment like AWS HVM instances?
And does this apply only to dynamic allocations within an instance, or
static as well? I'
Actually, that message is telling you that binding to core is available, but
that we cannot bind memory to be local to that core. You can verify the binding
pattern by adding --report-bindings to your cmd line.
> On Dec 22, 2017, at 11:58 AM, Brian Dobbins wrote:
>
>
> Hi all,
>
> We're t
Hi all,
We're testing a model on AWS using C4/C5 nodes and some of our timers, in
a part of the code with no communication, show really poor performance
compared to native runs. We think this is because we're not binding to a
core properly and thus not caching, and a quick 'mpirun --bind-to cor