quyres (jsquyres)
> Sent: Friday, June 07, 2013 2:54 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Sandy Bridge performance question
>
> On Jun 7, 2013, at 5:28 AM, "Blosch, Edwin L"
> wrote:
>
> > Regarding VTune, we have a code that doesn't scale
On Jun 7, 2013, at 5:28 AM, "Blosch, Edwin L" wrote:
> Regarding VTune, we have a code that doesn't scale well so that's a good tip.
> I have access to VTune, I've used it. But I only remember looking at
> OpenMP, I didn't know it could handle MPI runs. That would be great.
You might have
pen-mpi.org] on behalf of Jeff
Squyres (jsquyres) [jsquy...@cisco.com]
Sent: Friday, June 07, 2013 6:00 AM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] Sandy Bridge performance question
+1
Depending on how much you care, you might also want to look at some performance
analysis tool
+1
Depending on how much you care, you might also want to look at some performance
analysis tools to look and see what is happening under the covers. The Intel
VTune suite is the gold standard -- it shows all the counters and statistics
from the CPUs themselves (be aware that there's a bit of
It depends on the application you are using. Some are "balanced" - i.e., they
run faster if the number of processes is a power of two. You'll see that n8 is
faster than n7, so this is likely the situation.
On Jun 6, 2013, at 4:10 PM, "Blosch, Edwin L" wrote:
> I am running single-node Sandy B
I am running single-node Sandy Bridge cases with OpenMPI and looking at scaling.
I'm using -bind-to-core without any other options (default is -bycore I
believe).
These numbers indicate number of cores first, then the second digit is the run
number (except for n=1, all runs repeated 3 times).