On 3/30/2011 10:08 AM, Eugene Loh wrote:
Michele Marena wrote:
I've launched my app with mpiP both when two processes are on
different node and when two processes are on the same node.
The process 0 is the manager (gathers the results only), processes 1
and 2 are workers (compute).
This is the
Michele Marena wrote:
I've launched my app with mpiP both when two processes are
on different node and when two processes are on the same node.
The process 0 is the manager (gathers the results only),
processes 1 and 2 are workers (compute).
This is the case processes 1 and 2 a
Hi Jeff,
I thank you for your help,
I've launched my app with mpiP both when two processes are on different node
and when two processes are on the same node.
The process 0 is the manager (gathers the results only), processes 1 and 2
are workers (compute).
This is the case processes 1 and 2 are o
How many messages are you sending, and how large are they? I.e., if your
message passing is tiny, then the network transport may not be the bottleneck
here.
On Mar 28, 2011, at 9:41 AM, Michele Marena wrote:
> I run ompi_info --param btl sm and this is the output
>
> MCA btl
I run ompi_info --param btl sm and this is the output
MCA btl: parameter "btl_base_debug" (current value: "0")
If btl_base_debug is 1 standard debug is output,
if > 1 verbose debug is output
MCA btl: parameter "btl" (current value: )
The fact that this exactly matches the time you measured with shared memory is
suspicious. My guess is that you aren't actually using shared memory at all.
Does your "ompi_info" output show shared memory as being available? Jeff or
others may be able to give you some params that would let you ch
What happens with 2 processes on the same node with tcp?
With --mca btl self,tcp my app runs in 23s.
2011/3/28 Jeff Squyres (jsquyres)
> Ah, I didn't catch before that there were more variables than just tcp vs.
> shmem.
>
> What happens with 2 processes on the same node with tcp?
>
> Eg, when b
On 3/28/2011 3:29 AM, Michele Marena wrote:
Each node have two processors (no dual-core).
which seems to imply that the 2 processors share memory space and a
single memory buss, and the question is not about what I originally guessed.
--
Tim Prince
On 3/28/2011 3:44 AM, Jeff Squyres (jsquyres) wrote:
Ah, I didn't catch before that there were more variables than just tcp
vs. shmem.
What happens with 2 processes on the same node with tcp?
Eg, when both procs are on the same node, are you thrashing caches or
memory?
In fact, I made the gues
Ah, I didn't catch before that there were more variables than just tcp vs.
shmem.
What happens with 2 processes on the same node with tcp?
Eg, when both procs are on the same node, are you thrashing caches or memory?
Sent from my phone. No type good.
On Mar 28, 2011, at 6:27 AM, "Michele Mar
Each node have two processors (no dual-core).
2011/3/28 Michele Marena
> However, I thank you Tim, Ralh and Jeff.
> My sequential application runs in 24s (wall clock time).
> My parallel application runs in 13s with two processes on different nodes.
> With shared memory, when two processes are o
However, I thank you Tim, Ralh and Jeff.
My sequential application runs in 24s (wall clock time).
My parallel application runs in 13s with two processes on different nodes.
With shared memory, when two processes are on the same node, my app runs in
23s.
I'm not understand why.
2011/3/28 Jeff Squyr
If your program runs faster across 3 processes, 2 of which are local to each
other, with --mca btl tcp,self compared to --mca btl tcp,sm,self, then
something is very, very strange.
Tim cites all kinds of things that can cause slowdowns, but it's still very,
very odd that simply enabling using t
On Mar 27, 2011, at 7:37 AM, Tim Prince wrote:
> On 3/27/2011 2:26 AM, Michele Marena wrote:
>> Hi,
>> My application performs good without shared memory utilization, but with
>> shared memory I get performance worst than without of it.
>> Do I make a mistake? Don't I pay attention to something?
On 3/27/2011 2:26 AM, Michele Marena wrote:
Hi,
My application performs good without shared memory utilization, but with
shared memory I get performance worst than without of it.
Do I make a mistake? Don't I pay attention to something?
I know OpenMPI uses /tmp directory to allocate shared memory
This is my machinefile
node-1-16 slots=2
node-1-17 slots=2
node-1-18 slots=2
node-1-19 slots=2
node-1-20 slots=2
node-1-21 slots=2
node-1-22 slots=2
node-1-23 slots=2
Each cluster node has 2 processors. I launch my application with 3
processes, one on node-1-16 (manager) and two on node-1-17(worke
Hi,
My application performs good without shared memory utilization, but with
shared memory I get performance worst than without of it.
Do I make a mistake? Don't I pay attention to something?
I know OpenMPI uses /tmp directory to allocate shared memory and it is in
the local filesystem.
I thank yo
17 matches
Mail list logo