Richard Treumann wrote:
Guess I should have kept quiet a bit longer. As I recall we had
already seen a counter example to Jeff's stronger statement and that
motivated my narrower one.
If there are no wildcard receives - every
MPI_Barrier call is semantica
Gus,
You hit the nail on the head. CPMD and VASP are both fine grained parallel
Quantum Mechanics Molecular Dynamics Codes. I believe CPMD has implemented
the domain decomposition methodology found in gromacs (a classical fine
grained molecular dynamics code) which significantly diminishes the s
Guess I should have kept quiet a bit longer. As I recall we had already
seen a counter example to Jeff's stronger statement and that motivated my
narrower one.
If there are no wildcard receives - every MPI_Barrier call is
semantically irrelevant.
Do you have a counte
On Aug 24, 2009, at 4:23 PM, Eugene Loh wrote:
Meanwhile, the last process, P2, is waiting on a receive before it
enters the barrier.
Right-o -- I missed that key point. So yes, P0's send will definitely
match that first recv (before the barrier). If the barrier was not
there and the P0
Jeff Squyres wrote:
On Aug 24, 2009, at 1:03 PM, Eugene Loh wrote:
E.g., let's say P0 and P1 each send a message to P2, both using the
same tag and communicator. Let's say P2 does two receives on that
communicator and tag, using a wildcard source. So, the messages
could be received in e
Hello again,
As you requested:
node64-test ~>salloc -n7
salloc: Granted job allocation 827
node64-test ~>srun hostname
node64-17....
node64-17....
node64-20....
node64-18....
node64-19....
node64-18....xx
As far as I can see, Jeff's analysis is dead on. The matching order at P2
is based on the order in which the envelopes from P0 and P1 show up at P2.
The Barrier does not force an order between the communication paths P0->P2
vs. P1->P2.
The MPI standard does not even say what "show up" means unle
Very interesting! I see the problem - we have never encountered the
SLURM_TASKS_PER_NODE in that format, while the SLURM_JOB_CPUS_PER_NODE
indicates that we have indeed been allocated two processors on each of the
nodes! So when you just do mpirun without specifying the number of
processes, we will
Hello,
Hopefully the below information will be helpful.
SLURM Version: 1.3.15
node64-test ~>salloc -n3
salloc: Granted job allocation 826
node64-test ~>srun hostname
node64-24....
node64-25....
node64-24....
node64-test ~>printenv | grep SLURM
SL
On Aug 24, 2009, at 1:03 PM, Eugene Loh wrote:
E.g., let's say P0 and P1 each send a message to P2, both using the
same tag and communicator. Let's say P2 does two receives on that
communicator and tag, using a wildcard source. So, the messages
could be received in either order. One coul
Haven't seen that before on any of our machines.
Could you do "printenv | grep SLURM" after the salloc and send the results?
What version of SLURM is this?
Please run "mpirun --display-allocation hostname" and send the results.
Thanks
Ralph
On Mon, Aug 24, 2009 at 11:30 AM, wrote:
> Hello,
>
I neglected to include some pertinent information:
I'm using Open MPI 1.3.2. Here's a backtrace:
#0 0x002a95e6890c in epoll_wait () from /lib64/tls/libc.so.6
#1 0x002a9623a39c in epoll_dispatch ()
from /home/sjackman/arch/xhost/lib/libopen-pal.so.0
#2 0x002a96238f10 in opal_even
Hi,
I'm seeing MPI_Send block in mca_pml_ob1_send. The packet is shorter
than the eager transmit limit for shared memory (3300 bytes < 4096
bytes). I'm trying to determine if MPI_Send is blocking due to a
deadlock. Will MPI_Send block even when sending a packet eagerly?
Thanks,
Shaun
Hello,
I've seem to run into an interesting problem with openMPI. After
allocating 3 processors and confirming that the 3 processors are
allocated. mpirun on a simple mpitest program seems to run on 4
processors. We have 2 processors per node. I can repeat this case with any
odd number of nodes, o
Going back to this thread from earlier this calendar year...
Ganesh wrote:
Hi Dick,
Jeff paraphrased an unnamed source as suggesting that: "any
MPI program that relies on a barrier for correctness is an incorrect
MPI application." . That is probably too strong.
How about thi
Lee Amy wrote:
Hi,
I run some programs by using OpenMPI 1.3.3 and when I execute the
command I encountered such following error messages.
sh: orted: command not found
--
A daemon (pid 6797) died unexpectedly with status 127
Lee Amy wrote:
> Hi,
>
> I run some programs by using OpenMPI 1.3.3 and when I execute the
> command I encountered such following error messages.
>
> sh: orted: command not found
> --
> A daemon (pid 6797) died unexpectedly w
17 matches
Mail list logo