I am wondering whether this is really due to the usage of
File_write_all. We had a bug in in 1.3 series so far (which will be
fixed in 1.3.4) where we lost message segments and thus had a deadlock
in Comm_dup if there was communication occurring *right after* the
Comm_dup. File_open executes a
I'm afraid you're right... I was testing it with Open MPI on my laptop, but
later on the cluster I had some problems... Probably a colleague has
uploaded mpich...
But I thought the behavior I see might be "implementation-independant".
Probably sounds stupid... :)
Thanks anyway :)
2009/10/12
>
Dear list,
the attached program deadlocks in MPI_File_write_all when run with 16
processes on two 8 core nodes of an Infiniband cluster. It runs fine when I
a) use tcp
or
b) replace MPI_File_write_all by MPI_File_write
I'm using openmpi V. 1.3.2 (but I checked that the problem is also
occurs
Hate to say this, but you don't appear to be using Open MPI.
"mpdtrace" is an MPICH command, last I checked.
You might try their mailing list, or check which mpiexec you are using
and contact them.
On Oct 12, 2009, at 9:01 AM, Jovana Knezevic wrote:
Hello everyone!
I am trying to run 11
Hello everyone!
I am trying to run 11 instances of my program on 6 dual-core Opterons
(it is not time-consuming application anyway, takes 10 seconds at
one-core laptop :)).
so, when I type:
mpiexec -machinefile hostfile -n 11 ./program
nothing happens!
The output of:
"mpdtrace -l" command (f
Any hint for the previous mail?
Does Open MPI-1.3.3 support only a limited versions of OFED?
Or any version is ok?
On Sun, Oct 11, 2009 at 3:55 PM, Sangamesh B wrote:
> Hi,
>
> A fortran application is installed with Intel Fortran 10.1, MKL-10 and
> Openmpi-1.3.3 on a Rocks-5.1 HPC Linux cluster