Hi,
I am trying to run a simple MPI program on TACC Stampede system. But, I
have the following error:
c445-203$ ../../ompi/install/bin/mpirun -np 2 -hostfile hosts ./simp
srun: cluster configuration lacks support for cpu bindingBut, I have
My config flag is:
./configure --prefix=$PWD/install --e
George & other list members,
I think I may have a race condition in this example that is masked by
the print_matrix statement.
For example, lets say rank one has a large sleep before reaching the
local transpose, will the other ranks have completed the Alltoall and
when rank one reaches the local
The Alltoall should only return when all data is sent and received on
the current rank, so there shouldn't be any race condition.
Cheers,
Matthieu
2014-05-08 15:53 GMT+02:00 Spenser Gilliland :
> George & other list members,
>
> I think I may have a race condition in this example that is masked
On 7 May 2014 17:48, Rob Latham wrote:
>
> Looks like I fixed that late last year. A slew of ">31 bit transfers"
> fixes went into the MPICH-3.1 release. Slurping those changes, which are
> individually small (using some _x versions of type-inquiry routines here,
> some MPI_Count promotions the
I think the issue is with the way you define the send and receive
buffer in the MPI_Alltoall. You have to keep in mind that the
all-to-all pattern will overwrite the entire data in the receive
buffer. Thus, starting from a relative displacement in the data (in
this case matrix[wrank*wrows]), begs f
On 7 May 2014 16:25, Jeff Squyres (jsquyres) wrote:
>
> "Periodically".
>
> Hopefully, the fix will be small and we can just pull that one fix down to
> OMPI.
Okay, thanks for letting me know Jeff.
Richard
On 05/07/2014 11:36 AM, Rob Latham wrote:
On 05/05/2014 09:20 PM, Richard Shaw wrote:
Hello,
I think I've come across a bug when using ROMIO to read in a 2D
distributed array. I've attached a test case to this email.
Thanks for the bug report and the test case.
I've opened MPICH bug (bec
George & Mattheiu,
> The Alltoall should only return when all data is sent and received on
> the current rank, so there shouldn't be any race condition.
Your right this is MPI not pthreads. That should never happen. Duh!
> I think the issue is with the way you define the send and receive
> buff
The segfault indicates that you overwrite outside of the allocated memory (and
conflicts with the ptmalloc library). I’m quite certain that you write outside
the allocated array …
George.
On May 8, 2014, at 15:16 , Spenser Gilliland wrote:
> George & Mattheiu,
>
>> The Alltoall should only
A simple test would be to run it with valgrind, so that out of bound
reads and writes will be obvious.
Cheers,
Matthieu
2014-05-08 21:16 GMT+02:00 Spenser Gilliland :
> George & Mattheiu,
>
>> The Alltoall should only return when all data is sent and received on
>> the current rank, so there sho
Matthieu & George,
Thanks you both for helping me. I really appreciate it.
> A simple test would be to run it with valgrind, so that out of bound
> reads and writes will be obvious.
I ran it through valgrind (i left the command line I used in there so
you can verify the methods)
I am getting er
Spenser,
Here is basically what is happening. On the top left, I depicted the datatype
resulting from the vector type. The two arrows point to the lower bound and
upper bound (thus the extent) of the datatype. On the top right, the resized
datatype, where the ub is now moved 2 elements after th
On 05/05/2014 09:20 PM, Richard Shaw wrote:
Hello,
I think I've come across a bug when using ROMIO to read in a 2D
distributed array. I've attached a test case to this email.
Richard: may I add this test case to ROMIO's test suite? I'm always on
the hunt for small self-contained tests.
I
On 05/07/2014 11:36 AM, Rob Latham wrote:
On 05/05/2014 09:20 PM, Richard Shaw wrote:
Hello,
I think I've come across a bug when using ROMIO to read in a 2D
distributed array. I've attached a test case to this email.
Thanks for the bug report and the test case.
I've opened MPICH bug (bec
I read the MPICH trac ticket you pointed to and your analysis seems pertinent.
The impact of my patch for “count = 0” has a similar outcome to yours: removed
all references to the datatype if the count was zero, without looking fo the
special markers.
Let me try to come up with a fix.
Thanks,
George,
> Here is basically what is happening. On the top left, I depicted the datatype
> resulting from the vector type. The two arrows point to the lower bound and
> upper bound (thus the extent) of the datatype. On the top right, the resized
> datatype, where the ub is now moved 2 elements a
On 8 May 2014 16:59, Rob Latham wrote:
>
> Richard: may I add this test case to ROMIO's test suite? I'm always on
> the hunt for small self-contained tests.
>
Please do. I'm glad it's being so useful - it seems to be hitting a
surprising amount of bugs of different origins.
It might be an idea
The alltoall exchanges data from all nodes to all nodes, including the
local participant. So every participant will write the same amount of
data.
George.
On Thu, May 8, 2014 at 6:16 PM, Spenser Gilliland
wrote:
> George,
>
>> Here is basically what is happening. On the top left, I depicted t
George,
> The alltoall exchanges data from all nodes to all nodes, including the
> local participant. So every participant will write the same amount of
> data.
Yes, I believe that is what my code is doing. However, I'm not sure
why the out of bounds is occurring. Can you be more specific? I
r
George,
I figured it out. The defined type was
MPI_Type_vector(N, wrows, N, MPI_FLOAT, &mpi_all_unaligned_t);
Where it should have been
MPI_Type_vector(wrows, wrows, N, MPI_FLOAT, &mpi_all_unaligned_t);
This clears up all the errors.
Thanks,
Spenser
On Thu, May 8, 2014 at 5:43 PM, S
20 matches
Mail list logo