At 16:19 09/05/2012, you wrote:
> On your code, the only point where it could fail is if one of the
> precalculated message size values is wrongly calculated and executes
> the Recieve where it shouldn't.
Yes, but after the sizes are calculated they don't change and that's why
I find it weird to
At 15:59 08/05/2012, you wrote:
Yep you are correct. I did the same and it worked. When I have more
than 3 MPI tasks there is lot of overhead on GPU.
But for CPU there is not overhead. All three machines have 4 quad
core processors with 3.8 GB RAM.
Just wondering why there is no degradation
Sorry for the delay, and sorry again because in last mail i had the
wrong taste that it was some kind of homework problem.
At 17:41 04/05/2012, you wrote:
> The logic of send/recv looks ok. Now, in 5 and 7, recvSize(p2) and
> recvSize(p1) function what value returns?
All the sendSizes and Rec
At 15:20 04/05/2012, you wrote:
Ups, I edited the code to make it easier to understand but I forgot
to change two p2, sorry ^^' .
I hope this one is completely right:
1: for(int p1=0; p14: if(sendSize(p2))
MPI_Ssend(sendBuffer[p2],sendSize(p2),MPI_FLOAT,p2,0,myw);
//processor p1 sends data to
At 11:52 04/05/2012, you wrote:
Hi all,
I have a program that executes a communication loop similar to this one:
1:for(int p1=0; p14:if(sendSize(p2))
MPI_Ssend(sendBuffer[p2],sendSize(p2),MPI_FLOAT,p2,0,myw);
5:if(recvSize(p2))
MPI_Recv(recvBuffer[p2],recvS
At 12:51 03/05/2012, you wrote:
Thanks for the reply.
When I modify the code it still fails with segmentation error.
You run it on different servers or runs in the same server?
If you are testing on one server, perhaps your gpu is out of memory.
Check your cudaMalloc calls, perhaps memory is
At 08:51 02/05/2012, you wrote:
Hi,
I am trying to execute following code on cluster.
run_kernel is a cuda call with signature int run_kernel(int
array[],int nelements, int taskid, char hostname[]);
... deleted code
mysum = run_kernel(&onearray[2000], chunksize, taskid, myname);
...