Hi all, I'm using the SGE system on my school network, and would like to know if the errors I received below means there's something wrong with my MPI_Recv function.
[0,1,3][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed with errno=104 [0,1,2][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed with errno=104 In my code, I have /* executed by P1 to P(p-1) */ for (row = 1 ; row <= size[0] ; row++) MPI_Send(&(cell[row][1]), length, stable, 0, rank, MPI_COMM_WORLD); <some other computations> /* P0 receive from P1 to P(p-2) */ for (source = 1 ; source < (p-1) ; source++) for (r = 1 ; r <= size[0] ; r++) MPI_Recv(&(cell[r][1])+(source-1)*mlength, mlength, stable,source, source, MPI_COMM_WORLD, &status); /* P0 receive from P(p-1) */ for (r = 1 ; r <= size[0] ; r++) MPI_Recv(&(cell[r][1]) + (p-2)*mlength, size[k]-(p-2)*mlength,stable, p-1, p-1, MPI_COMM_WORLD, &status); When I used some printf statements to see when the errors occur, they usually occur in the middle of the first MPI_Recv function, usually when source is 2 and the value of r usually differs, i.e. the error does not seem to occur at the same exact row: Basically what I'm trying to do is: Say there are a total of 4 processors (p=4), P0 - P3. P1 and P2 each have a (size[0]+1)-by-(mlength+1) matrix "cell", and P3 has a (size[0]+1)-by-(length+1) matrix "cell". For P1 to P2, length = mlength. size[k] = (p-2)*mlength + length(in P3) I'm trying to send the matrix "cell" in P1, P2 and P3 to P0, then have P0 combine them into one (size[0]+1)-by-(size[k]+1) matrix "cell". I'm sending the matrix row-by-row. In short, say the matrix in P1, P2 and P3 are ---- ---- ----- -### -ooo -@@@@ -### -ooo -@@@@ -### -ooo -@@@@ respectively. size[0] = 3, size[k] = 10, length = 3 for P1 and P2, length = 4 for P3 and mlength = 3. I now need to combine them into 1 table in P0: ----------- -###ooo@@@@ -###ooo@@@@ -###ooo@@@@ What is strange is I do this combination of matrices more than once in my DNA sequence alignment program, and the error occurs only when it tries to combine matrices from one or two particular sequences, but not the others. Please help. Thank you. Regards, Rayne ____________________________________________________________________________________ Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7