Hi folks,        I have been seeing some 
nasty behaviour in MPI_Send/Recv with large dataset(8 MB), when used with 
OpenMP and Openmpi together with IB Interconnect. Attached is a 
program.       The code first calls MPI_Init_thread() 
followed by openmp thread creation API. The program works fine, if we do single 
side comm unication [Thread 0 of process 0 sending some data to any thread of 
process 1], but it hangs if both side tries to send some data (8 MB) using IB 
Interconnect        Interesting to note that 
program works fine, if we send short data(1 MB or 
below).        I see this 
with        openmpi-1.2 or openmpi-1.2.4 
(compiled with --enable-mpi-threads)        
ofed 1.2        
2.6.9-42.4sp.XCsmp        icc (Intel 
Compiler)        compiled 
as               
 mpicc -O3 -openmp temp.c        run 
as               
 mpirun -np 2 -hostfile nodelist 
a.out        The error i am getting 
is        
------------------------------------------------------------------------------------------------------------------------------------------------------------------       
 [0,1,1][btl_openib_component.c:1199:btl_openib_component_progress] from n129 
to: n115 error polling LP CQ with status LOCAL PROTOCOL ERROR status number 4 
for wr_id 6391728 opcode 
0[0,1,1][btl_openib_component.c:1199:btl_openib_component_progress] from n129 
to: n115 error polling LP CQ with status WORK REQUEST FLUSHED ERROR status 
number 5 for wr_id 7058304 opcode 128[0,1,0][
btl_openib_component.c:1199:btl_openib_component_progress] from n115 to: n129 
[0,1,0][btl_openib_component.c:1199:btl_openib_component_progress] from n115 
to: n129 error polling LP CQ with status WORK REQUEST FLUSHED ERROR status 
number 5 for wr_id 6854256 opcode 128error polling LP CQ with status LOCAL 
LENGTH ERROR status number 1 for wr_id 6920112 opcode 
0        
---------------------------------------------------------------------------------------------------------------------------------------------------------------       
 Anyone else seeing similar?  Any ideas for 
workarounds?        As a point of reference, 
program works fine, if we force openmpi to select TCP interconnect using --mca 
btl tcp,self.-Neeraj
#include<stdio.h>
#include<mpi.h>
#include<omp.h>
#include<math.h>
#include <stdlib.h>
#include "time.h"
#include <sys/time.h>

#define MAX 1000000


int main(int argc, char *argv[])
{

  	int		required = MPI_THREAD_MULTIPLE;
  	int		provided;
  	int		rank;
  	int		size;
  	int		id;
  	int		flag;
  	MPI_Status	status;
  	double	*buff1, *buff2;


  	MPI_Init_thread(&argc, &argv, required, &provided);
  	MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  	MPI_Comm_size(MPI_COMM_WORLD, &size);

  	buff1 = (double *)malloc(sizeof(double)*MAX);
  	buff2 = (double *)malloc(sizeof(double)*MAX);

  	omp_set_num_threads(2);

  	#pragma omp parallel private(id)
  	{
      		id = omp_get_thread_num();
  		if(rank == 0)
  		{
			if(id == 0)
				MPI_Send(buff1, MAX ,MPI_DOUBLE, 1, rank, MPI_COMM_WORLD);
			else
				MPI_Recv(buff2, MAX, MPI_DOUBLE, 1, 1234, MPI_COMM_WORLD, &status);
  		}    
  		if(rank == 1)
  		{
			if(id == 0)
		 		MPI_Recv(buff1, MAX, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &status);
			else
				MPI_Send(buff2, MAX ,MPI_DOUBLE, 0, 1234, MPI_COMM_WORLD);
  		}
	}
	printf("rank = %d %d \n", rank, provided);
  	free(buff1);
  	free(buff2);
  	MPI_Barrier(MPI_COMM_WORLD);
  	MPI_Finalize();
}

Reply via email to