Hi all, I am a research assistant (RA) at NUST Pakistan in High Performance Scientific Computing Lab. I am working on the parallel implementation of the Finitie Difference Time Domain (FDTD) method using MPI. I am using the OpenMPI environment on a cluster of 4 SunFire v890 cluster connected through Myrinet. I am having problem that when I run my code with let say 4 processes. Process 0 takes about 3 times more time than other three processes, executing a for loop which is the main cause of load imbalance in my code. I am writing the code that is causing the problem. The code is run by all the processes simultaneously and independently and I have timed it independent of segments of code.
start = gethrtime(); for (m = 1; m < my_x ; m++){ for (n = 1; n < size_y-1; n++) { Ez(m,n) = Ez(m,n) + cezh*((Hy(m,n) - Hy(m-1,n)) - (Hx(m,n) - Hx(m,n-1))); } } stop = gethrtime(); time = (stop-start); In my implementation I used 1-D array to realize 2-D arrays.I have used the following macros for accesing the array elements. #define Hx(I,J) hx[(I)*(size_y) + (J)] #define Hy(I,J) hy[(I)*(size_y) + (J)] #define Ez(I,J) ez[(I)*(size_y) + (J)] Can any one tell me what am I doing wrong here, or macros are creating the problems or it can be related to any OS issue. I will be looking forward for help because this problem has stopped my progress for the last two weeks regards aftab hussain RA High Performance Scientific Computing Lab NUST Institue of Information Technology National University of Sciences and Technology Pakistan -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.