If you only have one thread doing MPI calls, then single and funneled are 
indeed the same. If this is only happening after long run times, I'd suspect 
resource exhaustion. You might check your memory footprint to see if you are 
running into leak issues (could be in our library as well as your app). When 
you eventually deadlock, do you get any error output? If you are using IB and 
run out of QP, you should at least get something saying that.


On Oct 15, 2014, at 8:22 AM, McGrattan, Kevin B. Dr. <kevin.mcgrat...@nist.gov> 
wrote:

> I am using OpenMPI 1.8.3 on a linux cluster to run fairly long CFD 
> (computational fluid dynamics) simulations using 16 MPI processes. The 
> calculations last several days and typically involve millions of MPI 
> exchanges. I use the Intel Fortran compiler, and when I compile with the 
> –openmp option and run with only one OpenMP thread per MPI process, I tend to 
> get deadlocks after several days of computing. These deadlocks only occur in 
> about 1 out of 10 calculations, and they only occur after running for several 
> days. I have eliminated things like network glitches, power spikes, etc, as 
> possibilities. The only thing left is the inclusion of the OpenMP option – 
> even though I am running with just one OpenMP thread per MPI process. I have 
> read about the issues with MPI_THREAD_INIT, and I have reduced the REQUIRED 
> level of support to MPI_THREAD_FUNNELED, down from MPI_THREAD_SERIALIZED. The 
> latter was not necessary for my application, and I think the reduction in 
> level of support has helped, but not completely removed, the deadlocking. Of 
> course, there is always the possibility that I have coded my MPI calls 
> improperly, even though the code runs for days on end. Maybe there’s that one 
> in a million possibility that rank x gets to a point in the code that is so 
> far ahead of all the other ranks that a deadlock occurs. Placing MPI_BARRIERs 
> has not helped me find any such situation.
>  
> So I have two questions. First, has anyone experienced something similar to 
> this where inclusion of OpenMP in an MPI code has caused deadlocks? Second, 
> is it possible that reducing the REQUIRED level of support to 
> MPI_THREAD_SINGLE will cause the code to behave differently than FUNNELED? I 
> have read in another post that SINGLE and FUNNELED are essentially the same 
> thing. I have even noted that I can spawn OpenMP threads even when I use 
> SINGLE.
>  
> Thanks
>  
> Kevin McGrattan
> National Institute of Standards and Technology
> 100 Bureau Drive, Mail Stop 8664
> Gaithersburg, Maryland 20899
>  
> 301 975 2712
>  
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/10/25500.php

Reply via email to