Re: [OMPI users] Hang in MPI_Comm_split in 2 RHEL Linux nodes with INTEL MIC cards

2015-05-05 Thread Nathan Hjelm
If any communication will be between two mics on the same node or between a mic and its host I suggest using the scif btl instead of tcp. You will see a factor of 10 or more improvement in latency by using the scif interface. -Nathan On Tue, May 05, 2015 at 10:39:47AM +0530, Manumachu Reddy wrot

Re: [OMPI users] Hang in MPI_Comm_split in 2 RHEL Linux nodes with INTEL MIC cards

2015-05-05 Thread Manumachu Reddy
Hi George, Sorry for the delay in writing to you. Your latest suggestion has worked admirably well. Thanks a lot for your help. On Sun, Apr 26, 2015 at 9:32 PM, George Bosilca wrote: > With the arguments I sent you the error about connection refused should > have disappeared. Let's try to fo

Re: [OMPI users] Hang in MPI_Comm_split in 2 RHEL Linux nodes with INTEL MIC cards

2015-04-26 Thread George Bosilca
With the arguments I sent you the error about connection refused should have disappeared. Let's try to force all traffic over the first TCP interface eth3. Try the following flags to your mpirun: --mca pml ob1 --mca btl tcp,sm,self --mca btl_tcp_if_include eth3 George. On Sun, Apr 26, 2015 at

Re: [OMPI users] Hang in MPI_Comm_split in 2 RHEL Linux nodes with INTEL MIC cards

2015-04-26 Thread Manumachu Reddy
Hi George, I am afraid the suggestion to use bcl_tcp_if_exclude has not applied. I executed the following command: *shell$ mpirun --mca btl_tcp_if_exclude mic0,mic1 -app appfile* Please let me know if there are options to mpirun (apart from -v) to get verbose output to understand what is happen

Re: [OMPI users] Hang in MPI_Comm_split in 2 RHEL Linux nodes with INTEL MIC cards

2015-04-24 Thread George Bosilca
Manumachu, Both nodes have the same IP for their Phi (mic0 and mic1). This is OK as long as they don't try to connect to each other using these addresses. A simple fix is to prevent OMPI from using the supposedly local mic0 and mic1 IP. Add --mca btl_tcp_if_exclude mic0,mic1 to your mpirun comm

[OMPI users] Hang in MPI_Comm_split in 2 RHEL Linux nodes with INTEL MIC cards

2015-04-24 Thread Manumachu Reddy
Dear OpenMPI Users, I request your help to resolve a hang in my OpenMPI application. My OpenMPI application hangs in MPI_Comm_split() operation. The code for this simple application is at the end of this email. Broadcast works fine. My experimental setup comprises of two RHEL6.4 Linux nodes. Eac