[OMPI users] jobs are hanging with btl_openib_component error

2013-06-17 Thread Singh, Bharati (GE Global Research, consultant)
Hi Team, Our users jobs are hanging and we notice below errors. [[61410,1],65][btl_openib_component.c:3238:handle_wc] from bng1aviationdc22 to: bng1aviationdc26 error polling LP CQ with status RETRY EXCEEDED ERROR status number 12 for wr_id 774739584 opcode 1 vendor error 129 qp_idx 0

[OMPI users] lsb_launch failed: 0

2013-06-17 Thread Singh, Bharati (GE Global Research, consultant)
Hi Team, Our users jobs are exiting with below error for random nodes. could you please help us to resolve this issue? [root@bng1grcdc200 output.228472]# cat user_script.stderr [bng1grcdc181:08381] [[54933,0],0] ORTE_ERROR_LOG: The specified application failed to start in file plm_lsf_modu

Re: [OMPI users] jobs are hanging with btl_openib_component error

2013-06-17 Thread Jeff Squyres (jsquyres)
That sounds like there's a problem with your InfiniBand fabric. You should run a complete level-0 diagnostic on your IB network. On Jun 17, 2013, at 5:23 AM, "Singh, Bharati (GE Global Research, consultant)" wrote: > Hi Team, > > Our users jobs are hanging and we notice below errors. >

Re: [OMPI users] lsb_launch failed: 0

2013-06-17 Thread Jeff Squyres (jsquyres)
I'm not an LSF expert, but this usually means that the Open MPI helper executable named "orted" was not able to be found on the remote nodes. Is your PATH set properly, both locally and remotely, such that the Open MPI executables can be found? On Jun 17, 2013, at 7:01 AM, "Singh, Bharati (GE

[OMPI users] Troubles Building OpenMPI on MinGW-w64 (GCC 4.8.0)

2013-06-17 Thread Haroogan
Hello, I'm trying to build OpenMPI with CMake under MinGW-w64 based on GCC 4.8.0 (POSIX Threads), and here is what I get: In file included from ../opal/threads/mutex_windows.h:36:0, from ../opal/threads/mutex.h:121, from ../opal/event/event.h:161,

Re: [OMPI users] jobs are hanging with btl_openib_component error

2013-06-17 Thread Shamis, Pavel
You may use tools like this http://linux.die.net/man/1/ibdiagnet to debug your ib network problems. Most likely, you have some bad cable or connector somewhere in the network. The tool should be able to pin-point the problem. Pavel (Pasha) Shamis --- Computer Science Research Group Computer Scien

Re: [OMPI users] MPI_Init_thread hangs in OpenMPI 1.7.1 when using --enable-mpi-thread-multiple

2013-06-17 Thread Ralph Castain
Hmmm...well, your code runs fine for me: Ralphs-iMac:mpi rhc$ mpirun -n 2 ./thread_init Calling MPI_Init_thread... Calling MPI_Init_thread... MPI_Init_thread returned, provided = 3 MPI_Init_thread returned, provided = 3 Ralphs-iMac:mpi rhc$ I think the key, however, is that you also have to conf

Re: [OMPI users] MPI_Init_thread(..., MPI_THREAD_SERIALIZED) hangs under OSX 10.8.4 if compiled with OpenMPI 1.7.1

2013-06-17 Thread Ralph Castain
For 1.7, you also have to configure with --enable-opal-multi-thread. I suspect MacPorts doesn't do that, so you might want to configure and build your own version. On Jun 16, 2013, at 5:27 AM, Hans Ekkehard Plesser wrote: > > On 15. juni 2013, at 01.31, Ralph Castain wrote: > >> I have no

Re: [OMPI users] Troubles Building OpenMPI on MinGW-w64 (GCC 4.8.0)

2013-06-17 Thread Ralph Castain
What version of OMPI are you using? On Jun 17, 2013, at 10:50 AM, Haroogan wrote: > Hello, > > I'm trying to build OpenMPI with CMake under MinGW-w64 based on GCC 4.8.0 > (POSIX Threads), and here is what I get: > > In file included from ../opal/threads/mutex_windows.h:36:0, >

Re: [OMPI users] Troubles Building OpenMPI on MinGW-w64 (GCC 4.8.0)

2013-06-17 Thread Haroogan
What version of OMPI are you using? Latest, stable: 1.6.4.

Re: [OMPI users] Troubles Building OpenMPI on MinGW-w64 (GCC 4.8.0)

2013-06-17 Thread Ralph Castain
You should just be able to pull a native Win version down from our web site - building it yourself can be an adventure on Windows. I'm afraid our Windows supporter has moved on to other pastures, so our ability to help with such builds is pretty minimal. I don't think he did anything with MinGW