Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-12 Thread François Trahay
The machines have 4 cores. The THREADS_DEFAULT corresponds to a limit: the program spawns threads once at a time. So at the beginning, only one thread performs the ping pong test, then a thread is created and the two threads run the ping pong test, then a thread is created and 3 threads run the

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-12 Thread Scott Atchley
Francois, How many cores do your machines have? The file specifies THREADS_DEFAULT 16. Does this spawn 16 threads per MPI rank? I see crashes when I run this with MX (BTL with mx,sm,self and MTL). If I change THREADS_DEFAULT to 4, I see crashes with TCP (BTL with tcp,sm,self) as well.

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-12 Thread François Trahay
Here's the program. It should print something like that: [1 communicating threads] [0] 1 2.484936 0.402 0.384 [0] 2 2.478036 0.807 0.770 [0] 4 2.501503 1.599 1.525 [0] 8 2.497516 3.203

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-11 Thread George Bosilca
I will take a look at the BTL problem. Can you provide a copy of the benchmarks please. Thanks, george. On Jun 11, 2009, at 16:05 , François Trahay wrote: concurrent_ping

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-11 Thread François Trahay
Oops. Here's the trace using the BTL. Francois Scott Atchley wrote: By specifying --mca pml cm, both traces are using the MTL. To use the BTL, try: $ mpiexec --mca btl mx,sm,self -machinefile ./joe -np 2 ./concurrent_ping or simply: $ mpiexec -machinefile ./joe -np 2 ./concurrent_ping Scot

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-11 Thread Scott Atchley
On Jun 11, 2009, at 2:20 PM, François Trahay wrote: The stack trace is from the MX MTL (I attach the backtraces I get with both MX MTL and MX BTL) Here is the program that I use. It is quite simple. It runs ping pongs concurrently (with one thread per node, then with two threads per node, e

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-11 Thread George Bosilca
Based on the stack trace, at one point (depth 4) we are in the MX MTL and then we call free. It might happens that two threads call free simultaneously ... It is a guess, as there is not enough information to corroborate this. george. On Jun 11, 2009, at 13:17 , Scott Atchley wrote: Br

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-11 Thread Brian Barrett
Almost assuredly, the MTL is not thread safe, and such support is unlikely to happen in the short term. You might be better off concentrating on the BTL, as George has done significant work on that front. Brian On Jun 11, 2009, at 12:20 PM, François Trahay wrote: The stack trace is from

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-11 Thread François Trahay
The stack trace is from the MX MTL (I attach the backtraces I get with both MX MTL and MX BTL) Here is the program that I use. It is quite simple. It runs ping pongs concurrently (with one thread per node, then with two threads per node, etc.) The error occurs when two threads run concurrently.

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-11 Thread Scott Atchley
Brian and George, I do not know if the stack trace is complete, but I do not see any mx_* functions called which would indicate a crash inside MX due to multiple threads trying to complete the same request. It does show an assert failed. Francois, is the stack trace from the MX MTL or BTL

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-11 Thread Brian Barrett
Neither the CM PML or the MX MTL has been looked at for thread safety. There's not much code to cause problems in the CM PML. The MX MTL would likely need some work to ensure the restrictions Scott mentioned are met (currently, there's no such guarantee in the MX MTL). Brian On Jun 11, 2

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-11 Thread George Bosilca
The comment on the FAQ (and on the other thread) is only true for some BTLs (TCP, SM and MX). I don't have resources to test for the others BTL, it is their developers responsibility to do the required modifications to make them thread safe. In addition, I have to confess that I never teste

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-11 Thread Scott Atchley
Francois, For threads, the FAQ has: http://www.open-mpi.org/faq/?category=supported-systems#thread-support It mentions that thread support is designed in, but lightly tested. It is also possible that the FAQ is out of date and MPI_THREAD_MULTIPLE is fully supported. The stack trace below

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-11 Thread François Trahay
Well, according to George Bosilca (http://www.open-mpi.org/community/lists/users/2005/02/0005.php), threads are supported in OpenMPI. The program I try to run works with the TCP stack and MX driver is thread-safe, so i guess the problem comes from the MX BTL or MTL. Francois Scott Atchley wr

Re: [OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-09 Thread Scott Atchley
Hi Francois, I am not familiar with the internals of the OMPI code. Are you sure, however, that threads are fully supported yet? I was under the impression that thread support was still partial. Can anyone else comment? Scott On Jun 8, 2009, at 8:43 AM, François Trahay wrote: Hi, I'm en

[OMPI users] Problem with OpenMPI (MX btl and mtl) and threads

2009-06-08 Thread François Trahay
Hi, I'm encountering some issues when running a multithreaded program with OpenMPI (trunk rev. 21380, configured with --enable-mpi-threads) My program (included in the tar.bz2) uses several pthreads that perform ping pongs concurrently (thread #1 uses tag #1, thread #2 uses tag #2, etc.) This prog