Re: [OMPI users] OpenMPI-1.10.0 bind-to core error

2015-09-16 Thread Patrick Begou
Thanks all for your answers, I've added some details about the tests I have run. See below. Ralph Castain wrote: Not precisely correct. It depends on the environment. If there is a resource manager allocating nodes, or you provide a hostfile that specifies the number of slots on the nodes,

Re: [OMPI users] runtime MCA parameters

2015-09-16 Thread marcin.krotkiewski
Thanks a lot, that looks right! Looks like some reading to do.. Do you know if in the OpenMPI implementation the MPI_T-interfaced MCA settings are thread-local, or rank-local? Thanks! Marcin On 09/15/2015 07:58 PM, Nathan Hjelm wrote: You can use MPI_T to set any MCA variable before MPI_In

[OMPI users] bug in MPI_Comm_accept?

2015-09-16 Thread marcin.krotkiewski
I have run into a freeze / potential bug when using MPI_Comm_accept in a simple client / server implementation. I have attached two simplest programs I could produce: 1. mpi-receiver.c opens a port using MPI_Open_port, saves the port name to a file 2. mpi-receiver enters infinite loop and

Re: [OMPI users] bug in MPI_Comm_accept?

2015-09-16 Thread Jalel Chergui
Can you check with an MPI_Finalize in the receiver ? Jalel Le 16/09/2015 16:06, marcin.krotkiewski a écrit : I have run into a freeze / potential bug when using MPI_Comm_accept in a simple client / server implementation. I have attached two simplest programs I could produce: 1. mpi-receiver.

Re: [OMPI users] bug in MPI_Comm_accept?

2015-09-16 Thread Marcin Krotkiewski
But where would I put it? If I put it in the while(1), then MPI_Comm_Accept cannot be called for the second time. If I put it outside of the loop it will never be called. On 09/16/2015 04:18 PM, Jalel Chergui wrote: Can you check with an MPI_Finalize in the receiver ? Jalel Le 16/09/2015 16:

Re: [OMPI users] bug in MPI_Comm_accept?

2015-09-16 Thread Jalel Chergui
Right, anyway Finalize is necessary at the end of the receiver. The other issue is Barrier which is invoked probably when the sender has exited hence changing the size of intercom. Can you comment that line in both files ? Jalel Le 16/09/2015 16:22, Marcin Krotkiewski a écrit : But where woul

Re: [OMPI users] bug in MPI_Comm_accept?

2015-09-16 Thread marcin.krotkiewski
I have removed the MPI_Barrier, to no avail. Same thing happens. Adding verbosity, before the receiver hangs I get the following message [node2:03928] mca: bml: Using openib btl to [[12620,1],0] on node node3 So It is somewhere in the openib btl module Marcin On 09/16/2015 04:34 PM, Jalel

Re: [OMPI users] OpenMPI-1.10.0 bind-to core error

2015-09-16 Thread Ralph Castain
As I said, if you don’t provide an explicit slot count in your hostfile, we default to allowing oversubscription. We don’t have OAR integration in OMPI, and so mpirun isn’t recognizing that you are running under a resource manager - it thinks this is just being controlled by a hostfile. If you

Re: [OMPI users] runtime MCA parameters

2015-09-16 Thread Jeff Squyres (jsquyres)
On Sep 16, 2015, at 8:22 AM, marcin.krotkiewski wrote: > > Thanks a lot, that looks right! Looks like some reading to do.. > > Do you know if in the OpenMPI implementation the MPI_T-interfaced MCA > settings are thread-local, or rank-local? By "rank local", I assume you mean "process local"

Re: [OMPI users] bug in MPI_Comm_accept?

2015-09-16 Thread Jalel Chergui
With openmpi-1.7.5, the sender segfaults. Sorry, I cannot see the problem in the codes. Perhaps people out there may help. Jalel Le 16/09/2015 16:40, marcin.krotkiewski a écrit : I have removed the MPI_Barrier, to no avail. Same thing happens. Adding verbosity, before the receiver hangs I

[OMPI users] open mpi gcc

2015-09-16 Thread Kumar, Sudhir
Hi We are currently using openmpi 1.8.5 and gcc 4.4.7, we would like to change the associated gcc to gcc 4.1.2 for our openmpi 1.8.5 installation. Is this possible. If so how can it be done. Thanks Sudhir Kumar

Re: [OMPI users] bug in MPI_Comm_accept? (UNCLASSIFIED)

2015-09-16 Thread Burns, Andrew J CTR USARMY RDECOM ARL (US)
CLASSIFICATION: UNCLASSIFIED Have you attempted using 2 cores per process? I have noticed that MPI_Comm_accept sometimes behaves strangely on single core variations. I have a program that makes use of Comm_accept/connect and I also call MPI_Comm_merge. So, you may want to look into that call as

Re: [OMPI users] open mpi gcc

2015-09-16 Thread Ralph Castain
Have you tried just rebuilding OMPI after setting gcc 4.1.2 at the front of your PATH and LD_LIBRARY_PATH? > On Sep 16, 2015, at 8:58 AM, Kumar, Sudhir wrote: > > Hi > We are currently using openmpi 1.8.5 and gcc 4.4.7, we would like to change > the associated gcc to gcc 4.1.2 for our openmpi

Re: [OMPI users] open mpi gcc

2015-09-16 Thread Kumar, Sudhir
Haven't tried that. Will try that approach. Thanks Sudhir Kumar -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Wednesday, September 16, 2015 11:05 AM To: Open MPI Users Subject: Re: [OMPI users] open mpi gcc Have you tried just rebuil

Re: [OMPI users] bug in MPI_Comm_accept? (UNCLASSIFIED)

2015-09-16 Thread marcin.krotkiewski
Thank you all for your replies. I have now tested the code with various setups and versions. First of all, the tcp btl seems to work fine (I had patience to check ~10 runs), the openib is the problem. I have also compiled using the Intel compiler and the story is the same as when using gcc.

[OMPI users] XHPL question

2015-09-16 Thread Mark Moorcroft
I found the thread below from May. I’m setting up a new cluster and using openmpi 1.10. I have a gnu build and an Intel. Neither has libmpi.so.1. I created a symlink and it’s working. My question is if I should try to rebuild LAPACK, and is it wise to be adding that link? For me it’s just burn-in a

[OMPI users] Contact?

2015-09-16 Thread Mark Moorcroft
It's worth a mention that I made several attempts to add a NASA email address to this list. Nothing ever happened. Nothing bounced. I never got a validation email. Nothing appeared in our spam filter. I emailed webmaster@openmpi and got no reply. There seems to be no contact conduit but these lists

Re: [OMPI users] Contact?

2015-09-16 Thread Jeff Squyres (jsquyres)
Sorry for the trouble. FWIW, there actually is a real, live sysadmin at Indiana University who actually receives the webmaster emails; he usually forwards such emails to me. Ping me off-list and we can dig into why your NASA email address didn't work (i.e., I can ask the IU sysadmins to look i

Re: [OMPI users] XHPL question

2015-09-16 Thread Ralph Castain
Jeff will undoubtedly start typing before he reads my response, so I'll spare you from reading all the ugly details twice :-) There was an unintentional ABI break in the 1.8 series that necessitated a version numbering change to libmpi. It involves the code that handles the connection between a pr

Re: [OMPI users] XHPL question

2015-09-16 Thread Mark Moorcroft
Hmm, I'm pretty sure my xhpl binary is/was dynamic linked. Here is my env: LD_LIBRARY_PATH=/share/apps/openmpi-1.10-intel-x86_64/lib:/share/apps/Intel/composer_xe_2015.2.164/compiler/lib/intel64:/share/apps/Intel/composer_xe_2015.2.164/mkl/lib/intel64:/opt/python/lib The binary fails without a sy

Re: [OMPI users] XHPL question

2015-09-16 Thread Ralph Castain
Looks like you are trying to link it against the 1.10 series? You could probably get away with the symlink, but unless there is some reason to avoid it, I'd just recompile to be safe. On Wed, Sep 16, 2015 at 7:36 PM, Mark Moorcroft wrote: > > > Hmm, I'm pretty sure my xhpl binary is/was dynamic