Re: [OMPI users] possible GATS bug in osc/sm

2015-09-21 Thread Nathan Hjelm
I maintain the osc/sm component but did not write the pscw synchronization. I agree that a counter is not sufficient. I have a fix in mind and will probably create a PR for it later this week. The fix will need to be applied to 1.10, 2.x, and master. -Nathan On Fri, Sep 18, 2015 at 10:33:18AM +0

Re: [OMPI users] OpenMPI-1.10.0 bind-to core error

2015-09-21 Thread Gilles Gouaillardet
Patrick, thanks for the report. can you confirm what happened was - you defined OMPI_MCA_plm_rsh_agent=oarshmost - oarshmost was not in the $PATH - mpirun silently ignored the remote nodes if that is correct, then i think mpirun should have reported an error (oarshmost not found, or cannot remot

Re: [OMPI users] OpenMPI-1.10.0 bind-to core error

2015-09-21 Thread Patrick Begou
Hi Gilles, I've done a big mistake! Compiling the patched version of openMPI and creating a new module, I've forgotten to add the path to oarshmost command while OMPI_MCA_plm_rsh_agent=oarshmost was set OpenMPI was silently ignoring oarshmost command as it was unable to find it and so only

Re: [OMPI users] send() to socket 9 failed with the 1.10.0 version but not with 1.8.7 one.

2015-09-21 Thread Jorge D'Elia
- Mensaje original - > De: "Ralph Castain" > Para: "Open MPI Users" > Enviado: Lunes, 21 de Septiembre 2015 1:42:08 > Asunto: Re: [OMPI users] send() to socket 9 failed with the 1.10.0 version > but not with 1.8.7 one. > > Okay, let’s try doing this: > > mpirun -mca oob_tcp_if_include

Re: [OMPI users] Problem with using MPI in a Python extension

2015-09-21 Thread Joel Hermanns
The project I’m working on is a larger toolchain (written in Python) to run regression tests. The part that does the data comparison is fairly small. Speed is not crucial but doing the data comparison in python was incredibly slow. So we went with a C++ extension. For everything else python work

Re: [OMPI users] send() to socket 9 failed with the 1.10.0 version but not with 1.8.7 one.

2015-09-21 Thread Ralph Castain
Okay, let’s try doing this: mpirun -mca oob_tcp_if_include br0 … This will restrict us to the br0 interface that is common to the two nodes. I note that your “node1” has two interfaces on the same subnet (192.168.1), which is usually a “no-no” that can cause trouble. Let’s see if removing that