Hi Gilles, all, Using `OMPI_MCA_ess_singleton_isolated=true ./program` achieves the desired result of establishing no TCP connections for a singleton execution.
Thank you for the suggestion! Best regards, -Dan On Wed, Apr 17, 2019 at 5:35 PM Gilles Gouaillardet <gil...@rist.or.jp> wrote: > Daniel, > > > If your MPI singleton will never MPI_Comm_spawn(), then you can use the > isolated mode like this > > OMPI_MCA_ess_singleton_isolated=true ./program > > > You can also save some ports by blacklisting the btl/tcp component > > > OMPI_MCA_ess_singleton_isolated=true OMPI_MCA_pml=ob1 > OMPI_MCA_btl=vader,self ./program > > > Cheers, > > > Gilles > > On 4/18/2019 3:51 AM, Daniel Hemberger wrote: > > Hi everyone, > > > > I've been trying to track down the source of TCP connections when > > running MPI singletons, with the goal of avoiding all TCP > > communication to free up ports for other processes. I have a local apt > > install of OpenMPI 2.1.1 on Ubuntu 18.04 which does not establish any > > TCP connections by default, either when run as "mpirun -np 1 > > ./program" or "./program". But it has non-TCP alternatives for both > > the BTL (vader, self, etc.) and OOB (ud and usock) frameworks, so I > > was not surprised by this result. > > > > On a remote machine, I'm running the same test with an assortment of > > OpenMPI versions (1.6.4, 1.8.6, 4.0.0, 4.0.1 on RHEL6 and 1.10.7 on > > RHEL7). In all but 1.8.6 and 1.10.7, there is always a TCP connection > > established, even if I disable the TCP BTL on the command line (e.g. > > "mpirun --mca btl ^tcp"). Therefore, I assumed this was because `tcp` > > was the only OOB interface available in these installations. This TCP > > connection is established both for "mpirun -np 1 ./program" and > > "./program". > > > > The confusing part is that the 1.8.6 and 1.10.7 installations only > > appear to establish a TCP connection when invoked with "mpirun -np 1 > > ./program", but _not_ with "./program", even though its only OOB > > interface was also `tcp`. This result was not consistent with my > > understanding, so now I am confused about when I should expect TCP > > communication to occur. > > > > Is there a known explanation for what I am seeing? Is there actually a > > way to get singletons to forego all TCP communication, even if TCP is > > the only OOB available, or is there something else at play here? I'd > > be happy to provide any config.log files or ompi_info output if it > > would help. > > > > For more context, the underlying issue I'm trying to resolve is that > > we are (unfortunately) running many short instances of mpirun, and the > > TCP connections are piling up in the TIME_WAIT state because they > > aren't cleaned up faster than we create them. > > > > Any advice or pointers would be greatly appreciated! > > > > Thanks, > > -Dan > > > > _______________________________________________ > > users mailing list > > users@lists.open-mpi.org > > https://lists.open-mpi.org/mailman/listinfo/users > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users