Re: [OMPI users] tcp_peer_send_blocking: send() to socket 9 failed: Broken pipe (32)

2018-02-09 Thread George Bosilca
What are the settings of the firewall on your 2 nodes ? George. On Fri, Feb 9, 2018 at 3:08 PM, William Mitchell wrote: > When I try to run an MPI program on a network with a shared file system > and connected by ethernet, I get the error message "tcp_peer_send_blocking: > send() to socket

[OMPI users] tcp_peer_send_blocking: send() to socket 9 failed: Broken pipe (32)

2018-02-09 Thread William Mitchell
When I try to run an MPI program on a network with a shared file system and connected by ethernet, I get the error message "tcp_peer_send_blocking: send() to socket 9 failed: Broken pipe (32)" followed by some suggestions of what could cause it, none of which are my problem. I have searched the FA

Re: [OMPI users] Output redirection: missing output from all but one node

2018-02-09 Thread Christoph Niethammer
Hi Joseph, Thanks for reporting! Regarding your second point about the missing output files there seems to be a problem with the current working directory detection on the remote nodes: while on the first node - on which mpirun is executed - the output folder is created in the current working

[OMPI users] Output redirection: missing output from all but one node

2018-02-09 Thread Joseph Schuchart
All, I am trying to debug my MPI application using good ol' printf and I am running into an issue with Open MPI's output redirection (using --output-filename). The system I'm running on is an IB cluster with the home directory mounted through NFS. 1) Sometimes I get the following error mes

Re: [OMPI users] OpenMPI with Portals4 transport

2018-02-09 Thread Todd Kordenbrock
Hi Brian, I'm tracking a two different problems here. The first is that mtl-portals4 is segfaulting in PtlPut(). The second is a btl-portals4 problem you described here: > Not specifying CM gets an earlier segfault (defaults to ob1) and looks to be a progress thread initialization problem. I h