On Wed, 16 Jan 2013 07:46:41 -0800 Ralph Castain <r...@open-mpi.org> wrote:
> This one means that a backend node lost its connection to mpirun. We use a > TCP socket between the daemon on a node and mpirun to launch the processes > and to detect if/when that node fails for some reason. Hm. And what would be the reasons for this? Too much load on node where mpirun is run? -- Jure Pečar http://jure.pecar.org