I may have deleted any responses to this message. In either case, we appear to have fixed the problem by installing a more current version of openmpi.
On Thu, Feb 14, 2013 at 2:27 PM, Erik Nelson <nelsoner...@gmail.com> wrote: > > I'm encountering an error using qsub that none of us can figure out. MPI > C++ programs seem to > run fine when executed from the command line, but for some reason when I > submit them through > the queue I get a strange error message .. > > > [compute-3-12.local][[58672,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] > > connect() to 2002:8170:6c2f:b:21d:9ff:fefd:7d94 failed: Permission denied > (13) > > > the compute node 3-12 doesn't matter (the error can generate from any of > the nodes, and I'm > guessing that 3-12 is the parent node here). > > To check if there was some problem with my own code, I created a simple > 'hello world' program > (see attached files). > > Again, the program runs fine from the command line but fails in qsub with > the same sort of error > message. > > I have included (i) the code (ii) the job script for qsub, and (iii) the > ".o" file from qsub for the > "hello world" program. > > These don't look like MPI errors, but rather some conflict with, maybe, > secure communication > accross nodes. > > Is there something simple I can do to fix this? > > Thanks, Erik > > -- > Erik Nelson > > Howard Hughes Medical Institute > 6001 Forest Park Blvd., Room ND10.124 > Dallas, Texas 75235-9050 > > p : 214 645 5981 > f : 214 645 5948 -- Erik Nelson Howard Hughes Medical Institute 6001 Forest Park Blvd., Room ND10.124 Dallas, Texas 75235-9050 p : 214 645 5981 f : 214 645 5948