You can eliminate the "[n17:30019] odls_bproc: openpty failed, using
pipes instead" message by configuring OMPI with the --disable-pty-
support flag, as there is a bug in BProc that causes that to happen.
-david
--
David Gunter
HPC-4: HPC Environments: Parallel Tools Team
Los Alamos National Laboratory
On Apr 26, 2007, at 2:06 PM, Daniel Gruner wrote:
Hi
I have been testing OpenMPI 1.2, and now 1.2.1, on several BProc-
based clusters, and I have found some problems/issues. All my
clusters have standard ethernet interconnects, either 100Base/T or
Gigabit, on standard switches.
The clusters are all running Clustermatic 5 (BProc 4.x), and range
from 32-bit Athlon, to 32-bit Xeon, to 64-bit Opteron. In all cases
the same problems occur, identically. I attach here the results
from "ompi_info --all" and the config.log, for my latest build on
an Opteron cluster, using the Pathscale compilers. I had exactly
the same problems when using the vanilla GNU compilers.
Now for a description of the problem:
When running an mpi code (cpi.c, from the standard mpi examples, also
attached), using the mpirun defaults (e.g. -byslot), with a single
process:
sonoma:dgruner{134}> mpirun -n 1 ./cpip
[n17:30019] odls_bproc: openpty failed, using pipes instead
Process 0 on n17
pi is approximately 3.1415926544231341, Error is 0.0000000008333410
wall clock time = 0.000199
However, if one tries to run more than one process, this bombs:
sonoma:dgruner{134}> mpirun -n 2 ./cpip
.
.
.
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
.
. ad infinitum
If one uses de option "-bynode", things work:
sonoma:dgruner{145}> mpirun -bynode -n 2 ./cpip
[n17:30055] odls_bproc: openpty failed, using pipes instead
Process 0 on n17
Process 1 on n21
pi is approximately 3.1415926544231318, Error is 0.0000000008333387
wall clock time = 0.010375
Note that there is always the message about "openpty failed, using
pipes instead".
If I run more processes (on my 3-node cluster, with 2 cpus per
node), the
openpty message appears repeatedly for the first node:
sonoma:dgruner{146}> mpirun -bynode -n 6 ./cpip
[n17:30061] odls_bproc: openpty failed, using pipes instead
[n17:30061] odls_bproc: openpty failed, using pipes instead
Process 0 on n17
Process 2 on n49
Process 1 on n21
Process 5 on n49
Process 3 on n17
Process 4 on n21
pi is approximately 3.1415926544231239, Error is 0.0000000008333307
wall clock time = 0.050332
Should I worry about the openpty failure? I suspect that
communications
may be slower this way. Using the -byslot option always fails, so
this
is a bug. The same occurs for all the codes that I have tried,
both simple
and complex.
Thanks for your attention to this.
Regards,
Daniel
--
Dr. Daniel Gruner dgru...@chem.utoronto.ca
Dept. of Chemistry daniel.gru...@utoronto.ca
University of Toronto phone: (416)-978-8689
80 St. George Street fax: (416)-978-5325
Toronto, ON M5S 3H6, Canada finger for PGP public key
<cpi.c.gz>
<config.log.gz>
<ompiinfo.gz>
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users