as usual -- forgot the attachment On Tue, 2006-11-14 at 14:50 -0700, Susan Coulter wrote: > We are investigating a problem that occurs when running a particular > code on more than 120 nodes. That number, 120, was arrived at purely > from empirical testing. We have tried various versions of openmpi > including 1.0.2, 1.1, and 1.1.2. They all fail the same way. The > archives indicate this was possibly a problem with 1.0.2 that was > resolved in later versions - but we get the same error with later > versions. > > This is an LNXI 64bit bproc cluster w/ IB interconnect. > > Attached is tgz file containing a snippet of stderr output, the output > from /opt/OpenMPI/openmpi-1.1/ib/bin/ompi_info, and > /usr/share/doc/openmpi-ib-1.1/config.log. > > Please let me know what other info you may want. Any feedback will be > appreciated. > > -- ============================================= Susan Coulter Scientific Computing Resources HPC-3 High Performance Computing Los Alamos National Laboratory 505-667-8425 - voice 505-665-7793 - fax ============================================= Increase the Peace ... An eye for an eye makes the whole world blind
mpierr.tgz
Description: application/compressed-tar