Hmmm..certainly sounds like a bug. It should pickup that the node is local. It checks the hostname (as returned by gethostname), but it also checks to see if host resolves to a local address. I'm assuming that the offending host has some other address besides just 127.0.1.1 as otherwise it couldn't connect to anything.
I'm heading out the door for a couple of weeks, but can try to look at it when I return. On Jun 19, 2013, at 10:43 AM, Riccardo Murri <riccardo.mu...@uzh.ch> wrote: > On 19 June 2013 16:01, Ralph Castain <r...@open-mpi.org> wrote: >> How is OMPI picking up this hostfile? It isn't being specified on the cmd >> line - are you running under some resource manager? > > Via the environment variable `OMPI_MCA_orte_default_hostfile`. > > We're running under SGE, but disable the OMPI/SGE integration (rather > old version of SGE, does not coordinate well with OpenMPI); here's the > relevant snippet from our startup script: > > # the OMPI/SGE integration does not seem to work with > # our SGE version; so use the `mpi` PE and direct OMPI > # to look for a "plain old" machine file > unset PE_HOSTFILE > if [ -r "${TMPDIR}/machines" ]; then > OMPI_MCA_orte_default_hostfile="${TMPDIR}/machines" > export OMPI_MCA_orte_default_hostfile > fi > GMSCOMMAND="$openmpi_root/bin/mpiexec -n $NCPUS --nooversubscribe > $gamess $INPUT -scr $(pwd)" > > The `$TMPDIR/machines` hostfile is created from SGE's $PE_HOSTFILE by > extracting the host names, and repeating each one for the given number > of slots (unmodified code that comes with SGE): > > PeHostfile2MachineFile() > { > cat $1 | while read line; do > # echo $line > host=`echo $line|cut -f1 -d" "|cut -f1 -d"."` > nslots=`echo $line|cut -f2 -d" "` > i=1 > while [ $i -le $nslots ]; do > echo $host > i=`expr $i + 1` > done > done > } > > Thanks, > Riccardo > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users