So what i currently do to have my xterms running:
on my workstation i call
xhost + <hostname> for all
machines in my hostfile, to allow them to use X on my workstation.
Then i set my DISPLAY variable to point to my workstation
export DISPLAY=<mymachine>:0.0
Finally, i call mpirun with the -x option (to exports the DISPLAY
variable to all nodes) :
mpirun -np 4 -hostfile myfiles -x DISPLAY run_xterm.sh
MyApplication arg1
arg2
Here run_xterm.sh is a shell script which creates a useful title
for
the xterm window
and calls the application with all its arguments (-hold leaves the
xterm open after the program terminates):
#!/bin/sh -f
# feedback for command line
echo "Running on node `hostname`"
# for version 1.2 use undocumented env variable
# for version 1.3 use documented env variable
export ID=$OMPI_COMM_WORLD_RANK
if [ X$ID = X ]; then
export ID=$OMPI_MCA_ns_nds_vpid
fi
export TITLE="node #$ID"
# start terminal
xterm -T "$TITLE" -hold -e $*
exit 0
(i have similar scripts to run gdb or valgrind in xterm windows)
I know that the 'xhost +' is a horror for certain sysadmins,
but i feel quite safe, because the machines listed in my hostfile
are not accessible from outside our department.
I haven't found any other alternative to have nice xterms when i
can't
use 'ssh -X'.
To come back to the '--xterm' option: i just ran my xterm-script
after
doing the above xhost+ and DISPLAY things, and it worked - all
local and
remote
processes created their xterm windows. (In other words, the
environment
was
set to have my remote nodes use xterms on my workstation.)
Immediately thereafter i called the same application with
mpirun -np 8 -hostfile testhosts --xterm 2,3,4,5! -x DISPLAY ./
MPITest
but still, only the local process (#2) created an xterm.
Do you think it would be possible to have open MPI make its
ssh-connections with '-X',
or are there technical or security-related objections?
Regards
Jody
On Mon, Feb 2, 2009 at 4:47 PM, Ralph Castain <r...@lanl.gov> wrote:
On Feb 2, 2009, at 2:55 AM, jody wrote:
Hi Ralph
The new options are great stuff!
Following your suggestion, i downloaded and installed
http://www.open-mpi.org/nightly/trunk/openmpi-1.4a1r20392.tar.gz
and tested the new options. (i have a simple cluster of
8 machines over tcp). Not everything worked as specified, though:
* timestamp-output : works
good!
* xterm : doesn't work completely -
comma-separated rank list:
Only for the local processes a xterm is opened. The other
processes
(the ones on remote machines) only output to the stdout of the
calling window.
(Just to be sure i started my own script for opening separate
xterms
- that did work for the remoties, too)
This is a problem we wrestled with for some time. The issue is
that we
really aren't comfortable modifying the DISPLAY envar on the
remote nodes
like you do in your script. It is fine for a user to do whatever
they
want,
but for OMPI to do it...that's another matter. We can't even
know for
sure
what to do because of the wide range of scenarios that might
occur (e.g.,
is
mpirun local to you, or on a remote node connected to you via
xterm,
or...?).
What you (the user) need to do is ensure that X11 is setup
properly so
that
an Xwindow opened on the remote host is displayed on your
screen. In this
case, I believe you have to enable xforwarding - I'm not an
xterm expert,
so
I can't advise you on how to do this. Suspect you may already
know - in
which case, can you please pass it along and I'll add it to our
docs? :-)
If a '-1' is given instead of a list of ranks, it fails
(locally &
with remotes):
[jody@localhost neander]$ mpirun -np 4 --xterm -1 ./MPITest
--------------------------------------------------------------------------
Sorry! You were supposed to get help about:
orte-odls-base:xterm-rank-out-of-bounds
from the file:
help-odls-base.txt
But I couldn't find any file matching that name. Sorry!
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun was unable to start the specified application as it
encountered an error
on node localhost. More information may be available above.
--------------------------------------------------------------------------
Fixed as of r20398 - this was a bug, had an if statement out of
sequence.
* output-filename : doesn't work here:
[jody@localhost neander]$ mpirun -np 4 --output-filename gnagna
./MPITest
[jody@localhost neander]$ ls -l gna*
-rw-r--r-- 1 jody morpho 549 2009-02-02 09:07 gnagna.%10lu
There is output from the processes on remote machines on
stdout, but
none
from the local ones.
Fixed as of r20400 - had a format statement syntax that was okay
in some
compilers, but not others.
A question about installing: i installed the usual way
(configure,
make all install),
but the new man-files apparently weren't copied to their
destination:
If i do 'man mpirun' i get shown the contents of an old man-file
(without the new options).
I had to do ' less /opt//openmpi-1.4a1r20394/share/man/man1/
mpirun.1'
to see them.
Strange - the install should put them in the right place, but I
wonder if
you updated your manpath to point at it?
About the xterm-option : when the application ends all xterms are
closed immediately.
(when doing things 'by hand' i used the -hold option for xterm)
Would it be possible to add this feature for your xterm option?
Perhaps by adding a '!' at the end of the rank list?
Done! A "!" at the end of the list will activate -hold as of
r20398.
About orte_iof: with the new version it works, but no matter
which
rank i specify,
it only prints out rank0's output:
[jody@localhost ~]$ orte-iof --pid 31049 --rank 4 --stdout
[localhost]I am #0/9 before the barrier
The problem here is that the option name changed from "rank" to
"ranks"
since you can now specify any number of ranks as comma-separated
ranges.
I
have updated orte-iof so it will gracefully fail if you provide an
unrecognized cmd line option and output the "help" detailing the
accepted
options.
Thanks
Jody
On Sun, Feb 1, 2009 at 10:49 PM, Ralph Castain <r...@lanl.gov>
wrote:
I'm afraid we discovered a bug in optimized builds with
r20392. Please
use
any tarball with r20394 or above.
Sorry for the confusion
Ralph
On Feb 1, 2009, at 5:27 AM, Jeff Squyres wrote:
On Jan 31, 2009, at 11:39 AM, Ralph Castain wrote:
For anyone following this thread:
I have completed the IOF options discussed below.
Specifically, I
have
added the following:
* a new "timestamp-output" option that timestamp's each line
of
output
* a new "output-filename" option that redirects each proc's
output to
a
separate rank-named file.
* a new "xterm" option that redirects the output of the
specified
ranks
to a separate xterm window.
You can obtain a copy of the updated code at:
http://www.open-mpi.org/nightly/trunk/openmpi-1.4a1r20392.tar.gz
Sweet stuff. :-)
Note that the URL/tarball that Ralph cites is a nightly
snapshot and
will
expire after a while -- we only keep the most 5 recent nightly
tarballs
available. You can find Ralph's new IOF stuff in any 1.4a1
nightly
tarball
after the one he cited above. Note that the last part of the
tarball
name
refers to the subversion commit number (which increases
monotonically);
any
1.4 nightly snapshot tarball beyond "r20392" will contain
this new IOF
stuff. Here's where to get our nightly snapshot tarballs:
http://www.open-mpi.org/nightly/trunk/
Don't read anything into the "1.4" version number -- we've
just bumped
the
version number internally to be different than the current
stable
series
(1.3). We haven't yet branched for the v1.4 series; hence,
"1.4a1"
currently refers to our development trunk.
--
Jeff Squyres
Cisco Systems
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users