I’m afraid that would be a rather significant job as it plays a
rather significant role in the ssh startup procedure. We have
plans to revamp that portion of the code, but without someone who
knows exactly what is going on and where, you are more likely to
break it than revise it.
If you can live with it as-is for now, I would strongly suggest
doing so until we get back to that area.
Just my $0.02.
Ralph
On 12/1/06 4:51 PM, "Dave Grote" <dpgr...@lbl.gov> wrote:
Is there a place where I can hack the openmpi code to force it to
keep the ssh sessions open without the -d option? I looked through
some of the code, including orterun.c and a few other places, but
don't have the familiarity with the code to find the place.
Thanks!
Dave
Galen Shipman wrote:
-d leaves the ssh session open
Try using:
mpirun -d -host boxtop2 -mca pls_rsh_agent "ssh -X -n" xterm -e cat
Note the "ssh -X -n", this will tell ssh not to open stdin..
You should then be able to type characters in the resulting xterm
and have them echo'd back correctly.
- Galen
On Dec 1, 2006, at 11:48 AM, Dave Grote wrote:
Thanks for the suggestion, but it doesn't fix my problem. I did
the same thing you did and was able to get xterms open when using
the -d option. But when I run my code, the -d option seems to play
havoc with stdin. My code normally reads stdin from one processor
and it broadcasts it to the others. This failed when using the -d
option and the code wouldn't take input commands properly.
But, since -d did get the X windows working, it must be doing
something differently. What is it about the -d option that allows
the windows to open? If I knew that, it would be the fix to my
problem.
Dave
Galen Shipman wrote:
I think this might be as simple as adding "-d" to the mpirun
command line....
If I run:
mpirun -np 2 -d -mca pls_rsh_agent "ssh -X" xterm -e gdb ./mpi-
ping
All is well, I get the xterm's up..
If I run:
mpirun -np 2 -mca pls_rsh_agent "ssh -X" xterm -e gdb ./mpi-ping
I get the following:
/usr/bin/xauth: error in locking authority file /home/
gshipman/.Xauthority
xterm Xt error: Can't open display: localhost:10.0
Have you tried adding "-d"?
Thanks,
Galen
On Nov 30, 2006, at 2:42 PM, Dave Grote wrote:
I don't think that that is the problem. As far as I can tell, the
DISPLAY environment variable is being set properly on the slave
(it will sometimes have a different value than in the shell where
mpirun was executed).
Dave
Ralph H Castain wrote:
Actually, I believe at least some of this may be a bug on our
part. We currently pickup the local environment and forward it on
to the remote nodes as the environment for use by the backend
processes. I have seen quite a few environment variables in that
list, including DISPLAY, which would create the problem you are
seeing.
I’ll have to chat with folks here to understand what part of the
environment we absolutely need to carry forward, and what parts we
need to “cleanse” before passing it along.
Ralph
On 11/30/06 10:50 AM, "Dave Grote" <dpgr...@lbl.gov>
<mailto:dpgr...@lbl.gov> wrote:
I'm using caos linux (developed at LBL), which has the wrapper
wwmpirun around mpirun, so my command is something like
wwmpirun -np 8 -- -x PYTHONPATH --mca pls_rsh_agent '"ssh -X"' /
usr/local/bin/pyMPI
This is essentially the same as
mpirun -np 8 -x PYTHONPATH --mca pls_rsh_agent '"ssh -X"' /usr/
local/bin/pyMPI
but wwmpirun does the scheduling, for example looking for idle
nodes and creating the host file.
My system is setup with a master/login node which is running a
full version of linux and slave nodes that run a reduced linux
(that includes access to the X libraries). wwmmpirun always picks
the slaves nodes to run on. I've also tried "ssh -Y" and it
doesn't help. I've set xhost for the slave nodes in my login shell
on the master and that didn't work. XForwarding is enabled on all
of the nodes, so that's not the problem.
I am able to get it to work by having wwmpirun do the command "ssh
-X nodennnn xclock" before starting the parallel program on that
same node, but this only works for the first person who logs into
the master and gets DISPLAY=localhost:10. When someone else tries
to run a parallel job, its seems that DISPLAY is set to localhost:
10 on the slaves and tries to forward through that other persons
login with the same display number and the connection is refused
because of wrong authentication. This seems like very odd
behavior. I'm aware that this may be an issue with the X server
(xorg) or with the version of linux, so I am also seeking help
from the person who maintains caos linux. If it matters, the
machine uses myrinet for the interconnects.
Thanks!
Dave
Galen Shipman wrote:
what does your command line look like?
- Galen
On Nov 29, 2006, at 7:53 PM, Dave Grote wrote:
I cannot get X11 forwarding to work using mpirun. I've tried all of
the
standard methods, such as setting pls_rsh_agent = ssh -X, using
xhost,
and a few other things, but nothing works in general. In the FAQ,
http://www.open-mpi.org/faq/?category=running#mpirun-gui, a
reference is
made to other methods, but "they involve sophisticated X forwarding
through mpirun", and no further explanation is given. Can someone
tell
me what these other methods are or point me to where I can find
info on
them? I've done lots of google searching and havn't found anything
useful. This is a major issue since my parallel code heavily
depends on
having the ability to open X windows on the remote machine. Any and
all
help would be appreciated!
Thanks!
Dave
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________ users mailing list
us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users