Actually, I believe at least some of this may be a bug on our part. We currently pickup the local environment and forward it on to the remote nodes as the environment for use by the backend processes. I have seen quite a few environment variables in that list, including DISPLAY, which would create the problem you are seeing.
I¹ll have to chat with folks here to understand what part of the environment we absolutely need to carry forward, and what parts we need to ³cleanse² before passing it along. Ralph On 11/30/06 10:50 AM, "Dave Grote" <dpgr...@lbl.gov> wrote: > > I'm using caos linux (developed at LBL), which has the wrapper wwmpirun around > mpirun, so my command is something like > wwmpirun -np 8 -- -x PYTHONPATH --mca pls_rsh_agent '"ssh -X"' > /usr/local/bin/pyMPI > This is essentially the same as > mpirun -np 8 -x PYTHONPATH --mca pls_rsh_agent '"ssh -X"' /usr/local/bin/pyMPI > but wwmpirun does the scheduling, for example looking for idle nodes and > creating the host file. > My system is setup with a master/login node which is running a full version of > linux and slave nodes that run a reduced linux (that includes access to the X > libraries). wwmmpirun always picks the slaves nodes to run on. I've also tried > "ssh -Y" and it doesn't help. I've set xhost for the slave nodes in my login > shell on the master and that didn't work. XForwarding is enabled on all of the > nodes, so that's not the problem. > > I am able to get it to work by having wwmpirun do the command "ssh -X nodennnn > xclock" before starting the parallel program on that same node, but this only > works for the first person who logs into the master and gets > DISPLAY=localhost:10. When someone else tries to run a parallel job, its seems > that DISPLAY is set to localhost:10 on the slaves and tries to forward through > that other persons login with the same display number and the connection is > refused because of wrong authentication. This seems like very odd behavior. > I'm aware that this may be an issue with the X server (xorg) or with the > version of linux, so I am also seeking help from the person who maintains caos > linux. If it matters, the machine uses myrinet for the interconnects. > Thanks! > Dave > > Galen Shipman wrote: >> >> what does your command line look like? >> >> - Galen >> >> On Nov 29, 2006, at 7:53 PM, Dave Grote wrote: >> >> >> >>> >>> I cannot get X11 forwarding to work using mpirun. I've tried all of >>> the >>> standard methods, such as setting pls_rsh_agent = ssh -X, using xhost, >>> and a few other things, but nothing works in general. In the FAQ, >>> http://www.open-mpi.org/faq/?category=running#mpirun-gui, a >>> reference is >>> made to other methods, but "they involve sophisticated X forwarding >>> through mpirun", and no further explanation is given. Can someone tell >>> me what these other methods are or point me to where I can find >>> info on >>> them? I've done lots of google searching and havn't found anything >>> useful. This is a major issue since my parallel code heavily >>> depends on >>> having the ability to open X windows on the remote machine. Any and >>> all >>> help would be appreciated! >>> Thanks! >>> Dave >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users