Ya, I just tested -x as well, and it does indeed set the value of DISPLAY correctly for every process, every time I run it. Unfortunately the display is still not behaving as desired. Sometimes they open, and sometimes they don't.
I'm currently using openmpi-1.4.1 over infiniband on a Rocks cluster. Any ideas on how to debug this will be greatly appreciated. Thanks, Brad On Tue, Dec 7, 2010 at 1:42 PM, Ralph Castain <r...@open-mpi.org> wrote: > FWIW: I just tested the -x option on a multi-node system and had no problem > getting the value of DISPLAY to propagate. I was able to define it on the > cmd line, saw it set correctly on every process, etc. > > This was with our devel trunk - not sure what version you are using. > > > On Dec 7, 2010, at 12:12 PM, brad baker wrote: > > Thanks for your responses! I'm at home today so I can't actually do any > tests to 'see' if anything works. But I logged in remotely and I did as > Ralph suggested and ran env as my app. No process returned a value for > DISPLAY. Then I made a small program that calls getenv("DISPLAY") to run > with mpi, and each process returns NULL. > > I did some googling and found in the mpirun man > page<http://linux.die.net/man/1/mpirun> > : > > "*Exported Environment Variables* > The *-x* option to *mpirun* can be used to export specific environment > variables to the new processes. While the syntax of the *-x* option allows > the definition of new variables, note that the* parser for this option is > currently not very sophisticated* - it does not even understand quoted > values. Users are advised to* set variables in the environment and use ** > -x** to export them; not to define them*." > So it looks like I need to manually set them, possible how Jeff suggested. > I'll do some more research on this and get back after I've tried a few > things in the lab. > > Thanks again! > Brad > > > On Tue, Dec 7, 2010 at 10:26 AM, Jeff Squyres <jsquy...@cisco.com> wrote: > >> Are you using ssh to launch OMPI between your nodes? (i.e., is mpirun >> using ssh under the covers to launch on remote nodes) >> >> If so, you might want to just set OMPI to use "ssh -X", which sets up SSH >> tunneled X forwarding, and therefore it sets DISPLAY for you properly on all >> the remote nodes automatically. But it does have the disadvantage of being >> a bit slow, since it's coming through ssh. >> >> Alternatively, you can xhost +<source_host>, where <source_host> is the >> host where your X app is running. Then set your DISPLAY variable manually >> to <source_host>:display and it'll just go in an unencrypted fashion. This >> is normal X forwarding stuff -- you can probably google around for more info >> on this. >> >> NOTE: IIRC, xauth is better than xhost these days. I stopped using X for >> most things many years ago, so my xhost/xauth information is probably a >> little dated. Google around for the most recent / best ways to do this >> stuff. >> >> >> On Dec 6, 2010, at 10:11 PM, Ralph Castain wrote: >> >> > BTW: you might check to see if the DISPLAY envar is being correctly set >> on all procs. Two ways to do it: >> > >> > 1. launch "env" as your app to print out the envars - can be messy on >> the output end, though you could use the mpirun options to tag and/or split >> the output from the procs >> > >> > 2. in your app, just do a getenv and print the display envar >> > >> > Would help tell us if there is an OMPI problem, or just a problem in how >> you setup X11 >> > >> > >> > On Dec 6, 2010, at 9:18 PM, Ralph Castain wrote: >> > >> >> Hmmm...yes, the code does seem to handle that '=' being in there. >> Forgot it was there. >> >> >> >> Depending on the version you are using, mpirun could just open the >> display for you. There is an mpirun option that tells us to please start >> each app in its own xterm. >> >> >> >> You shouldn't need forwarding if you are going to see it on a local >> display (i.e., one physically attached to the node), assuming you are logged >> into those nodes (otherwise you don't own the display). >> >> >> >> If you are trying to view it on your own local display, then you do >> need forwarding setup. >> >> >> >> >> >> On Dec 6, 2010, at 8:36 PM, brad baker wrote: >> >> >> >>> Without including the -x DISPLAY, glut doesn't know what display to >> open. For instance, without the -x DISPLAY parameter glut returns an error >> from each process stating that it could not find display "" (empty string). >> This strategy is briefly described in the openmpi FAQs for launching gui >> applications with openmpi. >> >>> >> >>> I'm assuming that by setting the DISPLAY envar to :0.0, each process >> will render to their local display, which is my intention, and as I >> previously stated works for up to 2 processes. So I believe it to be >> necessary. >> >>> >> >>> But I'm thinking I may have to configure some kind of X11 forwarding. >> I'm not sure... >> >>> >> >>> Thanks for your reply! Any more ideas? >> >>> Brad >> >>> >> >>> >> >>> On Mon, Dec 6, 2010 at 6:31 PM, Ralph Castain <r...@open-mpi.org> >> wrote: >> >>> Guess I'm not entirely sure I understand how this is supposed to work. >> All the -x does is tell us to pickup an envar of the given name and forward >> its value to the remote apps. You can't set the envar's value on the cmd >> line. So you told mpirun to pickup the value of an envar called >> "DISPLAY=:0.0". >> >>> >> >>> So yes - I would expect this would be behaving strangely. >> >>> >> >>> If you tell us -x DISPLAY, we'll pickup the local value of DISPLAY and >> forward it. What that will cause your app to do is, I suppose, up to it. >> >>> >> >>> >> >>> On Dec 6, 2010, at 12:42 PM, brad baker wrote: >> >>> >> >>> > Hello, >> >>> > >> >>> > I'm working on an mpi application that opens a glut display on each >> node of a small cluster for opengl rendering (each node has its own >> display). My current implementation scales great with mpich2, but I'd like >> to use openmpi infiniband, which is giving me trouble. >> >>> > >> >>> > I've had some success with the -x DISPLAY=:0.0 parameter to mpirun, >> which will open the display on up to 2 of my nodes... any 2. But when I >> attempt to run the application on 4 nodes, the display is non-deterministic. >> If any open at all process 0 definately will, and sometimes process 3 along >> with that. I haven't observed much behavior from process 1 or 2. >> >>> > >> >>> > My command: >> >>> > >> >>> > mpirun -x DISPLAY=:0.0 -np 4 -hostfile ~/openmpi.hosts ./myapp >> >>> > >> >>> > I've tried adding the -d option with no success. >> >>> > >> >>> > Does anyone have any suggestions or comments? I'll certainly >> provide more information if required. >> >>> > >> >>> > Thanks, >> >>> > Brad >> >>> > _______________________________________________ >> >>> > users mailing list >> >>> > us...@open-mpi.org >> >>> > http://www.open-mpi.org/mailman/listinfo.cgi/users >> >>> >> >>> >> >>> _______________________________________________ >> >>> users mailing list >> >>> us...@open-mpi.org >> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >>> >> >>> _______________________________________________ >> >>> users mailing list >> >>> us...@open-mpi.org >> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> > >> > _______________________________________________ >> > users mailing list >> > us...@open-mpi.org >> > http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >