Re: [OMPI users] Tracing the library using gdb and xterm
Hi Rolf, Thanks for that. There is still one minor problem, though. The xwindow is getting spawned on the remote machine and not on my local machine. It now looks like, mpirun --prefix /usr/local -hostfile machines -x DISPLAY -x PATH -np 2 xterm -e gdb peruse_ex1 Please let me know what i can do to have it displayed on my machine. I have the DISPLAY variable set to 0.0 on both the machines and I am ssh-ing into the other machine by using the -X switch. Thanks, Krishna Chaitanya On 1/2/08, Rolf Vandevaart wrote: > > Krishna Chaitanya wrote: > > Hi, > >I have been tracing the interactions between the PERUSE > > and MPI library,on one machine. I have been using gdb along with xterm > > to have two windows open at the same time as I step through the code. I > > wish to get a better glimpse of the working of the point to point calls, > > by launching the job on two machines and by tracing the flow in a > > similar manner. This is where I stand as of now : > > > > mpirun --prefix /usr/local -hostfile machines -np 2 xterm -e gdb > peruse_ex1 > > xterm Xt error: Can't open display: > > xterm: DISPLAY is not set > > > >I tried using the display option for xterm and setting > > the value as 0.0, that was not of much help. > >If someone can guide me as to where the DISPLAY parameter > > has to be set to allow the remote machine to open the xterm window, it > > will be of great help. > > > > Thanks, > > Krishna > > > > I also do the the following: > > -x DISPLAY -x PATH > > In this way, both your DISPLAY and PATH settings make it to the remote > node. > > Rolf > -- > > = > rolf.vandeva...@sun.com > 781-442-3043 > = > -- In the middle of difficulty, lies opportunity
Re: [OMPI users] Tracing the library using gdb and xterm
Krishna -- Did you not see my post yesterday? http://www.open-mpi.org/community/lists/users/2008/01/4774.php D On Jan 3, 2008, at 4:54 AM, Krishna Chaitanya wrote: Hi Rolf, Thanks for that. There is still one minor problem, though. The xwindow is getting spawned on the remote machine and not on my local machine. It now looks like, mpirun --prefix /usr/local -hostfile machines -x DISPLAY -x PATH - np 2 xterm -e gdb peruse_ex1 Please let me know what i can do to have it displayed on my machine. I have the DISPLAY variable set to 0.0 on both the machines and I am ssh-ing into the other machine by using the -X switch. Thanks, Krishna Chaitanya On 1/2/08, Rolf Vandevaart wrote: Krishna Chaitanya wrote: > Hi, >I have been tracing the interactions between the PERUSE > and MPI library,on one machine. I have been using gdb along with xterm > to have two windows open at the same time as I step through the code. I > wish to get a better glimpse of the working of the point to point calls, > by launching the job on two machines and by tracing the flow in a > similar manner. This is where I stand as of now : > > mpirun --prefix /usr/local -hostfile machines -np 2 xterm -e gdb peruse_ex1 > xterm Xt error: Can't open display: > xterm: DISPLAY is not set > >I tried using the display option for xterm and setting > the value as 0.0, that was not of much help. >If someone can guide me as to where the DISPLAY parameter > has to be set to allow the remote machine to open the xterm window, it > will be of great help. > > Thanks, > Krishna > I also do the the following: -x DISPLAY -x PATH In this way, both your DISPLAY and PATH settings make it to the remote node. Rolf -- = rolf.vandeva...@sun.com 781-442-3043 = -- In the middle of difficulty, lies opportunity ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] Tracing the library using gdb and xterm
Krishna, Review the ssh and sshd man pages. When using ssh -X it takes care of defining the DISPLAY and sending the X11 images to your screen. Defining DISPLY directly generally won't work (that is how you do it with rlogin but not with ssh). Doug Reeder On Jan 3, 2008, at 1:54 AM, Krishna Chaitanya wrote: Hi Rolf, Thanks for that. There is still one minor problem, though. The xwindow is getting spawned on the remote machine and not on my local machine. It now looks like, mpirun --prefix /usr/local -hostfile machines -x DISPLAY -x PATH - np 2 xterm -e gdb peruse_ex1 Please let me know what i can do to have it displayed on my machine. I have the DISPLAY variable set to 0.0 on both the machines and I am ssh-ing into the other machine by using the -X switch. Thanks, Krishna Chaitanya On 1/2/08, Rolf Vandevaart wrote: Krishna Chaitanya wrote: > Hi, >I have been tracing the interactions between the PERUSE > and MPI library,on one machine. I have been using gdb along with xterm > to have two windows open at the same time as I step through the code. I > wish to get a better glimpse of the working of the point to point calls, > by launching the job on two machines and by tracing the flow in a > similar manner. This is where I stand as of now : > > mpirun --prefix /usr/local -hostfile machines -np 2 xterm -e gdb peruse_ex1 > xterm Xt error: Can't open display: > xterm: DISPLAY is not set > >I tried using the display option for xterm and setting > the value as 0.0, that was not of much help. >If someone can guide me as to where the DISPLAY parameter > has to be set to allow the remote machine to open the xterm window, it > will be of great help. > > Thanks, > Krishna > I also do the the following: -x DISPLAY -x PATH In this way, both your DISPLAY and PATH settings make it to the remote node. Rolf -- = rolf.vandeva...@sun.com 781-442-3043 = -- In the middle of difficulty, lies opportunity ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Tracing the library using gdb and xterm
Per my previous mail, Open MPI (by default) closes its ssh sessions after the remote processes are launched, so X forwarding through ssh will not work. If it is possible (and I think it is, based on your subsequent replies), you might be best served with unencrypted X forwarding. On Jan 3, 2008, at 11:02 AM, Doug Reeder wrote: Krishna, Review the ssh and sshd man pages. When using ssh -X it takes care of defining the DISPLAY and sending the X11 images to your screen. Defining DISPLY directly generally won't work (that is how you do it with rlogin but not with ssh). Doug Reeder On Jan 3, 2008, at 1:54 AM, Krishna Chaitanya wrote: Hi Rolf, Thanks for that. There is still one minor problem, though. The xwindow is getting spawned on the remote machine and not on my local machine. It now looks like, mpirun --prefix /usr/local -hostfile machines -x DISPLAY -x PATH - np 2 xterm -e gdb peruse_ex1 Please let me know what i can do to have it displayed on my machine. I have the DISPLAY variable set to 0.0 on both the machines and I am ssh-ing into the other machine by using the -X switch. Thanks, Krishna Chaitanya On 1/2/08, Rolf Vandevaart wrote: Krishna Chaitanya wrote: > Hi, >I have been tracing the interactions between the PERUSE > and MPI library,on one machine. I have been using gdb along with xterm > to have two windows open at the same time as I step through the code. I > wish to get a better glimpse of the working of the point to point calls, > by launching the job on two machines and by tracing the flow in a > similar manner. This is where I stand as of now : > > mpirun --prefix /usr/local -hostfile machines -np 2 xterm -e gdb peruse_ex1 > xterm Xt error: Can't open display: > xterm: DISPLAY is not set > >I tried using the display option for xterm and setting > the value as 0.0, that was not of much help. >If someone can guide me as to where the DISPLAY parameter > has to be set to allow the remote machine to open the xterm window, it > will be of great help. > > Thanks, > Krishna > I also do the the following: -x DISPLAY -x PATH In this way, both your DISPLAY and PATH settings make it to the remote node. Rolf -- = rolf.vandeva...@sun.com 781-442-3043 = -- In the middle of difficulty, lies opportunity ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] multi-compiler builds of OpenMPI (RPM)
Thanks for the detailed responces! I've included some stuff inline below: On Jan 2, 2008 1:56 PM, Jeff Squyres wrote: > On Dec 31, 2007, at 12:50 AM, Jim Kusznir wrote: > > The rpm build errored out near the end with a missing file. It was > > trying to find /opt/openmpi-gcc/1.2.4/opt/share/openmpi-gcc (IIRC), > > but the last part was actually openmpi on disk. I ended up > > correcting it by changing line 182 (configuration logic) to: > > > > %define _datadir /opt/%{name}/%{version}/share/%{name} > > > > (I changed _pkgdatadir to _datadir). Your later directive if > > _pkgdatadir is undefined took care of _pkgdatadir. I must admit, I > > still don't fully understand where rpm was getting the idea to look > > for that file...I tried manually configuring _pkgdatadir to the path > > that existed, but that changed nothing. If I didn't rename the > > package, it all worked fine. > > Hmm. This is actually symptomatic of a larger problem -- Open MPI's > configure/build process is apparently not getting the _pkgdatadir > value, probably because there's no way to pass it on the configure > command line (i.e., there's no standard AC --pkgdatadir option). > Instead, the "$datadir/openmpi" location is hard-coded in the Open MPI > code base (in opal/mca/installdirs/config, if you care). As such, > when you re-defined %{_name}, the specfile didn't agree with where > OMPI actually installed the files, resulting in the error you saw. > Yuck. > > Well, there are other reasons you can't have multiple OMPI > installations share a single installation tree (e.g., they'll all try > to install their own "mpirun" executable -- per a prior thread, the -- > program-prefix/suffix stuff also doesn't work; see > https://svn.open-mpi.org/trac/ompi/ticket/1168 > for details). So this isn't making OMPI any worse than it already > is. :-\ > > So I think the best solution for the moment is to just fix the > specfile's %_pkgdatadir to use the hard-coded name "openmpi" instead > of %{name}. I actually tried this first, but it failed to accomplish anything (got the same error). However, now with defining %_datadir, it works with the name directive just fine. > I committed these changes (and some other small fixes for things I > found while testing the _name and multi-package stuff) to the OMPI SVN > trunk in r17036 (see https://svn.open-mpi.org/trac/ompi/changeset/ > 17036) -- could you give it a whirl and see if it works for you? > > And another from an off-list mail: > > > In the preamble for the separate rpm files, the -devel and -docs > > reference openmpi-runtime statically rather than using %{name}- > > runtime, which breaks dependencies if you build under a different > > name as I am. > > Doh. I tried replacing the Requires: with %{_name}-runtime, but then > rpmbuild complained: > > error: line 300: Dependency tokens must begin with alpha-numeric, '_' > or '/': Requires: %{_name}-runtime Huh..this is strange. Here's the chunk from my spec file and rpm version. I've now built 3 sets of multi-rpm openmpi, each with a different name, and its worked flawlessly: [root@aeolus ~]# rpmbuild --version RPM version 4.3.3 [root@aeolus ~]# grep Requires /usr/src/redhat/SPECS/openmpi.spec Requires: %{modules_rpm_name} Requires: %{mpi_selector_rpm_name} Requires: %{modules_rpm_name} Requires: %{name}-runtime Requires: %{name}-runtime Perhaps its the difference between _name and name. > So it looks like Requires: will only take a hard-coded name, not a > variable (I have no comments in the specfile about this issue, but > perhaps that's why Greg/I hard-coded it in the first place...?). > Yuck. :-( > > This error occurred with rpmbuild v4.3.3 (the default on RHEL4U4), so > I tried manually upgrading to v4.4.2.2 from rpm.org to see if this > constraint had been relaxed, but I couldn't [easily] get it to build. > I guess it wouldn't be attractive to use something that would only > work with the newest version RPM, anyway. > > We'll unfortunately have to do something different, then. :- > ( Obvious but icky solutions include: > > - remove the Requires statements > - protect the Requires statements to only be used when %{_name} is > "openmpi" > > Got any better ideas? > > > 3) Will the resulting -runtime .rpms (for the different compiler > > versions) coexist peacefully without any special environment munging > > on the compute nodes, or do I need modules, etc. on all the compute > > nodes as well? > > They can co-exist peacefully out on the nodes because you should > choose different --prefix values for each installation (e.g., /opt/ > openmpi_gcc3.4.0/ or whatever naming convention you choose to use). > That being said, you should ensure that whatever version of OMPI you > use is consistent across an entire job. E.g., if job X was compiled > with the openmpi-gcc installation, then it should use the openmpi-gcc > installation on all the nodes on which it runs. I currently have them all installed accross the cluster in the
Re: [OMPI users] Tracing the library using gdb and xterm
Krishna, Would it work to launch the gdb/ddd process separately on the remote machine and then attaching to the mpi running jobfrom within gdb/ddd. Something like ssh -X [hostname|ip address] [ddd|gdb] Doug Reeder On Jan 3, 2008, at 8:32 AM, Jeff Squyres wrote: Per my previous mail, Open MPI (by default) closes its ssh sessions after the remote processes are launched, so X forwarding through ssh will not work. If it is possible (and I think it is, based on your subsequent replies), you might be best served with unencrypted X forwarding. On Jan 3, 2008, at 11:02 AM, Doug Reeder wrote: Krishna, Review the ssh and sshd man pages. When using ssh -X it takes care of defining the DISPLAY and sending the X11 images to your screen. Defining DISPLY directly generally won't work (that is how you do it with rlogin but not with ssh). Doug Reeder On Jan 3, 2008, at 1:54 AM, Krishna Chaitanya wrote: Hi Rolf, Thanks for that. There is still one minor problem, though. The xwindow is getting spawned on the remote machine and not on my local machine. It now looks like, mpirun --prefix /usr/local -hostfile machines -x DISPLAY -x PATH - np 2 xterm -e gdb peruse_ex1 Please let me know what i can do to have it displayed on my machine. I have the DISPLAY variable set to 0.0 on both the machines and I am ssh-ing into the other machine by using the -X switch. Thanks, Krishna Chaitanya On 1/2/08, Rolf Vandevaart wrote: Krishna Chaitanya wrote: Hi, I have been tracing the interactions between the PERUSE and MPI library,on one machine. I have been using gdb along with xterm to have two windows open at the same time as I step through the code. I wish to get a better glimpse of the working of the point to point calls, by launching the job on two machines and by tracing the flow in a similar manner. This is where I stand as of now : mpirun --prefix /usr/local -hostfile machines -np 2 xterm -e gdb peruse_ex1 xterm Xt error: Can't open display: xterm: DISPLAY is not set I tried using the display option for xterm and setting the value as 0.0, that was not of much help. If someone can guide me as to where the DISPLAY parameter has to be set to allow the remote machine to open the xterm window, it will be of great help. Thanks, Krishna I also do the the following: -x DISPLAY -x PATH In this way, both your DISPLAY and PATH settings make it to the remote node. Rolf -- = rolf.vandeva...@sun.com 781-442-3043 = -- In the middle of difficulty, lies opportunity ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users