Hi, I'm starting to think that the main problem lies in the 32-bit server not being able to execute the program compiled by the 64-bit PC, and vice versa. I just noticed that the sizes of the executables created by the PC and server are different - the one created by the 64-bit PC is 3204 bytes larger. Does that imply that the programs are somewhat different?
I compiled the program using the server and created 32.out, then I compiled the program using the PC and created 64.out. When I used the PC to run 32.out locally, I get the error: "error while loading shared libraries: libmpi.so.0: wrong ELF class: ELFCLASS64" But when I used the PC to run 64.out locally, it ran fine. I got the following error message when I ran 64.out on the server locally: "Could not execute the executable "./a.out": Exec format error This could mean that your PATH or executable name is wrong, or that you do not have the necessary permissions. Please ensure that your executable is able to be found and executed." However, when I used the server to run 32.out locally, it ran fine. Does this mean that the errors I got are not due to incorrect setup of the "MPI network", but the incompatibility issues of programs compiled by 32-bit and 64-bit machines? Thank you. Regards, Rayne --- On Wed, 13/8/08, jody <jody....@gmail.com> wrote: > From: jody <jody....@gmail.com> > Subject: Re: [OMPI users] Setting up Open MPI to run on multiple servers > To: lancer6...@yahoo.com, "Open MPI Users" <us...@open-mpi.org> > Date: Wednesday, 13 August, 2008, 2:56 PM > Hi Rayne > > SSH is used to start processes on the other machines - > that's why you > must configure ssh to work without passwords. > > As to your 64/32 bit problem: a program compiled for 32 > bits usually > works on a 64 bit machine, > but not vice versa. There are methods to start MPI such > that different > executables are started > on different machines, but iguess the easiest way to get > things going > would be to use > 32 bit versions of your program on all your machines. > > Jody > > On Wed, Aug 13, 2008 at 4:52 AM, Rayne > <lancer6...@yahoo.com> wrote: > > Thank you for all the replies. > > > > Here's what I have now. > > > > I modified my .bash_profile on my server to include > > the path of my executables, and now mpiexec and mpicc both > > point to the correct ones. I tried setting the > > LD_LIBRARY_PATH too, but it didn't seem to work, as it > > kept telling me it couldn't find the sharded library > > libmpi.so.0, although 'which libmpi.so.0' gave me > > the correct location. Then I modified the /etc/ld.so.conf > > file to include the directory of the libraries, and now the > > MPI programs work correctly on the server. > > > > Now, my problem is that I have trouble running the > > program using my PC and remotely on my server. I have the IP > > address of my server in the openmpi-default-hostfile in my > > PC, and I have set up a password-less ssh between them > > (though I have set it up such that it asks for a > > passphrase). All my programs and executables are in the > > shared folder. However, when I tried to run the program on > > my PC using 'mpiexec -n 2 ./a.out', I get the > > following error: > > > > "Failed to find or execute the following > > executable: > > > > Host: (Name of my server) > > Executable: ./a.out > > > > Cannot continue." > > > > If I try to compile then execute the program locally, > > on both my PC and server, they run fine. It's only when > > I try to get both PC and server to run the program > > concurrently, (which is the purpose of using MPI) that I get > > the error. I have checked and the a.out file is exactly the > > same on the PC and server, in terms of the size and > > date/time modified. > > > > One thing is that my PC is a 64-bit machine, and my > > server is a 32-bit machine. Could this be a factor, that a > > program compiled on a 64-bit machine cannot run on a 32-bit > > machine? > > > > Also, I don't quite understand the mechanism of > > how MPI allows one machine to communicate with another. For > > example, after compiling a program, an executable is created > > and stored on that machine and also on the remote nodes, > > through the use of a shared system. So when I run the > > program, how does the machine I'm at get the remode > > nodes to run the program too? > > > > Thank you, and sorry for the long email. > > > > Regards, > > Rayne New Email names for you! Get the Email name you've always wanted on the new @ymail and @rocketmail. Hurry before someone else does! http://mail.promotions.yahoo.com/newdomains/sg/