Re: [OMPI users] Please help "orte_init failed error" running simple MPI code
Hi Jeff, Thanks for your advice, I have solved my error and I should have read the FAQ. The two commands are reallt helpful:) mpiCC --showme:compile mpiCC --showme:link But configuring KDevelop for the shown compile+link option are a bit confusing for newbies like me :) Well, after trial-n-error I think I got the answer, and I hope it would be helpful for those KDevelop+OpenMPI newcomers (KDevelop 3.4.1): (1)Project->Project Option->Configure Option->General->Linker flags, copy&paste the output of the --showme:link (2)Environment variables, add LD_LIBRARY_PATH [/usr/local/lib] and PATH [/usr/local/bin], where [/xx/xx/xx] is your openMPI install directory (3)Project->Project Option->Configure Option->C++->Compiler flags, copy&paste the output of the --showme:compiler (4)Automake manager->Target options->Libraries->link libraries outside project, add all the .a or .so installed in your openMPI directory Now I can compile and run without problem! :) But I still have a question unsolved by press the shift+F9 ("Gear icon"), Konsole run its default -e /bin/sh -c and I cannnot run with "mpirun -np 4" on a multicore Xeon PC? I have check the Project->Project Option->Run Option but seems I can only pass argument but not customized execute command?! How can I tell Konsole to run "mpirun -np 4" when I press the "Gear icon"? Thanks! Rolly -Original Message- From: Jeff Squyres [mailto:jsquy...@cisco.com] Sent: 2008年3月15日 7:45 To: 50008...@student.cityu.edu.hk; Open MPI Users Subject: Re: [OMPI users] Please help "orte_init failed error" running simple MPI code Two things: - You might want to run "mpicc --showme" to get the list of compiler/ linker flags to put into KDevelop. Better yet, if you can get KDevelop to use mpicc (etc.) instead of gcc, that would avoid you needing to explicitly list anything in terms of libraries, header locations, etc. - Check that your system didn't come with Open MPI already installed. If so, you may accidentally be mixing and matching two different versions (e.g., the 1.2.5 that you installed and the version that came pre-installed). We do not [yet] guarantee binary compatibility between different versions of Open MPI, so if you're mixing them, Bad Things can happen. On Mar 13, 2008, at 11:33 AM, Rolly Ng wrote: > Hello all, > > I am new to open-MPI programming and I have a strange error while > running my simple code: > > My platform is a IBM T42 notebook with just a single-core processor, > and I just installed OpenSuSE 10.3 with KDevelop as my IDE. I have > downloaded the openmpi-1.2.5.tar.gz and install using the commands, > > shell$ gunzip -c openmpi-1.2.5.tar.gz | tar xf - shell$ cd > openmpi-1.2.5 shell$ ./configure --prefix=/usr/local <...lots of > output...> shell$ make all install > > Then I add -lmpi, -lmpi_cxx, -lopen-pal, -lopen-rte, -lmca_common_sm > options to the link libraries outside project (LDADD) in the > Automake Manager inside Kdevelop. I have also added the PATH /usr/ > local/bin and LD_LIBRARY_PATH /usr/local/lib in the Enviroment > variables in the Run options of Project Options. I can compile my > code with no error. > > Here are my codes, > #ifdef HAVE_CONFIG_H > #include > #endif > > #include > #include > #include > //#include > //#include > > using namespace std; > > int main(int argc, char ** argv) > { > int mynode, totalnodes; > > MPI_Init(&argc,&argv); > MPI_Comm_size(MPI_COMM_WORLD, &totalnodes); > MPI_Comm_rank(MPI_COMM_WORLD, &mynode); > > cout << "Hello world from processor " << mynode << " of " << > totalnodes << endl; > > MPI_Finalize(); > } > > I am expecting the output as: Hello world from processor 0 of 1. But > it does not work and MPI failed to initialize. The output is strange: > > [rollyopensuse:24924] [0,0,0] ORTE_ERROR_LOG: Error in file runtime/ > orte_init_stage1.c at line 312 > -- > It looks like orte_init failed for some reason; your parallel > process is likely to abort. There are many reasons that a parallel > process can fail during orte_init; some of which are due to > configuration or environment problems. This failure appears to be > an internal failure; here's some additional information (which may > only be relevant to an Open MPI developer): > > orte_pls_base_select failed > --> Returned value -1 instead of ORTE_SUCCESS > > -- > -- > It looks like MPI_INIT failed for some reason; your parallel process > is likely to abort. There are many reasons that a parallel process > can fail during MPI_INIT; some of which are due to configuration or > environment problems. This failure appears to be an internal > failure; here's some additional information (which may only be > relevant to an Open MPI > developer):
Re: [OMPI users] openmpi-1.2.5 and globus-4.0.5
My apologies - I should have read the note more closely. Still trying to leaf through the umpteen emails that arrived during my vacation. ;-) I take it that "globus-job-run" is executing some script that eventually calls our mpirun? Your script must be doing some command-line parsing as I do not recognize some of those options - it would help to see the eventual command line being given to mpirun. The problem here is that mpirun looks at all its available launchers to see what can work. In this case, it duly noted that launchers for the managed environments (e.g., Torque and slurm) will not work. This leaves only the rsh launcher. The rsh launcher looks for "ssh" or (if that isn't found) "rsh" to be in the path by default. In your case, it clearly didn't find either one, so the rsh launcher indicated that it also would not work. As a result, mpirun aborts because no mechanism to launch processes could be found. Your choices remain the same as what I previously described, however. IIRC, launching on the grid requires communication with the Globus daemons on each of the target machines, possibly interaction with the Globus security manager, etc. ORTE doesn't know how to do any of these things, so you will either have to tell it how to do so, or use the "standalone" launch method. Alternatively, if you believe you can use some ssh-like variant, then you can provide that command to ORTE in place of the default "ssh". The parameter would be -mca pls_rsh_agent my_ssh_replacement. Be sure this replacement command is in your path, or provide the absolute pathname of it. Note that the replacement command -must- accept command line options similar to those of ssh - ORTE will replace "ssh" with whatever you give it, but the rest of the command line will be built as if the command was "ssh". FWIW, there are people working on integrating a Globus-aware launcher into ORTE. I'm not entirely sure when that will be completed (it will not be back-ported to the 1.2.x series), nor if/when that code would become part of the OMPI distribution. Hope that helps. Ralph On 3/14/08 9:01 PM, "Ralph Castain" wrote: > The problem here is that you are attempting to start the application > processes without using our mpirun. We call this a "standalone" launch. > > Unfortunately, OMPI doesn't currently understand how to do a standalone > launch - ORTE will get confused and abort, as you experienced. There are two > ways to fix this: > > 1. someone could write a Globus launcher for ORTE. I don't think this would > be terribly hard. You would then use our mpirun to start the job after > getting an allocation via some grid-compatible resource manager. > > 2. once we get standalone operations working, you could do what you tried. > You will likely have to write an ESS component for Globus so the processes > can figure out their rank. > > I have done some prototyping for standalone launch, and expect to have at > least one working example in our development trunk later this month. > However, we currently don't plan to release standalone support until > probably 1.3.2, which likely won't come out for a few months. > > Hope that helps > Ralph > > > On 3/14/08 5:40 PM, "Jeff Squyres" wrote: > >> I don't know if anyone has tried to run Open MPI with globus before. >> >> One requirement that Open MPI currently has is that all nodes must be >> reachable to each other via TCP. Is that true in your globus >> environment? >> >> >> >> On Mar 10, 2008, at 11:01 AM, Christoph Spielmann wrote: >> >>> Hi everybody! >>> >>> I try to get OpenMPI and Globus to cooperate. These are the steps i >>> executed in order to get OpenMPI working: >>> >>> export PATH=/opt/openmpi/bin/:$PATH >>> /opt/globus/setup/globus/setup-globus-job-manager-fork >>> checking for mpiexec... /opt/openmpi/bin//mpiexec >>> checking for mpirun... /opt/openmpi/bin//mpirun >>> find-fork-tools: creating ./config.status >>> config.status: creating fork.pm >>> restart VDT (includes GRAM, WSGRAM, mysql, rls...) >>> As you can see the necessary OpenMPI-executables are recognized >>> correctly by setup-globus-job-manager-fork. But when i actually try >>> to execute a simple mpi-program using globus-job-run i get this: >>> >>> globus-job-run localhost -x '(jobType=mpi)' -np 2 -s ./hypercube 0 >>> [hydra:10168] [0,0,0] ORTE_ERROR_LOG: Error in file runtime/ >>> orte_init_stage1.c at line 312 >>> -- >>> It looks like orte_init failed for some reason; your parallel >>> process is >>> likely to abort. There are many reasons that a parallel process can >>> fail during orte_init; some of which are due to configuration or >>> environment problems. This failure appears to be an internal failure; >>> here's some additional information (which may only be relevant to an >>> Open MPI developer): >>> >>> orte_pls_base_select failed >>> --> Returned value -1 instead of ORTE_SUCCESS >>> >>> -
Re: [OMPI users] Please help "orte_init failed error" running simple MPI code
On Mar 15, 2008, at 1:31 AM, Rolly Ng wrote: But I still have a question unsolved by press the shift+F9 ("Gear icon"), Konsole run its default -e /bin/sh -c and I cannnot run with "mpirun -np 4" on a multicore Xeon PC? I have check the Project->Project Option- >Run Option but seems I can only pass argument but not customized execute command?! I'm afraid I know nothing about KDevelop, so I can't help you. I usually run manually in a terminal window. -- Jeff Squyres Cisco Systems