[OMPI users] intel compiler linking issue and issue of environment variable on remote node, with open mpi 1.4.3
Hi, I am trying to compile our codes with open mpi 1.4.3, by intel compilers 8.1. (1) For open mpi 1.4.3 installation on linux beowulf cluster, I use: ./configure --prefix=/home/yiguang/dmp-setup/openmpi-1.4.3 CC=icc CXX=icpc F77=ifort FC=ifort --enable-static LDFLAGS="-i-static - static-libcxa" --with-wrapper-ldflags="-i-static -static-libcxa" 2>&1 | tee config.log and make all install 2>&1 | tee install.log The issue is that I am trying to build open mpi 1.4.3 with intel compiler libraries statically linked to it, so that when we run mpirun/orterun, it does not need to dynamically load any intel libraries. But what I got is mpirun always asks for some intel library(e.g. libsvml.so) if I do not put intel library path on library search path($LD_LIBRARY_PATH). I checked the open mpi user archive, it seems only some kind user mentioned to use "-i-static"(in my case) or "-static-intel" in ldflags, this is what I did, but it seems not working, and I did not get any confirmation whether or not this works for anyone else from the user archive. could anyone help me on this? thanks! (2) After compiling and linking our in-house codes with open mpi 1.4.3, we want to make a minimal list of executables for our codes with some from open mpi 1.4.3 installation, without any dependent on external setting such as environment variables, etc. I orgnize my directory as follows: parent--- | package | bin | lib | tools In package/ directory are executables from our codes. bin/ has mpirun and orted, copied from openmpi installation. lib/ includes open mpi libraries, and intel libraries. tools/ includes some c-shell scripts to launch mpi jobs, which uses mpirun in bin/. The parent/ directory is on a NFS shared by all nodes of the cluster. In ~/.bashrc(shared by all nodes too), I clear PATH and LD_LIBRARY_PATH without direct to any directory of open mpi 1.4.3 installation. First, if I set above bin/ directory to PATH and lib/ LD_LIBRARY_PATH in ~/.bashrc, our parallel codes(starting by the C shell script in tools/) run AS EXPECTED without any problem, so that I set other things right. Then again, to avoid modifying ~/.bashrc or ~/.profile, I set bin/ to PATH and lib/ to LD_LIBRARY_PATH in the C shell script under tools/ directory, as: setenv PATH /path/to/bin:$PATH setenv LD_LIBRARY_PATH /path/to/lib:$LD_LIBRARY_PATH Then I start our codes from the C shell script in tools/, I got message: "orted command not found", which is from slave nodes, and orted should be in directory /path/to/bin. So I guess the $PATH variable or more general, the environment variables set in the script are not passed to the slave nodes by mpirun(I use absolute path for mpirun in the script). After I checked open mpi FAQ, I tried to set the "--prefix /path/to/parent" to mpirun command in the C shell script. it still does not work. Does any one have any hints? thanks! I have tried my best to describe the issues, if anything not clear, please let me know as well. Thanks a lot for helps! Sincerely, Yiguang
Re: [OMPI users] intel compiler linking issue and issue of environment variable on remote node, with open mpi 1.4.3 (Tim Prince)
Thank you very much for the comments and hints. I will try to upgrade our intel compiler collections. As for my second issue, with open mpi, is there any way to propagate enviroment variables of the current process on the master node to other slave nodes, such that orted daemon could run on slave nodes too? Thanks, Yiguang > On 3/21/2011 5:21 AM, ya...@adina.com wrote: > > > I am trying to compile our codes with open mpi 1.4.3, by intel > > compilers 8.1. > > > > (1) For open mpi 1.4.3 installation on linux beowulf cluster, I use: > > > > ./configure --prefix=/home/yiguang/dmp-setup/openmpi-1.4.3 > > CC=icc > > CXX=icpc F77=ifort FC=ifort --enable-static LDFLAGS="-i-static - > > static-libcxa" --with-wrapper-ldflags="-i-static -static-libcxa" > > 2>&1 | tee config.log > > > > and > > > > make all install 2>&1 | tee install.log > > > > The issue is that I am trying to build open mpi 1.4.3 with intel > > compiler libraries statically linked to it, so that when we run > > mpirun/orterun, it does not need to dynamically load any intel > > libraries. But what I got is mpirun always asks for some intel > > library(e.g. libsvml.so) if I do not put intel library path on > > library search path($LD_LIBRARY_PATH). I checked the open mpi user > > archive, it seems only some kind user mentioned to use > > "-i-static"(in my case) or "-static-intel" in ldflags, this is what > > I did, but it seems not working, and I did not get any confirmation > > whether or not this works for anyone else from the user archive. > > could anyone help me on this? thanks! > > > > If you are to use such an ancient compiler (apparently a 32-bit one), > you must read the docs which come with it, rather than relying on > comments about a more recent version. libsvml isn't included > automatically at link time by that 32-bit compiler, unless you specify > an SSE option, such as -xW. It's likely that no one has verified > OpenMPI with a compiler of that vintage. We never used the 32-bit > compiler for MPI, and we encountered run-time library bugs for the > ifort x86_64 which weren't fixed until later versions. > > > -- > Tim Prince > > > --
Re: [OMPI users] intel compiler linking issue and issue of environment variable on remote node, with open mpi 1.4.3
Thanks for your information. For my Open MPI installation, actually the executables such as mpirun and orted are dependent on those dynamic intel libraries, when I use ldd on mpirun, some dynamic libraries show up. I am trying to make these open mpi executables statically linked with these intel libraries, but it shows no progress even if I use "--with-gnu-ld" with specific static intel libraries set in LIBS when I configure open mpi 1.4.3 installation. It seems there are something for the compiling process of open mpi 1.4.3 that I do not have control, or I just missed something. I will try different things, and will report here once I have a confirmative conclusion. However, any hints or information on how to make open mpi executables statically linked to intel libs through intel compilers are very welcomed. Thanks! As for the issue that environment variables set in a script do not propagate to remote slave nodes, I use rsh connection for simplicity. If I set PATH and LD_LIBRARY_PATH in ~/.bashrc (which shared by all nodes, master or slave), my MPI application does work as expected, and this confirms Ralph's suggestions. The thing is that I just want to avoid set the environment variables in .bashrc or .porfile file, but instead, set them in the script, and want these environment variables propagating to other slave nodes when I do mpirun, as I could do for MPICH. I also try use the prefix path before mpirun when I do mpirun, as suggested by Jeff, it does not work either. Any hints to solve this issue? Thanks, Yiguang On 23 Mar 2011, at 12:00, users-requ...@open-mpi.org wrote: > On Mar 21, 2011, at 8:21 AM, ya...@adina.com wrote: > > > The issue is that I am trying to build open mpi 1.4.3 with intel > > compiler libraries statically linked to it, so that when we run > > mpirun/orterun, it does not need to dynamically load any intel > > libraries. But what I got is mpirun always asks for some intel > > library(e.g. libsvml.so) if I do not put intel library path on > > library search path($LD_LIBRARY_PATH). I checked the open mpi user > > archive, it seems only some kind user mentioned to use > > "-i-static"(in my case) or "-static-intel" in ldflags, this is what > > I did, but it seems not working, and I did not get any confirmation > > whether or not this works for anyone else from the user archive. > > could anyone help me on this? thanks! > > Is it Open MPI's executables that require the intel shared libraries > at run time, or your application? Keep in mind the difference: > > 1. Compile/link flags that you specify to OMPI's configure script are > used to compile/link Open MPI itself (including executables such as > mpirun). > > 2. mpicc (and friends) use a similar-but-different set of flags to > compile and link MPI applications. Specifically, we try to use the > minimal set of flags necessary to compile/link, and let the user > choose to add more flags if they want to. See this FAQ entry for more > details: > > http://www.open-mpi.org/faq/?category=mpi-apps#override-wrappers-a > fter -v1.0 > > > (2) After compiling and linking our in-house codes with open mpi > > 1.4.3, we want to make a minimal list of executables for our codes > > with some from open mpi 1.4.3 installation, without any dependent on > > external setting such as environment variables, etc. > > > > I orgnize my directory as follows: > > > > parent--- > >| > > package > > | > > bin > > | > > lib > > | > > tools > > > > In package/ directory are executables from our codes. bin/ has > > mpirun and orted, copied from openmpi installation. lib/ includes > > open mpi libraries, and intel libraries. tools/ includes some > > c-shell scripts to launch mpi jobs, which uses mpirun in bin/. > > FWIW, you can use the following OMPI options to configure to eliminate > all the OMPI plugins (i.e., locate all that code up in libmpi and > friends, vs. being standalone-DSOs): > > --disable-shared --enable-static > > This will make libmpi.a (vs. libmpi.so and a bunch of plugins) which > your application can statically link against. But it does make a > larger executable. Alternatively, you can: > > --disable-dlopen > > (instead of disable-shared/enable-static) which will make a giant > libmpi.so (vs. libmpi.so and all the plugin DSOs). So your MPI app > will still dynamically link against libmpi, but all the plugins will > be physically located in libmpi.so vs. being dlopen'ed at run time. > > > The parent/ directory is on a NFS shared by all nodes of the > > cluster. In ~/.bashrc(shared by all nodes too), I clear PATH and > > LD_LIBRARY_PATH without direct to any directory of open mpi 1.4.3 > > installation. > > > > First, if I set above bin/ directory to PATH and lib/ > > LD_LIBRARY_PATH in ~/.bashrc, our parallel codes(starting by the C > > shell script in tools/) run AS EXPECTED witho
Re: [OMPI users] intel compiler linking issue and issue of environment variable on remote node, with open mpi 1.4.3
Open MPI 1.4.3 + Intel Compilers V8.1 summary: (in case someone likes to refer to it later) (1) To make all Open MPI executables statically linked and independent of any dynamic libraries, "--disable-shared" and "--enable-static" options should BOTH be fowarded to configure, and "-i-static" option should be specified for intel compilers too. (2) It is confirmed that environment variables could be forwarded to slave nodes, such as $PATH and $LD_LIBARY_PATH, by specifying options to mpirun. However, mpirun will invoke orted daemon on master and slave nodes. These environment variables passed to slave nodes via mpirun options does not take into effect before orted started. So if orted daemon needs these environment variables to run, the only way is to set these environment variables in a shared .bashrc or .profile file, visible to both master and slave nodes, say, on a shared NFS partition. It seems no other way to resolve this kind of dependence. Regards, Yiguang
[OMPI users] mpirun does not propagate environment from master node to slave nodes
Hello All, I installed Open MPI 1.4.3 on our new HPC blades, with Infiniband interconnection. My system environments are as: 1)uname -a output: Linux gulftown 2.6.18-194.el5 #1 SMP Tue Mar 16 21:52:39 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux 2) /home is mounted over all nodes, and mpirun is started under /home/... Open MPI and application codes are compiled with intel(R) compilers V11. Infiniband stack is Mellanox OFED 1.5.2. I have two questions about mpirun: a) how could I get to know what is the network interconnect protocol used by the MPI application? I specify "--mca btl openib,self,sm,tcp" to mpirun, but I want to make sure it really uses infiniband interconnect. b) when I run mpirun, I get the following message: == Quote begin bash: orted: command not found bash: orted: command not found bash: orted: command not found -- A daemon (pid 15120) died unexpectedly with status 127 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -- -- mpirun noticed that the job aborted, but has no info as to the process that caused that situation. -- -- mpirun was unable to cleanly terminate the daemons on the nodes shown below. Additional manual cleanup may be required - please refer to the "orte-clean" tool for assistance. -- ibnode001 - daemon did not report back when launched ibnode002 - daemon did not report back when launched ibnode003 - daemon did not report back when launched == Quote end It seems orted is not found on slave nodes. If I set the PATH and LD_LIBRARY_PATH through --prefix to mpirun, or --path, or -x options to mpirun, to make the orted and related dynamic libs available on slave nodes, it does not work as expected from mpirun manual page. The only working case is that I set PATH and LD_LIBRARY_PATH in ~/.bashrc for mpirun, and this .bashrc is invoked by slave nodes too for login shell. I do not want to set PATH and LD_LIBRARY_PATH in ~/.bashrc, but instead to set options to mpirun directly. Thanks, Yiguang
Re: [OMPI users] mpirun does not propagate environment from master node to slave nodes
Thanks, Ralph! a) Yes, I know I could use only IB by "--mca btl openib", but just want to make sure I am using IB interfaces. I am seeking an option to mpirun to print out the actual interconnect protocol, like --prot to mpirun in MPICH2. b) Yes, my default shell is bash, but I run a c-shell script from bash terminal, mpirun is invoked inside this c-shell script. I am using rsh launcher, exactly as your guess. I try different mpirun command in the c-shell, one of them is /path/to/bin/mpirun --mca btl openib --app appfile and mpirun and orted are under /path/to/bin, and necessary libs are under /path/to/lib. I tried the -x, --prefix, and -path, all does not work as expected to propagate the PATH and LD_LIBRARY_PATH, since orted is not found on slave nodes, although it shoud since it on the shared NFS partition. Thanks, Yiguang On Jun 28, 2011, at 9:05 AM, yanyg_at_[hidden] wrote: > Hello All, > > I installed Open MPI 1.4.3 on our new HPC blades, with Infiniband > interconnection. > > My system environments are as: > > 1)uname -a output: > Linux gulftown 2.6.18-194.el5 #1 SMP Tue Mar 16 21:52:39 EDT > 2010 x86_64 x86_64 x86_64 GNU/Linux > > 2) /home is mounted over all nodes, and mpirun is started under > /home/... > > Open MPI and application codes are compiled with intel(R) > compilers V11. Infiniband stack is Mellanox OFED 1.5.2. > > I have two questions about mpirun: > > a) how could I get to know what is the network interconnect > protocol used by the MPI application? > > I specify "--mca btl openib,self,sm,tcp" to mpirun, but I want to > make sure it really uses infiniband interconnect. Why specify tcp if you don't want it used? Just leave that off and it will have no choice but to use IB. > > b) when I run mpirun, I get the following message: > It seems orted is not found on slave nodes. If I set the PATH and > LD_LIBRARY_PATH through --prefix to mpirun, or --path, or -x > options to mpirun, to make the orted and related dynamic libs > available on slave nodes, it does not work as expected from mpirun > manual page. The only working case is that I set PATH and > LD_LIBRARY_PATH in ~/.bashrc for mpirun, and this .bashrc is > invoked by slave nodes too for login shell. I do not want to set PATH > and LD_LIBRARY_PATH in ~/.bashrc, but instead to set options to > mpirun directly. Should work with either prefix or -x options, assuming the right syntax with the latter. I take it your default shell is bash, and that you are using the rsh launcher (as opposed to something like torque)? Are you launching from your default shell, or did you perhaps change shell? Can you send the actual mpirun command you typed?
Re: [OMPI users] mpirun does not propagate environment from master node to slave nodes
Thanks, Ralph. Your information is very deep and detailed. I tried with your suggestion to set ""-mca plm_rsh_assume_same_shell 0", it still does not work though. My situation is that we start a c-shell script from bash shell, which in turn invokes mpirun to other slave nodes. These slave nodes have bash login shell by default, and mpirun will execute another c-shell script on each node, will these mess thing up a little bit and related to the orted missing message? Thanks again, Yiguang On Jun 28, 2011, at 3:52 PM, yanyg_at_[hidden] wrote: I looked a little deeper into this. I keep forgetting that we changed our default settings a few years ago. In the dim past, OMPI would always probe the remote node to find out what shell it was using, and then use the proper command syntax for that shell. However, people complained about the extra time during launch, and very very few people actually used mis-matched shells. So we changed the setting the other way to default to assuming the remote shell is the same as the local one. For those like yourself that actually do have a mismatch, we left a parameter you can set to override that assumption. Just add "-mca plm_rsh_assume_same_shell 0" to your mpirun cmd line and it should resolve the problem.
[OMPI users] MPI_Reduce error over Infiniband or TCP
Dear all, We are testing Open MPI over Infiniband, and got a MPI_Reduce error message when we run our codes either over TCP or Infiniband interface, as follows, --- [gulftown:25487] *** An error occurred in MPI_Reduce [gulftown:25487] *** on communicator MPI COMMUNICATOR 3 CREATE FROM 0 [gulftown:25487] *** MPI_ERR_ARG: invalid argument of some other kind [gulftown:25487] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) Elapsed time: 6:33.78 -- mpirun has exited due to process rank 0 with PID 25428 on node gulftown exiting without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -- --- Any hints? Thanks, Yiguang
Re: [OMPI users] mpirun does not propagate environment from master node to slave nodes
Thanks, Ralph. *** quote begin * Let me get this straight. You are executing mpirun from inside a c- shell script, launching onto nodes where you will by default be running bash. The param I gave you should support that mode - it basically tells OMPI to probe the remote node to discover what shell it will run under there, and then formats the orted cmd line accordingly. If that isn't working (and it almost never gets used, so may have bit-rotted), then your only option is to convert the c-shell to bash. However, you are saying that the app you are asking us to run is a c-shell script??? Have you included the !/bin/csh directive in the top of that file so the system will automatically exec it using csh? Note that the orted comes alive and running prior to your "app" being executed, so the fact that your "app" is a c-shell script is irrelevant. *** quote end * You got exactly as in my case. and I agree with you that the app c- shell should not matter here. I checked that I have the #!/bin/csh to the head of the c-shell scripts. I guess I have to rewrite the c-shell script in bash to solve this issue totally, although it is not that easy. Thanks again, Yiguang
[OMPI users] Error-Open MPI over Infiniband: polling LP CQ with status LOCAL LENGTH ERROR
Hi all, The message says : [[17549,1],0][btl_openib_component.c:3224:handle_wc] from gulftown to: gulftown error polling LP CQ with status LOCAL LENGTH ERROR status number 1 for wr_id 492359816 opcode 32767 vendor error 105 qp_idx 3 This is very arcane to me, the same test ran when only one MPI process on each node, but when we switch to two MPI processes on each node, then this error message comes up. Anything I could do? Anything related to infiniband configuration, as guessed form the string "vendor error 105 qp_idx 3"? Thanks, Yiguang
Re: [OMPI users] Error-Open MPI over Infiniband: polling LP CQ with status LOCAL LENGTH ERROR
Hi Yevgeny, Thanks. Here is the output of /usr/bin/ibv_devinfo: hca_id: mlx4_0 transport: InfiniBand (0) fw_ver: 2.8.000 node_guid: 0002:c903:0010:a85a sys_image_guid: 0002:c903:0010:a85d vendor_id: 0x02c9 vendor_part_id: 26428 hw_ver: 0xB0 board_id: HP_016009 phys_port_cnt: 2 port: 1 state: PORT_ACTIVE (4) max_mtu:2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 1 port_lmc: 0x00 link_layer: IB port: 2 state: PORT_ACTIVE (4) max_mtu:2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 6 port_lmc: 0x00 link_layer: IB Each node has a HCA card with two ports active. Network controller is MT26428, version 09:00.0 I am running over Open MPI 1.4.3, the command line is: /path/to/mpirun -mca btl_openib_warn_default_gid_prefix 0 --mca btl openib,self -app appfile Thanks again, Yiguang On 10 Jul 2011, at 9:55, Yevgeny Kliteynik wrote: > Hi Yiguang, > > On 08-Jul-11 4:38 PM, ya...@adina.com wrote: > > Hi all, > > > > The message says : > > > > [[17549,1],0][btl_openib_component.c:3224:handle_wc] from > > gulftown to: gulftown error polling LP CQ with status LOCAL > > LENGTH ERROR status number 1 for wr_id 492359816 opcode > > 32767 vendor error 105 qp_idx 3 > > > > This is very arcane to me, the same test ran when only one MPI > > process on each node, but when we switch to two MPI processes > > on each node, then this error message comes up. Anything I could do? > > Anything related to infiniband configuration, as guessed form the > > string "vendor error 105 qp_idx 3"? > > What OMPI version are you using and what kind of HCAs do you have? You > can get details about HCA with "ibv_devinfo" command. Also, can you > post here all the OMPI command line parameters that you use when you > run your test? > > Thanks. > > -- YK > > > Thanks, > > Yiguang > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > >
[OMPI users] help: sm btl does not work when I specify the same host twice or more in the node list
Hi all, Good morning! I have trouble to communicate through sm btl in open MPI, please check the attached file for my system information. I am using open MPI 1.4.3, intel compilers V11.1, on linux RHEL 5.4 with kernel 2.6. The tests are the following: (1) if I specify the btl to mpirun by "--mca btl self,sm,openib", if I did not specify any of my computing nodes twice or more in the node list, my job runs fine. However, if I specify any of the computing nodes twice or more in the node list, it will hang there forever. (2) if I did not specify the sm btl to mpirun as "--mca btl self,openib", I could run my job smoothly, either put any of the computing nodes twice or more in the node list, or not. >From above 2 tests, apparently something wrong with sm btl interface on my system. As I checked the user archive, sm btl issue has been encountered due to the comm_spawned parent/child processes. But this seems not the case here, if I do not use any of my MPI based solver, only with MPI initialization and finalization procedures called, it still has this issue. Any comments? Thanks, Yiguang The following section of this message contains a file attachment prepared for transmission using the Internet MIME message format. If you are using Pegasus Mail, or any another MIME-compliant system, you should be able to save it or view it from within your mailer. If you cannot, please ask your system administrator for assistance. File information --- File: ompiinfo-config-uname-output.tgz Date: 9 Feb 2012, 8:58 Size: 126316 bytes. Type: Unknown ompiinfo-config-uname-output.tgz Description: Binary data
Re: [OMPI users] help: sm btl does not work when I specify the same host twice or more in the node list
Hi Jeff, Thank you very much for your help! I tried to run the same test of ring_c from standard examples in Open MPI 1.4.3 distribution. If I ran as you described from the command line, it worked without any problem with sm btl included(with --mca btl self,sm,openib). However, if I use sm btl(with --mca btl self,sm,openib), and ran ring_c from an in-house script, it showed the same issue as I described in my previous email, it will hang at MPI_Init(...) call. I think this issue is related to some environmental setting in the script. Do you have any hints, any prerequisite of system environmental configuration to work with sm btl layer in Open MPI? Thanks again, Yiguang
Re: [OMPI users] help: sm btl does not work when I specify the same host twice or more in the node list
Hi Jeff, The command "env | grep OMPI" output nothing but a blank line from my script. Anything I should set for mpirun? On the other hand, you may get reminded that I found you discussed some similar issue with Jonathan Dursi. The difference is that when I tried with --mca btl_sm_num_fifos #(np-1), it does not work with me, and I did find those files in the tmp directory that sm mmaped in(shared_mem_pool.ibnode001, etc), but for some mysterious reason, it hang at MPI_Init, so these files are created when we call MPI_Init? Thanks, Yiguang
Re: [OMPI users] help: sm btl does not work when I specify the same host twice or more in the node list
Hi Ralph, Could you please tell me what OMPI envars are broken? or what OMPI envars should be there for OMPI to work properly? Although I start my c-shell script from a bash command line(not sure if this matters), I only add Open MPI executable and lib path to $PATH and $LD_LIBRARY_PATH, no other OMPI environmental variables are set on my system(in bash or csh) as I checked. Thanks, Yiguang
Re: [OMPI users] help: sm btl does not work when I specify the same host twice or more in the node list
Yes, in short, I start a c-shell script from bash command line, in which I mpirun another c-shell script which start the computing process. The only OMPI related envars are PATH and LD_LIBRARY_PATH. Any other OPMI envars I should set?
Re: [OMPI users] help: sm btl does not work when I specify the same host twice or more in the node list
> No, there are no others you need to set. Ralph's referring to the fact > that we set OMPI environment variables in the processes that are > started on the remote nodes. > > I was asking to ensure you hadn't set any MCA parameters in the > environment that could be creating a problem. Do you have any set in > files, perchance? > > And can you run "env | grep OMPI" from the script that you invoked via > mpirun? > > So just to be clear on the exact problem you're seeing: > > - you mpirun on a single node and all works fine > - you mpirun on multiple nodes and all works fine (e.g., mpirun --host > a,b,c your_executable) - you mpirun on multiple nodes and list a host > more than once and it hangs (e.g., mpirun --host a,a,b,c > your_executable) > > Is that correct? > > If so, can you attach a debugger to one of the hung processes and see > exactly where it's hung? (i.e., get the stack traces) > > Per a question from your prior mail: yes, Open MPI does create mmapped > files in /tmp for use with shared memory communication. They *should* > get cleaned up when you exit, however, unless something disastrous > happens. Thank you very much! Now I am more clear with what Ralph asked. Yes what you described is right with the sm btl layer. As I double checked again, the problem is that when I use sm btl for MPI commnunication on the same host(as --mca btl openib,sm,self), issues come up as you described, all ran well on a single node, all ran well on multiple but different nodes, but it hang at MPI_Init() call if I ran on multiple nodes and list a host more than once. However, if I instead use tcp or openib btl without sm layer(as --mca btl openib,self), all these 3 cases ran just fine. I do setup the MCAs "plm_rsh_agent" to "rsh:ssh" and "btl_openib_warn_default_gid_prefix" to 0 in all cases, with or without sm btl layer. The OMPI environment variables set for each processes are quoted below(as output by env | grep OMPI in my script invoked by mpirun): -- //process #0: OMPI_MCA_plm_rsh_agent=rsh:ssh OMPI_MCA_btl_openib_warn_default_gid_prefix=0 OMPI_MCA_btl=openib,sm,self OMPI_MCA_orte_precondition_transports=3a07553f5dca58b5- 21784eac1fc85294 OMPI_MCA_orte_local_daemon_uri=195559424.0;tcp://198.177.14 6.70:53997;tcp://10.10.10.4:53997;tcp://172.23.10.1:53997;tcp://172 .33.10.1:53997 OMPI_MCA_orte_hnp_uri=195559424.0;tcp://198.177.146.70:53997 ;tcp://10.10.10.4:53997;tcp://172.23.10.1:53997;tcp://172.33.10.1:53 997 OMPI_MCA_mpi_yield_when_idle=0 OMPI_MCA_orte_app_num=0 OMPI_UNIVERSE_SIZE=4 OMPI_MCA_ess=env OMPI_MCA_orte_ess_num_procs=4 OMPI_COMM_WORLD_SIZE=4 OMPI_COMM_WORLD_LOCAL_SIZE=2 OMPI_MCA_orte_ess_jobid=195559425 OMPI_MCA_orte_ess_vpid=0 OMPI_COMM_WORLD_RANK=0 OMPI_COMM_WORLD_LOCAL_RANK=0 //process #1: OMPI_MCA_plm_rsh_agent=rsh:ssh OMPI_MCA_btl_openib_warn_default_gid_prefix=0 OMPI_MCA_btl=openib,sm,self OMPI_MCA_orte_precondition_transports=3a07553f5dca58b5- 21784eac1fc85294 OMPI_MCA_orte_local_daemon_uri=195559424.0;tcp://198.177.14 6.70:53997;tcp://10.10.10.4:53997;tcp://172.23.10.1:53997;tcp://172 .33.10.1:53997 OMPI_MCA_orte_hnp_uri=195559424.0;tcp://198.177.146.70:53997 ;tcp://10.10.10.4:53997;tcp://172.23.10.1:53997;tcp://172.33.10.1:53 997 OMPI_MCA_mpi_yield_when_idle=0 OMPI_MCA_orte_app_num=1 OMPI_UNIVERSE_SIZE=4 OMPI_MCA_ess=env OMPI_MCA_orte_ess_num_procs=4 OMPI_COMM_WORLD_SIZE=4 OMPI_COMM_WORLD_LOCAL_SIZE=2 OMPI_MCA_orte_ess_jobid=195559425 OMPI_MCA_orte_ess_vpid=1 OMPI_COMM_WORLD_RANK=1 OMPI_COMM_WORLD_LOCAL_RANK=1 //process #3: OMPI_MCA_plm_rsh_agent=rsh:ssh OMPI_MCA_btl_openib_warn_default_gid_prefix=0 OMPI_MCA_btl=openib,sm,self OMPI_MCA_orte_precondition_transports=3a07553f5dca58b5- 21784eac1fc85294 OMPI_MCA_orte_daemonize=1 OMPI_MCA_orte_hnp_uri=195559424.0;tcp://198.177.146.70:53997 ;tcp://10.10.10.4:53997;tcp://172.23.10.1:53997;tcp://172.33.10.1:53 997 OMPI_MCA_ess=env OMPI_MCA_orte_ess_jobid=195559425 OMPI_MCA_orte_ess_vpid=3 OMPI_MCA_orte_ess_num_procs=4 OMPI_MCA_orte_local_daemon_uri=195559424.1;tcp://198.177.14 6.71:53290;tcp://10.10.10.1:53290;tcp://172.23.10.2:53290;tcp://172 .33.10.2:53290 OMPI_MCA_mpi_yield_when_idle=0 OMPI_MCA_orte_app_num=3 OMPI_UNIVERSE_SIZE=4 OMPI_COMM_WORLD_SIZE=4 OMPI_COMM_WORLD_LOCAL_SIZE=2 OMPI_COMM_WORLD_RANK=3 OMPI_COMM_WORLD_LOCAL_RANK=1 //process #2: OMPI_MCA_plm_rsh_agent=rsh:ssh OMPI_MCA_btl_openib_warn_default_gid_prefix=0 OMPI_MCA_btl=openib,sm,self OMPI_MCA_orte_precondition_transports=3a07553f5dca58b5- 21784eac1fc85294 OMPI_MCA_orte_daemonize=1 OMPI_MCA_orte_hnp_uri=195559424.0;tcp://198.177.146.70:53997 ;tcp://10.10.10.4:53997;tcp://172.23.10.1:53997;tcp://172.33.10.1:53 997 OMPI_MCA_ess=env OMPI_MCA_orte_ess_jobid=195559425 OMPI_MCA_orte_ess_vpid=2 OMPI_MCA_orte_ess_num_procs=4 OMPI_MCA_orte_local_daemon_uri=195559424.1;tcp://198.177.14 6.71:53290;tcp://10.10.10.1:53290;tcp://172.23.10.2:53290;tcp://172 .33.10.2:53290 OMPI_MCA_mpi_yield_when_idle=0 OMPI_MCA_orte_app_
Re: [OMPI users] help: sm btl does not work when I specify the same host twice or more in the node list
> So the real issue is: the sm BTL is not working for you. > Yes. > What version of Open MPI are you using? > It is 1.4.3 I am using. > Can you rm -rf any Open MPI directories that may be left over in /tmp? Yes, I have tried that. The clean up does not help to make sm btl work.
Re: [OMPI users] help: sm btl does not work when I specify the same host twice or more in the node list
OK, with Jeff's kind help, I solved this issue in a very simple way. Now I would like to report back the reason for this issue and the solution. (1) The scenario under which this issue happened: In my OPMI environment, the $TMPDIR envar is set to different scratch directory for different MPI process, even some MPI processes are running on the same host. This is not troublesome if we use openib,self,tcp btl layer for communication. However, if we use sm btl layer, then, as Jeff said: """ Open MPI creates its shared memory files in $TMPDIR. It implicitly expects all shared memory files to be found under the same $TMPDIR for all procs on a single machine. More specifically, Open MPI creates what we call a "session directory" under $TMPDIR that is an implicit rendezvous point for all processes on the same machine. Some meta data is put in there, to include the shared memory mmap files. So if the different processes have a different idea of where the rendezvous session directory exists, they'll end up blocking waiting for others to show up at their (individual) rendezvous points... but that will never happen, because each process is waiting at their own rendezvous point. """ So in this case, there is a block and wait on each other for MPI processes shared data through shared memory, which will never be released, hence the hang at the MPI_Init call. (2) Solution to this issue: You may set the $TMPDIR to a same directory on the same host if possible; or you could setenv OMPI_PREFIX_ENV to a common directory for MPI processes on the same host while keeping your $TMPDIR setting. either way is verified and working fine for me! Thanks, Yiguang
[OMPI users] orted daemon no found! --- environment not passed to slave nodes?
Greetings! I have tried to run ring_c example test from a bash script. In this bash script, I setup PATH and LD_LIBRARY_PATH(I donot want to disturb ~/.bashrc, etc), then use a full path of mpirun to invoke mpi processes, the mpirun and orted are both on the PATH. However, from the Open MPI message, orted was not found, to me, it was not found only on slave nodes. Then I tried to set the --prefix or -x PATH -x LD_LIBRARY_PATH to hope these envars passed to slave nodes, but it turned out they are not forwarded to slave nodes. On the other hand, if I set the same PATH and LD_LIBRARY_PATH in ~/.bashrc which shared by all nodes, mpirun from bash script runs fine and orted could be found. This is easy to understand though, but I realy do not want to change ~/.bashrc. It seems the non-interactive bash shell does not pass envars to slave nodes. Any comments and solutions? Thanks, Yiguang