Re: [OMPI users] MPIRUN + Environtment Variable
On 09/29/11 20:54, Xin Tong wrote: I need to set up some environment variables before I run my application ( appA ). I am currently using mpirun -np 1 -host socrates (socrates is another machine) appA. Before appA runs, it expects some environment variables to be set up. How do i do that ? % man mpirun ... To manage files and runtime environment: ... -x Export the specified environment variables to the remote nodes before executing the program. Only one environment variable can be specified per -x option. Existing environment variables can be specified or new variable names specified with corresponding values. For example: % mpirun -x DISPLAY -x OFILE=/tmp/out ... The parser for the -x option is not very sophisticated; it does not even understand quoted values. Users are advised to set variables in the environment, and then use -x to export (not define) them.
Re: [OMPI users] VampirTrace integration with VT_GNU_NMFILE environment variable
Hello, first, please consider that the VT versions integrated in Open MPI v1.5.x and v1.4.x are different - respectively the names of the environment variables for setting a pre-created symbol list: Open MPI v1.4.x: VT_NMFILE Open MPI v1.5.x: VT_GNU_NMFILE Furthermore, make sure that the environment variable is exported to *all* MPI tasks. Therefor, add the option '-x ' to your mpirun command: mpirun -x VT_GNU_NMFILE ... Regards, Matthias On Monday 26 September 2011 3:19:21 you wrote: > According to the VampirTrace documentation, it is possible to create a > symbol list file in advance and set the name of the file in the > environment variable VT_GNU_NMFILE. For example, you might do this: > > $ nm hello > hello.nm > $ export VT_GNU_NMFILE="hello.nm" > > I have set up a symbol file list as above (with full path name of > course) but when I run my VT instrumented program (via mpirun) it > appears to ignore the VT_GNU_NMFILE environment variable and run "nm" > automatically on startup (the default behavior). This can be a time > consuming process, so I would prefer to use the pre-created symbol > list file. > > Can anyone confirm if the VT_GNU_NMFILE environment variable is > supported with the OpenMPI integration? > > Thanks, > Rocky > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Role of ethernet interfaces of startup of openmpi job using IB
Thanks for the prompt reply! On Sep 27, 2011, at 6:35 AM, Salvatore Podda wrote: We would like to know if the ethernet interfaces play any role in the startup phase of an opempi job using InfiniBand In this case, where we can found some literature on this topic? Unfortunately, there's not a lot of docs about this other than people asking questions on this list. For the above reason, does anyone, in the list, know which the order/ ranking by which the ethernet interfaces will be qeuried in the case of multiple ones? And which are the rules? Regards Salvatore Podda IP is used by default during Open MPI startup. Specifically, it is used as our "out of band" communication channel for things like stdin/stdout/stderr redirection, launch command relaying, process control, etc. The OOB channel is also used by default for bootstrapping IB queue pairs. To clarify, note that these are two different things: 1. the out of band (OOB) channel used for process control, std* routing, etc. 2. bootstrapping IB queue pairs You can change the IB QP bootstrapping to use the OpenFabrics RDMA communications manager (vs. our OOB channel) with the following: mpirun --mca btl_openib_if_cpc rdmacm ... See if that helps (although the OF RDMA CM has its own scalability issues, also associated with ARP). If your cluster is large, you might want to check out the section on our FAQ about large clusters: http://www.open-mpi.org/faq/?category=large-clusters I don't think there's an entry on there yet about this, but it may also be worthwhile to try enabling the "radix" support; a more scalable version of our OOB channel (i.e., the tree across all the support daemons has a much larger radix and is therefore much flatter). Los Alamos recently committed an IB UD OOB channel plugin to our development trunk and is comparing its performance to the radix tree to see if it's worthwhile. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] Role of ethernet interfaces of startup of openmpi job using IB
On Sep 30, 2011, at 6:29 AM, Salvatore Podda wrote: > For the above reason, does anyone, in the list, know which the order/ranking > by which the > ethernet interfaces will be qeuried in the case of multiple ones? > And which are the rules? They're all used equally. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'
On Sep 28, 2011, at 5:02 PM, Blosch, Edwin L wrote: > ./configure --prefix=/release/cfd/openmpi-intel --without-tm --without-sge > --without-lsf --without-psm --without-portals --without-elan --without-slurm > --without-loadleveler --without-libnuma --enable-mpirun-prefix-by-default > --enable-contrib-no-build=vt --enable-mca-no-build=maffinity > --disable-per-user-config-files --disable-io-romio --enable-static > --disable-shared --without-openib CXX=/appserv/intel/cce/10.1.021/bin/icpc > CC=/appserv/intel/cce/10.1.021/bin/icc 'CFLAGS= -O2' 'CXXFLAGS= -O2' > F77=/appserv/intel/fce/10.1.021/bin/ifort 'FFLAGS=-D_GNU_SOURCE -traceback > -O2' FC=/appserv/intel/fce/10.1.021/bin/ifort 'FCFLAGS=-D_GNU_SOURCE > -traceback -O2' 'LDFLAGS= -static-intel The weird thing here is that I am unable to replicate this issue. :-\ I thought that if I tried essentially the same configure line as above, I should see the same issue, because I have libnuma.so and no libnuma.a. But it worked fine (i.e., OMPI build and installed fine, and I'm able to compile/link MPI applications just fine). Huh. > The error messages upon linking the application are unchanged: >> /scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o): >> In function `hwloc_linux_alloc_membind': >> topology-linux.c:(.text+0x1da): undefined reference to `mbind' > > Re: NUMA: It appears there is a /usr/lib64/libnuma.so but no static version. > There is /usr/include/numa.h and /usr/include/numaif.h. > > I don't understand about make V=1. What tree? Somewhere in the OpenMPI build, > or in the application compilation itself? Is "V=1" something in the OpenMPI > makefile structure? Sorry, "make V=1" is part of OMPI's build system. If you "make V=1" in the v1.5 (and later) OMPI, it'll show you the whole compile line instead of the abbreviated output. > Thanks, > > Ed > > -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Jeff Squyres > Sent: Wednesday, September 28, 2011 11:05 AM > To: Open MPI Users > Subject: EXTERNAL: Re: [OMPI users] Unresolved reference 'mbind' and > 'get_mempolicy' > > Yowza; that sounds like a configury bug. :-( > > What line were you using to configure Open MPI? Do you have libnuma > installed? If so, do you have the .h and .so files? Do you have the .a file? > > Can you send the last few lines of output from a failed "make V=1" in that > tree? (it'll show us the exact commands used to compile/link, etc.) > > > On Sep 28, 2011, at 11:55 AM, Blosch, Edwin L wrote: > >> I am getting some undefined references in building OpenMPI 1.5.4 and I would >> like to know how to work around it. >> >> The errors look like this: >> >> /scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o): >> In function `hwloc_linux_alloc_membind': >> topology-linux.c:(.text+0x1da): undefined reference to `mbind' >> topology-linux.c:(.text+0x213): undefined reference to `mbind' >> /scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o): >> In function `hwloc_linux_set_area_membind': >> topology-linux.c:(.text+0x414): undefined reference to `mbind' >> topology-linux.c :(.text+0x46c): undefined reference to `mbind' >> /scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o): >> In function `hwloc_linux_get_thisthread_membind': >> topology-linux.c:(.text+0x4ff): undefined reference to `get_mempolicy' >> topology-linux.c:(.text+0x5ff): undefined reference to `get_mempolicy' >> /scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o): >> In function `hwloc_linux_set_thisthread_membind': >> topology-linux.c:(.text+0x7b5): undefined reference to `migrate_pages' >> topology-linux.c:(.text+0x7e9): undefined reference to `set_mempolicy' >> topology-linux.c:(.text+0x831): undefined reference to `set_mempolicy' >> make: *** [main] Error 1 >> >> S ome configure output that is probably relevant: >> >> checking numaif.h usability... yes >> checking numaif.h presence... yes >> checking for numaif.h... yes >> checking for set_mempolicy in -lnuma... yes >> checking for mbind in -lnuma... yes >> checking for migrate_pages in -lnuma... yes >> >> The FAQ says that I should have to give -with-libnuma explicitly, but I did >> not do that. Is there a problem with configure? Or the FAQ? Or perhaps >> the system has a configuration peculiarity? >> >> On another system, the configure output is different, and there are no >> unresolved references: >> >> checking numaif.h usability... no >> checking numaif.h presence... no >> checking for numaif.h... no >> >> What is the configure option that will make the unresolved references go >> away? >> >> Thanks, >> >> Ed >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For c
Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'
On Sep 29, 2011, at 12:45 PM, Blosch, Edwin L wrote: > If I add --without-hwloc in addition to --without-libnuma, then it builds. > Is that a reasonable thing to do? Is there a better workaround? This > 'hwloc' module looks like it might be important. As a note of explanation: hwloc is effectively our replacement for libnuma. You might want to check out hwloc (the standalone software package) -- it has a CLI and is quite useful for administrating servers, even outside of an HPC environment: http://www.open-mpi.org/projects/hwloc/ hwloc may use libnuma under the covers; that's where this issue is coming from (i.e., OMPI may still use libnuma -- it's just now doing so indirectly, instead of directly). > For what it's worth, if there's something wrong with my configure line, let > me know what to improve. Otherwise, as weird as > "--enable-mca-no-build=maffinity --disable-io-romio --enable-static > --disable-shared" may look, I am not trying to build fully static binaries. I > have unavoidable need to build OpenMPI on certain machines and then transfer > the executables to other machines that are compatable but not identical, and > over the years these are the minimal set of configure flags necessary to make > that possible. I may revisit these choices at some point, but if they are > supposed to work, then I'd rather just keep using them. Your configure line looks fine to me. FWIW/heads up: in the 1.7 series, we're going to be ignoring the $F77 and $FFLAGS variables; we'll *only* be using $FC and $FCFLAGS. There's still plenty of time before this hits mainstream, but I figured I'd let you know it's coming. :-) -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'
I think the issue here is that it's linking the *MPI application* that is causing the problem. Is that right? If so, can you send your exact application compile line, and the the output of that compile line with "--showme" at the end? On Sep 29, 2011, at 4:24 PM, Brice Goglin wrote: > Le 28/09/2011 23:02, Blosch, Edwin L a écrit : >> Jeff, >> >> I've tried it now adding --without-libnuma. Actually that did NOT fix the >> problem, so I can send you the full output from configure if you want, to >> understand why this "hwloc" function is trying to use a function which >> appears to be unavailable. > > This function is likely available... in the dynamic version of libnuma > (that's why configure is happy), but make is probably trying to link > with the static version which isn't available on your machine. That's my > guess, at least. > >> I don't understand about make V=1. What tree? Somewhere in the OpenMPI >> build, or in the application compilation itself? Is "V=1" something in the >> OpenMPI makefile structure? > > Instead of doing > ./configure ... > make > do > ./configure > make V=1 > > It will make the output more verbose. Once you get the failure, please > send the last 15 lines or so. We will look at these verbose lines to > understand how things are being compiled (which linker flags, which > libraries, ...) > > Brice > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'
Thank you for all this information. Your diagnosis is totally right. I actually sent e-mail yesterday but apparently it never got through :< It IS the MPI application that is failing to link, not OpenMPI itself; my e-mail was not well written; sorry Brice. The situation is this: I am trying to compile using an OpenMPI 1.5.4 that was built to be rooted in /release, but it is not placed there yet (testing); it is currently under /builds/release. I have set OPAL_PREFIX in the environment, with the intention of helping the compiler wrappers work right. Under /release, I currently have OpenMPI 1.4.3, whereas the OpenMPI under /builds/release is 1.5.4. What I am getting is this: The mpif90 wrapper (under /builds/release/openmpi/bin) puts -I/release instead of -I/builds/release. But it includes -L/builds/release. So I'm getting headers from 1.4.3 when compiling, but the libmpi from 1.5.4 when linking. I did a quick "move 1.4.3 out of the way and put 1.5.4 over to /release where it belongs" test, and my application did link without errors, so I think that confirms the nature of the problem. Is it a bug that mpif90 didn't pay attention to OPAL_PREFIX in the -I but did use it in the -L ? -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: Friday, September 30, 2011 7:04 AM To: Open MPI Users Subject: Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy' I think the issue here is that it's linking the *MPI application* that is causing the problem. Is that right? If so, can you send your exact application compile line, and the the output of that compile line with "--showme" at the end? On Sep 29, 2011, at 4:24 PM, Brice Goglin wrote: > Le 28/09/2011 23:02, Blosch, Edwin L a écrit : >> Jeff, >> >> I've tried it now adding --without-libnuma. Actually that did NOT fix the problem, so I can send you the full output from configure if you want, to understand why this "hwloc" function is trying to use a function which appears to be unavailable. > > This function is likely available... in the dynamic version of libnuma > (that's why configure is happy), but make is probably trying to link > with the static version which isn't available on your machine. That's my > guess, at least. > >> I don't understand about make V=1. What tree? Somewhere in the OpenMPI build, or in the application compilation itself? Is "V=1" something in the OpenMPI makefile structure? > > Instead of doing > ./configure ... > make > do > ./configure > make V=1 > > It will make the output more verbose. Once you get the failure, please > send the last 15 lines or so. We will look at these verbose lines to > understand how things are being compiled (which linker flags, which > libraries, ...) > > Brice > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] Open MPI process cannot do send-receive message correctly on a distributed memory cluster
Hi, I have a Open MPI program, which works well on a Linux shared memory multicore (2 x 6 cores) machine. But, it does not work well on a distributed cluster with Linux Open MPI. I found that the the process sends out some messages to other processes, which can not receive them. What is the possible reason ? I do not change anything of the program. Any help is really appreciated. Thanks
Re: [OMPI users] Open MPI process cannot do send-receive message correctly on a distributed memory cluster
You can use a debugger (just gdb will do, no TotalView needed) to find out which MPI send & receive calls are hanging the code on the distributed cluster, and see if the send & receive pair is due to a problem described at: Deadlock avoidance in your MPI programs: http://www.cs.ucsb.edu/~hnielsen/cs140/mpi-deadlocks.html Rayson = Grid Engine / Open Grid Scheduler http://gridscheduler.sourceforge.net Wikipedia Commons http://commons.wikimedia.org/wiki/User:Raysonho On Fri, Sep 30, 2011 at 11:06 AM, Jack Bryan wrote: > Hi, > > I have a Open MPI program, which works well on a Linux shared memory > multicore (2 x 6 cores) machine. > > But, it does not work well on a distributed cluster with Linux Open MPI. > > I found that the the process sends out some messages to other processes, > which can not receive them. > > What is the possible reason ? > > I do not change anything of the program. > > Any help is really appreciated. > > Thanks > > > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > == Open Grid Scheduler - The Official Open Source Grid Engine http://gridscheduler.sourceforge.net/
Re: [OMPI users] Open MPI process cannot do send-receive message correctly on a distributed memory cluster
Thanks, I am using non-blocking MPI_Isend to send out message and using blocking MPI_Recv to get the message. Each MPI_Isend use a distinct buffer to hold the message, which is not changed until the message is received. Then, the sender process waits for the MPI_Isend to be finished. Before this message is sent out, a heading message (about how many data and what data will be sent out in the following MPI_Isend) is sent out in the same way, they can be received well. Why the following message (which has larger size) cannot be received ? Any help is really appreciated. > Date: Fri, 30 Sep 2011 11:33:16 -0400 > From: raysonlo...@gmail.com > To: us...@open-mpi.org > Subject: Re: [OMPI users] Open MPI process cannot do send-receive message > correctly on a distributed memory cluster > > You can use a debugger (just gdb will do, no TotalView needed) to find > out which MPI send & receive calls are hanging the code on the > distributed cluster, and see if the send & receive pair is due to a > problem described at: > > Deadlock avoidance in your MPI programs: > http://www.cs.ucsb.edu/~hnielsen/cs140/mpi-deadlocks.html > > Rayson > > = > Grid Engine / Open Grid Scheduler > http://gridscheduler.sourceforge.net > > Wikipedia Commons > http://commons.wikimedia.org/wiki/User:Raysonho > > > On Fri, Sep 30, 2011 at 11:06 AM, Jack Bryan wrote: > > Hi, > > > > I have a Open MPI program, which works well on a Linux shared memory > > multicore (2 x 6 cores) machine. > > > > But, it does not work well on a distributed cluster with Linux Open MPI. > > > > I found that the the process sends out some messages to other processes, > > which can not receive them. > > > > What is the possible reason ? > > > > I do not change anything of the program. > > > > Any help is really appreciated. > > > > Thanks > > > > > > > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > == > Open Grid Scheduler - The Official Open Source Grid Engine > http://gridscheduler.sourceforge.net/ > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] problem running with RoCE over 10GbE
Encountered a problem when trying to run OpenMPI 1.5.4 with RoCE over 10GbE fabric. Got this run time error: An invalid CPC name was specified via the btl_openib_cpc_include MCA parameter. Local host: atl3-14 btl_openib_cpc_include value: rdmacm Invalid name: rdmacm All possible valid names: oob,xoob -- [atl3-14:07184] mca: base: components_open: component btl / openib open function failed [atl3-12:09178] mca: base: components_open: component btl / openib open function failed Used these options to mpirun: "--mca btl openib,self,sm --mca btl_openib_cpc_include rdmacm -mca btl_openib_if_include mlx4_0:2" We have a Mellanox LOM with two ports, first is an IB port, second is an 10GbE port. Running over the IB port and TCP over the 10GbE port work fine. Built OpenMPI with this option "--enable-openib-rdmacm". Our system has OFED 1.5.2 with librdmacm-1.0.13-1 I noticed this output from configure script: checking rdma/rdma_cma.h usability... no checking rdma/rdma_cma.h presence... no checking for rdma/rdma_cma.h... no checking whether IBV_LINK_LAYER_ETHERNET is declared... yes checking if RDMAoE support is enabled... yes checking for infiniband/driver.h... yes checking if ConnectX XRC support is enabled... yes checking if dynamic SL is enabled... no checking if OpenFabrics RDMACM support is enabled... no Are we missing a build option or a piece of software? Config.log and output from "ompi_info --all" attached. % ibv_devinfo hca_id: mlx4_0 transport: InfiniBand (0) fw_ver: 2.9.1000 node_guid: 78e7:d103:0021:4464 sys_image_guid: 78e7:d103:0021:4467 vendor_id: 0x02c9 vendor_part_id: 26438 hw_ver: 0xB0 board_id: HP_020003 phys_port_cnt: 2 port: 1 state: PORT_ACTIVE (4) max_mtu:2048 (4) active_mtu: 2048 (4) sm_lid: 34 port_lid: 11 port_lmc: 0x00 link_layer: IB port: 2 state: PORT_ACTIVE (4) max_mtu:2048 (4) active_mtu: 1024 (3) sm_lid: 0 port_lid: 0 port_lmc: 0x00 link_layer: Ethernet % /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 78:E7:D1:21:44:60 inet addr:16.113.180.147 Bcast:16.113.183.255 Mask:255.255.252.0 inet6 addr: fe80::7ae7:d1ff:fe21:4460/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1861763 errors:0 dropped:0 overruns:0 frame:0 TX packets:1776402 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:712448939 (679.4 MiB) TX bytes:994111004 (948.0 MiB) Memory:fb9e-fba0 eth2 Link encap:Ethernet HWaddr 78:E7:D1:21:44:65 inet addr:10.10.0.147 Bcast:10.10.0.255 Mask:255.255.255.0 inet6 addr: fe80::78e7:d100:121:4465/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:8519814 errors:0 dropped:0 overruns:0 frame:0 TX packets:8555715 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:12370127778 (11.5 GiB) TX bytes:12372246315 (11.5 GiB) ib0 Link encap:InfiniBand HWaddr 80:00:00:4D:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 inet addr:192.168.0.147 Bcast:192.168.0.255 Mask:255.255.255.0 inet6 addr: fe80::7ae7:d103:21:4465/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:16384 Metric:1 RX packets:1989 errors:0 dropped:0 overruns:0 frame:0 TX packets:208 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:275196 (268.7 KiB) TX bytes:19202 (18.7 KiB) loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:42224 errors:0 dropped:0 overruns:0 frame:0 TX packets:42224 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:3115668 (2.9 MiB) TX bytes:3115668 (2.9 MiB) Thanks, -Jeff /**/ /* Jeff Konz jeffrey.k...
Re: [OMPI users] Proper way to stop MPI process
Sigterm should work - what version are you using? Ralph Sent from my iPad On Sep 28, 2011, at 1:40 PM, Xin Tong wrote: > I am wondering what the proper way of stop a mpirun process and the child > process it created. I tried to send SIGTERM, it does not respond to it ? > What kind of signal should I be sending to it ? > > > Thanks > > > Xin > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users