[OMPI users] how to get mpirun to scale from 16 to 64 cores
Dear All: I bought a 64 core workstation and installed NASA fun3d with open mpi 1.6.5. Then I started to test run fun3d using 16, 32, 48 cores. However the performance of the fun3d run is bad. I got data below: the run command is (it is for 32 core as an example) mpiexec -np 32 --bysocket --bind-to-socket ~ysun/Codes/NASA/fun3d-12.3-66687/Mpi/FUN3D_90/nodet_mpi --time_timestep_loop --animation_freq -1 > screen.dump_bs30 CPUs times iterations time/it 60 678s 30it 22.61s 48 702s 30it 23.40s 32 734s 30it 24.50s 16 894s 30it 29.80s You can see using 60 cores, to run 30 iteration, FUN3D will complete in 678 seconds, roughly 22.61 second per iteration. Using 16 cores, to run 30 iteration, FUN3D will complete in 894 seconds, roughly 29.8 seconds per iteration. the data above shows FUN3D run using mpirun does not scale at all! I used to run fun3d with mpirun on a 8 core WS, and it scales well. The same job to run on a linux cluster scales well. Would you all give me some advice to improve the performance loss when I increase the use of more cores, or how to run mpirun with proper options to get a linear scaling when using 16 to 32 to 48 cores? Thank you. Yuping
Re: [OMPI users] how to get mpirun to scale from 16 to 64 cores
Hi Ralph: Is the following correct command to you: mpirun -np 32 --bysocket --bycore ~ysun/Codes/NASA/fun3d-12.3-66687/Mpi/FUN3D_90/nodet_mpi --time_timestep_loop --animation_freq -1 I run above command, still do not improve. Would you give me a detailed command with options? Thank you. Best regards, Yuping On Tue, 6/17/14, Ralph Castain wrote: Subject: Re: [OMPI users] how to get mpirun to scale from 16 to 64 cores To: "Yuping Sun" , "Open MPI Users" Date: Tuesday, June 17, 2014, 1:59 AM Well, for one, there is never any guarantee of linear scaling with the number of procs - that is very application dependent. You can actually see performance decrease with number of procs if the application doesn't know how to exploit them. One thing that stands out is your mapping and binding options. Mapping bysocket means that you are putting neighboring ranks (i.e., ranks that differ by 1) on different sockets, which usually means different NUMA regions. This make shared memory between those procs run poorly. IF the application does a lot of messaging between ranks that differ by 1, then you would see poor scaling. So one thing you could do is change --bysocket to --bycore. Then, if your application isn't threaded, you could --bind-to-core for better performance. On Jun 16, 2014, at 3:19 PM, Yuping Sun wrote: Dear All: I bought a 64 core workstation and installed NASA fun3d with open mpi 1.6.5. Then I started to test run fun3d using 16, 32, 48 cores. However the performance of the fun3d run is bad. I got data below: the run command is (it is for 32 core as an example) mpiexec -np 32 --bysocket --bind-to-socket ~ysun/Codes/NASA/fun3d-12.3-66687/Mpi/FUN3D_90/nodet_mpi --time_timestep_loop --animation_freq -1 > screen.dump_bs30 CPUs times iterations time/it 60 678s 30it 22.61s 48 702s 30it 23.40s 32 734s 30it 24.50s 16 894s 30it 29.80s You can see using 60 cores, to run 30 iteration, FUN3D will complete in 678 seconds, roughly 22.61 second per iteration. Using 16 cores, to run 30 iteration, FUN3D will complete in 894 seconds, roughly 29.8 seconds per iteration. the data above shows FUN3D run using mpirun does not scale at all! I used to run fun3d with mpirun on a 8 core WS, and it scales well.The same job to run on a linux cluster scales well. Would you all give me some advice to improve the performance loss when I increase the use of more cores, or how to run mpirun with proper options to get a linear scaling when using 16 to 32 to 48 cores? Thank you. Yuping ___ users mailing list us...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2014/06/24654.php
[OMPI users] errors testing openmpi1.6.5 ----
Dear All: I downloaded openmpi1.5.6 and installed on my linux workstation with the help of NASA engineer. Then I tried to test the openmpi installation, but get the following error message: [ysun@ysunrh mpi]$ which mpiexec /usr/local/openmpi1.6.5/bin/mpiexec [ysun@ysunrh mpi]$ mpiexec utils/MPIcheck/mpicheck [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_paffinity_hwloc: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_carto_auto_detect: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_carto_file: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_shmem_mmap: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_shmem_posix: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_shmem_sysv: lt_dlerror() returned NULL! (ignored) -- It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_shmem_base_select failed --> Returned value -1 instead of OPAL_SUCCESS -- [ysunrh.fdwt.com:24905] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 79 [ysunrh.fdwt.com:24905] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file orterun.c at line 694 I also try the use ompi_info to get more information, and it give me a lot error messages and I listed some below: [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_osc_pt2pt: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_osc_rdma: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_btl_self: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_btl_sm: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_btl_tcp: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_topo_unity: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_pubsub_orte: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_dpm_orte: lt_dlerror() returned NULL! (ignored) MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.6.5) MCA memory: linux (MCA v2.0, API v2.0, Component v1.6.5) MCA timer: linux (MCA v2.0, API v2.0, Component v1.6.5) MCA installdirs: env (MCA v2.0, API v2.0, Component v1.6.5) MCA installdirs: config (MCA v2.0, API v2.0, Component v1.6.5) MCA hwloc: hwloc132 (MCA v2.0, API v2.0, Component v1.6.5) Could anyone of you give me some help, and tell me what I need to do to install openmpi correctly or give me some instructions to make it working? Thank you. Yuping Sun FloDesign Wind Turbine Corp 242 Sturbridge Road Charlton, MA 01507 Direct: 508-434-1507 Cell: 713-456-9420 y...@fdwt.com<mailto:y...@fdwt.com> [Description: cid:3300779197_294562]
Re: [OMPI users] errors testing openmpi1.6.5 ----
Hi Ralph: Here is what I got: ls /usr/local/openmpi1.6.5/lib/ libmca_common_sm.a libmpi.a libmpi_cxx.la libmpi_f77.la libmpi_f90.la libompitrace.a libopen-pal.a libopen-rte.a libopen-trace-format.a libotfaux.a libvt.a libvt-hyb.la libvt-mpi.a libvt-mpi-unify.a libvt-mt.a libvt-pomp.a mpi.mod pkgconfig libmca_common_sm.la libmpi_cxx.a libmpi_f77.a libmpi_f90.a libmpi.la libompitrace.la libopen-pal.la libopen-rte.la libopen-trace-format.la libotfaux.la libvt-hyb.a libvt.la libvt-mpi.la libvt-mpi-unify.la libvt-mt.la libvt-pomp.la openmpi [root@ysunrh usr]# Do you think I have all the referenced libraries? If yes, then what I need to do? I also try the following to see my libpath: [root@ysunrh usr]# [root@ysunrh usr]# echo $LD_LIBRARY_PATH [root@ysunrh usr]# ---so I have not set up the library path. What would you suggest me to do? However, I have /usr/local/openmpi1.6.5/bin in my PATH. By the way, I also see there is rpm package for openmpi below issuing command: >yum whatprovides openmpi openmpi-1.4-4.el5.i386 : Open Message Passing Interface Repo: rhel-x86_64-server-5 Matched from: openmpi-1.4-4.el5.x86_64 : Open Message Passing Interface Repo: rhel-x86_64-server-5 Matched from: openmpi-1.4-7.el5.i386 : Open Message Passing Interface Repo: rhel-x86_64-server-5 Matched from: openmpi-1.4-7.el5.x86_64 : Open Message Passing Interface Repo: rhel-x86_64-server-5 Matched from: would you tell me how to install openmpi using yum or rpm? And how to get a rpm package for openmpi-1.6.5.? I installed openmpi-1.6.5 using tar file downloaded from the openmpi.org website. Thank you. Thank you. From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Wednesday, July 24, 2013 8:35 PM To: Open MPI Users Subject: Re: [OMPI users] errors testing openmpi1.6.5 If you list the /usr/local/openmpi1.6.5/lib directory, do you see the referenced libraries? On Jul 24, 2013, at 5:09 PM, Yuping Sun mailto:y...@fdwt.com>> wrote: Dear All: I downloaded openmpi1.5.6 and installed on my linux workstation with the help of NASA engineer. Then I tried to test the openmpi installation, but get the following error message: [ysun@ysunrh mpi]$ which mpiexec /usr/local/openmpi1.6.5/bin/mpiexec [ysun@ysunrh mpi]$ mpiexec utils/MPIcheck/mpicheck [ysunrh.fdwt.com<http://ysunrh.fdwt.com>:24905] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_paffinity_hwloc: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com<http://ysunrh.fdwt.com>:24905] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_carto_auto_detect: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com<http://ysunrh.fdwt.com>:24905] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_carto_file: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com<http://ysunrh.fdwt.com>:24905] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_shmem_mmap: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com<http://ysunrh.fdwt.com>:24905] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_shmem_posix: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com<http://ysunrh.fdwt.com>:24905] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_shmem_sysv: lt_dlerror() returned NULL! (ignored) -- It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_shmem_base_select failed --> Returned value -1 instead of OPAL_SUCCESS -- [ysunrh.fdwt.com<http://ysunrh.fdwt.com>:24905] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 79 [ysunrh.fdwt.com<http://ysunrh.fdwt.com>:24905] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file orterun.c at line 694 I also try the use ompi_info to get more information, and it give me a lot error messages and I listed some below: [ysunrh.fdwt.com<http://ysunrh.fdwt.com>:24920] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_osc_pt2pt: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com<http://ysunrh.fdwt.com>:24920] mca: base: component_find: unable to open /usr/local/openmpi1.6.5/lib/openmpi/mca_osc_rdma: lt_dlerror() returned NULL! (ignored) [ysunrh.fdwt.com<http://ysunrh.fdwt.com>:24
Re: [OMPI users] errors testing openmpi1.6.5 ----
Hi Gus: I went back and set the PATH and LD_LIBRARY_PATH following the FAQ answer. However, it did not change anything. What else can you suggest? Thank you. Yuping -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa Sent: Wednesday, July 24, 2013 8:28 PM To: Open MPI Users Subject: Re: [OMPI users] errors testing openmpi1.6.5 Hi Yuping Did you set your PATH and LD_LIBRARY_PATH? Please, see these FAQ: http://www.open-mpi.org/faq/?category=running#run-prereqs http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path I hope this helps, Gus Correa On 07/24/2013 08:09 PM, Yuping Sun wrote: > Dear All: > > I downloaded openmpi1.5.6 and installed on my linux workstation with > the help of NASA engineer. Then I tried to test the openmpi > installation, but get the following error message: > > [ysun@ysunrh mpi]$ which mpiexec > > /usr/local/openmpi1.6.5/bin/mpiexec > > [ysun@ysunrh mpi]$ mpiexec utils/MPIcheck/mpicheck > > [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_paffinity_hwloc: lt_dlerror() > returned NULL! (ignored) > > [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_carto_auto_detect: > lt_dlerror() returned NULL! (ignored) > > [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_carto_file: lt_dlerror() > returned NULL! (ignored) > > [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_shmem_mmap: lt_dlerror() > returned NULL! (ignored) > > [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_shmem_posix: lt_dlerror() > returned NULL! (ignored) > > [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_shmem_sysv: lt_dlerror() > returned NULL! (ignored) > > -- > > > It looks like opal_init failed for some reason; your parallel process > is > > likely to abort. There are many reasons that a parallel process can > > fail during opal_init; some of which are due to configuration or > > environment problems. This failure appears to be an internal failure; > > here's some additional information (which may only be relevant to an > > Open MPI developer): > > opal_shmem_base_select failed > > --> Returned value -1 instead of OPAL_SUCCESS > > -- > > > [ysunrh.fdwt.com:24905] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in > file runtime/orte_init.c at line 79 > > [ysunrh.fdwt.com:24905] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in > file orterun.c at line 694 > > I also try the use ompi_info to get more information, and it give me a > lot error messages and I listed some below: > > [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_osc_pt2pt: lt_dlerror() > returned NULL! (ignored) > > [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_osc_rdma: lt_dlerror() > returned NULL! (ignored) > > [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_btl_self: lt_dlerror() > returned NULL! (ignored) > > [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_btl_sm: lt_dlerror() returned > NULL! (ignored) > > [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_btl_tcp: lt_dlerror() returned > NULL! (ignored) > > [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_topo_unity: lt_dlerror() > returned NULL! (ignored) > > [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_pubsub_orte: lt_dlerror() > returned NULL! (ignored) > > [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open > /usr/local/openmpi1.6.5/lib/openmpi/mca_dpm_orte: lt_dlerror() > returned NULL! (ignored) > > MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.6.5) > > MCA memory: linux (MCA v2.0, API v2.0, Component v1.6.5) > > MCA timer: linux (MCA v2.0, API v2.0, Component v1.6.5) > > MCA installdirs: env (MCA v2.0, API v2.0, Component v1.6.5) > > MCA installdirs: config (MCA v2.0, API v2.0, Component v1.6.5) > > MCA hwloc: hwloc132 (MCA v
Re: [OMPI users] errors testing openmpi1.6.5 ----
Hi Gus: Thank you. I did these as I use .bash_profile to add the path and LD, but it did not help. Thank you. Is there anything else you can think of? Best regards, Yuping -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa Sent: Thursday, July 25, 2013 4:51 PM To: Open MPI Users Subject: Re: [OMPI users] errors testing openmpi1.6.5 Hi Yuping A simple way to do it is to put in your initialization files, which are hidden ("dot files") in your home directory. It depends on the shell you use (do 'echo $SHELL' to see). If bash, .bashrc export PATH=/usr/local/openmpi1.6.5/bin:${PATH} export LD_LIBRARY_PATH=/usr/local/openmpi1.6.5/lib:${LD_LIBRARY_PATH} If csh/tcsh .cshrc/.tcshrc setenv PATH /usr/local/openmpi1.6.5/bin:${PATH} setenv LD_LIBRARY_PATH /usr/local/openmpi1.6.5/lib:${LD_LIBRARY_PATH} Are you running the programs in a single machine, or is it a cluster or some farm or networked machines? In the second case, you need to make sure your home directory is shared across the machines, or if it is not shared, you need to make the modifications above in each machine. I hope this helps. Gus Correa On 07/25/2013 03:11 PM, Yuping Sun wrote: > Hi Gus: > > I went back and set the PATH and LD_LIBRARY_PATH following the FAQ answer. > However, it did not change anything. > > What else can you suggest? > > Thank you. > > Yuping > > -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] > On Behalf Of Gus Correa > Sent: Wednesday, July 24, 2013 8:28 PM > To: Open MPI Users > Subject: Re: [OMPI users] errors testing openmpi1.6.5 > > Hi Yuping > > Did you set your PATH and LD_LIBRARY_PATH? > Please, see these FAQ: > http://www.open-mpi.org/faq/?category=running#run-prereqs > http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path > > I hope this helps, > Gus Correa > > On 07/24/2013 08:09 PM, Yuping Sun wrote: >> Dear All: >> >> I downloaded openmpi1.5.6 and installed on my linux workstation with >> the help of NASA engineer. Then I tried to test the openmpi >> installation, but get the following error message: >> >> [ysun@ysunrh mpi]$ which mpiexec >> >> /usr/local/openmpi1.6.5/bin/mpiexec >> >> [ysun@ysunrh mpi]$ mpiexec utils/MPIcheck/mpicheck >> >> [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open >> /usr/local/openmpi1.6.5/lib/openmpi/mca_paffinity_hwloc: lt_dlerror() >> returned NULL! (ignored) >> >> [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open >> /usr/local/openmpi1.6.5/lib/openmpi/mca_carto_auto_detect: >> lt_dlerror() returned NULL! (ignored) >> >> [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open >> /usr/local/openmpi1.6.5/lib/openmpi/mca_carto_file: lt_dlerror() >> returned NULL! (ignored) >> >> [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open >> /usr/local/openmpi1.6.5/lib/openmpi/mca_shmem_mmap: lt_dlerror() >> returned NULL! (ignored) >> >> [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open >> /usr/local/openmpi1.6.5/lib/openmpi/mca_shmem_posix: lt_dlerror() >> returned NULL! (ignored) >> >> [ysunrh.fdwt.com:24905] mca: base: component_find: unable to open >> /usr/local/openmpi1.6.5/lib/openmpi/mca_shmem_sysv: lt_dlerror() >> returned NULL! (ignored) >> >> - >> - >> >> >> It looks like opal_init failed for some reason; your parallel process >> is >> >> likely to abort. There are many reasons that a parallel process can >> >> fail during opal_init; some of which are due to configuration or >> >> environment problems. This failure appears to be an internal failure; >> >> here's some additional information (which may only be relevant to an >> >> Open MPI developer): >> >> opal_shmem_base_select failed >> >> --> Returned value -1 instead of OPAL_SUCCESS >> >> - >> - >> >> >> [ysunrh.fdwt.com:24905] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in >> file runtime/orte_init.c at line 79 >> >> [ysunrh.fdwt.com:24905] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in >> file orterun.c at line 694 >> >> I also try the use ompi_info to get more information, and it give me >> a lot error messages and I listed some below: >> >> [ysunrh.fdwt.com:24920] mca: base: component_find: unable to open >> /
Re: [OMPI users] errors testing openmpi1.6.5 ----
Hi Gus: I use yum install to install openmpi 1.4.7 and it went through. Then I tested a small code, hello world, and it worked. Now I have two questions for you: 1. do we have a openmpi 1.6.5 rpm package so I can use rpm or yum to install? 2. do you know how to specify the directory to install openmpi when using yum install? Thank you. Best regards, Yuping -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa Sent: Thursday, July 25, 2013 7:22 PM To: Open MPI Users Subject: Re: [OMPI users] errors testing openmpi1.6.5 Hi Yuping Something seems to be broken in the way you set your environment variables. We use .bashrc/.tcshrc for this. For what is worth, the bash man page says: *** "When bash is invoked as an interactive login shell, or as a non-inter- active shell with the --login option, it first reads and executes com- mands from the file /etc/profile, if that file exists. After reading that file, it looks for ~/.bash_profile, ~/.bash_login, and ~/.profile, in that order, and reads and executes commands from the first one that exists and is readable. The --noprofile option may be used when the shell is started to inhibit this behavior. When a login shell exits, bash reads and executes commands from the file ~/.bash_logout, if it exists. When an interactive shell that is not a login shell is started, bash reads and executes commands from ~/.bashrc, if that file exists. This may be inhibited by using the --norc option. The --rcfile file option will force bash to read and execute commands from file instead of ~/.bashrc. " *** Notice the difference between interactive login shells, and interactive shell that is not a login shell. However, it is hard to guess what it is going on. We (the OMPI list) don't even know if you are running the job in a single machine, or on a cluster, or on a few networked workstations. It would help if you tell more. Anyway, you could try to pass along the environment variables the mpiexec command line. Something like this: *** export PATH=... (whatever you use) export LD_LIBRARY_PATH=... (whatever you use) mpiexec -x PATH -x LD_LIBRARY_PATH -np 4 ./my_program *** See 'man mpiexec" for details. ** In any case, in my humble opinion, the best approach is to fix for good the problems with your environment. They will haunt you sooner or later. I hope this helps, Gus Correa On 07/25/2013 05:58 PM, Yuping Sun wrote: > > Hi Gus: > > Thank you. I did these as I use .bash_profile to add the path and LD, but it > did not help. Thank you. > Is there anything else you can think of? > > Best regards, > > Yuping > > -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] > On Behalf Of Gus Correa > Sent: Thursday, July 25, 2013 4:51 PM > To: Open MPI Users > Subject: Re: [OMPI users] errors testing openmpi1.6.5 > > Hi Yuping > > A simple way to do it is to put in your initialization files, which are > hidden ("dot files") in your home directory. > > It depends on the shell you use (do 'echo $SHELL' to see). > > If bash, > .bashrc > > export PATH=/usr/local/openmpi1.6.5/bin:${PATH} > export LD_LIBRARY_PATH=/usr/local/openmpi1.6.5/lib:${LD_LIBRARY_PATH} > > If csh/tcsh > .cshrc/.tcshrc > > setenv PATH /usr/local/openmpi1.6.5/bin:${PATH} > setenv LD_LIBRARY_PATH /usr/local/openmpi1.6.5/lib:${LD_LIBRARY_PATH} > > Are you running the programs in a single machine, or is it a cluster or some > farm or networked machines? > In the second case, you need to make sure your home directory is shared > across the machines, or if it is not shared, you need to make the > modifications above in each machine. > > I hope this helps. > > Gus Correa > > On 07/25/2013 03:11 PM, Yuping Sun wrote: >> Hi Gus: >> >> I went back and set the PATH and LD_LIBRARY_PATH following the FAQ answer. >> However, it did not change anything. >> >> What else can you suggest? >> >> Thank you. >> >> Yuping >> >> -Original Message- >> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] >> On Behalf Of Gus Correa >> Sent: Wednesday, July 24, 2013 8:28 PM >> To: Open MPI Users >> Subject: Re: [OMPI users] errors testing openmpi1.6.5 >> >> Hi Yuping >> >> Did you set your PATH and LD_LIBRARY_PATH? >> Please, see these FAQ: >> http://www.open-mpi.org/faq/?category=running#run-prereqs >> http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path >> >> I hope this helps, >> Gus Correa >> >> On 07/24/2013 08:09 PM, Yuping Sun wrote: >>> Dear All:
Re: [OMPI users] errors testing openmpi1.6.5 ----
Hi Gus: Can you review the procedures I went through to install and let me know where it went wrong? Procedures: Step 1: Download Open link http://www.open-mpi.org/software/ompi/v1.6/ You will see Click openmpi-1.6.5.tar.gz And download the tar file. Step 2: prepare to install $ cd ~ysun/Codes $ mkdir OpenMpi $ cd ~ysun/Codes/OpenMpi $ mv ../openmpi-1.6.5.tar.gz . $ gunzip openmpi-1.6.5.tar.gz $ tar -xvf openmpi-1.6.5.tar; gzip *.tar $ ./configure --prefix=/usr/local/openmpi165 FC=`which ifort ` F77=`which ifort` --disable-shared # this is the same process: to use configure to produce a localized Makefile. Here we have to use --disble-shared! #here prefix is used to specify the install location #FC is used to indicate the path of fortran compiler, here ifort is used #F77 is used to indicate the path of F77, here ifort is used $ make -j8 #here we execute make using 8 processors we have on this machine. Then we need to use root to install $ su root $ cd ~ysun/Codes/OpenMpi/openmpi-1.6.5 $ make install If possible, would you list the procedures you went through to install openmpi-1.6.5? Thank you. Have a good night. Yuping -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa Sent: Thursday, July 25, 2013 9:40 PM To: Open MPI Users Subject: Re: [OMPI users] errors testing openmpi1.6.5 Hi Yuping Hmmm... I think the last release in the 1.4.X series was 1.4.5. I don't remember any 1.4.7. Is it a typo in your email. In any case, I always install OpenMPI from source code. It works like a charm, and is trouble-free. I am not familiar to the OpenMPI packages from the various Linux distributions. So, I am afraid I can't help you with this. Gus Correa On 07/25/2013 08:15 PM, Yuping Sun wrote: > Hi Gus: > > I use yum install to install openmpi 1.4.7 and it went through. Then I tested > a small code, hello world, and it worked. > Now I have two questions for you: > > 1. do we have a openmpi 1.6.5 rpm package so I can use rpm or yum to install? > 2. do you know how to specify the directory to install openmpi when using yum > install? > > Thank you. > > Best regards, > > Yuping > > > > > -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] > On Behalf Of Gus Correa > Sent: Thursday, July 25, 2013 7:22 PM > To: Open MPI Users > Subject: Re: [OMPI users] errors testing openmpi1.6.5 > > Hi Yuping > > Something seems to be broken in the way you set your environment variables. > We use .bashrc/.tcshrc for this. > For what is worth, the bash man page says: > > *** > "When bash is invoked as an interactive login shell, or as a non-inter- > active shell with the --login option, it first reads and executes com- mands > from the file /etc/profile, if that file exists. After reading that file, > it looks for ~/.bash_profile, ~/.bash_login, and ~/.profile, in that order, > and reads and executes commands from the first one that exists and is > readable. The --noprofile option may be used when the shell is started to > inhibit this behavior. > > When a login shell exits, bash reads and executes commands from the file > ~/.bash_logout, if it exists. > > When an interactive shell that is not a login shell is started, bash reads > and executes commands from ~/.bashrc, if that file exists. This may be > inhibited by using the --norc option. The --rcfile file option will force > bash to read and execute commands from file instead of ~/.bashrc. " > *** > > Notice the difference between interactive login shells, and interactive shell > that is not a login shell. > > However, it is hard to guess what it is going on. > We (the OMPI list) don't even know if you are running the job in a single > machine, or on a cluster, or on a few networked workstations. > It would help if you tell more. > > Anyway, you could try to pass along the environment variables the mpiexec > command line. > > Something like this: > > *** > export PATH=... (whatever you use) > export LD_LIBRARY_PATH=... (whatever you use) mpiexec -x PATH -x > LD_LIBRARY_PATH -np 4 ./my_program > > *** > > See 'man mpiexec" for details. > > ** > > In any case, in my humble opinion, > the best approach is to fix for good the problems with your environment. > They will haunt you sooner or later. > > I hope this helps, > Gus Correa > > > > On 07/25/2013 05:58 PM, Yuping Sun wrote: >> >> Hi Gus: >> >> Thank you. I did these as I use .bash_profile to add the path and LD, but it >> did not help. Thank you
Re: [OMPI users] errors testing openmpi1.6.5 ----
Dear All: I was able to make and build openmpi-1.6.5 using gfortran. However, I was not able to compile using intel ifort. Would anyone tell me how to modify configure options to use ifort? Please give me the instruction if you can. Here is what I used: $ ./configure --prefix=/usr/local/openmpi165 FC=`which ifort ` F77=`which ifort` --disable-shared $ make -j8 all $ make install It does not work out. Thank you. Yuping