[OMPI users] Error building openmpi-dev-1883-g7cce015 on Linux
Hi, today I tried to build openmpi-dev-1883-g7cce015 on my machines (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with gcc-5.1.0 and Sun C 5.13/5.12. I got the following error for gcc-5.1.0 and Sun C 5.12 on Linux and I didn't get any errors on my Solaris machines for gcc-5.1.0 and Sun C 5.13. I used the following command to configure the package for gcc. ../openmpi-dev-1883-g7cce015/configure --prefix=/usr/local/openmpi-master_64_gcc \ --libdir=/usr/local/openmpi-master_64_gcc/lib64 \ --with-jdk-bindir=/usr/local/jdk1.8.0/bin \ --with-jdk-headers=/usr/local/jdk1.8.0/include \ JAVA_HOME=/usr/local/jdk1.8.0 \ LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \ CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \ CPP="cpp" CXXCPP="cpp" \ CPPFLAGS="" CXXCPPFLAGS="" \ --enable-mpi-cxx \ --enable-cxx-exceptions \ --enable-mpi-java \ --enable-heterogeneous \ --enable-mpi-thread-multiple \ --with-hwloc=internal \ --without-verbs \ --with-wrapper-cflags="-std=c11 -m64" \ --with-wrapper-cxxflags="-m64" \ --with-wrapper-fcflags="-m64" \ --enable-debug \ |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc ... make install-exec-hook make[3]: Entering directory `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' WARNING! Common symbols found: keyval_lex.o: 0004 C opal_util_keyval_yyleng keyval_lex.o: 0008 C opal_util_keyval_yytext show_help_lex.o: 0004 C opal_show_help_yyleng show_help_lex.o: 0008 C opal_show_help_yytext rmaps_rank_file_lex.o: 0004 C orte_rmaps_rank_file_leng rmaps_rank_file_lex.o: 0008 C orte_rmaps_rank_file_text hostfile_lex.o: 0004 C orte_util_hostfile_leng hostfile_lex.o: 0008 C orte_util_hostfile_text mpi-f08-types.o: 0004 C mpi_fortran_argv_null mpi-f08-types.o: 0004 C mpi_fortran_argvs_null [...] skipping remaining symbols. To see all symbols, run: ../openmpi-dev-1883-g7cce015/config/find_common_syms --top_builddir=. --top_srcdir=../openmpi-dev-1883-g7cce015 --objext=o make[3]: [install-exec-hook] Error 1 (ignored) make[3]: Leaving directory `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' make[2]: Nothing to be done for `install-data-am'. make[2]: Leaving directory `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' make[1]: Leaving directory `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' linpc1 openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc 130 linpc1 openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc 130 ../openmpi-dev-1883-g7cce015/config/find_common_syms --top_builddir=. --top_srcdir=../openmpi-dev-1883-g7cce015 --objext=o WARNING! Common symbols found: keyval_lex.o: 0004 C opal_util_keyval_yyleng keyval_lex.o: 0008 C opal_util_keyval_yytext show_help_lex.o: 0004 C opal_show_help_yyleng show_help_lex.o: 0008 C opal_show_help_yytext rmaps_rank_file_lex.o: 0004 C orte_rmaps_rank_file_leng rmaps_rank_file_lex.o: 0008 C orte_rmaps_rank_file_text hostfile_lex.o: 0004 C orte_util_hostfile_leng hostfile_lex.o: 0008 C orte_util_hostfile_text mpi-f08-types.o: 0004 C mpi_fortran_argv_null mpi-f08-types.o: 0004 C mpi_fortran_argvs_null mpi-f08-types.o: 0004 C mpi_fortran_bottom mpi-f08-types.o: 0004 C mpi_fortran_errcodes_ignore mpi-f08-types.o: 0004 C mpi_fortran_in_place mpi-f08-types.o: 0018 C mpi_fortran_status_ignore mpi-f08-types.o: 0018 C mpi_fortran_statuses_ignore mpi-f08-types.o: 0004 C mpi_fortran_unweighted mpi-f08-types.o: 0004 C mpi_fortran_weights_empty mpi-f08-types.o: 0004 C ompi_f08_mpi_2complex mpi-f08-types.o: 0004 C ompi_f08_mpi_2double_complex mpi-f08-types.o: 0004 C ompi_f08_mpi_2double_precision mpi-f08-types.o: 0004 C ompi_f08_mpi_2integer mpi-f08-types.o: 0004 C ompi_f08_mpi_2real mpi-f08-types.o: 0004 C ompi_f08_mpi_band mpi-f08-types.o: 0004 C ompi_f08_mpi_bor mpi-f08-types.o: 0004 C ompi_f08_mpi_bxor mpi-f08-types.o: 0004 C ompi_f08_mpi_byte mpi-f08-types.o: 0004 C ompi_f08_mpi_character mpi-f08-types.o: 0004 C ompi_f08_mpi_comm_null mpi-f08-types.o: 0004 C ompi_f08_mpi_comm_self mpi-f08-types.o: 0004 C ompi_f08_mpi_comm_world mpi-f08-
Re: [OMPI users] Error building openmpi-dev-1883-g7cce015 on Linux
Siegmar, these are just warnings, you can safely ignore them Cheers, Gilles On Tuesday, June 16, 2015, Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de> wrote: > Hi, > > today I tried to build openmpi-dev-1883-g7cce015 on my machines > (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 > x86_64) with gcc-5.1.0 and Sun C 5.13/5.12. I got the following > error for gcc-5.1.0 and Sun C 5.12 on Linux and I didn't get any > errors on my Solaris machines for gcc-5.1.0 and Sun C 5.13. > I used the following command to configure the package for gcc. > > ../openmpi-dev-1883-g7cce015/configure > --prefix=/usr/local/openmpi-master_64_gcc \ > --libdir=/usr/local/openmpi-master_64_gcc/lib64 \ > --with-jdk-bindir=/usr/local/jdk1.8.0/bin \ > --with-jdk-headers=/usr/local/jdk1.8.0/include \ > JAVA_HOME=/usr/local/jdk1.8.0 \ > LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \ > CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \ > CPP="cpp" CXXCPP="cpp" \ > CPPFLAGS="" CXXCPPFLAGS="" \ > --enable-mpi-cxx \ > --enable-cxx-exceptions \ > --enable-mpi-java \ > --enable-heterogeneous \ > --enable-mpi-thread-multiple \ > --with-hwloc=internal \ > --without-verbs \ > --with-wrapper-cflags="-std=c11 -m64" \ > --with-wrapper-cxxflags="-m64" \ > --with-wrapper-fcflags="-m64" \ > --enable-debug \ > |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc > > > ... > make install-exec-hook > make[3]: Entering directory > `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' > WARNING! Common symbols found: > keyval_lex.o: 0004 C opal_util_keyval_yyleng > keyval_lex.o: 0008 C opal_util_keyval_yytext > show_help_lex.o: 0004 C opal_show_help_yyleng > show_help_lex.o: 0008 C opal_show_help_yytext > rmaps_rank_file_lex.o: 0004 C orte_rmaps_rank_file_leng > rmaps_rank_file_lex.o: 0008 C orte_rmaps_rank_file_text >hostfile_lex.o: 0004 C orte_util_hostfile_leng >hostfile_lex.o: 0008 C orte_util_hostfile_text > mpi-f08-types.o: 0004 C mpi_fortran_argv_null > mpi-f08-types.o: 0004 C mpi_fortran_argvs_null > [...] > skipping remaining symbols. To see all symbols, run: > ../openmpi-dev-1883-g7cce015/config/find_common_syms --top_builddir=. > --top_srcdir=../openmpi-dev-1883-g7cce015 --objext=o > make[3]: [install-exec-hook] Error 1 (ignored) > make[3]: Leaving directory > `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' > make[2]: Nothing to be done for `install-data-am'. > make[2]: Leaving directory > `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' > make[1]: Leaving directory > `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' > linpc1 openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc 130 > > > > linpc1 openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc 130 > ../openmpi-dev-1883-g7cce015/config/find_common_syms --top_builddir=. > --top_srcdir=../openmpi-dev-1883-g7cce015 --objext=o > WARNING! Common symbols found: > keyval_lex.o: 0004 C opal_util_keyval_yyleng > keyval_lex.o: 0008 C opal_util_keyval_yytext > show_help_lex.o: 0004 C opal_show_help_yyleng > show_help_lex.o: 0008 C opal_show_help_yytext > rmaps_rank_file_lex.o: 0004 C orte_rmaps_rank_file_leng > rmaps_rank_file_lex.o: 0008 C orte_rmaps_rank_file_text >hostfile_lex.o: 0004 C orte_util_hostfile_leng >hostfile_lex.o: 0008 C orte_util_hostfile_text > mpi-f08-types.o: 0004 C mpi_fortran_argv_null > mpi-f08-types.o: 0004 C mpi_fortran_argvs_null > mpi-f08-types.o: 0004 C mpi_fortran_bottom > mpi-f08-types.o: 0004 C mpi_fortran_errcodes_ignore > mpi-f08-types.o: 0004 C mpi_fortran_in_place > mpi-f08-types.o: 0018 C mpi_fortran_status_ignore > mpi-f08-types.o: 0018 C mpi_fortran_statuses_ignore > mpi-f08-types.o: 0004 C mpi_fortran_unweighted > mpi-f08-types.o: 0004 C mpi_fortran_weights_empty > mpi-f08-types.o: 0004 C ompi_f08_mpi_2complex > mpi-f08-types.o: 0004 C ompi_f08_mpi_2double_complex > mpi-f08-types.o: 0004 C > ompi_f08_mpi_2double_precision > mpi-f08-types.o: 0004 C ompi_f08_mpi_2integer > mpi-f08-types.o: 0004 C ompi_f08_mpi_2real > mpi-f08-types.o: 0004 C ompi_f08_mpi_band > mpi-f08-types.o: 0004 C ompi_f08_mpi_bor > mpi-f08-types.o: 0004 C ompi_f08_mpi_bxor > mpi-f0
Re: [OMPI users] Fwd[2]: OMPI yalla vs impi
Hello, Alina! If I use --map-by node I will get only intranode communications on osu_mbw_mr. I use --map-by core instead. I have 2 nodes, each node has 2 sockets with 8 cores per socket. When I run osu_mbw_mr on 2 nodes with 32 MPI procs (command see below), I expect to see the unidirectional bandwidth of 4xFDR link as a result of this test. With IntelMPI I get 6367 MB/s, With ompi_yalla I get about 3744 MB/s (problem: it is a half of impi result) With openmpi without mxm (ompi_clear) I get 6321 MB/s. How can I increase yalla results? IntelMPI cmd: /opt/software/intel/impi/4.1.0.030/intel64/bin/mpiexec.hydra -machinefile machines.pYAvuK -n 32 -binding domain=core ../osu_impi/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_mbw_mr -v -r=0 ompi_yalla cmd: /gpfs/NETHOME/oivt1/nicevt/itf/sources/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/ompi-mellanox-fca-v1.8.5/bin/mpirun -report-bindings -display-map -mca coll_hcoll_enable 1 -x HCOLL_MAIN_IB=mlx4_0:1 -x MXM_IB_PORTS=mlx4_0:1 -x MXM_SHM_KCOPY_MODE=off --mca pml yalla --map-by core --bind-to core --hostfile hostlist ../osu_ompi_hcoll/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_mbw_mr -v -r=0 ompi_clear cmd: /gpfs/NETHOME/oivt1/nicevt/itf/sources/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/ompi-clear-v1.8.5/bin/mpirun -report-bindings -display-map --hostfile hostlist --map-by core --bind-to core ../osu_ompi_clear/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_mbw_mr -v -r=0 I have attached output files to this letter: ompi_clear.out, ompi_clear.err - contains ompi_clear results ompi_yalla.out, ompi_yalla.err - contains ompi_yalla results impi.out, impi.err - contains intel MPI results Best regards, Timur Воскресенье, 7 июня 2015, 16:11 +03:00 от Alina Sklarevich : >Hi Timur, > >After running the osu_mbw_mr benchmark in our lab, we obsereved that the >binding policy made a difference on the performance. >Can you please rerun your ompi tests with the following added to your command >line? (one of them in each run) > >1. --map-by node --bind-to socket >2. --map-by node --bind-to core > >Please attach your results. > >Thank you, >Alina. > >On Thu, Jun 4, 2015 at 6:53 PM, Timur Ismagilov < tismagi...@mail.ru > wrote: >>Hello, Alina. >>1. Here is my >>ompi_yalla command line: >>$HPCX_MPI_DIR/bin/mpirun -mca coll_hcoll_enable 1 -x HCOLL_MAIN_IB=mlx4_0:1 >>-x MXM_IB_PORTS=mlx4_0:1 -x MXM_SHM_KCOPY_MODE=off --mca pml yalla --hostfile >>hostlist $@ >>echo $HPCX_MPI_DIR >>/gpfs/NETHOME/oivt1/nicevt/itf/sources/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/ >> ompi-mellanox-fca-v1.8.5 >>This mpi was configured with: --with-mxm=/path/to/mxm >>--with-hcoll=/path/to/hcoll >>--with-platform=contrib/platform/mellanox/optimized --prefix=/path/to/ >>ompi-mellanox-fca-v1.8.5 >>ompi_clear command line: >>HPCX_MPI_DIR/bin/mpirun --hostfile hostlist $@ >>echo $HPCX_MPI_DIR >>/gpfs/NETHOME/oivt1/nicevt/itf/sources/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/ >> ompi-clear-v1.8.5 >>This mpi was configured with: >>--with-platform=contrib/platform/mellanox/optimized --prefix=/path/to >>/ompi-clear-v1.8.5 >>2. When i run osu_mbr_mr with key "-x MXM_TLS=self,shm,rc" . It fails with >>segmentation fault : >>stdout log is in attached file osu_mbr_mr_n-2_ppn-16.out; >>stderr log is in attached file osu_mbr_mr_n-2_ppn-16.err; >>cmd line: >>$HPCX_MPI_DIR/bin/mpirun -mca coll_hcoll_enable 1 -x HCOLL_MAIN_IB=mlx4_0:1 >>-x MXM_IB_PORTS=mlx4_0:1 -x MXM_SHM_KCOPY_MODE=off --mca pml yalla -x >>MXM_TLS=self,shm,rc --hostfile hostlist osu_mbw_mr -v -r=0 >>osu_mbw_mr.c >>I have changed WINDOW_SIZES in osu_mbw_mr.c: >>#define WINDOW_SIZES {8, 16, 32, 64, 128, 256, 512, 1024 } >>3. I add results of running osu_mbw_mr with yalla and without hcoll on 32 and >>64 nodes (512 and 1024 mpi procs >>) to mvs10p_mpi.xls : list osu_mbr_mr. >>The results are 20 percents smaller than old results (with hcoll). >> >> >> >>Среда, 3 июня 2015, 10:29 +03:00 от Alina Sklarevich < >>ali...@dev.mellanox.co.il >: >>>Hello Timur, >>> >>>I will review your results and try to reproduce them in our lab. >>> >>>You are using an old OFED - OFED-1.5.4.1 and we suspect that this may be >>>causing the performance issues you are seeing. >>> >>>In the meantime, could you please: >>> >>>1. send us the exact command lines that you were running when you got these >>>results? >>> >>>2. add the following to the command line that you are running with 'pml >>>yalla' and attach the results? >>>"-x MXM_TLS=self,shm,rc" >>> >>>3. run your command line with yalla and without hcoll? >>> >>>Thanks, >>>Alina. >>> >>> >>> >>>On Tue, Jun 2, 2015 at 4:56 PM, Timur Ismagilov < tismagi...@mail.ru > >>>wrote: Hi, Mike! I have impi v 4.1.2 (- impi) I build ompi 1.8.5 with MXM and hcoll (- ompi_yalla) I build ompi 1.8.5 without MXM and hcoll (- ompi_clear) I start osu p2p: osu_mbr_mr test with this MPIs. You can find the result of benchmark in attached f
Re: [OMPI users] Error building openmpi-dev-1883-g7cce015 on Linux
Hi Gilles, > these are just warnings, you can safely ignore them Good to know. Nevertheless, I thought that you may be interested to know about the warnings, because they are new. Kind regards Siegmar > Cheers, > > Gilles > > On Tuesday, June 16, 2015, Siegmar Gross < > siegmar.gr...@informatik.hs-fulda.de> wrote: > > > Hi, > > > > today I tried to build openmpi-dev-1883-g7cce015 on my machines > > (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 > > x86_64) with gcc-5.1.0 and Sun C 5.13/5.12. I got the following > > error for gcc-5.1.0 and Sun C 5.12 on Linux and I didn't get any > > errors on my Solaris machines for gcc-5.1.0 and Sun C 5.13. > > I used the following command to configure the package for gcc. > > > > ../openmpi-dev-1883-g7cce015/configure > > --prefix=/usr/local/openmpi-master_64_gcc \ > > --libdir=/usr/local/openmpi-master_64_gcc/lib64 \ > > --with-jdk-bindir=/usr/local/jdk1.8.0/bin \ > > --with-jdk-headers=/usr/local/jdk1.8.0/include \ > > JAVA_HOME=/usr/local/jdk1.8.0 \ > > LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \ > > CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \ > > CPP="cpp" CXXCPP="cpp" \ > > CPPFLAGS="" CXXCPPFLAGS="" \ > > --enable-mpi-cxx \ > > --enable-cxx-exceptions \ > > --enable-mpi-java \ > > --enable-heterogeneous \ > > --enable-mpi-thread-multiple \ > > --with-hwloc=internal \ > > --without-verbs \ > > --with-wrapper-cflags="-std=c11 -m64" \ > > --with-wrapper-cxxflags="-m64" \ > > --with-wrapper-fcflags="-m64" \ > > --enable-debug \ > > |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc > > > > > > ... > > make install-exec-hook > > make[3]: Entering directory > > `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' > > WARNING! Common symbols found: > > keyval_lex.o: 0004 C opal_util_keyval_yyleng > > keyval_lex.o: 0008 C opal_util_keyval_yytext > > show_help_lex.o: 0004 C opal_show_help_yyleng > > show_help_lex.o: 0008 C opal_show_help_yytext > > rmaps_rank_file_lex.o: 0004 C orte_rmaps_rank_file_leng > > rmaps_rank_file_lex.o: 0008 C orte_rmaps_rank_file_text > >hostfile_lex.o: 0004 C orte_util_hostfile_leng > >hostfile_lex.o: 0008 C orte_util_hostfile_text > > mpi-f08-types.o: 0004 C mpi_fortran_argv_null > > mpi-f08-types.o: 0004 C mpi_fortran_argvs_null > > [...] > > skipping remaining symbols. To see all symbols, run: > > ../openmpi-dev-1883-g7cce015/config/find_common_syms --top_builddir=. > > --top_srcdir=../openmpi-dev-1883-g7cce015 --objext=o > > make[3]: [install-exec-hook] Error 1 (ignored) > > make[3]: Leaving directory > > `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' > > make[2]: Nothing to be done for `install-data-am'. > > make[2]: Leaving directory > > `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' > > make[1]: Leaving directory > > `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' > > linpc1 openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc 130 > > > > > > > > linpc1 openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc 130 > > ../openmpi-dev-1883-g7cce015/config/find_common_syms --top_builddir=. > > --top_srcdir=../openmpi-dev-1883-g7cce015 --objext=o > > WARNING! Common symbols found: > > keyval_lex.o: 0004 C opal_util_keyval_yyleng > > keyval_lex.o: 0008 C opal_util_keyval_yytext > > show_help_lex.o: 0004 C opal_show_help_yyleng > > show_help_lex.o: 0008 C opal_show_help_yytext > > rmaps_rank_file_lex.o: 0004 C orte_rmaps_rank_file_leng > > rmaps_rank_file_lex.o: 0008 C orte_rmaps_rank_file_text > >hostfile_lex.o: 0004 C orte_util_hostfile_leng > >hostfile_lex.o: 0008 C orte_util_hostfile_text > > mpi-f08-types.o: 0004 C mpi_fortran_argv_null > > mpi-f08-types.o: 0004 C mpi_fortran_argvs_null > > mpi-f08-types.o: 0004 C mpi_fortran_bottom > > mpi-f08-types.o: 0004 C mpi_fortran_errcodes_ignore > > mpi-f08-types.o: 0004 C mpi_fortran_in_place > > mpi-f08-types.o: 0018 C mpi_fortran_status_ignore > > mpi-f08-types.o: 0018 C mpi_fortran_statuses_ignore > > mpi-f08-types.o: 0004 C mpi_fortran_unweighted > > mpi-f08-types.o: 0004 C mpi_fortran_weights_empty > > mpi-f08-types.o: 0004 C ompi_f08_mpi_2complex > > mpi-f08-types.o: 0004 C ompi_f08_mpi_2double_complex > > mpi-f08-types.o: 0004 C > > ompi_f08_mpi_2double_precision > >
Re: [OMPI users] Fwd[2]: OMPI yalla vs impi
With ' --bind-to socket' i get the same results as '--bind-to-core' : 3813 MB/s. I have attached ompi_yalla_socket.out and ompi_yalla_socket.err files to this letter. Вторник, 16 июня 2015, 18:15 +03:00 от Alina Sklarevich : >Hi Timur, > >Can you please try running your ompi_yalla cmd with ' --bind-to socket' >(instead of binding to core) and check if it affects the results? >We saw that it made a difference on the performance in our lab so that's why I >asked you to try the same. > >Thanks, >Alina. > >On Tue, Jun 16, 2015 at 5:53 PM, Timur Ismagilov < tismagi...@mail.ru > wrote: >>Hello, Alina! >> >>If I use --map-by node I will get only intranode communications on >>osu_mbw_mr. I use --map-by core instead. >> >>I have 2 nodes, each node has 2 sockets with 8 cores per socket. >> >>When I run osu_mbw_mr on 2 nodes with 32 MPI procs (command see below), I >>expect to see the unidirectional bandwidth of 4xFDR link as a result of >>this test. >> >>With IntelMPI I get 6367 MB/s, >>With ompi_yalla I get about 3744 MB/s (problem: it is a half of impi result) >>With openmpi without mxm (ompi_clear) I get 6321 MB/s. >> >>How can I increase yalla results? >> >>IntelMPI cmd: >>/opt/software/intel/impi/ 4.1.0.030/intel64/bin/mpiexec.hydra -machinefile >>machines.pYAvuK -n 32 -binding domain=core >>../osu_impi/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_mbw_mr -v -r=0 >> >>ompi_yalla cmd: >>/gpfs/NETHOME/oivt1/nicevt/itf/sources/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/ompi-mellanox-fca-v1.8.5/bin/mpirun >> -report-bindings -display-map -mca coll_hcoll_enable 1 -x >>HCOLL_MAIN_IB=mlx4_0:1 -x MXM_IB_PORTS=mlx4_0:1 -x >>MXM_SHM_KCOPY_MODE=off --mca pml yalla --map-by core --bind-to core >>--hostfile hostlist >>../osu_ompi_hcoll/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_mbw_mr -v -r=0 >> >>ompi_clear cmd: >>/gpfs/NETHOME/oivt1/nicevt/itf/sources/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/ompi-clear-v1.8.5/bin/mpirun >> -report-bindings -display-map --hostfile hostlist --map-by core --bind-to >>core ../osu_ompi_clear/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_mbw_mr -v >>-r=0 >> >>I have attached output files to this letter: >>ompi_clear.out, ompi_clear.err - contains ompi_clear results >>ompi_yalla.out, ompi_yalla.err - contains ompi_yalla results >>impi.out, impi.err - contains intel MPI results >> >>Best regards, >>Timur >> >>Воскресенье, 7 июня 2015, 16:11 +03:00 от Alina Sklarevich < >>ali...@dev.mellanox.co.il >: >>>Hi Timur, >>> >>>After running the osu_mbw_mr benchmark in our lab, we obsereved that the >>>binding policy made a difference on the performance. >>>Can you please rerun your ompi tests with the following added to your >>>command line? (one of them in each run) >>> >>>1. --map-by node --bind-to socket >>>2. --map-by node --bind-to core >>> >>>Please attach your results. >>> >>>Thank you, >>>Alina. >>> >>>On Thu, Jun 4, 2015 at 6:53 PM, Timur Ismagilov < tismagi...@mail.ru > >>>wrote: Hello, Alina. 1. Here is my ompi_yalla command line: $HPCX_MPI_DIR/bin/mpirun -mca coll_hcoll_enable 1 -x HCOLL_MAIN_IB=mlx4_0:1 -x MXM_IB_PORTS=mlx4_0:1 -x MXM_SHM_KCOPY_MODE=off --mca pml yalla --hostfile hostlist $@ echo $HPCX_MPI_DIR /gpfs/NETHOME/oivt1/nicevt/itf/sources/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/ ompi-mellanox-fca-v1.8.5 This mpi was configured with: --with-mxm=/path/to/mxm --with-hcoll=/path/to/hcoll --with-platform=contrib/platform/mellanox/optimized --prefix=/path/to/ ompi-mellanox-fca-v1.8.5 ompi_clear command line: HPCX_MPI_DIR/bin/mpirun --hostfile hostlist $@ echo $HPCX_MPI_DIR /gpfs/NETHOME/oivt1/nicevt/itf/sources/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/ ompi-clear-v1.8.5 This mpi was configured with: --with-platform=contrib/platform/mellanox/optimized --prefix=/path/to /ompi-clear-v1.8.5 2. When i run osu_mbr_mr with key "-x MXM_TLS=self,shm,rc" . It fails with segmentation fault : stdout log is in attached file osu_mbr_mr_n-2_ppn-16.out; stderr log is in attached file osu_mbr_mr_n-2_ppn-16.err; cmd line: $HPCX_MPI_DIR/bin/mpirun -mca coll_hcoll_enable 1 -x HCOLL_MAIN_IB=mlx4_0:1 -x MXM_IB_PORTS=mlx4_0:1 -x MXM_SHM_KCOPY_MODE=off --mca pml yalla -x MXM_TLS=self,shm,rc --hostfile hostlist osu_mbw_mr -v -r=0 osu_mbw_mr.c I have changed WINDOW_SIZES in osu_mbw_mr.c: #define WINDOW_SIZES {8, 16, 32, 64, 128, 256, 512, 1024 } 3. I add results of running osu_mbw_mr with yalla and without hcoll on 32 and 64 nodes (512 and 1024 mpi procs ) to mvs10p_mpi.xls : list osu_mbr_mr. The results are 20 percents smaller than old results (with hcoll). Среда, 3 июня 2015, 10:29 +03:00 от Alina Sklarevich < ali...@dev.mellanox.co.il >: >Hello Timur, > >I will review your results and try to reproduce them in our lab. > >You are using
[OMPI users] IB to some nodes but TCP for others
Hi All, We have a set of nodes which are all connected via InfiniBand, but all are mutually connected. For example, nodes 1-32 are connected to IB switch A and 33-64 are connected to switch B, but there is no IB connection between switches A and B. However, all nodes are mutually routable over TCP. What we'd like to do is tell OpenMPI to use IB when communicating amongst nodes 1-32 or 33-64, but to use TCP whenever a node in the set 1-32 needs to talk to another node in the set 33-64 or vice-versa. We've written an application in such a way that we can confine most of the bandwidth and latency sensitive operations to within groups of 32 nodes, but members of the two groups do have to communicate infrequently via MPI. Is there any way to tell OpenMPI to use IB within an IB-connected group and TCP for inter-group communications? Obvoiously, we would need to tell OpenMPI the membership of the two groups. If there's no such functionality, would it be a difficult thing to hack in (I'd be glad to give it a try myself, but I'm not that familiar with the codebase, so a couple of pointers would be helpful, or a note saying I'm crazy for trying). Thanks, Tim
Re: [OMPI users] Error building openmpi-dev-1883-g7cce015 on Linux
We just recently started showing these common symbol warnings -- they're really motivations to ourselves to reduce the number of common symbols. :-) > On Jun 16, 2015, at 11:17 AM, Siegmar Gross > wrote: > > Hi Gilles, > >> these are just warnings, you can safely ignore them > > Good to know. Nevertheless, I thought that you may be interested to > know about the warnings, because they are new. > > > Kind regards > > Siegmar > > > > >> Cheers, >> >> Gilles >> >> On Tuesday, June 16, 2015, Siegmar Gross < >> siegmar.gr...@informatik.hs-fulda.de> wrote: >> >>> Hi, >>> >>> today I tried to build openmpi-dev-1883-g7cce015 on my machines >>> (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 >>> x86_64) with gcc-5.1.0 and Sun C 5.13/5.12. I got the following >>> error for gcc-5.1.0 and Sun C 5.12 on Linux and I didn't get any >>> errors on my Solaris machines for gcc-5.1.0 and Sun C 5.13. >>> I used the following command to configure the package for gcc. >>> >>> ../openmpi-dev-1883-g7cce015/configure >>> --prefix=/usr/local/openmpi-master_64_gcc \ >>> --libdir=/usr/local/openmpi-master_64_gcc/lib64 \ >>> --with-jdk-bindir=/usr/local/jdk1.8.0/bin \ >>> --with-jdk-headers=/usr/local/jdk1.8.0/include \ >>> JAVA_HOME=/usr/local/jdk1.8.0 \ >>> LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \ >>> CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \ >>> CPP="cpp" CXXCPP="cpp" \ >>> CPPFLAGS="" CXXCPPFLAGS="" \ >>> --enable-mpi-cxx \ >>> --enable-cxx-exceptions \ >>> --enable-mpi-java \ >>> --enable-heterogeneous \ >>> --enable-mpi-thread-multiple \ >>> --with-hwloc=internal \ >>> --without-verbs \ >>> --with-wrapper-cflags="-std=c11 -m64" \ >>> --with-wrapper-cxxflags="-m64" \ >>> --with-wrapper-fcflags="-m64" \ >>> --enable-debug \ >>> |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc >>> >>> >>> ... >>> make install-exec-hook >>> make[3]: Entering directory >>> `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' >>> WARNING! Common symbols found: >>> keyval_lex.o: 0004 C opal_util_keyval_yyleng >>> keyval_lex.o: 0008 C opal_util_keyval_yytext >>> show_help_lex.o: 0004 C opal_show_help_yyleng >>> show_help_lex.o: 0008 C opal_show_help_yytext >>>rmaps_rank_file_lex.o: 0004 C orte_rmaps_rank_file_leng >>>rmaps_rank_file_lex.o: 0008 C orte_rmaps_rank_file_text >>> hostfile_lex.o: 0004 C orte_util_hostfile_leng >>> hostfile_lex.o: 0008 C orte_util_hostfile_text >>> mpi-f08-types.o: 0004 C mpi_fortran_argv_null >>> mpi-f08-types.o: 0004 C mpi_fortran_argvs_null >>> [...] >>> skipping remaining symbols. To see all symbols, run: >>> ../openmpi-dev-1883-g7cce015/config/find_common_syms --top_builddir=. >>> --top_srcdir=../openmpi-dev-1883-g7cce015 --objext=o >>> make[3]: [install-exec-hook] Error 1 (ignored) >>> make[3]: Leaving directory >>> `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' >>> make[2]: Nothing to be done for `install-data-am'. >>> make[2]: Leaving directory >>> `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' >>> make[1]: Leaving directory >>> `/export2/src/openmpi-master/openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc' >>> linpc1 openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc 130 >>> >>> >>> >>> linpc1 openmpi-dev-1883-g7cce015-Linux.x86_64.64_gcc 130 >>> ../openmpi-dev-1883-g7cce015/config/find_common_syms --top_builddir=. >>> --top_srcdir=../openmpi-dev-1883-g7cce015 --objext=o >>> WARNING! Common symbols found: >>> keyval_lex.o: 0004 C opal_util_keyval_yyleng >>> keyval_lex.o: 0008 C opal_util_keyval_yytext >>> show_help_lex.o: 0004 C opal_show_help_yyleng >>> show_help_lex.o: 0008 C opal_show_help_yytext >>>rmaps_rank_file_lex.o: 0004 C orte_rmaps_rank_file_leng >>>rmaps_rank_file_lex.o: 0008 C orte_rmaps_rank_file_text >>> hostfile_lex.o: 0004 C orte_util_hostfile_leng >>> hostfile_lex.o: 0008 C orte_util_hostfile_text >>> mpi-f08-types.o: 0004 C mpi_fortran_argv_null >>> mpi-f08-types.o: 0004 C mpi_fortran_argvs_null >>> mpi-f08-types.o: 0004 C mpi_fortran_bottom >>> mpi-f08-types.o: 0004 C mpi_fortran_errcodes_ignore >>> mpi-f08-types.o: 0004 C mpi_fortran_in_place >>> mpi-f08-types.o: 0018 C mpi_fortran_status_ignore >>> mpi-f08-types.o: 0018 C mpi_fortran_statuses_ignore >>> mpi-f08-types.o: 0004 C mpi_fortran_unweighted >>> mpi-f08-types.o: 0004 C mpi_fortran_weights_empty >>> mpi-f08-types.o: 000
Re: [OMPI users] IB to some nodes but TCP for others
I’m surprised that it doesn’t already “just work” - once we exchange endpoint info, each process should look at the endpoint of every other process to determine which transport can reach it. It then picks the “best” one on a per-process basis. So it should automatically be selecting IB for procs within the same group, and TCP for all others. Is it not doing so? > On Jun 16, 2015, at 10:25 AM, Tim Miller wrote: > > Hi All, > > We have a set of nodes which are all connected via InfiniBand, but all are > mutually connected. For example, nodes 1-32 are connected to IB switch A and > 33-64 are connected to switch B, but there is no IB connection between > switches A and B. However, all nodes are mutually routable over TCP. > > What we'd like to do is tell OpenMPI to use IB when communicating amongst > nodes 1-32 or 33-64, but to use TCP whenever a node in the set 1-32 needs to > talk to another node in the set 33-64 or vice-versa. We've written an > application in such a way that we can confine most of the bandwidth and > latency sensitive operations to within groups of 32 nodes, but members of the > two groups do have to communicate infrequently via MPI. > > Is there any way to tell OpenMPI to use IB within an IB-connected group and > TCP for inter-group communications? Obvoiously, we would need to tell OpenMPI > the membership of the two groups. If there's no such functionality, would it > be a difficult thing to hack in (I'd be glad to give it a try myself, but I'm > not that familiar with the codebase, so a couple of pointers would be > helpful, or a note saying I'm crazy for trying). > > Thanks, > Tim > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/06/27141.php
Re: [OMPI users] IB to some nodes but TCP for others
Do you have different IB subnet IDs? That would be the only way for Open MPI to tell the two IB subnets apart. > On Jun 16, 2015, at 1:25 PM, Tim Miller wrote: > > Hi All, > > We have a set of nodes which are all connected via InfiniBand, but all are > mutually connected. For example, nodes 1-32 are connected to IB switch A and > 33-64 are connected to switch B, but there is no IB connection between > switches A and B. However, all nodes are mutually routable over TCP. > > What we'd like to do is tell OpenMPI to use IB when communicating amongst > nodes 1-32 or 33-64, but to use TCP whenever a node in the set 1-32 needs to > talk to another node in the set 33-64 or vice-versa. We've written an > application in such a way that we can confine most of the bandwidth and > latency sensitive operations to within groups of 32 nodes, but members of the > two groups do have to communicate infrequently via MPI. > > Is there any way to tell OpenMPI to use IB within an IB-connected group and > TCP for inter-group communications? Obvoiously, we would need to tell OpenMPI > the membership of the two groups. If there's no such functionality, would it > be a difficult thing to hack in (I'd be glad to give it a try myself, but I'm > not that familiar with the codebase, so a couple of pointers would be > helpful, or a note saying I'm crazy for trying). > > Thanks, > Tim > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/06/27141.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/