[OMPI users] problem with rankfile in openmpi-1.7.4rc2r30323
Hi, yesterday I installed openmpi-1.7.4rc2r30323 on our machines ("Solaris 10 x86_64", "Solaris 10 Sparc", and "openSUSE Linux 12.1 x86_64" with Sun C 5.12). My rankfile "rf_linpc_sunpc_tyr" contains the following lines. rank 0=linpc0 slot=0:0-1;1:0-1 rank 1=linpc1 slot=0:0-1 rank 2=sunpc1 slot=1:0 rank 3=tyr slot=1:0 I get no output, when I run the following command. mpiexec -report-bindings -np 4 -rf rf_linpc_sunpc_tyr hostname "dbx" reports the following problem. /opt/solstudio12.3/bin/sparcv9/dbx \ /usr/local/openmpi-1.7.4_64_cc/bin/mpiexec For information about new features see `help changes' To remove this message, put `dbxenv suppress_startup_message 7.9' in your .dbxrc Reading mpiexec Reading ld.so.1 ... Reading libmd.so.1 (dbx) run -report-bindings -np 4 -rf rf_linpc_sunpc_tyr hostname Running: mpiexec -report-bindings -np 4 -rf rf_linpc_sunpc_tyr hostname (process id 22337) Reading libc_psr.so.1 ... Reading mca_dfs_test.so execution completed, exit code is 1 (dbx) check -all access checking - ON memuse checking - ON (dbx) run -report-bindings -np 4 -rf rf_linpc_sunpc_tyr hostname Running: mpiexec -report-bindings -np 4 -rf rf_linpc_sunpc_tyr hostname (process id 22344) Reading rtcapihook.so ... RTC: Running program... Read from uninitialized (rui) on thread 1: Attempting to read 1 byte at address 0x7fffbf8b which is 459 bytes above the current stack pointer Variable is 'cwd' t@1 (l@1) stopped in opal_getcwd at line 65 in file "opal_getcwd.c" 65 if (0 != strcmp(pwd, cwd)) { (dbx) quit Rankfiles work "fine" on x86_64 architectures. Contents of my rankfile. rank 0=linpc1 slot=0:0-1;1:0-1 rank 1=sunpc1 slot=0:0-1 rank 2=sunpc1 slot=1:0 rank 3=sunpc1 slot=1:1 mpiexec -report-bindings -np 4 -rf rf_linpc_sunpc hostname [sunpc1:13489] MCW rank 1 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]]: [B/B][./.] [sunpc1:13489] MCW rank 2 bound to socket 1[core 2[hwt 0]]: [./.][B/.] [sunpc1:13489] MCW rank 3 bound to socket 1[core 3[hwt 0]]: [./.][./B] sunpc1 sunpc1 sunpc1 [linpc1:29997] MCW rank 0 is not bound (or bound to all available processors) linpc1 Unfortunately "dbx" reports nevertheless a problem. /opt/solstudio12.3/bin/amd64/dbx \ /usr/local/openmpi-1.7.4_64_cc/bin/mpiexec For information about new features see `help changes' To remove this message, put `dbxenv suppress_startup_message 7.9' in your .dbxrc Reading mpiexec Reading ld.so.1 ... Reading libmd.so.1 (dbx) run -report-bindings -np 4 -rf rf_linpc_sunpc hostname Running: mpiexec -report-bindings -np 4 -rf rf_linpc_sunpc hostname (process id 18330) Reading mca_shmem_mmap.so ... Reading mca_dfs_test.so [sunpc1:18330] MCW rank 1 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]]: [B/B][./.] [sunpc1:18330] MCW rank 2 bound to socket 1[core 2[hwt 0]]: [./.][B/.] [sunpc1:18330] MCW rank 3 bound to socket 1[core 3[hwt 0]]: [./.][./B] sunpc1 sunpc1 sunpc1 [linpc1:30148] MCW rank 0 is not bound (or bound to all available processors) linpc1 execution completed, exit code is 0 (dbx) check -all access checking - ON memuse checking - ON (dbx) run -report-bindings -np 4 -rf rf_linpc_sunpc hostname Running: mpiexec -report-bindings -np 4 -rf rf_linpc_sunpc hostname (process id 18340) Reading rtcapihook.so ... RTC: Running program... Reading disasm.so Read from uninitialized (rui) on thread 1: Attempting to read 1 byte at address 0x436d57 which is 15 bytes into a heap block of size 16 bytes at 0x436d48 This block was allocated from: [1] vasprintf() at 0xfd7fdc9b335a [2] asprintf() at 0xfd7fdc9b3452 [3] opal_output_init() at line 184 in "output.c" [4] do_open() at line 548 in "output.c" [5] opal_output_open() at line 219 in "output.c" [6] opal_malloc_init() at line 68 in "malloc.c" [7] opal_init_util() at line 250 in "opal_init.c" [8] orterun() at line 658 in "orterun.c" t@1 (l@1) stopped in do_open at line 638 in file "output.c" 638 info[i].ldi_prefix = strdup(lds->lds_prefix); (dbx) I can also manually bind threads on our Sun M4000 server (two quad-core Sparc VII processors with two hwthreads each). mpiexec --report-bindings -np 4 --bind-to hwthread hostname [rs0.informatik.hs-fulda.de:09531] MCW rank 1 bound to socket 0[core 1[hwt 0]]: [../B./../..][../../../..] [rs0.informatik.hs-fulda.de:09531] MCW rank 2 bound to socket 1[core 4[hwt 0]]: [../../../..][B./../../..] [rs0.informatik.hs-fulda.de:09531] MCW rank 3 bound to socket 1[core 5[hwt 0]]: [../../../..][../B./../..] [rs0.informatik.hs-fulda.de:09531] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B./../../..][../../../..] rs0.informatik.hs-fulda.de rs0.informatik.hs-fulda.de rs0.informatik.hs-fulda.de rs0.informatik.hs-fulda.de It doesn't work with cores. I know that it wasn't possible last summer and it seems that it is still not possible now. mpiexec --report-bindings -np 4 --bind-to core hostname --
Re: [OMPI users] simple test problem hangs on mpi_finalize and consumes all system resources
Well, this is a little strange. The hanging behavior is gone, but I'm getting a segfault now. The output of "hello_c.c" and "ring_c.c" are attached. I'm getting a segfault with the Fortran test, also. I'm afraid I may have polluted the experiment by removing the target openmpi-1.6.5 installation directory yesterday. To produce the attached outputs, I just went back and did "make install" in the openmpi-1.6.5 build directory. I've re-set the environment variables as they were a few days ago by sourcing the same bash script. Perhaps I forgot something, or something on the system changed? Regardless, LD_LIBRARY_PATH and PATH are set correctly, and aberrant behavior persists. The reason for deleting the openmpi-1.6.5 installation was that I went back and installed openmpi-1.4.3 and the problem (mostly) went away. Openmpi-1.4.3 can run the simple tests without issue, but on my "real" program, I'm getting symbol lookup errors: mca_paffinity_linux.so: undefined symbol: mca_base_param_reg_int Perhaps that's a separate thread. >-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff >Squyres (jsquyres) >Sent: Tuesday, January 21, 2014 3:57 PM >To: Open MPI Users >Subject: Re: [OMPI users] simple test problem hangs on mpi_finalize and >consumes all system resources > >Just for giggles, can you repeat the same test but with hello_c.c and ring_c.c? >I.e., let's get the Fortran out of the way and use just the base C bindings, >and >see what happens. > > >On Jan 19, 2014, at 6:18 PM, "Fischer, Greg A." >wrote: > >> I just tried running "hello_f90.f90" and see the same behavior: 100% CPU >usage, gradually increasing memory consumption, and failure to get past >mpi_finalize. LD_LIBRARY_PATH is set as: >> >> >> /tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.6.5/lib >> >> The installation target for this version of OpenMPI is: >> >> >> /tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.6.5 >> >> 1045 >> fischega@lxlogin2[/data/fischega/petsc_configure/mpi_test/simple]> >> which mpirun >> /tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.6.5/bin/mpir >> un >> >> Perhaps something strange is happening with GCC? I've tried simple hello >world C and Fortran programs, and they work normally. >> >> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph >> Castain >> Sent: Sunday, January 19, 2014 11:36 AM >> To: Open MPI Users >> Subject: Re: [OMPI users] simple test problem hangs on mpi_finalize >> and consumes all system resources >> >> The OFED warning about registration is something OMPI added at one point >when we isolated the cause of jobs occasionally hanging, so you won't see >that warning from other MPIs or earlier versions of OMPI (I forget exactly >when we added it). >> >> The problem you describe doesn't sound like an OMPI issue - it sounds like >you've got a memory corruption problem in the code. Have you tried running >the examples in our example directory to confirm that the installation is >good? >> >> Also, check to ensure that your LD_LIBRARY_PATH is correctly set to pickup >the OMPI libs you installed - most Linux distros come with an older version, >and that can cause problems if you inadvertently pick them up. >> >> >> On Jan 19, 2014, at 5:51 AM, Fischer, Greg A. >wrote: >> >> >> Hello, >> >> I have a simple, 1-process test case that gets stuck on the mpi_finalize >> call. >The test case is a dead-simple calculation of pi - 50 lines of Fortran. The >process gradually consumes more and more memory until the system >becomes unresponsive and needs to be rebooted, unless the job is killed >first. >> >> In the output, attached, I see the warning message about OpenFabrics >being configured to only allow registering part of physical memory. I've tried >to chase this down with my administrator to no avail yet. (I am aware of the >relevant FAQ entry.) A different installation of MPI on the same system, >made with a different compiler, does not produce the OpenFabrics memory >registration warning - which seems strange because I thought it was a system >configuration issue independent of MPI. Also curious in the output is that LSF >seems to think there are 7 processes and 11 threads associated with this job. >> >> The particulars of my configuration are attached and detailed below. Does >anyone see anything potentially problematic? >> >> Thanks, >> Greg >> >> OpenMPI Version: 1.6.5 >> Compiler: GCC 4.6.1 >> OS: SuSE Linux Enterprise Server 10, Patchlevel 2 >> >> uname -a : Linux lxlogin2 2.6.16.60-0.21-smp #1 SMP Tue May 6 12:41:02 >> UTC 2008 x86_64 x86_64 x86_64 GNU/Linux >> >> LD_LIBRARY_PATH=/tools/casl_sles10/vera_clean/gcc- >4.6.1/toolset/openmp >> i-1.6.5/lib:/tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/gcc-4.6.1/ >> lib64:/tools/lsf/7.0.6.EC/7.0/linux2.6-glibc2.3-x86_64/lib >> >> PATH= >> /tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/python-2.7.6/bin:/tool >> s/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi
Re: [OMPI users] problem with rankfile in openmpi-1.7.4rc2r30323
Hard to know how to address all that, Siegmar, but I'll give it a shot. See below. On Jan 22, 2014, at 5:34 AM, Siegmar Gross wrote: > Hi, > > yesterday I installed openmpi-1.7.4rc2r30323 on our machines > ("Solaris 10 x86_64", "Solaris 10 Sparc", and "openSUSE Linux > 12.1 x86_64" with Sun C 5.12). My rankfile "rf_linpc_sunpc_tyr" > contains the following lines. > > rank 0=linpc0 slot=0:0-1;1:0-1 > rank 1=linpc1 slot=0:0-1 > rank 2=sunpc1 slot=1:0 > rank 3=tyr slot=1:0 > > I get no output, when I run the following command. > > mpiexec -report-bindings -np 4 -rf rf_linpc_sunpc_tyr hostname > > "dbx" reports the following problem. > > /opt/solstudio12.3/bin/sparcv9/dbx \ > /usr/local/openmpi-1.7.4_64_cc/bin/mpiexec > For information about new features see `help changes' > To remove this message, put `dbxenv suppress_startup_message > 7.9' in your .dbxrc > Reading mpiexec > Reading ld.so.1 > ... > Reading libmd.so.1 > (dbx) run -report-bindings -np 4 -rf rf_linpc_sunpc_tyr hostname > Running: mpiexec -report-bindings -np 4 -rf rf_linpc_sunpc_tyr hostname > (process id 22337) > Reading libc_psr.so.1 > ... > Reading mca_dfs_test.so > > execution completed, exit code is 1 > (dbx) check -all > access checking - ON > memuse checking - ON > (dbx) run -report-bindings -np 4 -rf rf_linpc_sunpc_tyr hostname > Running: mpiexec -report-bindings -np 4 -rf rf_linpc_sunpc_tyr hostname > (process id 22344) > Reading rtcapihook.so > ... > RTC: Running program... > Read from uninitialized (rui) on thread 1: > Attempting to read 1 byte at address 0x7fffbf8b >which is 459 bytes above the current stack pointer > Variable is 'cwd' > t@1 (l@1) stopped in opal_getcwd at line 65 in file "opal_getcwd.c" > 65 if (0 != strcmp(pwd, cwd)) { > (dbx) quit > This looks like a bogus issue to me. Are you able to run something *without* a rankfile? In other words, is it rankfile operation that is causing a problem, or are you unable to run anything on Sparc? > > > > Rankfiles work "fine" on x86_64 architectures. Contents of my rankfile. > > rank 0=linpc1 slot=0:0-1;1:0-1 > rank 1=sunpc1 slot=0:0-1 > rank 2=sunpc1 slot=1:0 > rank 3=sunpc1 slot=1:1 > > > mpiexec -report-bindings -np 4 -rf rf_linpc_sunpc hostname > [sunpc1:13489] MCW rank 1 bound to socket 0[core 0[hwt 0]], > socket 0[core 1[hwt 0]]: [B/B][./.] > [sunpc1:13489] MCW rank 2 bound to socket 1[core 2[hwt 0]]: [./.][B/.] > [sunpc1:13489] MCW rank 3 bound to socket 1[core 3[hwt 0]]: [./.][./B] > sunpc1 > sunpc1 > sunpc1 > [linpc1:29997] MCW rank 0 is not bound (or bound to all available > processors) > linpc1 > > > Unfortunately "dbx" reports nevertheless a problem. > > /opt/solstudio12.3/bin/amd64/dbx \ > /usr/local/openmpi-1.7.4_64_cc/bin/mpiexec > For information about new features see `help changes' > To remove this message, put `dbxenv suppress_startup_message 7.9' > in your .dbxrc > Reading mpiexec > Reading ld.so.1 > ... > Reading libmd.so.1 > (dbx) run -report-bindings -np 4 -rf rf_linpc_sunpc hostname > Running: mpiexec -report-bindings -np 4 -rf rf_linpc_sunpc hostname > (process id 18330) > Reading mca_shmem_mmap.so > ... > Reading mca_dfs_test.so > [sunpc1:18330] MCW rank 1 bound to socket 0[core 0[hwt 0]], > socket 0[core 1[hwt 0]]: [B/B][./.] > [sunpc1:18330] MCW rank 2 bound to socket 1[core 2[hwt 0]]: [./.][B/.] > [sunpc1:18330] MCW rank 3 bound to socket 1[core 3[hwt 0]]: [./.][./B] > sunpc1 > sunpc1 > sunpc1 > [linpc1:30148] MCW rank 0 is not bound (or bound to all available > processors) > linpc1 > > execution completed, exit code is 0 > (dbx) check -all > access checking - ON > memuse checking - ON > (dbx) run -report-bindings -np 4 -rf rf_linpc_sunpc hostname > Running: mpiexec -report-bindings -np 4 -rf rf_linpc_sunpc hostname > (process id 18340) > Reading rtcapihook.so > ... > > RTC: Running program... > Reading disasm.so > Read from uninitialized (rui) on thread 1: > Attempting to read 1 byte at address 0x436d57 >which is 15 bytes into a heap block of size 16 bytes at 0x436d48 > This block was allocated from: >[1] vasprintf() at 0xfd7fdc9b335a >[2] asprintf() at 0xfd7fdc9b3452 >[3] opal_output_init() at line 184 in "output.c" >[4] do_open() at line 548 in "output.c" >[5] opal_output_open() at line 219 in "output.c" >[6] opal_malloc_init() at line 68 in "malloc.c" >[7] opal_init_util() at line 250 in "opal_init.c" >[8] orterun() at line 658 in "orterun.c" > > t@1 (l@1) stopped in do_open at line 638 in file "output.c" > 638 info[i].ldi_prefix = strdup(lds->lds_prefix); > (dbx) > > Again, I think dbx is just getting lost > > > > I can also manually bind threads on our Sun M4000 server (two quad-core > Sparc VII processors with two hwthreads each). > > mpiexec --report-bindings -np 4 --bind-to hwthread hostname > [rs0.informatik.hs-fulda.de:09531] MCW rank 1 bound to > socket 0[core 1[hwt 0]]: [../B./
[OMPI users] default num_procs of round_robin_mapper with cpus-per-proc option
Hi Ralph, I want to ask you one more thing about default setting of num_procs when we don't specify the -np option and we set the cpus-per-proc > 1. In this case, the round_robin_mapper sets num_procs = num_slots as below: rmaps_rr.c: 130if (0 == app->num_procs) { 131/* set the num_procs to equal the number of slots on these mapped nodes */ 132app->num_procs = num_slots; 133} However, because of cpus_per_rank > 1, this num_procs will be refused at the line 61 in rmaps_rr_mappers.c as below, unless we switch on the oversubscribe directive. rmaps_rr_mappers.c: 61if (num_slots < ((int)app->num_procs * orte_rmaps_base.cpus_per_rank)) { 62if (ORTE_MAPPING_NO_OVERSUBSCRIBE & ORTE_GET_MAPPING_DIRECTIVE (jdata->map->mapping)) { 63orte_show_help("help-orte-rmaps-base.txt", "orte-rmaps-base:alloc-error", 64 true, app->num_procs, app->app); 65return ORTE_ERR_SILENT; 66} 67} Therefore, I think the default num_procs should be equal to the number of num_slots divided by cpus/rank: app->num_procs = num_slots / orte_rmaps_base.cpus_per_rank; This would be more convinient for most of people who want to use the -cpus-per-proc option. I already confirmed it worked well. Please consider to apply this fix to 1.7.4. Regards, Tetsuya Mishima
Re: [OMPI users] default num_procs of round_robin_mapper with cpus-per-proc option
Seems like a reasonable, minimal risk request - will do On Jan 22, 2014, at 4:28 PM, tmish...@jcity.maeda.co.jp wrote: > > Hi Ralph, I want to ask you one more thing about default setting of > num_procs > when we don't specify the -np option and we set the cpus-per-proc > 1. > > In this case, the round_robin_mapper sets num_procs = num_slots as below: > > rmaps_rr.c: > 130if (0 == app->num_procs) { > 131/* set the num_procs to equal the number of slots on these > mapped nodes */ > 132app->num_procs = num_slots; > 133} > > However, because of cpus_per_rank > 1, this num_procs will be refused at > the > line 61 in rmaps_rr_mappers.c as below, unless we switch on the > oversubscribe > directive. > > rmaps_rr_mappers.c: > 61if (num_slots < ((int)app->num_procs * > orte_rmaps_base.cpus_per_rank)) { > 62if (ORTE_MAPPING_NO_OVERSUBSCRIBE & ORTE_GET_MAPPING_DIRECTIVE > (jdata->map->mapping)) { > 63orte_show_help("help-orte-rmaps-base.txt", > "orte-rmaps-base:alloc-error", > 64 true, app->num_procs, app->app); > 65return ORTE_ERR_SILENT; > 66} > 67} > > Therefore, I think the default num_procs should be equal to the number of > num_slots divided by cpus/rank: > > app->num_procs = num_slots / orte_rmaps_base.cpus_per_rank; > > This would be more convinient for most of people who want to use the > -cpus-per-proc option. I already confirmed it worked well. Please consider > to apply this fix to 1.7.4. > > Regards, > Tetsuya Mishima > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] default num_procs of round_robin_mapper with cpus-per-proc option
Thanks, Ralph. I have one more question. I'm sorry to ask you many things ... Could you tell me the difference between "map-by slot" and "map-by core". >From my understanding, slot is the synonym of core. But those behaviors using openmpi-1.7.4rc2 with the cpus-per-proc option are quite different as shown below. I tried to browse the source code but I could not make it clear so far. Regards, Tetsuya Mishima [ un-managed environment] (node05,06 has 8 cores each) [mishima@manage work]$ cat pbs_hosts node05 node05 node05 node05 node05 node05 node05 node05 node06 node06 node06 node06 node06 node06 node06 node06 [mishima@manage work]$ mpirun -np 4 -hostfile pbs_hosts -report-bindings -cpus-per-proc 4 -map-by slot ~/mis/openmpi/dem os/myprog [node05.cluster:23949] MCW rank 1 bound to socket 1[core 4[hwt 0]], socket 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so cket 1[core 7[hwt 0]]: [./././.][B/B/B/B] [node05.cluster:23949] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so cket 0[core 3[hwt 0]]: [B/B/B/B][./././.] [node06.cluster:22139] MCW rank 3 bound to socket 1[core 4[hwt 0]], socket 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so cket 1[core 7[hwt 0]]: [./././.][B/B/B/B] [node06.cluster:22139] MCW rank 2 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so cket 0[core 3[hwt 0]]: [B/B/B/B][./././.] Hello world from process 0 of 4 Hello world from process 1 of 4 Hello world from process 3 of 4 Hello world from process 2 of 4 [mishima@manage work]$ mpirun -np 4 -hostfile pbs_hosts -report-bindings -cpus-per-proc 4 -map-by core ~/mis/openmpi/dem os/myprog [node05.cluster:23985] MCW rank 1 bound to socket 0[core 1[hwt 0]]: [./B/./.][./././.] [node05.cluster:23985] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././.][./././.] [node06.cluster:22175] MCW rank 3 bound to socket 0[core 1[hwt 0]]: [./B/./.][./././.] [node06.cluster:22175] MCW rank 2 bound to socket 0[core 0[hwt 0]]: [B/././.][./././.] Hello world from process 2 of 4 Hello world from process 3 of 4 Hello world from process 0 of 4 Hello world from process 1 of 4 (note) I have the same behavior in the managed environment by Torque > Seems like a reasonable, minimal risk request - will do > > On Jan 22, 2014, at 4:28 PM, tmish...@jcity.maeda.co.jp wrote: > > > > > Hi Ralph, I want to ask you one more thing about default setting of > > num_procs > > when we don't specify the -np option and we set the cpus-per-proc > 1. > > > > In this case, the round_robin_mapper sets num_procs = num_slots as below: > > > > rmaps_rr.c: > > 130if (0 == app->num_procs) { > > 131/* set the num_procs to equal the number of slots on these > > mapped nodes */ > > 132app->num_procs = num_slots; > > 133} > > > > However, because of cpus_per_rank > 1, this num_procs will be refused at > > the > > line 61 in rmaps_rr_mappers.c as below, unless we switch on the > > oversubscribe > > directive. > > > > rmaps_rr_mappers.c: > > 61if (num_slots < ((int)app->num_procs * > > orte_rmaps_base.cpus_per_rank)) { > > 62if (ORTE_MAPPING_NO_OVERSUBSCRIBE & ORTE_GET_MAPPING_DIRECTIVE > > (jdata->map->mapping)) { > > 63orte_show_help("help-orte-rmaps-base.txt", > > "orte-rmaps-base:alloc-error", > > 64 true, app->num_procs, app->app); > > 65return ORTE_ERR_SILENT; > > 66} > > 67} > > > > Therefore, I think the default num_procs should be equal to the number of > > num_slots divided by cpus/rank: > > > > app->num_procs = num_slots / orte_rmaps_base.cpus_per_rank; > > > > This would be more convinient for most of people who want to use the > > -cpus-per-proc option. I already confirmed it worked well. Please consider > > to apply this fix to 1.7.4. > > > > Regards, > > Tetsuya Mishima > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users