Re: [OMPI users] segmentation fault for slot-list and openmpi-1.10.3rc2
Hi Ralph and Gilles, it's strange that the program works with "--host" and "--slot-list" in your environment and not in mine. I get the following output, if I run the program in gdb without a breakpoint. loki spawn 142 gdb /usr/local/openmpi-1.10.3_64_gcc/bin/mpiexec GNU gdb (GDB; SUSE Linux Enterprise 12) 7.9.1 ... (gdb) set args -np 1 --host loki --slot-list 0:0-1,1:0-1 simple_spawn (gdb) run Starting program: /usr/local/openmpi-1.10.3_64_gcc/bin/mpiexec -np 1 --host loki --slot-list 0:0-1,1:0-1 simple_spawn [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Detaching after fork from child process 18031. [pid 18031] starting up! 0 completed MPI_Init Parent [pid 18031] about to spawn! Detaching after fork from child process 18033. Detaching after fork from child process 18034. [pid 18033] starting up! [pid 18034] starting up! [loki:18034] *** Process received signal *** [loki:18034] Signal: Segmentation fault (11) ... I get a different output, if I run the program in gdb with a breakpoint. gdb /usr/local/openmpi-1.10.3_64_gcc/bin/mpiexec (gdb) set args -np 1 --host loki --slot-list 0:0-1,1:0-1 simple_spawn (gbd) set follow-fork-mode child (gdb) break ompi_proc_self (gdb) run (gdb) next Repeating "next" very often results in the following output. ... Starting program: /home/fd1026/work/skripte/master/parallel/prog/mpi/spawn/simple_spawn [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [pid 13277] starting up! [New Thread 0x742ef700 (LWP 13289)] Breakpoint 1, ompi_proc_self (size=0x7fffc060) at ../../openmpi-1.10.3rc3/ompi/proc/proc.c:413 413 ompi_proc_t **procs = (ompi_proc_t**) malloc(sizeof(ompi_proc_t*)); (gdb) n 414 if (NULL == procs) { (gdb) 423 OBJ_RETAIN(ompi_proc_local_proc); (gdb) 424 *procs = ompi_proc_local_proc; (gdb) 425 *size = 1; (gdb) 426 return procs; (gdb) 427 } (gdb) ompi_comm_init () at ../../openmpi-1.10.3rc3/ompi/communicator/comm_init.c:138 138 group->grp_my_rank = 0; (gdb) 139 group->grp_proc_count= (int)size; ... 193 ompi_comm_reg_init(); (gdb) 196 ompi_comm_request_init (); (gdb) 198 return OMPI_SUCCESS; (gdb) 199 } (gdb) ompi_mpi_init (argc=0, argv=0x0, requested=0, provided=0x7fffc21c) at ../../openmpi-1.10.3rc3/ompi/runtime/ompi_mpi_init.c:738 738 if (OMPI_SUCCESS != (ret = ompi_file_init())) { (gdb) 744 if (OMPI_SUCCESS != (ret = ompi_win_init())) { (gdb) 750 if (OMPI_SUCCESS != (ret = ompi_attr_init())) { ... 988 ompi_mpi_initialized = true; (gdb) 991 if (ompi_enable_timing && 0 == OMPI_PROC_MY_NAME->vpid) { (gdb) 999 return MPI_SUCCESS; (gdb) 1000} (gdb) PMPI_Init (argc=0x0, argv=0x0) at pinit.c:94 94 if (MPI_SUCCESS != err) { (gdb) 104 return MPI_SUCCESS; (gdb) 105 } (gdb) 0x00400d0c in main () (gdb) Single stepping until exit from function main, which has no line number information. 0 completed MPI_Init Parent [pid 13277] about to spawn! [New process 13472] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". process 13472 is executing new program: /usr/local/openmpi-1.10.3_64_gcc/bin/orted [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [New process 13474] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". process 13474 is executing new program: /home/fd1026/work/skripte/master/parallel/prog/mpi/spawn/simple_spawn [pid 13475] starting up! [pid 13476] starting up! [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [pid 13474] starting up! [New Thread 0x7491b700 (LWP 13480)] [Switching to Thread 0x77ff1740 (LWP 13474)] Breakpoint 1, ompi_proc_self (size=0x7fffba30) at ../../openmpi-1.10.3rc3/ompi/proc/proc.c:413 413 ompi_proc_t **procs = (ompi_proc_t**) malloc(sizeof(ompi_proc_t*)); (gdb) 414 if (NULL == procs) { ... 426 return procs; (gdb) 427 } (gdb) ompi_comm_init () at ../../openmpi-1.10.3rc3/ompi/communicator/comm_init.c:138 138 group->grp_my_rank = 0; (gdb) 139 group->grp_proc_count= (int)size; (gdb) 140 OMPI_GROUP_SET_INTRINSIC (group); ... 193 ompi_comm_reg_init(); (gdb) 196 ompi_comm_request_init (); (gdb) 198 return OMPI_SUCCESS; (gdb) 199 } (gdb) ompi_mpi_init (argc=0, argv=0x0, requested=0, provided=0x7fffbbec) at ../../openmpi-1.10.3rc3/ompi/runtime/ompi_mpi_init.c:738 738 if (OMPI_SUCCESS != (ret = ompi_file_init())) { (gdb) 744 if (OMPI_SUCCESS != (ret = ompi_win_init())) { (gdb) 750 if (OMPI_SUCCESS != (ret = ompi_attr_init())) { ... 863 if (OMPI_SUCCESS != (ret = ompi_
Re: [OMPI users] segmentation fault for slot-list and openmpi-1.10.3rc2
I’m afraid I honestly can’t make any sense of it. It seems you at least have a simple workaround (use a hostfile instead of -host), yes? > On May 26, 2016, at 5:48 AM, Siegmar Gross > wrote: > > Hi Ralph and Gilles, > > it's strange that the program works with "--host" and "--slot-list" > in your environment and not in mine. I get the following output, if > I run the program in gdb without a breakpoint. > > > loki spawn 142 gdb /usr/local/openmpi-1.10.3_64_gcc/bin/mpiexec > GNU gdb (GDB; SUSE Linux Enterprise 12) 7.9.1 > ... > (gdb) set args -np 1 --host loki --slot-list 0:0-1,1:0-1 simple_spawn > (gdb) run > Starting program: /usr/local/openmpi-1.10.3_64_gcc/bin/mpiexec -np 1 --host > loki --slot-list 0:0-1,1:0-1 simple_spawn > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > Detaching after fork from child process 18031. > [pid 18031] starting up! > 0 completed MPI_Init > Parent [pid 18031] about to spawn! > Detaching after fork from child process 18033. > Detaching after fork from child process 18034. > [pid 18033] starting up! > [pid 18034] starting up! > [loki:18034] *** Process received signal *** > [loki:18034] Signal: Segmentation fault (11) > ... > > > > I get a different output, if I run the program in gdb with > a breakpoint. > > gdb /usr/local/openmpi-1.10.3_64_gcc/bin/mpiexec > (gdb) set args -np 1 --host loki --slot-list 0:0-1,1:0-1 simple_spawn > (gbd) set follow-fork-mode child > (gdb) break ompi_proc_self > (gdb) run > (gdb) next > > Repeating "next" very often results in the following output. > > ... > Starting program: > /home/fd1026/work/skripte/master/parallel/prog/mpi/spawn/simple_spawn > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > [pid 13277] starting up! > [New Thread 0x742ef700 (LWP 13289)] > > Breakpoint 1, ompi_proc_self (size=0x7fffc060) >at ../../openmpi-1.10.3rc3/ompi/proc/proc.c:413 > 413 ompi_proc_t **procs = (ompi_proc_t**) > malloc(sizeof(ompi_proc_t*)); > (gdb) n > 414 if (NULL == procs) { > (gdb) > 423 OBJ_RETAIN(ompi_proc_local_proc); > (gdb) > 424 *procs = ompi_proc_local_proc; > (gdb) > 425 *size = 1; > (gdb) > 426 return procs; > (gdb) > 427 } > (gdb) > ompi_comm_init () at ../../openmpi-1.10.3rc3/ompi/communicator/comm_init.c:138 > 138 group->grp_my_rank = 0; > (gdb) > 139 group->grp_proc_count= (int)size; > ... > 193 ompi_comm_reg_init(); > (gdb) > 196 ompi_comm_request_init (); > (gdb) > 198 return OMPI_SUCCESS; > (gdb) > 199 } > (gdb) > ompi_mpi_init (argc=0, argv=0x0, requested=0, provided=0x7fffc21c) >at ../../openmpi-1.10.3rc3/ompi/runtime/ompi_mpi_init.c:738 > 738 if (OMPI_SUCCESS != (ret = ompi_file_init())) { > (gdb) > 744 if (OMPI_SUCCESS != (ret = ompi_win_init())) { > (gdb) > 750 if (OMPI_SUCCESS != (ret = ompi_attr_init())) { > ... > 988 ompi_mpi_initialized = true; > (gdb) > 991 if (ompi_enable_timing && 0 == OMPI_PROC_MY_NAME->vpid) { > (gdb) > 999 return MPI_SUCCESS; > (gdb) > 1000} > (gdb) > PMPI_Init (argc=0x0, argv=0x0) at pinit.c:94 > 94 if (MPI_SUCCESS != err) { > (gdb) > 104 return MPI_SUCCESS; > (gdb) > 105 } > (gdb) > 0x00400d0c in main () > (gdb) > Single stepping until exit from function main, > which has no line number information. > 0 completed MPI_Init > Parent [pid 13277] about to spawn! > [New process 13472] > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > process 13472 is executing new program: > /usr/local/openmpi-1.10.3_64_gcc/bin/orted > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > [New process 13474] > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > process 13474 is executing new program: > /home/fd1026/work/skripte/master/parallel/prog/mpi/spawn/simple_spawn > [pid 13475] starting up! > [pid 13476] starting up! > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > [pid 13474] starting up! > [New Thread 0x7491b700 (LWP 13480)] > [Switching to Thread 0x77ff1740 (LWP 13474)] > > Breakpoint 1, ompi_proc_self (size=0x7fffba30) >at ../../openmpi-1.10.3rc3/ompi/proc/proc.c:413 > 413 ompi_proc_t **procs = (ompi_proc_t**) > malloc(sizeof(ompi_proc_t*)); > (gdb) > 414 if (NULL == procs) { > ... > 426 return procs; > (gdb) > 427 } > (gdb) > ompi_comm_init () at ../../openmpi-1.10.3rc3/ompi/communicator/comm_init.c:138 > 138 group->grp_my_rank = 0; > (gdb) > 139 group->grp_proc_count= (int)size; > (gdb) > 140 OMPI_GROUP_SET_INTRINSIC (group); > ... > 193 ompi_comm
Re: [OMPI users] users Digest, Vol 3510, Issue 2
Thank you all for your suggestions !! I found an answer to a similar case in Open MPI FAQ (Question 15)FAQ: Running MPI jobs | | | | | | | | | | | FAQ: Running MPI jobsTable of contents: What pre-requisites are necessary for running an Open MPI job? What ABI guarantees does Open MPI provide? Do I need a common filesystem on a... | | | | Afficher sur www.open-mpi.org | Aperçu par Yahoo | | | | | which suggests to use mpirun's prefix command line option or to use the mpirun wrapper. I modified my command to the following mpirun --prefix /opt/openfoam30/platforms/linux64GccDPInt32Opt/lib/Openmpi-system -np 1 pimpleDyMFoam -case OF But, I got an error (see attached picture). Is the syntax correct? How can I solve the problem? That first method seems to be easier than using the mpirun wrapper. Otherwise, how can I use the mpirun wrapper? Regards,islem Le Mercredi 25 mai 2016 16h40, Dave Love a écrit : I wrote: > You could wrap one (set of) program(s) in a script to set the > appropriate environment before invoking the real program. I realize I should have said something like "program invocations", i.e. if you have no control over something invoking mpirun for programs using different MPIs, then an mpirun wrapper needs to check what it's being asked to run.
Re: [OMPI users] users Digest, Vol 3510, Issue 2
You're still intermingling your Open MPI and MPICH installations. You need to ensure to use the wrapper compilers and mpirun/mpiexec from the same MPI implementation. For example, if you use mpicc/mpifort from Open MPI to build your program, then you must use Open MPI's mpirun/mpiexec. If you absolutely need to have both MPI implementations in your PATH / LD_LIBRARY_PATH, you might want to use absolute path names to for mpicc/mpifort/mpirun/mpiexec. > On May 26, 2016, at 3:46 PM, Megdich Islem wrote: > > Thank you all for your suggestions !! > > I found an answer to a similar case in Open MPI FAQ (Question 15) > FAQ: Running MPI jobs > > > > > > > > > FAQ: Running MPI jobs > Table of contents: What pre-requisites are necessary for running an Open MPI > job? What ABI guarantees does Open MPI provide? Do I need a common filesystem > on a... > Afficher sur www.open-mpi.org > Aperçu par Yahoo > > which suggests to use mpirun's prefix command line option or to use the > mpirun wrapper. > > I modified my command to the following > mpirun --prefix > /opt/openfoam30/platforms/linux64GccDPInt32Opt/lib/Openmpi-system -np 1 > pimpleDyMFoam -case OF > > But, I got an error (see attached picture). Is the syntax correct? How can I > solve the problem? That first method seems to be easier than using the mpirun > wrapper. > > Otherwise, how can I use the mpirun wrapper? > > Regards, > islem > > > Le Mercredi 25 mai 2016 16h40, Dave Love a écrit : > > > I wrote: > > > > You could wrap one (set of) program(s) in a script to set the > > appropriate environment before invoking the real program. > > > I realize I should have said something like "program invocations", > i.e. if you have no control over something invoking mpirun for programs > using different MPIs, then an mpirun wrapper needs to check what it's > being asked to run. > > > > ___ > users mailing list > us...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/05/29317.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/