Re: [OMPI users] Segmentation fault: intel 10.1.008 compilers w/ openmpi-1.2.4
On Sun, 2007-12-02 at 21:27 -0500, de Almeida, Valmor F. wrote: > Hello, > > After compiling ompi-1.2.4 with the intel compiler suite 10.1.008, I get > > ->mpicxx --showme > Segmentation fault > > ->ompi_info > Segmentation fault > > The 10.1.008 is the only one I know that officially supports the linux > kernel 2.6 and glibc-2.6 that I have on my system. > > config.log file attached. > > Any help appreciated. Run an nm on opal/mca/memory/ptmalloc2/.libs/malloc.o and check if malloc is defined in there. This seems to be the problem i have when compiling with pathscale. It removes the malloc (public_mALLOc) function from the objectfile but leaves the free (public_fREe) in there, resulting in malloc/free mismatch.
Re: [OMPI users] large number of processes
> Hi: > I managed to run a 256 process job on a single node. I ran a simple test > in which all processes send a message to all others. > This was using Sun's Binary Distribution of Open MPI on Solaris which is > based on r16572 of the 1.2 branch. The machine had 8 cores. > > burl-ct-v40z-0 49 =>/opt/SUNWhpc/HPC7.1/bin/mpirun --mca > mpool_sm_max_size 2147483647 -np 256 connectivity_c > Connectivity test on 256 processes PASSED. > burl-ct-v40z-0 50 => > burl-ct-v40z-0 50 =>/opt/SUNWhpc/HPC7.1/bin/mpirun --mca > mpool_sm_max_size 2147483647 -np 300 connectivity_c -v > Connectivity test on 300 processes PASSED. > > burl-ct-v40z-0 54 =>limit > cputime unlimited > filesize unlimited > datasize unlimited > stacksize 10240 kbytes > coredumpsize 0 kbytes > vmemoryuse unlimited > descriptors 65536 > burl-ct-v40z-0 55 => Thank you for Solaris results. I compared your environment with the user limit of our cluster. The limit of open files seemed too small even for executing 256 processes.If we increased the limit, I was able to execute 256 processes per node. SUSUKITA, Ryutaro Peta-scale System Interconnect Project Fukuoka Industry, Science & Technology Foundation
Re: [OMPI users] Segmentation fault: intel 10.1.008 compilers w/ openmpi-1.2.4
On Tue, 2007-12-04 at 09:33 +0100, Åke Sandgren wrote: > On Sun, 2007-12-02 at 21:27 -0500, de Almeida, Valmor F. wrote: > > Hello, > > > > After compiling ompi-1.2.4 with the intel compiler suite 10.1.008, I get > > > > ->mpicxx --showme > > Segmentation fault > > > > ->ompi_info > > Segmentation fault > > > > The 10.1.008 is the only one I know that officially supports the linux > > kernel 2.6 and glibc-2.6 that I have on my system. > > > > config.log file attached. > > > > Any help appreciated. > > Run an nm on opal/mca/memory/ptmalloc2/.libs/malloc.o and check if > malloc is defined in there. > > This seems to be the problem i have when compiling with pathscale. > It removes the malloc (public_mALLOc) function from the objectfile but > leaves the free (public_fREe) in there, resulting in malloc/free > mismatch. For pathscale the solution for me was to add -fno-builtin. Now ompi_info doesn't segfault anymore. Check if the intel 10 has something similar. -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
[OMPI users] arch question: long running app
Hi, although I did my due diligence on searching for this question, I apologise if this is a repeat. >From an architectural point of view does it make sense to use MPI in the following scenario (for the purposes of resilience as much as parallelization): Each process is a long-running process (runs non-interrupted for weeks, months or even years) that collects and crunches some streaming data, for example temperature readings, and the data is replicated to R nodes. Because this is a diversion from the normal modus operandi (i.e. all data is immediately available), is there any obvious MPI issues that I am not considering in designing such an application? Here is a more detailed description of the app: A master receives the data and dispatches it according to some function such that each tuple is replicated R times to R of the N nodes (with R<=N). Suppose that there are K regions from which temperature readings stream in in the form of where K is the region id and T is the temperature reading. The master sends to R of the N nodes. These nodes maintain a long-term state of, say, the min/max readings. If R=N=2, the system is basically duplicated and if one of the two nodes dies inadvertently, the other one still has accounted for all the data. Here is some pseudo-code: int main(argc, argv) int N=10, R=3, K=200; Init(argc,argv); int rank=COMM_WORLD.Get_rank(); if(rank==0) { int lastnode = 1; while(read from socket) for(i in 0:R) COMM_WORLD.Send(,1,tuple,++lastnode%N,tag); } else { COMM_WORLD.Recv(,1,tuple,any,tag,Info); process_message(); } Many thanks for your time! Regards Dok
Re: [OMPI users] Segmentation fault: intel 10.1.008 compilers w/openmpi-1.2.4
> -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > > On Tue, 2007-12-04 at 09:33 +0100, Åke Sandgren wrote: > > On Sun, 2007-12-02 at 21:27 -0500, de Almeida, Valmor F. wrote: > > > > Run an nm on opal/mca/memory/ptmalloc2/.libs/malloc.o and check if > > malloc is defined in there. > > > > This seems to be the problem i have when compiling with pathscale. > > It removes the malloc (public_mALLOc) function from the objectfile but > > leaves the free (public_fREe) in there, resulting in malloc/free > > mismatch. > > For pathscale the solution for me was to add -fno-builtin. > Now ompi_info doesn't segfault anymore. > > Check if the intel 10 has something similar. Below is the nm output. The no builtin compiler option you mentioned above seems to belong to gcc. I have compiled openmpi-1.2.4 with the gcc-4.1.2 suite without problems. Thanks, -- Valmor ->nm ./opal/mca/memory/ptmalloc2/.libs/malloc.o | grep -i "malloc" 3faa T __malloc_check_init 0004 D __malloc_hook 08b8 B __malloc_initialize_hook 0010 D __malloc_initialized 3944 t _int_icomalloc 0c36 T _int_malloc 08b0 b disallow_malloc_check 3742 T independent_comalloc 06a8 T malloc t malloc_atfork 2224 t malloc_check 1184 t malloc_consolidate 2c00 T malloc_get_state 0410 t malloc_hook_ini 44f2 t malloc_init_state 2d84 T malloc_set_state 2b5e t malloc_starter 3ad4 T malloc_trim 3b5e T malloc_usable_size U opal_mem_free_ptmalloc2_mmap U opal_mem_free_ptmalloc2_munmap 17d6 t opal_mem_free_ptmalloc2_sbrk 4032 t ptmalloc_init 44b0 t ptmalloc_init_minimal 01dc t ptmalloc_lock_all 02f6 t ptmalloc_unlock_all 036a t ptmalloc_unlock_all2 129e t sYSMALLOc 0870 b save_malloc_hook 08ac b using_malloc_checking
Re: [OMPI users] Segmentation fault: intel 10.1.008 compilers w/openmpi-1.2.4
On Tue, 2007-12-04 at 15:28 -0500, de Almeida, Valmor F. wrote: > > -Original Message- > > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > > > > On Tue, 2007-12-04 at 09:33 +0100, Åke Sandgren wrote: > > > On Sun, 2007-12-02 at 21:27 -0500, de Almeida, Valmor F. wrote: > > > > > > Run an nm on opal/mca/memory/ptmalloc2/.libs/malloc.o and check if > > > malloc is defined in there. > > > > > > This seems to be the problem i have when compiling with pathscale. > > > It removes the malloc (public_mALLOc) function from the objectfile but > > > leaves the free (public_fREe) in there, resulting in malloc/free > > > mismatch. > > > > For pathscale the solution for me was to add -fno-builtin. > > Now ompi_info doesn't segfault anymore. > > > > Check if the intel 10 has something similar. > > Below is the nm output. The no builtin compiler option you mentioned above > seems to belong to gcc. I have compiled openmpi-1.2.4 with the gcc-4.1.2 > suite without problems. Ok, it was a long short anyway. -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
[OMPI users] suggested intel compiler version for openmpi-1.2.4
Hello, What is the suggested intel compiler version to compile openmpi-1.2.4? I tried versions 10.1.008 and 9.1.052 and no luck in getting a working library. In both cases I get: ->mpic++ --showme Segmentation fault ->ompi_info Segmentation fault Thanks for your help. -- Valmor de Almeida
Re: [OMPI users] suggested intel compiler version for openmpi-1.2.4
I have compiled Open MPI with Intel 10.0 and 9.1 with no problems on RHEL4U4. Can you send all the info that you can (obviously, ompi_info won't run) from http://www.open-mpi.org/community/help/ ? On Dec 4, 2007, at 4:26 PM, de Almeida, Valmor F. wrote: Hello, What is the suggested intel compiler version to compile openmpi-1.2.4? I tried versions 10.1.008 and 9.1.052 and no luck in getting a working library. In both cases I get: ->mpic++ --showme Segmentation fault ->ompi_info Segmentation fault Thanks for your help. -- Valmor de Almeida ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems