Re: [OMPI users] Segmentation fault: intel 10.1.008 compilers w/ openmpi-1.2.4

2007-12-04 Thread Åke Sandgren
On Sun, 2007-12-02 at 21:27 -0500, de Almeida, Valmor F. wrote:
> Hello,
> 
> After compiling ompi-1.2.4 with the intel compiler suite 10.1.008, I get
> 
> ->mpicxx --showme
> Segmentation fault
> 
> ->ompi_info
> Segmentation fault
> 
> The 10.1.008 is the only one I know that officially supports the linux
> kernel 2.6 and glibc-2.6 that I have on my system.
> 
> config.log file attached.
> 
> Any help appreciated.

Run an nm on opal/mca/memory/ptmalloc2/.libs/malloc.o and check if
malloc is defined in there.

This seems to be the problem i have when compiling with pathscale.
It removes the malloc (public_mALLOc) function from the objectfile but
leaves the free (public_fREe) in there, resulting in malloc/free
mismatch.



Re: [OMPI users] large number of processes

2007-12-04 Thread hbtcx243
> Hi:
> I managed to run a 256 process job on a single node. I ran a simple test
> in which all processes send a message to all others.
> This was using Sun's Binary Distribution of Open MPI on Solaris which is
> based on r16572 of the 1.2 branch. The machine had 8 cores.
>
> burl-ct-v40z-0 49 =>/opt/SUNWhpc/HPC7.1/bin/mpirun --mca
> mpool_sm_max_size 2147483647 -np 256 connectivity_c
> Connectivity test on 256 processes PASSED.
> burl-ct-v40z-0 50 =>
> burl-ct-v40z-0 50 =>/opt/SUNWhpc/HPC7.1/bin/mpirun --mca
> mpool_sm_max_size 2147483647 -np 300 connectivity_c -v
> Connectivity test on 300 processes PASSED.
>
> burl-ct-v40z-0 54 =>limit
> cputime unlimited
> filesize unlimited
> datasize unlimited
> stacksize 10240 kbytes
> coredumpsize 0 kbytes
> vmemoryuse unlimited
> descriptors 65536
> burl-ct-v40z-0 55 =>

Thank you for Solaris results.
I compared your environment with the user limit of our cluster.
The limit of open files seemed too small even for executing 256
processes.If we increased the limit,
I was able to execute 256 processes per node.

SUSUKITA, Ryutaro
Peta-scale System Interconnect Project
Fukuoka Industry, Science & Technology Foundation


Re: [OMPI users] Segmentation fault: intel 10.1.008 compilers w/ openmpi-1.2.4

2007-12-04 Thread Åke Sandgren
On Tue, 2007-12-04 at 09:33 +0100, Åke Sandgren wrote:
> On Sun, 2007-12-02 at 21:27 -0500, de Almeida, Valmor F. wrote:
> > Hello,
> > 
> > After compiling ompi-1.2.4 with the intel compiler suite 10.1.008, I get
> > 
> > ->mpicxx --showme
> > Segmentation fault
> > 
> > ->ompi_info
> > Segmentation fault
> > 
> > The 10.1.008 is the only one I know that officially supports the linux
> > kernel 2.6 and glibc-2.6 that I have on my system.
> > 
> > config.log file attached.
> > 
> > Any help appreciated.
> 
> Run an nm on opal/mca/memory/ptmalloc2/.libs/malloc.o and check if
> malloc is defined in there.
> 
> This seems to be the problem i have when compiling with pathscale.
> It removes the malloc (public_mALLOc) function from the objectfile but
> leaves the free (public_fREe) in there, resulting in malloc/free
> mismatch.

For pathscale the solution for me was to add -fno-builtin.
Now ompi_info doesn't segfault anymore.

Check if the intel 10 has something similar.

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se



[OMPI users] arch question: long running app

2007-12-04 Thread doktora v
Hi, although I did my due diligence on searching for this question,
I apologise if this is a repeat.
>From an architectural point of view does it make sense to use MPI in the
following scenario (for the purposes of resilience as much as
parallelization):

Each process is a long-running process (runs non-interrupted for weeks,
months or even years) that collects and crunches some streaming data, for
example temperature readings, and the data is replicated to R nodes.

Because this is a diversion from the normal modus operandi (i.e. all data is
immediately available), is there any obvious MPI issues that I am not
considering in designing such an application?

Here is a more detailed description of the app:

A master receives the data and dispatches it according to some function such
that each tuple is replicated R times to R of the N nodes (with R<=N).
Suppose that there are K regions from which temperature readings stream in
 in the form of  where K is the region id and T is the temperature
reading. The master sends  to R of the N nodes. These nodes maintain a
long-term state of, say, the min/max readings. If R=N=2, the system is
basically duplicated and if one of the two nodes dies inadvertently, the
other one still has accounted for all the data.

Here is some pseudo-code:

int main(argc, argv)

int N=10, R=3, K=200;


Init(argc,argv);

int rank=COMM_WORLD.Get_rank();
if(rank==0) {
 int lastnode = 1;
 while(read  from socket)
   for(i in 0:R) COMM_WORLD.Send(,1,tuple,++lastnode%N,tag);
} else {
 COMM_WORLD.Recv(,1,tuple,any,tag,Info);
   process_message();
}

Many thanks for your time!
Regards
Dok


Re: [OMPI users] Segmentation fault: intel 10.1.008 compilers w/openmpi-1.2.4

2007-12-04 Thread de Almeida, Valmor F.
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
> 
> On Tue, 2007-12-04 at 09:33 +0100, Åke Sandgren wrote:
> > On Sun, 2007-12-02 at 21:27 -0500, de Almeida, Valmor F. wrote:
> >
> > Run an nm on opal/mca/memory/ptmalloc2/.libs/malloc.o and check if
> > malloc is defined in there.
> >
> > This seems to be the problem i have when compiling with pathscale.
> > It removes the malloc (public_mALLOc) function from the objectfile but
> > leaves the free (public_fREe) in there, resulting in malloc/free
> > mismatch.
> 
> For pathscale the solution for me was to add -fno-builtin.
> Now ompi_info doesn't segfault anymore.
> 
> Check if the intel 10 has something similar.

Below is the nm output. The no builtin compiler option you mentioned above 
seems to belong to gcc. I have compiled openmpi-1.2.4 with the gcc-4.1.2 suite 
without problems.

Thanks,

--
Valmor

->nm ./opal/mca/memory/ptmalloc2/.libs/malloc.o | grep -i "malloc"
3faa T __malloc_check_init
0004 D __malloc_hook
08b8 B __malloc_initialize_hook
0010 D __malloc_initialized
3944 t _int_icomalloc
0c36 T _int_malloc
08b0 b disallow_malloc_check
3742 T independent_comalloc
06a8 T malloc
 t malloc_atfork
2224 t malloc_check
1184 t malloc_consolidate
2c00 T malloc_get_state
0410 t malloc_hook_ini
44f2 t malloc_init_state
2d84 T malloc_set_state
2b5e t malloc_starter
3ad4 T malloc_trim
3b5e T malloc_usable_size
 U opal_mem_free_ptmalloc2_mmap
 U opal_mem_free_ptmalloc2_munmap
17d6 t opal_mem_free_ptmalloc2_sbrk
4032 t ptmalloc_init
44b0 t ptmalloc_init_minimal
01dc t ptmalloc_lock_all
02f6 t ptmalloc_unlock_all
036a t ptmalloc_unlock_all2
129e t sYSMALLOc
0870 b save_malloc_hook
08ac b using_malloc_checking




Re: [OMPI users] Segmentation fault: intel 10.1.008 compilers w/openmpi-1.2.4

2007-12-04 Thread Åke Sandgren
On Tue, 2007-12-04 at 15:28 -0500, de Almeida, Valmor F. wrote:
> > -Original Message-
> > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
> > 
> > On Tue, 2007-12-04 at 09:33 +0100, Åke Sandgren wrote:
> > > On Sun, 2007-12-02 at 21:27 -0500, de Almeida, Valmor F. wrote:
> > >
> > > Run an nm on opal/mca/memory/ptmalloc2/.libs/malloc.o and check if
> > > malloc is defined in there.
> > >
> > > This seems to be the problem i have when compiling with pathscale.
> > > It removes the malloc (public_mALLOc) function from the objectfile but
> > > leaves the free (public_fREe) in there, resulting in malloc/free
> > > mismatch.
> > 
> > For pathscale the solution for me was to add -fno-builtin.
> > Now ompi_info doesn't segfault anymore.
> > 
> > Check if the intel 10 has something similar.
> 
> Below is the nm output. The no builtin compiler option you mentioned above 
> seems to belong to gcc. I have compiled openmpi-1.2.4 with the gcc-4.1.2 
> suite without problems.

Ok, it was a long short anyway.

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se



[OMPI users] suggested intel compiler version for openmpi-1.2.4

2007-12-04 Thread de Almeida, Valmor F.

Hello,

What is the suggested intel compiler version to compile openmpi-1.2.4?

I tried versions 10.1.008 and 9.1.052 and no luck in getting a working
library. In both cases I get:

->mpic++ --showme
Segmentation fault

->ompi_info 
Segmentation fault

Thanks for your help.

--
Valmor de Almeida







Re: [OMPI users] suggested intel compiler version for openmpi-1.2.4

2007-12-04 Thread Jeff Squyres
I have compiled Open MPI with Intel 10.0 and 9.1 with no problems on  
RHEL4U4.


Can you send all the info that you can (obviously, ompi_info won't  
run) from http://www.open-mpi.org/community/help/ ?




On Dec 4, 2007, at 4:26 PM, de Almeida, Valmor F. wrote:



Hello,

What is the suggested intel compiler version to compile openmpi-1.2.4?

I tried versions 10.1.008 and 9.1.052 and no luck in getting a working
library. In both cases I get:

->mpic++ --showme
Segmentation fault

->ompi_info
Segmentation fault

Thanks for your help.

--
Valmor de Almeida





___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems