Re: [OMPI users] MPI_Bcast issue

2010-08-10 Thread Terry Frankcombe
On Tue, 2010-08-10 at 19:09 -0700, Randolph Pullen wrote: > Jeff thanks for the clarification, > What I am trying to do is run N concurrent copies of a 1 to N data > movement program to affect an N to N solution. I'm no MPI guru, nor do I completely understand what you are doing, but isn't this an

Re: [OMPI users] MPI_Bcast issue

2010-08-10 Thread Randolph Pullen
Jeff thanks for the clarification, What I am trying to do is run N concurrent copies of a 1 to N data movement program to affect an N to N solution.  The actual mechanism I am using is to spawn N copies of mpirun from PVM across the cluster. So each 1 to N MPI application starts at the same time

Re: [OMPI users] MPI_Allreduce on local machine

2010-08-10 Thread Gus Correa
Hi Jeff Thank you for opening a ticket and taking care of this. Jeff Squyres wrote: On Jul 28, 2010, at 5:07 PM, Gus Correa wrote: Still, the alignment under Intel may or may not be right. And this may or may not explain the errors that Hugo has got. FYI, the ompi_info from my OpenMPI 1.3.2

Re: [OMPI users] openMPI shared with NFS, but says different version

2010-08-10 Thread Gus Correa
Thank you, Cristobal. That is good news. Gus Correa Cristobal Navarro wrote: i have good news. after updating to a newer kernel on ubuntu server nodes, sm is not a problem anymore for the nehalem CPUs!!! my older kernel, was Linux 2.6.32-22-server #36-Ubuntu SMP Thu Jun 3 20:38:33 UTC 2010 x

Re: [OMPI users] Hybrid OpenMPI / OpenMP run pins OpenMP threads to a single core

2010-08-10 Thread David Akin
Solved: The process to core locking was due to affinity being set at the PSM layer. So I added -x IPATH_NO_CPUAFFINITY=1 to the mpirun command. Dave On Wed, Aug 4, 2010 at 12:13 PM, Eugene Loh wrote: > > David Akin wrote: > >> All, >> I'm trying to get the OpenMP portion of the code below to ru

Re: [OMPI users] MPI_Bcast issue

2010-08-10 Thread Jeff Squyres
+1 on Eugene's comment that I don't fully understand what you are trying to do. Can you send a short example code? Some random points: - Edgar already chimed in about how MPI-2 allows the use of intercommunicators with bcast. Open MPI is MPI-2.1 complaint, so you can use intercommunicators wi

Re: [OMPI users] MPI Template Datatype?

2010-08-10 Thread Riccardo Murri
On Tue, Aug 10, 2010 at 9:49 PM, Alexandru Blidaru wrote: > Are the Boost.MPI send and recv functions as fast as the standard ones when > using Open-MPI? Boost.MPI is layered on top of plain MPI; it basically provides a mapping from complex and user-defined C++ data types to MPI datatypes. The ad

Re: [OMPI users] MPI Template Datatype?

2010-08-10 Thread Alexandru Blidaru
Hi Riccardo, Are the Boost.MPI send and recv functions as fast as the standard ones when using Open-MPI? Best regards, Alexandru Blidaru University of Waterloo - Electrical Engineering '15 University email: asbli...@uwaterloo.ca Twitter handle: @G_raph Blog: http://alexblidaru.wordpress.com/

Re: [OMPI users] Checkpointing mpi4py program

2010-08-10 Thread ananda.mudar
Josh Please find attached is the python program that reproduces the hang that I described. Initial part of this file describes the prerequisite modules and the steps to reproduce the problem. Please let me know if you have any questions in reproducing the hang. Please note that, if I add the foll

Re: [OMPI users] [openib] segfault when using openib btl

2010-08-10 Thread Eloi Gaudry
Hi, sorry, i just forgot to add the values of the function parameters: (gdb) print reg->cbdata $1 = (void *) 0x0 (gdb) print openib_btl->super $2 = {btl_component = 0x2b341edd7380, btl_eager_limit = 12288, btl_rndv_eager_limit = 12288, btl_max_send_size = 65536, btl_rdma_pipeline_send_length = 1

Re: [OMPI users] [openib] segfault when using openib btl

2010-08-10 Thread Eloi Gaudry
Hi, Here is the output of a core file generated during a segmentation fault observed during a collective call (using openib): #0 0x in ?? () (gdb) where #0 0x in ?? () #1 0x2aedbc4e05f4 in btl_openib_handle_incoming (openib_btl=0x1902f9b0, ep=0x1908a1c0, f

Re: [OMPI users] openib issues

2010-08-10 Thread Eloi Gaudry
Hi Mike, The HCA card is a Mellanox Technologies MT25418 (ConnectX IB DDR, PCIe 2.0 2.5GT/s, rev a0). I cannot post code/instructions how to reproduce these errors as they randomly appeared during some tests I've performed to locate the origin of a segmentation fault during an MPI collective ca

Re: [OMPI users] openib issues

2010-08-10 Thread Mike Dubman
Hey Eloi, What HCA card do you have ? Can you post code/instructions howto reproduce it? 10x Mike On Mon, Aug 9, 2010 at 5:22 PM, Eloi Gaudry wrote: > Hi, > > Could someone have a look on these two different error messages ? I'd like > to know the reason(s) why they were displayed and their act

[OMPI users] SGE integration when getting slots from different queues on one and the same host mismatch

2010-08-10 Thread Reuti
Hi, I just stumped into the following behavior of Open MPI 1.4.2. Used jobscript: *** #!/bin/sh export PATH=~/local/openmpi-1.4.2/bin:$PATH cat $PE_HOSTFILE mpiexec ./dummy.sh *** with dummy.sh: *** #!/bin/sh env | grep TMPDIR sleep 30 *** === Situation 1: getting 4 slots in total from 2 queues