Re: [OMPI users] Finalize() does not return

2013-08-21 Thread Eloi Gaudry
>> Is there any other information I could provide that might be useful? >You might want to audit the code and ensure that you have no pending >communications that haven't finished -- check all your sends and receives, not >just in the code, but at run-time (e.g., use an MPI profiling tool to mat

Re: [OMPI users] sge tight integration leads to bad allocation

2012-04-20 Thread Eloi Gaudry
pi.org] On Behalf Of Reuti Sent: vendredi 20 avril 2012 15:20 To: Open MPI Users Subject: Re: [OMPI users] sge tight integration leads to bad allocation Am 20.04.2012 um 15:04 schrieb Eloi Gaudry: > > Hi Ralph, Reuti, > > I've just observed the same issue without specifying -np.

Re: [OMPI users] sge tight integration leads to bad allocation

2012-04-20 Thread Eloi Gaudry
Hi Ralph, Reuti,   I've just observed the same issue without specifying -np. Please find attached the ps -elfax output from the computing nodes and some sge related information.   Regards, Eloi       -Original message- From:Ralph Castain Sent:Wed 04-11-2012 02:25 pm Subject:Re: [OMPI

Re: [OMPI users] sge tight intregration leads to bad allocation

2012-04-10 Thread Eloi Gaudry
> This might be of interest to Reuti and you : it seems that we cannot > reproduce the problem anymore if we don't provide the "-np N" option on the > orterun command line. Of course, we need to launch a few other runs to be > really sure because the allocation error was not always observable. A

Re: [OMPI users] sge tight intregration leads to bad allocation

2012-04-10 Thread Eloi Gaudry
f Of Ralph Castain Sent: mardi 10 avril 2012 16:43 To: Open MPI Users Subject: Re: [OMPI users] sge tight intregration leads to bad allocation Could well be a bug in OMPI - I can take a look, though it may be awhile before I get to it. Have you tried one of the 1.5 series releases? On Apr 10, 201

Re: [OMPI users] sge tight intregration leads to bad allocation

2012-04-10 Thread Eloi Gaudry
Thx. This is the allocation which is also confirmed by the Open MPI output. [eg: ] exactly, but not the one used afterwards by openmpi - The application was compiled with the same version of Open MPI? [eg: ] yes, version 1.4.4 for all - Does the application start something on its own besides the

Re: [OMPI users] sge tight intregration leads to bad allocation

2012-04-06 Thread Eloi Gaudry
> - Can you please post while it's running the relevant lines from: > ps -e f --cols=500 > (f w/o -) from both machines. > It's allocated between the nodes more like in a round-robin fashion. > [eg: ] I'll try to do this tomorrow, as soon as some slots become free. > Thanks for your feedback Reuti

Re: [OMPI users] sge tight intregration leads to bad allocation

2012-04-05 Thread Eloi Gaudry
-Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Reuti Sent: jeudi 5 avril 2012 18:41 To: Open MPI Users Subject: Re: [OMPI users] sge tight intregration leads to bad allocation Am 05.04.2012 um 17:55 schrieb Eloi Gaudry: > > &

Re: [OMPI users] sge tight intregration leads to bad allocation

2012-04-05 Thread Eloi Gaudry
>> Here are the allocation info retrieved from `qstat -g t` for the related job: > > For me the output of `qstat -g t` shows MASTER and SLAVE entries but no > variables. Is there any wrapper defined for `qstat` to reformat the output > (or a ~/.sge_qstat defined)? > > [eg: ] sorry, i forgot abo

Re: [OMPI users] sge tight intregration leads to bad allocation

2012-04-03 Thread Eloi Gaudry
-Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Reuti Sent: mardi 3 avril 2012 17:13 To: Open MPI Users Subject: Re: [OMPI users] sge tight intregration leads to bad allocation Am 03.04.2012 um 16:59 schrieb Eloi Gaudry: > Hi Re

Re: [OMPI users] sge tight intregration leads to bad allocation

2012-04-03 Thread Eloi Gaudry
Behalf Of Reuti Sent: mardi 3 avril 2012 16:24 To: Open MPI Users Subject: Re: [OMPI users] sge tight intregration leads to bad allocation Hi, Am 03.04.2012 um 16:12 schrieb Eloi Gaudry: > Thanks for your feedback. > No, this is the other way around, the "reserved" slots on all nod

Re: [OMPI users] sge tight intregration leads to bad allocation

2012-04-03 Thread Eloi Gaudry
so the two slots on charlie is in error? Sent from my iPad On Apr 3, 2012, at 6:23 AM, "Eloi Gaudry" mailto:eloi.gau...@fft.be> > wrote: Hi,   I’ve observed a strange behavior during rank allocation on a distributed run schedule and submitted using Sge (Son of Grid Egine 8.0.0d) a

Re: [OMPI users] sge tight intregration leads to bad allocation

2012-04-03 Thread Eloi Gaudry
munity/lists/users/2012/02/18399.php <http://www.open-mpi.org/community/lists/users/2012/02/18399.php> In my case, the workaround was just to launch the app with mpiexec, and the allocation is handled correctly. ---Tom On 4/3/12 9:23 AM, "Eloi Gaudry" wrote: Hi, I've obs

[OMPI users] sge tight intregration leads to bad allocation

2012-04-03 Thread Eloi Gaudry
Hi, I've observed a strange behavior during rank allocation on a distributed run schedule and submitted using Sge (Son of Grid Egine 8.0.0d) and OpenMPI-1.4.4. Briefly, there is a one-slot difference between allocated rank/slot for Sge and OpenMPI. The issue here is that one node becomes over

Re: [OMPI users] [openib] segfault when using openib btl

2012-01-31 Thread Eloi Gaudry
ions, apart from checking the driver and firmware levels. The consensus was that it would be better if you could take this up directly with your IB vendor. Regards --Nysal On Mon, Sep 27, 2010 at 8:14 PM, Eloi Gaudry <mailto:e...@fft.be>> wrote: Terry, Please find enclosed the re

Re: [OMPI users] huge VmRSS on rank 0 after MPI_Init when using "btl_openib_receive_queues" option

2011-05-26 Thread Eloi Gaudry
hi, does anyone have a clue here ? éloi On 22/04/2011 08:52, Eloi Gaudry wrote: it varies with the receive_queues specification *and* with the number of mpi processes: memory_consumed = nb_mpi_process * nb_buffers * (buffer_size + low_buffer_count_watermark + credit_window_size ) éloi On

Re: [OMPI users] huge VmRSS on rank 0 after MPI_Init when using "btl_openib_receive_queues" option

2011-04-22 Thread Eloi Gaudry
receive_queues specification? On Apr 19, 2011, at 9:03 AM, Eloi Gaudry wrote: hello, i would like to get your input on this: when launching a parallel computation on 128 nodes using openib and the "-mca btl_openib_receive_queues P,65536,256,192,128" option, i observe a rather large

[OMPI users] huge VmRSS on rank 0 after MPI_Init when using "btl_openib_receive_queues" option

2011-04-19 Thread Eloi Gaudry
PI-1.4.2, built with gcc-4.3.4 and '--enable-cxx-exceptions --with-pic --with-threads=posix' options. thanks for your help, éloi -- Eloi Gaudry Senior Product Development Engineer Free Field Technologies Company Website: http://www.fft.be Direct Phone Number: +32 10 495 147

Re: [OMPI users] memory consumption on rank 0 and btl_openib_receive_queues use

2011-01-24 Thread Eloi Gaudry
7 AM, Eloi Gaudry wrote: hi, i'd like to know if someone had a chance to check at the issue I reported. thanks and happy new year ! éloi On 12/21/2010 10:58 AM, Eloi Gaudry wrote: hi, when launching a parallel computation on 128 nodes using openib and the "-mca btl_ope

Re: [OMPI users] memory consumption on rank 0 and btl_openib_receive_queues use

2011-01-03 Thread Eloi Gaudry
hi, i'd like to know if someone had a chance to check at the issue I reported. thanks and happy new year ! éloi On 12/21/2010 10:58 AM, Eloi Gaudry wrote: hi, when launching a parallel computation on 128 nodes using openib and the "-mca btl_openib_receive_queues P,65536,256,192,1

[OMPI users] memory consumption on rank 0 and btl_openib_receive_queues use

2010-12-21 Thread Eloi Gaudry
#x27;t use that amount of memory - all others processes (i.e. located on any other nodes) neither i'm using OpenMPI-1.4.2, built with gcc-4.3.4 and '--enable-cxx-exceptions --with-pic --with-threads=posix' options. the cluster is based on eight-core nodes using mellanox hca.

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-29 Thread Eloi Gaudry
e to disable eager rdma. Regards, Pasha On Sep 29, 2010, at 1:04 PM, Terry Dontje wrote: Pasha, do you by any chance know who at Mellanox might be responsible for OMPI working? --td Eloi Gaudry wrote: Hi Nysal, Terry, Thanks for your input on this issue. I'll follow your advice. D

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-29 Thread Eloi Gaudry
Hi Nysal, Terry, Thanks for your input on this issue. I'll follow your advice. Do you know any Mellanox developer I may discuss with, preferably someone who has spent some time inside the openib btl ? Regards, Eloi On 29/09/2010 06:01, Nysal Jan wrote: Hi Eloi, We discussed this issue durin

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-27 Thread Eloi Gaudry
ely from your last email I think it will still all have > non-zero values. > If that ends up being the case then there must be something odd with the > descriptor pointer to the fragment. > > --td > > Eloi Gaudry wrote: > > Terry, > > > > Please

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-27 Thread Eloi Gaudry
btl/openib/btl_ope > nib_endpoint.h#548 > > --td > > Eloi Gaudry wrote: > > Hi Terry, > > > > Do you have any patch that I could apply to be able to do so ? I'm > > remotely working on a cluster (with a terminal) and I cannot use any > > parallel debugg

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-27 Thread Eloi Gaudry
btl/openib/btl_ope > nib_endpoint.h#548 > > --td > > Eloi Gaudry wrote: > > Hi Terry, > > > > Do you have any patch that I could apply to be able to do so ? I'm > > remotely working on a cluster (with a terminal) and I cannot use any > > parallel debugg

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-27 Thread Eloi Gaudry
e coalescing is not your issue and that the problem has > something to do with the queue sizes. It would be helpful if we could > detect the hdr->tag == 0 issue on the sending side and get at least a > stack trace. There is something really odd going on here. > > --td > > El

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-25 Thread Eloi Gaudry
I've already tried to write something but I haven't succeeded so far at reproducing the hdr->tag=0 issue with it. Eloi On 24/09/2010 18:37, Terry Dontje wrote: Eloi Gaudry wrote: Terry, You were right, the error indeed seems to come from the message coalescing feature. If I tu

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-24 Thread Eloi Gaudry
lar error (https://svn.open-mpi.org/trac/ompi/search?q=coalescing) but they are all closed (except https://svn.open-mpi.org/trac/ompi/ticket/2352 that might be related), aren't they ? What would you suggest Terry ? Eloi On Friday 24 September 2010 16:00:26 Terry Dontje wrote: > Eloi Gau

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-24 Thread Eloi Gaudry
s other than the default and the one you mention. > > I wonder if you did a combination of the two receive queues causes a > failure or not. Something like > > P,128,256,192,128:P,65536,256,192,128 > > I am wondering if it is the first queuing definition causing the issue or

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-24 Thread Eloi Gaudry
job > it is? Does it always fail on the same bcast, or same process? > > Eloi Gaudry wrote: > > Hi Nysal, > > > > Thanks for your suggestions. > > > > I'm now able to get the checksum computed and redirected to stdout, > > thanks (I forgo

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-22 Thread Eloi Gaudry
ver called because the hdr->tag is invalid. So > enabling checksum tracing also might not be of much use. Is it the first > Bcast that fails or the nth Bcast and what is the message size? I'm not > sure what could be the problem at this moment. I'm afraid you will have to > de

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-17 Thread Eloi Gaudry
an try using it to see if it is able to > catch anything. > > Regards > --Nysal > > On Thu, Sep 16, 2010 at 3:48 PM, Eloi Gaudry wrote: > > Hi Nysal, > > > > I'm sorry to intrrupt, but I was wondering if you had a chance to look at > > this error.

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-15 Thread Eloi Gaudry
Hi, I was wondering if anybody got a chance to have a look at this issue. Regards, Eloi On Wednesday 18 August 2010 09:16:26 Eloi Gaudry wrote: > Hi Jeff, > > Please find enclosed the output (valgrind.out.gz) from > /opt/openmpi-debug-1.4.2/bin/orterun -np 2 --host pbn11,pbn

Re: [OMPI users] [openib] segfault when using openib btl

2010-08-20 Thread Eloi Gaudry
Hi Jeff, here is the valgrind output when using OpenMPI -1.5rc5, just in case. Thanks, Eloi On Wednesday 18 August 2010 23:01:49 Jeff Squyres wrote: > On Aug 17, 2010, at 12:32 AM, Eloi Gaudry wrote: > > would it help if i use the upcoming 1.5 version of openmpi ? i read that > >

Re: [OMPI users] [openib] segfault when using openib btl

2010-08-18 Thread Eloi Gaudry
--suppressions=/opt/openmpi-debug-1.4.2/share/openmpi/openmpi- valgrind.supp --suppressions=./suppressions.python.supp /opt/actran/bin/actranpy_mp ... Thanks, Eloi On Tuesday 17 August 2010 09:32:53 Eloi Gaudry wrote: > On Monday 16 August 2010 19:14:47 Jeff Squyres wrote: > > On Aug

Re: [OMPI users] [openib] segfault when using openib btl

2010-08-17 Thread Eloi Gaudry
our application? The > openib BTL is not yet thread safe in the 1.4 release series. There have > been improvements to openib BTL thread safety in 1.5, but it is still not > officially supported. > > --Nysal > > On Tue, Aug 17, 2010 at 1:06 PM, Eloi Gaudry wrote: > &g

Re: [OMPI users] [openib] segfault when using openib btl

2010-08-17 Thread Eloi Gaudry
> So hdr->tag should be a value >= 65 > Since the tag is incorrect you are not getting the proper callback function > pointer and hence the SEGV. > I'm not sure at this point as to why you are getting an invalid/corrupt > message header ? > > --Nysal > > On Tu

Re: [OMPI users] [openib] segfault when using openib btl

2010-08-17 Thread Eloi Gaudry
On Monday 16 August 2010 19:14:47 Jeff Squyres wrote: > On Aug 16, 2010, at 10:05 AM, Eloi Gaudry wrote: > > I did run our application through valgrind but it couldn't find any > > "Invalid write": there is a bunch of "Invalid read" (I'm using

Re: [OMPI users] [openib] segfault when using openib btl

2010-08-16 Thread Eloi Gaudry
ack trace looks like you're calling through python, but can you run this application through valgrind, or some other memory-checking debugger? On Aug 10, 2010, at 7:15 AM, Eloi Gaudry wrote: Hi, sorry, i just forgot to add the values of the function parameters: (gdb) print reg-&g

Re: [OMPI users] [openib] segfault when using openib btl

2010-08-10 Thread Eloi Gaudry
3f4110, btl_register_error = 0x2b341eb90565 , btl_ft_event = 0x2b341eb952e7 } (gdb) print hdr->tag $3 = 0 '\0' (gdb) print des $4 = (mca_btl_base_descriptor_t *) 0xf4a6700 (gdb) print reg->cbfunc $5 = (mca_btl_base_module_recv_cb_fn_t) 0 Eloi On Tuesday 10 August 2010 16:04:08 E

Re: [OMPI users] [openib] segfault when using openib btl

2010-08-10 Thread Eloi Gaudry
;tag, des, reg->cbdata ); 2882if(MCA_BTL_OPENIB_RDMA_FRAG(frag)) { 2883cqp = (hdr->credits >> 11) & 0x0f; 2884 hdr->credits &= 0x87ff; 2885} else { Regards, Eloi On Friday 16 July 2010 16:01:02 Eloi Gaudry wrote: > Hi Edgar

Re: [OMPI users] openib issues

2010-08-10 Thread Eloi Gaudry
> On Mon, Aug 9, 2010 at 5:22 PM, Eloi Gaudry wrote: > > Hi, > > > > Could someone have a look on these two different error messages ? I'd > > like to know the reason(s) why they were displayed and their actual > > meaning. > > > > Thanks, &

Re: [OMPI users] openib issues

2010-08-09 Thread Eloi Gaudry
Hi, Could someone have a look on these two different error messages ? I'd like to know the reason(s) why they were displayed and their actual meaning. Thanks, Eloi On Monday 19 July 2010 16:38:57 Eloi Gaudry wrote: > Hi, > > I've been working on a random segmentation fault

[OMPI users] openib issues

2010-07-19 Thread Eloi Gaudry
QP_ACCESS_ERR) This error may indicate connectivity problems within the fabric; please contact your system administrator. -- I'd like to know what these two errors mean and where they come from. Thanks for your help

Re: [OMPI users] [openib] segfault when using openib btl

2010-07-16 Thread Eloi Gaudry
ue is not somehow limited to the tuned collective routines. Thanks, Eloi On Thursday 15 July 2010 17:24:24 Edgar Gabriel wrote: > On 7/15/2010 10:18 AM, Eloi Gaudry wrote: > > hi edgar, > > > > thanks for the tips, I'm gonna try this option as well. the segmentati

Re: [OMPI users] [openib] segfault when using openib btl

2010-07-15 Thread Eloi Gaudry
oblem in the openib btl triggered from the tuned > collective component, in cases where the ofed libraries were installed > but no NCA was found on a node. It used to work however with the basic > component. > > Thanks > Edgar > > On 7/15/2010 3:08 AM, Eloi Gaudry wrote: &

Re: [OMPI users] [openib] segfault when using openib btl

2010-07-15 Thread Eloi Gaudry
ferent algorithms that can > > be selected for the various collectives. > > Therefore, you need this: > > > > --mca coll_tuned_use_dynamic_rules 1 --mca coll_tuned_bcast_algorithm 1 > > > > Rolf > > > > On 07/13/10 11:28, Eloi Gaudry wrote: > > > Hi, > &

Re: [OMPI users] [openib] segfault when using openib btl

2010-07-14 Thread Eloi Gaudry
rules 1 --mca coll_tuned_bcast_algorithm 1 > > Rolf > > On 07/13/10 11:28, Eloi Gaudry wrote: > > Hi, > > > > I've found that "--mca coll_tuned_bcast_algorithm 1" allowed to switch to > > the basic linear algorithm. Anyway whatever the alg

Re: [OMPI users] [openib] segfault when using openib btl

2010-07-13 Thread Eloi Gaudry
nday 12 July 2010 10:53:58 Eloi Gaudry wrote: > Hi, > > I'm focusing on the MPI_Bcast routine that seems to randomly segfault when > using the openib btl. I'd like to know if there is any way to make OpenMPI > switch to a different algorithm than the default one being s

Re: [OMPI users] [openib] segfault when using openib btl

2010-07-12 Thread Eloi Gaudry
2010 11:06:52 Eloi Gaudry wrote: > Hi, > > I'm observing a random segmentation fault during an internode parallel > computation involving the openib btl and OpenMPI-1.4.2 (the same issue > can be observed with OpenMPI-1.3.3). >mpirun (Open MPI) 1.4.2 >Report bugs

[OMPI users] [openib] segfault when using openib btl

2010-07-02 Thread Eloi Gaudry
.list --mca btl self,sm,tcp --display-map --verbose --version --mca mpi_warn_on_fork 0 --mca btl_openib_want_fork_support 0 [...] Thanks, Eloi -- Eloi Gaudry Free Field Technologies Axis Park Louvain-la-Neuve Rue Emile Francqui, 1 B-1435 Mont-Saint Guibert BELGIUM Company Phone: +32 10 487 959 C

Re: [OMPI users] Problems with memchecker in version 1.4.2

2010-06-22 Thread Eloi Gaudry
: > valgrind is installed, and worked with Open MPI 1.4.1. > > 2010/6/22 Eloi Gaudry : > > Hi Michele, > > > > You may actually need to have gdb/valgrind installed before configuring > > and building OpenMPI with the --enable-memchecker option. > > > > Regards,

Re: [OMPI users] Problems with memchecker in version 1.4.2

2010-06-22 Thread Eloi Gaudry
ERROR_LOG: Not found > in file ../../../../orte/tools/orterun/orterun.c at line 543 > > > It seems that the memchecker does not work, because after > reconfiguring without "--enable-memchecker" and rebuilding, I don't > receive the same error anymore. > > May any

Re: [OMPI users] [sge::tight-integration] slot scheduling and resources handling

2010-06-07 Thread Eloi Gaudry
Hi Reuti, I've been unable to reproduce the issue so far. Sorry for the convenience, Eloi On Tuesday 25 May 2010 11:32:44 Reuti wrote: > Hi, > > Am 25.05.2010 um 09:14 schrieb Eloi Gaudry: > > I do no reset any environment variable during job submission or job > > h

Re: [OMPI users] [sge::tight-integration] slot scheduling and resources handling

2010-05-25 Thread Eloi Gaudry
May 2010 17:35:24 Reuti wrote: > Hi, > > Am 21.05.2010 um 17:19 schrieb Eloi Gaudry: > > Hi Reuti, > > > > Yes, the openmpi binaries used were build after having used the > > --with-sge during configure, and we only use those binaries on our > > cluster.

Re: [OMPI users] [sge::tight-integration] slot scheduling and resources handling

2010-05-21 Thread Eloi Gaudry
v2.0, API v2.0, Component v1.3.3) MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.3.3) Regards, Eloi On Friday 21 May 2010 16:01:54 Reuti wrote: > Hi, > > Am 21.05.2010 um 14:11 schrieb Eloi Gaudry: > > Hi there, > > > > I'm observing something s

[OMPI users] [sge::tight-integration] slot scheduling and resources handling

2010-05-21 Thread Eloi Gaudry
the different command line options. Any help would be appreciated, Thanks, Eloi -- Eloi Gaudry Free Field Technologies Axis Park Louvain-la-Neuve Rue Emile Francqui, 1 B-1435 Mont-Saint Guibert BELGIUM Company Phone: +32 10 487 959 Company Fax: +32 10 454 626

Re: [OMPI users] segfault when combining OpenMPI and GotoBLAS2

2010-01-20 Thread Eloi Gaudry
Hi, FYI, This issue is solved with the last version of the library (v2-1.11), at least on my side. Eloi Gus Correa wrote: Hi Dorian Dorian Krause wrote: Hi, @Gus I don't use any flags for the installed OpenMPI version. In fact for this mail I used an OpenMPI version just installed with t

Re: [OMPI users] segfault when combining OpenMPI and GotoBLAS2

2010-01-19 Thread Eloi Gaudry
I hope this helps. Gus Correa - Gustavo Correa Lamont-Doherty Earth Observatory - Columbia University Palisades, NY, 10964-8000 - USA ----- Eloi Gaudry wrote: Dorian Krause wrote: Hi Eloi, Does the seg

Re: [OMPI users] segfault when combining OpenMPI and GotoBLAS2

2010-01-19 Thread Eloi Gaudry
des, NY, 10964-8000 - USA --------- Eloi Gaudry wrote: Dorian Krause wrote: Hi Eloi, Does the segmentation faults you're facing also happen in a sequential environment (i.e. not linked against openmpi libraries) ? No, without MPI everything works fine.

Re: [OMPI users] segfault when combining OpenMPI and GotoBLAS2

2010-01-18 Thread Eloi Gaudry
Dorian Krause wrote: Hi Eloi, Does the segmentation faults you're facing also happen in a sequential environment (i.e. not linked against openmpi libraries) ? No, without MPI everything works fine. Also, linking against mvapich doesn't give any errors. I think there is a problem with GotoBL

Re: [OMPI users] segfault when combining OpenMPI and GotoBLAS2

2010-01-18 Thread Eloi Gaudry
Dorian Krause wrote: Hi, has any one successfully combined OpenMPI and GotoBLAS2? I'm facing segfaults in any program which combines the two libraries (as shared libs). The segmentation fault seems to occur in MPI_Init(). The gdb backtrace is Program received signal SIGSEGV, Segmentation fa

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
This is what I did (create by hand /opt/sge/tmp/test on an execution host log as a regular cluster user). Eloi On 11/11/2009 00:26, Reuti wrote: To avoid misunderstandings: Am 11.11.2009 um 00:19 schrieb Eloi Gaudry: On any execution node, creating a subdirectory of /opt/sge/tmp (i.e

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
sge got nobody/nogroup as owner. Eloi On 11/11/2009 00:14, Reuti wrote: Am 11.11.2009 um 00:03 schrieb Eloi Gaudry: The user/group used to generate the temporary directories was nobody/nogroup, when using a shared $tmpdir. Now that I'm using a local $tmpdir (one for each node

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
nMPI could failed when using such a configuration (i.e. with a shared "tmpdir"). Eloi On 10/11/2009 19:17, Eloi Gaudry wrote: Reuti, The acl here were just added when I tried to force the /opt/sge/tmp subdirectories to be 777 (which I did when I first encountered the error of sub

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
stead of a shared one for "tmpdir". But as this issue seems somehow related to permissions, I don't know if this would eventually be the rigth solution. Thanks for your help, Eloi Reuti wrote: Hi, Am 10.11.2009 um 19:01 schrieb Eloi Gaudry: Reuti, I'm using "tmpdi

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
gs/bin/true stop_proc_args /bin/true allocation_rule$round_robin control_slaves TRUE job_is_first_task FALSE urgency_slots min accounting_summary FALSE Thanks for your help, Eloi Reuti wrote: Am 10.11.2009 um 18:20 schrieb Eloi Gaudry: Thanks for your help Reuti, I'm

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
inside (as OpenMPI won't use nobody:nogroup credentials). Ad Ralph suggested, I checked the SGE configuration, but I haven't found anything related to nobody:nogroup configuration so far. Eloi Reuti wrote: Hi, Am 10.11.2009 um 17:55 schrieb Eloi Gaudry: Thanks for your help Ralp

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
e - check "mpirun -h", or ompi_info for the required option. But I would first check your SGE config as that just doesn't sound right. On Nov 10, 2009, at 9:40 AM, Eloi Gaudry wrote: Hi there, I'm experiencing some issues using GE6.2U4 and OpenMPI-1.3.3 (with gridengin

[OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
ny solution was found. Thanks for your help, Eloi -- Eloi Gaudry Free Field Technologies Axis Park Louvain-la-Neuve Rue Emile Francqui, 1 B-1435 Mont-Saint Guibert BELGIUM Company Phone: +32 10 487 959 Company Fax: +32 10 454 626

[OMPI users] [1.2.x] --enable--mpi-threads

2009-06-04 Thread Eloi Gaudry
ption when compiling (configuring) OpenMPI to prevent such issues. Is there any extensive doc. about this specific option ? Should I be using something else when building OpenMPI ? Thanks for your help, Eloi -- Eloi Gaudry Free Field Technologies Axis Park Louvain-la-Neuve Rue Emile Francqui,

Re: [OMPI users] openmpi on linux-ia64

2008-07-23 Thread Eloi Gaudry
nary called MPI_init (assuming it was the method redefined in the fake_mpi library), it was actually calling the the MPI_init method from the openmpi library. Thanks for your reactivity Jeff, Eloi Jeff Squyres wrote: On Jul 23, 2008, at 8:33 AM, Eloi Gaudry wrote: I've been encountering

[OMPI users] openmpi on linux-ia64

2008-07-23 Thread Eloi Gaudry
Hi there, I've been encountering some issues with openmpi on a linux-ia64 platform (centos-4.6 with gcc-4.3.1) within a call to MPI_Query_thread (in a fake single process run): An error occurred in MPI_Query_thread *** before MPI was initialized *** MPI_ERRORS_ARE_FATAL (goodbye) I'd like to