Re: [OMPI users] www.open-mpi.org certificate error?

2016-07-30 Thread Gilles Gouaillardet
Jeff, if my understanding is correct, https requires open-mpi.org is the only (httpd) domain served on port 443 for a given IP (e.g. no shared hosting) a certificate is based on host name (e.g. www.open-mpi.org) and can contains wildcards (e.g. *.open-mpi.org) so if the first condition is met, th

Re: [OMPI users] www.open-mpi.org certificate error?

2016-07-31 Thread Gilles Gouaillardet
wrote: > Is the web server's private key, used to generate the CSR, also > needed? If so, perhaps IU cannot share that. > > > > On Sat, Jul 30, 2016 at 11:09 PM, Gilles Gouaillardet > > wrote: > > Jeff, > > > > if my understanding is correct, htt

Re: [OMPI users] open-mpi: all-recursive error when compiling

2016-08-04 Thread Gilles Gouaillardet
The error message is related to a permission issue (which is very puzzling in itself ...) can you manually check the permissions ? cd /home/pi/Downloads/openmpi-2.0.0/opal/asm ls -l .deps/atomic-asm.Tpo atomic-asm.S then you can make clean make V=1 atomic-asm.lo and post the output mea

Re: [OMPI users] open-mpi: all-recursive error when compiling

2016-08-05 Thread Gilles Gouaillardet
] Error 1 make[2]: Leaving directory '/home/pi/Downloads/TEST/openmpi-2.0.0/opal/asm' make[1]: *** [Makefile:2301: all-recursive] Error 1 make[1]: Leaving directory '/home/pi/Downloads/TEST/openmpi-2.0.0/opal' make: *** [Makefile:1800: all-recursive] Error 1 * 3 - Hypothese

Re: [OMPI users] open-mpi: all-recursive error when compiling

2016-08-05 Thread Gilles Gouaillardet
rectory '/home/pi/Downloads/TEST/openmpi-2.0.0/opal' make: *** [Makefile:1800: all-recursive] Error 1 * 3 - Hypotheses ==* *Solaris* I'm not expert but searching "__curbrk" I discover it belongs to glibc http://stackoverflow.com/questions/6210685/explanation-for-t

Re: [OMPI users] Multiple connections to MySQL vi MPI

2016-08-08 Thread Gilles Gouaillardet
What if you run mpirun -np 1 mysqlconnect on your frontend (aka compilation and/or submission) host ? does it work as expected ? if yes, then this likely indicates a MySQL permission/configuration issue. for example, it accepts connections from 'someUser' only from one node, or maybe mysqld

Re: [OMPI users] which info is needed for SIGSEGV in Java foropenmpi-dev-124-g91e9686 on Solaris

2014-10-24 Thread Gilles Gouaillardet
Siegmar, how did you configure openmpi ? which java version did you use ? i just found a regression and you currently have to explicitly add CFLAGS=-D_REENTRANT CPPFLAGS=-D_REENTRANT to your configure command line if you want to debug this issue (i cannot reproduce it on a solaris 11 x86 virtual

Re: [OMPI users] OMPI users] low CPU utilization with OpenMPI

2014-10-24 Thread Gilles Gouaillardet
Can you also check there is no cpu binding issue (several mpi tasks and/or OpenMP threads if any, bound to the same core and doing time sharing ? A simple way to check that is to log into a compute node, run top and then press 1 f j If some cores have higher usage than others, you are likely doin

Re: [OMPI users] OMPI users] which info is needed for SIGSEGV in Java foropenmpi-dev-124-g91e9686on Solaris

2014-10-25 Thread Gilles Gouaillardet
Hi Siegmar, You might need to configure with --enable-debug and add -g -O0 to your CFLAGS and LDFLAGS Then once you attach with gdb, you have to find the thread that is polling : thread 1 bt thread 2 bt and so on until you find the good thread If _dbg is a local variable, you need to select the

Re: [OMPI users] OMPI users] OMPI users] which info is needed for SIGSEGV inJava foropenmpi-dev-124-g91e9686on Solaris

2014-10-26 Thread Gilles Gouaillardet
It looks like we faced a similar issue : opal_process_name_t is 64 bits aligned wheteas orte_process_name_t is 32 bits aligned. If you run an alignment sensitive cpu such as sparc and you are not lucky (so to speak) you can run into this issue. i will make a patch for this shortly Ralph Castain

Re: [OMPI users] OMPI users] OMPI users] OMPI users] which info is needed for SIGSEGV inJava foropenmpi-dev-124-g91e9686on Solaris

2014-10-26 Thread Gilles Gouaillardet
variable declaration only. Any thought ? Ralph Castain wrote: >Will PR#249 solve it? If so, we should just go with it as I suspect that is >the long-term solution. > >> On Oct 26, 2014, at 4:25 PM, Gilles Gouaillardet >> wrote: >> >> It lo

Re: [OMPI users] OMPI users] OMPI users] OMPI users] which info is needed for SIGSEGV inJava foropenmpi-dev-124-g91e9686on Solaris

2014-10-27 Thread Gilles Gouaillardet
; If you add changes to your branch, I can pass you a patch with my suggested > alterations. > >> On Oct 26, 2014, at 5:55 PM, Gilles Gouaillardet >> wrote: >> >> No :-( >> I need some extra work to stop declaring orte_process_name_t and >> ompi_process_name_

Re: [OMPI users] which info is needed for SIGSEGV in Java foropenmpi-dev-124-g91e9686on Solaris

2014-10-27 Thread Gilles Gouaillardet
>>>>>> while >>> (_dbg) poll(NULL, 0, 1); >>>>>> tyr java 400 nm /usr/local/openmpi-1.9.0_64_gcc/lib64/*.so | grep -i _dbg >>>>>> tyr java 401 nm /usr/local/openmpi-1.9.0_64_gcc/lib64/*.so | grep -i >>>>>> JNI_OnLoad &g

Re: [OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-27 Thread Gilles Gouaillardet
Hi, i tested on a RedHat 6 like linux server and could not observe any memory leak. BTW, are you running 32 or 64 bits cygwin ? and what is your configure command line ? Thanks, Gilles On 2014/10/27 18:26, Marco Atzeri wrote: > On 10/27/2014 8:30 AM, maxinator333 wrote: >> Hello, >> >> I notic

Re: [OMPI users] OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-27 Thread Gilles Gouaillardet
Thanks Marco, I could reproduce the issue even with one node sending/receiving to itself. I will investigate this tomorrow Cheers, Gilles Marco Atzeri wrote: > > >On 10/27/2014 10:30 AM, Gilles Gouaillardet wrote: >> Hi, >> >> i tested on a RedHat 6 like linux s

Re: [OMPI users] WG: Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-10-27 Thread Gilles Gouaillardet
Michael, Could you please run mpirun -np 1 df -h mpirun -np 1 df -hi on both compute and login nodes Thanks Gilles michael.rach...@dlr.de wrote: >Dear developers of OPENMPI, > >We have now installed and tested the bugfixed OPENMPI Nightly Tarball of >2014-10-24 (openmpi-dev-176-g9334abc.tar.

Re: [OMPI users] WG: Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-10-27 Thread Gilles Gouaillardet
Michael, The available space must be greater than the requested size + 5% From the logs, the error message makes sense to me : there is not enough space in /tmp Since the compute nodes have a lot of memory, you might want to try using /dev/shm instead of /tmp for the backing files Cheers, Gil

Re: [OMPI users] OMPI users] OMPI users] OMPI users] which info is needed for SIGSEGV inJava foropenmpi-dev-124-g91e9686on Solaris

2014-10-27 Thread Gilles Gouaillardet
Ralph, On 2014/10/28 0:46, Ralph Castain wrote: > Actually, I propose to also remove that issue. Simple enough to use a > hash_table_32 to handle the jobids, and let that point to a > hash_table_32 of vpids. Since we rarely have more than one jobid > anyway, the memory overhead actually decreases

Re: [OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-28 Thread Gilles Gouaillardet
Marco, here is attached a patch that fixes the issue /* i could not find yet why this does not occurs on Linux ... */ could you please give it a try ? Cheers, Gilles On 2014/10/27 18:45, Marco Atzeri wrote: > > > On 10/27/2014 10:30 AM, Gilles Gouaillardet wrote: >> Hi, >

Re: [OMPI users] SIGBUS in openmpi-dev-178-ga16c1e4 on Solaris 10 Sparc

2014-10-28 Thread Gilles Gouaillardet
Hi Siegmar, From the jvm logs, there is an alignment error in native_get_attr but i could not find it by reading the source code. Could you please do ulimit -c unlimited mpiexec ... and then gdb /bin/java core And run bt on all threads until you get a line number in native_get_attr Thanks Gill

Re: [OMPI users] OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-28 Thread Gilles Gouaillardet
Thanks Marco, pthread_mutex_init calls calloc under cygwin but does not allocate memory under linux, so not invoking pthread_mutex_destroy causes a memory leak only under cygwin. Gilles Marco Atzeri wrote: >On 10/28/2014 12:04 PM, Gilles Gouaillardet wrote: >> Marco, >> >&g

Re: [OMPI users] OMPI users] OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-28 Thread Gilles Gouaillardet
Yep, will do today Ralph Castain wrote: >Gilles: will you be committing this to trunk and PR to 1.8? > > >> On Oct 28, 2014, at 11:05 AM, Marco Atzeri wrote: >> >> On 10/28/2014 4:41 PM, Gilles Gouaillardet wrote: >>> Thanks Marco, >>> >>>

Re: [OMPI users] SIGBUS in openmpi-dev-178-ga16c1e4 on Solaris 10 Sparc

2014-10-29 Thread Gilles Gouaillardet
ead can be found to >>> satisfy query >>> (gdb) bt >>> #0 0x7f6173d0 in rtld_db_dlactivity () from >>> /usr/lib/sparcv9/ld.so.1 >>> #1 0xffff7f6175a8 in rd_event () from /usr/lib/sparcv9/ld.so.1 >>> #2 0x7f618950 in lm

Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-11-05 Thread Gilles Gouaillardet
Michael, could you please share your test program so we can investigate it ? Cheers, Gilles On 2014/10/31 18:53, michael.rach...@dlr.de wrote: > Dear developers of OPENMPI, > > There remains a hanging observed in MPI_WIN_ALLOCATE_SHARED. > > But first: > Thank you for your advices to employ

Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-11-05 Thread Gilles Gouaillardet
ved with our large CFD-code. > > Are OPENMPI-developers nevertheless interested in that testprogram? > > Greetings > Michael > > > > > > > -Ursprüngliche Nachricht- > Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Gilles > Gouaillar

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-05 Thread Gilles Gouaillardet
Michael, the root cause is openmpi was not compiled with the intel compilers but the gnu compiler. fortran modules are not binary compatible so openmpi and your application must be compiled with the same compiler. Cheers, Gilles On 2014/11/05 18:25, michael.rach...@dlr.de wrote: > Dear OPENMPI

Re: [OMPI users] OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-05 Thread Gilles Gouaillardet
ding an >mpi.mod file, because the User can look inside the module >and can directly see, if something is missing or possibly wrongly coded. > >Greetings > Michael Rachner > > >-Ursprüngliche Nachricht- >Von: users [mailto:users-boun...@open-mpi.org] Im Auftra

Re: [OMPI users] OMPI users] How OMPI picks ethernet interfaces

2014-11-07 Thread Gilles Gouaillardet
Brock, Is your post related to ib0/eoib0 being used at all, or being used with load balancing ? let me clarify this : --mca btl ^openib disables the openib btl aka *native* infiniband. This does not disable ib0 and eoib0 that are handled by the tcp btl. As you already figured out, btl_tcp_if_inc

Re: [OMPI users] OMPI users] How OMPI picks ethernet interfaces

2014-11-07 Thread Gilles Gouaillardet
Ralph, IIRC there is load balancing accros all the btl, for example between vader and scif. So load balancing between ib0 and eoib0 is just a particular case that might not necessarily be handled by the btl tcp. Cheers, Gilles Ralph Castain wrote: >OMPI discovers all active interfaces and aut

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-10 Thread Gilles Gouaillardet
Hi, IIRC there were some bug fixes between 1.8.1 and 1.8.2 in order to really use all the published interfaces. by any change, are you running a firewall on your head node ? one possible explanation is the compute node tries to access the public interface of the head node, and packets get dropped

Re: [OMPI users] OMPI users] How OMPI picks ethernet interfaces

2014-11-12 Thread Gilles Gouaillardet
Could you please send the output of netstat -nr on both head and compute node ? no problem obfuscating the ip of the head node, i am only interested in netmasks and routes. Ralph Castain wrote: > >> On Nov 12, 2014, at 2:45 PM, Reuti wrote: >> >> Am 12.11.2014 um 17:27 schrieb Reuti: >> >>> A

Re: [OMPI users] mpirun fails across nodes

2014-11-13 Thread Gilles Gouaillardet
Hi, it seems you messed up the command line could you try $ mpirun --mca btl ^openib --host compute-01-01,compute-01-06 ring_c can you also try to run mpirun from a compute node instead of the head node ? Cheers, Gilles On 2014/11/13 16:07, Syed Ahsan Ali wrote: > Here is what I see when di

Re: [OMPI users] mpirun fails across nodes

2014-11-13 Thread Gilles Gouaillardet
9 > [compute-01-01.private.dns.zone][[11064,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.108.10 failed: No route to host (113) > > > On Thu, Nov 13, 2014 at 12:11 PM, Gilles Gouaillardet > wrote: >> Hi, >> >> it seems you me

Re: [OMPI users] mpirun fails across nodes

2014-11-13 Thread Gilles Gouaillardet
.0 b) TX bytes:0 (0.0 b) > > > > So the point is why mpirun is following the ib path while I it has > been disabled. Possible solutions? > > On Thu, Nov 13, 2014 at 12:32 PM, Gilles Gouaillardet > wrote: >> mpirun complains about the 192.168.108.10 ip address, bu

Re: [OMPI users] mpirun fails across nodes

2014-11-13 Thread Gilles Gouaillardet
ddr >>> 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 >>> inet addr:192.168.108.14 Bcast:192.168.108.255 >>> Mask:255.255.255.0 >>> UP BROADCAST MULTICAST MTU:65520 Metric:1 >>> RX packets:0 errors:0 dropped

Re: [OMPI users] mpirun fails across nodes

2014-11-13 Thread Gilles Gouaillardet
0.0.0.0 255.0.0.0 U 0 0 0 eth0 > 0.0.0.0 10.0.0.10.0.0.0 UG0 0 0 eth0 > [pmdtest@compute-01-06 ~]$ > > > On Thu, Nov 13, 2014 at 12:56 PM, Gilles Gouaillardet > wrote: >> This is really weird ? >

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-13 Thread Gilles Gouaillardet
My 0.02 US$ first, the root cause of the problem was a default gateway was configured on the node, but this gateway was unreachable. imho, this is incorrect system setting that can lead to unpredictable results : - openmpi 1.8.1 works (you are lucky, good for you) - openmpi 1.8.3 fails (no luck th

Re: [OMPI users] OMPI users] error building openmpi-dev-274-g2177f9e withgcc-4.9.2

2014-11-16 Thread Gilles Gouaillardet
Siegmar, This is correct, --enable-heterogenous is now fixed in the trunk. Please also note that -D_REENTRANT is now automatically set on solaris Cheers Gilles Siegmar Gross wrote: >Hi Jeff, hi Ralph, > >> This issue should now be fixed, too. > >Yes, it is. Thank you very much for your help.

Re: [OMPI users] Fortran and OpenMPI 1.8.3 compiled with Intel-15 does nothing silently

2014-11-17 Thread Gilles Gouaillardet
Hi John, do you MPI_Init() or do you MPI_Init_thread(MPI_THREAD_MULTIPLE) ? does your program calls MPI anywhere from an OpenMP region ? does your program calls MPI only within an !$OMP MASTER section ? does your program does not invoke MPI at all from any OpenMP region ? can you reproduce this

Re: [OMPI users] collective algorithms

2014-11-17 Thread Gilles Gouaillardet
Daniel, you can run $ ompi_info --parseable --all | grep _algorithm: | grep enumerator that will give you the list of supported algo for the collectives, here is a sample output : mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:0:ignore mca:coll:tuned:param:coll_tuned_allred

Re: [OMPI users] MPI_Neighbor_alltoallw fails with mpi-1.8.3

2014-11-21 Thread Gilles Gouaillardet
Hi Ghislain, that sound like a but in MPI_Dist_graph_create :-( you can use MPI_Dist_graph_create_adjacent instead : MPI_Dist_graph_create_adjacent(MPI_COMM_WORLD, degrees, &targets[0], &weights[0], degrees, &targets[0], &weights[0], info, rankReordering, &commGraph); it

Re: [OMPI users] MPI_Neighbor_alltoallw fails with mpi-1.8.3

2014-11-21 Thread Gilles Gouaillardet
t reagrds, > Ghislain > > 2014-11-21 7:23 GMT+01:00 Gilles Gouaillardet > : >> Hi Ghislain, >> >> that sound like a but in MPI_Dist_graph_create :-( >> >> you can use MPI_Dist_graph_create_adjacent instead : >> >> MPI_Dis

Re: [OMPI users] MPI_Neighbor_alltoallw fails with mpi-1.8.3

2014-11-25 Thread Gilles Gouaillardet
xpect based on prior knowledge. > > George. > > > On Fri, Nov 21, 2014 at 3:48 AM, Gilles Gouaillardet < > gilles.gouaillar...@iferc.org> wrote: > >> Ghislain, >> >> i can confirm there is a bug in mca_topo_base_dist_graph_distribute >> >>

Re: [OMPI users] mpi_wtime implementation

2014-11-27 Thread Gilles Gouaillardet
Folks, one drawback of retrieving time with rdtsc is that this value is core specific : if a task is not bound to a core, then the value returned by MPI_Wtime() might go backward. if i run the following program with taskset -c 1 ./time and then move it accross between cores (taskset -cp 0 ; tas

Re: [OMPI users] "default-only MCA variable"?

2014-11-27 Thread Gilles Gouaillardet
It could be because configure did not find the knem headers and hence knem is not supported and hence this mca parameter is read-only My 0.2 us$ ... Dave Love さんのメール: >Why can't I set parameters like this (not the only one) with 1.8.3? > > WARNING: A user-supplied value attempted to override th

Re: [OMPI users] Warning about not enough registerable memory on SL6.6

2014-12-08 Thread Gilles Gouaillardet
Folks, FWIW, i observe a similar behaviour on my system. imho, the root cause is OFED has been upgraded from a (quite) older version to latest 3.12 version here is the relevant part of code (btl_openib.c from the master) : static uint64_t calculate_max_reg (void) { if (0 == stat("/sys/modu

Re: [OMPI users] Open mpi based program runs as root and gives SIGSEGV under unprivileged user

2014-12-10 Thread Gilles Gouaillardet
Luca, your email mentions openmpi 1.6.5 but gdb output points to openmpi 1.8.1. could the root cause be a mix of versions that does not occur with root account ? which openmpi version are you expecting ? you can run pmap when your binary is running and/or under gdb to confirm the openmpi libra

Re: [OMPI users] Open mpi based program runs as root and gives SIGSEGV under unprivileged user

2014-12-11 Thread Gilles Gouaillardet
size should be >> unlimited. >> Check /etc/security/limits.conf and "ulimit -a". >> >> I hope this helps, >> Gus Correa >> >> On 12/10/2014 08:28 AM, Gilles Gouaillardet wrote: >>> Luca, >>> >>> your email mentions ope

Re: [OMPI users] MPI inside MPI (still)

2014-12-11 Thread Gilles Gouaillardet
Alex, can you try something like call system(sh -c 'env -i /.../mpirun -np 2 /.../app_name') -i start with an empty environment that being said, you might need to set a few environment variables manually : env -i PATH=/bin ... and that being also said, this "trick" could be just a bad idea : you

Re: [OMPI users] MPI inside MPI (still)

2014-12-11 Thread Gilles Gouaillardet
gt; I realize > getting passed over a job scheduler with this approach might not work at > all... > > I have looked at the MPI_Comm_spawn call but I failed to understand how it > could help here. For instance, can I use it to launch an mpi app with the > option "-n 5" ?

Re: [OMPI users] MPI inside MPI (still)

2014-12-11 Thread Gilles Gouaillardet
,MPI_INFO_NULL,rank,MPI_COMM_WORLD,my_intercomm,MPI_ERRCODES_IGNORE,status) > enddo > > I do get 15 instances of the 'hello_world' app running: 5 for each parent > rank 1, 2 and 3. > > Thanks a lot, Gilles. > > Best regargs, > > Alex > > > > > 2014-1

Re: [OMPI users] OMPI users] MPI inside MPI (still)

2014-12-12 Thread Gilles Gouaillardet
ront end to use those, but since we have a lot of data to process > >it also benefits from a parallel environment. > > >Alex > >  > > >2014-12-12 2:30 GMT-02:00 Gilles Gouaillardet : > >Alex, > >just to make sure ... >this is the behavior you expe

Re: [OMPI users] OMPI users] OMPI users] MPI inside MPI (still)

2014-12-13 Thread Gilles Gouaillardet
would I track each one for their completion? > >Alex > > >2014-12-12 22:35 GMT-02:00 Gilles Gouaillardet : > >Alex, > >You need MPI_Comm_disconnect at least. >I am not sure if this is 100% correct nor working. > >If you are using third party apps, why dont you do

Re: [OMPI users] OMPI users] OMPI users] OMPI users] MPI inside MPI (still)

2014-12-13 Thread Gilles Gouaillardet
to say that >we could do a lot better if they could be executed in parallel. > >I am not familiar with DMRAA but it seems to be the right choice to deal with >job schedulers as it covers the ones I am interested in (pbs/torque and >loadlever). > >Alex > > >2014-12-13

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-14 Thread Gilles Gouaillardet
Eric, can you make your test case (source + input file + howto) available so i can try to reproduce and fix this ? Based on the stack trace, i assume this is a complete end user application. have you tried/been able to reproduce the same kind of crash with a trimmed test program ? BTW, what kind

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-14 Thread Gilles Gouaillardet
Eric, i checked the source code (v1.8) and the limit for the shared_fp_fname is 256 (hard coded). i am now checking if the overflow is correctly detected (that could explain the one byte overflow reported by valgrind) Cheers, Gilles On 2014/12/15 11:52, Eric Chamberland wrote: > Hi again, > >

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-14 Thread Gilles Gouaillardet
Eric, here is a patch for the v1.8 series, it fixes a one byte overflow. valgrind should stop complaining, and assuming this is the root cause of the memory corruption, that could also fix your program. that being said, shared_fp_fname is limited to 255 characters (this is hard coded) so even if

Re: [OMPI users] ERROR: C_FUNLOC function

2014-12-15 Thread Gilles Gouaillardet
Hi Siegmar, a similar issue was reported in mpich with xlf compilers : http://trac.mpich.org/projects/mpich/ticket/2144 They concluded this is a compiler issue (e.g. the compiler does not implement TS 29113 subclause 8.1) Jeff, i made PR 315 https://github.com/open-mpi/ompi/pull/315 f08 binding

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-15 Thread Gilles Gouaillardet
Eric, thanks for the simple test program. i think i see what is going wrong and i will make some changes to avoid the memory overflow. that being said, there is a hard coded limit of 256 characters, and your path is bigger than 300 characters. bottom line, and even if there is no more memory ove

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-15 Thread Gilles Gouaillardet
. Cheers, Gilles On 2014/12/16 12:43, Gilles Gouaillardet wrote: > Eric, > > thanks for the simple test program. > > i think i see what is going wrong and i will make some changes to avoid > the memory overflow. > > that being said, there is a hard coded limit of 256 charac

Re: [OMPI users] OMPI users] OMPI users] OMPI users] OMPI users] MPI inside MPI (still)

2014-12-17 Thread Gilles Gouaillardet
future release. > > >However, siesta is launched only by specifying input/output files with i/o >redirection like > >mpirun -n   siesta < infile > outfile > > >So far, I could not find anything about how to set an stdin file for an >spawnee process. >

Re: [OMPI users] OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-17 Thread Gilles Gouaillardet
ere are some limitations but it works very well for our >uses... and until a "real" fix is proposed... > >Thanks for helping! > >Eric > > >On 12/15/2014 11:42 PM, Gilles Gouaillardet wrote: >> Eric and all, >> >> That is clearly a limitation in rom

Re: [OMPI users] OMPI users] ERROR: C_FUNLOC function

2014-12-18 Thread Gilles Gouaillardet
FWIW I faced a simlar issue on my linux virtualbox. My shared folder is a vboxfs filesystem, but statfs returns the nfs magic id. That causes some mess and the test fails. At this stage i cannot tell whether i should blame the glibc, the kernel, a virtualbox driver or myself Cheer, Gilles Mike

Re: [OMPI users] processes hang with openmpi-dev-602-g82c02b4

2014-12-24 Thread Gilles Gouaillardet
Siegmar, could you please give a try to the attached patch ? /* and keep in mind this is just a workaround that happen to work */ Cheers, Gilles On 2014/12/22 22:48, Siegmar Gross wrote: > Hi, > > today I installed openmpi-dev-602-g82c02b4 on my machines (Solaris 10 Sparc, > Solaris 10 x86_64,

Re: [OMPI users] processes hang with openmpi-dev-602-g82c02b4

2014-12-24 Thread Gilles Gouaillardet
Kawashima-san, i'd rather consider this as a bug in the README (!) heterogenous support has been broken for some time, but it was eventually fixed. truth is there are *very* limited resources (both human and hardware) maintaining heterogeneous support, but that does not mean heterogeneous suppo

Re: [OMPI users] OMPI users] What could cause a segfault in OpenMPI?

2014-12-28 Thread Gilles Gouaillardet
Where does the error occurs ? MPI_Init ? MPI_Finalize ? In between ? In the first case, the bug is likely a mishandled error case, which means OpenMPI is unlikely the root cause of the crash. Did you check infniband is up and running on your cluster ? Cheers, Gilles Saliya Ekanayake さんのメール: >

Re: [OMPI users] OMPI users] OMPI users] What could cause a segfault in OpenMPI?

2014-12-29 Thread Gilles Gouaillardet
lat and ib_read_bw >that measures latency and bandwith between two nodes. They are part of >the "perftest" repo package." > >On Dec 28, 2014 10:20 AM, "Saliya Ekanayake" wrote: > >This happens at MPI_Init. I've attached the full error message. >

Re: [OMPI users] OMPI users] Icreasing OFED registerable memory

2014-12-30 Thread Gilles Gouaillardet
FWIW ompi does not yet support XRC with OFED 3.12. Cheers, Gilles Deva さんのメール: >Hi Waleed, > > >It is highly recommended to upgrade to latest OFED.  Meanwhile, Can you try >latest OMPI release (v1.8.4), where this warning is ignored on older OFEDs > > >-Devendar  > > >On Sun, Dec 28, 2014 at 6:

Re: [OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-02 Thread Gilles Gouaillardet
Diego, First, i recommend you redefine tParticle and add a padding integer so everything is aligned. Before invoking MPI_Type_create_struct, you need to call MPI_Get_address(dummy, base, MPI%err) displacements = displacements - base MPI_Type_create_resized might be unnecessary if tParticle is

Re: [OMPI users] OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-02 Thread Gilles Gouaillardet
t; > >What do you think? > >George, Did i miss something? > > >Thanks a lot > > > > >Diego > > >On 2 January 2015 at 12:51, Gilles Gouaillardet > wrote: > >Diego, > >First, i recommend you redefine tParticle and add a p

Re: [OMPI users] OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-04 Thread Gilles Gouaillardet
gt; What do you meam "remove mpi_get_address(dummy) from all displacements". > > Thanks for all your help > > Diego > > > > Diego > > > On 3 January 2015 at 00:45, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com> wrote: > >> Die

Re: [OMPI users] OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-04 Thread Gilles Gouaillardet
MENTS* > * ENDIF* > > and the results is: > >*139835891001320 -139835852218120 -139835852213832* > * -139835852195016 8030673735967299609* > > I am not able to understand it. > > Thanks a lot. > > In the attachment you can find the program > > > >

Re: [OMPI users] OMPI users] OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-05 Thread Gilles Gouaillardet
ces in displacements(2), I have only an integer in >dummy%ip? > >Why do you use dummy(1) and dummy(2)? > > >Thanks a lot     > > > >Diego > > >On 5 January 2015 at 02:44, Gilles Gouaillardet > wrote: > >Diego, > >MPI_Get_address was invoked wi

Re: [OMPI users] OMPI users] OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-07 Thread Gilles Gouaillardet
Diego, my bad, i should have passed displacements(1) to MPI_Type_create_struct here is an updated version (note you have to use a REQUEST integer for MPI_Isend and MPI_Irecv, and you also have to call MPI_Wait to ensure the requests complete) Cheers, Gilles On 2015/01/08 8:23, Diego Avesani w

Re: [OMPI users] difference of behaviour for MPI_Publish_name between openmpi-1.4.5 and openmpi-1.8.4

2015-01-07 Thread Gilles Gouaillardet
Well, per the source code, this is not a bug but a feature : from publish function from ompi/mca/pubsub/orte/pubsub_orte.c ompi_info_get_bool(info, "ompi_unique", &unique, &flag); if (0 == flag) { /* uniqueness not specified - overwrite by default */ unique = false; }

Re: [OMPI users] difference of behaviour for MPI_Publish_name between openmpi-1.4.5 and openmpi-1.8.4

2015-01-07 Thread Gilles Gouaillardet
ust as > reasonable as the alternative (I believe we flipped a coin) > > >> On Jan 7, 2015, at 6:47 PM, Gilles Gouaillardet >> wrote: >> >> Well, per the source code, this is not a bug but a feature : >> >> >> from publish function from ompi/mca/p

Re: [OMPI users] OMPI users] OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-08 Thread Gilles Gouaillardet
the program run in your case? > > Thanks again > > > > Diego > > > On 8 January 2015 at 03:02, Gilles Gouaillardet < > gilles.gouaillar...@iferc.org> wrote: > >> Diego, >> >> my bad, i should have passed displacements(1) to MPI_Type_create_stru

Re: [OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-12 Thread Gilles Gouaillardet
> Attached is my copy of your program with fixes for the above-mentioned issues. > > BTW, I missed the beginning of this thread -- I assume that this is an > artificial use of mpi_type_create_resized for the purposes of a small > example. The specific use of it in this program ap

Re: [OMPI users] error building openmpi-dev-685-g881b1dc on Soalris 10

2015-01-13 Thread Gilles Gouaillardet
Hi Siegmar, could you please try again with adding '-D_STDC_C99' to your CFLAGS ? Thanks and regards, Gilles On 2015/01/12 20:54, Siegmar Gross wrote: > Hi, > > today I tried to build openmpi-dev-685-g881b1dc on my machines > (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 > x86_6

Re: [OMPI users] Problems compiling OpenMPI 1.8.4 with GCC 4.9.2

2015-01-14 Thread Gilles Gouaillardet
Ryan, this issue has already been reported. please refer to http://www.open-mpi.org/community/lists/users/2015/01/26134.php for a workaround Cheers, Gilles On 2015/01/14 16:35, Novosielski, Ryan wrote: > OpenMPI 1.8.4 does not appear to be buildable with GCC 4.9.2. The output, as > requested

Re: [OMPI users] Segfault in mpi-java

2015-01-22 Thread Gilles Gouaillardet
Alexander, i was able to reproduce this behaviour. basically, bad things happen when the garbage collector is invoked ... i was even able to reproduce some crashes (but that happen at random stages) very early in the code by manually inserting calls to the garbage collector (e.g. System.gc();) C

Re: [OMPI users] using multiple IB connections between hosts

2015-02-01 Thread Gilles Gouaillardet
Dave, the QDR Infiniband uses the openib btl (by default : btl_openib_exclusivity=1024) i assume the RoCE 10Gbps card is using the tcp btl (by default : btl_tcp_exclusivity=100) that means that by default, when both openib and tcp btl could be used, the tcp btl is discarded. could you give a try

Re: [OMPI users] cross-compiling openmpi-1.8.4 with static linking

2015-02-09 Thread Gilles Gouaillardet
Simona, On 2015/02/08 20:45, simona bellavista wrote: > I have two systems A (aka Host) and B (aka Target). On A a compiler suite > is installed (intel 14.0.2), on B there is no compiler. I want to compile > openmpi on A for running it on system B (in particular, I want to use > mpirun and mpif90)

Re: [OMPI users] Open MPI collectives algorithm selection

2015-03-10 Thread Gilles Gouaillardet
Khalid, i am not aware of such a mechanism. /* there might be a way to use MPI_T_* mechanisms to force the algorithm, and i will let other folks comment on that */ you definetly cannot directly invoke ompi_coll_tuned_bcast_intra_binomial (abstraction violation, non portable, and you miss the som

Re: [OMPI users] open mpi on blue waters

2015-03-25 Thread Gilles Gouaillardet
min or if you know what you are doing, you can try mpirun -mca sec basic) on blue waters, that would mean ompi does not run out of the box, but fails with an understandable message. that would be less user friendly, but more secure any thoughts ? Cheers, Gilles [gouaillardet@node0

Re: [OMPI users] open mpi on blue waters

2015-03-26 Thread Gilles Gouaillardet
On 2015/03/26 13:00, Ralph Castain wrote: > Well, I did some digging around, and this PR looks like the right solution. ok then :-) following stuff is not directly related to ompi, but you might want to comment on that anyway ... > Second, the running of munge on the IO nodes is not only okay but

Re: [OMPI users] open mpi on blue waters

2015-03-26 Thread Gilles Gouaillardet
see Munge is/can be used by both SLURM and > TORQUE. > (http://docs.adaptivecomputing.com/torque/4-0-2/Content/topics/1-installConfig/serverConfig.htm#usingMUNGEAuth) > > If I misunderstood the drift, please ignore ;-) > > Mark > > >> On 26 Mar 2015, at 5:38 , Gilles

Re: [OMPI users] Isend, Recv and Test

2016-05-06 Thread Gilles Gouaillardet
per the error message, you likely misspeled vader (e.g. missed the "r") Jeff, the behavior was initially reported on a single node, so the tcp btl is unlikely used Cheers, Gilles On Friday, May 6, 2016, Zhen Wang wrote: > > > 2016-05-05 9:27 GMT-05:00 Gilles Gouaillardet &

Re: [OMPI users] SLOAVx alltoallv

2016-05-06 Thread Gilles Gouaillardet
Dave, I briefly read the papers and it suggests the SLOAVx algorithm is implemented by the ml collective module this module had some issues and was judged not good for production. it is disabled by default in the v1.10 series, and has been simply removed from the v2.x branch. you can either use (

Re: [OMPI users] Error building openmpi-dev-4010-g6c9d65c on Linux with Sun C

2016-05-06 Thread Gilles Gouaillardet
Siegmar, at first glance, this looks like a crash of the compiler. so I guess the root cause is not openmpi (that being said, a workaround could be implemented in openmpi) Cheers, Gilles On Saturday, May 7, 2016, Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de> wrote: > Hi, > > today I tr

Re: [OMPI users] warning message for process binding with openmpi-dev-4010-g6c9d65c

2016-05-07 Thread Gilles Gouaillardet
Siegmar, did you upgrade your os recently ? or change hyper threading settings ? this error message typically appears when the numactl-devel rpm is not installed (numactl-devel on redhat, the package name might differ on sles) if not, would you mind retesting frI'm scratch a previous tarball that

Re: [OMPI users] problem with Sun C 5.14 beta

2016-05-07 Thread Gilles Gouaillardet
Siegmar, per the config.log, you need to update your CXXFLAGS="-m64 -library=stlport4 -std=sun03" or just CXXFLAGS="-m64" Cheers, Gilles On Saturday, May 7, 2016, Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de> wrote: > Hi, > > today I tried to install openmpi-v1.10.2-176-g9d45e07 on my

Re: [OMPI users] Incorrect function call in simple C program

2016-05-09 Thread Gilles Gouaillardet
Devon, send() is a libc function that is used internally by Open MPI, and it uses your user function instead of the libc ne. simply rename your function mysend() or something else that is not used by libc, and your issue will likely be fixed Cheers, Gilles On Tuesday, May 10, 2016, Devon Hollow

Re: [OMPI users] 'AINT' undeclared

2016-05-09 Thread Gilles Gouaillardet
Hi, i was able to build openmpi 1.10.2 with the same configure command line (after i quoted the LDFLAGS parameters) can you please run grep SIZEOF_PTRDIFF_T config.status it should be 4 or 8, but it seems different in your environment (!) are you running 32 or 64 bit kernel ? on which p

Re: [OMPI users] mpirun command won't run unless the firewalld daemon is disabled

2016-05-10 Thread Gilles Gouaillardet
you can direct OpenMPI to only use a specific range of ports (that should be open in your firewall configuration) mpirun --mca oob_tcp_static_ipv4_ports - ... if you use the tcp btl, you can (also) use mpirun --mca btl_tcp_port_min_v4 --mca btl_tcp_port_range_v4 ... Cheers, Gilles On

Re: [OMPI users] problem with ld for Sun C 5.14 beta and openmpi-dev-4010-g6c9d65c

2016-05-10 Thread Gilles Gouaillardet
Siegmar, this issue was previously reported at http://www.open-mpi.org/community/lists/devel/2016/05/18923.php i just pushed the patch Cheers, Gilles On 5/10/2016 2:27 PM, Siegmar Gross wrote: Hi, I tried to install openmpi-dev-4010-g6c9d65c on my "SUSE Linux Enterprise Server 12 (x86

Re: [OMPI users] Incorrect function call in simple C program

2016-05-10 Thread Gilles Gouaillardet
; That worked perfectly. Thank you. I'm surprised that clang didn't emit a > warning about this! > > -Devon > > On Mon, May 9, 2016 at 3:42 PM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com > > wrote: > >> Devon, >> >> send() is a libc

Re: [OMPI users] mpirun command won't run unless the firewalld daemon is disabled

2016-05-10 Thread Gilles Gouaillardet
I was basically suggesting you open a few ports to anyone (e.g. any IP address), and Jeff suggests you open all ports to a few trusted IP addresses. btw, how many network ports do you have ? if you have two ports (e.g. eth0 for external access and eth1 for private network) and MPI should only use

Re: [OMPI users] 'AINT' undeclared

2016-05-10 Thread Gilles Gouaillardet
Ilias, at first glance, you are using the PGI preprocessor (!) can you re-run configure with CPP=cpp, or after removing all PGI related environment variables, and see it it helps ? Cheers, Gilles On Wednesday, May 11, 2016, Ilias Miroslav wrote: > https://www.open-mpi.org/community/lists/use

Re: [OMPI users] Question about mpirun mca_oob_tcp_recv_handler error.

2016-05-11 Thread Gilles Gouaillardet
Hi, Where did you get the openmpi package from ? fc20 ships openmpi 1.7.3 ... does it work as expected if you do not use mpirun (e.g. ./hello_c) if yes, then you can try ldd hello_c which mpirun ldd mpirun mpirun -np 1 ldd hello_c and confirm both mpirun and hello_c use the same mpi

  1   2   3   4   5   6   7   8   9   10   >