Re: [OMPI users] fatal error: ac_nonexistent.h: No such file or directory (openmpi-4.0.0)

2019-04-20 Thread Gilles Gouaillardet via users
The root cause is configure cannot run a simple Fortran program (see the relevant log below) I suggest you export LD_LIBRARY_PATH=/share/apps/gcc-5.4.0/lib64:$LD_LIBRARY_PATH and then try again. Cheers, Gilles configure:44254: checking Fortran value of selected_int_kind(4) configure:44281: /sh

Re: [OMPI users] 3.0.4, 4.0.1 build failure on OSX Mojave with LLVM

2019-04-24 Thread Gilles Gouaillardet via users
John, what if you move some parameters to CPPFLAGS and CXXCPPFLAGS (see the new configure command line below) Cheers, Gilles '/Users/cary/projects/ulixesall-llvm/builds/openmpi-4.0.1/nodl/../configure' \ --prefix=/Volumes/GordianStorage/opt/contrib-llvm7_appleclang/openmpi-4.0.1-nodl \ CC='/Vol

Re: [OMPI users] error running mpirun command

2019-05-03 Thread Gilles Gouaillardet via users
Eric, which version of Open MPI are you using ? how many hosts in your hostsfile ? The error message suggests this could be a bug within Open MPI, and a potential workaround for you would be to try mpirun -np 84 - -hostfile hostsfile --mca routed direct ./openmpi_hello.c You might also want to

Re: [OMPI users] undefined reference error related to ucx

2019-06-25 Thread Gilles Gouaillardet via users
Passant, UCX 1.6.0 is not yet officially released, and it seems Open MPI (4.0.1) does not support it yet, and some porting is needed. Cheers, Gilles On Tue, Jun 25, 2019 at 5:13 PM Passant A. Hafez via users wrote: > > Hello, > > > I'm trying to build ompi 4.0.1 with external ucx 1.6.0 but I'm

Re: [OMPI users] undefined reference error related to ucx

2019-06-25 Thread Gilles Gouaillardet via users
d here https://github.com/openucx/ucx/issues/3336 that the UCX 1.6 might solve this issue, so I tried the pre-release version to just check if it will. All the best, -- Passant From: users on behalf of Gilles Gouaillardet via users Sent: Tuesday, June 2

Re: [OMPI users] Possible bugs in MPI_Neighbor_alltoallv()

2019-06-27 Thread Gilles Gouaillardet via users
Thanks Junchao, I issued https://github.com/open-mpi/ompi/pull/6782 in order to fix this (and the alltoallw variant as well) Meanwhile, you can manually download and apply the patch at https://github.com/open-mpi/ompi/pull/6782.patch Cheers, Gilles On 6/28/2019 1:10 PM, Zhang, Junchao

Re: [OMPI users] Problems with MPI_Comm_spawn

2019-07-02 Thread Gilles Gouaillardet via users
Thanks for the report, this is indeed a bug I fixed at https://github.com/open-mpi/ompi/pull/6790 meanwhile, you can manually download and apply the patch at https://github.com/open-mpi/ompi/pull/6790.patch Cheers, Gilles On 7/3/2019 1:30 AM, Gyevi-Nagy László via users wrote: Hi, I

Re: [OMPI users] Naming scheme of PSM2 and Vader shared memory segments

2019-07-07 Thread Gilles Gouaillardet via users
Sebastian, the PSM2 shared memory segment name is set by the PSM2 library and my understanding is that Open MPI has no control over it. If you believe the root cause of the crash is related to non unique PSM2 shared memory segment name, I guess you should report this at https://github.com

Re: [OMPI users] How it the rank determined (Open MPI and Podman)

2019-07-11 Thread Gilles Gouaillardet via users
Adrian, the MPI application relies on some environment variables (they typically start with OMPI_ and PMIX_). The MPI application internally uses a PMIx client that must be able to contact a PMIx server (that is included in mpirun and the orted daemon(s) spawned on the remote hosts). lo

Re: [OMPI users] How it the rank determined (Open MPI and Podman)

2019-07-12 Thread Gilles Gouaillardet via users
procs total) >>--> Process # 0 of 2 is alive. ->test1 >>--> Process # 1 of 2 is alive. ->test2 >> >> I need to tell Podman to mount /tmp from the host into the container, as >> I am running rootless I also need to tell Podman to us

Re: [OMPI users] How is the rank determined (Open MPI and Podman)

2019-07-22 Thread Gilles Gouaillardet via users
ugh. Not sure yet if this related to the fact that Podman is running rootless. I will continue to investigate, but now I know where to look. Thanks! Adrian On Fri, Jul 12, 2019 at 06:48:59PM +0900, Gilles Gouaillardet via users wrote: Adrian, Can you try mpirun --mca btl_vader_copy_me

Re: [OMPI users] When is it save to free the buffer after MPI_Isend?

2019-07-27 Thread Gilles Gouaillardet via users
Carlos, MPI_Isend() does not automatically frees the buffer after it sends the message. (it simply cannot do it since the buffer might be pointing to a global variable or to the stack). Can you please extract a reproducer from your program ? Out of curiosity, what if you insert an (useless) MPI_

Re: [OMPI users] OpenMPI 2.1.1 bug on Ubuntu 18.04.2 LTS

2019-08-01 Thread Gilles Gouaillardet via users
Juanchao, Is the issue related to https://github.com/open-mpi/ompi/pull/4501 ? Jeff, you might have to configure with --enable-heterogeneous to evidence the issue Cheers, Gilles On 8/2/2019 4:06 AM, Jeff Squyres (jsquyres) via users wrote: I am able to replicate the issue on a stock

Re: [OMPI users] OMPI was not built with SLURM's PMI support

2019-08-08 Thread Gilles GOUAILLARDET via users
Hi, You need to configure --with-pmi ... Cheers, Gilles On August 8, 2019, at 11:28 PM, Jing Gong via users wrote: Hi, Recently our Slurm system has been upgraded to 19.0.5. I tried to recompile openmpi v3.0 due to the bug reported in https://bugs.schedmd.com/show_bug.cgi?id=6993

Re: [OMPI users] Error with OpenMPI: Could not resolve generic procedure mpi_irecv

2019-08-19 Thread Gilles Gouaillardet via users
Hi, Can you please post a full but minimal example that evidences the issue? Also please post your Open MPI configure command line. Cheers, Gilles Sent from my iPod > On Aug 19, 2019, at 18:13, Sangam B via users > wrote: > > Hi, > > I get following error if the application is compiled

Re: [OMPI users] Error with OpenMPI: Could not resolve generic procedure mpi_irecv

2019-08-19 Thread Gilles Gouaillardet via users
Thanks, and your reproducer is ? Cheers, Gilles On Mon, Aug 19, 2019 at 6:42 PM Sangam B via users wrote: > > Hi, > > OpenMPI is configured as follows: > > export CC=`which clang` > export CXX=`which clang++` > export FC=`which flang` > export F90=`which flang` > > ../configure --prefix=/sw/op

Re: [OMPI users] Error with OpenMPI: Could not resolve generic procedure mpi_irecv

2019-08-19 Thread Gilles Gouaillardet via users
One more thing ... Your initial message mentioned a failure with gcc 8.2.0, but your follow-up message mentions LLVM compiler. So which compiler did you use to build Open MPI that fails to build your test ? Cheers, Gilles On Mon, Aug 19, 2019 at 6:49 PM Gilles Gouaillardet wrote: > > Thanks,

Re: [OMPI users] Error with OpenMPI: Could not resolve generic procedure mpi_irecv

2019-08-19 Thread Gilles Gouaillardet via users
both case. > > -- > > > On Mon, Aug 19, 2019 at 3:25 PM Gilles Gouaillardet via users > wrote: >> >> One more thing ... >> >> Your initial message mentioned a failure with gcc 8.2.0, but your >> follow-up message mentions LLVM compiler. >>

Re: [OMPI users] Error with OpenMPI: Could not resolve generic procedure mpi_irecv

2019-08-19 Thread Gilles Gouaillardet via users
gt; size = this%size_dim(this%gi)*this%size_dim(this%gj)*cs3 > if(this%is_exchange_off) then >call this%update_stats(size) >this%bf(:,:,1:cs3) = cmplx(0.,0.) > else >call MPI_Irecv(this%bf(:,:,1:cs3),size,MPI_COMPLEX_TYPE,& > this%nrank,t

Re: [OMPI users] Parameters at run time

2019-10-20 Thread Gilles Gouaillardet via users
Raymond, In the case of UCX, you can mpirun --mca pml_base_verbose 10 ... If the pml/ucx component is used, then your app will run over UCX. If the pml/ob1 component is used, then you can mpirun --mca btl_base_verbose 10 ... btl/self should be used for communications to itself. if btl/uct

Re: [OMPI users] Deadlock in netcdf tests

2019-10-25 Thread Gilles Gouaillardet via users
Orion, thanks for the report. I can confirm this is indeed an Open MPI bug. FWIW, a workaround is to disable the fcoll/vulcan component. That can be achieved by mpirun --mca fcoll ^vulcan ... or OMPI_MCA_fcoll=^vulcan mpirun ... I also noted the tst_parallel3 program crashes with the RO

Re: [OMPI users] Program hangs when MPI_Bcast is called rapidly

2019-10-28 Thread Gilles Gouaillardet via users
Charles, unless you expect yes or no answers, can you please post a simple program that evidences the issue you are facing ? Cheers, Gilles On 10/29/2019 6:37 AM, Garrett, Charles via users wrote: Does anyone have any idea why this is happening?  Has anyone seen this problem before?

Re: [OMPI users] mpirun --output-filename behavior

2019-10-31 Thread Gilles Gouaillardet via users
Joseph, you can achieve this via an agent (and it works with DDT too) For example, the nostderr script below redirects each MPI task's stderr to /dev/null (so it is not forwarded to mpirun) $ cat nostderr #!/bin/sh exec 2> /dev/null exec "$@" and then you can simply $ mpirun --mca or

Re: [OMPI users] mpirun --output-filename behavior

2019-11-01 Thread Gilles GOUAILLARDET via users
via users wrote: Gilles, Thanks for your suggestions! I just tried both of them, see below: On 11/1/19 1:15 AM, Gilles Gouaillardet via users wrote: > Joseph, > > > you can achieve this via an agent (and it works with DDT too) > > > For example, the nostderr script

Re: [OMPI users] MPI_Iallreduce with multidimensional Fortran array

2019-11-13 Thread Gilles Gouaillardet via users
Camille, your program is only valid with a MPI library that features |MPI_SUBARRAYS_SUPPORTED| and this is not (yet) the case in Open MPI. A possible fix is to use an intermediate contiguous buffer   integer, allocatable, dimension(:,:,:,:) :: tmp   allocate( tmp(N,N,N,N) ) and then repla

Re: [OMPI users] speed of model is slow with openmpi

2019-11-27 Thread Gilles Gouaillardet via users
Your gfortran command line strongly suggests your program is serial and does not use MPI at all. Consequently, mpirun will simply spawn 8 identical instances of the very same program, and no speed up should be expected (but you can expect some slow down and/or file corruption). If you obser

Re: [OMPI users] mca_oob_tcp_recv_handler: invalid message type: 15

2019-12-11 Thread Gilles Gouaillardet via users
Guido, This error message is from MPICH and not Open MPI. Make sure your environment is correct and the shared filesystem is mounted on the compute nodes. Cheers, Gilles Sent from my iPod > On Dec 12, 2019, at 1:44, Guido granda muñoz via users > wrote: > > Hi, > after following the ins

Re: [OMPI users] Optimized and portable Open MPI packaging in Guix

2019-12-20 Thread Gilles Gouaillardet via users
Ludovic, in order to figure out which interconnect is used, you can mpirun --mca pml_base_verbose 10 --mca mtl_base_verbose 10 --mca btl_base_verbose 10 ... the output might be a bit verbose, so here are a few tips on how to get it step by step first, mpirun --mca pml_base_verbose 10 ... in ord

Re: [OMPI users] HELP: openmpi is not using the specified infiniband interface !!

2020-01-14 Thread Gilles Gouaillardet via users
Soporte, The error message is from MPICH! If you intend to use Open MPI, fix your environment first Cheers, Gilles Sent from my iPod > On Jan 15, 2020, at 7:53, SOPORTE MODEMAT via users > wrote: > > Hello everyone. > > I would like somebody help me to figure out how can I make that the

Re: [OMPI users] OpenMPI 4.0.2 with PGI 19.10, will not build with hcoll

2020-01-25 Thread Gilles Gouaillardet via users
Thanks Jeff for the information and sharing the pointer. FWIW, this issue typically occurs when libtool pulls the -pthread flag from libhcoll.la that was compiled with a GNU compiler. The simplest workaround is to remove libhcoll.la (so libtool simply links with libhcoll.so and does not pull any c

Re: [OMPI users] Read from file performance degradation when increasing number of processors in some cases

2020-03-06 Thread Gilles Gouaillardet via users
Hi, The log filenames suggests you are always running on a single node, is that correct ? Do you create the input file on the tmpfs once for all? before each run? Can you please post your mpirun command lines? If you did not bind the tasks, can you try again mpirun --bind-to core ... Ch

Re: [OMPI users] Read from file performance degradation whenincreasing number of processors in some cases

2020-03-06 Thread Gilles Gouaillardet via users
Also, in mpi_just_read.c, what if you add MPI_Barrier(MPI_COMM_WORLD); right before invoking MPI_Finalize(); can you observe a similar performance degradation when moving from 32 to 64 tasks ? Cheers, Gilles - Original Message - Hi, The log filenames suggests you are al

Re: [OMPI users] Read from file performance degradation whenincreasing number of processors in some cases

2020-03-06 Thread Gilles Gouaillardet via users
s on a resource: Bind to:CORE Node: compute-0 #processes: 2 #cpus: 1 You can override this protection by adding the "overload-allowed" option to your binding directive. — I will solve this and get back to you soon. Best regards, Al

Re: [OMPI users] How to prevent linking in GPFS when it is present

2020-03-29 Thread Gilles Gouaillardet via users
Jonathon, GPFS is used by both the ROMIO component (that comes from MPICH) and the fs/gpfs component that is used by ompio (native Open MPI MPI-IO so to speak). you should be able to disable both by running ac_cv_header_gpfs_h=no configure --without-gpfs ... Note that Open MPI is modular

Re: [OMPI users] Slow collective MPI File IO

2020-04-06 Thread Gilles GOUAILLARDET via users
Collin, Do you have any data to backup your claim? As long as MPI-IO is used to perform file I/O, the Fortran bindings overhead should be hardly noticeable. Cheers, Gilles On April 6, 2020, at 23:22, Collin Strassburger via users wrote: Hello,   Just a quick comment on this; is your

Re: [OMPI users] Slow collective MPI File IO

2020-04-06 Thread Gilles Gouaillardet via users
David, I suggest you rely on well established benchmarks such as IOR or iozone. As already pointed by Edgar, you first need to make sure you are not benchmarking your (memory) cache by comparing the bandwidth you measure vs the performance you can expect from your hardware. As a side note, unl

Re: [OMPI users] file/process write speed is not scalable

2020-04-09 Thread Gilles Gouaillardet via users
Note there could be some NUMA-IO effect, so I suggest you compare running every MPI tasks on socket 0, to running every MPI tasks on socket 1 and so on, and then compared to running one MPI task per socket. Also, what performance do you measure? - Is this something in line with the filesystem/netw

Re: [OMPI users] Hwlock library problem

2020-04-14 Thread Gilles Gouaillardet via users
Paul, this issue is likely the one already been reported at https://github.com/open-mpi/ompi/issues/7615 Several workarounds are documented, feel free to try some of them and report back (either on GitHub or this mailing list) Cheers, Gilles On Tue, Apr 14, 2020 at 11:18 PM フォンスポール J via users

Re: [OMPI users] Hwlock library problem

2020-04-15 Thread Gilles Gouaillardet via users
suggested that the > "-Wl," before "-force_load” might be a workable solution, but I don’t > understand what to change in order to accomplish this. Might you have some > suggestions. > > > > > > On Apr 15, 2020, at 0:06, Gilles Gouaillardet via users > w

Re: [OMPI users] Hwlock library problem

2020-04-15 Thread Gilles Gouaillardet via users
g. > link_static_flag="" > > After running “make clean”, I ran make again and found the same error. Any > ideas? > > FCLD libmpi_usempif08.la > ld: library not found for -lhwloc > make[2]: *** [libmpi_usempif08.la] Error 1 > make[1]: *** [all-recursive] Error

Re: [OMPI users] Hwlock library problem

2020-04-15 Thread Gilles Gouaillardet via users
Sorry for your trouble. > > > > > On Apr 16, 2020, at 11:49, Gilles Gouaillardet via users > > wrote: > > > > Paul, > > > > My ifort eval license on OSX has expired so I cannot test myself, > > sorry about that. > > > > It has been rep

Re: [OMPI users] OMPI v2.1.5 with Slurm

2020-04-21 Thread Gilles Gouaillardet via users
Levi, as a workaround, have you tried using mpirun instead of direct launch (e.g. srun) ? Note you are using pmix 1.2.5, so you likely want to srun --mpi=pmix_v1 Also, as reported by the logs 1. [nodeA:12838] OPAL ERROR: Error in file pmix3x_client.c at line 112 there is something fishy

Re: [OMPI users] Preloading the libraries "--preload-files" Effect

2020-04-21 Thread Gilles Gouaillardet via users
Kihang Youn, The --preload-files is used to preload files in a sense that files are copied to the remote hosts before the MPI application is started. If you are using the released versions of Open MPI, you can try the fork_agent. Basically, mpirun --mca orte_fork_agent /.../wrapper a.out will hav

Re: [OMPI users] Warnings

2020-05-04 Thread Gilles Gouaillardet via users
Hi, how many MPI tasks are you running? are you running from a terminal? from two different jobs? two mpirun within the same job? what happens next? hang? abort? crash? app runs just fine? fwiw, the message says that rank 3 received an unexpected connection from rank 4 Cheers, Gilles On Tue, M

Re: [OMPI users] ubuntu linux getting cores, not hyperthreading slots

2020-05-06 Thread Gilles Gouaillardet via users
Jim, You can mpirun --use-hwthread-cpus ... to have one slot = one hyperthread (default is slot = core) Note you always have the opportunity to mpirun --oversubscribe ... Cheers, Gilles - Original Message - I've just compiled my own version of ompi on Ubuntu 20.04 linux fo

Re: [OMPI users] Memchecker and MPI_Comm_spawn

2020-05-09 Thread Gilles Gouaillardet via users
Kurt, the error is "valgrind myApp" is not an executable (but this is a command a shell can interpret) so you have several options: - use a wrapper (e.g. myApp.valgrind) that forks&exec valgrind myApp) - MPI_Comm_spawn("valgrind", argv, ...) after you inserted "myApp" at the beginning of argv -

Re: [OMPI users] OpenMPI 4.3 without ucx

2020-05-10 Thread Gilles Gouaillardet via users
Patrick, what is the error when no environment variable is set? can you double check you do not have an old mca_pml_ucx.so in your //GCC9.3/openmpi/4.0.3/lib/openmpi directory? Cheers, Gilles On Sun, May 10, 2020 at 4:43 PM Patrick Bégou via users wrote: > > Hi all, > > I've built Ope

Re: [OMPI users] I can't build openmpi 4.0.X using PMIx 3.1.5 to use with Slurm

2020-05-11 Thread Gilles Gouaillardet via users
Leandro, First you must make sure SLURM has been built with PMIx (preferably PMIx 3.1.5) and the pmix plugin was built. >From the Open MPI point of view, you do not need the --with-ompi-pmix-rte option. If you want to uses srun, just make sure it uses pmix. you can srun --mpi=list If you want t

Re: [OMPI users] Do I need C++ bindings for Open MPI mpicc

2020-05-12 Thread Gilles Gouaillardet via users
Hi, no you do not. FWIW, MPI C++ bindings were removed from the standard a decade ago. mpicc is the wrapper for the C compiler, and the wrappers for the C++ compilers are mpicxx,mpiCC and mpicxx. If your C++ application is only using the MPI C bindings, then you do not need --enable-mpi-cxx for t

Re: [OMPI users] Error with MPI_GET_ADDRESS and MPI_TYPE_CREATE_RESIZED?

2020-05-17 Thread Gilles Gouaillardet via users
Diego, Did you change your compiler options? Cheers, Gilles - Original Message - > Dear all, > > I would like to share with what I have done in oder to create my own MPI > data type. The strange thing is that it worked until some day ago and then > it stopped working. This because pr

Re: [OMPI users] Running mpirun with grid

2020-05-29 Thread Gilles Gouaillardet via users
John, Most of these questions are irrelevant with respect to the resolution of this problem. Please use this mailing list only for Open MPI related topics. Cheers, Gilles On Sat, May 30, 2020 at 3:24 PM John Hearns via users wrote: > > Good morning Vipul. I would like to ask some higher leve

Re: [OMPI users] Running mpirun with grid

2020-06-02 Thread Gilles Gouaillardet via users
Vipul, You can also use the launch_agent to debug that. Long story short mpirun --mca orte_launch_agent /.../agent.sh a.out will qrsh ... /.../agent.sh instead of qrsh ... orted at first, you can write a trivial agent that simply dumps the command line. you might also want to dump the environm

Re: [OMPI users] error while running mpirun

2020-07-12 Thread Gilles Gouaillardet via users
Srijan, The logs suggest you explicitly request the btl/sm component, and this typically occurs via a openmpi-mca-params.conf (that contains a line such as btl = sm,openib,self), or the OMPI_MCA_btl environment variable Cheers, Gilles On Mon, Jul 13, 2020 at 1:50 AM Srijan Chatterjee via users

Re: [OMPI users] include/mpi.h:201:32: error: two or more data types in declaration specifiers

2020-07-14 Thread Gilles Gouaillardet via users
Steve, For some unknown reason, this macro declaration seems to cause the crash ... when it is expanded, and the logs you posted do not show that. Could you please post a larger chunk of the logs? please run 'make -j 1' so no logs are interleaved and hence hard to decipher. Cheers, Gilles On Tu

Re: [OMPI users] MTU Size and Open-MPI/HPL Benchmark

2020-07-15 Thread Gilles GOUAILLARDET via users
John, On a small cluster, HPL is not communication intensive, so you are unlikely to see some improvements by tweaking the network. Instead, I'd rather suggest you run MPI benchmarks such as IMB (from Intel) or the OSU suite (from Ohio State University). Cheers, Gilles On July 15, 2020, at 2

Re: [OMPI users] include/mpi.h:201:32: error: two or more data types in declaration specifiers

2020-07-23 Thread Gilles Gouaillardet via users
In https://github.com/NCAR/WRFV3/blob/master/external/RSL_LITE/rsl_lite.h #ifndef MPI2_SUPPORT typedef int MPI_Fint; # define MPI_Comm_c2f(comm) (MPI_Fint)(comm) # define MPI_Comm_f2c(comm) (MPI_Comm)(comm) #endif so I guess the MPI2_SUPPORT macro is not defined, and it makes Open MPI a sad panda

Re: [OMPI users] Silent hangs with MPI_Ssend and MPI_Irecv

2020-07-25 Thread Gilles Gouaillardet via users
Sean, you might also want to confirm openib is (part of) the issue by running your app on TCP only. mpirun --mca pml ob1 --mca btl tcp,self, ... Cheers, Gilles - Original Message - > Hi Sean, > > Thanks for the report! I have a few questions/suggestions: > > 1) What version of Open

Re: [OMPI users] segfault in libibverbs.so

2020-07-27 Thread Gilles Gouaillardet via users
Prentice, ibverbs might be used by UCX (either pml/ucx or btl/uct), so to be 100% sure, you should mpirun --mca pml ob1 --mca btl ^openib,uct ... in order to force btl/tcp, you need to ensure pml/ob1 is used, and then you always need the btl/self component mpirun --mca pml ob1 --mca btl tcp,se

Re: [OMPI users] Books/resources to learn (open)MPI from

2020-08-05 Thread Gilles Gouaillardet via users
Assuming you want to learn about MPI (and not the Open MPI internals), the books by Bill Gropp et al. are the reference : https://www.mcs.anl.gov/research/projects/mpi/usingmpi/ (Using MPI 3rd edition is affordable on amazon) Cheers, Gilles

Re: [OMPI users] Books/resources to learn (open)MPI from

2020-08-06 Thread Gilles Gouaillardet via users
the developers via github and/or the devel mailing list Cheers, Gilles On Thu, Aug 6, 2020 at 5:47 PM Oddo Da via users wrote: > > On Wed, Aug 5, 2020 at 11:06 PM Gilles Gouaillardet via users > wrote: >> >> Assuming you want to learn about MPI (and not the Open MPI interna

Re: [OMPI users] MPI is still dominantparadigm?

2020-08-07 Thread Gilles Gouaillardet via users
The goal of Open MPI is to provide a high quality of the MPI standard, and the goal of this mailing list is to discuss Open MPI (and not the MPI standard) The Java bindings support "recent" JDK, and if you face an issue, please report a bug (either here or on github) Cheers, Gilles --

Re: [OMPI users] ORTE HNP Daemon Error - Generated by Tweaking MTU

2020-08-09 Thread Gilles Gouaillardet via users
John, I am not sure you will get much help here with a kernel crash caused by a tweaked driver. About HPL, you are more likely to get better performance with P and Q closer (e.g. 4x8 is likely better then 2x16 or 1x32). Also, HPL might have better performance with one MPI task per node and a mult

Re: [OMPI users] Issue with shared memory arrays in Fortran

2020-08-24 Thread Gilles Gouaillardet via users
Patrick, Thanks for the report and the reproducer. I was able to confirm the issue with python and Fortran, but - I can only reproduce it with pml/ucx (read --mca pml ob1 --mca btl tcp,self works fine) - I can only reproduce it with bcast algorithm 8 and 9 As a workaround, you can keep using u

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-10-21 Thread Gilles Gouaillardet via users
Hi Jorge, If a firewall is running on your nodes, I suggest you disable it and try again Cheers, Gilles On Wed, Oct 21, 2020 at 5:50 AM Jorge SILVA via users wrote: > > Hello, > > I installed kubuntu20.4.1 with openmpi 4.0.3-0ubuntu in two different > computers in the standard way. Compiling w

Re: [OMPI users] Anyone try building openmpi 4.0.5 w/ llvm 11

2020-10-22 Thread Gilles Gouaillardet via users
Alan, thanks for the report, I addressed this issue in https://github.com/open-mpi/ompi/pull/8116 As a temporary workaround, you can apply the attached patch. FWIW, f18 (shipped with LLVM 11.0.0) is still in development and uses gfortran under the hood. Cheers, Gilles On Wed, Oct 21, 2020 at

Re: [OMPI users] ompe support for filesystems

2020-10-31 Thread Gilles Gouaillardet via users
Hi Ognen, MPI-IO is implemented by two components: - ROMIO (from MPICH) - ompio ("native" Open MPI MPI-IO, default component unless running on Lustre) Assuming you want to add support for a new filesystem in ompio, first step is to implement a new component in the fs framework the framework is

Re: [OMPI users] 4.0.5 on Linux Pop!_OS

2020-11-07 Thread Gilles Gouaillardet via users
Paul, a "slot" is explicitly defined in the error message you copy/pasted: "If none of a hostfile, the --host command line parameter, or an RM is present, Open MPI defaults to the number of processor cores" The error message also lists 4 ways on how you can move forward, but you should first ask

Re: [OMPI users] Unable to run complicated MPI Program

2020-11-28 Thread Gilles Gouaillardet via users
Dean, That typically occurs when some nodes have multiple interfaces, and several nodes have a similar IP on a private/unused interface. I suggest you explicitly restrict the interface Open MPI should be using. For example, you can mpirun --mca btl_tcp_if_include eth0 ... Cheers, Gilles On Fr

Re: [OMPI users] Parallel HDF5 low performance

2020-12-03 Thread Gilles Gouaillardet via users
Patrick, In recent Open MPI releases, the default component for MPI-IO is ompio (and no more romio) unless the file is on a Lustre filesystem. You can force romio with mpirun --mca io ^ompio ... Cheers, Gilles On 12/3/2020 4:20 PM, Patrick Bégou via users wrote: Hi, I'm using an old

Re: [OMPI users] Parallel HDF5 low performance

2020-12-03 Thread Gilles Gouaillardet via users
> I was tracking this problem for several weeks but not looking in the > right direction (testing NFS server I/O, network bandwidth.) > > I think we will now move definitively to modern OpenMPI implementations. > > Patrick > > Le 03/12/2020 à 09:06, Gilles Gouaillardet vi

Re: [OMPI users] MPI_type_free question

2020-12-03 Thread Gilles Gouaillardet via users
Patrick, based on George's idea, a simpler check is to retrieve the Fortran index via the (standard) MPI_Type_c2() function after you create a derived datatype. If the index keeps growing forever even after you MPI_Type_free(), then this clearly indicates a leak. Unfortunately, this simp

Re: [OMPI users] MPI_type_free question

2020-12-04 Thread Gilles Gouaillardet via users
r Omnipath based. I will have to investigate too but not sure it is the same problem. Patrick Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit : Patrick, based on George's idea, a simpler check is to retrieve the Fortran index via the (standard) MPI_Type_c2() function aft

Re: [OMPI users] MPI_type_free question

2020-12-10 Thread Gilles Gouaillardet via users
same binaries for the 4 runs and OpenMPI 3.1 (same behavior with 4.0.5). The code is in attachment. I'll try to check type deallocation as soon as possible. Patrick Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit : Patrick, based on George's idea, a simpler check

Re: [OMPI users] MPI_type_free question

2020-12-14 Thread Gilles Gouaillardet via users
the memory used by rank 0 before (blue) and after (red) the patch. > > Thanks > > Patrick > > > Le 10/12/2020 à 10:15, Gilles Gouaillardet via users a écrit : > > Patrick, > > > First, thank you very much for sharing the reproducer. > > > Yes, please open

Re: [OMPI users] MPMD hostfile: executables on same hosts

2020-12-21 Thread Gilles Gouaillardet via users
Vineet, probably *not* what you expect, but I guess you can try $ cat host-file host1 slots=3 host2 slots=3 host3 slots=3 $ mpirun -hostfile host-file -np 2 ./EXE1 : -np 1 ./EXE2 : -np 2 ./EXE1 : -np 1 ./EXE2 : -np 2 ./EXE1 : -np 1 ./EXE2 Cheers, Gilles On Mon, Dec 21, 2020 at 10:26 PM Vinee

Re: [OMPI users] Timeout in MPI_Bcast/MPI_Barrier?

2021-01-08 Thread Gilles Gouaillardet via users
Daniel, Can you please post the full error message and share a reproducer for this issue? Cheers, Gilles On Fri, Jan 8, 2021 at 10:25 PM Daniel Torres via users wrote: > > Hi all. > > Actually I'm implementing an algorithm that creates a process grid and > divides it into row and column commu

Re: [OMPI users] Confusing behaviour of compiler wrappers

2021-01-09 Thread Gilles Gouaillardet via users
Sajid, I believe this is a Spack issue and Open MPI cannot do anything about it. (long story short, `module load openmpi-xyz` does not set the environment for the (spack) external `xpmem` library. I updated the spack issue with some potential workarounds you might want to give a try. Cheers, Gi

Re: [OMPI users] Timeout in MPI_Bcast/MPI_Barrier?

2021-01-11 Thread Gilles Gouaillardet via users
r TCP, where we use the socket timeout to prevent deadlocks. As you > already did quite a few communicator duplications and other collective > communications before you see the timeout, we need more info about this. As > Gilles indicated, having the complete output might help. What is

Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-18 Thread Gilles Gouaillardet via users
Dave, On 1/19/2021 2:13 AM, Dave Love via users wrote: Generally it's not surprising if there's a shortage of effort when outside contributions seem unwelcome. I've tried to contribute several times. The final attempt wasted two or three days, after being encouraged to get the port of curren

Re: [OMPI users] Error with building OMPI with PGI

2021-01-19 Thread Gilles Gouaillardet via users
Passant, unless this is a copy paste error, the last error message reads plus zero three, which is clearly an unknown switch (plus uppercase o three is a known one) At the end of the configure, make sure Fortran bindings are generated. If the link error persists, you can ldd /.../libmpi_mpifh.so

Re: [OMPI users] Debugging a crash

2021-01-29 Thread Gilles Gouaillardet via users
Diego, the mpirun command line starts 2 MPI task, but the error log mentions rank 56, so unless there is a copy/paste error, this is highly suspicious. I invite you to check the filesystem usage on this node, and make sure there is a similar amount of available space in /tmp and /dev/shm (or othe

Re: [OMPI users] OMPI 4.1 in Cygwin packages?

2021-02-04 Thread Gilles Gouaillardet via users
Martin, this is a connectivity issue reported by the btl/tcp component. You can try restricting the IP interface to a subnet known to work (and with no firewall) between both hosts mpirun --mca btl_tcp_if_include 192.168.0.0/24 ... If the error persists, you can mpirun --mca btl_tcp_base_verbo

Re: [OMPI users] OMPI 4.1 in Cygwin packages?

2021-02-04 Thread Gilles Gouaillardet via users
ygwin packages? > > > > Do we know if this was definitely fixed in v4.1.x? > > > > On Feb 4, 2021, at 7:46 AM, Gilles Gouaillardet via users > > wrote: > > > > Martin, > > > > this is a connectivity issue reported by the btl/tcp component. >

Re: [OMPI users] OpenMPI 4.1.0 misidentifies x86 capabilities

2021-02-10 Thread Gilles Gouaillardet via users
Max, at configure time, Open MPI detects the *compiler* capabilities. In your case, your compiler can emit AVX512 code. (and fwiw, the tests are only compiled and never executed) Then at *runtime*, Open MPI detects the *CPU* capabilities. In your case, it should not invoke the functions containin

Re: [OMPI users] GROMACS with openmpi

2021-02-11 Thread Gilles Gouaillardet via users
This is not an Open MPI question, and hence not a fit for this mailing list. But here we go: first, try cmake -DGMX_MPI=ON ... if it fails, try cmake -DGMX_MPI=ON -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx . .. Cheers, Gilles - Original Message - Hi, MPI develope

Re: [OMPI users] weird mpi error report: Type mismatch between arguments

2021-02-17 Thread Gilles Gouaillardet via users
Diego, IIRC, you now have to build your gfortran 10 apps with -fallow-argument-mismatch Cheers, Gilles - Original Message - Dear OPENMPI users, i'd like to notify you a strange issue that arised right after installing a new up-to-date version of Linux (Kubuntu 20.10, with gcc-1

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-24 Thread Gilles Gouaillardet via users
Can you run ifconfig or ip addr in both Termux and ArchLinux for Termux? On 2/25/2021 2:00 PM, LINUS FERNANDES via users wrote: Why do I see the following error messages when executing |mpirun| on ArchLinux for Termux? The same program executes on Termux without any glitches. |@loca

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Gilles Gouaillardet via users
Is SELinux running on ArchLinux under Termux? On 2/25/2021 4:36 PM, LINUS FERNANDES via users wrote: Yes, I did not receive this in my inbox since I set to receive digest. ifconfig output: dummy0: flags=195 mtu 1500         inet6 fe80::38a0:1bff:fe81:d4f5 prefixlen 64 scopeid 0x20    

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Gilles Gouaillardet via users
a wrappers which I > obviously can't on Termux since it doesn't support OpenJDK. > > On Thu, 25 Feb 2021, 13:37 Gilles Gouaillardet via users, > wrote: >> >> Is SELinux running on ArchLinux under Termux? >> >> On 2/25/2021 4:36 PM, LINUS FERNANDES via u

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Gilles Gouaillardet via users
4/64 scope link >>>>valid_lft forever preferred_lft forever >>>> >>>> Errno==13 is EACCESS, which generically translates to "permission denied". >>>> Since you're running as root, this suggests that something outside of >>&g

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Gilles Gouaillardet via users
yes, you need to (re)build Open MPI from source in order to try this trick. On 2/26/2021 3:55 PM, LINUS FERNANDES via users wrote: No change. What do you mean by running configure? Are you expecting me to build OpenMPI from source? On Fri, 26 Feb 2021, 11:16 Gilles Gouaillardet via users

Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ?

2021-03-04 Thread Gilles Gouaillardet via users
On top of XPMEM, try to also force btl/vader with mpirun --mca pml ob1 --mca btl vader,self, ... On Fri, Mar 5, 2021 at 8:37 AM Nathan Hjelm via users wrote: > > I would run the v4.x series and install xpmem if you can > (http://github.com/hjelmn/xpmem). You will need to build with > —with-xpme

Re: [OMPI users] config: gfortran: "could not run a simple Fortran program"

2021-03-07 Thread Gilles Gouaillardet via users
Anthony, Did you make sure you can compile a simple fortran program with gfortran? and gcc? Please compress and attach both openmpi-config.out and config.log, so we can diagnose the issue. Cheers, Gilles On Mon, Mar 8, 2021 at 6:48 AM Anthony Rollett via users wrote: > > I am trying to config

Re: [OMPI users] [External] Help with MPI and macOS Firewall

2021-03-18 Thread Gilles Gouaillardet via users
Matt, you can either mpirun --mca btl self,vader ... or export OMPI_MCA_btl=self,vader mpirun ... you may also add btl = self,vader in your /etc/openmpi-mca-params.conf and then simply mpirun ... Cheers, Gilles On Fri, Mar 19, 2021 at 5:44 AM Matt Thompson via users wrote: > > Prentice, >

Re: [OMPI users] HWLOC icc error

2021-03-23 Thread Gilles Gouaillardet via users
Luis, this file is never compiled when an external hwloc is used. Please open a github issue and include all the required information Cheers, Gilles On Tue, Mar 23, 2021 at 5:44 PM Luis Cebamanos via users wrote: > > Hello, > > Compiling OpenMPI 4.0.5 with Intel 2020 I came across this error

Re: [OMPI users] Building Open-MPI with Intel C

2021-04-06 Thread Gilles Gouaillardet via users
Michael, orted is able to find its dependencies to the Intel runtime on the host where you sourced the environment. However, it is unlikely able to do it on a remote host For example ssh ... ldd `which opted` will likely fail. An option is to use -rpath (and add the path to the Intel runtime). II

Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04?

2021-04-08 Thread Gilles Gouaillardet via users
Are you using gcc provided by Ubuntu 20.04? if not which compiler (vendor and version) are you using? My (light) understanding is that this patch should not impact performances, so I am not sure whether the performance being back is something I do not understand, or the side effect of a compiler

Re: [OMPI users] [EXTERNAL] Linker errors in Fedora 34 Docker container

2021-05-25 Thread Gilles Gouaillardet via users
Howard, I have a recollection of a similar issue that only occurs with the latest flex (that requires its own library to be passed to the linker). I cannot remember if this was a flex packaging issue, or if we ended up recommending to downgrade flex to a known to work version. The issue s

Re: [OMPI users] (Fedora 34, x86_64-pc-linux-gnu, openmpi-4.1.1.tar.gz): PML ucx cannot be selected

2021-05-28 Thread Gilles Gouaillardet via users
Jorge, pml/ucx used to be selected when no fast interconnect were detected (since ucx provides driver for both TCP and shared memory). These providers are now disabled by default, so unless your machine has a supported fast interconnect (such as Infiniband), pml/ucx cannot be used out of the box a

Re: [OMPI users] how to suppress "libibverbs: Warning: couldn't load driver ..." messages?

2021-06-23 Thread Gilles Gouaillardet via users
Hi Jeff, Assuming you did **not** explicitly configure Open MPI with --disable-dlopen, you can try mpirun --mca pml ob1 --mca btl vader,self ... Cheers, Gilles On Thu, Jun 24, 2021 at 5:08 AM Jeff Hammond via users < users@lists.open-mpi.org> wrote: > I am running on a single node and do not n

  1   2   >