Re: [OMPI users] Error build Open MPI 4.1.5 with GCC 11.3

2023-07-18 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users
I don't know how openmpi does it, but I've definitely seen packages where "make clean" wipes the ".o" files but not the results of the configure process. Sometimes there's a "make distclean" which tries to get back closer to as-untarred state. Noam On Jul 18, 2023, at 12:51 PM, Jeffrey Layton

Re: [OMPI users] mpi via python mpi4py can't use ucx pml

2023-07-07 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users
Looks like the issue is that mpi4py by default uses THREAD_MULTIPLE, which ucx does not support. It would be nice if the OpenMPI pml selection code provided information on what exactly caused ucx initialization to fail, but at least I know how to work around my problem now.

[OMPI users] mpi via python mpi4py can't use ucx pml

2023-07-06 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users
I've been happily using OpenMPI 4.1.4 for a while, but I've run into a weird new problem. I mainly use it with ucx, typically running with the mpirun flags --bind-to core --report-bindings --mca pml ucx --mca osc ucx --mca btl ^vader,tcp,openib and with our compiled Fortran codes it seems to work

Re: [OMPI users] Multiple mpirun instances crash

2023-02-08 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users
I don't think it's inherently true that multiple mpiruns interfere with each other - we do that routinely, I thought. Any chance that your jobs are doing something like writing to a common directory (like /tmp), and then interfering with each other? As an aside, you should consider the mpirun

Re: [OMPI users] Printing in a fortran MPI/OpenMP environment

2023-01-31 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users
Stdout from every process is gathered by mpirun and shown on in stdout of the shell where mpirun started. There's a command line option for mpirun to label lines by the MPI task, "--tag-output" I think. There's some OpenMP function you can use to determine the current OpenMP thread number whic

Re: [OMPI users] ucx problems

2022-08-25 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users
Yeah, that appears to have been the issue - IB is entirely dead (it's a new machine, so maybe no subnet manager, or maybe a bad cable). I'll track that down, and follow up here if there's still an issue once the low level IB problem is fixed. However, given that ucx says it supports shared memo

Re: [OMPI users] ucx problems

2022-08-24 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users
Here is more information with higher verbosity: > mpirun -np 2 --mca pml ucx --mca osc ucx --bind-to core --map-by core > --rank-by core --mca pml_ucx_verbose 100 --mca osx_ucxv_erbose 100 --mca > bml_base_verbose 100 mpi_executable [tin2:1137672] mca: base: components_register: registering fram

[OMPI users] ucx problems

2022-08-24 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users
Hi all - I'm trying to get openmpi with ucx working on a new Rocky Linux 8 + OpenHPC machine. I'm used to running with mpirun --mca pml ucx --mca osc ucx --mca btl ^vader,tcp,openib --bind-to core --map-by core --rank-by core However, now it complains that it can't start the pml, with the message

[OMPI users] mixed OpenMP/MPI

2022-03-15 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users
Hi - I'm trying to run multi-node mixed OpenMP/MPI with each MPI task bound to a set of cores. I thought this would be relatively straightforward with "--map-by slot:PE=$OMP_NUM_THREADS --bind-to core", but I can't get it to work. I couldn't figure out if it was a bug or just something missin