Re: [OMPI users] mpi program gets stuck

2022-12-07 Thread Jeff Squyres (jsquyres) via users
To tie up this issue for the web mail archives... There were a bunch more off-list emails exchanged on this thread. It was determined that something is going wrong down in the IB networking stack. It looks like it may be a problem in the environment itself, not Open MPI. The user is continui

Re: [OMPI users] mpi program gets stuck

2022-12-01 Thread Jeff Squyres (jsquyres) via users
Ok, this looks like the same type of output running ring_c as your Python MPI app -- good. Using a C MPI program for testing just eliminates some possible variables / issues. Ok, let's try running again, but add some more command line parameters: mpirun -n 2 --machinefile hosts --mca plm_base_

Re: [OMPI users] mpi program gets stuck

2022-11-29 Thread Jeff Squyres (jsquyres) via users
(we've conversed a bit off-list; bringing this back to the list with a good subject to differentiate it from other digest threads) I'm glad the tarball I provided (that included the PMIx fix) resolved running "uptime" for you. Can you try running a plain C MPI program instead of a Python MPI pr

[OMPI users] mpi program gets stuck

2022-11-29 Thread timesir via users
see also: https://pastebin.com/s5tjaUkF (py3.9) ➜ /share cat hosts 192.168.180.48 slots=1 192.168.60.203 slots=1 1. This command now runs correctly using your openmpi-gitclone-pr11096.tar.bz2 (py3.9) ➜ /share mpirun -n 2 --machinefile hosts --mca plm_base_verbose 100 --mca rmaps_base_verbose