Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-23 Thread Howard Pritchard via users
HI All, I opened a new issue to track the coll_perf failure in case its not related to the HDF5 problem reported earlier. https://github.com/open-mpi/ompi/issues/8246 Howard Am Mo., 23. Nov. 2020 um 12:14 Uhr schrieb Dave Love via users < users@lists.open-mpi.org>: > Mark Dixon via users wri

Re: [OMPI users] OMPI 4.0.4 crashes (or hangs) with dynamically processes allocation. OMPI 4.0.1 don't.

2020-08-15 Thread Howard Pritchard via users
HI Martin, Thanks this is helpful. Are you getting this timeout when you're running the spawner process as a singleton? Howard Am Fr., 14. Aug. 2020 um 17:44 Uhr schrieb Martín Morales < martineduardomora...@hotmail.com>: > Howard, > > > > I pasted below, the error message after a while of the

Re: [OMPI users] OMPI 4.0.4 crashes (or hangs) with dynamically processes allocation. OMPI 4.0.1 don't.

2020-08-14 Thread Howard Pritchard via users
Hi Martin, I opened an issue on Open MPI's github to track this https://github.com/open-mpi/ompi/issues/8005 You may be seeing another problem if you removed master from the host file. Could you add the --debug-daemons option to the mpirun and post the output? Howard Am Di., 11. Aug. 2020 um 1

Re: [OMPI users] OMPI 4.0.4 crashes (or hangs) with dynamically processes allocation. OMPI 4.0.1 don't.

2020-08-13 Thread Howard Pritchard via users
Hi Ralph, I've not yet determined whether this is actually a PMIx issue or the way the dpm stuff in OMPI is handling PMIx namespaces. Howard Am Di., 11. Aug. 2020 um 19:34 Uhr schrieb Ralph Castain via users < users@lists.open-mpi.org>: > Howard - if there is a problem in PMIx that is causing

Re: [OMPI users] OMPI 4.0.4 crashes (or hangs) with dynamically processes allocation. OMPI 4.0.1 don't.

2020-08-10 Thread Howard Pritchard via users
Hi Martin, I was able to reproduce this with 4.0.x branch. I'll open an issue. If you really want to use 4.0.4, then what you'll need to do is build an external PMIx 3.1.2 (the PMIx that was embedded in Open MPI 4.0.1), and then build Open MPI using the --with-pmix=where your pmix is installed Y

Re: [OMPI users] OMPI 4.0.4 crashes (or hangs) with dynamically processes allocation. OMPI 4.0.1 don't.

2020-08-10 Thread Howard Pritchard via users
Hello Martin, Between Open MPI 4.0.1 and Open MPI 4.0.4 we upgraded the internal PMIx version that introduced a problem with spawn for the 4.0.2-4.0.4 versions. This is supposed to be fixed in the 4.0.5 release. Could you try the 4.0.5rc1 tarball and see if that addresses the problem you're seein

Re: [OMPI users] Differences 4.0.3 -> 4.0.4 (Regression?)

2020-08-08 Thread Howard Pritchard via users
Hello Michael, Not sure what could be causing this in terms of delta between v4.0.3 and v4.0.4. Two things to try - add --debug-daemons and --mca pmix_base_verbose 100 to the mpirun line and compare output from the v4.0.3 and v4.0.4 installs - perhaps try using the --enable-mpirun-prefix-by-defau

Re: [OMPI users] OMPI returns error 63 on AMD 7742 when utilizing 100+ processors per node

2020-01-29 Thread Howard Pritchard via users
Collin, A couple of things to try. First, could you just configure without using the mellanox platform file and see if you can run the app with 100 or more processes? Another thing to try is to keep using the mellanox platform file, but run the app with mpirun --mca pml ob1 -np 100 bin/xhpcg an

Re: [OMPI users] OMPI returns error 63 on AMD 7742 when utilizing 100+ processors per node

2020-01-27 Thread Howard Pritchard via users
Hello Collen, Could you provide more information about the error. Is there any output from either Open MPI or, maybe, UCX, that could provide more information about the problem you are hitting? Howard Am Mo., 27. Jan. 2020 um 08:38 Uhr schrieb Collin Strassburger via users < users@lists.open-m