[OMPI users] Error handling

2023-07-18 Thread Alexander Stadik via users
Hey everyone, I am working for longer time now with cuda-aware OpenMPI, and developed longer time back a small exceptions handling framework including MPI and CUDA exceptions. Currently I am using MPI_Abort with costum error numbers, to terminate everything elegantly, which works well, by just

Re: [OMPI users] Error build Open MPI 4.1.5 with GCC 11.3

2023-07-18 Thread Jeffrey Layton via users
Jeff, Thanks for the tip - it started me thinking a bit. I was using a directory in my /home account with 4.1.5 that I had previously built using GCC 9.4 (Ubuntu 20.04). I rebuilt the system with Ubuntu-22.04 but I did a backup of /home. Then I copied the 4.1.5 directory to /home again. I checke

Re: [OMPI users] Error build Open MPI 4.1.5 with GCC 11.3

2023-07-18 Thread Jeff Squyres (jsquyres) via users
There were probably quite a few differences from the output of "configure" between GCC 9.4 and GCC 11.3. For example, your original post cited "/usr/lib/gcc/x86_64-linux-gnu/9/include/float.h", which, I assume, does not exist on your new GCC 11.3-based system. Meaning: if you had run make clea

Re: [OMPI users] Error handling

2023-07-18 Thread George Bosilca via users
Alex, How are your values "random" if you provide correct values ? Even for negative values you could use MIN to pick one value and return it. What is the problem with `MPI_Abort` ? it does seem to do what you want. George. On Tue, Jul 18, 2023 at 4:38 AM Alexander Stadik via users < users@li

Re: [OMPI users] Error build Open MPI 4.1.5 with GCC 11.3

2023-07-18 Thread Jeffrey Layton via users
As soon as you pointed out /usr/lib/gcc/x86_64-linux-gnu/9/include/float.h that made me think of the previous build. I did "make clean" a _bunch_ of times before running configure and it didn't cure it. Strange. But, nuking the source tree from orbit, just to be sure, and then configure/rebuild w

Re: [OMPI users] Error build Open MPI 4.1.5 with GCC 11.3

2023-07-18 Thread Jeff Squyres (jsquyres) via users
The GNU-generated Makefile dependencies may not be removed during "make clean" -- they may only be removed during "make distclean" (which is kinda equivalent to rm -rf'ing the tree and extracting a fresh tarball). From: Jeffrey Layton Sent: Tuesday, July 18, 2023

Re: [OMPI users] Error build Open MPI 4.1.5 with GCC 11.3

2023-07-18 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users
I don't know how openmpi does it, but I've definitely seen packages where "make clean" wipes the ".o" files but not the results of the configure process. Sometimes there's a "make distclean" which tries to get back closer to as-untarred state. Noam On Jul 18, 2023, at 12:51 PM, Jeffrey Layton

Re: [OMPI users] Error build Open MPI 4.1.5 with GCC 11.3

2023-07-18 Thread Tom Kacvinsky via users
On Jul 18, 2023, at 16:05, Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users wrote: I don't know how openmpi does it, but I've definitely seen packages where "make clean" wipes the ".o" files but not the results of the configure process.  Sometimes there's a "make distclean" wh

Re: [OMPI users] [EXT] Re: Error handling

2023-07-18 Thread Alexander Stadik via users
Hey George, I said random only because I do not see the method behind it, but exactly like this when I do allreduce by MIN and return a negative number I get either 248, 253, 11 or 6 usually. Meaning that's purely a number from MPI side. The Problem with MPI_Abort is it shows the correct number