Hi Eli, I agree with you, keep the checks enabled, and users that want them off can do it via our MCA parameters (command line or ${HOME}/.openmpi/mca-params.conf).
I don't think it is ever effective to try to save a few branches in MPI functions that usually cost over a microsecond, and lose all protection against incorrect usage of the 600 something functions in the MPI standard. This is even more true in a setup meant for shared use between experienced and novice MPI programmers. George. On Thu, Jul 18, 2024 at 5:16 PM Eli Heady via users < users@lists.open-mpi.org> wrote: > Hello, > > While working in a HPC support role, I was asked to resolve an apparent > discrepancy between OpenMPI 'mpi_cart_rank' behavior and the MPI spec [1, > 2] that says "[out]-of-range coordinates are erroneous for non-periodic > dimensions." The observed behavior in our environment [3] was that > mpi_cart_rank on a topology with non-periodic dimensions was returning an > implicitly shifted value for a lookup of an invalid coordinate ('-1' for > example). This behavior was caused by the compile time flag > "--with-mpi-param-check=no" as included in the > contrib/platform/mellanox/optimized file [4], which ultimately seems to > disable the coordinate bounds checking happening at > ompi/mpi/c/cart_rank.c#L85-L91 [5]. We initially thought this could be a > bug, especially after reading 'MPI_Cart_rank: Out-of-range coordinates are > erroneous for non-periodic dimensions' [6], but the realization that our > build was disabling all parameter checking makes me a bit reluctant to call > this a 'bug'. > > I'm relatively new to the MPI world and have searched this list's archives > for answers but found nothing really specific to my question. This is a > general question for other OpenMPI users and cluster admins regarding the > build optimization --with-mpi-param-check=no. I'm looking for opinions > based on experience supporting diverse user code in shared OpenMPI > installations: > > In the context of a shared cluster deployment in a high performance > environment, are there good arguments for permanently disabling MPI > parameter checking (--with-mpi-param-check=no)? To eliminate some runtime > overhead in the functions that conditionally skip parameter validation? Is > that overhead substantial? I haven't found any recommendations to use the > configure flag '--with-mpi-param-check=no', apart from indirectly by > incorporating the Mellanox platform optimized [4] file. Are any other > site installers here intentionally (permanently) disabling parameter > checking in shared installations? Anyone disabling parameter checking at > runtime as a default? Are there other considerations? > > My impression is it would be safe to compile out parameter checking if you > know your MPI code passes only legal parameter values to all MPI functions, > otherwise it would be prudent to leave parameter checking enabled (or > runtime disable-able). > > > 1. MPI 4, 8.5.5, p406 > 2. MPI 3.1, 7.5.5, p305 > 3. OpenMPI 4.1.5 and 4.0.3 configured with > "--with-platform=contrib/platform/mellanox/optimized", as found in > https://linux.mellanox.com/public/repo/mlnx_ofed/5.8-3.0.7.0/rhel9.2/x86_64/openmpi-4.1.5a1-1.58307.x86_64.rpm > (/usr/mpi/gcc/openmpi-4.1.5a1/bin/ompi_info | grep "Configure command") > 4. > https://github.com/open-mpi/ompi/blob/42b829b3b3190dd1987d113fd8c2810eb8584007/contrib/platform/mellanox/optimized#L55 > 5. > https://github.com/open-mpi/ompi/blob/42b829b3b3190dd1987d113fd8c2810eb8584007/ompi/mpi/c/cart_rank.c#L85-L91 > 6. https://www.mail-archive.com/users@lists.open-mpi.org/msg07705.html > > > Eli >