Hi Eli,

I agree with you, keep the checks enabled, and users that want them off can
do it via our MCA parameters (command line or
${HOME}/.openmpi/mca-params.conf).

I don't think it is ever effective to try to save a few branches in MPI
functions that usually cost over a microsecond, and lose all protection
against incorrect usage of the 600 something functions in the MPI standard.
This is even more true in a setup meant for shared use between
experienced and novice MPI programmers.

George.


On Thu, Jul 18, 2024 at 5:16 PM Eli Heady via users <
users@lists.open-mpi.org> wrote:

> Hello,
>
> While working in a HPC support role, I was asked to resolve an apparent
> discrepancy between OpenMPI 'mpi_cart_rank' behavior and the MPI spec [1,
> 2] that says "[out]-of-range coordinates are erroneous for non-periodic
> dimensions." The observed behavior in our environment [3] was that
> mpi_cart_rank on a topology with non-periodic dimensions was returning an
> implicitly shifted value for a lookup of an invalid coordinate ('-1' for
> example). This behavior was caused by the compile time flag
> "--with-mpi-param-check=no" as included in the
> contrib/platform/mellanox/optimized file [4], which ultimately seems to
> disable the coordinate bounds checking happening at
> ompi/mpi/c/cart_rank.c#L85-L91 [5]. We initially thought this could be a
> bug, especially after reading 'MPI_Cart_rank: Out-of-range coordinates are
> erroneous for non-periodic dimensions' [6], but the realization that our
> build was disabling all parameter checking makes me a bit reluctant to call
> this a 'bug'.
>
> I'm relatively new to the MPI world and have searched this list's archives
> for answers but found nothing really specific to my question. This is a
> general question for other OpenMPI users and cluster admins regarding the
> build optimization --with-mpi-param-check=no. I'm looking for opinions
> based on experience supporting diverse user code in shared OpenMPI
> installations:
>
> In the context of a shared cluster deployment in a high performance
> environment, are there good arguments for permanently disabling MPI
> parameter checking (--with-mpi-param-check=no)? To eliminate some runtime
> overhead in the functions that conditionally skip parameter validation? Is
> that overhead substantial? I haven't found any recommendations to use the
> configure flag '--with-mpi-param-check=no', apart from indirectly by
> incorporating the Mellanox platform optimized [4] file. Are any other
> site installers here intentionally (permanently) disabling parameter
> checking in shared installations? Anyone disabling parameter checking at
> runtime as a default? Are there other considerations?
>
> My impression is it would be safe to compile out parameter checking if you
> know your MPI code passes only legal parameter values to all MPI functions,
> otherwise it would be prudent to leave parameter checking enabled (or
> runtime disable-able).
>
>
> 1. MPI 4, 8.5.5, p406
> 2. MPI 3.1, 7.5.5, p305
> 3. OpenMPI 4.1.5 and 4.0.3 configured with
> "--with-platform=contrib/platform/mellanox/optimized", as found in
> https://linux.mellanox.com/public/repo/mlnx_ofed/5.8-3.0.7.0/rhel9.2/x86_64/openmpi-4.1.5a1-1.58307.x86_64.rpm
> (/usr/mpi/gcc/openmpi-4.1.5a1/bin/ompi_info | grep "Configure command")
> 4.
> https://github.com/open-mpi/ompi/blob/42b829b3b3190dd1987d113fd8c2810eb8584007/contrib/platform/mellanox/optimized#L55
> 5.
> https://github.com/open-mpi/ompi/blob/42b829b3b3190dd1987d113fd8c2810eb8584007/ompi/mpi/c/cart_rank.c#L85-L91
> 6. https://www.mail-archive.com/users@lists.open-mpi.org/msg07705.html
>
>
> Eli
>

Reply via email to