Slurm 19.05 removed support for `--cpu_bind`, which is what all released versions of OpenMPI are using when they call into srun. This issue was fixed 24 days ago in [OpenMPI's git repo][1].

This means all OpenMPI programs that end up calling `srun` on Slurm 19.05 will fail.

This enormous amount of breakage for such a minor "gain" seems unwise. I think this [change][2] should be backed out and converted to a warning message to allow time for the OpenMPI changes to be backported, released, and adopted. Theoretically they were given time with the 17.11 release (I think?) but since it's only just landed...

Levi Morrison
Brigham Young University

  [1]: https://github.com/open-mpi/ompi/commit/7dad74032e30259506da7fa582dd8c4351e6e0a1   [2]: https://github.com/SchedMD/slurm/commit/d78af893e4a60e933a2319b0c36a0e40c7dd1b02

Reply via email to