Current Slurm official repository branches contain only a limited PMIx 
integration, primarily constrained to providing support for the traditional 
put-get exchange of key-value pairs. The support is also restricted to PMIx 
v3.x releases and below, though this restriction is due to a configure 
limitation as opposed to any compatibility issues with the PMIx library.

Given the emergence of new features in programming libraries such as MPI and 
OSHMEM that rely on more recent PMIx releases, and the sundowning of support 
for the older PMIx series, organizations may find themselves in need (or at 
least desiring) of support for PMIx releases in the v4.x and above series. 
Obviously, the optimal solution would be for this support to be available from 
the official repository and associated releases, and we are continuing to work 
towards that goal.

Meantime, projects involving (to one degree or another) the use of PMIx within 
Slurm have started. As part of their overall effort, these projects will extend 
the current PMIx integration to embrace the full  range of PMIx operations. 
Much of this work will remain private pending publication, but some of the 
basic PMIx integration can be made available to interested organizations as it 
is completed.

Until these capabilities can be upstreamed, several organizations are teaming 
to provide two paths forward.

First, we offer a patch that can be applied to official Slurm releases that 
upgrades the PMIx support. The patches 
(https://github.com/slurm-pmix/slurm/wiki/Patches) are based on the head of the 
Slurm master branch and should apply cleanly to recent releases. Feedback on 
problems with the patch should be reported on that repository's "issues" page 
(https://github.com/slurm-pmix/slurm/issues). We will maintain a list of 
patches (each marked with a date and hash upon which they were based) as work 
continues on adding support for a broader range of PMIx features.

Secondly, we remind users that they can use the PMIx Reference RunTime 
Environment (PRRTE, https://github.com/openpmix/prrte) to resolve this issue. 
Once a user has obtained an allocation, simply execute prte to instantiate the 
persistent Distributed Virtual Machine (DVM). The PRRTE DVM contains support 
for the full range of PMIx features, thereby providing a complete environment 
for advanced features such as MPI Sessions and dynamic operations, 
multi-application workflows, and novel programming models such as the "sea of 
MPI" (to be described soon on the PRRTE site).

For ease-of-use in transitioning to PRRTE, the PMIx community is working on an 
"srun" personality for that environment. The base launcher for PRRTE is "prun", 
which has a command line similar but not identical to "srun". However, PRRTE 
supports customized command lines, and we are working to create such a wrapper 
for this environment. When complete, use of the "srun" command provided by 
PRRTE will behave the same as the native Slurm version of the command - but 
will execute the specified application using PRRTE.

It is our hope that organizations will find one (or both) of these options 
helpful in meeting their needs until a longer term solution is achieved.

Ralph

Reply via email to