Hi,
Yes the fix for exactly that was submitted by me at end of 2014th:
https://github.com/SchedMD/slurm/commit/7fff5eed6b8fe97347a832149966ed11f5805f99
You need to track if it was included to your version.
2015-08-21 17:31 GMT-04:00 Aaron Knister :
> Hi Artem,
>
> Do you know if a fix for this
Hi Artem,
Do you know if a fix for this was ever committed? We ran into this with a
code base that builds non-mpi apps with mpicc and then attempts to run then
multiple times from within a single SLURM task.
-Aaron
On Wed, May 21, 2014 at 9:12 AM, Artem Polyakov wrote:
> 2014-05-21 19:28 GMT+0
2014-05-21 19:28 GMT+07:00 Hongjia Cao :
>
> You debugging and analysis is correct.
>
> PMI2_init() initialize PMI in two steps. First a PMI 1.1 init command is
> sent to the server and the version is negotiated with the server. After
> that a PMI 2.0 fullinit command is sent. Everything goes well
2014-05-21 10:50 GMT+07:00 Artem Polyakov :
> Here is an exact examples:
>
> 1. "appnum = -1" problem:
> Program pmi_appnum.c (attached) is allocated using batch script
> pmi_appnum.job (attached) and produces following results:
>
> PMI2_Init(0, 16, 0, -1)
> PMI2_Init(0, 16, 1, -1)
> PMI2_Init(0,
You debugging and analysis is correct.
PMI2_init() initialize PMI in two steps. First a PMI 1.1 init command is
sent to the server and the version is negotiated with the server. After
that a PMI 2.0 fullinit command is sent. Everything goes well so far.
But since the version number is decided, th
2014-05-21 12:18 GMT+07:00 Artem Polyakov :
> Hello, Hongjia.
>
> 2014-05-21 12:11 GMT+07:00 Hongjia Cao :
>
>
>> 在 2014-05-20二的 17:46 -0700,Artem Polyakov写道:
>> >
>> >
>> > среда, 21 мая 2014 г. пользователь David Bigagli написал:
>> >
>> > The srun --mpi=pmi2 option has to be specified
2014-05-21 12:18 GMT+07:00 Hongjia Cao :
>
> I'd like to mention that the mpi/pmi2 plugin of SLURM also supports
> PMI1.1. If no --mpi=pmi2 option given, the PMI implementation in SLURM
> will be used, which supports PMI1.1 only.
>
That is correct. We think of this exactly that way.
>
> 在 2014-0
Hello, Hongjia.
2014-05-21 12:11 GMT+07:00 Hongjia Cao :
>
> 在 2014-05-20二的 17:46 -0700,Artem Polyakov写道:
> >
> >
> > среда, 21 мая 2014 г. пользователь David Bigagli написал:
> >
> > The srun --mpi=pmi2 option has to be specified if openmpi was
> > built with the --with-pmi opti
I'd like to mention that the mpi/pmi2 plugin of SLURM also supports
PMI1.1. If no --mpi=pmi2 option given, the PMI implementation in SLURM
will be used, which supports PMI1.1 only.
在 2014-05-20二的 22:04 -0700,Artem Polyakov写道:
> Thank you, Chris!
>
> Currently we have a prototype that selects PMI
I will check the double init hang problem.
在 2014-05-20二的 20:52 -0700,Artem Polyakov写道:
> Here is an exact examples:
>
>
> 1. "appnum = -1" problem:
> Program pmi_appnum.c (attached) is allocated using batch script
> pmi_appnum.job (attached) and produces following results:
>
>
> PMI2_Init(0,
在 2014-05-20二的 17:46 -0700,Artem Polyakov写道:
>
>
> среда, 21 мая 2014 г. пользователь David Bigagli написал:
>
> The srun --mpi=pmi2 option has to be specified if openmpi was
> built with the --with-pmi options otherwise Slurm will not
> load the pmi2 plugins and
Thank you, Chris!
Currently we have a prototype that selects PMI version based on:
(a) user preference, in this case if PMI2 return error - we give up and
MPI_Init fails.
(b) automatic: here if we see the that PMI2 wasn't enabled (no --mpi=pmi2
option) we rollback to PMI1. This case also includes
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 21/05/14 10:47, Artem Polyakov wrote:
> I need to add that I am working on PMI support in Open MPI, so I
> go slightly deeper than regular user.
It's also worth noting that because of problem reports about PMI2 with
Slurm to the OMPI developers t
Here is an exact examples:
1. "appnum = -1" problem:
Program pmi_appnum.c (attached) is allocated using batch script
pmi_appnum.job (attached) and produces following results:
PMI2_Init(0, 16, 0, -1)
PMI2_Init(0, 16, 1, -1)
PMI2_Init(0, 16, 2, -1)
PMI2_Init(0, 16, 3, -1)
PMI2_Init(0, 16, 5, -1)
PM
среда, 21 мая 2014 г. пользователь Artem Polyakov написал:
>
>
> среда, 21 мая 2014 г. пользователь David Bigagli написал:
>
>>
>> The srun --mpi=pmi2 option has to be specified if openmpi was built with
>> the --with-pmi options otherwise Slurm will not load the pmi2 plugins and
>> the mpi job wi
среда, 21 мая 2014 г. пользователь David Bigagli написал:
>
> The srun --mpi=pmi2 option has to be specified if openmpi was built with
> the --with-pmi options otherwise Slurm will not load the pmi2 plugins and
> the mpi job will fail in MPI_Init().
Thank you.
I need to add that I am working on
The srun --mpi=pmi2 option has to be specified if openmpi was built with
the --with-pmi options otherwise Slurm will not load the pmi2 plugins
and the mpi job will fail in MPI_Init().
On 05/17/2014 07:50 PM, Artem Polyakov wrote:
Hello,
Here is some related notes that I found during further
17 matches
Mail list logo