I run start an interactive allocation and I just noticed that the problem 
happens, when I join this allocation from another shell.

Here is how I join:

srun --pty --x11 --jobid=$(squeue -u $USER -o %A | tail -n 1) bash

And here is how I create the allocation:

srun --pty --nodes 8 --ntasks-per-node 24 --mem 50G --time=3:00:00 
--partition=haswell --x11 bash


On 09/08/2017 09:58 AM, Gilles Gouaillardet wrote:
> Maxsym,
> 
> 
> can you please post your sbatch script ?
> 
> fwiw, i am unable to reproduce the issue with the latest v2.x from github.
> 
> 
> by any chance, would you be able to test the latest openmpi 2.1.2rc3 ?
> 
> 
> Cheers,
> 
> 
> Gilles
> 
> 
> On 9/8/2017 4:19 PM, Maksym Planeta wrote:
>> Indeed mpirun shows slots=1 per node, but I create allocation with 
>> --ntasks-per-node 24, so I do have all cores of the node allocated.
>>
>> When I use srun I can get all the cores.
>>
>> On 09/07/2017 02:12 PM, r...@open-mpi.org wrote:
>>> My best guess is that SLURM has only allocated 2 slots, and we 
>>> respect the RM regardless of what you say in the hostfile. You can 
>>> check this by adding --display-allocation to your cmd line. You 
>>> probably need to tell slurm to allocate more cpus/node.
>>>
>>>
>>>> On Sep 7, 2017, at 3:33 AM, Maksym Planeta 
>>>> <mplan...@os.inf.tu-dresden.de> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I'm trying to tell OpenMPI how many processes per node I want to 
>>>> use, but mpirun seems to ignore the configuration I provide.
>>>>
>>>> I create following hostfile:
>>>>
>>>> $ cat hostfile.16
>>>> taurusi6344 slots=16
>>>> taurusi6348 slots=16
>>>>
>>>> And then start the app as follows:
>>>>
>>>> $ mpirun --display-map   -machinefile hostfile.16 -np 2 hostname
>>>> Data for JOB [42099,1] offset 0
>>>>
>>>> ========================   JOB MAP   ========================
>>>>
>>>> Data for node: taurusi6344     Num slots: 1    Max slots: 0    Num 
>>>> procs: 1
>>>>          Process OMPI jobid: [42099,1] App: 0 Process rank: 0 Bound: 
>>>> socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 
>>>> 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 
>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], socket 0[core 7[hwt 0]], 
>>>> socket 0[core 8[hwt 0]], socket 0[core 9[hwt 0]], socket 0[core 
>>>> 10[hwt 0]], socket 0[core 11[hwt 
>>>> 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]
>>>>
>>>> Data for node: taurusi6348     Num slots: 1    Max slots: 0    Num 
>>>> procs: 1
>>>>          Process OMPI jobid: [42099,1] App: 0 Process rank: 1 Bound: 
>>>> socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 
>>>> 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 
>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], socket 0[core 7[hwt 0]], 
>>>> socket 0[core 8[hwt 0]], socket 0[core 9[hwt 0]], socket 0[core 
>>>> 10[hwt 0]], socket 0[core 11[hwt 
>>>> 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]
>>>>
>>>> =============================================================
>>>> taurusi6344
>>>> taurusi6348
>>>>
>>>> If I put anything more than 2 in "-np 2", I get following error 
>>>> message:
>>>>
>>>> $ mpirun --display-map   -machinefile hostfile.16 -np 4 hostname
>>>> -------------------------------------------------------------------------- 
>>>>
>>>> There are not enough slots available in the system to satisfy the 4 
>>>> slots
>>>> that were requested by the application:
>>>>    hostname
>>>>
>>>> Either request fewer slots for your application, or make more slots 
>>>> available
>>>> for use.
>>>> -------------------------------------------------------------------------- 
>>>>
>>>>
>>>> The OpenMPI version is "mpirun (Open MPI) 2.1.0"
>>>>
>>>> Also there is SLURM installed with version "slurm 
>>>> 16.05.7-Bull.1.1-20170512-1252"
>>>>
>>>> Could you help me to enforce OpenMPI to respect slots paremeter?
>>>> -- 
>>>> Regards,
>>>> Maksym Planeta
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users@lists.open-mpi.org
>>>> https://lists.open-mpi.org/mailman/listinfo/users
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
> 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users

-- 
Regards,
Maksym Planeta

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to