I run start an interactive allocation and I just noticed that the problem happens, when I join this allocation from another shell.
Here is how I join: srun --pty --x11 --jobid=$(squeue -u $USER -o %A | tail -n 1) bash And here is how I create the allocation: srun --pty --nodes 8 --ntasks-per-node 24 --mem 50G --time=3:00:00 --partition=haswell --x11 bash On 09/08/2017 09:58 AM, Gilles Gouaillardet wrote: > Maxsym, > > > can you please post your sbatch script ? > > fwiw, i am unable to reproduce the issue with the latest v2.x from github. > > > by any chance, would you be able to test the latest openmpi 2.1.2rc3 ? > > > Cheers, > > > Gilles > > > On 9/8/2017 4:19 PM, Maksym Planeta wrote: >> Indeed mpirun shows slots=1 per node, but I create allocation with >> --ntasks-per-node 24, so I do have all cores of the node allocated. >> >> When I use srun I can get all the cores. >> >> On 09/07/2017 02:12 PM, r...@open-mpi.org wrote: >>> My best guess is that SLURM has only allocated 2 slots, and we >>> respect the RM regardless of what you say in the hostfile. You can >>> check this by adding --display-allocation to your cmd line. You >>> probably need to tell slurm to allocate more cpus/node. >>> >>> >>>> On Sep 7, 2017, at 3:33 AM, Maksym Planeta >>>> <mplan...@os.inf.tu-dresden.de> wrote: >>>> >>>> Hello, >>>> >>>> I'm trying to tell OpenMPI how many processes per node I want to >>>> use, but mpirun seems to ignore the configuration I provide. >>>> >>>> I create following hostfile: >>>> >>>> $ cat hostfile.16 >>>> taurusi6344 slots=16 >>>> taurusi6348 slots=16 >>>> >>>> And then start the app as follows: >>>> >>>> $ mpirun --display-map -machinefile hostfile.16 -np 2 hostname >>>> Data for JOB [42099,1] offset 0 >>>> >>>> ======================== JOB MAP ======================== >>>> >>>> Data for node: taurusi6344 Num slots: 1 Max slots: 0 Num >>>> procs: 1 >>>> Process OMPI jobid: [42099,1] App: 0 Process rank: 0 Bound: >>>> socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core >>>> 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket >>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], socket 0[core 7[hwt 0]], >>>> socket 0[core 8[hwt 0]], socket 0[core 9[hwt 0]], socket 0[core >>>> 10[hwt 0]], socket 0[core 11[hwt >>>> 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.] >>>> >>>> Data for node: taurusi6348 Num slots: 1 Max slots: 0 Num >>>> procs: 1 >>>> Process OMPI jobid: [42099,1] App: 0 Process rank: 1 Bound: >>>> socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core >>>> 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket >>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], socket 0[core 7[hwt 0]], >>>> socket 0[core 8[hwt 0]], socket 0[core 9[hwt 0]], socket 0[core >>>> 10[hwt 0]], socket 0[core 11[hwt >>>> 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.] >>>> >>>> ============================================================= >>>> taurusi6344 >>>> taurusi6348 >>>> >>>> If I put anything more than 2 in "-np 2", I get following error >>>> message: >>>> >>>> $ mpirun --display-map -machinefile hostfile.16 -np 4 hostname >>>> -------------------------------------------------------------------------- >>>> >>>> There are not enough slots available in the system to satisfy the 4 >>>> slots >>>> that were requested by the application: >>>> hostname >>>> >>>> Either request fewer slots for your application, or make more slots >>>> available >>>> for use. >>>> -------------------------------------------------------------------------- >>>> >>>> >>>> The OpenMPI version is "mpirun (Open MPI) 2.1.0" >>>> >>>> Also there is SLURM installed with version "slurm >>>> 16.05.7-Bull.1.1-20170512-1252" >>>> >>>> Could you help me to enforce OpenMPI to respect slots paremeter? >>>> -- >>>> Regards, >>>> Maksym Planeta >>>> >>>> _______________________________________________ >>>> users mailing list >>>> users@lists.open-mpi.org >>>> https://lists.open-mpi.org/mailman/listinfo/users >>> _______________________________________________ >>> users mailing list >>> users@lists.open-mpi.org >>> https://lists.open-mpi.org/mailman/listinfo/users >>> >> >> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users -- Regards, Maksym Planeta
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users