the program probably says 32 threads, because it's just looking at the
box, not what slurm cgroups allow (assuming your using them) for cpu

i think for an openmp program (not openmpi) you definitely want the
first command with --cpus-per-task=32

are you measuring the runtime inside the program or outside it?  if
the later the 10sec addition in time could be the slurm setup/node
allocation

On Wed, Apr 23, 2025 at 2:41 PM Jeffrey Layton <layto...@gmail.com> wrote:
>
> I tried using ntasks and cpus-per-task to get all 32 cores. So I added 
> --ntasks=# --cpus-per-task=N  to th sbatch command  so that it now looks like:
>
> sbatch --nodes=1 --ntasks=1 --cpus-per-task=32 <script>
>
> It now takes 28 seconds (I ran it a few times).
>
> If I change the command to
>
> sbatch --nodes=1 --ntasks=32 --cpus-per-task=1 <script>
>
> It now takes about 30 seconds.
>
> Outside of Slurm it was only taking about 19.6 seconds. So either way it 
> takes longer.
>
> Interesting, in the output from bt, it gives the Total Threads and Avail 
> Threads. In all cases the answer is 32. If the code was only using 1 thread 
> I'm wondering why it would say Avail Threads is 32.
>
> I'm still not sure why it takes longer when Slurm is being used, but I'm 
> reading as much as I can.
>
> Thanks!
>
> Jeff
>
>
> On Wed, Apr 23, 2025 at 2:15 PM Jeffrey Layton <layto...@gmail.com> wrote:
>>
>> Roger. I didn't configure Slurm so let me look at slurm.conf and gres.conf 
>> to see if they restrict a job to a single CPU.
>>
>> Thanks
>>
>> On Wed, Apr 23, 2025 at 1:48 PM Michael DiDomenico via slurm-users 
>> <slurm-users@lists.schedmd.com> wrote:
>>>
>>> without knowing anything about your environment, its reasonable to
>>> suspect that maybe your openmp program is multi-threaded, but slurm is
>>> constraining your job to a single core.  evidence of this should show
>>> up when running top on the node, watching the cpu% used for the
>>> program
>>>
>>> On Wed, Apr 23, 2025 at 1:28 PM Jeffrey Layton via slurm-users
>>> <slurm-users@lists.schedmd.com> wrote:
>>> >
>>> > Good morning,
>>> >
>>> > I'm running an NPB test, bt.C that is OpenMP and built using NV HPC SDK 
>>> > (version 25.1). I run it on a compute node by ssh-ing to the node. It 
>>> > runs in about 19.6 seconds.
>>> >
>>> > Then I run the code using a simple job:
>>> >
>>> > Command to submit job: sbatch --nodes=1 run-npb-omp
>>> >
>>> > The script run-npb-omp is the following:
>>> >
>>> > #!/bin/bash
>>> >
>>> > cd /home/.../NPB3.4-OMP/bin
>>> >
>>> > ./bt.C.x
>>> >
>>> >
>>> > When I use Slurm, the job takes 482 seconds.
>>> >
>>> > Nothing really appears in the logs. It doesn't do any IO. No data is 
>>> > copied anywhere. I'm king of at a loss to figure out why. Any suggestions 
>>> > of where to look?
>>> >
>>> > Thanks!
>>> >
>>> > Jeff
>>> >
>>> >
>>> >
>>> > --
>>> > slurm-users mailing list -- slurm-users@lists.schedmd.com
>>> > To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>>>
>>> --
>>> slurm-users mailing list -- slurm-users@lists.schedmd.com
>>> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to