Bill, would this allow allocating all the remaining harts when the
node is initially half full ? How are the parameters set up for that ?
The cluster has 14 machines with 56 harts and 128 GB RAM and 12
machines with 104 harts and 256 GB RAM.

Some of the algorithms used have hot loops that scale close to or
beyond the number of harts, so it will always be beneficial to use all
harts available in an opportunistic, best-effort way. The algorithms
are for training photometric galaxy redshift estimators (galaxy
distance calculators). Training will be done with a certain frequency
due to the large amount of available physical parameters. The amount
of memory that's being required right now seems to be below 10 GB, but
I can't say for all algorithms that will be used (at least 6 different
ones), nor for different parameters expected to be required.

On Thu, Aug 1, 2024 at 4:27 PM Bill via slurm-users
<slurm-users@lists.schedmd.com> wrote:
>
> Either allocate the whole node's cores or the whole node's memory?  Both
> will allocate the node exclusively for you.
>
> So you'll need to know what a node looks like.  For a homogeneous
> cluster, this is straightforward.  For a heterogeneous cluster, you may
> also need to specify a nodelist for say those 28 core nodes and then
> those 64 core nodes.
>
> But going back to the original answer, --exclusive, is the answer here.
> You DO know how many cores you need right?  (Scaling study should give
> you that).  And you DO know the memory footprint by past jobs with
> similar inputs I hope.
>
> Bill
>
> On 8/1/24 3:17 PM, Henrique Almeida via slurm-users wrote:
> >   Hello, maybe rephrase the question to fill a whole node ?
> >
> > On Thu, Aug 1, 2024 at 3:08 PM Jason Simms <jsim...@swarthmore.edu> wrote:
> >>
> >> On the one hand, you say you want "to allocate a whole node for a single 
> >> multi-threaded process," but on the other you say you want to allow it to 
> >> "share nodes with other running jobs." Those seem like mutually exclusive 
> >> requirements.
> >>
> >> Jason
> >>
> >> On Thu, Aug 1, 2024 at 1:32 PM Henrique Almeida via slurm-users 
> >> <slurm-users@lists.schedmd.com> wrote:
> >>>
> >>>   Hello, I'm testing it right now and it's working pretty well in a
> >>> normal situation, but that's not exactly what I want. --exclusive
> >>> documentation says that the job allocation cannot share nodes with
> >>> other running jobs, but I want to allow it to do so, if that's
> >>> unavoidable. Are there other ways to configure it ?
> >>>
> >>>   The current parameters I'm testing:
> >>>
> >>>      sbatch -N 1 --exclusive --ntasks-per-node=1 --mem=0 pz-train.batch
> >>>
> >>> On Thu, Aug 1, 2024 at 12:29 PM Davide DelVento
> >>> <davide.quan...@gmail.com> wrote:
> >>>>
> >>>> In part, it depends on how it's been configured, but have you tried 
> >>>> --exclusive?
> >>>>
> >>>> On Thu, Aug 1, 2024 at 7:39 AM Henrique Almeida via slurm-users 
> >>>> <slurm-users@lists.schedmd.com> wrote:
> >>>>>
> >>>>>   Hello, everyone, with slurm, how to allocate a whole node for a
> >>>>> single multi-threaded process?
> >>>>>
> >>>>> https://stackoverflow.com/questions/78818547/with-slurm-how-to-allocate-a-whole-node-for-a-single-multi-threaded-process
> >>>>>
> >>>>>
> >>>>> --
> >>>>>   Henrique Dante de Almeida
> >>>>>   hda...@gmail.com
> >>>>>
> >>>>> --
> >>>>> slurm-users mailing list -- slurm-users@lists.schedmd.com
> >>>>> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
> >>>
> >>>
> >>>
> >>> --
> >>>   Henrique Dante de Almeida
> >>>   hda...@gmail.com
> >>>
> >>> --
> >>> slurm-users mailing list -- slurm-users@lists.schedmd.com
> >>> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
> >>
> >>
> >>
> >> --
> >> Jason L. Simms, Ph.D., M.P.H.
> >> Manager of Research Computing
> >> Swarthmore College
> >> Information Technology Services
> >> (610) 328-8102
> >> Schedule a meeting: https://calendly.com/jlsimms
> >
> >
> >
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com



-- 
 Henrique Dante de Almeida
 hda...@gmail.com

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to