Hello,

I am in the situation where evaluating the precise memory consumption of jobs 
beforehand is pretty challenging. So I would like to create a “trust” system, 
meaning that the requested memory for jobs is taken into account for 
scheduling, but no action is taken if the job actually breach the limit once 
running on the node.
I tried to use NoOverMemoryKill but it seems to work only for sbatch, not srun.
So I ended up declaring memory as an un-consumable resource on the slurm.conf 
of nodes, but not on the master. This seems to work, but looks rather hackish 
(and slurm complains of the discrepancy in configuration)
Is this a supported practice? Can it bite me later on? Is there a cleaner 
solution?

Reply via email to