Hi,
The command "sshare -l" is crashing. I isolated the problem to an account
which is causing the problem. The problem seems to be an extremely large
LevelFS in the order of 4.8x10e16. I can see the value if I add the "-p"
option. Is there a way to fix the account?
Below are the results of the 2
Hi,
I would like to do the equivalent of:
sacctmgr -i add user namef account=grpa
sacctmgr -i add user nameg account=grpa
...
sacctmgr -i add user namez account=grpa
but with an "sacct -i load filename" in which filename contains the grpa
with the list of user. The documentation mentions the "lo
I would suggest using Gnu Parallel (https://www.gnu.org/software/parallel/).
Also, if you run that many "srun" in a row, on a very large cluster where
the slurmctl is very solicited some of the srun might time out and not run.
Richard
Le ven. 5 nov. 2021 à 05:45, Marcus Pedersén a
écrit :
> Hi
We have MIG defined and being used. But the billing for which MIG is used
dean't seem to work.
I have in the partitions the slurm.conf with something like below for
TRESBilllings:
TRESBillingWeights=CPU=1,Mem=1G,GRES/gpu:3g.20gb=0.375,GRES/gpu:4g.20gb=0.5,GRES/gpu=1.0
Yet, when I do sacct -j I d
I'm having problems with Autodetect=nvml in gres.conf.
I get on the controller log the following:
error: _check_core_range_matches_sock: gres/gpu GRES autodetected core
affinity 16-31 on node node001 doesn't match socket boundaries. (Socket 0
is cores 0-31). Consider setting SlurmdParameters=l3ca