The user account with the anomalously high usage hasn't run any jobs in the last
year (that I'm interested in), so I deleted all jobs by that user.
So that includes all jobs with that id_user, id_group, and any of their
associations:
mysql> select user,partition,acct,id_assoc from <cluster>_assoc_table where
user="JoeUser";
+---------+-----------+-----------+----------+
| user | partition | acct | id_assoc |
+---------+-----------+-----------+----------+
| JoeUser | high | avgrp | 91 |
| JoeUser | lo | avgrp | 89 |
| JoeUser | low | avgrp | 271 |
| JoeUser | med | avgrp | 90 |
+---------+-----------+-----------+----------+
I found some with id_user=0 or id_group=0 as well, we don't run jobs as root so
I nuked those as well.
Then I set the last_ran table to jan 1st 2015:
update <cluster>_last_ran_table set hourly_rollup=UNIX_TIMESTAMP('2015-01-01
00:00:00'),daily_rollup=UNIX_TIMESTAMP('2015-01-01
00:00:00'),monthly_rollup=UNIX_TIMESTAMP('2015-01-01 00:00:00');
Nothing happened, so I restarted slurmdbd daemon, and it ran at 100% for an hour
or so rebuilding the tables.
Unfortunately sreport still shows super high numbers for root and the user in
question, even for time periods in the last year.
[email protected]:~# sreport cluster AccountUtilizationByUser
Start=2017-01-01 End=2018-01-01 -t percent
Cluster/Account/User Utilization 2017-01-01T00:00:00 - 2017-10-30T16:59:59
(26150400 secs)
Use reported in Percentage of Total
--------------------------------------------------------------------------------
Cluster Account Login Proper Name Used Energy
--------- --------------- --------- --------------- ------------- --------
MyCluster root 3762.30% 0.00%
MyCluster root root root 0.00% 0.00%
MyCluster avgrp 3643.77% 0.00%
MyCluster avgrp JoeUser Joe User 3388.96% 0.00%
MyCluster avgrp JoeUser Joe User 254.76% 0.00%
Any idea what to look for? Or any other way to rebuild the accounting data for
the last year?
I ran lost.pl and found nothing there either.