Okay, so perhaps this is another bug. At each reconfigure, users lose
access to the jobs they submitted before the reconfigure itself and start
"clean slate". Newly submitted jobs can be queried normally. The slurm
administrator can query everything at all times, so the data is not
lost, but this i
And weirdly enough it has now stopped working again, after I did the
experimentation for power save described in the other thread.
That is really strange. At the highest verbosity level the logs just say
slurmdbd: debug: REQUEST_PERSIST_INIT: CLUSTER:cluster VERSION:9984
UID:1457 IP:192.168.2.254
For others potentially seeing this on mailing list search, yes, I needed
that, which of course required creating an account charge which I wasn't
using. So I ran
sacctmgr add account default_account
sacctmgr add -i user $user Accounts=default_account
with an appropriate looping around for $user a
You will probably need to.
The way we handle it is that we add users when the first submit a job
via the job_submit.lua script. This way the database autopopulates with
active users.
-Paul Edmon-
On 10/3/23 9:01 AM, Davide DelVento wrote:
By increasing the slurmdbd verbosity level, I got add
By increasing the slurmdbd verbosity level, I got additional information,
namely the following:
slurmdbd: error: couldn't get information for this user (null)(xx)
slurmdbd: debug: accounting_storage/as_mysql:
as_mysql_jobacct_process_get_jobs: User xx has no associations, and
is not admi
Thanks Paul, this helps.
I don't have any PrivateData line in either config file. According to the
docs, "By default, all information is visible to all users" so this should
not be an issue. I tried to add a line with "PrivateData=jobs" to the conf
files, just in case, but that didn't change the b
At least in our setup, users can see their own scripts by doing sacct -B
-j JOBID
I would make sure that the scripts are being stored and how you have
PrivateData set.
-Paul Edmon-
On 10/2/2023 10:57 AM, Davide DelVento wrote:
I deployed the job_script archival and it is working, however it
I deployed the job_script archival and it is working, however it can be
queried only by root.
A regular user can run sacct -lj towards any jobs (even those by other
users, and that's okay in our setup) with no problem. However if they run
sacct -j job_id --batch-script even against a job they own
Fantastic, this is really helpful, thanks!
On Thu, Sep 28, 2023 at 12:05 PM Paul Edmon wrote:
> Yes it was later than that. If you are 23.02 you are good. We've been
> running with storing job_scripts on for years at this point and that part
> of the database only uses up 8.4G. Our entire data
Yes it was later than that. If you are 23.02 you are good. We've been
running with storing job_scripts on for years at this point and that
part of the database only uses up 8.4G. Our entire database takes up
29G on disk. So its about 1/3 of the database. We also have database
compression whic
No, all the archiving does is remove the pointer. What slurm does right
now is that it creates a hash of the job_script/job_env and then checks
and sees if that hash matches one on record. If not then it adds it to
the record, if it does match then it adds a pointer to the appropriate
record.
Sorry for the duplicate e-mail in a short time: do you know (or anyone) when
the hashing was added? Was planning to enable this on 21.08, but we then had to
delay our upgrade to it. I’m assuming later than that, as I believe that’s when
the feature was added.
On Sep 28, 2023, at 13:55, Ryan Nov
Thank you; we’ll put in a feature request for improvements in that area, and
also thanks for the warning? I thought of that in passing, but the real world
experience is really useful. I could easily see wanting that stuff to be
retained less often than the main records, which is what I’d ask for
Slurm should take care of it when you add it.
So far as horror stories, under previous versions our database size
ballooned to be so massive that it actually prevented us from upgrading
and we had to drop the columns containing the job_script and job_env.
This was back before slurm started ha
In my current slurm installation, (recently upgraded to slurm v23.02.3), I
only have
AccountingStoreFlags=job_comment
I now intend to add both
AccountingStoreFlags=job_script
AccountingStoreFlags=job_env
leaving the default 4MB value for max_script_size
Do I need to do anything on the DB mysel
15 matches
Mail list logo