[slurm-users] cannot find auth plugin for auth/munge

2018-06-15 Thread ~Stack~
Greetings, I've got Slurm 17.11.7 running on a Scientific Linux 6. Things are working great. I have a Scientific Linux 7 system that I just want to be able to run sinfo/squeue/sacct on. I installed 17.11.7 from the OpenHPC repo (it's what we have running on the other SL7 cluster). The munge.key

Re: [slurm-users] How to check if there's a reservation

2018-06-15 Thread Prentice Bisbal
I agree. I brought it up with SchedMD after I spent almost an entire day trying to figure out why jobs were queued up but not running. I figured the reason column would say "reservation" if that was the issue. Instead, it provided some completely useless message, making me think the problem was

Re: [slurm-users] Job Resource Utilization Summary Email

2018-06-15 Thread Hanby, Mike
Thanks, Ole, that's perfect. Mike Hanby mhanby @ uab.edu Systems Analyst II - Enterprise IT Research Computing Services The University of Alabama at Birmingham On 6/13/18, 4:22 AM, "slurm-users on behalf of Ole Holm Nielsen" wrote: On 06/12/2018 06:06 PM, Hanby, Mik

Re: [slurm-users] How to check if there's a reservation

2018-06-15 Thread Ryan Novosielski
That’s great news — this is is a vFAQ at our site. > On Jun 13, 2018, at 1:37 PM, Prentice Bisbal wrote: > > Just to revisit this, for jobs that are queued, but prevented from running, > will have a more useful reason in 18.08, which will address one of my issues > with reservation collisions.

Re: [slurm-users] Generating OPA topology.conf

2018-06-15 Thread Jeffrey Frey
> Jeffrey: It would be very nice if you could document in detail how to > configure opa2slurm and list all prerequisite RPMs in your README.md. Added to the README.md: build info and usage info. :: Jeffrey T. Frey, Ph.D. Systems Programmer V

[slurm-users] Stripped binaries and parallel debugging

2018-06-15 Thread Pär Lindfors
Hi, Slurm's spec file disable RPM's normal behaviour where symbols are extracted and shipped in a separate debuginfo RPM. There is a comment that this is done to avoid breaking parallel debugging. Does anybody know what parallel debugging use case this refers to? I did a small test and stripped

Re: [slurm-users] cluster not registered

2018-06-15 Thread UGI
I have changed the StateSaveLocation. And now the errors gone. It works ok. 2018-06-15 17:21 GMT+08:00 UGI : > > When I use slurmdbd, it output the following errors. > > I have run "sacctmgr add clustr myslurm". > > [2018-06-15T17:11:54.685] slurmdbd version 17.11.7 started > > [2018-06-15T17:12

Re: [slurm-users] sreport reports blank information

2018-06-15 Thread Buckley, Ronan
I should have added that I just setup this cluster. I'm guessing that it won't provide sreport data until tomorrow as the default range of the command seems to be yesterday (Cluster Utilization 2018-06-14T00:00:00 - 2018-06-14T23:59:59) Can anyone confirm this? From: slurm-users [mailto:slurm-us

Re: [slurm-users] When I start slurmctld, there are some errors in log.

2018-06-15 Thread UGI
I have changed the StateSaveLocation. And now the errors gone. It works ok. 2018-06-15 17:42 GMT+08:00 John Hearns : > Please do three things for the list: > > a) cat /etc/*elease* > > b) give details on how Slurm was installed on the master node and the > compute nodes > > c) How was your slurm.

Re: [slurm-users] When I start slurmctld, there are some errors in log.

2018-06-15 Thread John Hearns
Please do three things for the list: a) cat /etc/*elease* b) give details on how Slurm was installed on the master node and the compute nodes c) How was your slurm.conf file created? Is this file identical on master node and compute nodes? On 15 June 2018 at 11:26, UGI wrote: > I didn't hav

Re: [slurm-users] additional variable to the struct job_desc_msg_t

2018-06-15 Thread Pär Lindfors
Dear Rajiv, On 06/14/2018 09:38 AM, Rajiv Nishtala wrote: > I started working with SLURM some days ago, and one of the first things > I aim to do is adding an additional variable to the job_script file via > the structure job_desc_msg_t. > Specifically, via the function slurm_submit_batch_job in s

Re: [slurm-users] When I start slurmctld, there are some errors in log.

2018-06-15 Thread UGI
I didn't have the directory /var/spool/slurmctld/. And then I mkdir the directory, and "chown slurm:slurm /var/spool/slurmctld". But there is also the errors. 2018-06-15 16:00 GMT+08:00 John Hearns : > And your permissions on the directory /var/spool/slurmctld/ are > > On 15 June 2018 at 0

[slurm-users] cluster not registered

2018-06-15 Thread UGI
When I use slurmdbd, it output the following errors. I have run "sacctmgr add clustr myslurm". [2018-06-15T17:11:54.685] slurmdbd version 17.11.7 started [2018-06-15T17:12:05.377] DBD_JOB_COMPLETE: cluster not registered [2018-06-15T17:12:05.379] DBD_STEP_START: cluster not registered [2018-06

[slurm-users] sreport reports blank information

2018-06-15 Thread Buckley, Ronan
Hi all, Slurm accounting commands like sstat and sacct report information but sreport always reports no information, even though by default it works on my VM. What am I missing? Rgds Ronan

Re: [slurm-users] When I start slurmctld, there are some errors in log.

2018-06-15 Thread John Hearns
And your permissions on the directory /var/spool/slurmctld/ are On 15 June 2018 at 09:11, UGI wrote: > When I start slurmctld, there are some errors in log. And the job running > information doesn't store to mysql via slurmdbd. > > I set > > AccountingStoragePass=/usr/local/munge-munge-0.5

[slurm-users] When I start slurmctld, there are some errors in log.

2018-06-15 Thread UGI
When I start slurmctld, there are some errors in log. And the job running information doesn't store to mysql via slurmdbd. I set AccountingStoragePass=/usr/local/munge-munge-0.5.13/var/run/munge/munge.socket.2 AccountingStorageType=accounting_storage/slurmdbd JobAcctGatherType=jobacct_gather/li