Re: [slurm-users] Accounting Information from slurmdbd does not reach slurmctld

2020-03-24 Thread Pascal Klink
Hi Sean, Hi Marcus, Changing from localhost to the actual IP seems to have solved the problem. Is that because not only the slurmctld process on the control node but also the slurmd processes on the compute nodes need to have access to the accounting information? Because although slurmdbd and

Re: [slurm-users] Accounting Information from slurmdbd does not reach slurmctld

2020-03-23 Thread Sean Crosby
What happens if you change AccountingStorageHost=localhost to AccountingStorageHost=192.168.1.1 i.e. same IP address as your ctl, and restart the ctld Sean -- Sean Crosby | Senior DevOpsHPC Engineer and HPC Team Lead Research Computing Services | Business Services The University of Melbourne,

Re: [slurm-users] Accounting Information from slurmdbd does not reach slurmctld

2020-03-23 Thread Marcus Wagner
Hi Pascal, are the slurmdbd and slurmctld running on he same host? Best Marcus Am 20.03.2020 um 18:12 schrieb Pascal Klink: Hi Chris, Thanks for the quick answer! I tried the 'sacctmgr show clusters‘ command, which gave Cluster ControlHost ControlPort RPC Share ... QOS

Re: [slurm-users] Accounting Information from slurmdbd does not reach slurmctld

2020-03-20 Thread Pascal Klink
Hi Chris, Thanks for the quick answer! I tried the 'sacctmgr show clusters‘ command, which gave Cluster ControlHost ControlPort RPC Share ... QOS Def QOS -- --- - - ... - iascluster

Re: [slurm-users] Accounting Information from slurmdbd does not reach slurmctld

2020-03-19 Thread Christopher Samuel
On 3/19/20 4:05 AM, Pascal Klink wrote: However, there was not real answer given why this happened. So we thought that maybe this time someone may have an idea. To me it sounds like either your slurmctld is not correctly registering with slurmdbd, or if it has then slurmdbd cannot connect ba

[slurm-users] Accounting Information from slurmdbd does not reach slurmctld

2020-03-19 Thread Pascal Klink
Hi everyone, we currently have a problem with our SLURM setup for a small cluster of 7 machines. The problem is that the accounted core usage is not correctly used for the share computation. I have set up a minimal (not) working example. In this example, we have one cluster to which we have add