Hi,

Thank you! That finally tells me where the real billing minutes (and also the 
used cpu minutes that I was using to bill in my older version of slurm) are 
stored. I’m not sure how I was to even know GrpTRESRaw is supposed to exist... 
I see no mention of it unless I specifically query it or in the docs. That’s a 
great start.

# sshare -A test -l -o GrpTRESRaw%70
                                                            GrpTRESRaw
                 -----------------------------------------------------
                            cpu=360,mem=0,energy=0,node=360,billing=60

Now I have something to query to present to the users in the future.
So now the issue remains on why I can’t use decimals to bill for time…

Thanks!
-John
From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of "Thomas 
M. Payerle" <paye...@umd.edu>
Reply-To: Slurm User Community List <slurm-users@lists.schedmd.com>
Date: Friday, April 27, 2018 at 11:39 AM
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] New Billing TRES Issue

I have not had a chance to play with the newest Slurm, but I would suggest 
looking at GrpTRESRaw, which is supposed to gather the usage by TRES (in 
TRES-minutes).
So if there is a billing TRES in GrpTRESRaw, that might be what you want.

On Fri, Apr 27, 2018 at 11:21 AM, Roberts, John E. 
<jerobe...@anl.gov<mailto:jerobe...@anl.gov>> wrote:
Hi,

I'm testing the newest version of Slurm and I'm seeing an issue when using the 
newer billing TRES to charge for cpu time on a partition. I've seen that 
billing should be used now instead of cpu in order to properly use the 
"TRESBillingWeights" option on a partition.

In my test case, I gave an account 2 hours of billing time. I used 1 hour of 
this while setting the partition to TRESBillingWeights="CPU=1.0". It seemed to 
have billed properly.
Next, I set on the same partition TRESBillingWeights="CPU=0.5". I ran several 
jobs, but the billing never seemed to increase. RawUsage, however, did 
increment correctly.

Here's an examples of sshare reporting no billing run minutes, when CPU=0.5 and 
I start a job with a walltime of 1 hour. Even though the RawUsage is well past 
2 hours, a job can still run when it shouldn't.

# sshare -A test -l -o RawUsage,GrpTRESMins,TRESRunMins%60
   RawUsage                    GrpTRESMins                                      
            TRESRunMins
----------- ------------------------------        
-----------------------------------------------------
      11068                    billing=120                      
cpu=60,mem=0,energy=0,node=60,billing=0

If I set CPU=1.0 and start say a job for 2 hours, I get this in the logs:
debug2: Job 32 being held, the job is at or exceeds assoc 
239(test/(null)/(null)) group max tres(billing) minutes of 120 of which 60 are 
still available but request is for 120 (plus 0 already in use) tres minutes 
(request tres count 1)

This makes sense because I previously ran a job at the weight of 1.0 for an 
hour so it "billed" for 1 hour at that time. How can I query the "available" 
billing hours if it's not RawUsage?

Going back to setting billing CPU weight to 0.5, the logs seem to be 
inconsistent too. In this first line, it shows the right thing:
debug:  TRES Weight: cpu = 1.000000 * 0.500000 = 0.500000

but not a few lines down:
debug2: acct_policy_job_begin: after adding job 45, assoc 
239(test/(null)/(null)) grp_used_tres_run_secs(billing) is 0

Again, RawUsage increases correctly, but Slurm is using some other field for 
billing to determine if a job can run.

My questions are: How can I set CPU billing to be less than 1 and how can I 
make sure jobs don't run if they are out of time in this case? What is Slurm 
using for billing, because it's clearly not RawUsage? Am I simply 
misunderstanding the billing and/or weights fields?

Thanks for any help...



--
Tom Payerle
DIT-ACIGS/Mid-Atlantic Crossroads        paye...@umd.edu<mailto:paye...@umd.edu>
5825 University Research Park               (301) 405-6135
University of Maryland
College Park, MD 20740-3831

Reply via email to