Ah, I see your point. Thanks for the detail explanation. -Zhihui
On Mon, Jan 23, 2017 at 6:10 AM, Thomas Gleixner <t...@linutronix.de> wrote: > On Sat, 21 Jan 2017, Zhihui Zhang wrote: > >> Sure, I believe that comments should always match the code. In this > > That's fine. > >> case, using either LVL_SIZE - 1 or LVL_SIZE is fine based on my >> understanding about 20 days ago. But I could be wrong and miss some >> subtle details. Anyway, my point is about readability. > > Well, readability is one thing, but correctness is more important, right? > > Let's assume we have 4 buckets per level and base->clk is 0. So Level 0 > has the following expiry times: > > Bucket 0: base->clk + 0 > Bucket 1: base->clk + 1 > Bucket 2: base->clk + 2 > Bucket 3: base->clk + 3 > > So we can accomodate 4 timers here, but there is a nifty detail. We > guarantee that expiries are not short, so a timer armed for base->clk > will expire at base->clk + 1. > > The reason for this is that we have no distinction between absolute and > relative timeouts. But for relative timeouts we have to guarantee that the > timeout does not expire before the number of jiffies has elapsed. > > Now a timer armed with 1 jiffie relativ to now (jiffies) cannot be queued > to bucket 0 because jiffies can be incremented immediately after queueing > the timer which would expire it early. So it's queued to bucket 1 and > that's why we need to have LVL_SIZE - 1 and not LVL_SIZE. See also > calc_index(). > > Your change completely breaks the wheel. Let's assume the above and a > timer expiring at base->clk + 3. > > With your change the timer would fall into Level 0. So no calc_index() > does: > > expires = (expires + LVL_GRAN(lvl)) >> LVL_SHIFT(lvl); > return LVL_OFFS(lvl) + (expires & LVL_MASK); > > Let's substitute that for the expires = base->clk + 3 case: > > expires = (base->clk + 3 + 1) >> 0; > > ---> expires = 4; > > return 0 + (4 & 0x03); > > ---> index = 0 > > So the timer gets queued into bucket 0 and expires 4 jiffies too early. > > So using either LVL_SIZE - 1 or LVL_SIZE is _NOT_ fine. > > Thanks, > > tglx > > > > > >