ptimer: Fix issues caused by artificially limited timer timeout

Dmitry Osipenko Thu, 29 Oct 2015 07:39:54 -0700

29.10.2015 04:39, Peter Crosthwaite пишет:



On Tue, Oct 27, 2015 at 2:26 PM, Dmitry Osipenko <dig...@gmail.com
<mailto:dig...@gmail.com>> wrote:

    25.10.2015 20:39, Peter Crosthwaite пишет:

        On Sun, Oct 25, 2015 at 6:23 AM, Dmitry Osipenko <dig...@gmail.com
        <mailto:dig...@gmail.com>> wrote:

            25.10.2015 02:55, Peter Crosthwaite пишет:

                On Sat, Oct 24, 2015 at 3:22 PM, Dmitry Osipenko
                <dig...@gmail.com <mailto:dig...@gmail.com>> wrote:


                    24.10.2015 22:45, Peter Crosthwaite пишет:




                        This looks like a give-up without trying to get the
                        correct value. If
                        the calculated value (using the normal-path logic below)
                        is sane, you
                        should just use it. If it comes out bad then you should
                        clamp to 1.

                        I am wondering whether this clamping policy (as in
                        original code as
                        well) is correct at all though. The value of a 
free-running
                        short-interval periodic timer (poor mans random number
                        generator)
                        without any actual interrupt generation will be affected
                        by QEMUs
                        asynchronous handling of timer events. So if I set limit
                        to 100, then
                        sample this timer every user keyboard stroke, I should
                        get a uniform
                        distribution on [0,100]. Instead in am going to get lots
                        of 1s. This




                    Right, that's a good example. What about to scale ptimer
                    period to match
                    adjusted timer_mod interval?


                Thats just as incorrect as changing the limit IMO. The guest
                could get
                confused with a timer running at the wrong frequency.

                        is more broken in the original code (as you state), as I
                        will get >
                        100, but I think we have traded broken for slightly less
                        broken. I
                        think the correct semantic is to completely ignoring
                        rate limitin
                        except for the scheduling on event callbacks. That is,
                        the timer




                    I'm missing you here. What event callbacks?


                when timer_mod() hits, and it turn triggers some device specific
                event
                (usually an interrupt).

                There are two basic interactions for any QEMU timer. There are
                synchronous events, i.e. the guest reading (polling) the counter
                which
                is what this patch tries to fix. The second is the common case 
of
                periodic interrupt generation. My proposal is that rate limiting
                does
                not affect synchronous operation, only asynchronous (so my 
keystroke
                RNG case works). In the current code, if ptimer_get_count() is
                called
                when the event has passed it returns 0 under the assumption 
that the
                timer_mod callback is about to happen. With rate-limiting that
                may be
                well in the future.


            ptimer_tick() would happen on the next QEMU loop cycle, so it might
            be more
            reasonable to return counter = 1 here, wouldn't it?


                        interval is not rate limited, instead the timer_mod 
interval
                        (next_event -last_event) just has a 10us clamp.

                        The means the original code semantic of returning
                        counter = 0 for an
                        already triggered timer is wrong. It should handle
                        in-the-past
                        wrap-arounds as wraparounds by triggering the timer and
                        redoing the
                        math with the new interval values. So instead the logic
                        would be
                        something like:

                        timer_val = -1;

                        for(;;) {

                             if (!enabled) {
                                return delta;
                             }

                             timer_val = (next-event - now) *  scaling();
                             if (timer_val >=  0) {
                                 return timer_val;
                             }
                             /* Timer has actually expired but we missed it,
                        reload it and try
                        again
                        */
                             ptimer_tick();
                        }




                    Why do you think that ptimer_get_count() == 0 in case of the
                    running
                    periodic timer that was expired while QEMU was "on the way"
                    to ptimer
                    code
                    is bad and wrong?



                Because you may have gone past the time when it was actually
                zero and
                reloaded and started counting again. It should return the real
                (reloaded and resumed) value. This is made worse by rate 
limiting as
                the timer will spend a long time at the clamp value waiting for 
the
                rate-limited tick to fix it.

                Following on from before, we don't want the async stuff to 
affect
                sync. As the async callbacks are heavily affected by rate
                limiting we
                don't want the polled timer having to rely on the callbacks 
(async
                path) at all for correct operation. That's what the current
                implementation of ptimer_get_count assumes with that 0-clamp.


            Alright, that make sense now. Thanks for clarifying.

                       From the guest point of view it's okay (no?), do we 
really
                    need to overengineer that corner case?


                Depends on your use case. Your bug report is correct in that the
                timer
                should never be outside the bounds of the limit. But you are 
fixing
                that very specifically with a minimal change rather than 
correcting
                the larger ptimer_get_count() logic which I think is more broken
                than
                you suggest it is.


                        ptimer_reload() then needs to be patched to make sure it
                        always
                        timer_mod()s in the future, otherwise this loop could
                        iterate a large
                        number of times.

                        This means that when the qemu_timer() actually ticks, a
                        large number
                        or cycles may have occured, but we can justify that in
                        that callback
                        event latency (usually interrupts) is undefined anyway
                        and coalescing
                        of multiples may have happened as part of that. This
                        usually matches
                        expectations of real guests where interrupt latency is
                        ultimately
                        undefined.




                    ptimer_tick() is re-arm'ing the qemu_timer() in case of
                    periodic mode.
                    Hope
                    I haven't missed your point here.


                Yes. But it is also capable of doing the heavy lifting for our
                already
                expired case. Basic idea is, if the timer is in a bad state 
(should
                have hit) but hasn't, do the hit to put the timer into a good 
state
                (by calling ptimer_tick) then just do the right thing. That's
                what the
                loop does. It should also work for an in-the-past one-shot.


            Summarizing what we have now:

            There are two issues with ptimer:

            1) ptimer_get_count() return wrong values with adjusted .limit

            Patch V7 doesn't solve that issue, but makes it slightly better by
            clamping
            returned value to [0, 1]. That's not what we want, we need to return
            counter
            value in it's valid range [0, limit].

            You are rejecting variant of scaling ptimer period, saying that it 
might
            affect software behavior inside the guest. But by adjusting the
            timer, we
            might already causing same misbehavior in case of blazing fast host
            machine.


        It is a different misbehaviour. We are modelling the polled timer
        perfectly but limiting the frequency of callbacks (interrupts). I
        think this is the lesser of two evils.

            I'll scratch my head a bit more on it. If you have any better idea,
            please
            share.

            2) ptimer_get_count() return fake 0 value in case of the expired
            qemu_timer() without triggering ptimer_tick()

            You're suggesting to solve it by running ptimer_tick(). So if 
emulated
            device uses ptimer tick event (scheduled qemu bh) to raise 
interrupt, it
            would do it by one QEMU loop cycle earlier.


        Yes, this is ok, as even in a rate limited scenario there is no reason
        to absolutely force the rate limit. If a poll happens it should just
        flush the waiting interrupt.

            My question here: is it always legal for the guest software to be
            able to
            get counter = 0 on poll while CPU interrupt on timer expire hasn't
            happened
            yet (would happen after one QEMU cycle).


        Yes. And I am going a step further by saying it is ok for the guest
        software to see the timer value wrapped around before the expire too.


    Let's imagine a hardware with a such restriction: timer interrupt has
    highest priority and CPU immediately switches to the interrupt handler in a
    such way that it won't ever could see counter = 0 / wraparound (with
    interrupt enabled) before entering the handler.

    Is it unrealistic?


Yes. And if it is possible in real HW, I don't think this is valid for QEMU
outside of icount mode.

Okay, fair enough. So QEMU doesn't guarantee proper behavior outside of icountmode. That's not what I expected :(

    For instance (on QEMU):

                  CPU                  |               Timer
    ---------------------------------------------------------------------
       start_periodic_timer            |       timer starts ticking
                                     .....
       QEMU starts to execute          |
       translated block                |
                                       |       QEMU timer expires
                                       |
       CPU reads the timer register,   |       ptimer_get_count() return
       ptimer_get_count() called       |       wrapped around value
                                     .....
       CPU interrupt handler kicks in  |       timer continue ticking, so
                                       |       any value is valid here
       CPU stops the timer and sets    |
       counter to 0, returns from the  |
       handler                         |
                                     .....
       Now, for some reason, software  |
       sees that timer is stopped      |
       and do something using read     |
       value                           |

    Program code sketch:

    timer_interrupt_handler()
    {
             write32(1, TIMER_STOP);
             write32(0, TIMER_COUNTER);
             write32(TIMER_IRQ_CLEAR, TIMER_STATUS);

             return IRQ_HANDLED;
    }

    program()
    {
             .....

             ..... <--- timer expired here
             ..... <--- interrupt handler executed here on real HW

             var1 = read32(TIMER_COUNTER); <--- Emulated got wrapped,
                                                real got 0

             ..... <--- interrupt handler executed here on QEMU

             if (read32(TIMER_STATUS) & TIMER_RUNNING) {
                     .....
             } else {
                     .....

                     write(var1 >> 16, SOME_DEV_REGISTER);
             }

             .....
    }

    Might emulated program behave differently from the real HW after it? 
Probably.

    I want to mention that not only beefy generic CPU's are the ptimer users.

    However, it seems that no one of the current ptimer users has a such
    restriction since it would already return 0 on expire and ptimer_tick()
    would happen after it. We can agree on keeping ptimer less universal in 
favor of
    the expire optimization, so somebody may improve it later if it would be 
needed.


I think removing the rate limiter's and clamping-affect on the read value makes
it more universal if anything.

    Do we agree?


I'm not sure, what are you referring to as the "expire optimizsation"?

By calling ptimer_tick() from ptimer_get_count() and returning wrapped aroundvalue (see the "future"), we would cause qemu_bh schedule happen earlier andmight break expected IRQ ordering. That's what I meant by "expire optimization".

            I guess it might cause software
            misbehavior if it assumes that the real hardware has CPU and timer
            running
            in the same clock domain, i.e. such situation might be not possible.


        Assumptions about the CPU clocking only make sense in icount mode,
        where the rate limiter is disable anyway.


    Timer limiter has nothing to do with a returned value for the expired timer.
    Clock cycle accurate execution isn't relevant to upstream QEMU, I meant
    clocking in general. Emulated behaviour shouldn't diverge from the real HW.


But when the rate limiter is on for a short interval, it massively distorts 
this.


Two corner cases for the limited polled periodic ptimer:

1) QEMU timer expires during poll:

Like you are suggesting, in that case we are calculating wrapped around countervalue, trigger ptimer_tick() and return wrapped value. Good.

2) Host machine is blazing fast, so poor RNG might constantly face "counter >limit":


How should we handle it?

- Period scaling is rejected, okay.

- Clamp counter to limit and pretend that it won't happen?

- Disable limiting for that host machine?

- Tune timer limit?

- Scale counter when we hit "counter > limit" in ptimer_get_count()? But nowwe'll have to book fact that we have hit that case and should scale counter onevery next poll till timer expires to avoid jumping into the "past", whencounter become less or equal the limit.


Regards,
Peter

            So I'm
            suggesting to return counter = 1 and allow ptimer_tick() happen on
            it's own.


        My alternate suggestion is, if you detect that the tick should have
        already happened, just make it happen. I don't see the need to rate
        limit a polled timer.


    Yes, I got your idea and it is absolutely correct if we agree on the above
    tradeoff (if that tradeoff exists).


--
Dmitry

Re: [Qemu-devel] [PATCH v7 1/2] hw/ptimer: Fix issues caused by artificially limited timer timeout

Reply via email to