Funky new vblank counter regressions in Linux 4.4-rc1

Mario Kleiner Mon, 23 Nov 2015 18:58:34 +0100


On 11/23/2015 04:51 PM, Ville SyrjÃ¤lÃ¤ wrote:
> On Mon, Nov 23, 2015 at 04:23:21PM +0100, Mario Kleiner wrote:
>> On 11/20/2015 04:34 PM, Ville SyrjÃ¤lÃ¤ wrote:
>>> On Fri, Nov 20, 2015 at 04:24:50PM +0100, Mario Kleiner wrote:
>>
>> ...
>> Ok, but why would that be a bad thing? I think we want it to think it is
>> in the previous frame if it is called outside the vblank irq context.
>> The only reason we fudge it to the next frames vblank if i vblank irq is
>> because we know the vblank irq handler we are executing atm. was meant
>> to execute within the upcoming vblank for the next frame, so we fudge
>> the scanout positions and thereby timestamp to correspond to that new
>> frame. But if something called outside irq context it should get a
>> scanout position/timestamp that corresponds to "reality".
>
> It would be a bad thing since it would cause the timestamp to jump
> backwards, and that would also cause the frame count guesstimate to go
> backwards.
>


But only if we don't use the dev->driver->get_vblank_counter() method, 
which we try to use on AMD. Also dev->vblank_time_lock should protect us 
from concurrent execution of the vblank counting/timestamping in irq 
context and non-irq context? At least that was one of the purposes of 
that lock in the past?

-mario

>>
>> It would maybe be a problem to get different answers for a query at the
>> same scanout position if we used .get_scanout_position() for something
>> else than for calculating vblank timestamps, but we don't do that atm.
>>
>> Maybe i'm overlooking something here related to the somewhat rewritten
>> code from the last year or so? But in the original design this would be
>> exactly what i intended?
>>
>> ...
>>
>>>> So it's good enough for typical desktop
>>>> applications/compositors/games/media players, and a nice improvement
>>>> over the previous state, but not quite sufficient for applications that
>>>> need long time consistent vblank counts for long waits of multiple
>>>> seconds or even minutes. So it's still good to have hw counters if we
>>>> can get some that are reliable enough.
>>>
>>> Ah, I didn't realize you care about small errors in the counter for
>>> such long periods of vblank off.
>>>
>>
>> Actually, you are right, i was stupid - not enough sleep last friday. I
>> do care about such small errors, but the long vblank off periods don't
>> matter at all the way my software works. I query the current count and
>> timestamp (glXGetSyncValuesOML), calculate a target count based on those
>> and then schedule a swap for that target count via glXSwapBuffersMscOML.
>> That swapbuffers call will keep vblank irqs on until the kms pageflip is
>> queued. So i only care about vblank counts/timestamps being consistent
>> for short amounts of time, typically 1 video refresh from vblank query
>> to queueing a vblank event, and then from reception of that
>> event/queuing the pageflip to pageflip completion event. So even if the
>> system would be heavily loaded and my code would have big preemption
>> delays i think counts that are consistent over a few seconds would be
>> enough to keep things working. Otherwise it wouldn't work now either
>> with a vblank off after 5 seconds and nouveau not having vblank hw counters.
>>
>> -mario
>>
>>
>>>>
>>>> -mario
>>>>
>>>>
>>>>>>
>>>>>> -mario
>>>>>>
>>>>>>>>
>>>>>>>> It almost sort of works on the rs600 code path, but i need a bit of 
>>>>>>>> info
>>>>>>>> from you:
>>>>>>>>
>>>>>>>> 1. There's this register from the old specs for m76.pdf, which is not
>>>>>>>> part of the current register defines for radeon-kms:
>>>>>>>>
>>>>>>>> "D1CRTC_STATUS_VF_COUNT - RW - 32 bits - [GpuF0MMReg:0x60A8]"
>>>>>>>>
>>>>>>>> It contains the lower 16 bits of framecounter and the 13 bits of
>>>>>>>> vertical scanout position. It seems to give the same readings as the 24
>>>>>>>> bit R_0060A4_D1CRTC_STATUS_FRAME_COUNT we use for the hw counter. This
>>>>>>>> would come handy.
>>>>>>>>
>>>>>>>> Does Evergreen and later have a same/similar register and where is it?
>>>>>>>>
>>>>>>>> 2. The hw framecounter seems to increment when the vertical scanout
>>>>>>>> position wraps back from (VTOTAL - 1) to 0, at least on the one DCE-3
>>>>>>>> gpu i tested so far. Is this so on all asics? And is the hw counter
>>>>>>>> increment happening exactly at the moment that vertical scanout 
>>>>>>>> position
>>>>>>>> jumps back to zero, ie. both events are driven by the same signal? Or 
>>>>>>>> is
>>>>>>>> the framecounter increment just happening somewhere inside either
>>>>>>>> scanline VTOTAL-1 or scanline 0?
>>>>>>>>
>>>>>>>>
>>>>>>>> If we can fix this and get it into rc2 or rc3 then we could avoid a bad
>>>>>>>> regression and with a bit of luck at the same time improve by being 
>>>>>>>> able
>>>>>>>> to set dev->vblank_disable_immediate = true then and allow vblank irqs
>>>>>>>> to get turned off more aggressively for a bit of extra power saving.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>> -mario
>>>>>>>
>>>>>>>> -- a/drivers/gpu/drm/drm_irq.c
>>>>>>>> +++ b/drivers/gpu/drm/drm_irq.c
>>>>>>>> @@ -172,9 +172,11 @@ static void drm_update_vblank_count(struct 
>>>>>>>> drm_device *dev, unsigned int pipe,
>>>>>>>>                                         unsigned long flags)
>>>>>>>>      {
>>>>>>>>             struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
>>>>>>>> -       u32 cur_vblank, diff;
>>>>>>>> +       u32 cur_vblank, diff = 0;
>>>>>>>>             bool rc;
>>>>>>>>             struct timeval t_vblank;
>>>>>>>> +       const struct timeval *t_old;
>>>>>>>> +       u64 diff_ns;
>>>>>>>>             int count = DRM_TIMESTAMP_MAXRETRIES;
>>>>>>>>             int framedur_ns = vblank->framedur_ns;
>>>>>>>>
>>>>>>>> @@ -195,13 +197,15 @@ static void drm_update_vblank_count(struct 
>>>>>>>> drm_device *dev, unsigned int pipe,
>>>>>>>>                     rc = drm_get_last_vbltimestamp(dev, pipe, 
>>>>>>>> &t_vblank, flags);
>>>>>>>>             } while (cur_vblank != 
>>>>>>>> dev->driver->get_vblank_counter(dev, pipe) && --count > 0);
>>>>>>>>
>>>>>>>> -       if (dev->max_vblank_count != 0) {
>>>>>>>> -               /* trust the hw counter when it's around */
>>>>>>>> -               diff = (cur_vblank - vblank->last) & 
>>>>>>>> dev->max_vblank_count;
>>>>>>>> -       } else if (rc && framedur_ns) {
>>>>>>>> -               const struct timeval *t_old;
>>>>>>>> -               u64 diff_ns;
>>>>>>>> -
>>>>>>>> +       /*
>>>>>>>> +        * Always use vblank timestamping based method if supported to 
>>>>>>>> reject
>>>>>>>> +        * redundant vblank irqs. E.g., AMD hardware needs this to not 
>>>>>>>> screw up
>>>>>>>> +        * due to some irq handling quirk.
>>>>>>>
>>>>>>> Hmm. I'm thinking it would be better to simply not claim that there's a 
>>>>>>> hw
>>>>>>> counter if it isn't reliable.
>>>>>>>
>>>>>>>> +        *
>>>>>>>> +        * This also sets the diff value for use as fallback below in 
>>>>>>>> case the
>>>>>>>> +        * hw does not support a suitable hw vblank counter.
>>>>>>>> +        */
>>>>>>>> +       if (rc && framedur_ns) {
>>>>>>>>                     t_old = &vblanktimestamp(dev, pipe, vblank->count);
>>>>>>>>                     diff_ns = timeval_to_ns(&t_vblank) - 
>>>>>>>> timeval_to_ns(t_old);
>>>>>>>>
>>>>>>>> @@ -212,11 +216,28 @@ static void drm_update_vblank_count(struct 
>>>>>>>> drm_device *dev, unsigned int pipe,
>>>>>>>>                      */
>>>>>>>>                     diff = DIV_ROUND_CLOSEST_ULL(diff_ns, framedur_ns);
>>>>>>>>
>>>>>>>> -               if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ)
>>>>>>>> -                       DRM_DEBUG_VBL("crtc %u: Redundant vblirq 
>>>>>>>> ignored."
>>>>>>>> -                                     " diff_ns = %lld, framedur_ns = 
>>>>>>>> %d)\n",
>>>>>>>> -                                     pipe, (long long) diff_ns, 
>>>>>>>> framedur_ns);
>>>>>>>> -       } else {
>>>>>>>> +               if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ) {
>>>>>>>> +                   DRM_DEBUG_VBL("crtc %u: Redundant vblirq ignored."
>>>>>>>> +                   " diff_ns = %lld, framedur_ns = %d)\n",
>>>>>>>> +                   pipe, (long long) diff_ns, framedur_ns);
>>>>>>>> +
>>>>>>>> +                   /* Treat this redundant invocation as no-op. */
>>>>>>>> +                   WARN_ON_ONCE(cur_vblank != vblank->last);
>>>>>>>> +                   return;
>>>>>>>> +               }
>>>>>>>> +       }
>>>>>>>> +
>>>>>>>> +       /*
>>>>>>>> +        * hw counters, if marked as supported via max_vblank_count != 
>>>>>>>> 0,
>>>>>>>> +        * *must* increment at leading edge of vblank and in sync with
>>>>>>>> +        * the firing of the hw vblank irq, otherwise we have a race 
>>>>>>>> here
>>>>>>>> +        * between gpu and us, where small variations in our execution
>>>>>>>> +        * timing can cause off-by-one miscounting of vblanks.
>>>>>>>> +        */
>>>>>>>> +       if (dev->max_vblank_count != 0) {
>>>>>>>> +               /* trust the hw counter when it's around */
>>>>>>>> +               diff = (cur_vblank - vblank->last) & 
>>>>>>>> dev->max_vblank_count;
>>>>>>>> +       } else if (!(rc && framedur_ns)) {
>>>>>>>>                     /* some kind of default for drivers w/o accurate 
>>>>>>>> vbl timestamping */
>>>>>>>>                     diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0;
>>>>>>>>             }
>>>>>>>
>>>>>>>
>>>>>
>>>
>

Funky new vblank counter regressions in Linux 4.4-rc1

Reply via email to