On 02/09/2016 03:29 PM, Daniel Vetter wrote:
> On Tue, Feb 09, 2016 at 02:39:44PM +0100, Mario Kleiner wrote:
>> On 02/09/2016 11:23 AM, Daniel Vetter wrote:
>>> On Tue, Feb 09, 2016 at 12:07:27PM +0200, Ville Syrjälä wrote:
>>>> On Tue, Feb 09, 2016 at 10:56:38AM +0100, Daniel Vetter wrote:
>>>>> On Mon, Feb 08, 2016 at 02:13:25AM +0100, Mario Kleiner wrote:
>>>>>> This fixes a regression introduced by the new drm_update_vblank_count()
>>>>>> implementation in Linux 4.4:
>>>>>>
>>>>>> Restrict the bump of the software vblank counter in 
>>>>>> drm_update_vblank_count()
>>>>>> to a safe maximum value of +1 whenever there is the possibility that
>>>>>> concurrent readers of vblank timestamps could be active at the moment,
>>>>>> as the current implementation of the timestamp caching and updating is
>>>>>> not safe against concurrent readers for calls to store_vblank() with a
>>>>>> bump of anything but +1. A bump != 1 would very likely return corrupted
>>>>>> timestamps to userspace, because the same slot in the cache could
>>>>>> be concurrently written by store_vblank() and read by one of those
>>>>>> readers in a non-atomic fashion and without the read-retry logic
>>>>>> detecting this collision.
>>>>>>
>>>>>> Concurrent readers can exist while drm_update_vblank_count() is called
>>>>> >from the drm_vblank_off() or drm_vblank_on() functions or other 
>>>>> >non-vblank-
>>>>>> irq callers. However, all those calls are happening with the vbl_lock
>>>>>> locked thereby preventing a drm_vblank_get(), so the vblank refcount
>>>>>> can't increase while drm_update_vblank_count() is executing. Therefore
>>>>>> a zero vblank refcount during execution of that function signals that
>>>>>> is safe for arbitrary counter bumps if called from outside vblank irq,
>>>>>> whereas a non-zero count is not safe.
>>>>>>
>>>>>> Whenever the function is called from vblank irq, we have to assume 
>>>>>> concurrent
>>>>>> readers could show up any time during its execution, even if the refcount
>>>>>> is currently zero, as vblank irqs are usually only enabled due to the
>>>>>> presence of readers, and because when it is called from vblank irq it
>>>>>> can't hold the vbl_lock to protect it from sudden bumps in vblank 
>>>>>> refcount.
>>>>>> Therefore also restrict bumps to +1 when the function is called from 
>>>>>> vblank
>>>>>> irq.
>>>>>>
>>>>>> Such bumps of more than +1 can happen at other times than reenabling
>>>>>> vblank irqs, e.g., when regular vblank interrupts get delayed by more
>>>>>> than 1 frame due to long held locks, long irq off periods, realtime
>>>>>> preemption on RT kernels, or system management interrupts.
>>>>>>
>>>>>> Signed-off-by: Mario Kleiner <mario.kleiner.de at gmail.com>
>>>>>> Cc: <stable at vger.kernel.org> # 4.4+
>>>>>> Cc: michel at daenzer.net
>>>>>> Cc: vbabka at suse.cz
>>>>>> Cc: ville.syrjala at linux.intel.com
>>>>>> Cc: daniel.vetter at ffwll.ch
>>>>>> Cc: dri-devel at lists.freedesktop.org
>>>>>> Cc: alexander.deucher at amd.com
>>>>>> Cc: christian.koenig at amd.com
>>>>>
>>>>> Imo this is duct-tape. If we want to fix this up properly I think we
>>>>> should just use a full-blown seqlock instead of our hand-rolled one. And
>>>>> that could handle any increment at all.
>>>>
>>>> And I even fixed this [1] almost a half a year ago when I sent the
>>>> original series, but that part got held hostage to the same seqlock
>>>> argument. Perfect is the enemy of good.
>>>>
>>>> [1] 
>>>> https://lists.freedesktop.org/archives/intel-gfx/2015-September/075879.html
>>>
>>> Hm yeah, that does suffer from reinventing seqlocks. But I'd prefer your
>>> patch over Mario's hack here tbh. Your patch with seqlock would be even
>>> more awesome.
>>> -Daniel
>>>
>>
>> I agree that my hack is only duct-tape. That's why the long code comment to
>> let people know under which condition they could remove it.
>>
>> Using seqlocks would be the robust long term solution. But as this is
>> supposed to be a fix for both 4.4 and 4.5 i thought that such a rewrite
>> would be too intrusive as a change, compared to this one-liner?
>>
>> The original "roll our own" seqlock look alike implementation was meant to
>> avoid/minimize taking locks, esp. with _irqsave that are taken by both
>> userspace and timing sensitive vblank irq handling code. There were various
>> locking changes since then and that advantage might have been lost already
>> quite a long time ago, so maybe switching to full seqlocks wouldn't pose
>> some new performance problems there, but i haven't looked into this.
>
> Last time I've checked we've already reinvented seqlocks completely,
> except buggy since ours can't take an increment > 1. I don't expect you'll
> be able to measure anything if we switch.
>
> Agree that it might be better to delay this for 4.6. So if you add a big
> "FIMXE: Need to replace this hack with proper seqlocks." a the top of your
> big comment (or just as a replacement for it), then
>
> Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>
>
> But currently it looks like this is a proper long-term solution, which it
> imo isn't.
> -Daniel
>

Ok, will do.
-mario

>
>>
>> -mario
>>
>>>>
>>>>> -Daniel
>>>>>
>>>>>> ---
>>>>>>   drivers/gpu/drm/drm_irq.c | 41 
>>>>>> +++++++++++++++++++++++++++++++++++++++++
>>>>>>   1 file changed, 41 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>>>>>> index bcb8528..aa2c74b 100644
>>>>>> --- a/drivers/gpu/drm/drm_irq.c
>>>>>> +++ b/drivers/gpu/drm/drm_irq.c
>>>>>> @@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct 
>>>>>> drm_device *dev, unsigned int pipe,
>>>>>>                  diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0;
>>>>>>          }
>>>>>>
>>>>>> +        /*
>>>>>> +         * Restrict the bump of the software vblank counter to a safe 
>>>>>> maximum
>>>>>> +         * value of +1 whenever there is the possibility that 
>>>>>> concurrent readers
>>>>>> +         * of vblank timestamps could be active at the moment, as the 
>>>>>> current
>>>>>> +         * implementation of the timestamp caching and updating is not 
>>>>>> safe
>>>>>> +         * against concurrent readers for calls to store_vblank() with 
>>>>>> a bump
>>>>>> +         * of anything but +1. A bump != 1 would very likely return 
>>>>>> corrupted
>>>>>> +         * timestamps to userspace, because the same slot in the cache 
>>>>>> could
>>>>>> +         * be concurrently written by store_vblank() and read by one of 
>>>>>> those
>>>>>> +         * readers without the read-retry logic detecting the collision.
>>>>>> +         *
>>>>>> +         * Concurrent readers can exist when we are called from the
>>>>>> +         * drm_vblank_off() or drm_vblank_on() functions and other 
>>>>>> non-vblank-
>>>>>> +         * irq callers. However, all those calls to us are happening 
>>>>>> with the
>>>>>> +         * vbl_lock locked to prevent drm_vblank_get(), so the vblank 
>>>>>> refcount
>>>>>> +         * can't increase while we are executing. Therefore a zero 
>>>>>> refcount at
>>>>>> +         * this point is safe for arbitrary counter bumps if we are 
>>>>>> called
>>>>>> +         * outside vblank irq, a non-zero count is not 100% safe. 
>>>>>> Unfortunately
>>>>>> +         * we must also accept a refcount of 1, as whenever we are 
>>>>>> called from
>>>>>> +         * drm_vblank_get() -> drm_vblank_enable() the refcount will be 
>>>>>> 1 and
>>>>>> +         * we must let that one pass through in order to not lose 
>>>>>> vblank counts
>>>>>> +         * during vblank irq off - which would completely defeat the 
>>>>>> whole
>>>>>> +         * point of this routine.
>>>>>> +         *
>>>>>> +         * Whenever we are called from vblank irq, we have to assume 
>>>>>> concurrent
>>>>>> +         * readers exist or can show up any time during our execution, 
>>>>>> even if
>>>>>> +         * the refcount is currently zero, as vblank irqs are usually 
>>>>>> only
>>>>>> +         * enabled due to the presence of readers, and because when we 
>>>>>> are called
>>>>>> +         * from vblank irq we can't hold the vbl_lock to protect us 
>>>>>> from sudden
>>>>>> +         * bumps in vblank refcount. Therefore also restrict bumps to 
>>>>>> +1 when
>>>>>> +         * called from vblank irq.
>>>>>> +         */
>>>>>> +        if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 ||
>>>>>> +            (flags & DRM_CALLED_FROM_VBLIRQ))) {
>>>>>> +                DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: 
>>>>>> diffr=%u "
>>>>>> +                              "refcount %u, vblirq %u\n", pipe, diff,
>>>>>> +                              atomic_read(&vblank->refcount),
>>>>>> +                              (flags & DRM_CALLED_FROM_VBLIRQ) != 0);
>>>>>> +                diff = 1;
>>>>>> +        }
>>>>>> +
>>>>>>          DRM_DEBUG_VBL("updating vblank count on crtc %u:"
>>>>>>                        " current=%u, diff=%u, hw=%u hw_last=%u\n",
>>>>>>                        pipe, vblank->count, diff, cur_vblank, 
>>>>>> vblank->last);
>>>>>> --
>>>>>> 1.9.1
>>>>>>
>>>>>
>>>>> --
>>>>> Daniel Vetter
>>>>> Software Engineer, Intel Corporation
>>>>> http://blog.ffwll.ch
>>>>
>>>> --
>>>> Ville Syrjälä
>>>> Intel OTC
>>>
>

Reply via email to