Am 18.08.2014 um 17:28 schrieb Maarten Lankhorst:
> Op 18-08-14 om 17:12 schreef Christian K?nig:
>> Am 18.08.2014 um 16:45 schrieb Maarten Lankhorst:
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com>
>>> ---
>>> V1 had a nasty bug breaking gpu lockup recovery. The fix is not
>>> allowing radeon_fence_driver_check_lockup to take exclusive_lock,
>>> and kill it during lockup recovery instead.
>>> V2 used delayed work that ran during lockup recovery, but required
>>> read lock. I've fixed this by downgrading the write, and retrying if
>>> recovery fails.
>>> ---
>>>    drivers/gpu/drm/radeon/radeon.h       |   2 +
>>>    drivers/gpu/drm/radeon/radeon_fence.c | 115 
>>> +++++++++++++++++-----------------
>>>    2 files changed, 61 insertions(+), 56 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/radeon/radeon.h 
>>> b/drivers/gpu/drm/radeon/radeon.h
>>> index 1d806983ec7b..29504efe8971 100644
>>> --- a/drivers/gpu/drm/radeon/radeon.h
>>> +++ b/drivers/gpu/drm/radeon/radeon.h
>>> @@ -355,6 +355,8 @@ struct radeon_fence_driver {
>>>        uint64_t            sync_seq[RADEON_NUM_RINGS];
>>>        atomic64_t            last_seq;
>>>        bool                initialized;
>>> +    struct delayed_work        fence_check_work;
>>> +    struct radeon_device        *rdev;
>> Put the reference to the device as the first field in the structure.
>>
>>>    };
>>>      struct radeon_fence {
>>> diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
>>> b/drivers/gpu/drm/radeon/radeon_fence.c
>>> index 913787085dfa..94eca53d99f8 100644
>>> --- a/drivers/gpu/drm/radeon/radeon_fence.c
>>> +++ b/drivers/gpu/drm/radeon/radeon_fence.c
>>> @@ -125,16 +125,7 @@ int radeon_fence_emit(struct radeon_device *rdev,
>>>        return 0;
>>>    }
>>>    -/**
>>> - * radeon_fence_process - process a fence
>>> - *
>>> - * @rdev: radeon_device pointer
>>> - * @ring: ring index the fence is associated with
>>> - *
>>> - * Checks the current fence value and wakes the fence queue
>>> - * if the sequence number has increased (all asics).
>>> - */
>>> -void radeon_fence_process(struct radeon_device *rdev, int ring)
>>> +static bool __radeon_fence_process(struct radeon_device *rdev, int ring)
>> Don't use "__" for internal radeon function names and especially don't 
>> remove the function documentation.
> I've moved the documentation to the new radeon_fence_process function that 
> calls the __radeon_fence_process one,
> which is an internal function that is only used by the driver. I guess I can 
> rename it to __radeon_fence_process_nowake or something instead,
> but documentation doesn't make sense since it's not used outside of 
> radeon_fence.c

We try to document even static functions in doxygen style even if you 
can't see them outside the C file.

>>>    {
>>>        uint64_t seq, last_seq, last_emitted;
>>>        unsigned count_loop = 0;
>>> @@ -190,7 +181,53 @@ void radeon_fence_process(struct radeon_device *rdev, 
>>> int ring)
>>>            }
>>>        } while (atomic64_xchg(&rdev->fence_drv[ring].last_seq, seq) > seq);
>>>    -    if (wake)
>>> +    if (seq < last_emitted && !rdev->in_reset)
>>> +        mod_delayed_work(system_power_efficient_wq,
>>> +                 &rdev->fence_drv[ring].fence_check_work,
>>> +                 RADEON_FENCE_JIFFIES_TIMEOUT);
>> Am I wrong or do you queue the work only after radeon_fence_process is 
>> called for the first time?
>>
>> Might be a good idea to have an explicit queue_delayed_work in 
>> radeon_fence_emit as well.
> Yeah might as well, with these changes it makes sense to run it as soon as 
> possible.
>
> But if there are no waiters, there's no real need. It's a tree falling in a 
> forest with no-one around to hear it. ;-)

That's why I suggested to schedule the work item only when the IRQ is 
enabled for waiting, otherwise it indeed doesn't make much sense.

Christian.

Reply via email to