RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-31 Thread Liu, Monk
Liu, Monk ; amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) On Fri, Aug 27, 2021 at 08:30:32PM +0200, Christian König wrote: > Yeah, that's what I meant with that the start of processing a job is a > bit swa

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-31 Thread Daniel Vetter
t; > > scheduled to the ring doesn't represent it's being processed > > > > > by hw. > > > > > > > > > > Thanks > > > > > > > > > > -- > > > >

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-27 Thread Andrey Grodzovsky
ovsky, Andrey Sent: Friday, August 27, 2021 4:14 AM To: Liu, Monk ; amd-gfx@lists.freedesktop.org; Koenig, Christian Cc: dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) Attached quick patch for per job TTL calculation to make more prec

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-27 Thread Christian König
amd-gfx@lists.freedesktop.org; Koenig, Christian Cc: dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) Attached quick patch for per job TTL calculation to make more precises next timer expiration. It's on top of the patch in this thre

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-27 Thread Andrey Grodzovsky
- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Grodzovsky, Andrey Sent: Friday, August 27, 2021 4:14 AM To: Liu, Monk ; amd-gfx@lists.freedesktop.org; Koenig, Christian Cc: dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/sc

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-27 Thread Christian König
27, 2021 2:12 PM To: Grodzovsky, Andrey ; Liu, Monk ; amd-gfx@lists.freedesktop.org; Koenig, Christian Cc: dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) I don't think that this will be necessary nor desired. See the job should

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-27 Thread Christian König
freedesktop.org; Koenig, Christian Cc: dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) Attached quick patch for per job TTL calculation to make more precises next timer expiration. It's on top of the patch in this thread. Let me know i

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-27 Thread Andrey Grodzovsky
ian Cc: dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) I don't think that this will be necessary nor desired. See the job should be cleaned up as soon as possible after it is finished or otherwise we won't cancel the timeout quick enough either. C

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-27 Thread Andrey Grodzovsky
eedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) Attached quick patch for per job TTL calculation to make more precises next timer expiration. It's on top of the patch in this thread. Let me know if this makes sense. Andrey On 2021-08-26 10:03 a.m., Andrey

RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-27 Thread Liu, Monk
From: Christian König Sent: Thursday, August 26, 2021 8:38 PM To: Liu, Monk ; amd-gfx@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) Am 26.08.21 um 13:55 schrieb Liu, Monk: > [AMD Official Use Only] > &

RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-27 Thread Liu, Monk
riginal Message- From: Christian König Sent: Friday, August 27, 2021 2:12 PM To: Grodzovsky, Andrey ; Liu, Monk ; amd-gfx@lists.freedesktop.org; Koenig, Christian Cc: dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) I don't thi

RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-27 Thread Liu, Monk
Cloud-GPU Core team -- -Original Message- From: Grodzovsky, Andrey Sent: Friday, August 27, 2021 4:14 AM To: Liu, Monk ; amd-gfx@lists.freedesktop.org; Koenig, Christian Cc: dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of tim

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Christian König
I don't think that this will be necessary nor desired. See the job should be cleaned up as soon as possible after it is finished or otherwise we won't cancel the timeout quick enough either. Christian. Am 26.08.21 um 22:14 schrieb Andrey Grodzovsky: Attached quick patch for per job TTL calcul

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Andrey Grodzovsky
Attached quick patch for per job TTL calculation to make more precises next timer expiration. It's on top of the patch in this thread. Let me know if this makes sense. Andrey On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote: On 2021-08-26 12:55 a.m., Monk Liu wrote: issue: in cleanup_job t

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Andrey Grodzovsky
On 2021-08-26 12:55 a.m., Monk Liu wrote: issue: in cleanup_job the cancle_delayed_work will cancel a TO timer even the its corresponding job is still running. fix: do not cancel the timer in cleanup_job, instead do the cancelling only when the heading job is signaled, and if there is a "next"

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Daniel Vetter
Monk Liu | Cloud-GPU Core team > > -- > > > > -----Original Message----- > > From: Christian König > > Sent: Thursday, August 26, 2021 6:09 PM > > To: Liu, Monk ; amd-gfx@lists.freedesktop.org > > Cc: dri-de...@li

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Christian König
eedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) Am 26.08.21 um 06:55 schrieb Monk Liu: issue: in cleanup_job the cancle_delayed_work will cancel a TO timer even the its corresponding job is still running. Yeah, that makes a lot more sense. fix: do not cancel

RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Liu, Monk
-- -Original Message- From: Christian König Sent: Thursday, August 26, 2021 6:09 PM To: Liu, Monk ; amd-gfx@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) Am 26.08.21 um 06:5

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Christian König
Am 26.08.21 um 06:55 schrieb Monk Liu: issue: in cleanup_job the cancle_delayed_work will cancel a TO timer even the its corresponding job is still running. Yeah, that makes a lot more sense. fix: do not cancel the timer in cleanup_job, instead do the cancelling only when the heading job is

Re: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-25 Thread Andrey Grodzovsky
ginal Message- From: Grodzovsky, Andrey Sent: Thursday, August 26, 2021 11:05 AM To: Liu, Monk ; Christian König ; amd-gfx@lists.freedesktop.org; dri-devel Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2) On 2021-08-25 10:31 p.m., Liu, Monk wrote: [AMD Official Use On

RE: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-25 Thread Liu, Monk
--- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Grodzovsky, Andrey Sent: Thursday, August 26, 2021 11:05 AM To: Liu, Monk ; Christian König ; amd-gfx@lists.freedesktop.org; dri-devel Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v

Re: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-25 Thread Andrey Grodzovsky
Thursday, August 26, 2021 2:20 AM To: Christian König ; Liu, Monk ; amd-gfx@lists.freedesktop.org; dri-devel Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2) On 2021-08-25 8:11 a.m., Christian König wrote: No, this would break that logic here. See drm_sched_start_timeo

RE: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-25 Thread Liu, Monk
M To: Christian König ; Liu, Monk ; amd-gfx@lists.freedesktop.org; dri-devel Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2) On 2021-08-25 8:11 a.m., Christian König wrote: > No, this would break that logic here. > > See drm_sched_start_timeout() can be called mult

RE: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-25 Thread Liu, Monk
Liu | Cloud-GPU Core team -- -----Original Message----- From: Christian König Sent: Wednesday, August 25, 2021 8:11 PM To: Liu, Monk ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2) No, th

Re: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-25 Thread Andrey Grodzovsky
-- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Christian König Sent: Wednesday, August 25, 2021 2:32 PM To: Liu, Monk ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2) Well NAK to that approach. Fir

Re: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-25 Thread Alex Deucher
Please cc dri-devel on all scheduler patches. It's core functionality. Alex On Wed, Aug 25, 2021 at 12:14 AM Monk Liu wrote: > > the original logic is wrong that the timeout will not be retriggerd > after the previous job siganled, and that lead to the scenario that all > jobs in the same sched

Re: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-25 Thread Christian König
Core team -- -Original Message- From: Liu, Monk Sent: Wednesday, August 25, 2021 7:55 PM To: 'Christian König' ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH] drm/sched: fix the bug of time out calculation(v2) [AMD Official Use Only] The timeout started by que

RE: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-25 Thread Liu, Monk
y Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Liu, Monk Sent: Wednesday, August 25, 2021 7:55 PM To: 'Christian König' ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH] drm/sched: fix the bug of time out calculatio

RE: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-25 Thread Liu, Monk
-- -Original Message- From: Christian König Sent: Wednesday, August 25, 2021 2:32 PM To: Liu, Monk ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2) Well NAK to that approach. First of all your bug analyses

Re: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-24 Thread Christian König
Well NAK to that approach. First of all your bug analyses is incorrect. The timeout started by queue_delayed_work() in drm_sched_start_timeout() is paired with the cancel_delayed_work() in drm_sched_get_cleanup_job(). So you must have something else going on here. Then please don't use mod_de

RE: [PATCH] drm/sched: fix the bug of time out calculation

2021-08-24 Thread Liu, Monk
u | Cloud-GPU Core team -- -Original Message- From: Grodzovsky, Andrey Sent: Tuesday, August 24, 2021 11:02 PM To: Liu, Monk ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation On 2021-08-24 10:46 a.m.,

Re: [PATCH] drm/sched: fix the bug of time out calculation

2021-08-24 Thread Andrey Grodzovsky
On 2021-08-24 5:51 a.m., Monk Liu wrote: the original logic is wrong that the timeout will not be retriggerd after the previous job siganled, and that lead to the scenario that all jobs in the same scheduler shares the same timeout timer from the very begining job in this scheduler which is wro

Re: [PATCH] drm/sched: fix the bug of time out calculation

2021-08-24 Thread Andrey Grodzovsky
On 2021-08-24 10:46 a.m., Andrey Grodzovsky wrote: On 2021-08-24 5:51 a.m., Monk Liu wrote: the original logic is wrong that the timeout will not be retriggerd after the previous job siganled, and that lead to the scenario that all jobs in the same scheduler shares the same timeout timer from