Re: [PATCH] drm/fb: Fix randconfig builds

2021-08-16 Thread Jani Nikula
On Mon, 16 Aug 2021, Jackie Liu  wrote:
> From: Jackie Liu 
>
> When CONFIG_DRM_FBDEV_EMULATION is compiled to y and CONFIG_FB is m, the
> compilation will fail. we need make that dependency explicit.

What's the failure mode? Using select here is a bad idea.

BR,
Jani.

>
> Reported-by: k2ci 
> Signed-off-by: Jackie Liu 
> ---
>  drivers/gpu/drm/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 7ff89690a976..346a518b5119 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
>  config DRM_FBDEV_EMULATION
>   bool "Enable legacy fbdev support for your modesetting driver"
>   depends on DRM
> - depends on FB
> + select FB
>   select DRM_KMS_HELPER
>   select FB_CFB_FILLRECT
>   select FB_CFB_COPYAREA

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [PATCH v2 0/3] drm/nouveau: fix a use-after-free in postclose()

2021-08-16 Thread Salvatore Bonaccorso
Hi,

On Fri, Mar 26, 2021 at 06:00:51PM -0400, Lyude Paul wrote:
> This patch series is:
> 
> Reviewed-by: Lyude Paul 
> 
> Btw - in the future if you need to send a respin of multiple patches, you need
> to send it as it's own separate series instead of replying to the previous one
> (one-off respins can just be posted as replies though), otherwise patchwork
> won't pick it up

Did this patch series somehow fall through the cracks or got lost?

Regards,
Salvatore


Re: [PATCH] drm/fb: Fix randconfig builds

2021-08-16 Thread Jackie Liu

Hi Jani.

My CI report an randconfigs build failed. there are:

drm_fb_helper.c:(.text+0x302): undefined reference to `fb_set_suspend'
drm_fb_helper.c:(.text+0xaea): undefined reference to `register_framebuffer'
drm_fb_helper.c:(.text+0x1dcc): undefined reference to `framebuffer_alloc'
ld: drm_fb_helper.c:(.text+0x1dea): undefined reference to `fb_alloc_cmap'
ld: drm_fb_helper.c:(.text+0x1e2f): undefined reference to `fb_dealloc_cmap'
ld: drm_fb_helper.c:(.text+0x1e5b): undefined reference to 
`framebuffer_release'
drm_fb_helper.c:(.text+0x1e85): undefined reference to 
`unregister_framebuffer'

drm_fb_helper.c:(.text+0x1ee9): undefined reference to `fb_dealloc_cmap'
ld: drm_fb_helper.c:(.text+0x1ef0): undefined reference to 
`framebuffer_release'
drm_fb_helper.c:(.text+0x1f96): undefined reference to 
`fb_deferred_io_cleanup'

drm_fb_helper.c:(.text+0x203b): undefined reference to `fb_sys_read'
drm_fb_helper.c:(.text+0x2051): undefined reference to `fb_sys_write'
drm_fb_helper.c:(.text+0x208d): undefined reference to `sys_fillrect'
drm_fb_helper.c:(.text+0x20bb): undefined reference to `sys_copyarea'
drm_fb_helper.c:(.text+0x20e9): undefined reference to `sys_imageblit'
drm_fb_helper.c:(.text+0x2117): undefined reference to `cfb_fillrect'
drm_fb_helper.c:(.text+0x2172): undefined reference to `cfb_copyarea'
drm_fb_helper.c:(.text+0x21cd): undefined reference to `cfb_imageblit'
drm_fb_helper.c:(.text+0x2233): undefined reference to `fb_set_suspend'
drm_fb_helper.c:(.text+0x22b0): undefined reference to `fb_set_suspend'
drm_fb_helper.c:(.text+0x250f): undefined reference to `fb_deferred_io_init'

The main reason is because DRM_FBDEV_EMULATION is built-in, and
CONFIG_FB is compiled as a module.

--
Jackie Liu

在 2021/8/16 下午3:01, Jani Nikula 写道:

On Mon, 16 Aug 2021, Jackie Liu  wrote:

From: Jackie Liu 

When CONFIG_DRM_FBDEV_EMULATION is compiled to y and CONFIG_FB is m, the
compilation will fail. we need make that dependency explicit.


What's the failure mode? Using select here is a bad idea.

BR,
Jani.



Reported-by: k2ci 
Signed-off-by: Jackie Liu 
---
  drivers/gpu/drm/Kconfig | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 7ff89690a976..346a518b5119 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
  config DRM_FBDEV_EMULATION
bool "Enable legacy fbdev support for your modesetting driver"
depends on DRM
-   depends on FB
+   select FB
select DRM_KMS_HELPER
select FB_CFB_FILLRECT
select FB_CFB_COPYAREA




Re: [PATCH] drm/radeon: Add break to switch statement in radeonfb_create_pinned_object()

2021-08-16 Thread Christian König

Am 15.08.21 um 21:29 schrieb Nathan Chancellor:

Clang + -Wimplicit-fallthrough warns:

drivers/gpu/drm/radeon/radeon_fb.c:170:2: warning: unannotated
fall-through between switch labels [-Wimplicit-fallthrough]
 default:
 ^
drivers/gpu/drm/radeon/radeon_fb.c:170:2: note: insert 'break;' to avoid
fall-through
 default:
 ^
 break;
1 warning generated.

Clang's version of this warning is a little bit more pedantic than
GCC's. Add the missing break to satisfy it to match what has been done
all over the kernel tree.

Signed-off-by: Nathan Chancellor 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/radeon/radeon_fb.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/radeon/radeon_fb.c 
b/drivers/gpu/drm/radeon/radeon_fb.c
index 0b206b052972..c8b545181497 100644
--- a/drivers/gpu/drm/radeon/radeon_fb.c
+++ b/drivers/gpu/drm/radeon/radeon_fb.c
@@ -167,6 +167,7 @@ static int radeonfb_create_pinned_object(struct 
radeon_fbdev *rfbdev,
break;
case 2:
tiling_flags |= RADEON_TILING_SWAP_16BIT;
+   break;
default:
break;
}

base-commit: ba31f97d43be41ca99ab72a6131d7c226306865f




Re: [PATCH 2/2] drm/amdgpu: Use mod_delayed_work in JPEG/UVD/VCE/VCN ring_end_use hooks

2021-08-16 Thread Christian König

Am 12.08.21 um 10:11 schrieb Michel Dänzer:

On 2021-08-12 7:55 a.m., Koenig, Christian wrote:

Hi James,

Evan seems to have understood how this all works together.

See while any begin/end use critical section is active the work should not be 
active.

When you handle only one ring you can just call cancel in begin use and 
schedule in end use. But when you have more than one ring you need a lock or 
counter to prevent concurrent work items to be started.

Michelle's idea to use mod_delayed_work is a bad one because it assumes that 
the delayed work is still running.

It merely assumes that the work may already have been scheduled before.

Admittedly, I missed the cancel_delayed_work_sync calls for patch 2. While I 
think it can still have some effect when there's a single work item for 
multiple rings, as described by James, it's probably negligible, since 
presumably the time intervals between ring_begin_use and ring_end_use are 
normally much shorter than a second.

So, while patch 2 is at worst a no-op (since mod_delayed_work is the same as 
schedule_delayed_work if the work hasn't been scheduled yet), I'm fine with 
dropping it.


Yeah, I think that would be much better.


Something similar applies to the first patch I think,

There are no cancel work calls in that case, so the commit log is accurate 
TTBOMK.

I noticed this because current mutter Git main wasn't able to sustain 60 fps on 
Navi 14 with a simple glxgears -fullscreen. mutter was dropping frames because 
its CPU work for a frame update occasionally took up to 3 ms, instead of the 
normal 2-300 microseconds. sysprof showed a lot of cycles spent in the 
functions which enable/disable GFXOFF in the HW.



so when this makes a difference it is actually a bug.

There was certainly a bug though, which patch 1 fixes. :)


Agreed, just wanted to note that this is most likely not the right 
solution since Alex was already picking it up.


Going to reply separately on the new patch as well.

Regards,
Christian.


Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-16 Thread Christian König

Am 13.08.21 um 12:29 schrieb Michel Dänzer:

From: Michel Dänzer 

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when GFXOFF transitions from
enabled to disabled. This makes sure the delayed work will be scheduled
as intended in the reverse case.

In order to avoid a deadlock, amdgpu_device_delay_enable_gfx_off needs
to use mutex_trylock instead of mutex_lock.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
   mod_delayed_work.


While this may work it still smells a little bit fishy.

In general you have two common locking orders around work items, either 
lock->work or work->lock. If you mix this as lock->work->lock like here 
trouble is usually imminent.


I think what we should do instead is to double check if taking the lock 
inside the work item is necessary and instead making sure that the work 
is sync canceled when we don't want it to run. In other words fully 
switching to the lock->work approach.


But please note that this are just high level design thoughts, I don't 
really know the details of the gfx_off code at all. Could even be that 
we need two locks, one outside and one inside of the work item.


Regards,
Christian.



Signed-off-by: Michel Dänzer 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 13 +++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h|  3 +++
  3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f3fd5ec710b6..8b025f70706c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2777,7 +2777,16 @@ static void amdgpu_device_delay_enable_gfx_off(struct 
work_struct *work)
struct amdgpu_device *adev =
container_of(work, struct amdgpu_device, 
gfx.gfx_off_delay_work.work);
  
-	mutex_lock(&adev->gfx.gfx_off_mutex);

+   /* mutex_lock could deadlock with cancel_delayed_work_sync in 
amdgpu_gfx_off_ctrl. */
+   if (!mutex_trylock(&adev->gfx.gfx_off_mutex)) {
+   /* If there's a bug which causes amdgpu_gfx_off_ctrl to be 
called with enable=true
+* when adev->gfx.gfx_off_req_count is already 0, we might race 
with that.
+* Re-schedule to make sure gfx off will be re-enabled in the 
HW eventually.
+*/
+   schedule_delayed_work(&adev->gfx.gfx_off_delay_work, 
AMDGPU_GFX_OFF_DELAY_ENABLE);
+   return;
+   }
+
if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, true))
adev->gfx.gfx_off_state = true;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index a0be0772c8b3..da4c46db3093 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -28,9 +28,6 @@
  #include "amdgpu_rlc.h"
  #include "amdgpu_ras.h"
  
-/* delay 0.1 second to enable gfx off feature */

-#define GFX_OFF_DELAY_ENABLE msecs_to_jiffies(100)
-
  /*
   * GPU GFX IP block helpers function.
   */
@@ -569,9 +566,13 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
adev->gfx.gfx_off_req_count--;
  
  	if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {

-   schedule_delayed_work(&adev->gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
-   } else if (!enable && adev->gfx.gfx_off_state) {
-   if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {
+   schedule_delayed_work(&adev->gfx.gfx_off_delay_work, 
AMDGPU_GFX_OFF_DELAY_ENABLE);
+   } else if (!enable) {
+   if (adev->gfx.gfx_off_req_count == 1 && 
!adev->gfx.gfx_off_state)
+   cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work);
+
+   if (adev->gfx.gfx_off_state &&
+   !amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {
adev->gfx.gfx_off_state = false;
  
  			if (adev->gfx.funcs->init_spm_golden) {

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index d43fe2ed8116..dcdb505bb7f4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -32,6 +32,9 @@
  #include "amdgpu_rlc.h"
  #include "soc15.h"
  
+/* delay 0.1 second t

Re: [PATCH 1/1] drm: ttm: Don't bail from ttm_global_init if debugfs_create_dir fails

2021-08-16 Thread Christian König

Am 10.08.21 um 21:59 schrieb Dan Moulding:

In 69de4421bb4c ("drm/ttm: Initialize debugfs from
ttm_global_init()"), ttm_global_init was changed so that if creation
of the debugfs global root directory fails, ttm_global_init will bail
out early and return an error, leading to initialization failure of
DRM drivers. However, not every system will be using debugfs. On such
a system, debugfs directory creation can be expected to fail, but DRM
drivers must still be usable. This changes it so that if creation of
TTM's debugfs root directory fails, then no biggie: keep calm and
carry on.

Fixes: 69de4421bb4c ("drm/ttm: Initialize debugfs from ttm_global_init()")
Signed-off-by: Dan Moulding 


Good point, patch is Reviewed-by: Christian König 
.


Going to pick that up later today.

Regards,
Christian.


---
  drivers/gpu/drm/ttm/ttm_device.c | 2 --
  1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index 74e3b460132b..2df59b3c2ea1 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -78,9 +78,7 @@ static int ttm_global_init(void)
  
  	ttm_debugfs_root = debugfs_create_dir("ttm", NULL);

if (IS_ERR(ttm_debugfs_root)) {
-   ret = PTR_ERR(ttm_debugfs_root);
ttm_debugfs_root = NULL;
-   goto out;
}
  
  	/* Limit the number of pages in the pool to about 50% of the total




Re: [PATCH] drm/fb: Fix randconfig builds

2021-08-16 Thread Jackie Liu

After commit f611b1e7624c, we change select FB
to depends on FB.

How about this:

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 7ff89690a976..cd129d96e649 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
 config DRM_FBDEV_EMULATION
bool "Enable legacy fbdev support for your modesetting driver"
depends on DRM
-   depends on FB
+   depends on FB && FB != m
select DRM_KMS_HELPER
select FB_CFB_FILLRECT
select FB_CFB_COPYAREA

--
Jackie Liu

在 2021/8/16 下午3:01, Jani Nikula 写道:

On Mon, 16 Aug 2021, Jackie Liu  wrote:

From: Jackie Liu 

When CONFIG_DRM_FBDEV_EMULATION is compiled to y and CONFIG_FB is m, the
compilation will fail. we need make that dependency explicit.


What's the failure mode? Using select here is a bad idea.

BR,
Jani.



Reported-by: k2ci 
Signed-off-by: Jackie Liu 
---
  drivers/gpu/drm/Kconfig | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 7ff89690a976..346a518b5119 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
  config DRM_FBDEV_EMULATION
bool "Enable legacy fbdev support for your modesetting driver"
depends on DRM
-   depends on FB
+   select FB
select DRM_KMS_HELPER
select FB_CFB_FILLRECT
select FB_CFB_COPYAREA




Re: [PATCH] drm/fb: Fix randconfig builds

2021-08-16 Thread Jani Nikula
On Mon, 16 Aug 2021, Jackie Liu  wrote:
> Hi Jani.
>
> My CI report an randconfigs build failed. there are:
>
> drm_fb_helper.c:(.text+0x302): undefined reference to `fb_set_suspend'
> drm_fb_helper.c:(.text+0xaea): undefined reference to `register_framebuffer'
> drm_fb_helper.c:(.text+0x1dcc): undefined reference to `framebuffer_alloc'
> ld: drm_fb_helper.c:(.text+0x1dea): undefined reference to `fb_alloc_cmap'
> ld: drm_fb_helper.c:(.text+0x1e2f): undefined reference to `fb_dealloc_cmap'
> ld: drm_fb_helper.c:(.text+0x1e5b): undefined reference to 
> `framebuffer_release'
> drm_fb_helper.c:(.text+0x1e85): undefined reference to 
> `unregister_framebuffer'
> drm_fb_helper.c:(.text+0x1ee9): undefined reference to `fb_dealloc_cmap'
> ld: drm_fb_helper.c:(.text+0x1ef0): undefined reference to 
> `framebuffer_release'
> drm_fb_helper.c:(.text+0x1f96): undefined reference to 
> `fb_deferred_io_cleanup'
> drm_fb_helper.c:(.text+0x203b): undefined reference to `fb_sys_read'
> drm_fb_helper.c:(.text+0x2051): undefined reference to `fb_sys_write'
> drm_fb_helper.c:(.text+0x208d): undefined reference to `sys_fillrect'
> drm_fb_helper.c:(.text+0x20bb): undefined reference to `sys_copyarea'
> drm_fb_helper.c:(.text+0x20e9): undefined reference to `sys_imageblit'
> drm_fb_helper.c:(.text+0x2117): undefined reference to `cfb_fillrect'
> drm_fb_helper.c:(.text+0x2172): undefined reference to `cfb_copyarea'
> drm_fb_helper.c:(.text+0x21cd): undefined reference to `cfb_imageblit'
> drm_fb_helper.c:(.text+0x2233): undefined reference to `fb_set_suspend'
> drm_fb_helper.c:(.text+0x22b0): undefined reference to `fb_set_suspend'
> drm_fb_helper.c:(.text+0x250f): undefined reference to `fb_deferred_io_init'
>
> The main reason is because DRM_FBDEV_EMULATION is built-in, and
> CONFIG_FB is compiled as a module.

DRM_FBDEV_EMULATION is not a module, it's just a config
knob. drm_fb_helper.ko is the module, enabled via DRM_KMS_HELPER, and it
has an implicit dependency on FB, and DRM_FBDEV_EMULATION selects
DRM_KMS_HELPER. Select just breaks dependencies in all kinds of ways.

This might help in config DRM_KMS_HELPER, and it might help the reader
because it's factual:

depends on FB if DRM_FBDEV_EMULATION


BR,
Jani.





>
> --
> Jackie Liu
>
> 在 2021/8/16 下午3:01, Jani Nikula 写道:
>> On Mon, 16 Aug 2021, Jackie Liu  wrote:
>>> From: Jackie Liu 
>>>
>>> When CONFIG_DRM_FBDEV_EMULATION is compiled to y and CONFIG_FB is m, the
>>> compilation will fail. we need make that dependency explicit.
>> 
>> What's the failure mode? Using select here is a bad idea.
>> 
>> BR,
>> Jani.
>> 
>>>
>>> Reported-by: k2ci 
>>> Signed-off-by: Jackie Liu 
>>> ---
>>>   drivers/gpu/drm/Kconfig | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
>>> index 7ff89690a976..346a518b5119 100644
>>> --- a/drivers/gpu/drm/Kconfig
>>> +++ b/drivers/gpu/drm/Kconfig
>>> @@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
>>>   config DRM_FBDEV_EMULATION
>>> bool "Enable legacy fbdev support for your modesetting driver"
>>> depends on DRM
>>> -   depends on FB
>>> +   select FB
>>> select DRM_KMS_HELPER
>>> select FB_CFB_FILLRECT
>>> select FB_CFB_COPYAREA
>> 

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [PATCH] drm/fb: Fix randconfig builds

2021-08-16 Thread Jani Nikula
On Mon, 16 Aug 2021, Jackie Liu  wrote:
> After commit f611b1e7624c, we change select FB
> to depends on FB.

And obviously you should cite the commit in the original patch and Cc
the author!

BR,
Jani.

>
> How about this:
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 7ff89690a976..cd129d96e649 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
>   config DRM_FBDEV_EMULATION
>  bool "Enable legacy fbdev support for your modesetting driver"
>  depends on DRM
> -   depends on FB
> +   depends on FB && FB != m
>  select DRM_KMS_HELPER
>  select FB_CFB_FILLRECT
>  select FB_CFB_COPYAREA
>
> --
> Jackie Liu
>
> 在 2021/8/16 下午3:01, Jani Nikula 写道:
>> On Mon, 16 Aug 2021, Jackie Liu  wrote:
>>> From: Jackie Liu 
>>>
>>> When CONFIG_DRM_FBDEV_EMULATION is compiled to y and CONFIG_FB is m, the
>>> compilation will fail. we need make that dependency explicit.
>> 
>> What's the failure mode? Using select here is a bad idea.
>> 
>> BR,
>> Jani.
>> 
>>>
>>> Reported-by: k2ci 
>>> Signed-off-by: Jackie Liu 
>>> ---
>>>   drivers/gpu/drm/Kconfig | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
>>> index 7ff89690a976..346a518b5119 100644
>>> --- a/drivers/gpu/drm/Kconfig
>>> +++ b/drivers/gpu/drm/Kconfig
>>> @@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
>>>   config DRM_FBDEV_EMULATION
>>> bool "Enable legacy fbdev support for your modesetting driver"
>>> depends on DRM
>>> -   depends on FB
>>> +   select FB
>>> select DRM_KMS_HELPER
>>> select FB_CFB_FILLRECT
>>> select FB_CFB_COPYAREA
>> 

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [PATCH] drm/fb: Fix randconfig builds

2021-08-16 Thread Jackie Liu

Hi, Jani.

Thanks, send V2 version immediately, and cc author.

--
Jackie Liu

在 2021/8/16 下午4:35, Jani Nikula 写道:

On Mon, 16 Aug 2021, Jackie Liu  wrote:

After commit f611b1e7624c, we change select FB
to depends on FB.


And obviously you should cite the commit in the original patch and Cc
the author!

BR,
Jani.



How about this:

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 7ff89690a976..cd129d96e649 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
   config DRM_FBDEV_EMULATION
  bool "Enable legacy fbdev support for your modesetting driver"
  depends on DRM
-   depends on FB
+   depends on FB && FB != m
  select DRM_KMS_HELPER
  select FB_CFB_FILLRECT
  select FB_CFB_COPYAREA

--
Jackie Liu

在 2021/8/16 下午3:01, Jani Nikula 写道:

On Mon, 16 Aug 2021, Jackie Liu  wrote:

From: Jackie Liu 

When CONFIG_DRM_FBDEV_EMULATION is compiled to y and CONFIG_FB is m, the
compilation will fail. we need make that dependency explicit.


What's the failure mode? Using select here is a bad idea.

BR,
Jani.



Reported-by: k2ci 
Signed-off-by: Jackie Liu 
---
   drivers/gpu/drm/Kconfig | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 7ff89690a976..346a518b5119 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
   config DRM_FBDEV_EMULATION
bool "Enable legacy fbdev support for your modesetting driver"
depends on DRM
-   depends on FB
+   select FB
select DRM_KMS_HELPER
select FB_CFB_FILLRECT
select FB_CFB_COPYAREA






Re: [PATCH] drm/fb: Fix randconfig builds

2021-08-16 Thread Jackie Liu

Hi Jani.

Your suggestion is that?

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 7ff89690a976..ba179a539497 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -77,6 +77,7 @@ config DRM_DEBUG_SELFTEST
 config DRM_KMS_HELPER
tristate
depends on DRM
+   depends on FB if DRM_FBDEV_EMULATION
help
  CRTC helpers for KMS drivers.


But it has a syntax error.

--
Thanks, BR, Jackie Liu

在 2021/8/16 下午4:33, Jani Nikula 写道:

On Mon, 16 Aug 2021, Jackie Liu  wrote:

Hi Jani.

My CI report an randconfigs build failed. there are:

drm_fb_helper.c:(.text+0x302): undefined reference to `fb_set_suspend'
drm_fb_helper.c:(.text+0xaea): undefined reference to `register_framebuffer'
drm_fb_helper.c:(.text+0x1dcc): undefined reference to `framebuffer_alloc'
ld: drm_fb_helper.c:(.text+0x1dea): undefined reference to `fb_alloc_cmap'
ld: drm_fb_helper.c:(.text+0x1e2f): undefined reference to `fb_dealloc_cmap'
ld: drm_fb_helper.c:(.text+0x1e5b): undefined reference to
`framebuffer_release'
drm_fb_helper.c:(.text+0x1e85): undefined reference to
`unregister_framebuffer'
drm_fb_helper.c:(.text+0x1ee9): undefined reference to `fb_dealloc_cmap'
ld: drm_fb_helper.c:(.text+0x1ef0): undefined reference to
`framebuffer_release'
drm_fb_helper.c:(.text+0x1f96): undefined reference to
`fb_deferred_io_cleanup'
drm_fb_helper.c:(.text+0x203b): undefined reference to `fb_sys_read'
drm_fb_helper.c:(.text+0x2051): undefined reference to `fb_sys_write'
drm_fb_helper.c:(.text+0x208d): undefined reference to `sys_fillrect'
drm_fb_helper.c:(.text+0x20bb): undefined reference to `sys_copyarea'
drm_fb_helper.c:(.text+0x20e9): undefined reference to `sys_imageblit'
drm_fb_helper.c:(.text+0x2117): undefined reference to `cfb_fillrect'
drm_fb_helper.c:(.text+0x2172): undefined reference to `cfb_copyarea'
drm_fb_helper.c:(.text+0x21cd): undefined reference to `cfb_imageblit'
drm_fb_helper.c:(.text+0x2233): undefined reference to `fb_set_suspend'
drm_fb_helper.c:(.text+0x22b0): undefined reference to `fb_set_suspend'
drm_fb_helper.c:(.text+0x250f): undefined reference to `fb_deferred_io_init'

The main reason is because DRM_FBDEV_EMULATION is built-in, and
CONFIG_FB is compiled as a module.


DRM_FBDEV_EMULATION is not a module, it's just a config
knob. drm_fb_helper.ko is the module, enabled via DRM_KMS_HELPER, and it
has an implicit dependency on FB, and DRM_FBDEV_EMULATION selects
DRM_KMS_HELPER. Select just breaks dependencies in all kinds of ways.

This might help in config DRM_KMS_HELPER, and it might help the reader
because it's factual:

depends on FB if DRM_FBDEV_EMULATION


BR,
Jani.







--
Jackie Liu

在 2021/8/16 下午3:01, Jani Nikula 写道:

On Mon, 16 Aug 2021, Jackie Liu  wrote:

From: Jackie Liu 

When CONFIG_DRM_FBDEV_EMULATION is compiled to y and CONFIG_FB is m, the
compilation will fail. we need make that dependency explicit.


What's the failure mode? Using select here is a bad idea.

BR,
Jani.



Reported-by: k2ci 
Signed-off-by: Jackie Liu 
---
   drivers/gpu/drm/Kconfig | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 7ff89690a976..346a518b5119 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
   config DRM_FBDEV_EMULATION
bool "Enable legacy fbdev support for your modesetting driver"
depends on DRM
-   depends on FB
+   select FB
select DRM_KMS_HELPER
select FB_CFB_FILLRECT
select FB_CFB_COPYAREA






[PATCH] drm/i915: Ditch the i915_gem_ww_ctx loop member

2021-08-16 Thread Thomas Hellström
It's only used by the for_i915_gem_ww() macro and we can use
the (typically) on-stack _err variable in its place.

While initially setting the _err variable to -EDEADLK to enter the
loop, we clear it before actually entering using fetch_and_zero() to
avoid empty loops or code not setting the _err variable running forever.

Suggested-by: Maarten Lankhorst 
Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/i915_gem_ww.h | 23 ---
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h 
b/drivers/gpu/drm/i915/i915_gem_ww.h
index f6b1a796667b..98348b1e6182 100644
--- a/drivers/gpu/drm/i915/i915_gem_ww.h
+++ b/drivers/gpu/drm/i915/i915_gem_ww.h
@@ -7,12 +7,13 @@
 
 #include 
 
+#include "i915_utils.h"
+
 struct i915_gem_ww_ctx {
struct ww_acquire_ctx ctx;
struct list_head obj_list;
struct drm_i915_gem_object *contended;
-   unsigned short intr;
-   unsigned short loop;
+   bool intr;
 };
 
 void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
@@ -23,28 +24,20 @@ void i915_gem_ww_unlock_single(struct drm_i915_gem_object 
*obj);
 /* Internal functions used by the inlines! Don't use. */
 static inline int __i915_gem_ww_fini(struct i915_gem_ww_ctx *ww, int err)
 {
-   ww->loop = 0;
if (err == -EDEADLK) {
err = i915_gem_ww_ctx_backoff(ww);
if (!err)
-   ww->loop = 1;
+   err = -EDEADLK;
}
 
-   if (!ww->loop)
+   if (err != -EDEADLK)
i915_gem_ww_ctx_fini(ww);
 
return err;
 }
 
-static inline void
-__i915_gem_ww_init(struct i915_gem_ww_ctx *ww, bool intr)
-{
-   i915_gem_ww_ctx_init(ww, intr);
-   ww->loop = 1;
-}
-
-#define for_i915_gem_ww(_ww, _err, _intr)  \
-   for (__i915_gem_ww_init(_ww, _intr); (_ww)->loop;   \
+#define for_i915_gem_ww(_ww, _err, _intr)\
+   for (i915_gem_ww_ctx_init(_ww, _intr), (_err) = -EDEADLK; \
+fetch_and_zero(&_err) == -EDEADLK;   \
 _err = __i915_gem_ww_fini(_ww, _err))
-
 #endif
-- 
2.31.1



[PATCH v2] drm/fb: Fix randconfig builds

2021-08-16 Thread Jackie Liu
From: Jackie Liu 

When CONFIG_DRM_FBDEV_EMULATION is compiled to y and CONFIG_FB is m, the
compilation will fail. we need make that dependency explicit.

Fixes: f611b1e7624c ("drm: Avoid circular dependencies for CONFIG_FB")
Reported-by: k2ci 
Signed-off-by: Jackie Liu 
---
 drivers/gpu/drm/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 7ff89690a976..cd129d96e649 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
 config DRM_FBDEV_EMULATION
bool "Enable legacy fbdev support for your modesetting driver"
depends on DRM
-   depends on FB
+   depends on FB && FB != m
select DRM_KMS_HELPER
select FB_CFB_FILLRECT
select FB_CFB_COPYAREA
-- 
2.25.1



Re: [PATCH v2] drm: avoid races with modesetting rights

2021-08-16 Thread Desmond Cheong Zhi Xi

On 16/8/21 2:47 am, kernel test robot wrote:

Hi Desmond,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on next-20210813]
[also build test ERROR on v5.14-rc5]
[cannot apply to linus/master v5.14-rc5 v5.14-rc4 v5.14-rc3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Desmond-Cheong-Zhi-Xi/drm-avoid-races-with-modesetting-rights/20210815-234145
base:4b358aabb93a2c654cd1dcab1a25a589f6e2b153
config: i386-randconfig-a004-20210815 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
 # 
https://github.com/0day-ci/linux/commit/cf6d8354b7d7953cd866fad004cbb189adfa074f
 git remote add linux-review https://github.com/0day-ci/linux
 git fetch --no-tags linux-review 
Desmond-Cheong-Zhi-Xi/drm-avoid-races-with-modesetting-rights/20210815-234145
 git checkout cf6d8354b7d7953cd866fad004cbb189adfa074f
 # save the attached .config to linux build tree
 make W=1 ARCH=i386

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>, old ones prefixed by <<):


ERROR: modpost: "task_work_add" [drivers/gpu/drm/drm.ko] undefined!




I'm a bit uncertain about this. Looking into the .config used, this 
error seems to happen because task_work_add isn't an exported symbol, 
but DRM is being compiled as a loadable kernel module (CONFIG_DRM=m).


One way to deal with this is to export the symbol, but there was a 
proposed patch to do this a few months back that wasn't picked up [1], 
so I'm not sure what to make of this.


I'll export the symbol as part of a v3 series, and check in with the 
task-work maintainers.


Link: 
https://lore.kernel.org/lkml/20210127150029.13766-3-josh...@samsung.com/ [1]



---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org





Re: [PATCH] drm/fb: Fix randconfig builds

2021-08-16 Thread Jani Nikula
On Mon, 16 Aug 2021, Jackie Liu  wrote:
> Hi Jani.
>
> Your suggestion is that?
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 7ff89690a976..ba179a539497 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -77,6 +77,7 @@ config DRM_DEBUG_SELFTEST
>   config DRM_KMS_HELPER
>  tristate
>  depends on DRM
> +   depends on FB if DRM_FBDEV_EMULATION
>  help
>CRTC helpers for KMS drivers.
>
>
> But it has a syntax error.

Ah, try this then:

depends on FB || FB=n

>
> --
> Thanks, BR, Jackie Liu
>
> 在 2021/8/16 下午4:33, Jani Nikula 写道:
>> On Mon, 16 Aug 2021, Jackie Liu  wrote:
>>> Hi Jani.
>>>
>>> My CI report an randconfigs build failed. there are:
>>>
>>> drm_fb_helper.c:(.text+0x302): undefined reference to `fb_set_suspend'
>>> drm_fb_helper.c:(.text+0xaea): undefined reference to `register_framebuffer'
>>> drm_fb_helper.c:(.text+0x1dcc): undefined reference to `framebuffer_alloc'
>>> ld: drm_fb_helper.c:(.text+0x1dea): undefined reference to `fb_alloc_cmap'
>>> ld: drm_fb_helper.c:(.text+0x1e2f): undefined reference to `fb_dealloc_cmap'
>>> ld: drm_fb_helper.c:(.text+0x1e5b): undefined reference to
>>> `framebuffer_release'
>>> drm_fb_helper.c:(.text+0x1e85): undefined reference to
>>> `unregister_framebuffer'
>>> drm_fb_helper.c:(.text+0x1ee9): undefined reference to `fb_dealloc_cmap'
>>> ld: drm_fb_helper.c:(.text+0x1ef0): undefined reference to
>>> `framebuffer_release'
>>> drm_fb_helper.c:(.text+0x1f96): undefined reference to
>>> `fb_deferred_io_cleanup'
>>> drm_fb_helper.c:(.text+0x203b): undefined reference to `fb_sys_read'
>>> drm_fb_helper.c:(.text+0x2051): undefined reference to `fb_sys_write'
>>> drm_fb_helper.c:(.text+0x208d): undefined reference to `sys_fillrect'
>>> drm_fb_helper.c:(.text+0x20bb): undefined reference to `sys_copyarea'
>>> drm_fb_helper.c:(.text+0x20e9): undefined reference to `sys_imageblit'
>>> drm_fb_helper.c:(.text+0x2117): undefined reference to `cfb_fillrect'
>>> drm_fb_helper.c:(.text+0x2172): undefined reference to `cfb_copyarea'
>>> drm_fb_helper.c:(.text+0x21cd): undefined reference to `cfb_imageblit'
>>> drm_fb_helper.c:(.text+0x2233): undefined reference to `fb_set_suspend'
>>> drm_fb_helper.c:(.text+0x22b0): undefined reference to `fb_set_suspend'
>>> drm_fb_helper.c:(.text+0x250f): undefined reference to `fb_deferred_io_init'
>>>
>>> The main reason is because DRM_FBDEV_EMULATION is built-in, and
>>> CONFIG_FB is compiled as a module.
>> 
>> DRM_FBDEV_EMULATION is not a module, it's just a config
>> knob. drm_fb_helper.ko is the module, enabled via DRM_KMS_HELPER, and it
>> has an implicit dependency on FB, and DRM_FBDEV_EMULATION selects
>> DRM_KMS_HELPER. Select just breaks dependencies in all kinds of ways.
>> 
>> This might help in config DRM_KMS_HELPER, and it might help the reader
>> because it's factual:
>> 
>>  depends on FB if DRM_FBDEV_EMULATION
>> 
>> 
>> BR,
>> Jani.
>> 
>> 
>> 
>> 
>> 
>>>
>>> --
>>> Jackie Liu
>>>
>>> 在 2021/8/16 下午3:01, Jani Nikula 写道:
 On Mon, 16 Aug 2021, Jackie Liu  wrote:
> From: Jackie Liu 
>
> When CONFIG_DRM_FBDEV_EMULATION is compiled to y and CONFIG_FB is m, the
> compilation will fail. we need make that dependency explicit.

 What's the failure mode? Using select here is a bad idea.

 BR,
 Jani.

>
> Reported-by: k2ci 
> Signed-off-by: Jackie Liu 
> ---
>drivers/gpu/drm/Kconfig | 2 +-
>1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 7ff89690a976..346a518b5119 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
>config DRM_FBDEV_EMULATION
>   bool "Enable legacy fbdev support for your modesetting driver"
>   depends on DRM
> - depends on FB
> + select FB
>   select DRM_KMS_HELPER
>   select FB_CFB_FILLRECT
>   select FB_CFB_COPYAREA

>> 

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [PATCH] drm/fb: Fix randconfig builds

2021-08-16 Thread Jani Nikula
On Mon, 16 Aug 2021, Jani Nikula  wrote:
> On Mon, 16 Aug 2021, Jackie Liu  wrote:
>> Hi Jani.
>>
>> Your suggestion is that?
>>
>> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
>> index 7ff89690a976..ba179a539497 100644
>> --- a/drivers/gpu/drm/Kconfig
>> +++ b/drivers/gpu/drm/Kconfig
>> @@ -77,6 +77,7 @@ config DRM_DEBUG_SELFTEST
>>   config DRM_KMS_HELPER
>>  tristate
>>  depends on DRM
>> +   depends on FB if DRM_FBDEV_EMULATION
>>  help
>>CRTC helpers for KMS drivers.
>>
>>
>> But it has a syntax error.
>
> Ah, try this then:
>
>   depends on FB || FB=n

Or this monster:

depends on FB || DRM_FBDEV_EMULATION=n


>
>>
>> --
>> Thanks, BR, Jackie Liu
>>
>> 在 2021/8/16 下午4:33, Jani Nikula 写道:
>>> On Mon, 16 Aug 2021, Jackie Liu  wrote:
 Hi Jani.

 My CI report an randconfigs build failed. there are:

 drm_fb_helper.c:(.text+0x302): undefined reference to `fb_set_suspend'
 drm_fb_helper.c:(.text+0xaea): undefined reference to 
 `register_framebuffer'
 drm_fb_helper.c:(.text+0x1dcc): undefined reference to `framebuffer_alloc'
 ld: drm_fb_helper.c:(.text+0x1dea): undefined reference to `fb_alloc_cmap'
 ld: drm_fb_helper.c:(.text+0x1e2f): undefined reference to 
 `fb_dealloc_cmap'
 ld: drm_fb_helper.c:(.text+0x1e5b): undefined reference to
 `framebuffer_release'
 drm_fb_helper.c:(.text+0x1e85): undefined reference to
 `unregister_framebuffer'
 drm_fb_helper.c:(.text+0x1ee9): undefined reference to `fb_dealloc_cmap'
 ld: drm_fb_helper.c:(.text+0x1ef0): undefined reference to
 `framebuffer_release'
 drm_fb_helper.c:(.text+0x1f96): undefined reference to
 `fb_deferred_io_cleanup'
 drm_fb_helper.c:(.text+0x203b): undefined reference to `fb_sys_read'
 drm_fb_helper.c:(.text+0x2051): undefined reference to `fb_sys_write'
 drm_fb_helper.c:(.text+0x208d): undefined reference to `sys_fillrect'
 drm_fb_helper.c:(.text+0x20bb): undefined reference to `sys_copyarea'
 drm_fb_helper.c:(.text+0x20e9): undefined reference to `sys_imageblit'
 drm_fb_helper.c:(.text+0x2117): undefined reference to `cfb_fillrect'
 drm_fb_helper.c:(.text+0x2172): undefined reference to `cfb_copyarea'
 drm_fb_helper.c:(.text+0x21cd): undefined reference to `cfb_imageblit'
 drm_fb_helper.c:(.text+0x2233): undefined reference to `fb_set_suspend'
 drm_fb_helper.c:(.text+0x22b0): undefined reference to `fb_set_suspend'
 drm_fb_helper.c:(.text+0x250f): undefined reference to 
 `fb_deferred_io_init'

 The main reason is because DRM_FBDEV_EMULATION is built-in, and
 CONFIG_FB is compiled as a module.
>>> 
>>> DRM_FBDEV_EMULATION is not a module, it's just a config
>>> knob. drm_fb_helper.ko is the module, enabled via DRM_KMS_HELPER, and it
>>> has an implicit dependency on FB, and DRM_FBDEV_EMULATION selects
>>> DRM_KMS_HELPER. Select just breaks dependencies in all kinds of ways.
>>> 
>>> This might help in config DRM_KMS_HELPER, and it might help the reader
>>> because it's factual:
>>> 
>>> depends on FB if DRM_FBDEV_EMULATION
>>> 
>>> 
>>> BR,
>>> Jani.
>>> 
>>> 
>>> 
>>> 
>>> 

 --
 Jackie Liu

 在 2021/8/16 下午3:01, Jani Nikula 写道:
> On Mon, 16 Aug 2021, Jackie Liu  wrote:
>> From: Jackie Liu 
>>
>> When CONFIG_DRM_FBDEV_EMULATION is compiled to y and CONFIG_FB is m, the
>> compilation will fail. we need make that dependency explicit.
>
> What's the failure mode? Using select here is a bad idea.
>
> BR,
> Jani.
>
>>
>> Reported-by: k2ci 
>> Signed-off-by: Jackie Liu 
>> ---
>>drivers/gpu/drm/Kconfig | 2 +-
>>1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
>> index 7ff89690a976..346a518b5119 100644
>> --- a/drivers/gpu/drm/Kconfig
>> +++ b/drivers/gpu/drm/Kconfig
>> @@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
>>config DRM_FBDEV_EMULATION
>>  bool "Enable legacy fbdev support for your modesetting driver"
>>  depends on DRM
>> -depends on FB
>> +select FB
>>  select DRM_KMS_HELPER
>>  select FB_CFB_FILLRECT
>>  select FB_CFB_COPYAREA
>
>>> 

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [PATCH v2] drm/fb: Fix randconfig builds

2021-08-16 Thread Jani Nikula
On Mon, 16 Aug 2021, Jackie Liu  wrote:
> From: Jackie Liu 
>
> When CONFIG_DRM_FBDEV_EMULATION is compiled to y and CONFIG_FB is m, the
> compilation will fail. we need make that dependency explicit.
>
> Fixes: f611b1e7624c ("drm: Avoid circular dependencies for CONFIG_FB")
> Reported-by: k2ci 
> Signed-off-by: Jackie Liu 

Why do you send this while the discussion is still ongoing?

Now this *requires* FB=y even if it *could* be FB=m when
DRM_KMS_HELPER=m.

BR,
Jani.

> ---
>  drivers/gpu/drm/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 7ff89690a976..cd129d96e649 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
>  config DRM_FBDEV_EMULATION
>   bool "Enable legacy fbdev support for your modesetting driver"
>   depends on DRM
> - depends on FB
> + depends on FB && FB != m
>   select DRM_KMS_HELPER
>   select FB_CFB_FILLRECT
>   select FB_CFB_COPYAREA

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [PATCH v2] drm/fb: Fix randconfig builds

2021-08-16 Thread Jackie Liu




在 2021/8/16 下午4:58, Jani Nikula 写道:

On Mon, 16 Aug 2021, Jackie Liu  wrote:

From: Jackie Liu 

When CONFIG_DRM_FBDEV_EMULATION is compiled to y and CONFIG_FB is m, the
compilation will fail. we need make that dependency explicit.

Fixes: f611b1e7624c ("drm: Avoid circular dependencies for CONFIG_FB")
Reported-by: k2ci 
Signed-off-by: Jackie Liu 


Why do you send this while the discussion is still ongoing?

Now this *requires* FB=y even if it *could* be FB=m when
DRM_KMS_HELPER=m.


Yes, You are right.

BR, Jackie



BR,
Jani.


---
  drivers/gpu/drm/Kconfig | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 7ff89690a976..cd129d96e649 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -98,7 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
  config DRM_FBDEV_EMULATION
bool "Enable legacy fbdev support for your modesetting driver"
depends on DRM
-   depends on FB
+   depends on FB && FB != m
select DRM_KMS_HELPER
select FB_CFB_FILLRECT
select FB_CFB_COPYAREA




Re: [PATCH v2] drm: avoid races with modesetting rights

2021-08-16 Thread Daniel Vetter
On Mon, Aug 16, 2021 at 10:53 AM Desmond Cheong Zhi Xi
 wrote:
> On 16/8/21 2:47 am, kernel test robot wrote:
> > Hi Desmond,
> >
> > Thank you for the patch! Yet something to improve:
> >
> > [auto build test ERROR on next-20210813]
> > [also build test ERROR on v5.14-rc5]
> > [cannot apply to linus/master v5.14-rc5 v5.14-rc4 v5.14-rc3]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch]
> >
> > url:
> > https://github.com/0day-ci/linux/commits/Desmond-Cheong-Zhi-Xi/drm-avoid-races-with-modesetting-rights/20210815-234145
> > base:4b358aabb93a2c654cd1dcab1a25a589f6e2b153
> > config: i386-randconfig-a004-20210815 (attached as .config)
> > compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
> > reproduce (this is a W=1 build):
> >  # 
> > https://github.com/0day-ci/linux/commit/cf6d8354b7d7953cd866fad004cbb189adfa074f
> >  git remote add linux-review https://github.com/0day-ci/linux
> >  git fetch --no-tags linux-review 
> > Desmond-Cheong-Zhi-Xi/drm-avoid-races-with-modesetting-rights/20210815-234145
> >  git checkout cf6d8354b7d7953cd866fad004cbb189adfa074f
> >  # save the attached .config to linux build tree
> >  make W=1 ARCH=i386
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot 
> >
> > All errors (new ones prefixed by >>, old ones prefixed by <<):
> >
> >>> ERROR: modpost: "task_work_add" [drivers/gpu/drm/drm.ko] undefined!
> >
>
> I'm a bit uncertain about this. Looking into the .config used, this
> error seems to happen because task_work_add isn't an exported symbol,
> but DRM is being compiled as a loadable kernel module (CONFIG_DRM=m).
>
> One way to deal with this is to export the symbol, but there was a
> proposed patch to do this a few months back that wasn't picked up [1],
> so I'm not sure what to make of this.
>
> I'll export the symbol as part of a v3 series, and check in with the
> task-work maintainers.
>
> Link:
> https://lore.kernel.org/lkml/20210127150029.13766-3-josh...@samsung.com/ [1]

Yeah that sounds best. I have two more thoughts on the patch:
- drm_master_flush isn't used by any modules outside of drm.ko, so we
can unexport it and drop the kerneldoc (the comment is still good).
These kind of internal functions have their declaration in
drm-internal.h - there's already a few there from drm_auth.c

- We know have 3 locks for master state, that feels a bit like
overkill. The spinlock I think we need to keep due to lock inversions,
but the master_mutex and master_rwsem look like we should be able to
merge them? I.e. anywhere we currently grab the master_mutex we could
instead grab the rwsem in either write mode (when we change stuff) or
read mode (when we just check, like in master_internal_acquire).

Thoughts?
-Daniel

>
> > ---
> > 0-DAY CI Kernel Test Service, Intel Corporation
> > https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [REGRESSION][BISECTED] 5.14.0-rc4 thru rc6 69de4421bb broke

2021-08-16 Thread Duncan
Duncan posted on Mon, 16 Aug 2021 07:58:37 + as excerpted:

> Mikael Pettersson posted on Tue, 03 Aug 2021 08:54:18 +0200 as
> excerpted:
>> On Mon, Aug 2, 2021 at 8:29 PM Duncan  wrote:
>>> Mikael Pettersson  wrote...
>>> > Booting 5.14.0-rc4 on my box with Radeon graphics breaks with
>>> >
>>> > [drm:radeon_ttm_init [radeon]] *ERROR* failed initializing buffer
>>> > object driver(-19).
>>> > radeon :01:00.0: Fatal error during GPU init
>>>
>>> Seeing this here too.  amdgpu on polaris-11, on an old amd-fx6100
>>> system.
>>>
>>> > after which the screen goes black for the rest of kernel boot and
>>> > early user-space init.
>>>
>>> *NOT* seeing that.  However, I have boot messages turned on by
>>> default and I see them as usual, only it stays in vga-console mode
>>> instead of switching to framebuffer after early-boot. I'm guessing
>>> MP has a high-res boot-splash which doesn't work in vga mode, thus
>>> the black-screen until the login shows up.
>> 
>> Yes, I have the Fedora boot splash enabled.
>> 
>>> > Once the console login shows up the screen is in some legacy
>>> > low-res mode and Xorg can't be started.
>>> >
>>> > A git bisect between v5.14-rc3 (good) and v5.14-rc4 (bad)
>>> > identified
>>> >
>>> > # first bad commit: [69de4421bb4c103ef42a32bafc596e23918c106f]
>>> > drm/ttm: Initialize debugfs from ttm_global_init()
>>> >
>>> > Reverting that from 5.14.0-rc4 gives me a working kernel again.
>>> >
>>> > Note that I have # CONFIG_DEBUG_FS is not set
>>>
>>> That all matches here, including the unset CONFIG_DEBUG_FS and
>>> confirming the revert on 5.14.0-rc4 works.
>> 
>> Thanks for the confirmation.
> 
> 69de44d1bb introduced a regression in rc4, reported to the list on
> August 2, that's still there in rc6.  It's also reported on kernel
> bugzilla as bug #214000.  No maintainer response either on-list or to
> the bug.  The commit was general ttm and the original post went to
> dri-devel and kernel,
> Jason E. and Daniel V., but all three user reports I've seen so far
> (two on-list and the bug reporter) are on amdgpu or radeon, so in an
> effort to at least get a response and hopefully a fix before release,
> I'm adding the amdgpu/radeon list and maintainers.
> 
> The bugzilla report confirmed that CONFIG_DEBUG_FS=y AND
> CONFIG_DEBUG_FS_ALLOW_ALL=y were *both* required to get a working
> kernel after that commit.  I and I believe the on-list reporter just
> reverted the commit in question, and kept our CONFIG_DEBUG_FS=n.

Trying again. I apologize if anyone gets this twice but I don't think
the first one made it at all (buggy client).

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


Re: IIO, dmabuf, io_uring

2021-08-16 Thread Paul Cercueil

Hi Christoph,

Le sam., août 14 2021 at 09:30:19 +0200, Christoph Hellwig 
 a écrit :

On Fri, Aug 13, 2021 at 01:41:26PM +0200, Paul Cercueil wrote:

 Hi,

 A few months ago we (ADI) tried to upstream the interface we use 
with our
 high-speed ADCs and DACs. It is a system with custom ioctls on the 
iio

 device node to dequeue and enqueue buffers (allocated with
 dma_alloc_coherent), that can then be mmap'd by userspace 
applications.
 Anyway, it was ultimately denied entry [1]; this API was okay in 
~2014 when

 it was designed but it feels like re-inventing the wheel in 2021.

 Back to the drawing table, and we'd like to design something that 
we can
 actually upstream. This high-speed interface looks awfully similar 
to
 DMABUF, so we may try to implement a DMABUF interface for IIO, 
unless

 someone has a better idea.


To me this does sound a lot like a dma buf use case.  The interesting
question to me is how to signal arrival of new data, or readyness to
consume more data.  I suspect that people that are actually using
dmabuf heavily at the moment (dri/media folks) might be able to chime
in a little more on that.


Thanks for the feedback.

I haven't looked too much into how dmabuf works; but IIO device nodes 
right now have a regular stdio interface, so I believe poll() flags can 
be used to signal arrival of new data.


 Our first usecase is, we want userspace applications to be able to 
dequeue
 buffers of samples (from ADCs), and/or enqueue buffers of samples 
(for
 DACs), and to be able to manipulate them (mmapped buffers). With a 
DMABUF
 interface, I guess the userspace application would dequeue a dma 
buffer
 from the driver, mmap it, read/write the data, unmap it, then 
enqueue it to
 the IIO driver again so that it can be disposed of. Does that sound 
sane?


 Our second usecase is - and that's where things get tricky - to be 
able to
 stream the samples to another computer for processing, over 
Ethernet or
 USB. Our typical setup is a high-speed ADC/DAC on a dev board with 
a FPGA
 and a weak soft-core or low-power CPU; processing the data in-situ 
is not
 an option. Copying the data from one buffer to another is not an 
option

 either (way too slow), so we absolutely want zero-copy.

 Usual userspace zero-copy techniques (vmsplice+splice, MSG_ZEROCOPY 
etc)
 don't really work with mmapped kernel buffers allocated for DMA [2] 
and/or

 have a huge overhead, so the way I see it, we would also need DMABUF
 support in both the Ethernet stack and USB (functionfs) stack. 
However, as
 far as I understood, DMABUF is mostly a DRM/V4L2 thing, so I am 
really not

 sure we have the right idea here.

 And finally, there is the new kid in town, io_uring. I am not very 
literate
 about the topic, but it does not seem to be able to handle DMA 
buffers
 (yet?). The idea that we could dequeue a buffer of samples from the 
IIO
 device and send it over the network in one single syscall is 
appealing,

 though.


Think of io_uring really just as an async syscall layer.  It doesn't
replace DMA buffers, but can be used as a different and for some
workloads more efficient way to dispatch syscalls.


That was my thought, yes. Thanks.

Cheers,
-Paul




Re: [PATCH] drm/fb: Fix randconfig builds

2021-08-16 Thread Jackie Liu




在 2021/8/16 下午4:56, Jani Nikula 写道:

On Mon, 16 Aug 2021, Jani Nikula  wrote:

On Mon, 16 Aug 2021, Jackie Liu  wrote:

Hi Jani.

Your suggestion is that?

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 7ff89690a976..ba179a539497 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -77,6 +77,7 @@ config DRM_DEBUG_SELFTEST
   config DRM_KMS_HELPER
  tristate
  depends on DRM
+   depends on FB if DRM_FBDEV_EMULATION
  help
CRTC helpers for KMS drivers.


But it has a syntax error.


Ah, try this then:

depends on FB || FB=n


Or this monster:

depends on FB || DRM_FBDEV_EMULATION=n



Hi Jani,

   depends on FB || DRM_FBDEV_EMULATION=n Will cause the following
warnings.

WARNING: unmet direct dependencies detected for DRM_KMS_HELPER
  Depends on [m]: HAS_IOMEM [=y] && DRM [=y] && (FB [=m] || 
!DRM_FBDEV_EMULATION [=y])

  Selected by [y]:
  - DRM_DEBUG_SELFTEST [=y] && HAS_IOMEM [=y] && DRM [=y] && 
DEBUG_KERNEL [=y]

  - DRM_VKMS [=y] && HAS_IOMEM [=y] && DRM [=y]
  - TINYDRM_ILI9341 [=y] && HAS_IOMEM [=y] && DRM [=y] && SPI [=y]
  - TINYDRM_MI0283QT [=y] && HAS_IOMEM [=y] && DRM [=y] && SPI [=y]
  - TINYDRM_ST7586 [=y] && HAS_IOMEM [=y] && DRM [=y] && SPI [=y]
  - TINYDRM_ST7735R [=y] && HAS_IOMEM [=y] && DRM [=y] && SPI [=y]
  - DRM_ANALOGIX_ANX78XX [=y] && HAS_IOMEM [=y] && DRM [=y] && 
DRM_BRIDGE [=y]

  Selected by [m]:
  - DRM_FBDEV_EMULATION [=y] && HAS_IOMEM [=y] && DRM [=y] && FB [=m]
  - DRM_SIMPLEDRM [=m] && HAS_IOMEM [=y] && DRM [=y]
  - TINYDRM_HX8357D [=m] && HAS_IOMEM [=y] && DRM [=y] && SPI [=y]
  - TINYDRM_REPAPER [=m] && HAS_IOMEM [=y] && DRM [=y] && SPI [=y]
configuration written to .config

How about this?

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 7ff89690a976..797eeea9cbbe 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -98,8 +98,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
 config DRM_FBDEV_EMULATION
bool "Enable legacy fbdev support for your modesetting driver"
depends on DRM
-   depends on FB
-   select DRM_KMS_HELPER
+   depends on (FB=y && DRM_KMS_HELPER) || (FB=m && DRM_KMS_HELPER=m)
select FB_CFB_FILLRECT
select FB_CFB_COPYAREA
select FB_CFB_IMAGEBLIT


--
BR, Jackie Liu








--
Thanks, BR, Jackie Liu

在 2021/8/16 下午4:33, Jani Nikula 写道:

On Mon, 16 Aug 2021, Jackie Liu  wrote:

Hi Jani.

My CI report an randconfigs build failed. there are:

drm_fb_helper.c:(.text+0x302): undefined reference to `fb_set_suspend'
drm_fb_helper.c:(.text+0xaea): undefined reference to `register_framebuffer'
drm_fb_helper.c:(.text+0x1dcc): undefined reference to `framebuffer_alloc'
ld: drm_fb_helper.c:(.text+0x1dea): undefined reference to `fb_alloc_cmap'
ld: drm_fb_helper.c:(.text+0x1e2f): undefined reference to `fb_dealloc_cmap'
ld: drm_fb_helper.c:(.text+0x1e5b): undefined reference to
`framebuffer_release'
drm_fb_helper.c:(.text+0x1e85): undefined reference to
`unregister_framebuffer'
drm_fb_helper.c:(.text+0x1ee9): undefined reference to `fb_dealloc_cmap'
ld: drm_fb_helper.c:(.text+0x1ef0): undefined reference to
`framebuffer_release'
drm_fb_helper.c:(.text+0x1f96): undefined reference to
`fb_deferred_io_cleanup'
drm_fb_helper.c:(.text+0x203b): undefined reference to `fb_sys_read'
drm_fb_helper.c:(.text+0x2051): undefined reference to `fb_sys_write'
drm_fb_helper.c:(.text+0x208d): undefined reference to `sys_fillrect'
drm_fb_helper.c:(.text+0x20bb): undefined reference to `sys_copyarea'
drm_fb_helper.c:(.text+0x20e9): undefined reference to `sys_imageblit'
drm_fb_helper.c:(.text+0x2117): undefined reference to `cfb_fillrect'
drm_fb_helper.c:(.text+0x2172): undefined reference to `cfb_copyarea'
drm_fb_helper.c:(.text+0x21cd): undefined reference to `cfb_imageblit'
drm_fb_helper.c:(.text+0x2233): undefined reference to `fb_set_suspend'
drm_fb_helper.c:(.text+0x22b0): undefined reference to `fb_set_suspend'
drm_fb_helper.c:(.text+0x250f): undefined reference to `fb_deferred_io_init'

The main reason is because DRM_FBDEV_EMULATION is built-in, and
CONFIG_FB is compiled as a module.


DRM_FBDEV_EMULATION is not a module, it's just a config
knob. drm_fb_helper.ko is the module, enabled via DRM_KMS_HELPER, and it
has an implicit dependency on FB, and DRM_FBDEV_EMULATION selects
DRM_KMS_HELPER. Select just breaks dependencies in all kinds of ways.

This might help in config DRM_KMS_HELPER, and it might help the reader
because it's factual:

depends on FB if DRM_FBDEV_EMULATION


BR,
Jani.







--
Jackie Liu

在 2021/8/16 下午3:01, Jani Nikula 写道:

On Mon, 16 Aug 2021, Jackie Liu  wrote:

From: Jackie Liu 

When CONFIG_DRM_FBDEV_EMULATION is compiled to y and CONFIG_FB is m, the
compilation will fail. we need make that dependency explicit.


What's the failure mode? Using select here is a bad idea.

BR,
Jani.



Reported-by: k2ci 
Signed-off-by: Jackie Liu 
---
drivers/gpu/drm/Kconfig | 2 +-

Re: [PATCH v2 4/5] drm/scheduler: Add fence deadline support

2021-08-16 Thread Christian König

Am 07.08.21 um 20:37 schrieb Rob Clark:

From: Rob Clark 

As the finished fence is the one that is exposed to userspace, and
therefore the one that other operations, like atomic update, would
block on, we need to propagate the deadline from from the finished
fence to the actual hw fence.

Signed-off-by: Rob Clark 
---
  drivers/gpu/drm/scheduler/sched_fence.c | 25 +
  drivers/gpu/drm/scheduler/sched_main.c  |  3 +++
  include/drm/gpu_scheduler.h |  6 ++
  3 files changed, 34 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_fence.c 
b/drivers/gpu/drm/scheduler/sched_fence.c
index 69de2c76731f..f389dca44185 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct 
dma_fence *f)
dma_fence_put(&fence->scheduled);
  }
  
+static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,

+ ktime_t deadline)
+{
+   struct drm_sched_fence *fence = to_drm_sched_fence(f);
+   unsigned long flags;
+
+   spin_lock_irqsave(&fence->lock, flags);
+
+   /* If we already have an earlier deadline, keep it: */
+   if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
+   ktime_before(fence->deadline, deadline)) {
+   spin_unlock_irqrestore(&fence->lock, flags);
+   return;
+   }
+
+   fence->deadline = deadline;
+   set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
+
+   spin_unlock_irqrestore(&fence->lock, flags);
+
+   if (fence->parent)
+   dma_fence_set_deadline(fence->parent, deadline);
+}
+
  static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
.get_driver_name = drm_sched_fence_get_driver_name,
.get_timeline_name = drm_sched_fence_get_timeline_name,
@@ -138,6 +162,7 @@ static const struct dma_fence_ops 
drm_sched_fence_ops_finished = {
.get_driver_name = drm_sched_fence_get_driver_name,
.get_timeline_name = drm_sched_fence_get_timeline_name,
.release = drm_sched_fence_release_finished,
+   .set_deadline = drm_sched_fence_set_deadline_finished,
  };
  
  struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a953693b45..3ab0900d3596 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -818,6 +818,9 @@ static int drm_sched_main(void *param)
  
  		if (!IS_ERR_OR_NULL(fence)) {

s_fence->parent = dma_fence_get(fence);
+   if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
+&s_fence->finished.flags))
+   dma_fence_set_deadline(fence, 
s_fence->deadline);


Maybe move this into a dma_sched_fence_set_parent() function.

Apart from that looks good to me.

Regards,
Christian.


r = dma_fence_add_callback(fence, &sched_job->cb,
   drm_sched_job_done_cb);
if (r == -ENOENT)
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index d18af49fd009..0f08ade614ae 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -144,6 +144,12 @@ struct drm_sched_fence {
   */
struct dma_fencefinished;
  
+	/**

+* @deadline: deadline set on &drm_sched_fence.finished which
+* potentially needs to be propagated to &drm_sched_fence.parent
+*/
+   ktime_t deadline;
+
  /**
   * @parent: the fence returned by &drm_sched_backend_ops.run_job
   * when scheduling the job on hardware. We signal the




Re: [PATCH v2 1/5] dma-fence: Add deadline awareness

2021-08-16 Thread Christian König

Am 07.08.21 um 20:37 schrieb Rob Clark:

From: Rob Clark 

Add a way to hint to the fence signaler of an upcoming deadline, such as
vblank, which the fence waiter would prefer not to miss.  This is to aid
the fence signaler in making power management decisions, like boosting
frequency as the deadline approaches and awareness of missing deadlines
so that can be factored in to the frequency scaling.

v2: Drop dma_fence::deadline and related logic to filter duplicate
 deadlines, to avoid increasing dma_fence size.  The fence-context
 implementation will need similar logic to track deadlines of all
 the fences on the same timeline.  [ckoenig]

Signed-off-by: Rob Clark 


Reviewed-by: Christian König 


---
  drivers/dma-buf/dma-fence.c | 20 
  include/linux/dma-fence.h   | 16 
  2 files changed, 36 insertions(+)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index ce0f5eff575d..1f444863b94d 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -910,6 +910,26 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, 
uint32_t count,
  }
  EXPORT_SYMBOL(dma_fence_wait_any_timeout);
  
+

+/**
+ * dma_fence_set_deadline - set desired fence-wait deadline
+ * @fence:the fence that is to be waited on
+ * @deadline: the time by which the waiter hopes for the fence to be
+ *signaled
+ *
+ * Inform the fence signaler of an upcoming deadline, such as vblank, by
+ * which point the waiter would prefer the fence to be signaled by.  This
+ * is intended to give feedback to the fence signaler to aid in power
+ * management decisions, such as boosting GPU frequency if a periodic
+ * vblank deadline is approaching.
+ */
+void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
+{
+   if (fence->ops->set_deadline && !dma_fence_is_signaled(fence))
+   fence->ops->set_deadline(fence, deadline);
+}
+EXPORT_SYMBOL(dma_fence_set_deadline);
+
  /**
   * dma_fence_init - Initialize a custom fence.
   * @fence: the fence to initialize
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 6ffb4b2c6371..9c809f0d5d0a 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -99,6 +99,7 @@ enum dma_fence_flag_bits {
DMA_FENCE_FLAG_SIGNALED_BIT,
DMA_FENCE_FLAG_TIMESTAMP_BIT,
DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+   DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
DMA_FENCE_FLAG_USER_BITS, /* must always be last member */
  };
  
@@ -261,6 +262,19 @@ struct dma_fence_ops {

 */
void (*timeline_value_str)(struct dma_fence *fence,
   char *str, int size);
+
+   /**
+* @set_deadline:
+*
+* Callback to allow a fence waiter to inform the fence signaler of an
+* upcoming deadline, such as vblank, by which point the waiter would
+* prefer the fence to be signaled by.  This is intended to give 
feedback
+* to the fence signaler to aid in power management decisions, such as
+* boosting GPU frequency.
+*
+* This callback is optional.
+*/
+   void (*set_deadline)(struct dma_fence *fence, ktime_t deadline);
  };
  
  void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,

@@ -586,6 +600,8 @@ static inline signed long dma_fence_wait(struct dma_fence 
*fence, bool intr)
return ret < 0 ? ret : 0;
  }
  
+void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline);

+
  struct dma_fence *dma_fence_get_stub(void);
  struct dma_fence *dma_fence_allocate_private_stub(void);
  u64 dma_fence_context_alloc(unsigned num);




Re: [PATCH v2 0/5] dma-fence: Deadline awareness

2021-08-16 Thread Christian König

The general approach seems to make sense now I think.

One minor thing which I'm missing is adding support for this to the 
dma_fence_array and dma_fence_chain containers.


Regards,
Christian.

Am 07.08.21 um 20:37 schrieb Rob Clark:

From: Rob Clark 

Based on discussion from a previous series[1] to add a "boost" mechanism
when, for example, vblank deadlines are missed.  Instead of a boost
callback, this approach adds a way to set a deadline on the fence, by
which the waiter would like to see the fence signalled.

I've not yet had a chance to re-work the drm/msm part of this, but
wanted to send this out as an RFC in case I don't have a chance to
finish the drm/msm part this week.

Original description:

In some cases, like double-buffered rendering, missing vblanks can
trick the GPU into running at a lower frequence, when really we
want to be running at a higher frequency to not miss the vblanks
in the first place.

This is partially inspired by a trick i915 does, but implemented
via dma-fence for a couple of reasons:

1) To continue to be able to use the atomic helpers
2) To support cases where display and gpu are different drivers

[1] https://patchwork.freedesktop.org/series/90331/

v1: https://patchwork.freedesktop.org/series/93035/
v2: Move filtering out of later deadlines to fence implementation
 to avoid increasing the size of dma_fence

Rob Clark (5):
   dma-fence: Add deadline awareness
   drm/vblank: Add helper to get next vblank time
   drm/atomic-helper: Set fence deadline for vblank
   drm/scheduler: Add fence deadline support
   drm/msm: Add deadline based boost support

  drivers/dma-buf/dma-fence.c | 20 +++
  drivers/gpu/drm/drm_atomic_helper.c | 36 
  drivers/gpu/drm/drm_vblank.c| 31 ++
  drivers/gpu/drm/msm/msm_fence.c | 76 +
  drivers/gpu/drm/msm/msm_fence.h | 20 +++
  drivers/gpu/drm/msm/msm_gpu.h   |  1 +
  drivers/gpu/drm/msm/msm_gpu_devfreq.c   | 20 +++
  drivers/gpu/drm/scheduler/sched_fence.c | 25 
  drivers/gpu/drm/scheduler/sched_main.c  |  3 +
  include/drm/drm_vblank.h|  1 +
  include/drm/gpu_scheduler.h |  6 ++
  include/linux/dma-fence.h   | 16 ++
  12 files changed, 255 insertions(+)





RE: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-16 Thread Quan, Evan
[AMD Official Use Only]

Hi Michel,

The patch seems reasonable to me(especially the cancel_delayed_work_sync() 
part).
However, can you explain more about the code below?
What's the race issue here exactly?

+   /* mutex_lock could deadlock with cancel_delayed_work_sync in 
amdgpu_gfx_off_ctrl. */
+   if (!mutex_trylock(&adev->gfx.gfx_off_mutex)) {
+   /* If there's a bug which causes amdgpu_gfx_off_ctrl to be 
called with enable=true
+* when adev->gfx.gfx_off_req_count is already 0, we might race 
with that.
+* Re-schedule to make sure gfx off will be re-enabled in the 
HW eventually.
+*/
+   schedule_delayed_work(&adev->gfx.gfx_off_delay_work, 
AMDGPU_GFX_OFF_DELAY_ENABLE);
+   return;
+   }

BR
Evan
> -Original Message-
> From: amd-gfx  On Behalf Of
> Michel Dänzer
> Sent: Friday, August 13, 2021 6:29 PM
> To: Deucher, Alexander ; Koenig, Christian
> 
> Cc: Liu, Leo ; Zhu, James ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is
> disabled
> 
> From: Michel Dänzer 
> 
> schedule_delayed_work does not push back the work if it was already
> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
> was disabled and re-enabled again during those 100 ms.
> 
> This resulted in frame drops / stutter with the upcoming mutter 41
> release on Navi 14, due to constantly enabling GFXOFF in the HW and
> disabling it again (for getting the GPU clock counter).
> 
> To fix this, call cancel_delayed_work_sync when GFXOFF transitions from
> enabled to disabled. This makes sure the delayed work will be scheduled
> as intended in the reverse case.
> 
> In order to avoid a deadlock, amdgpu_device_delay_enable_gfx_off needs
> to use mutex_trylock instead of mutex_lock.
> 
> v2:
> * Use cancel_delayed_work_sync & mutex_trylock instead of
>   mod_delayed_work.
> 
> Signed-off-by: Michel Dänzer 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 13 +++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h|  3 +++
>  3 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index f3fd5ec710b6..8b025f70706c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2777,7 +2777,16 @@ static void
> amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
>   struct amdgpu_device *adev =
>   container_of(work, struct amdgpu_device,
> gfx.gfx_off_delay_work.work);
> 
> - mutex_lock(&adev->gfx.gfx_off_mutex);
> + /* mutex_lock could deadlock with cancel_delayed_work_sync in
> amdgpu_gfx_off_ctrl. */
> + if (!mutex_trylock(&adev->gfx.gfx_off_mutex)) {
> + /* If there's a bug which causes amdgpu_gfx_off_ctrl to be
> called with enable=true
> +  * when adev->gfx.gfx_off_req_count is already 0, we might
> race with that.
> +  * Re-schedule to make sure gfx off will be re-enabled in the
> HW eventually.
> +  */
> + schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
> AMDGPU_GFX_OFF_DELAY_ENABLE);
> + return;
> + }
> +
>   if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
>   if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, true))
>   adev->gfx.gfx_off_state = true;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index a0be0772c8b3..da4c46db3093 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -28,9 +28,6 @@
>  #include "amdgpu_rlc.h"
>  #include "amdgpu_ras.h"
> 
> -/* delay 0.1 second to enable gfx off feature */
> -#define GFX_OFF_DELAY_ENABLE msecs_to_jiffies(100)
> -
>  /*
>   * GPU GFX IP block helpers function.
>   */
> @@ -569,9 +566,13 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device
> *adev, bool enable)
>   adev->gfx.gfx_off_req_count--;
> 
>   if (enable && !adev->gfx.gfx_off_state && !adev-
> >gfx.gfx_off_req_count) {
> - schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
> GFX_OFF_DELAY_ENABLE);
> - } else if (!enable && adev->gfx.gfx_off_state) {
> - if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, false)) {
> + schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
> AMDGPU_GFX_OFF_DELAY_ENABLE);
> + } else if (!enable) {
> + if (adev->gfx.gfx_off_req_count == 1 && !adev-
> >gfx.gfx_off_state)
> + cancel_delayed_work_sync(&adev-
> >gfx.gfx_off_delay_work);
> +
> + if (adev->gfx.gfx_off_state &&
> +

Re: [PATCH v2] drm: avoid races with modesetting rights

2021-08-16 Thread Desmond Cheong Zhi Xi

On 16/8/21 5:04 pm, Daniel Vetter wrote:

On Mon, Aug 16, 2021 at 10:53 AM Desmond Cheong Zhi Xi
 wrote:

On 16/8/21 2:47 am, kernel test robot wrote:

Hi Desmond,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on next-20210813]
[also build test ERROR on v5.14-rc5]
[cannot apply to linus/master v5.14-rc5 v5.14-rc4 v5.14-rc3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Desmond-Cheong-Zhi-Xi/drm-avoid-races-with-modesetting-rights/20210815-234145
base:4b358aabb93a2c654cd1dcab1a25a589f6e2b153
config: i386-randconfig-a004-20210815 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
  # 
https://github.com/0day-ci/linux/commit/cf6d8354b7d7953cd866fad004cbb189adfa074f
  git remote add linux-review https://github.com/0day-ci/linux
  git fetch --no-tags linux-review 
Desmond-Cheong-Zhi-Xi/drm-avoid-races-with-modesetting-rights/20210815-234145
  git checkout cf6d8354b7d7953cd866fad004cbb189adfa074f
  # save the attached .config to linux build tree
  make W=1 ARCH=i386

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>, old ones prefixed by <<):


ERROR: modpost: "task_work_add" [drivers/gpu/drm/drm.ko] undefined!




I'm a bit uncertain about this. Looking into the .config used, this
error seems to happen because task_work_add isn't an exported symbol,
but DRM is being compiled as a loadable kernel module (CONFIG_DRM=m).

One way to deal with this is to export the symbol, but there was a
proposed patch to do this a few months back that wasn't picked up [1],
so I'm not sure what to make of this.

I'll export the symbol as part of a v3 series, and check in with the
task-work maintainers.

Link:
https://lore.kernel.org/lkml/20210127150029.13766-3-josh...@samsung.com/ [1]


Yeah that sounds best. I have two more thoughts on the patch:
- drm_master_flush isn't used by any modules outside of drm.ko, so we
can unexport it and drop the kerneldoc (the comment is still good).
These kind of internal functions have their declaration in
drm-internal.h - there's already a few there from drm_auth.c



Sounds good, I'll do that and move the declaration from drm_auth.h to 
drm_internal.h.



- We know have 3 locks for master state, that feels a bit like
overkill. The spinlock I think we need to keep due to lock inversions,
but the master_mutex and master_rwsem look like we should be able to
merge them? I.e. anywhere we currently grab the master_mutex we could
instead grab the rwsem in either write mode (when we change stuff) or
read mode (when we just check, like in master_internal_acquire).

Thoughts?
-Daniel



Using rwsem in the places where we currently hold the mutex seems pretty 
doable.


There are some tricky bits once we add rwsem read locks to the ioctl 
handler. Some ioctl functions like drm_authmagic need a write lock.


In this particular case, it might make sense to break master_mutex down 
into finer-grained locks, since the function doesn't change master 
permissions. It just needs to prevent concurrent writes to the 
drm_master.magic_map idr.


For other ioctls, I'll take a closer look on a case-by-case basis.




---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org










[PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-16 Thread Michel Dänzer
From: Michel Dänzer 

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
  mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)

Cc: sta...@vger.kernel.org
Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 22 +-
 2 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f3fd5ec710b6..f944ed858f3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2777,12 +2777,11 @@ static void amdgpu_device_delay_enable_gfx_off(struct 
work_struct *work)
struct amdgpu_device *adev =
container_of(work, struct amdgpu_device, 
gfx.gfx_off_delay_work.work);
 
-   mutex_lock(&adev->gfx.gfx_off_mutex);
-   if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
-   if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, true))
-   adev->gfx.gfx_off_state = true;
-   }
-   mutex_unlock(&adev->gfx.gfx_off_mutex);
+   WARN_ON_ONCE(adev->gfx.gfx_off_state);
+   WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
+
+   if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
true))
+   adev->gfx.gfx_off_state = true;
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index a0be0772c8b3..ca91aafcb32b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -563,15 +563,26 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
 
mutex_lock(&adev->gfx.gfx_off_mutex);
 
-   if (!enable)
-   adev->gfx.gfx_off_req_count++;
-   else if (adev->gfx.gfx_off_req_count > 0)
+   if (enable) {
+   /* If the count is already 0, it means there's an imbalance bug 
somewhere.
+* Note that the bug may be in a different caller than the one 
which triggers the
+* WARN_ON_ONCE.
+*/
+   if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
+   goto unlock;
+
adev->gfx.gfx_off_req_count--;
+   } else {
+   adev->gfx.gfx_off_req_count++;
+   }
 
if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) 
{
schedule_delayed_work(&adev->gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
-   } else if (!enable && adev->gfx.gfx_off_state) {
-   if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {
+   } else if (!enable && adev->gfx.gfx_off_req_count == 1) {
+   cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work);
+
+   if (adev->gfx.gfx_off_state &&
+   !amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {
adev->gfx.gfx_off_state = false;
 
if (adev->gfx.funcs->init_spm_golden) {
@@ -581,6 +592,7 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
}
}
 
+unlock:
mutex_unlock(&adev->gfx.gfx_off_mutex);
 }
 
-- 
2.32.0



Re: [PATCH 1/3] drm/fourcc: Add macros to determine the modifier vendor

2021-08-16 Thread Thierry Reding
On Thu, Jun 10, 2021 at 01:12:34PM +0200, Thierry Reding wrote:
> From: Thierry Reding 
> 
> When working with framebuffer modifiers, it can be useful to extract the
> vendor identifier or check a modifier against a given vendor identifier.
> Add one macro that extracts the vendor identifier and a helper to check
> a modifier against a given vendor identifier.
> 
> Reviewed-by: Daniel Vetter 
> Acked-by: Daniel Stone 
> Signed-off-by: Thierry Reding 
> ---
>  include/uapi/drm/drm_fourcc.h | 6 ++
>  1 file changed, 6 insertions(+)

Sorry for this taking so long, I've finally applied this to
drm-misc-next.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-16 Thread Michel Dänzer
On 2021-08-16 9:38 a.m., Christian König wrote:
> Am 13.08.21 um 12:29 schrieb Michel Dänzer:
>> From: Michel Dänzer 
>>
>> schedule_delayed_work does not push back the work if it was already
>> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
>> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
>> was disabled and re-enabled again during those 100 ms.
>>
>> This resulted in frame drops / stutter with the upcoming mutter 41
>> release on Navi 14, due to constantly enabling GFXOFF in the HW and
>> disabling it again (for getting the GPU clock counter).
>>
>> To fix this, call cancel_delayed_work_sync when GFXOFF transitions from
>> enabled to disabled. This makes sure the delayed work will be scheduled
>> as intended in the reverse case.
>>
>> In order to avoid a deadlock, amdgpu_device_delay_enable_gfx_off needs
>> to use mutex_trylock instead of mutex_lock.
>>
>> v2:
>> * Use cancel_delayed_work_sync & mutex_trylock instead of
>>    mod_delayed_work.
> 
> While this may work it still smells a little bit fishy.
> 
> In general you have two common locking orders around work items, either 
> lock->work or work->lock. If you mix this as lock->work->lock like here 
> trouble is usually imminent.
> 
> I think what we should do instead is to double check if taking the lock 
> inside the work item is necessary and instead making sure that the work is 
> sync canceled when we don't want it to run. In other words fully switching to 
> the lock->work approach.

Done in v3, thanks for the suggestion!


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer


Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-16 Thread Michel Dänzer
On 2021-08-16 12:20 p.m., Quan, Evan wrote:
> [AMD Official Use Only]
> 
> Hi Michel,
> 
> The patch seems reasonable to me(especially the cancel_delayed_work_sync() 
> part).
> However, can you explain more about the code below?
> What's the race issue here exactly?
> 
> + /* mutex_lock could deadlock with cancel_delayed_work_sync in 
> amdgpu_gfx_off_ctrl. */
> + if (!mutex_trylock(&adev->gfx.gfx_off_mutex)) {
> + /* If there's a bug which causes amdgpu_gfx_off_ctrl to be 
> called with enable=true
> +  * when adev->gfx.gfx_off_req_count is already 0, we might race 
> with that.
> +  * Re-schedule to make sure gfx off will be re-enabled in the 
> HW eventually.
> +  */
> + schedule_delayed_work(&adev->gfx.gfx_off_delay_work, 
> AMDGPU_GFX_OFF_DELAY_ENABLE);
> + return;
> + }

If amdgpu_gfx_off_ctrl was called with enable=true when 
adev->gfx.gfx_off_req_count == 0 already, it could have prevented 
amdgpu_device_delay_enable_gfx_off from locking the mutex.

v3 solves this by only scheduling the work when adev->gfx.gfx_off_req_count 
transitions from 1 to 0, which means it no longer needs to lock the mutex.


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer


Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-16 Thread Michel Dänzer
On 2021-08-16 6:13 a.m., Lazar, Lijo wrote:
> On 8/13/2021 9:30 PM, Michel Dänzer wrote:
>> On 2021-08-13 5:07 p.m., Lazar, Lijo wrote:
>>> On 8/13/2021 8:10 PM, Michel Dänzer wrote:
 On 2021-08-13 4:14 p.m., Lazar, Lijo wrote:
> On 8/13/2021 7:04 PM, Michel Dänzer wrote:
>> On 2021-08-13 1:50 p.m., Lazar, Lijo wrote:
>>> On 8/13/2021 3:59 PM, Michel Dänzer wrote:
 From: Michel Dänzer 

 schedule_delayed_work does not push back the work if it was already
 scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
 after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
 was disabled and re-enabled again during those 100 ms.

 This resulted in frame drops / stutter with the upcoming mutter 41
 release on Navi 14, due to constantly enabling GFXOFF in the HW and
 disabling it again (for getting the GPU clock counter).

 To fix this, call cancel_delayed_work_sync when GFXOFF transitions from
 enabled to disabled. This makes sure the delayed work will be scheduled
 as intended in the reverse case.

 In order to avoid a deadlock, amdgpu_device_delay_enable_gfx_off needs
 to use mutex_trylock instead of mutex_lock.

 v2:
 * Use cancel_delayed_work_sync & mutex_trylock instead of
   mod_delayed_work.

 Signed-off-by: Michel Dänzer 
 ---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 13 +++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h    |  3 +++
  3 files changed, 20 insertions(+), 7 deletions(-)

 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
 b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
 index f3fd5ec710b6..8b025f70706c 100644
 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
 +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
 @@ -2777,7 +2777,16 @@ static void 
 amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
  struct amdgpu_device *adev =
  container_of(work, struct amdgpu_device, 
 gfx.gfx_off_delay_work.work);
  -    mutex_lock(&adev->gfx.gfx_off_mutex);
 +    /* mutex_lock could deadlock with cancel_delayed_work_sync in 
 amdgpu_gfx_off_ctrl. */
 +    if (!mutex_trylock(&adev->gfx.gfx_off_mutex)) {
 +    /* If there's a bug which causes amdgpu_gfx_off_ctrl to be 
 called with enable=true
 + * when adev->gfx.gfx_off_req_count is already 0, we might 
 race with that.
 + * Re-schedule to make sure gfx off will be re-enabled in the 
 HW eventually.
 + */
 +    schedule_delayed_work(&adev->gfx.gfx_off_delay_work, 
 AMDGPU_GFX_OFF_DELAY_ENABLE);
 +    return;
>>>
>>> This is not needed and is just creating another thread to contend for 
>>> mutex.
>>
>> Still not sure what you mean by that. What other thread?
>
> Sorry, I meant it schedules another workitem and delays GFXOFF enablement 
> further. For ex: if it was another function like gfx_off_status holding 
> the lock at the time of check.
>
>>
>>> The checks below take care of enabling gfxoff correctly. If it's 
>>> already in gfx_off state, it doesn't do anything. So I don't see why 
>>> this change is needed.
>>
>> mutex_trylock is needed to prevent the deadlock discussed before and 
>> below.
>>
>> schedule_delayed_work is needed due to this scenario hinted at by the 
>> comment:
>>
>> 1. amdgpu_gfx_off_ctrl locks mutex, calls schedule_delayed_work
>> 2. amdgpu_device_delay_enable_gfx_off runs, calls mutex_trylock, which 
>> fails
>>
>> GFXOFF would never get re-enabled in HW in this case (until 
>> amdgpu_gfx_off_ctrl calls schedule_delayed_work again).
>>
>> (cancel_delayed_work_sync guarantees there's no pending delayed work 
>> when it returns, even if amdgpu_device_delay_enable_gfx_off calls 
>> schedule_delayed_work)
>>
>
> I think we need to explain based on the original code before. There is an 
> asssumption here that the only other contention of this mutex is with the 
> gfx_off_ctrl function.

 Not really.


> As far as I understand if the work has already started running when 
> schedule_delayed_work is called, it will insert another in the work queue 
> after delay. Based on that understanding I didn't find a problem with the 
> original code.

 Original code as in without this patch or the mod_delayed_work patch? If 
 so, the problem is not when the work has already started running. It's 
 that when it hasn't started running yet,

Re: [RFC PATCH v3 1/6] drm/doc: Color Management and HDR10 RFC

2021-08-16 Thread Brian Starkey
On Fri, Aug 13, 2021 at 10:42:12AM +0530, Sharma, Shashank wrote:
> Hello Brian,
> (+Uma in cc)
> 
> Thanks for your comments, Let me try to fill-in for Harry to keep the design
> discussion going. Please find my comments inline.
> 
> On 8/2/2021 10:00 PM, Brian Starkey wrote:
> >

-- snip --

> > 
> > Android doesn't blend in linear space, so any API shouldn't be built
> > around an assumption of linear blending.
> > 
> 
> If I am not wrong, we still need linear buffers for accurate Gamut
> transformation (SRGB -> BT2020 or other way around) isn't it ?

Yeah, you need to transform the buffer to linear for color gamut
conversions, but then back to non-linear (probably sRGB or gamma 2.2)
for actual blending.

This is why I'd like to have the per-plane "OETF/GAMMA" separate
from tone-mapping, so that the composition transfer function is
independent.

> 

...

> > > +
> > > +Tonemapping in this case could be a simple nits value or `EDR`_ to 
> > > describe
> > > +how to scale the :ref:`SDR luminance`.
> > > +
> > > +Tonemapping could also include the ability to use a 3D LUT which might be
> > > +accompanied by a 1D shaper LUT. The shaper LUT is required in order to
> > > +ensure a 3D LUT with limited entries (e.g. 9x9x9, or 17x17x17) operates
> > > +in perceptual (non-linear) space, so as to evenly spread the limited
> > > +entries evenly across the perceived space.
> > 
> > Some terminology care may be needed here - up until this point, I
> > think you've been talking about "tonemapping" being luminance
> > adjustment, whereas I'd expect 3D LUTs to be used for gamut
> > adjustment.
> > 
> 
> IMO, what harry wants to say here is that, which HW block gets picked and
> how tone mapping is achieved can be a very driver/HW specific thing, where
> one driver can use a 1D/Fixed function block, whereas another one can choose
> more complex HW like a 3D LUT for the same.
> 
> DRM layer needs to define only the property to hook the API with core
> driver, and the driver can decide which HW to pick and configure for the
> activity. So when we have a tonemapping property, we might not have a
> separate 3D-LUT property, or the driver may fail the atomic_check() if both
> of them are programmed for different usages.

I still think that directly exposing the HW blocks and their
capabilities is the right approach, rather than a "magic" tonemapping
property.

Yes, userspace would need to have a good understanding of how to use
that hardware, but if the pipeline model is standardised that's the
kind of thing a cross-vendor library could handle.

It would definitely be good to get some compositor opinions here.

Cheers,
-Brian


[PATCH v1] drm/bridge: anx7625: Don't store unread return value

2021-08-16 Thread Robert Foss
The return value of sp_tx_rst_aux() is stored, but never read.
This happens in the context EDID communication already failing,
which means that this additional failure doesn't necessarily
convey any additional inforamation.

This means that we can safely avoid storing the value.

Fixes: 8bdfc5dae4e3 ("drm/bridge: anx7625: Add anx7625 MIPI DSI/DPI to DP")

Reported-by: kernel test robot 
Signed-off-by: Robert Foss 
---
 drivers/gpu/drm/bridge/analogix/anx7625.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c 
b/drivers/gpu/drm/bridge/analogix/anx7625.c
index 14d73fb1dd15b..3471785915c45 100644
--- a/drivers/gpu/drm/bridge/analogix/anx7625.c
+++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
@@ -771,7 +771,7 @@ static int segments_edid_read(struct anx7625_data *ctx,
ret = sp_tx_aux_rd(ctx, 0xf1);
 
if (ret) {
-   ret = sp_tx_rst_aux(ctx);
+   sp_tx_rst_aux(ctx);
DRM_DEV_ERROR(dev, "segment read fail, reset!\n");
} else {
ret = anx7625_reg_block_read(ctx, ctx->i2c.rx_p0_client,
-- 
2.30.2



Re: [PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-16 Thread Lazar, Lijo




On 8/16/2021 4:05 PM, Michel Dänzer wrote:

From: Michel Dänzer 

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
   mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)

Cc: sta...@vger.kernel.org
Signed-off-by: Michel Dänzer 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 22 +-
  2 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f3fd5ec710b6..f944ed858f3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2777,12 +2777,11 @@ static void amdgpu_device_delay_enable_gfx_off(struct 
work_struct *work)
struct amdgpu_device *adev =
container_of(work, struct amdgpu_device, 
gfx.gfx_off_delay_work.work);
  
-	mutex_lock(&adev->gfx.gfx_off_mutex);

-   if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
-   if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, true))
-   adev->gfx.gfx_off_state = true;
-   }
-   mutex_unlock(&adev->gfx.gfx_off_mutex);
+   WARN_ON_ONCE(adev->gfx.gfx_off_state);


Don't see any case for this. It's not expected to be scheduled in this 
case, right?



+   WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
+


Thinking about ON_ONCE here - this may happen more than once if it's 
completed as part of cancel_ call. Is the warning needed?


Anyway,
Reviewed-by: Lijo Lazar 


+   if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
true))
+   adev->gfx.gfx_off_state = true;
  }
  
  /**

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index a0be0772c8b3..ca91aafcb32b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -563,15 +563,26 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
  
  	mutex_lock(&adev->gfx.gfx_off_mutex);
  
-	if (!enable)

-   adev->gfx.gfx_off_req_count++;
-   else if (adev->gfx.gfx_off_req_count > 0)
+   if (enable) {
+   /* If the count is already 0, it means there's an imbalance bug 
somewhere.
+* Note that the bug may be in a different caller than the one 
which triggers the
+* WARN_ON_ONCE.
+*/
+   if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
+   goto unlock;
+
adev->gfx.gfx_off_req_count--;
+   } else {
+   adev->gfx.gfx_off_req_count++;
+   }
  
  	if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {

schedule_delayed_work(&adev->gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
-   } else if (!enable && adev->gfx.gfx_off_state) {
-   if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {
+   } else if (!enable && adev->gfx.gfx_off_req_count == 1) {
+   cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work);
+
+   if (adev->gfx.gfx_off_state &&
+   !amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {
adev->gfx.gfx_off_state = false;
  
  			if (adev->gfx.funcs->init_spm_golden) {

@@ -581,6 +592,7 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
}
}
  
+unlock:

mutex_unlock(&adev->gfx.gfx_off_mutex);
  }
  



Re: [PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-16 Thread Christian König

Am 16.08.21 um 13:33 schrieb Lazar, Lijo:

On 8/16/2021 4:05 PM, Michel Dänzer wrote:

From: Michel Dänzer 

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
   mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)

Cc: sta...@vger.kernel.org
Signed-off-by: Michel Dänzer 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 22 +-
  2 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index f3fd5ec710b6..f944ed858f3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2777,12 +2777,11 @@ static void 
amdgpu_device_delay_enable_gfx_off(struct work_struct *work)

  struct amdgpu_device *adev =
  container_of(work, struct amdgpu_device, 
gfx.gfx_off_delay_work.work);

  -    mutex_lock(&adev->gfx.gfx_off_mutex);
-    if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
-    if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, true))

-    adev->gfx.gfx_off_state = true;
-    }
-    mutex_unlock(&adev->gfx.gfx_off_mutex);
+    WARN_ON_ONCE(adev->gfx.gfx_off_state);


Don't see any case for this. It's not expected to be scheduled in this 
case, right?



+ WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
+


Thinking about ON_ONCE here - this may happen more than once if it's 
completed as part of cancel_ call. Is the warning needed?


WARN_ON_ONCE() is usually used to prevent spamming the system log with 
warnings. E.g. the warning is only printed once indicating a driver bug 
and that's it.




Anyway,
Reviewed-by: Lijo Lazar 


Acked-by: Christian König 

Regards,
Christian.



+    if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, true))

+    adev->gfx.gfx_off_state = true;
  }
    /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c

index a0be0772c8b3..ca91aafcb32b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -563,15 +563,26 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device 
*adev, bool enable)

    mutex_lock(&adev->gfx.gfx_off_mutex);
  -    if (!enable)
-    adev->gfx.gfx_off_req_count++;
-    else if (adev->gfx.gfx_off_req_count > 0)
+    if (enable) {
+    /* If the count is already 0, it means there's an imbalance 
bug somewhere.
+ * Note that the bug may be in a different caller than the 
one which triggers the

+ * WARN_ON_ONCE.
+ */
+    if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
+    goto unlock;
+
  adev->gfx.gfx_off_req_count--;
+    } else {
+    adev->gfx.gfx_off_req_count++;
+    }
    if (enable && !adev->gfx.gfx_off_state && 
!adev->gfx.gfx_off_req_count) {
schedule_delayed_work(&adev->gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);

-    } else if (!enable && adev->gfx.gfx_off_state) {
-    if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {

+    } else if (!enable && adev->gfx.gfx_off_req_count == 1) {
+ cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work);
+
+    if (adev->gfx.gfx_off_state &&
+    !amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {

  adev->gfx.gfx_off_state = false;
    if (adev->gfx.funcs->init_spm_golden) {
@@ -581,6 +592,7 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device 
*adev, bool enable)

  }
  }
  +unlock:
  mutex_unlock(&adev->gfx.gfx_off_mutex);
  }





Re: [PATCH 0/1] Fix DRM driver initialization failure in kernel v5.14

2021-08-16 Thread Christian König




Am 16.08.21 um 08:20 schrieb Huang Rui:

On Sat, Aug 14, 2021 at 12:50:14AM +0800, Dan Moulding wrote:

Just a friendly reminder that this fix for a regression needs
review. It should be a quick review.

It would probably be good to ensure this gets in before the final 5.14
release, otherwise this is going to be a very visible regression for
anyone that uses DRM and does not use debugfs.


Just took a look at your patch, it's ok for me. Alex/Christian, could you
help to apply this fix if you have no concern?


Sorry for the delay, just came back from vacation today.

Patch is pushed to drm-misc-fixes and will hopefully still get into 5.14.

Thanks,
Christian.



Best Regards,
Ray




[PATCH -next] drm/i915: pass correct pointer to PTR_ERR()

2021-08-16 Thread Yang Yingliang
The 'obj' is tested by IS_ERR(), we should pass it to PTR_ERR().

Signed-off-by: Yang Yingliang 
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
index ffae7df5e4d7..88531d06811e 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
@@ -102,7 +102,7 @@ static int igt_dmabuf_import_same_driver_lmem(void *arg)
obj = __i915_gem_object_create_user(i915, PAGE_SIZE, &lmem, 1);
if (IS_ERR(obj)) {
pr_err("__i915_gem_object_create_user failed with err=%ld\n",
-  PTR_ERR(dmabuf));
+  PTR_ERR(obj));
err = PTR_ERR(obj);
goto out_ret;
}
@@ -156,7 +156,7 @@ static int igt_dmabuf_import_same_driver(struct 
drm_i915_private *i915,
regions, num_regions);
if (IS_ERR(obj)) {
pr_err("__i915_gem_object_create_user failed with err=%ld\n",
-  PTR_ERR(dmabuf));
+  PTR_ERR(obj));
err = PTR_ERR(obj);
goto out_ret;
}
-- 
2.25.1



Re: [RFC PATCH v3 1/6] drm/doc: Color Management and HDR10 RFC

2021-08-16 Thread Harry Wentland



On 2021-08-16 7:10 a.m., Brian Starkey wrote:
> On Fri, Aug 13, 2021 at 10:42:12AM +0530, Sharma, Shashank wrote:
>> Hello Brian,
>> (+Uma in cc)
>>
>> Thanks for your comments, Let me try to fill-in for Harry to keep the design
>> discussion going. Please find my comments inline.
>>

Thanks, Shashank. I'm back at work now. Had to cut my trip short
due to rising Covid cases and concern for my kids.

>> On 8/2/2021 10:00 PM, Brian Starkey wrote:
>>>
> 
> -- snip --
> 
>>>
>>> Android doesn't blend in linear space, so any API shouldn't be built
>>> around an assumption of linear blending.
>>>

This seems incorrect but I guess ultimately the OS is in control of
this. If we want to allow blending in non-linear space with the new
API we would either need to describe the blending space or the
pre/post-blending gamma/de-gamma.

Any idea if this blending behavior in Android might get changed in
the future?

>>
>> If I am not wrong, we still need linear buffers for accurate Gamut
>> transformation (SRGB -> BT2020 or other way around) isn't it ?
> 
> Yeah, you need to transform the buffer to linear for color gamut
> conversions, but then back to non-linear (probably sRGB or gamma 2.2)
> for actual blending.
> 
> This is why I'd like to have the per-plane "OETF/GAMMA" separate
> from tone-mapping, so that the composition transfer function is
> independent.
> 
>>
> 
> ...
> 
 +
 +Tonemapping in this case could be a simple nits value or `EDR`_ to 
 describe
 +how to scale the :ref:`SDR luminance`.
 +
 +Tonemapping could also include the ability to use a 3D LUT which might be
 +accompanied by a 1D shaper LUT. The shaper LUT is required in order to
 +ensure a 3D LUT with limited entries (e.g. 9x9x9, or 17x17x17) operates
 +in perceptual (non-linear) space, so as to evenly spread the limited
 +entries evenly across the perceived space.
>>>
>>> Some terminology care may be needed here - up until this point, I
>>> think you've been talking about "tonemapping" being luminance
>>> adjustment, whereas I'd expect 3D LUTs to be used for gamut
>>> adjustment.
>>>
>>
>> IMO, what harry wants to say here is that, which HW block gets picked and
>> how tone mapping is achieved can be a very driver/HW specific thing, where
>> one driver can use a 1D/Fixed function block, whereas another one can choose
>> more complex HW like a 3D LUT for the same.
>>
>> DRM layer needs to define only the property to hook the API with core
>> driver, and the driver can decide which HW to pick and configure for the
>> activity. So when we have a tonemapping property, we might not have a
>> separate 3D-LUT property, or the driver may fail the atomic_check() if both
>> of them are programmed for different usages.
> 
> I still think that directly exposing the HW blocks and their
> capabilities is the right approach, rather than a "magic" tonemapping
> property.
> 
> Yes, userspace would need to have a good understanding of how to use
> that hardware, but if the pipeline model is standardised that's the
> kind of thing a cross-vendor library could handle.
> 

One problem with cross-vendor libraries is that they might struggle
to really be cross-vendor when it comes to unique HW behavior. Or
they might pick sub-optimal configurations as they're not aware of
the power impact of a configuration. What's an optimal configuration
might differ greatly between different HW.

We're seeing this problem with "universal" planes as well.

> It would definitely be good to get some compositor opinions here.
> 

For this we'll probably have to wait for Pekka's input when he's
back from his vacation.

> Cheers,
> -Brian
> 



Re: [PATCH 1/2] drm/ttm: ttm_bo_device is now ttm_device

2021-08-16 Thread Christian König

Reviewed and pushed to drm-misc-next-fixes.

Thanks,
Christian.

Am 12.08.21 um 22:34 schrieb Jason Ekstrand:

These names were changed in

commit 8af8a109b34fa88b8b91f25d11485b37d37549c3
Author: Christian König 
Date:   Thu Oct 1 14:51:40 2020 +0200

 drm/ttm: device naming cleanup

But he missed a couple of them.

Signed-off-by: Jason Ekstrand 
Cc: Christian König 
Fixes: 8af8a109b34f ("drm/ttm: device naming cleanup")
---
  Documentation/gpu/drm-mm.rst | 2 +-
  include/drm/ttm/ttm_tt.h | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index d5a73fa2c9ef..8126beadc7df 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -37,7 +37,7 @@ TTM initialization
  This section is outdated.
  
  Drivers wishing to support TTM must pass a filled :c:type:`ttm_bo_driver

-` structure to ttm_bo_device_init, together with an
+` structure to ttm_device_init, together with an
  initialized global reference to the memory manager.  The ttm_bo_driver
  structure contains several fields with function pointers for
  initializing the TTM, allocating and freeing memory, waiting for command
diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h
index 818680c6a8ed..0d97967bf955 100644
--- a/include/drm/ttm/ttm_tt.h
+++ b/include/drm/ttm/ttm_tt.h
@@ -31,7 +31,7 @@
  #include 
  #include 
  
-struct ttm_bo_device;

+struct ttm_device;
  struct ttm_tt;
  struct ttm_resource;
  struct ttm_buffer_object;




Re: [PATCH v2] drm/virtio: support mapping exported vram

2021-08-16 Thread Gerd Hoffmann
On Fri, Aug 13, 2021 at 09:54:41AM +0900, David Stevens wrote:
> Implement virtgpu specific map_dma_buf callback to support mapping
> exported vram object dma-bufs. The dma-buf callback is used directly, as
> vram objects don't have backing pages and thus can't implement the
> drm_gem_object_funcs.get_sg_table callback.
> 
> Signed-off-by: David Stevens 

Pushed to drm-misc-next.

thanks,
  Gerd



Re: [PATCH] drm: radeon: r600_dma: Replace cpu_to_le32() by lower_32_bits()

2021-08-16 Thread Christian König




Am 13.08.21 um 17:03 schrieb Michel Dänzer:

On 2021-08-13 10:54 a.m., zhaoxiao wrote:

This patch fixes the following sparse errors:
drivers/gpu/drm/radeon/r600_dma.c:247:30: warning: incorrect type in assignment 
(different base types)
drivers/gpu/drm/radeon/r600_dma.c:247:30:expected unsigned int volatile 
[usertype]
drivers/gpu/drm/radeon/r600_dma.c:247:30:got restricted __le32 [usertype]

Signed-off-by: zhaoxiao 
---
  drivers/gpu/drm/radeon/r600_dma.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/r600_dma.c 
b/drivers/gpu/drm/radeon/r600_dma.c
index fb65e6fb5c4f..a2d0b1edcd22 100644
--- a/drivers/gpu/drm/radeon/r600_dma.c
+++ b/drivers/gpu/drm/radeon/r600_dma.c
@@ -244,7 +244,7 @@ int r600_dma_ring_test(struct radeon_device *rdev,
gpu_addr = rdev->wb.gpu_addr + index;
  
  	tmp = 0xCAFEDEAD;

-   rdev->wb.wb[index/4] = cpu_to_le32(tmp);
+   rdev->wb.wb[index/4] = lower_32_bits(tmp);
  
  	r = radeon_ring_lock(rdev, ring, 4);

if (r) {


Seems better to mark rdev->wb.wb as little endian instead. It's read with 
le32_to_cpu (with some exceptions which look like bugs), which would result in 
0xADEDFECA like this.


Yeah, that patch doesn't look correct at all and most likely breaks ring 
test on big endian systems.


Christian.


[PATCH] drm/imx: ipuv3-plane: fix accidental partial revert of 8 pixel alignment fix

2021-08-16 Thread Philipp Zabel
This fixes an accidental partial revert of commit 94dfec48fca7
("drm/imx: Add 8 pixel alignment fix") during a rebase of
commit fc1e985b67f9 ("drm/imx: ipuv3-plane: add color encoding and range
properties").

Fixes: fc1e985b67f9 ("drm/imx: ipuv3-plane: add color encoding and range 
properties")
Signed-off-by: Philipp Zabel 
---
 drivers/gpu/drm/imx/ipuv3-plane.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c 
b/drivers/gpu/drm/imx/ipuv3-plane.c
index 8710f55d2579..bd1f9f0366d3 100644
--- a/drivers/gpu/drm/imx/ipuv3-plane.c
+++ b/drivers/gpu/drm/imx/ipuv3-plane.c
@@ -683,7 +683,7 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
break;
}
 
-   ipu_dmfc_config_wait4eot(ipu_plane->dmfc, drm_rect_width(dst));
+   ipu_dmfc_config_wait4eot(ipu_plane->dmfc, ALIGN(drm_rect_width(dst), 
8));
 
width = ipu_src_rect_width(new_state);
height = drm_rect_height(&new_state->src) >> 16;
-- 
2.30.2



Re: [PATCH] drm/i915: Ditch the i915_gem_ww_ctx loop member

2021-08-16 Thread Matthew Auld
On Mon, 16 Aug 2021 at 09:49, Thomas Hellström
 wrote:
>
> It's only used by the for_i915_gem_ww() macro and we can use
> the (typically) on-stack _err variable in its place.
>
> While initially setting the _err variable to -EDEADLK to enter the
> loop, we clear it before actually entering using fetch_and_zero() to
> avoid empty loops or code not setting the _err variable running forever.
>
> Suggested-by: Maarten Lankhorst 
> Signed-off-by: Thomas Hellström 
> ---
>  drivers/gpu/drm/i915/i915_gem_ww.h | 23 ---
>  1 file changed, 8 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h 
> b/drivers/gpu/drm/i915/i915_gem_ww.h
> index f6b1a796667b..98348b1e6182 100644
> --- a/drivers/gpu/drm/i915/i915_gem_ww.h
> +++ b/drivers/gpu/drm/i915/i915_gem_ww.h
> @@ -7,12 +7,13 @@
>
>  #include 
>
> +#include "i915_utils.h"
> +
>  struct i915_gem_ww_ctx {
> struct ww_acquire_ctx ctx;
> struct list_head obj_list;
> struct drm_i915_gem_object *contended;
> -   unsigned short intr;
> -   unsigned short loop;
> +   bool intr;
>  };
>
>  void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
> @@ -23,28 +24,20 @@ void i915_gem_ww_unlock_single(struct drm_i915_gem_object 
> *obj);
>  /* Internal functions used by the inlines! Don't use. */
>  static inline int __i915_gem_ww_fini(struct i915_gem_ww_ctx *ww, int err)
>  {
> -   ww->loop = 0;
> if (err == -EDEADLK) {
> err = i915_gem_ww_ctx_backoff(ww);
> if (!err)
> -   ww->loop = 1;
> +   err = -EDEADLK;
> }
>
> -   if (!ww->loop)
> +   if (err != -EDEADLK)
> i915_gem_ww_ctx_fini(ww);
>
> return err;
>  }
>
> -static inline void
> -__i915_gem_ww_init(struct i915_gem_ww_ctx *ww, bool intr)
> -{
> -   i915_gem_ww_ctx_init(ww, intr);
> -   ww->loop = 1;
> -}
> -
> -#define for_i915_gem_ww(_ww, _err, _intr)  \
> -   for (__i915_gem_ww_init(_ww, _intr); (_ww)->loop;   \
> +#define for_i915_gem_ww(_ww, _err, _intr)\
> +   for (i915_gem_ww_ctx_init(_ww, _intr), (_err) = -EDEADLK; \
> +fetch_and_zero(&_err) == -EDEADLK;   \

Doesn't this now hide "normal" errors, like say get_pages() returning
-ENOSPC or so?

>  _err = __i915_gem_ww_fini(_ww, _err))
> -
>  #endif
> --
> 2.31.1
>


Re: [PATCH] drm/i915: Ditch the i915_gem_ww_ctx loop member

2021-08-16 Thread Thomas Hellström



On 8/16/21 3:25 PM, Matthew Auld wrote:

On Mon, 16 Aug 2021 at 09:49, Thomas Hellström
 wrote:

It's only used by the for_i915_gem_ww() macro and we can use
the (typically) on-stack _err variable in its place.

While initially setting the _err variable to -EDEADLK to enter the
loop, we clear it before actually entering using fetch_and_zero() to
avoid empty loops or code not setting the _err variable running forever.

Suggested-by: Maarten Lankhorst 
Signed-off-by: Thomas Hellström 
---
  drivers/gpu/drm/i915/i915_gem_ww.h | 23 ---
  1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h 
b/drivers/gpu/drm/i915/i915_gem_ww.h
index f6b1a796667b..98348b1e6182 100644
--- a/drivers/gpu/drm/i915/i915_gem_ww.h
+++ b/drivers/gpu/drm/i915/i915_gem_ww.h
@@ -7,12 +7,13 @@

  #include 

+#include "i915_utils.h"
+
  struct i915_gem_ww_ctx {
 struct ww_acquire_ctx ctx;
 struct list_head obj_list;
 struct drm_i915_gem_object *contended;
-   unsigned short intr;
-   unsigned short loop;
+   bool intr;
  };

  void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
@@ -23,28 +24,20 @@ void i915_gem_ww_unlock_single(struct drm_i915_gem_object 
*obj);
  /* Internal functions used by the inlines! Don't use. */
  static inline int __i915_gem_ww_fini(struct i915_gem_ww_ctx *ww, int err)
  {
-   ww->loop = 0;
 if (err == -EDEADLK) {
 err = i915_gem_ww_ctx_backoff(ww);
 if (!err)
-   ww->loop = 1;
+   err = -EDEADLK;
 }

-   if (!ww->loop)
+   if (err != -EDEADLK)
 i915_gem_ww_ctx_fini(ww);

 return err;
  }

-static inline void
-__i915_gem_ww_init(struct i915_gem_ww_ctx *ww, bool intr)
-{
-   i915_gem_ww_ctx_init(ww, intr);
-   ww->loop = 1;
-}
-
-#define for_i915_gem_ww(_ww, _err, _intr)  \
-   for (__i915_gem_ww_init(_ww, _intr); (_ww)->loop;   \
+#define for_i915_gem_ww(_ww, _err, _intr)\
+   for (i915_gem_ww_ctx_init(_ww, _intr), (_err) = -EDEADLK; \
+fetch_and_zero(&_err) == -EDEADLK;   \

Doesn't this now hide "normal" errors, like say get_pages() returning
-ENOSPC or so?


Yes, good catch. We should either just clear the -EDEADLK case, or not 
clear the error at all..


/Thomas





Re: [PATCH] drm/i915: Ditch the i915_gem_ww_ctx loop member

2021-08-16 Thread Maarten Lankhorst
Op 16-08-2021 om 15:30 schreef Thomas Hellström:
>
> On 8/16/21 3:25 PM, Matthew Auld wrote:
>> On Mon, 16 Aug 2021 at 09:49, Thomas Hellström
>>  wrote:
>>> It's only used by the for_i915_gem_ww() macro and we can use
>>> the (typically) on-stack _err variable in its place.
>>>
>>> While initially setting the _err variable to -EDEADLK to enter the
>>> loop, we clear it before actually entering using fetch_and_zero() to
>>> avoid empty loops or code not setting the _err variable running forever.
>>>
>>> Suggested-by: Maarten Lankhorst 
>>> Signed-off-by: Thomas Hellström 
>>> ---
>>>   drivers/gpu/drm/i915/i915_gem_ww.h | 23 ---
>>>   1 file changed, 8 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h 
>>> b/drivers/gpu/drm/i915/i915_gem_ww.h
>>> index f6b1a796667b..98348b1e6182 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_ww.h
>>> +++ b/drivers/gpu/drm/i915/i915_gem_ww.h
>>> @@ -7,12 +7,13 @@
>>>
>>>   #include 
>>>
>>> +#include "i915_utils.h"
>>> +
>>>   struct i915_gem_ww_ctx {
>>>  struct ww_acquire_ctx ctx;
>>>  struct list_head obj_list;
>>>  struct drm_i915_gem_object *contended;
>>> -   unsigned short intr;
>>> -   unsigned short loop;
>>> +   bool intr;
>>>   };
>>>
>>>   void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
>>> @@ -23,28 +24,20 @@ void i915_gem_ww_unlock_single(struct 
>>> drm_i915_gem_object *obj);
>>>   /* Internal functions used by the inlines! Don't use. */
>>>   static inline int __i915_gem_ww_fini(struct i915_gem_ww_ctx *ww, int err)
>>>   {
>>> -   ww->loop = 0;
>>>  if (err == -EDEADLK) {
>>>  err = i915_gem_ww_ctx_backoff(ww);
>>>  if (!err)
>>> -   ww->loop = 1;
>>> +   err = -EDEADLK;
>>>  }
>>>
>>> -   if (!ww->loop)
>>> +   if (err != -EDEADLK)
>>>  i915_gem_ww_ctx_fini(ww);
>>>
>>>  return err;
>>>   }
>>>
>>> -static inline void
>>> -__i915_gem_ww_init(struct i915_gem_ww_ctx *ww, bool intr)
>>> -{
>>> -   i915_gem_ww_ctx_init(ww, intr);
>>> -   ww->loop = 1;
>>> -}
>>> -
>>> -#define for_i915_gem_ww(_ww, _err, _intr)  \
>>> -   for (__i915_gem_ww_init(_ww, _intr); (_ww)->loop;   \
>>> +#define for_i915_gem_ww(_ww, _err, _intr)    \
>>> +   for (i915_gem_ww_ctx_init(_ww, _intr), (_err) = -EDEADLK; \
>>> +    fetch_and_zero(&_err) == -EDEADLK;   \
>> Doesn't this now hide "normal" errors, like say get_pages() returning
>> -ENOSPC or so?
>
> Yes, good catch. We should either just clear the -EDEADLK case, or not clear 
> the error at all..
>
> /Thomas

I believe not setting _err is a bug anyway. Why would you do such a loop 
without at least one err = ww_mutex_lock(&ww); ?

Infinite loop would catch that at first test.

~Maarten



Re: [RFC PATCH v3 1/6] drm/doc: Color Management and HDR10 RFC

2021-08-16 Thread sebastian

On 2021-08-16 14:40, Harry Wentland wrote:

On 2021-08-16 7:10 a.m., Brian Starkey wrote:

On Fri, Aug 13, 2021 at 10:42:12AM +0530, Sharma, Shashank wrote:

Hello Brian,
(+Uma in cc)

Thanks for your comments, Let me try to fill-in for Harry to keep the 
design

discussion going. Please find my comments inline.



Thanks, Shashank. I'm back at work now. Had to cut my trip short
due to rising Covid cases and concern for my kids.


On 8/2/2021 10:00 PM, Brian Starkey wrote:




-- snip --



Android doesn't blend in linear space, so any API shouldn't be built
around an assumption of linear blending.



This seems incorrect but I guess ultimately the OS is in control of
this. If we want to allow blending in non-linear space with the new
API we would either need to describe the blending space or the
pre/post-blending gamma/de-gamma.

Any idea if this blending behavior in Android might get changed in
the future?


There is lots of software which blends in sRGB space and designers
adjusted to the incorrect blending in a way that the result looks right.
Blending in linear space would result in incorrectly looking images.



If I am not wrong, we still need linear buffers for accurate Gamut
transformation (SRGB -> BT2020 or other way around) isn't it ?


Yeah, you need to transform the buffer to linear for color gamut
conversions, but then back to non-linear (probably sRGB or gamma 2.2)
for actual blending.

This is why I'd like to have the per-plane "OETF/GAMMA" separate
from tone-mapping, so that the composition transfer function is
independent.





...


+
+Tonemapping in this case could be a simple nits value or `EDR`_ to 
describe

+how to scale the :ref:`SDR luminance`.
+
+Tonemapping could also include the ability to use a 3D LUT which 
might be
+accompanied by a 1D shaper LUT. The shaper LUT is required in 
order to
+ensure a 3D LUT with limited entries (e.g. 9x9x9, or 17x17x17) 
operates
+in perceptual (non-linear) space, so as to evenly spread the 
limited

+entries evenly across the perceived space.


Some terminology care may be needed here - up until this point, I
think you've been talking about "tonemapping" being luminance
adjustment, whereas I'd expect 3D LUTs to be used for gamut
adjustment.



IMO, what harry wants to say here is that, which HW block gets picked 
and
how tone mapping is achieved can be a very driver/HW specific thing, 
where
one driver can use a 1D/Fixed function block, whereas another one can 
choose

more complex HW like a 3D LUT for the same.

DRM layer needs to define only the property to hook the API with core
driver, and the driver can decide which HW to pick and configure for 
the

activity. So when we have a tonemapping property, we might not have a
separate 3D-LUT property, or the driver may fail the atomic_check() 
if both

of them are programmed for different usages.


I still think that directly exposing the HW blocks and their
capabilities is the right approach, rather than a "magic" tonemapping
property.

Yes, userspace would need to have a good understanding of how to use
that hardware, but if the pipeline model is standardised that's the
kind of thing a cross-vendor library could handle.



One problem with cross-vendor libraries is that they might struggle
to really be cross-vendor when it comes to unique HW behavior. Or
they might pick sub-optimal configurations as they're not aware of
the power impact of a configuration. What's an optimal configuration
might differ greatly between different HW.

We're seeing this problem with "universal" planes as well.


I'm repeating what has been said before but apparently it has to be said
again: if a property can't be replicated exactly in a shader the
property is useless. If your hardware is so unique that it can't give us
the exact formula we expect you cannot expose the property.

Maybe my view on power consumption is simplistic but I would expect enum
< 1d lut < 3d lut < shader. Is there more to it?

Either way if the fixed KMS pixel pipeline is not sufficient to expose
the intricacies of real hardware the right move would be to make the KMS
pixel pipeline more dynamic, expose more hardware specifics and create a
hardware specific user space like mesa. Moving the whole compositing
with all its policies and decision making into the kernel is exactly the
wrong way to go.

Laurent Pinchart put this very well:
https://lists.freedesktop.org/archives/dri-devel/2021-June/311689.html


It would definitely be good to get some compositor opinions here.



For this we'll probably have to wait for Pekka's input when he's
back from his vacation.


Cheers,
-Brian



Re: [PATCH] drm/i915: Ditch the i915_gem_ww_ctx loop member

2021-08-16 Thread Thomas Hellström



On 8/16/21 3:34 PM, Maarten Lankhorst wrote:

Op 16-08-2021 om 15:30 schreef Thomas Hellström:

On 8/16/21 3:25 PM, Matthew Auld wrote:

On Mon, 16 Aug 2021 at 09:49, Thomas Hellström
 wrote:

It's only used by the for_i915_gem_ww() macro and we can use
the (typically) on-stack _err variable in its place.

While initially setting the _err variable to -EDEADLK to enter the
loop, we clear it before actually entering using fetch_and_zero() to
avoid empty loops or code not setting the _err variable running forever.

Suggested-by: Maarten Lankhorst 
Signed-off-by: Thomas Hellström 
---
   drivers/gpu/drm/i915/i915_gem_ww.h | 23 ---
   1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h 
b/drivers/gpu/drm/i915/i915_gem_ww.h
index f6b1a796667b..98348b1e6182 100644
--- a/drivers/gpu/drm/i915/i915_gem_ww.h
+++ b/drivers/gpu/drm/i915/i915_gem_ww.h
@@ -7,12 +7,13 @@

   #include 

+#include "i915_utils.h"
+
   struct i915_gem_ww_ctx {
  struct ww_acquire_ctx ctx;
  struct list_head obj_list;
  struct drm_i915_gem_object *contended;
-   unsigned short intr;
-   unsigned short loop;
+   bool intr;
   };

   void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
@@ -23,28 +24,20 @@ void i915_gem_ww_unlock_single(struct drm_i915_gem_object 
*obj);
   /* Internal functions used by the inlines! Don't use. */
   static inline int __i915_gem_ww_fini(struct i915_gem_ww_ctx *ww, int err)
   {
-   ww->loop = 0;
  if (err == -EDEADLK) {
  err = i915_gem_ww_ctx_backoff(ww);
  if (!err)
-   ww->loop = 1;
+   err = -EDEADLK;
  }

-   if (!ww->loop)
+   if (err != -EDEADLK)
  i915_gem_ww_ctx_fini(ww);

  return err;
   }

-static inline void
-__i915_gem_ww_init(struct i915_gem_ww_ctx *ww, bool intr)
-{
-   i915_gem_ww_ctx_init(ww, intr);
-   ww->loop = 1;
-}
-
-#define for_i915_gem_ww(_ww, _err, _intr)  \
-   for (__i915_gem_ww_init(_ww, _intr); (_ww)->loop;   \
+#define for_i915_gem_ww(_ww, _err, _intr)    \
+   for (i915_gem_ww_ctx_init(_ww, _intr), (_err) = -EDEADLK; \
+    fetch_and_zero(&_err) == -EDEADLK;   \

Doesn't this now hide "normal" errors, like say get_pages() returning
-ENOSPC or so?

Yes, good catch. We should either just clear the -EDEADLK case, or not clear 
the error at all..

/Thomas

I believe not setting _err is a bug anyway. Why would you do such a loop without at 
least one err = ww_mutex_lock(&ww); ?

Infinite loop would catch that at first test.


OK, I'll skip the clearing then.

/Thomas




~Maarten



[PATCH 01/22] drm/i915/guc: Fix blocked context accounting

2021-08-16 Thread Matthew Brost
Prior to this patch the blocked context counter was cleared on
init_sched_state (used during registering a context & resets) which is
incorrect. This state needs to be persistent or the counter can read the
incorrect value resulting in scheduling never getting enabled again.

Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
Signed-off-by: Matthew Brost 
Reviewed-by: Daniel Vetter 
Cc: 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 87d8dc8f51b9..69faa39da178 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -152,7 +152,7 @@ static inline void init_sched_state(struct intel_context 
*ce)
 {
/* Only should be called from guc_lrc_desc_pin() */
atomic_set(&ce->guc_sched_state_no_lock, 0);
-   ce->guc_state.sched_state = 0;
+   ce->guc_state.sched_state &= SCHED_STATE_BLOCKED_MASK;
 }
 
 static inline bool
-- 
2.32.0



[PATCH 00/22] Clean up GuC CI failures, simplify locking, and kernel DOC

2021-08-16 Thread Matthew Brost
Daniel Vetter pointed out that locking in the GuC submission code was
overly complicated, let's clean this up a bit before introducing more
features in the GuC submission backend.

Also fix some CI failures, port fixes from our internal tree, and add a
few more selftests for coverage.

Lastly, add some kernel DOC explaining how the GuC submission backend
works.

v2: Fix logic error in 'Workaround reset G2H is received after schedule
done G2H', don't propagate errors to dependent fences in execlists
submissiom, resolve checkpatch issues, resend to correct lists

Signed-off-by: Matthew Brost 

Matthew Brost (22):
  drm/i915/guc: Fix blocked context accounting
  drm/i915/guc: Fix outstanding G2H accounting
  drm/i915/guc: Unwind context requests in reverse order
  drm/i915/guc: Don't drop ce->guc_active.lock when unwinding context
  drm/i915/guc: Workaround reset G2H is received after schedule done G2H
  drm/i915/execlists: Do not propagate errors to dependent fences
  drm/i915/selftests: Add a cancel request selftest that triggers a
reset
  drm/i915/guc: Don't enable scheduling on a banned context, guc_id
invalid, not registered
  drm/i915/selftests: Fix memory corruption in live_lrc_isolation
  drm/i915/selftests: Add initial GuC selftest for scrubbing lost G2H
  drm/i915/guc: Take context ref when cancelling request
  drm/i915/guc: Don't touch guc_state.sched_state without a lock
  drm/i915/guc: Reset LRC descriptor if register returns -ENODEV
  drm/i915: Allocate error capture in atomic context
  drm/i915/guc: Flush G2H work queue during reset
  drm/i915/guc: Release submit fence from an IRQ
  drm/i915/guc: Move guc_blocked fence to struct guc_state
  drm/i915/guc: Rework and simplify locking
  drm/i915/guc: Proper xarray usage for contexts_lookup
  drm/i915/guc: Drop pin count check trick between sched_disable and
re-pin
  drm/i915/guc: Move GuC priority fields in context under guc_active
  drm/i915/guc: Add GuC kernel doc

 drivers/gpu/drm/i915/gt/intel_context.c   |   5 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |  68 +-
 .../drm/i915/gt/intel_execlists_submission.c  |   4 -
 drivers/gpu/drm/i915/gt/selftest_lrc.c|  29 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  19 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 690 +++---
 drivers/gpu/drm/i915/gt/uc/selftest_guc.c | 126 
 drivers/gpu/drm/i915/i915_gpu_error.c |  37 +-
 drivers/gpu/drm/i915/i915_request.h   |  23 +-
 drivers/gpu/drm/i915/i915_trace.h |   8 +-
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 drivers/gpu/drm/i915/selftests/i915_request.c | 100 +++
 .../i915/selftests/intel_scheduler_helpers.c  |  12 +
 .../i915/selftests/intel_scheduler_helpers.h  |   2 +
 14 files changed, 813 insertions(+), 311 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/uc/selftest_guc.c

-- 
2.32.0



[PATCH 02/22] drm/i915/guc: Fix outstanding G2H accounting

2021-08-16 Thread Matthew Brost
A small race that could result in incorrect accounting of the number
of outstanding G2H. Basically prior to this patch we did not increment
the number of outstanding G2H if we encoutered a GT reset while sending
a H2G. This was incorrect as the context state had already been updated
to anticipate a G2H response thus the counter should be incremented.

Fixes: f4eb1f3fe946 ("drm/i915/guc: Ensure G2H response has space in buffer")
Signed-off-by: Matthew Brost 
Cc: 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 69faa39da178..b5d3972ae164 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -360,11 +360,13 @@ static int guc_submission_send_busy_loop(struct intel_guc 
*guc,
 {
int err;
 
-   err = intel_guc_send_busy_loop(guc, action, len, g2h_len_dw, loop);
-
-   if (!err && g2h_len_dw)
+   if (g2h_len_dw)
atomic_inc(&guc->outstanding_submission_g2h);
 
+   err = intel_guc_send_busy_loop(guc, action, len, g2h_len_dw, loop);
+   if (err == -EBUSY && g2h_len_dw)
+   atomic_dec(&guc->outstanding_submission_g2h);
+
return err;
 }
 
-- 
2.32.0



[PATCH 03/22] drm/i915/guc: Unwind context requests in reverse order

2021-08-16 Thread Matthew Brost
When unwinding requests on a reset context, if other requests in the
context are in the priority list the requests could be resubmitted out
of seqno order. Traverse the list of active requests in reverse and
append to the head of the priority list to fix this.

Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface")
Signed-off-by: Matthew Brost 
Cc: 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index b5d3972ae164..bc51caba50d0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -799,9 +799,9 @@ __unwind_incomplete_requests(struct intel_context *ce)
 
spin_lock_irqsave(&sched_engine->lock, flags);
spin_lock(&ce->guc_active.lock);
-   list_for_each_entry_safe(rq, rn,
-&ce->guc_active.requests,
-sched.link) {
+   list_for_each_entry_safe_reverse(rq, rn,
+&ce->guc_active.requests,
+sched.link) {
if (i915_request_completed(rq))
continue;
 
@@ -818,7 +818,7 @@ __unwind_incomplete_requests(struct intel_context *ce)
}
GEM_BUG_ON(i915_sched_engine_is_empty(sched_engine));
 
-   list_add_tail(&rq->sched.link, pl);
+   list_add(&rq->sched.link, pl);
set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
 
spin_lock(&ce->guc_active.lock);
-- 
2.32.0



[PATCH 11/22] drm/i915/guc: Take context ref when cancelling request

2021-08-16 Thread Matthew Brost
A context can get destroyed after cancelling a request so take a
reference to context when cancelling a request.

Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index bffd0199dc15..89126be26786 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1613,8 +1613,10 @@ static void guc_context_cancel_request(struct 
intel_context *ce,
   struct i915_request *rq)
 {
if (i915_sw_fence_signaled(&rq->submit)) {
-   struct i915_sw_fence *fence = guc_context_block(ce);
+   struct i915_sw_fence *fence;
 
+   intel_context_get(ce);
+   fence = guc_context_block(ce);
i915_sw_fence_wait(fence);
if (!i915_request_completed(rq)) {
__i915_request_skip(rq);
@@ -1629,6 +1631,7 @@ static void guc_context_cancel_request(struct 
intel_context *ce,
flush_work(&ce_to_guc(ce)->ct.requests.worker);
 
guc_context_unblock(ce);
+   intel_context_put(ce);
}
 }
 
-- 
2.32.0



[PATCH 06/22] drm/i915/execlists: Do not propagate errors to dependent fences

2021-08-16 Thread Matthew Brost
Progagating errors to dependent fences is wrong, don't do it. Selftest
in following patch exposes this bug.

Fixes: 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to children 
on unhold")
Signed-off-by: Matthew Brost 
Cc: 
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index de5f9c86b9a4..cafb0608ffb4 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2140,10 +2140,6 @@ static void __execlists_unhold(struct i915_request *rq)
if (p->flags & I915_DEPENDENCY_WEAK)
continue;
 
-   /* Propagate any change in error status */
-   if (rq->fence.error)
-   i915_request_set_error_once(w, rq->fence.error);
-
if (w->engine != rq->engine)
continue;
 
-- 
2.32.0



[PATCH 09/22] drm/i915/selftests: Fix memory corruption in live_lrc_isolation

2021-08-16 Thread Matthew Brost
GuC submission has exposed an existing memory corruption in
live_lrc_isolation. We believe that some writes to the watchdog offsets
in the LRC (0x178 & 0x17c) can result in trashing of portions of the
address space. With GuC submission there are additional objects which
can move the context redzone into the space that is trashed. To
workaround this avoid poisoning the watchdog.

v2:
 (Daniel Vetter)
  - Add VLK ref in code to workaround

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/selftest_lrc.c | 29 +-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index b0977a3b699b..cdc6ae48a1e1 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -1074,6 +1074,32 @@ record_registers(struct intel_context *ce,
goto err_after;
 }
 
+static u32 safe_offset(u32 offset, u32 reg)
+{
+   /* XXX skip testing of watchdog - VLK-22772 */
+   if (offset == 0x178 || offset == 0x17c)
+   reg = 0;
+
+   return reg;
+}
+
+static int get_offset_mask(struct intel_engine_cs *engine)
+{
+   if (GRAPHICS_VER(engine->i915) < 12)
+   return 0xfff;
+
+   switch (engine->class) {
+   default:
+   case RENDER_CLASS:
+   return 0x07ff;
+   case COPY_ENGINE_CLASS:
+   return 0x0fff;
+   case VIDEO_DECODE_CLASS:
+   case VIDEO_ENHANCEMENT_CLASS:
+   return 0x3fff;
+   }
+}
+
 static struct i915_vma *load_context(struct intel_context *ce, u32 poison)
 {
struct i915_vma *batch;
@@ -1117,7 +1143,8 @@ static struct i915_vma *load_context(struct intel_context 
*ce, u32 poison)
len = (len + 1) / 2;
*cs++ = MI_LOAD_REGISTER_IMM(len);
while (len--) {
-   *cs++ = hw[dw];
+   *cs++ = safe_offset(hw[dw] & 
get_offset_mask(ce->engine),
+   hw[dw]);
*cs++ = poison;
dw += 2;
}
-- 
2.32.0



[PATCH 12/22] drm/i915/guc: Don't touch guc_state.sched_state without a lock

2021-08-16 Thread Matthew Brost
Before we did some clever tricks to not use the a lock when touching
guc_state.sched_state in certain cases. Don't do that, enforce the use
of the lock.

Part of this is removing a dead code path from guc_lrc_desc_pin where a
context could be deregistered when the aforementioned function was
called from the submission path. Remove this dead code and add a
GEM_BUG_ON if this path is ever attempted to be used.

Signed-off-by: Matthew Brost 
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 57 ++-
 1 file changed, 31 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 89126be26786..8d45585773f3 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -150,11 +150,22 @@ static inline void clr_context_registered(struct 
intel_context *ce)
 #define SCHED_STATE_BLOCKED_MASK   (0xfff << SCHED_STATE_BLOCKED_SHIFT)
 static inline void init_sched_state(struct intel_context *ce)
 {
-   /* Only should be called from guc_lrc_desc_pin() */
+   lockdep_assert_held(&ce->guc_state.lock);
atomic_set(&ce->guc_sched_state_no_lock, 0);
ce->guc_state.sched_state &= SCHED_STATE_BLOCKED_MASK;
 }
 
+static inline bool sched_state_is_init(struct intel_context *ce)
+{
+   /*
+* XXX: Kernel contexts can have SCHED_STATE_NO_LOCK_REGISTERED after
+* suspend.
+*/
+   return !(atomic_read(&ce->guc_sched_state_no_lock) &
+~SCHED_STATE_NO_LOCK_REGISTERED) &&
+   !(ce->guc_state.sched_state &= ~SCHED_STATE_BLOCKED_MASK);
+}
+
 static inline bool
 context_wait_for_deregister_to_register(struct intel_context *ce)
 {
@@ -165,7 +176,7 @@ context_wait_for_deregister_to_register(struct 
intel_context *ce)
 static inline void
 set_context_wait_for_deregister_to_register(struct intel_context *ce)
 {
-   /* Only should be called from guc_lrc_desc_pin() without lock */
+   lockdep_assert_held(&ce->guc_state.lock);
ce->guc_state.sched_state |=
SCHED_STATE_WAIT_FOR_DEREGISTER_TO_REGISTER;
 }
@@ -599,9 +610,7 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
intel_guc *guc)
bool pending_disable, pending_enable, deregister, destroyed, banned;
 
xa_for_each(&guc->context_lookup, index, ce) {
-   /* Flush context */
spin_lock_irqsave(&ce->guc_state.lock, flags);
-   spin_unlock_irqrestore(&ce->guc_state.lock, flags);
 
/*
 * Once we are at this point submission_disabled() is guaranteed
@@ -617,6 +626,8 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
intel_guc *guc)
banned = context_banned(ce);
init_sched_state(ce);
 
+   spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+
if (pending_enable || destroyed || deregister) {
atomic_dec(&guc->outstanding_submission_g2h);
if (deregister)
@@ -1318,6 +1329,7 @@ static int guc_lrc_desc_pin(struct intel_context *ce, 
bool loop)
int ret = 0;
 
GEM_BUG_ON(!engine->mask);
+   GEM_BUG_ON(!sched_state_is_init(ce));
 
/*
 * Ensure LRC + CT vmas are is same region as write barrier is done
@@ -1346,7 +1358,6 @@ static int guc_lrc_desc_pin(struct intel_context *ce, 
bool loop)
desc->priority = ce->guc_prio;
desc->context_flags = CONTEXT_REGISTRATION_FLAG_KMD;
guc_context_policy_init(engine, desc);
-   init_sched_state(ce);
 
/*
 * The context_lookup xarray is used to determine if the hardware
@@ -1357,26 +1368,23 @@ static int guc_lrc_desc_pin(struct intel_context *ce, 
bool loop)
 * registering this context.
 */
if (context_registered) {
+   bool disabled;
+   unsigned long flags;
+
trace_intel_context_steal_guc_id(ce);
-   if (!loop) {
+   GEM_BUG_ON(!loop);
+
+   /* Seal race with Reset */
+   spin_lock_irqsave(&ce->guc_state.lock, flags);
+   disabled = submission_disabled(guc);
+   if (likely(!disabled)) {
set_context_wait_for_deregister_to_register(ce);
intel_context_get(ce);
-   } else {
-   bool disabled;
-   unsigned long flags;
-
-   /* Seal race with Reset */
-   spin_lock_irqsave(&ce->guc_state.lock, flags);
-   disabled = submission_disabled(guc);
-   if (likely(!disabled)) {
-   set_context_wait_for_deregister_to_register(ce);
-   intel_context_get(ce);
-   }
-   spin_unlock_irqrestore(&ce->guc_state.lock, flags);
-   

[PATCH 16/22] drm/i915/guc: Release submit fence from an IRQ

2021-08-16 Thread Matthew Brost
A subsequent patch will flip the locking hierarchy from
ce->guc_state.lock -> sched_engine->lock to sched_engine->lock ->
ce->guc_state.lock. As such we need to release the submit fence for a
request from an IRQ to break a lock inversion - i.e. the fence must be
release went holding ce->guc_state.lock and the releasing of the can
acquire sched_engine->lock.

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 ++-
 drivers/gpu/drm/i915/i915_request.h   |  5 +
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 8c560ed14976..9ae4633aa7cb 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2017,6 +2017,14 @@ static const struct intel_context_ops guc_context_ops = {
.create_virtual = guc_create_virtual,
 };
 
+static void submit_work_cb(struct irq_work *wrk)
+{
+   struct i915_request *rq = container_of(wrk, typeof(*rq), submit_work);
+
+   might_lock(&rq->engine->sched_engine->lock);
+   i915_sw_fence_complete(&rq->submit);
+}
+
 static void __guc_signal_context_fence(struct intel_context *ce)
 {
struct i915_request *rq;
@@ -2026,8 +2034,12 @@ static void __guc_signal_context_fence(struct 
intel_context *ce)
if (!list_empty(&ce->guc_state.fences))
trace_intel_context_fence_release(ce);
 
+   /*
+* Use an IRQ to ensure locking order of sched_engine->lock ->
+* ce->guc_state.lock is preserved.
+*/
list_for_each_entry(rq, &ce->guc_state.fences, guc_fence_link)
-   i915_sw_fence_complete(&rq->submit);
+   irq_work_queue(&rq->submit_work);
 
INIT_LIST_HEAD(&ce->guc_state.fences);
 }
@@ -2137,6 +2149,7 @@ static int guc_request_alloc(struct i915_request *rq)
spin_lock_irqsave(&ce->guc_state.lock, flags);
if (context_wait_for_deregister_to_register(ce) ||
context_pending_disable(ce)) {
+   init_irq_work(&rq->submit_work, submit_work_cb);
i915_sw_fence_await(&rq->submit);
 
list_add_tail(&rq->guc_fence_link, &ce->guc_state.fences);
diff --git a/drivers/gpu/drm/i915/i915_request.h 
b/drivers/gpu/drm/i915/i915_request.h
index 1bc1349ba3c2..d818cfbfc41d 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -218,6 +218,11 @@ struct i915_request {
};
struct llist_head execute_cb;
struct i915_sw_fence semaphore;
+   /**
+* @submit_work: complete submit fence from an IRQ if needed for
+* locking hierarchy reasons.
+*/
+   struct irq_work submit_work;
 
/*
 * A list of everyone we wait upon, and everyone who waits upon us.
-- 
2.32.0



[PATCH 08/22] drm/i915/guc: Don't enable scheduling on a banned context, guc_id invalid, not registered

2021-08-16 Thread Matthew Brost
When unblocking a context, do not enable scheduling if the context is
banned, guc_id invalid, or not registered.

Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
Signed-off-by: Matthew Brost 
Cc: 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index c3b7bf7319dd..353899634fa8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1579,6 +1579,9 @@ static void guc_context_unblock(struct intel_context *ce)
spin_lock_irqsave(&ce->guc_state.lock, flags);
 
if (unlikely(submission_disabled(guc) ||
+intel_context_is_banned(ce) ||
+context_guc_id_invalid(ce) ||
+!lrc_desc_registered(guc, ce->guc_id) ||
 !intel_context_is_pinned(ce) ||
 context_pending_disable(ce) ||
 context_blocked(ce) > 1)) {
-- 
2.32.0



[PATCH 07/22] drm/i915/selftests: Add a cancel request selftest that triggers a reset

2021-08-16 Thread Matthew Brost
Add a cancel request selftest that results in an engine reset to cancel
the request as it is non-preemptable. Also insert a NOP request after
the cancelled request and confirm that it completely successfully.

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/selftests/i915_request.c | 100 ++
 1 file changed, 100 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c 
b/drivers/gpu/drm/i915/selftests/i915_request.c
index d67710d10615..e2c5db77f087 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -772,6 +772,98 @@ static int __cancel_completed(struct intel_engine_cs 
*engine)
return err;
 }
 
+static int __cancel_reset(struct intel_engine_cs *engine)
+{
+   struct intel_context *ce;
+   struct igt_spinner spin;
+   struct i915_request *rq, *nop;
+   unsigned long preempt_timeout_ms;
+   int err = 0;
+
+   preempt_timeout_ms = engine->props.preempt_timeout_ms;
+   engine->props.preempt_timeout_ms = 100;
+
+   if (igt_spinner_init(&spin, engine->gt))
+   goto out_restore;
+
+   ce = intel_context_create(engine);
+   if (IS_ERR(ce)) {
+   err = PTR_ERR(ce);
+   goto out_spin;
+   }
+
+   rq = igt_spinner_create_request(&spin, ce, MI_NOOP);
+   if (IS_ERR(rq)) {
+   err = PTR_ERR(rq);
+   goto out_ce;
+   }
+
+   pr_debug("%s: Cancelling active request\n", engine->name);
+   i915_request_get(rq);
+   i915_request_add(rq);
+   if (!igt_wait_for_spinner(&spin, rq)) {
+   struct drm_printer p = drm_info_printer(engine->i915->drm.dev);
+
+   pr_err("Failed to start spinner on %s\n", engine->name);
+   intel_engine_dump(engine, &p, "%s\n", engine->name);
+   err = -ETIME;
+   goto out_rq;
+   }
+
+   nop = intel_context_create_request(ce);
+   if (IS_ERR(nop))
+   goto out_nop;
+   i915_request_get(nop);
+   i915_request_add(nop);
+
+   i915_request_cancel(rq, -EINTR);
+
+   if (i915_request_wait(rq, 0, HZ) < 0) {
+   struct drm_printer p = drm_info_printer(engine->i915->drm.dev);
+
+   pr_err("%s: Failed to cancel hung request\n", engine->name);
+   intel_engine_dump(engine, &p, "%s\n", engine->name);
+   err = -ETIME;
+   goto out_nop;
+   }
+
+   if (rq->fence.error != -EINTR) {
+   pr_err("%s: fence not cancelled (%u)\n",
+  engine->name, rq->fence.error);
+   err = -EINVAL;
+   goto out_nop;
+   }
+
+   if (i915_request_wait(nop, 0, HZ) < 0) {
+   struct drm_printer p = drm_info_printer(engine->i915->drm.dev);
+
+   pr_err("%s: Failed to complete nop request\n", engine->name);
+   intel_engine_dump(engine, &p, "%s\n", engine->name);
+   err = -ETIME;
+   goto out_nop;
+   }
+
+   if (nop->fence.error != 0) {
+   pr_err("%s: Nop request errored (%u)\n",
+  engine->name, nop->fence.error);
+   err = -EINVAL;
+   }
+
+out_nop:
+   i915_request_put(nop);
+out_rq:
+   i915_request_put(rq);
+out_ce:
+   intel_context_put(ce);
+out_spin:
+   igt_spinner_fini(&spin);
+out_restore:
+   engine->props.preempt_timeout_ms = preempt_timeout_ms;
+   if (err)
+   pr_err("%s: %s error %d\n", __func__, engine->name, err);
+   return err;
+}
+
 static int live_cancel_request(void *arg)
 {
struct drm_i915_private *i915 = arg;
@@ -804,6 +896,14 @@ static int live_cancel_request(void *arg)
return err;
if (err2)
return err2;
+
+   /* Expects reset so call outside of igt_live_test_* */
+   err = __cancel_reset(engine);
+   if (err)
+   return err;
+
+   if (igt_flush_test(i915))
+   return -EIO;
}
 
return 0;
-- 
2.32.0



[PATCH 10/22] drm/i915/selftests: Add initial GuC selftest for scrubbing lost G2H

2021-08-16 Thread Matthew Brost
While debugging an issue with full GT resets I went down a rabbit hole
thinking the scrubbing of lost G2H wasn't working correctly. This proved
to be incorrect as this was working just fine but this chase inspired me
to write a selftest to prove that this works. This simple selftest
injects errors dropping various G2H and then issues a full GT reset
proving that the scrubbing of these G2H doesn't blow up.

v2:
 (Daniel Vetter)
  - Use ifdef instead of macros for selftests
v3:
 (Checkpatch)
  - A space after 'switch' statement

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/intel_context_types.h |  18 +++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  25 
 drivers/gpu/drm/i915/gt/uc/selftest_guc.c | 126 ++
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 .../i915/selftests/intel_scheduler_helpers.c  |  12 ++
 .../i915/selftests/intel_scheduler_helpers.h  |   2 +
 6 files changed, 184 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gt/uc/selftest_guc.c

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index e54351a170e2..3a73f3117873 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -198,6 +198,24 @@ struct intel_context {
 */
u8 guc_prio;
u32 guc_prio_count[GUC_CLIENT_PRIORITY_NUM];
+
+#ifdef CONFIG_DRM_I915_SELFTEST
+   /**
+* @drop_schedule_enable: Force drop of schedule enable G2H for selftest
+*/
+   bool drop_schedule_enable;
+
+   /**
+* @drop_schedule_disable: Force drop of schedule disable G2H for
+* selftest
+*/
+   bool drop_schedule_disable;
+
+   /**
+* @drop_deregister: Force drop of deregister G2H for selftest
+*/
+   bool drop_deregister;
+#endif
 };
 
 #endif /* __INTEL_CONTEXT_TYPES__ */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 353899634fa8..bffd0199dc15 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2634,6 +2634,13 @@ int intel_guc_deregister_done_process_msg(struct 
intel_guc *guc,
 
trace_intel_context_deregister_done(ce);
 
+#ifdef CONFIG_DRM_I915_SELFTEST
+   if (unlikely(ce->drop_deregister)) {
+   ce->drop_deregister = false;
+   return 0;
+   }
+#endif
+
if (context_wait_for_deregister_to_register(ce)) {
struct intel_runtime_pm *runtime_pm =
&ce->engine->gt->i915->runtime_pm;
@@ -2688,10 +2695,24 @@ int intel_guc_sched_done_process_msg(struct intel_guc 
*guc,
trace_intel_context_sched_done(ce);
 
if (context_pending_enable(ce)) {
+#ifdef CONFIG_DRM_I915_SELFTEST
+   if (unlikely(ce->drop_schedule_enable)) {
+   ce->drop_schedule_enable = false;
+   return 0;
+   }
+#endif
+
clr_context_pending_enable(ce);
} else if (context_pending_disable(ce)) {
bool banned;
 
+#ifdef CONFIG_DRM_I915_SELFTEST
+   if (unlikely(ce->drop_schedule_disable)) {
+   ce->drop_schedule_disable = false;
+   return 0;
+   }
+#endif
+
/*
 * Unpin must be done before __guc_signal_context_fence,
 * otherwise a race exists between the requests getting
@@ -3068,3 +3089,7 @@ bool intel_guc_virtual_engine_has_heartbeat(const struct 
intel_engine_cs *ve)
 
return false;
 }
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftest_guc.c"
+#endif
diff --git a/drivers/gpu/drm/i915/gt/uc/selftest_guc.c 
b/drivers/gpu/drm/i915/gt/uc/selftest_guc.c
new file mode 100644
index ..264e2f705c17
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/uc/selftest_guc.c
@@ -0,0 +1,126 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright �� 2021 Intel Corporation
+ */
+
+#include "selftests/intel_scheduler_helpers.h"
+
+static struct i915_request *nop_user_request(struct intel_context *ce,
+struct i915_request *from)
+{
+   struct i915_request *rq;
+   int ret;
+
+   rq = intel_context_create_request(ce);
+   if (IS_ERR(rq))
+   return rq;
+
+   if (from) {
+   ret = i915_sw_fence_await_dma_fence(&rq->submit,
+   &from->fence, 0,
+   I915_FENCE_GFP);
+   if (ret < 0) {
+   i915_request_put(rq);
+   return ERR_PTR(ret);
+   }
+   }
+
+   i915_request_get(rq);
+   i915_request_add(rq);
+
+   return rq;
+}
+
+static int intel_guc_scrub_ctbs(void *arg)
+{
+   struct intel_gt *gt = arg;
+   int ret = 0;
+   i

[PATCH 20/22] drm/i915/guc: Drop pin count check trick between sched_disable and re-pin

2021-08-16 Thread Matthew Brost
Drop pin count check trick between a sched_disable and re-pin, now rely
on the lock and counter of the number of committed requests to determine
if scheduling should be disabled on the context.

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/intel_context_types.h |  2 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 49 ---
 2 files changed, 34 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index d5d643b04d54..524a35a78bf4 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -169,6 +169,8 @@ struct intel_context {
struct list_head fences;
/* GuC context blocked fence */
struct i915_sw_fence blocked_fence;
+   /* GuC committed requests */
+   int number_committed_requests;
} guc_state;
 
struct {
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 2ecb2f002bed..c6ae6b4417c2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -247,6 +247,25 @@ static inline void decr_context_blocked(struct 
intel_context *ce)
ce->guc_state.sched_state -= SCHED_STATE_BLOCKED;
 }
 
+static inline bool context_has_committed_requests(struct intel_context *ce)
+{
+   return !!ce->guc_state.number_committed_requests;
+}
+
+static inline void incr_context_committed_requests(struct intel_context *ce)
+{
+   lockdep_assert_held(&ce->guc_state.lock);
+   ++ce->guc_state.number_committed_requests;
+   GEM_BUG_ON(ce->guc_state.number_committed_requests < 0);
+}
+
+static inline void decr_context_committed_requests(struct intel_context *ce)
+{
+   lockdep_assert_held(&ce->guc_state.lock);
+   --ce->guc_state.number_committed_requests;
+   GEM_BUG_ON(ce->guc_state.number_committed_requests < 0);
+}
+
 static inline bool context_guc_id_invalid(struct intel_context *ce)
 {
return ce->guc_id == GUC_INVALID_LRC_ID;
@@ -1736,14 +1755,11 @@ static void guc_context_sched_disable(struct 
intel_context *ce)
spin_lock_irqsave(&ce->guc_state.lock, flags);
 
/*
-* We have to check if the context has been disabled by another thread.
-* We also have to check if the context has been pinned again as another
-* pin operation is allowed to pass this function. Checking the pin
-* count, within ce->guc_state.lock, synchronizes this function with
-* guc_request_alloc ensuring a request doesn't slip through the
-* 'context_pending_disable' fence. Checking within the spin lock (can't
-* sleep) ensures another process doesn't pin this context and generate
-* a request before we set the 'context_pending_disable' flag here.
+* We have to check if the context has been disabled by another thread,
+* check if submssion has been disabled to seal a race with reset and
+* finally check if any more requests have been committed to the
+* context ensursing that a request doesn't slip through the
+* 'context_pending_disable' fence.
 */
enabled = context_enabled(ce);
if (unlikely(!enabled || submission_disabled(guc))) {
@@ -1752,7 +1768,8 @@ static void guc_context_sched_disable(struct 
intel_context *ce)
spin_unlock_irqrestore(&ce->guc_state.lock, flags);
goto unpin;
}
-   if (unlikely(atomic_add_unless(&ce->pin_count, -2, 2))) {
+   if (unlikely(context_has_committed_requests(ce))) {
+   intel_context_sched_disable_unpin(ce);
spin_unlock_irqrestore(&ce->guc_state.lock, flags);
return;
}
@@ -1785,6 +1802,7 @@ static void __guc_context_destroy(struct intel_context 
*ce)
   ce->guc_prio_count[GUC_CLIENT_PRIORITY_HIGH] ||
   ce->guc_prio_count[GUC_CLIENT_PRIORITY_KMD_NORMAL] ||
   ce->guc_prio_count[GUC_CLIENT_PRIORITY_NORMAL]);
+   GEM_BUG_ON(ce->guc_state.number_committed_requests);
 
lrc_fini(ce);
intel_context_fini(ce);
@@ -2015,6 +2033,10 @@ static void remove_from_context(struct i915_request *rq)
 
spin_unlock_irq(&ce->guc_active.lock);
 
+   spin_lock_irq(&ce->guc_state.lock);
+   decr_context_committed_requests(ce);
+   spin_unlock_irq(&ce->guc_state.lock);
+
atomic_dec(&ce->guc_id_ref);
i915_request_notify_execute_cb_imm(rq);
 }
@@ -2162,15 +2184,7 @@ static int guc_request_alloc(struct i915_request *rq)
 * schedule enable or context registration if either G2H is pending
 * respectfully. Once a G2H returns, the fence is released that is
 * blocking these requests (see guc_signal_context_fence).
-*
-* We can safely check the below fields outs

[PATCH 22/22] drm/i915/guc: Add GuC kernel doc

2021-08-16 Thread Matthew Brost
Add GuC kernel doc for all structures added thus far for GuC submission
and update the main GuC submission section with the new interface
details.

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/intel_context_types.h |  42 +---
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  19 +++-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 101 ++
 drivers/gpu/drm/i915/i915_request.h   |  18 ++--
 4 files changed, 131 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index f6989e6807f7..75d609a1bc33 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -156,44 +156,56 @@ struct intel_context {
u8 wa_bb_page; /* if set, page num reserved for context workarounds */
 
struct {
-   /** lock: protects everything in guc_state */
+   /** @lock: protects everything in guc_state */
spinlock_t lock;
/**
-* sched_state: scheduling state of this context using GuC
+* @sched_state: scheduling state of this context using GuC
 * submission
 */
u32 sched_state;
/*
-* fences: maintains of list of requests that have a submit
-* fence related to GuC submission
+* @fences: maintains a list of requests are currently being
+* fenced until a GuC operation completes
 */
struct list_head fences;
-   /* GuC context blocked fence */
+   /**
+* @blocked_fence: fence used to signal when the blocking of a
+* contexts submissions is complete.
+*/
struct i915_sw_fence blocked_fence;
-   /* GuC committed requests */
+   /** @number_committed_requests: number of committed requests */
int number_committed_requests;
} guc_state;
 
struct {
-   /** lock: protects everything in guc_active */
+   /** @lock: protects everything in guc_active */
spinlock_t lock;
-   /** requests: active requests on this context */
+   /** @requests: list of active requests on this context */
struct list_head requests;
-   /*
-* GuC priority management
-*/
+   /** @guc_prio: the contexts current guc priority */
u8 guc_prio;
+   /**
+* @guc_prio_count: a counter of the number requests inflight in
+* each priority bucket
+*/
u32 guc_prio_count[GUC_CLIENT_PRIORITY_NUM];
} guc_active;
 
-   /* GuC LRC descriptor ID */
+   /**
+* @guc_id: unique handle which is used to communicate information with
+* the GuC about this context, protected by guc->contexts_lock
+*/
u16 guc_id;
 
-   /* GuC LRC descriptor reference count */
+   /**
+* @guc_id_ref: the number of references to the guc_id, when
+* transitioning in and out of zero protected by guc->contexts_lock
+*/
atomic_t guc_id_ref;
 
-   /*
-* GuC ID link - in list when unpinned but guc_id still valid in GuC
+   /**
+* @guc_id_link: in guc->guc_id_list when the guc_id has no refs but is
+* still valid, protected by guc->contexts_lock
 */
struct list_head guc_id_link;
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 2e27fe59786b..c0b3fdb601f0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -41,6 +41,10 @@ struct intel_guc {
spinlock_t irq_lock;
unsigned int msg_enabled_mask;
 
+   /**
+* @outstanding_submission_g2h: number of outstanding G2H related to GuC
+* submission, used to determine if the GT is idle
+*/
atomic_t outstanding_submission_g2h;
 
struct {
@@ -49,12 +53,16 @@ struct intel_guc {
void (*disable)(struct intel_guc *guc);
} interrupts;
 
-   /*
-* contexts_lock protects the pool of free guc ids and a linked list of
-* guc ids available to be stolen
+   /**
+* @contexts_lock: protects guc_ids, guc_id_list, ce->guc_id, and
+* ce->guc_id_ref when transitioning in and out of zero
 */
spinlock_t contexts_lock;
+   /** @guc_ids: used to allocate new guc_ids */
struct ida guc_ids;
+   /**
+* @guc_id_list: list of intel_context with valid guc_ids but no refs
+*/
struct list_head guc_id_list;
 
bool submission_supported;
@@ -70,7 +78,10 @@ struct intel_guc {
struct i915_vma *lrc_desc_pool

[PATCH 17/22] drm/i915/guc: Move guc_blocked fence to struct guc_state

2021-08-16 Thread Matthew Brost
Move guc_blocked fence to struct guc_state as the lock which protects
the fence lives there.

s/ce->guc_blocked/ce->guc_state.blocked_fence/g

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/intel_context.c|  5 +++--
 drivers/gpu/drm/i915/gt/intel_context_types.h  |  5 ++---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c  | 18 +-
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 745e84c72c90..0e48939ec85f 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -405,8 +405,9 @@ intel_context_init(struct intel_context *ce, struct 
intel_engine_cs *engine)
 * Initialize fence to be complete as this is expected to be complete
 * unless there is a pending schedule disable outstanding.
 */
-   i915_sw_fence_init(&ce->guc_blocked, sw_fence_dummy_notify);
-   i915_sw_fence_commit(&ce->guc_blocked);
+   i915_sw_fence_init(&ce->guc_state.blocked_fence,
+  sw_fence_dummy_notify);
+   i915_sw_fence_commit(&ce->guc_state.blocked_fence);
 
i915_active_init(&ce->active,
 __intel_context_active, __intel_context_retire, 0);
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 3a73f3117873..c06171ee8792 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -167,6 +167,8 @@ struct intel_context {
 * fence related to GuC submission
 */
struct list_head fences;
+   /* GuC context blocked fence */
+   struct i915_sw_fence blocked_fence;
} guc_state;
 
struct {
@@ -190,9 +192,6 @@ struct intel_context {
 */
struct list_head guc_id_link;
 
-   /* GuC context blocked fence */
-   struct i915_sw_fence guc_blocked;
-
/*
 * GuC priority management
 */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 9ae4633aa7cb..7aa16371908a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1482,24 +1482,24 @@ static void guc_blocked_fence_complete(struct 
intel_context *ce)
 {
lockdep_assert_held(&ce->guc_state.lock);
 
-   if (!i915_sw_fence_done(&ce->guc_blocked))
-   i915_sw_fence_complete(&ce->guc_blocked);
+   if (!i915_sw_fence_done(&ce->guc_state.blocked_fence))
+   i915_sw_fence_complete(&ce->guc_state.blocked_fence);
 }
 
 static void guc_blocked_fence_reinit(struct intel_context *ce)
 {
lockdep_assert_held(&ce->guc_state.lock);
-   GEM_BUG_ON(!i915_sw_fence_done(&ce->guc_blocked));
+   GEM_BUG_ON(!i915_sw_fence_done(&ce->guc_state.blocked_fence));
 
/*
 * This fence is always complete unless a pending schedule disable is
 * outstanding. We arm the fence here and complete it when we receive
 * the pending schedule disable complete message.
 */
-   i915_sw_fence_fini(&ce->guc_blocked);
-   i915_sw_fence_reinit(&ce->guc_blocked);
-   i915_sw_fence_await(&ce->guc_blocked);
-   i915_sw_fence_commit(&ce->guc_blocked);
+   i915_sw_fence_fini(&ce->guc_state.blocked_fence);
+   i915_sw_fence_reinit(&ce->guc_state.blocked_fence);
+   i915_sw_fence_await(&ce->guc_state.blocked_fence);
+   i915_sw_fence_commit(&ce->guc_state.blocked_fence);
 }
 
 static u16 prep_context_pending_disable(struct intel_context *ce)
@@ -1539,7 +1539,7 @@ static struct i915_sw_fence *guc_context_block(struct 
intel_context *ce)
if (enabled)
clr_context_enabled(ce);
spin_unlock_irqrestore(&ce->guc_state.lock, flags);
-   return &ce->guc_blocked;
+   return &ce->guc_state.blocked_fence;
}
 
/*
@@ -1555,7 +1555,7 @@ static struct i915_sw_fence *guc_context_block(struct 
intel_context *ce)
with_intel_runtime_pm(runtime_pm, wakeref)
__guc_context_sched_disable(guc, ce, guc_id);
 
-   return &ce->guc_blocked;
+   return &ce->guc_state.blocked_fence;
 }
 
 static void guc_context_unblock(struct intel_context *ce)
-- 
2.32.0



[PATCH 05/22] drm/i915/guc: Workaround reset G2H is received after schedule done G2H

2021-08-16 Thread Matthew Brost
If the context is reset as a result of the request cancelation the
context reset G2H is received after schedule disable done G2H which is
likely the wrong order. The schedule disable done G2H release the
waiting request cancelation code which resubmits the context. This races
with the context reset G2H which also wants to resubmit the context but
in this case it really should be a NOP as request cancelation code owns
the resubmit. Use some clever tricks of checking the context state to
seal this race until if / when the GuC firmware is fixed.

v2:
 (Checkpatch)
  - Fix typos

Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
Signed-off-by: Matthew Brost 
Cc: 
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 43 ---
 1 file changed, 37 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 3cd2da6f5c03..c3b7bf7319dd 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -826,17 +826,35 @@ __unwind_incomplete_requests(struct intel_context *ce)
 static void __guc_reset_context(struct intel_context *ce, bool stalled)
 {
struct i915_request *rq;
+   unsigned long flags;
u32 head;
+   bool skip = false;
 
intel_context_get(ce);
 
/*
-* GuC will implicitly mark the context as non-schedulable
-* when it sends the reset notification. Make sure our state
-* reflects this change. The context will be marked enabled
-* on resubmission.
+* GuC will implicitly mark the context as non-schedulable when it sends
+* the reset notification. Make sure our state reflects this change. The
+* context will be marked enabled on resubmission.
+*
+* XXX: If the context is reset as a result of the request cancellation
+* this G2H is received after the schedule disable complete G2H which is
+* likely wrong as this creates a race between the request cancellation
+* code re-submitting the context and this G2H handler. This likely
+* should be fixed in the GuC but until if / when that gets fixed we
+* need to workaround this. Convert this function to a NOP if a pending
+* enable is in flight as this indicates that a request cancellation has
+* occurred.
 */
-   clr_context_enabled(ce);
+   spin_lock_irqsave(&ce->guc_state.lock, flags);
+   if (likely(!context_pending_enable(ce))) {
+   clr_context_enabled(ce);
+   } else {
+   skip = true;
+   }
+   spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+   if (unlikely(skip))
+   goto out_put;
 
rq = intel_context_find_active_request(ce);
if (!rq) {
@@ -855,6 +873,7 @@ static void __guc_reset_context(struct intel_context *ce, 
bool stalled)
 out_replay:
guc_reset_state(ce, head, stalled);
__unwind_incomplete_requests(ce);
+out_put:
intel_context_put(ce);
 }
 
@@ -1599,6 +1618,13 @@ static void guc_context_cancel_request(struct 
intel_context *ce,
guc_reset_state(ce, intel_ring_wrap(ce->ring, rq->head),
true);
}
+
+   /*
+* XXX: Racey if context is reset, see comment in
+* __guc_reset_context().
+*/
+   flush_work(&ce_to_guc(ce)->ct.requests.worker);
+
guc_context_unblock(ce);
}
 }
@@ -2719,7 +2745,12 @@ static void guc_handle_context_reset(struct intel_guc 
*guc,
 {
trace_intel_context_reset(ce);
 
-   if (likely(!intel_context_is_banned(ce))) {
+   /*
+* XXX: Racey if request cancellation has occurred, see comment in
+* __guc_reset_context().
+*/
+   if (likely(!intel_context_is_banned(ce) &&
+  !context_blocked(ce))) {
capture_error_state(guc, ce);
guc_context_replay(ce);
}
-- 
2.32.0



[PATCH 19/22] drm/i915/guc: Proper xarray usage for contexts_lookup

2021-08-16 Thread Matthew Brost
Lock the xarray and take ref to the context if needed.

v2:
 (Checkpatch)
  - Add new line after declaration

Signed-off-by: Matthew Brost 
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 84 ---
 1 file changed, 73 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index ba19b99173fc..2ecb2f002bed 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -599,8 +599,18 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
intel_guc *guc)
unsigned long index, flags;
bool pending_disable, pending_enable, deregister, destroyed, banned;
 
+   xa_lock_irqsave(&guc->context_lookup, flags);
xa_for_each(&guc->context_lookup, index, ce) {
-   spin_lock_irqsave(&ce->guc_state.lock, flags);
+   /*
+* Corner case where the ref count on the object is zero but and
+* deregister G2H was lost. In this case we don't touch the ref
+* count and finish the destroy of the context.
+*/
+   bool do_put = kref_get_unless_zero(&ce->ref);
+
+   xa_unlock(&guc->context_lookup);
+
+   spin_lock(&ce->guc_state.lock);
 
/*
 * Once we are at this point submission_disabled() is guaranteed
@@ -616,7 +626,9 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
intel_guc *guc)
banned = context_banned(ce);
init_sched_state(ce);
 
-   spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+   spin_unlock(&ce->guc_state.lock);
+
+   GEM_BUG_ON(!do_put && !destroyed);
 
if (pending_enable || destroyed || deregister) {
atomic_dec(&guc->outstanding_submission_g2h);
@@ -645,7 +657,12 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
intel_guc *guc)
 
intel_context_put(ce);
}
+
+   if (do_put)
+   intel_context_put(ce);
+   xa_lock(&guc->context_lookup);
}
+   xa_unlock_irqrestore(&guc->context_lookup, flags);
 }
 
 static inline bool
@@ -866,16 +883,26 @@ void intel_guc_submission_reset(struct intel_guc *guc, 
bool stalled)
 {
struct intel_context *ce;
unsigned long index;
+   unsigned long flags;
 
if (unlikely(!guc_submission_initialized(guc))) {
/* Reset called during driver load? GuC not yet initialised! */
return;
}
 
-   xa_for_each(&guc->context_lookup, index, ce)
+   xa_lock_irqsave(&guc->context_lookup, flags);
+   xa_for_each(&guc->context_lookup, index, ce) {
+   intel_context_get(ce);
+   xa_unlock(&guc->context_lookup);
+
if (intel_context_is_pinned(ce))
__guc_reset_context(ce, stalled);
 
+   intel_context_put(ce);
+   xa_lock(&guc->context_lookup);
+   }
+   xa_unlock_irqrestore(&guc->context_lookup, flags);
+
/* GuC is blown away, drop all references to contexts */
xa_destroy(&guc->context_lookup);
 }
@@ -950,11 +977,21 @@ void intel_guc_submission_cancel_requests(struct 
intel_guc *guc)
 {
struct intel_context *ce;
unsigned long index;
+   unsigned long flags;
+
+   xa_lock_irqsave(&guc->context_lookup, flags);
+   xa_for_each(&guc->context_lookup, index, ce) {
+   intel_context_get(ce);
+   xa_unlock(&guc->context_lookup);
 
-   xa_for_each(&guc->context_lookup, index, ce)
if (intel_context_is_pinned(ce))
guc_cancel_context_requests(ce);
 
+   intel_context_put(ce);
+   xa_lock(&guc->context_lookup);
+   }
+   xa_unlock_irqrestore(&guc->context_lookup, flags);
+
guc_cancel_sched_engine_requests(guc->sched_engine);
 
/* GuC is blown away, drop all references to contexts */
@@ -2848,21 +2885,26 @@ void intel_guc_find_hung_context(struct intel_engine_cs 
*engine)
struct intel_context *ce;
struct i915_request *rq;
unsigned long index;
+   unsigned long flags;
 
/* Reset called during driver load? GuC not yet initialised! */
if (unlikely(!guc_submission_initialized(guc)))
return;
 
+   xa_lock_irqsave(&guc->context_lookup, flags);
xa_for_each(&guc->context_lookup, index, ce) {
+   intel_context_get(ce);
+   xa_unlock(&guc->context_lookup);
+
if (!intel_context_is_pinned(ce))
-   continue;
+   goto next;
 
if (intel_engine_is_virtual(ce->engine)) {
if (!(ce->engine->mask & engine->mask))
-   continue;
+

[PATCH 21/22] drm/i915/guc: Move GuC priority fields in context under guc_active

2021-08-16 Thread Matthew Brost
Move GuC management fields in context under guc_active struct as this is
where the lock that protects theses fields lives. Also only set guc_prio
field once during context init.

Fixes: ee242ca704d3 ("drm/i915/guc: Implement GuC priority management")
Signed-off-by: Matthew Brost 
Cc: 
---
 drivers/gpu/drm/i915/gt/intel_context_types.h | 12 ++--
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 68 +++
 drivers/gpu/drm/i915/i915_trace.h |  2 +-
 3 files changed, 45 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 524a35a78bf4..f6989e6807f7 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -112,6 +112,7 @@ struct intel_context {
 #define CONTEXT_FORCE_SINGLE_SUBMISSION7
 #define CONTEXT_NOPREEMPT  8
 #define CONTEXT_LRCA_DIRTY 9
+#define CONTEXT_GUC_INIT   10
 
struct {
u64 timeout_us;
@@ -178,6 +179,11 @@ struct intel_context {
spinlock_t lock;
/** requests: active requests on this context */
struct list_head requests;
+   /*
+* GuC priority management
+*/
+   u8 guc_prio;
+   u32 guc_prio_count[GUC_CLIENT_PRIORITY_NUM];
} guc_active;
 
/* GuC LRC descriptor ID */
@@ -191,12 +197,6 @@ struct intel_context {
 */
struct list_head guc_id_link;
 
-   /*
-* GuC priority management
-*/
-   u8 guc_prio;
-   u32 guc_prio_count[GUC_CLIENT_PRIORITY_NUM];
-
 #ifdef CONFIG_DRM_I915_SELFTEST
/**
 * @drop_schedule_enable: Force drop of schedule enable G2H for selftest
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index c6ae6b4417c2..eb06a4c7534e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1354,8 +1354,6 @@ static void guc_context_policy_init(struct 
intel_engine_cs *engine,
desc->preemption_timeout = engine->props.preempt_timeout_ms * 1000;
 }
 
-static inline u8 map_i915_prio_to_guc_prio(int prio);
-
 static int guc_lrc_desc_pin(struct intel_context *ce, bool loop)
 {
struct intel_engine_cs *engine = ce->engine;
@@ -1363,8 +1361,6 @@ static int guc_lrc_desc_pin(struct intel_context *ce, 
bool loop)
struct intel_guc *guc = &engine->gt->uc.guc;
u32 desc_idx = ce->guc_id;
struct guc_lrc_desc *desc;
-   const struct i915_gem_context *ctx;
-   int prio = I915_CONTEXT_DEFAULT_PRIORITY;
bool context_registered;
intel_wakeref_t wakeref;
int ret = 0;
@@ -1381,12 +1377,6 @@ static int guc_lrc_desc_pin(struct intel_context *ce, 
bool loop)
 
context_registered = lrc_desc_registered(guc, desc_idx);
 
-   rcu_read_lock();
-   ctx = rcu_dereference(ce->gem_context);
-   if (ctx)
-   prio = ctx->sched.priority;
-   rcu_read_unlock();
-
reset_lrc_desc(guc, desc_idx);
set_lrc_desc_registered(guc, desc_idx, ce);
 
@@ -1395,8 +1385,7 @@ static int guc_lrc_desc_pin(struct intel_context *ce, 
bool loop)
desc->engine_submit_mask = adjust_engine_mask(engine->class,
  engine->mask);
desc->hw_context_desc = ce->lrc.lrca;
-   ce->guc_prio = map_i915_prio_to_guc_prio(prio);
-   desc->priority = ce->guc_prio;
+   desc->priority = ce->guc_active.guc_prio;
desc->context_flags = CONTEXT_REGISTRATION_FLAG_KMD;
guc_context_policy_init(engine, desc);
 
@@ -1798,10 +1787,10 @@ static inline void guc_lrc_desc_unpin(struct 
intel_context *ce)
 
 static void __guc_context_destroy(struct intel_context *ce)
 {
-   GEM_BUG_ON(ce->guc_prio_count[GUC_CLIENT_PRIORITY_KMD_HIGH] ||
-  ce->guc_prio_count[GUC_CLIENT_PRIORITY_HIGH] ||
-  ce->guc_prio_count[GUC_CLIENT_PRIORITY_KMD_NORMAL] ||
-  ce->guc_prio_count[GUC_CLIENT_PRIORITY_NORMAL]);
+   GEM_BUG_ON(ce->guc_active.guc_prio_count[GUC_CLIENT_PRIORITY_KMD_HIGH] 
||
+  ce->guc_active.guc_prio_count[GUC_CLIENT_PRIORITY_HIGH] ||
+  
ce->guc_active.guc_prio_count[GUC_CLIENT_PRIORITY_KMD_NORMAL] ||
+  ce->guc_active.guc_prio_count[GUC_CLIENT_PRIORITY_NORMAL]);
GEM_BUG_ON(ce->guc_state.number_committed_requests);
 
lrc_fini(ce);
@@ -1911,14 +1900,17 @@ static void guc_context_set_prio(struct intel_guc *guc,
 
GEM_BUG_ON(prio < GUC_CLIENT_PRIORITY_KMD_HIGH ||
   prio > GUC_CLIENT_PRIORITY_NORMAL);
+   lockdep_assert_held(&ce->guc_active.lock);
 
-   if (ce->guc_prio == prio || submission_disabled(guc) ||
-   !context_registered(ce))
+   if (ce->guc_active.guc_

[PATCH 18/22] drm/i915/guc: Rework and simplify locking

2021-08-16 Thread Matthew Brost
Rework and simplify the locking with GuC subission. Drop
sched_state_no_lock and move all fields under the guc_state.sched_state
and protect all these fields with guc_state.lock . This requires
changing the locking hierarchy from guc_state.lock -> sched_engine.lock
to sched_engine.lock -> guc_state.lock.

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/intel_context_types.h |   5 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 186 --
 drivers/gpu/drm/i915/i915_trace.h |   6 +-
 3 files changed, 89 insertions(+), 108 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index c06171ee8792..d5d643b04d54 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -161,7 +161,7 @@ struct intel_context {
 * sched_state: scheduling state of this context using GuC
 * submission
 */
-   u16 sched_state;
+   u32 sched_state;
/*
 * fences: maintains of list of requests that have a submit
 * fence related to GuC submission
@@ -178,9 +178,6 @@ struct intel_context {
struct list_head requests;
} guc_active;
 
-   /* GuC scheduling state flags that do not require a lock. */
-   atomic_t guc_sched_state_no_lock;
-
/* GuC LRC descriptor ID */
u16 guc_id;
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 7aa16371908a..ba19b99173fc 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -72,86 +72,23 @@ guc_create_virtual(struct intel_engine_cs **siblings, 
unsigned int count);
 
 #define GUC_REQUEST_SIZE 64 /* bytes */
 
-/*
- * Below is a set of functions which control the GuC scheduling state which do
- * not require a lock as all state transitions are mutually exclusive. i.e. It
- * is not possible for the context pinning code and submission, for the same
- * context, to be executing simultaneously. We still need an atomic as it is
- * possible for some of the bits to changing at the same time though.
- */
-#define SCHED_STATE_NO_LOCK_ENABLEDBIT(0)
-#define SCHED_STATE_NO_LOCK_PENDING_ENABLE BIT(1)
-#define SCHED_STATE_NO_LOCK_REGISTERED BIT(2)
-static inline bool context_enabled(struct intel_context *ce)
-{
-   return (atomic_read(&ce->guc_sched_state_no_lock) &
-   SCHED_STATE_NO_LOCK_ENABLED);
-}
-
-static inline void set_context_enabled(struct intel_context *ce)
-{
-   atomic_or(SCHED_STATE_NO_LOCK_ENABLED, &ce->guc_sched_state_no_lock);
-}
-
-static inline void clr_context_enabled(struct intel_context *ce)
-{
-   atomic_and((u32)~SCHED_STATE_NO_LOCK_ENABLED,
-  &ce->guc_sched_state_no_lock);
-}
-
-static inline bool context_pending_enable(struct intel_context *ce)
-{
-   return (atomic_read(&ce->guc_sched_state_no_lock) &
-   SCHED_STATE_NO_LOCK_PENDING_ENABLE);
-}
-
-static inline void set_context_pending_enable(struct intel_context *ce)
-{
-   atomic_or(SCHED_STATE_NO_LOCK_PENDING_ENABLE,
- &ce->guc_sched_state_no_lock);
-}
-
-static inline void clr_context_pending_enable(struct intel_context *ce)
-{
-   atomic_and((u32)~SCHED_STATE_NO_LOCK_PENDING_ENABLE,
-  &ce->guc_sched_state_no_lock);
-}
-
-static inline bool context_registered(struct intel_context *ce)
-{
-   return (atomic_read(&ce->guc_sched_state_no_lock) &
-   SCHED_STATE_NO_LOCK_REGISTERED);
-}
-
-static inline void set_context_registered(struct intel_context *ce)
-{
-   atomic_or(SCHED_STATE_NO_LOCK_REGISTERED,
- &ce->guc_sched_state_no_lock);
-}
-
-static inline void clr_context_registered(struct intel_context *ce)
-{
-   atomic_and((u32)~SCHED_STATE_NO_LOCK_REGISTERED,
-  &ce->guc_sched_state_no_lock);
-}
-
 /*
  * Below is a set of functions which control the GuC scheduling state which
- * require a lock, aside from the special case where the functions are called
- * from guc_lrc_desc_pin(). In that case it isn't possible for any other code
- * path to be executing on the context.
+ * require a lock.
  */
 #define SCHED_STATE_WAIT_FOR_DEREGISTER_TO_REGISTERBIT(0)
 #define SCHED_STATE_DESTROYED  BIT(1)
 #define SCHED_STATE_PENDING_DISABLEBIT(2)
 #define SCHED_STATE_BANNED BIT(3)
-#define SCHED_STATE_BLOCKED_SHIFT  4
+#define SCHED_STATE_ENABLEDBIT(4)
+#define SCHED_STATE_PENDING_ENABLE BIT(5)
+#define SCHED_STATE_REGISTERED BIT(6)
+#define SCHED_STATE_BLOCKED_SHIFT  7
 #define SCHED_STATE_BLOCKED 

[PATCH 15/22] drm/i915/guc: Flush G2H work queue during reset

2021-08-16 Thread Matthew Brost
It isn't safe to scrub for missing G2H or continue with the reset until
all G2H processing is complete. Flush the G2H work queue during reset to
ensure it is done running.

Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface")
Signed-off-by: Matthew Brost 
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c  | 18 ++
 1 file changed, 2 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 3a01743e09ea..8c560ed14976 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -707,8 +707,6 @@ static void guc_flush_submissions(struct intel_guc *guc)
 
 void intel_guc_submission_reset_prepare(struct intel_guc *guc)
 {
-   int i;
-
if (unlikely(!guc_submission_initialized(guc))) {
/* Reset called during driver load? GuC not yet initialised! */
return;
@@ -724,20 +722,8 @@ void intel_guc_submission_reset_prepare(struct intel_guc 
*guc)
 
guc_flush_submissions(guc);
 
-   /*
-* Handle any outstanding G2Hs before reset. Call IRQ handler directly
-* each pass as interrupt have been disabled. We always scrub for
-* outstanding G2H as it is possible for outstanding_submission_g2h to
-* be incremented after the context state update.
-*/
-   for (i = 0; i < 4 && atomic_read(&guc->outstanding_submission_g2h); 
++i) {
-   intel_guc_to_host_event_handler(guc);
-#define wait_for_reset(guc, wait_var) \
-   intel_guc_wait_for_pending_msg(guc, wait_var, false, (HZ / 20))
-   do {
-   wait_for_reset(guc, &guc->outstanding_submission_g2h);
-   } while (!list_empty(&guc->ct.requests.incoming));
-   }
+   flush_work(&guc->ct.requests.worker);
+
scrub_guc_desc_for_outstanding_g2h(guc);
 }
 
-- 
2.32.0



[PATCH 13/22] drm/i915/guc: Reset LRC descriptor if register returns -ENODEV

2021-08-16 Thread Matthew Brost
Reset LRC descriptor if a context register returns -ENODEV as this means
we are mid-reset.

Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface")
Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 8d45585773f3..3a01743e09ea 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1399,10 +1399,12 @@ static int guc_lrc_desc_pin(struct intel_context *ce, 
bool loop)
} else {
with_intel_runtime_pm(runtime_pm, wakeref)
ret = register_context(ce, loop);
-   if (unlikely(ret == -EBUSY))
+   if (unlikely(ret == -EBUSY)) {
+   reset_lrc_desc(guc, desc_idx);
+   } else if (unlikely(ret == -ENODEV)) {
reset_lrc_desc(guc, desc_idx);
-   else if (unlikely(ret == -ENODEV))
ret = 0;/* Will get registered later */
+   }
}
 
return ret;
-- 
2.32.0



[PATCH 04/22] drm/i915/guc: Don't drop ce->guc_active.lock when unwinding context

2021-08-16 Thread Matthew Brost
Don't drop ce->guc_active.lock when unwinding a context after reset.
At one point we had to drop this because of a lock inversion but that is
no longer the case. It is much safer to hold the lock so let's do that.

Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface")
Signed-off-by: Matthew Brost 
Cc: 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index bc51caba50d0..3cd2da6f5c03 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -806,8 +806,6 @@ __unwind_incomplete_requests(struct intel_context *ce)
continue;
 
list_del_init(&rq->sched.link);
-   spin_unlock(&ce->guc_active.lock);
-
__i915_request_unsubmit(rq);
 
/* Push the request back into the queue for later resubmission. 
*/
@@ -820,8 +818,6 @@ __unwind_incomplete_requests(struct intel_context *ce)
 
list_add(&rq->sched.link, pl);
set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
-
-   spin_lock(&ce->guc_active.lock);
}
spin_unlock(&ce->guc_active.lock);
spin_unlock_irqrestore(&sched_engine->lock, flags);
-- 
2.32.0



[PATCH 14/22] drm/i915: Allocate error capture in atomic context

2021-08-16 Thread Matthew Brost
Error captures can now be done in a work queue processing G2H messages.
These messages need to be completely done being processed in the reset
path, to avoid races in the missing G2H cleanup, which create a
dependency on memory allocations and dma fences (i915_requests).
Requests depend on resets, thus now we have a circular dependency. To
work around this, allocate the error capture in an atomic context.

Fixes: dc0dad365c5e ("Fix for error capture after full GPU reset with GuC")
Fixes: 573ba126aef3 ("Capture error state on context reset")
Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 37 +--
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 0f08bcfbe964..453376aa6d9f 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -49,7 +49,6 @@
 #include "i915_memcpy.h"
 #include "i915_scatterlist.h"
 
-#define ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
 #define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN)
 
 static void __sg_set_buf(struct scatterlist *sg,
@@ -79,7 +78,7 @@ static bool __i915_error_grow(struct drm_i915_error_state_buf 
*e, size_t len)
if (e->cur == e->end) {
struct scatterlist *sgl;
 
-   sgl = (typeof(sgl))__get_free_page(ALLOW_FAIL);
+   sgl = (typeof(sgl))__get_free_page(ATOMIC_MAYFAIL);
if (!sgl) {
e->err = -ENOMEM;
return false;
@@ -99,10 +98,10 @@ static bool __i915_error_grow(struct 
drm_i915_error_state_buf *e, size_t len)
}
 
e->size = ALIGN(len + 1, SZ_64K);
-   e->buf = kmalloc(e->size, ALLOW_FAIL);
+   e->buf = kmalloc(e->size, ATOMIC_MAYFAIL);
if (!e->buf) {
e->size = PAGE_ALIGN(len + 1);
-   e->buf = kmalloc(e->size, GFP_KERNEL);
+   e->buf = kmalloc(e->size, ATOMIC_MAYFAIL);
}
if (!e->buf) {
e->err = -ENOMEM;
@@ -243,12 +242,12 @@ static bool compress_init(struct i915_vma_compress *c)
 {
struct z_stream_s *zstream = &c->zstream;
 
-   if (pool_init(&c->pool, ALLOW_FAIL))
+   if (pool_init(&c->pool, ATOMIC_MAYFAIL))
return false;
 
zstream->workspace =
kmalloc(zlib_deflate_workspacesize(MAX_WBITS, MAX_MEM_LEVEL),
-   ALLOW_FAIL);
+   ATOMIC_MAYFAIL);
if (!zstream->workspace) {
pool_fini(&c->pool);
return false;
@@ -256,7 +255,7 @@ static bool compress_init(struct i915_vma_compress *c)
 
c->tmp = NULL;
if (i915_has_memcpy_from_wc())
-   c->tmp = pool_alloc(&c->pool, ALLOW_FAIL);
+   c->tmp = pool_alloc(&c->pool, ATOMIC_MAYFAIL);
 
return true;
 }
@@ -280,7 +279,7 @@ static void *compress_next_page(struct i915_vma_compress *c,
if (dst->page_count >= dst->num_pages)
return ERR_PTR(-ENOSPC);
 
-   page = pool_alloc(&c->pool, ALLOW_FAIL);
+   page = pool_alloc(&c->pool, ATOMIC_MAYFAIL);
if (!page)
return ERR_PTR(-ENOMEM);
 
@@ -376,7 +375,7 @@ struct i915_vma_compress {
 
 static bool compress_init(struct i915_vma_compress *c)
 {
-   return pool_init(&c->pool, ALLOW_FAIL) == 0;
+   return pool_init(&c->pool, ATOMIC_MAYFAIL) == 0;
 }
 
 static bool compress_start(struct i915_vma_compress *c)
@@ -391,7 +390,7 @@ static int compress_page(struct i915_vma_compress *c,
 {
void *ptr;
 
-   ptr = pool_alloc(&c->pool, ALLOW_FAIL);
+   ptr = pool_alloc(&c->pool, ATOMIC_MAYFAIL);
if (!ptr)
return -ENOMEM;
 
@@ -997,7 +996,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 
num_pages = min_t(u64, vma->size, vma->obj->base.size) >> PAGE_SHIFT;
num_pages = DIV_ROUND_UP(10 * num_pages, 8); /* worstcase zlib growth */
-   dst = kmalloc(sizeof(*dst) + num_pages * sizeof(u32 *), ALLOW_FAIL);
+   dst = kmalloc(sizeof(*dst) + num_pages * sizeof(u32 *), ATOMIC_MAYFAIL);
if (!dst)
return NULL;
 
@@ -1433,7 +1432,7 @@ capture_engine(struct intel_engine_cs *engine,
struct i915_request *rq = NULL;
unsigned long flags;
 
-   ee = intel_engine_coredump_alloc(engine, GFP_KERNEL);
+   ee = intel_engine_coredump_alloc(engine, ATOMIC_MAYFAIL);
if (!ee)
return NULL;
 
@@ -1481,7 +1480,7 @@ gt_record_engines(struct intel_gt_coredump *gt,
struct intel_engine_coredump *ee;
 
/* Refill our page pool before entering atomic section */
-   pool_refill(&compress->pool, ALLOW_FAIL);
+   pool_refill(&compress->pool, ATOMIC_MAYFAIL);
 
ee = capture_engine(engine, compress);
if (!ee)
@@ -1507,7 +1506,7 @@ gt_record_uc(struct intel_gt_coredum

Re: [PATCH v2] drm: avoid races with modesetting rights

2021-08-16 Thread Daniel Vetter
On Mon, Aug 16, 2021 at 12:31 PM Desmond Cheong Zhi Xi
 wrote:
>
> On 16/8/21 5:04 pm, Daniel Vetter wrote:
> > On Mon, Aug 16, 2021 at 10:53 AM Desmond Cheong Zhi Xi
> >  wrote:
> >> On 16/8/21 2:47 am, kernel test robot wrote:
> >>> Hi Desmond,
> >>>
> >>> Thank you for the patch! Yet something to improve:
> >>>
> >>> [auto build test ERROR on next-20210813]
> >>> [also build test ERROR on v5.14-rc5]
> >>> [cannot apply to linus/master v5.14-rc5 v5.14-rc4 v5.14-rc3]
> >>> [If your patch is applied to the wrong git tree, kindly drop us a note.
> >>> And when submitting patch, we suggest to use '--base' as documented in
> >>> https://git-scm.com/docs/git-format-patch]
> >>>
> >>> url:
> >>> https://github.com/0day-ci/linux/commits/Desmond-Cheong-Zhi-Xi/drm-avoid-races-with-modesetting-rights/20210815-234145
> >>> base:4b358aabb93a2c654cd1dcab1a25a589f6e2b153
> >>> config: i386-randconfig-a004-20210815 (attached as .config)
> >>> compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
> >>> reproduce (this is a W=1 build):
> >>>   # 
> >>> https://github.com/0day-ci/linux/commit/cf6d8354b7d7953cd866fad004cbb189adfa074f
> >>>   git remote add linux-review https://github.com/0day-ci/linux
> >>>   git fetch --no-tags linux-review 
> >>> Desmond-Cheong-Zhi-Xi/drm-avoid-races-with-modesetting-rights/20210815-234145
> >>>   git checkout cf6d8354b7d7953cd866fad004cbb189adfa074f
> >>>   # save the attached .config to linux build tree
> >>>   make W=1 ARCH=i386
> >>>
> >>> If you fix the issue, kindly add following tag as appropriate
> >>> Reported-by: kernel test robot 
> >>>
> >>> All errors (new ones prefixed by >>, old ones prefixed by <<):
> >>>
> > ERROR: modpost: "task_work_add" [drivers/gpu/drm/drm.ko] undefined!
> >>>
> >>
> >> I'm a bit uncertain about this. Looking into the .config used, this
> >> error seems to happen because task_work_add isn't an exported symbol,
> >> but DRM is being compiled as a loadable kernel module (CONFIG_DRM=m).
> >>
> >> One way to deal with this is to export the symbol, but there was a
> >> proposed patch to do this a few months back that wasn't picked up [1],
> >> so I'm not sure what to make of this.
> >>
> >> I'll export the symbol as part of a v3 series, and check in with the
> >> task-work maintainers.
> >>
> >> Link:
> >> https://lore.kernel.org/lkml/20210127150029.13766-3-josh...@samsung.com/ 
> >> [1]
> >
> > Yeah that sounds best. I have two more thoughts on the patch:
> > - drm_master_flush isn't used by any modules outside of drm.ko, so we
> > can unexport it and drop the kerneldoc (the comment is still good).
> > These kind of internal functions have their declaration in
> > drm-internal.h - there's already a few there from drm_auth.c
> >
>
> Sounds good, I'll do that and move the declaration from drm_auth.h to
> drm_internal.h.
>
> > - We know have 3 locks for master state, that feels a bit like
> > overkill. The spinlock I think we need to keep due to lock inversions,
> > but the master_mutex and master_rwsem look like we should be able to
> > merge them? I.e. anywhere we currently grab the master_mutex we could
> > instead grab the rwsem in either write mode (when we change stuff) or
> > read mode (when we just check, like in master_internal_acquire).
> >
> > Thoughts?
> > -Daniel
> >
>
> Using rwsem in the places where we currently hold the mutex seems pretty
> doable.
>
> There are some tricky bits once we add rwsem read locks to the ioctl
> handler. Some ioctl functions like drm_authmagic need a write lock.

Ah yes, I only looked at the dropmaster/setmaster ioctl, and those
don't have the DRM_MASTER bit set.

> In this particular case, it might make sense to break master_mutex down
> into finer-grained locks, since the function doesn't change master
> permissions. It just needs to prevent concurrent writes to the
> drm_master.magic_map idr.

Yeah for authmagic we could perhaps just reuse the spinlock to protect
->magic_map?

> For other ioctls, I'll take a closer look on a case-by-case basis.

If it's too much shuffling then I think totally fine to leave things
as-is. Just feels a bit silly to have 3 locks, on of which is an
rwlock itself, for this fairly small amount of state.
-Daniel

>
> >>
> >>> ---
> >>> 0-DAY CI Kernel Test Service, Intel Corporation
> >>> https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
> >>>
> >>
> >
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/1] drm: ttm: Don't bail from ttm_global_init if debugfs_create_dir fails

2021-08-16 Thread Jason Ekstrand
Makes sense

Reviewed-by: Jason Ekstrand 

On Mon, Aug 16, 2021 at 2:40 AM Christian König
 wrote:
>
> Am 10.08.21 um 21:59 schrieb Dan Moulding:
> > In 69de4421bb4c ("drm/ttm: Initialize debugfs from
> > ttm_global_init()"), ttm_global_init was changed so that if creation
> > of the debugfs global root directory fails, ttm_global_init will bail
> > out early and return an error, leading to initialization failure of
> > DRM drivers. However, not every system will be using debugfs. On such
> > a system, debugfs directory creation can be expected to fail, but DRM
> > drivers must still be usable. This changes it so that if creation of
> > TTM's debugfs root directory fails, then no biggie: keep calm and
> > carry on.
> >
> > Fixes: 69de4421bb4c ("drm/ttm: Initialize debugfs from ttm_global_init()")
> > Signed-off-by: Dan Moulding 
>
> Good point, patch is Reviewed-by: Christian König
> .
>
> Going to pick that up later today.
>
> Regards,
> Christian.
>
> > ---
> >   drivers/gpu/drm/ttm/ttm_device.c | 2 --
> >   1 file changed, 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/ttm/ttm_device.c 
> > b/drivers/gpu/drm/ttm/ttm_device.c
> > index 74e3b460132b..2df59b3c2ea1 100644
> > --- a/drivers/gpu/drm/ttm/ttm_device.c
> > +++ b/drivers/gpu/drm/ttm/ttm_device.c
> > @@ -78,9 +78,7 @@ static int ttm_global_init(void)
> >
> >   ttm_debugfs_root = debugfs_create_dir("ttm", NULL);
> >   if (IS_ERR(ttm_debugfs_root)) {
> > - ret = PTR_ERR(ttm_debugfs_root);
> >   ttm_debugfs_root = NULL;
> > - goto out;
> >   }
> >
> >   /* Limit the number of pages in the pool to about 50% of the total
>


Re: [Linaro-mm-sig] IIO, dmabuf, io_uring

2021-08-16 Thread Daniel Vetter
On Sat, Aug 14, 2021 at 09:30:19AM +0200, Christoph Hellwig wrote:
> On Fri, Aug 13, 2021 at 01:41:26PM +0200, Paul Cercueil wrote:
> > Hi,
> >
> > A few months ago we (ADI) tried to upstream the interface we use with our 
> > high-speed ADCs and DACs. It is a system with custom ioctls on the iio 
> > device node to dequeue and enqueue buffers (allocated with 
> > dma_alloc_coherent), that can then be mmap'd by userspace applications. 
> > Anyway, it was ultimately denied entry [1]; this API was okay in ~2014 when 
> > it was designed but it feels like re-inventing the wheel in 2021.
> >
> > Back to the drawing table, and we'd like to design something that we can 
> > actually upstream. This high-speed interface looks awfully similar to 
> > DMABUF, so we may try to implement a DMABUF interface for IIO, unless 
> > someone has a better idea.
> 
> To me this does sound a lot like a dma buf use case.  The interesting
> question to me is how to signal arrival of new data, or readyness to
> consume more data.  I suspect that people that are actually using
> dmabuf heavily at the moment (dri/media folks) might be able to chime
> in a little more on that.

One option is to just block in userspace (on poll, or an ioctl, or
whatever) and then latch the next stage in the pipeline. That's what media
does right now (because the dma-fence proposal never got anywhere).

In drm we use dma_fences to tie up the stages, and the current
recommendation for uapi is to use the drm_syncobj container (not the
sync_file container, that was a bit an awkward iteration on that problem).
With that you can tie together all the pipeline stages within the kernel
(and at least sometimes directly in hw).

The downside is (well imo it's not a downside, but some people see it as
hta) that once you use dma-fence dri-devel folks really consider your
stuff a gpu driver and expect all the gpu driver review/merge criteria to
be fulfilled. Specifically about the userspace side too:

https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements

At least one driver is trying to play some very clever games here and
that's not a solid way to make friends ...
-Daniel

> 
> > Our first usecase is, we want userspace applications to be able to dequeue 
> > buffers of samples (from ADCs), and/or enqueue buffers of samples (for 
> > DACs), and to be able to manipulate them (mmapped buffers). With a DMABUF 
> > interface, I guess the userspace application would dequeue a dma buffer 
> > from the driver, mmap it, read/write the data, unmap it, then enqueue it to 
> > the IIO driver again so that it can be disposed of. Does that sound sane?
> >
> > Our second usecase is - and that's where things get tricky - to be able to 
> > stream the samples to another computer for processing, over Ethernet or 
> > USB. Our typical setup is a high-speed ADC/DAC on a dev board with a FPGA 
> > and a weak soft-core or low-power CPU; processing the data in-situ is not 
> > an option. Copying the data from one buffer to another is not an option 
> > either (way too slow), so we absolutely want zero-copy.
> >
> > Usual userspace zero-copy techniques (vmsplice+splice, MSG_ZEROCOPY etc) 
> > don't really work with mmapped kernel buffers allocated for DMA [2] and/or 
> > have a huge overhead, so the way I see it, we would also need DMABUF 
> > support in both the Ethernet stack and USB (functionfs) stack. However, as 
> > far as I understood, DMABUF is mostly a DRM/V4L2 thing, so I am really not 
> > sure we have the right idea here.
> >
> > And finally, there is the new kid in town, io_uring. I am not very literate 
> > about the topic, but it does not seem to be able to handle DMA buffers 
> > (yet?). The idea that we could dequeue a buffer of samples from the IIO 
> > device and send it over the network in one single syscall is appealing, 
> > though.
> 
> Think of io_uring really just as an async syscall layer.  It doesn't
> replace DMA buffers, but can be used as a different and for some
> workloads more efficient way to dispatch syscalls.
> ___
> Linaro-mm-sig mailing list
> linaro-mm-...@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/linaro-mm-sig

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-16 Thread Michel Dänzer
On 2021-08-16 2:06 p.m., Christian König wrote:
> Am 16.08.21 um 13:33 schrieb Lazar, Lijo:
>> On 8/16/2021 4:05 PM, Michel Dänzer wrote:
>>> From: Michel Dänzer 
>>>
>>> schedule_delayed_work does not push back the work if it was already
>>> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
>>> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
>>> was disabled and re-enabled again during those 100 ms.
>>>
>>> This resulted in frame drops / stutter with the upcoming mutter 41
>>> release on Navi 14, due to constantly enabling GFXOFF in the HW and
>>> disabling it again (for getting the GPU clock counter).
>>>
>>> To fix this, call cancel_delayed_work_sync when the disable count
>>> transitions from 0 to 1, and only schedule the delayed work on the
>>> reverse transition, not if the disable count was already 0. This makes
>>> sure the delayed work doesn't run at unexpected times, and allows it to
>>> be lock-free.
>>>
>>> v2:
>>> * Use cancel_delayed_work_sync & mutex_trylock instead of
>>>    mod_delayed_work.
>>> v3:
>>> * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
>>>
>>> Cc: sta...@vger.kernel.org
>>> Signed-off-by: Michel Dänzer 
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 22 +-
>>>   2 files changed, 22 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index f3fd5ec710b6..f944ed858f3e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -2777,12 +2777,11 @@ static void 
>>> amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
>>>   struct amdgpu_device *adev =
>>>   container_of(work, struct amdgpu_device, 
>>> gfx.gfx_off_delay_work.work);
>>>   -    mutex_lock(&adev->gfx.gfx_off_mutex);
>>> -    if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
>>> -    if (!amdgpu_dpm_set_powergating_by_smu(adev, 
>>> AMD_IP_BLOCK_TYPE_GFX, true))
>>> -    adev->gfx.gfx_off_state = true;
>>> -    }
>>> -    mutex_unlock(&adev->gfx.gfx_off_mutex);
>>> +    WARN_ON_ONCE(adev->gfx.gfx_off_state);
>>
>> Don't see any case for this. It's not expected to be scheduled in this case, 
>> right?
>>
>>> + WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
>>> +
>>
>> Thinking about ON_ONCE here - this may happen more than once if it's 
>> completed as part of cancel_ call. Is the warning needed?
> 
> WARN_ON_ONCE() is usually used to prevent spamming the system log with 
> warnings. E.g. the warning is only printed once indicating a driver bug and 
> that's it.

Right, these WARN_ONs are like assert()s in user-space code, documenting the 
pre-conditions and checking them at runtime. And I use _ONCE so that if a 
pre-condition is ever violated for some reason, dmesg isn't spammed with 
multiple warnings.


>> Anyway,
>> Reviewed-by: Lijo Lazar 
> 
> Acked-by: Christian König 

Thanks guys!


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer


Re: [Intel-gfx] [PATCH v6 10/15] drm/i915/pxp: interfaces for using protected objects

2021-08-16 Thread Daniel Vetter
On Fri, Aug 13, 2021 at 08:18:02AM -0700, Daniele Ceraolo Spurio wrote:
> 
> 
> On 8/13/2021 7:37 AM, Daniel Vetter wrote:
> > On Wed, Jul 28, 2021 at 07:01:01PM -0700, Daniele Ceraolo Spurio wrote:
> > > This api allow user mode to create protected buffers and to mark
> > > contexts as making use of such objects. Only when using contexts
> > > marked in such a way is the execution guaranteed to work as expected.
> > > 
> > > Contexts can only be marked as using protected content at creation time
> > > (i.e. the parameter is immutable) and they must be both bannable and not
> > > recoverable.
> > > 
> > > All protected objects and contexts that have backing storage will be
> > > considered invalid when the PXP session is destroyed and all new
> > > submissions using them will be rejected. All intel contexts within the
> > > invalidated gem contexts will be marked banned. A new flag has been
> > > added to the RESET_STATS ioctl to report the context invalidation to
> > > userspace.
> > > 
> > > This patch was previously sent as 2 separate patches, which have been
> > > squashed following a request to have all the uapi in a single patch.
> > > I've retained the s-o-b from both.
> > > 
> > > v5: squash patches, rebase on proto_ctx, update kerneldoc
> > > 
> > > v6: rebase on obj create_ext changes
> > > 
> > > Signed-off-by: Daniele Ceraolo Spurio 
> > > Signed-off-by: Bommu Krishnaiah 
> > > Cc: Rodrigo Vivi 
> > > Cc: Chris Wilson 
> > > Cc: Lionel Landwerlin 
> > > Cc: Jason Ekstrand 
> > > Cc: Daniel Vetter 
> > > Reviewed-by: Rodrigo Vivi  #v5
> > > ---
> > >   drivers/gpu/drm/i915/gem/i915_gem_context.c   | 68 --
> > >   drivers/gpu/drm/i915/gem/i915_gem_context.h   | 18 
> > >   .../gpu/drm/i915/gem/i915_gem_context_types.h |  2 +
> > >   drivers/gpu/drm/i915/gem/i915_gem_create.c| 75 
> > >   .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 40 -
> > >   drivers/gpu/drm/i915/gem/i915_gem_object.c|  6 ++
> > >   drivers/gpu/drm/i915/gem/i915_gem_object.h| 12 +++
> > >   .../gpu/drm/i915/gem/i915_gem_object_types.h  |  9 ++
> > >   drivers/gpu/drm/i915/pxp/intel_pxp.c  | 89 +++
> > >   drivers/gpu/drm/i915/pxp/intel_pxp.h  | 15 
> > >   drivers/gpu/drm/i915/pxp/intel_pxp_session.c  |  3 +
> > >   drivers/gpu/drm/i915/pxp/intel_pxp_types.h|  5 ++
> > >   include/uapi/drm/i915_drm.h   | 55 +++-
> > >   13 files changed, 371 insertions(+), 26 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> > > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > index cff72679ad7c..0cd3e2d06188 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > @@ -77,6 +77,8 @@
> > >   #include "gt/intel_gpu_commands.h"
> > >   #include "gt/intel_ring.h"
> > > +#include "pxp/intel_pxp.h"
> > > +
> > >   #include "i915_gem_context.h"
> > >   #include "i915_trace.h"
> > >   #include "i915_user_extensions.h"
> > > @@ -241,6 +243,25 @@ static int proto_context_set_persistence(struct 
> > > drm_i915_private *i915,
> > >   return 0;
> > >   }
> > > +static int proto_context_set_protected(struct drm_i915_private *i915,
> > > +struct i915_gem_proto_context *pc,
> > > +bool protected)
> > > +{
> > > + int ret = 0;
> > > +
> > > + if (!intel_pxp_is_enabled(&i915->gt.pxp))
> > > + ret = -ENODEV;
> > > + else if (!protected)
> > > + pc->user_flags &= ~BIT(UCONTEXT_PROTECTED);
> > > + else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) ||
> > > +  !(pc->user_flags & BIT(UCONTEXT_BANNABLE)))
> > > + ret = -EPERM;
> > > + else
> > > + pc->user_flags |= BIT(UCONTEXT_PROTECTED);
> > > +
> > > + return ret;
> > > +}
> > > +
> > >   static struct i915_gem_proto_context *
> > >   proto_context_create(struct drm_i915_private *i915, unsigned int flags)
> > >   {
> > > @@ -686,6 +707,8 @@ static int set_proto_ctx_param(struct 
> > > drm_i915_file_private *fpriv,
> > >   ret = -EPERM;
> > >   else if (args->value)
> > >   pc->user_flags |= BIT(UCONTEXT_BANNABLE);
> > > + else if (pc->user_flags & BIT(UCONTEXT_PROTECTED))
> > > + ret = -EPERM;
> > >   else
> > >   pc->user_flags &= ~BIT(UCONTEXT_BANNABLE);
> > >   break;
> > > @@ -693,10 +716,12 @@ static int set_proto_ctx_param(struct 
> > > drm_i915_file_private *fpriv,
> > >   case I915_CONTEXT_PARAM_RECOVERABLE:
> > >   if (args->size)
> > >   ret = -EINVAL;
> > > - else if (args->value)
> > > - pc->user_flags |= BIT(UCONTEXT_RECOVERABLE);
> > > - else
> > > + else if (!args->value)
> > >   pc->user_flags &= ~BIT(UCONTEXT_REC

Re: [Intel-gfx] [PATCH v6 10/15] drm/i915/pxp: interfaces for using protected objects

2021-08-16 Thread Daniel Vetter
On Fri, Aug 13, 2021 at 08:24:44AM -0700, Daniele Ceraolo Spurio wrote:
> 
> 
> On 8/13/2021 7:42 AM, Daniel Vetter wrote:
> > On Fri, Aug 13, 2021 at 04:37:53PM +0200, Daniel Vetter wrote:
> > > On Wed, Jul 28, 2021 at 07:01:01PM -0700, Daniele Ceraolo Spurio wrote:
> > > > This api allow user mode to create protected buffers and to mark
> > > > contexts as making use of such objects. Only when using contexts
> > > > marked in such a way is the execution guaranteed to work as expected.
> > > > 
> > > > Contexts can only be marked as using protected content at creation time
> > > > (i.e. the parameter is immutable) and they must be both bannable and not
> > > > recoverable.
> > > > 
> > > > All protected objects and contexts that have backing storage will be
> > > > considered invalid when the PXP session is destroyed and all new
> > > > submissions using them will be rejected. All intel contexts within the
> > > > invalidated gem contexts will be marked banned. A new flag has been
> > > > added to the RESET_STATS ioctl to report the context invalidation to
> > > > userspace.
> > > > 
> > > > This patch was previously sent as 2 separate patches, which have been
> > > > squashed following a request to have all the uapi in a single patch.
> > > > I've retained the s-o-b from both.
> > > > 
> > > > v5: squash patches, rebase on proto_ctx, update kerneldoc
> > > > 
> > > > v6: rebase on obj create_ext changes
> > > > 
> > > > Signed-off-by: Daniele Ceraolo Spurio 
> > > > Signed-off-by: Bommu Krishnaiah 
> > > > Cc: Rodrigo Vivi 
> > > > Cc: Chris Wilson 
> > > > Cc: Lionel Landwerlin 
> > > > Cc: Jason Ekstrand 
> > > > Cc: Daniel Vetter 
> > > > Reviewed-by: Rodrigo Vivi  #v5
> > > > ---
> > > >   drivers/gpu/drm/i915/gem/i915_gem_context.c   | 68 --
> > > >   drivers/gpu/drm/i915/gem/i915_gem_context.h   | 18 
> > > >   .../gpu/drm/i915/gem/i915_gem_context_types.h |  2 +
> > > >   drivers/gpu/drm/i915/gem/i915_gem_create.c| 75 
> > > >   .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 40 -
> > > >   drivers/gpu/drm/i915/gem/i915_gem_object.c|  6 ++
> > > >   drivers/gpu/drm/i915/gem/i915_gem_object.h| 12 +++
> > > >   .../gpu/drm/i915/gem/i915_gem_object_types.h  |  9 ++
> > > >   drivers/gpu/drm/i915/pxp/intel_pxp.c  | 89 +++
> > > >   drivers/gpu/drm/i915/pxp/intel_pxp.h  | 15 
> > > >   drivers/gpu/drm/i915/pxp/intel_pxp_session.c  |  3 +
> > > >   drivers/gpu/drm/i915/pxp/intel_pxp_types.h|  5 ++
> > > >   include/uapi/drm/i915_drm.h   | 55 +++-
> > > >   13 files changed, 371 insertions(+), 26 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> > > > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > index cff72679ad7c..0cd3e2d06188 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > @@ -77,6 +77,8 @@
> > > >   #include "gt/intel_gpu_commands.h"
> > > >   #include "gt/intel_ring.h"
> > > > +#include "pxp/intel_pxp.h"
> > > > +
> > > >   #include "i915_gem_context.h"
> > > >   #include "i915_trace.h"
> > > >   #include "i915_user_extensions.h"
> > > > @@ -241,6 +243,25 @@ static int proto_context_set_persistence(struct 
> > > > drm_i915_private *i915,
> > > > return 0;
> > > >   }
> > > > +static int proto_context_set_protected(struct drm_i915_private *i915,
> > > > +  struct i915_gem_proto_context 
> > > > *pc,
> > > > +  bool protected)
> > > > +{
> > > > +   int ret = 0;
> > > > +
> > > > +   if (!intel_pxp_is_enabled(&i915->gt.pxp))
> > > > +   ret = -ENODEV;
> > > > +   else if (!protected)
> > > > +   pc->user_flags &= ~BIT(UCONTEXT_PROTECTED);
> > > > +   else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) ||
> > > > +!(pc->user_flags & BIT(UCONTEXT_BANNABLE)))
> > > > +   ret = -EPERM;
> > > > +   else
> > > > +   pc->user_flags |= BIT(UCONTEXT_PROTECTED);
> > > > +
> > > > +   return ret;
> > > > +}
> > > > +
> > > >   static struct i915_gem_proto_context *
> > > >   proto_context_create(struct drm_i915_private *i915, unsigned int 
> > > > flags)
> > > >   {
> > > > @@ -686,6 +707,8 @@ static int set_proto_ctx_param(struct 
> > > > drm_i915_file_private *fpriv,
> > > > ret = -EPERM;
> > > > else if (args->value)
> > > > pc->user_flags |= BIT(UCONTEXT_BANNABLE);
> > > > +   else if (pc->user_flags & BIT(UCONTEXT_PROTECTED))
> > > > +   ret = -EPERM;
> > > > else
> > > > pc->user_flags &= ~BIT(UCONTEXT_BANNABLE);
> > > > break;
> > > > @@ -693,10 +716,12 @@ static int set_proto_ctx_param(struct 
> > > > drm_i915_file_private *fpriv,
> > > >

Re: [PATCH v2 3/5] drm/atomic-helper: Set fence deadline for vblank

2021-08-16 Thread Daniel Vetter
On Sat, Aug 07, 2021 at 11:37:57AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> For an atomic commit updating a single CRTC (ie. a pageflip) calculate
> the next vblank time, and inform the fence(s) of that deadline.
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/drm_atomic_helper.c | 36 +
>  1 file changed, 36 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
> b/drivers/gpu/drm/drm_atomic_helper.c
> index bc3487964fb5..7caa2c3cc304 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -1406,6 +1406,40 @@ void drm_atomic_helper_commit_modeset_enables(struct 
> drm_device *dev,
>  }
>  EXPORT_SYMBOL(drm_atomic_helper_commit_modeset_enables);
>  
> +/*
> + * For atomic updates which touch just a single CRTC, calculate the time of 
> the
> + * next vblank, and inform all the fences of the of the deadline.

s/of the//

Otherwise lgtm, Reviewed-by: Daniel Vetter 


> + */
> +static void set_fence_deadline(struct drm_device *dev,
> +struct drm_atomic_state *state)
> +{
> + struct drm_crtc *crtc, *wait_crtc = NULL;
> + struct drm_crtc_state *new_crtc_state;
> + struct drm_plane *plane;
> + struct drm_plane_state *new_plane_state;
> + ktime_t vbltime;
> + int i;
> +
> + for_each_new_crtc_in_state (state, crtc, new_crtc_state, i) {
> + if (wait_crtc)
> + return;
> + wait_crtc = crtc;
> + }
> +
> + /* If no CRTCs updated, then nothing to do: */
> + if (!wait_crtc)
> + return;
> +
> + if (drm_crtc_next_vblank_time(wait_crtc, &vbltime))
> + return;
> +
> + for_each_new_plane_in_state (state, plane, new_plane_state, i) {
> + if (!new_plane_state->fence)
> + continue;
> + dma_fence_set_deadline(new_plane_state->fence, vbltime);
> + }
> +}
> +
>  /**
>   * drm_atomic_helper_wait_for_fences - wait for fences stashed in plane state
>   * @dev: DRM device
> @@ -1435,6 +1469,8 @@ int drm_atomic_helper_wait_for_fences(struct drm_device 
> *dev,
>   struct drm_plane_state *new_plane_state;
>   int i, ret;
>  
> + set_fence_deadline(dev, state);
> +
>   for_each_new_plane_in_state(state, plane, new_plane_state, i) {
>   if (!new_plane_state->fence)
>   continue;
> -- 
> 2.31.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 4/5] drm/scheduler: Add fence deadline support

2021-08-16 Thread Daniel Vetter
On Mon, Aug 16, 2021 at 12:14:35PM +0200, Christian König wrote:
> Am 07.08.21 um 20:37 schrieb Rob Clark:
> > From: Rob Clark 
> > 
> > As the finished fence is the one that is exposed to userspace, and
> > therefore the one that other operations, like atomic update, would
> > block on, we need to propagate the deadline from from the finished
> > fence to the actual hw fence.
> > 
> > Signed-off-by: Rob Clark 

I guess you're already letting the compositor run at a higher gpu priority
so that your deadline'd drm_sched_job isn't stuck behind the app rendering
the next frame?

I'm not sure whether you wire that one up as part of the conversion to
drm/sched. Without that I think we might need to ponder how we can do a
prio-boost for these, e.g. within a scheduling class we pick the jobs with
the nearest deadline first, before we pick others.
-Daniel

> > ---
> >   drivers/gpu/drm/scheduler/sched_fence.c | 25 +
> >   drivers/gpu/drm/scheduler/sched_main.c  |  3 +++
> >   include/drm/gpu_scheduler.h |  6 ++
> >   3 files changed, 34 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c 
> > b/drivers/gpu/drm/scheduler/sched_fence.c
> > index 69de2c76731f..f389dca44185 100644
> > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct 
> > dma_fence *f)
> > dma_fence_put(&fence->scheduled);
> >   }
> > +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
> > + ktime_t deadline)
> > +{
> > +   struct drm_sched_fence *fence = to_drm_sched_fence(f);
> > +   unsigned long flags;
> > +
> > +   spin_lock_irqsave(&fence->lock, flags);
> > +
> > +   /* If we already have an earlier deadline, keep it: */
> > +   if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
> > +   ktime_before(fence->deadline, deadline)) {
> > +   spin_unlock_irqrestore(&fence->lock, flags);
> > +   return;
> > +   }
> > +
> > +   fence->deadline = deadline;
> > +   set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
> > +
> > +   spin_unlock_irqrestore(&fence->lock, flags);
> > +
> > +   if (fence->parent)
> > +   dma_fence_set_deadline(fence->parent, deadline);
> > +}
> > +
> >   static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
> > .get_driver_name = drm_sched_fence_get_driver_name,
> > .get_timeline_name = drm_sched_fence_get_timeline_name,
> > @@ -138,6 +162,7 @@ static const struct dma_fence_ops 
> > drm_sched_fence_ops_finished = {
> > .get_driver_name = drm_sched_fence_get_driver_name,
> > .get_timeline_name = drm_sched_fence_get_timeline_name,
> > .release = drm_sched_fence_release_finished,
> > +   .set_deadline = drm_sched_fence_set_deadline_finished,
> >   };
> >   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index a2a953693b45..3ab0900d3596 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -818,6 +818,9 @@ static int drm_sched_main(void *param)
> > if (!IS_ERR_OR_NULL(fence)) {
> > s_fence->parent = dma_fence_get(fence);
> > +   if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
> > +&s_fence->finished.flags))
> > +   dma_fence_set_deadline(fence, 
> > s_fence->deadline);
> 
> Maybe move this into a dma_sched_fence_set_parent() function.
> 
> Apart from that looks good to me.
> 
> Regards,
> Christian.
> 
> > r = dma_fence_add_callback(fence, &sched_job->cb,
> >drm_sched_job_done_cb);
> > if (r == -ENOENT)
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index d18af49fd009..0f08ade614ae 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -144,6 +144,12 @@ struct drm_sched_fence {
> >*/
> > struct dma_fencefinished;
> > +   /**
> > +* @deadline: deadline set on &drm_sched_fence.finished which
> > +* potentially needs to be propagated to &drm_sched_fence.parent
> > +*/
> > +   ktime_t deadline;
> > +
> >   /**
> >* @parent: the fence returned by &drm_sched_backend_ops.run_job
> >* when scheduling the job on hardware. We signal the
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 5/9] drm/i915/guc: Flush the work queue for GuC generated G2H

2021-08-16 Thread Daniel Vetter
On Fri, Aug 13, 2021 at 07:02:55PM +, Matthew Brost wrote:
> On Fri, Aug 13, 2021 at 05:11:59PM +0200, Daniel Vetter wrote:
> > On Thu, Aug 12, 2021 at 10:38:18PM +, Matthew Brost wrote:
> > > On Thu, Aug 12, 2021 at 09:47:23PM +0200, Daniel Vetter wrote:
> > > > On Thu, Aug 12, 2021 at 03:23:30PM +, Matthew Brost wrote:
> > > > > On Thu, Aug 12, 2021 at 04:11:28PM +0200, Daniel Vetter wrote:
> > > > > > On Wed, Aug 11, 2021 at 01:16:18AM +, Matthew Brost wrote:
> > > > > > > Flush the work queue for GuC generated G2H messages durinr a GT 
> > > > > > > reset.
> > > > > > > This is accomplished by spinning on the the list of outstanding 
> > > > > > > G2H to
> > > > > > > go empty.
> > > > > > > 
> > > > > > > Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new 
> > > > > > > GuC interface")
> > > > > > > Signed-off-by: Matthew Brost 
> > > > > > > Cc: 
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 +
> > > > > > >  1 file changed, 5 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > > > > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > > > > > index 3cd2da6f5c03..e5eb2df11b4a 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > > > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > > > > > @@ -727,6 +727,11 @@ void 
> > > > > > > intel_guc_submission_reset_prepare(struct intel_guc *guc)
> > > > > > >   wait_for_reset(guc, 
> > > > > > > &guc->outstanding_submission_g2h);
> > > > > > >   } while (!list_empty(&guc->ct.requests.incoming));
> > > > > > >   }
> > > > > > > +
> > > > > > > + /* Flush any GuC generated G2H */
> > > > > > > + while (!list_empty(&guc->ct.requests.incoming))
> > > > > > > + msleep(20);
> > > > > > 
> > > > > > flush_work or flush_workqueue, beacuse that comes with lockdep
> > > > > > annotations. Dont hand-roll stuff like this if at all possible.
> > > > > 
> > > > > lockdep puked when used that.
> > > > 
> > > > Lockdep tends to be right ... So we definitely want that, but maybe a
> > > > different flavour, or there's something wrong with the workqueue setup.
> > > > 
> > > 
> > > Here is a dependency chain that lockdep doesn't like.
> > > 
> > > fs_reclaim_acquire -> >->reset.mutex (shrinker)
> > > workqueue -> fs_reclaim_acquire (error capture in workqueue)
> > > >->reset.mutex -> workqueue (reset)
> > > 
> > > In practice I don't think we couldn't ever hit this but lockdep does
> > > looks right here. Trying to work out how to fix this. We really need to
> > > all G2H to done being processed before we proceed during a reset or we
> > > have races. Have a few ideas of how to handle this but can't convince
> > > myself any of them are fully safe.
> > 
> > It might be false sharing due to a single workqueue, or a single-threaded
> > workqueue.
> > 
> > Essentially the lockdep annotations for work_struct track two things:
> > - dependencies against the specific work item
> > - dependencies against anything queued on that work queue, if you flush
> >   the entire queue, or if you flush a work item that's on a
> >   single-threaded queue.
> > 
> > Because if guc/host communication is inverted like this here, you have a
> > much bigger problem.
> > 
> > Note that if you pick a different workqueue for your guc work stuff then
> > you need to make sure that's all properly flushed on suspend and driver
> > unload.
> > 
> > It might also be that the reset work is on the wrong workqueue.
> > 
> > Either way, this must be fixed, because I've seen too many of these "it
> > never happens in practice" blow up, plus if your locking scheme is
> > engineered with quicksand forget about anyone ever understanding it.
> 
> The solution is to allocate memory for the error capture in an atomic
> context if the error capture is being done from the G2H work queue. That
> means this can possibly fail if the system is under memory pressure but
> that is better than a lockdep splat.

Ah yeah if this is for error capture then GFP_ATOMIC is the right option.
-Daniel

> 
> Matt
> 
> > -Daniel
> > 
> > > 
> > > Splat below:
> > > 
> > > [  154.625989] ==
> > > [  154.632195] WARNING: possible circular locking dependency detected
> > > [  154.638393] 5.14.0-rc5-guc+ #50 Tainted: G U
> > > [  154.643991] --
> > > [  154.650196] i915_selftest/1673 is trying to acquire lock:
> > > [  154.655621] 8881079cb918 
> > > ((work_completion)(&ct->requests.worker)){+.+.}-{0:0}, at: 
> > > __flush_work+0x350/0x4d0
> > > [  154.665826]
> > >but task is already holding lock:
> > > [  154.671682] 8881079cbfb8 (>->reset.mutex){+.+.}-{3:3}, at: 
> > > intel_gt_reset+0xf0/0x300 [i915]
> > > [  154.680659]
> > >which lock already depends on the new lock.
> > > 
> > 

Re: [PATCH v1] drm/bridge: anx7625: Don't store unread return value

2021-08-16 Thread Sam Ravnborg
Hi Robert,

On Mon, Aug 16, 2021 at 01:14:51PM +0200, Robert Foss wrote:
> The return value of sp_tx_rst_aux() is stored, but never read.
> This happens in the context EDID communication already failing,
> which means that this additional failure doesn't necessarily
> convey any additional inforamation.
> 
> This means that we can safely avoid storing the value.
> 
> Fixes: 8bdfc5dae4e3 ("drm/bridge: anx7625: Add anx7625 MIPI DSI/DPI to DP")
> 
> Reported-by: kernel test robot 
> Signed-off-by: Robert Foss 
> ---
>  drivers/gpu/drm/bridge/analogix/anx7625.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c 
> b/drivers/gpu/drm/bridge/analogix/anx7625.c
> index 14d73fb1dd15b..3471785915c45 100644
> --- a/drivers/gpu/drm/bridge/analogix/anx7625.c
> +++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
> @@ -771,7 +771,7 @@ static int segments_edid_read(struct anx7625_data *ctx,
>   ret = sp_tx_aux_rd(ctx, 0xf1);
>  
>   if (ret) {
> - ret = sp_tx_rst_aux(ctx);
> + sp_tx_rst_aux(ctx);
>   DRM_DEV_ERROR(dev, "segment read fail, reset!\n");
>   } else {
>   ret = anx7625_reg_block_read(ctx, ctx->i2c.rx_p0_client,

>From a quick look this seems to be the wrong fix.
Replace return 0; with return ret; as the last line in this function
looks like the correct fix to me.
With a careful audit that the error handling is OK in said function.

Sam


Re: [Intel-gfx] [PATCH v6 10/15] drm/i915/pxp: interfaces for using protected objects

2021-08-16 Thread Daniele Ceraolo Spurio




On 8/16/2021 8:15 AM, Daniel Vetter wrote:

On Fri, Aug 13, 2021 at 08:18:02AM -0700, Daniele Ceraolo Spurio wrote:


On 8/13/2021 7:37 AM, Daniel Vetter wrote:

On Wed, Jul 28, 2021 at 07:01:01PM -0700, Daniele Ceraolo Spurio wrote:

This api allow user mode to create protected buffers and to mark
contexts as making use of such objects. Only when using contexts
marked in such a way is the execution guaranteed to work as expected.

Contexts can only be marked as using protected content at creation time
(i.e. the parameter is immutable) and they must be both bannable and not
recoverable.

All protected objects and contexts that have backing storage will be
considered invalid when the PXP session is destroyed and all new
submissions using them will be rejected. All intel contexts within the
invalidated gem contexts will be marked banned. A new flag has been
added to the RESET_STATS ioctl to report the context invalidation to
userspace.

This patch was previously sent as 2 separate patches, which have been
squashed following a request to have all the uapi in a single patch.
I've retained the s-o-b from both.

v5: squash patches, rebase on proto_ctx, update kerneldoc

v6: rebase on obj create_ext changes

Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Bommu Krishnaiah 
Cc: Rodrigo Vivi 
Cc: Chris Wilson 
Cc: Lionel Landwerlin 
Cc: Jason Ekstrand 
Cc: Daniel Vetter 
Reviewed-by: Rodrigo Vivi  #v5
---
   drivers/gpu/drm/i915/gem/i915_gem_context.c   | 68 --
   drivers/gpu/drm/i915/gem/i915_gem_context.h   | 18 
   .../gpu/drm/i915/gem/i915_gem_context_types.h |  2 +
   drivers/gpu/drm/i915/gem/i915_gem_create.c| 75 
   .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 40 -
   drivers/gpu/drm/i915/gem/i915_gem_object.c|  6 ++
   drivers/gpu/drm/i915/gem/i915_gem_object.h| 12 +++
   .../gpu/drm/i915/gem/i915_gem_object_types.h  |  9 ++
   drivers/gpu/drm/i915/pxp/intel_pxp.c  | 89 +++
   drivers/gpu/drm/i915/pxp/intel_pxp.h  | 15 
   drivers/gpu/drm/i915/pxp/intel_pxp_session.c  |  3 +
   drivers/gpu/drm/i915/pxp/intel_pxp_types.h|  5 ++
   include/uapi/drm/i915_drm.h   | 55 +++-
   13 files changed, 371 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index cff72679ad7c..0cd3e2d06188 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -77,6 +77,8 @@
   #include "gt/intel_gpu_commands.h"
   #include "gt/intel_ring.h"
+#include "pxp/intel_pxp.h"
+
   #include "i915_gem_context.h"
   #include "i915_trace.h"
   #include "i915_user_extensions.h"
@@ -241,6 +243,25 @@ static int proto_context_set_persistence(struct 
drm_i915_private *i915,
return 0;
   }
+static int proto_context_set_protected(struct drm_i915_private *i915,
+  struct i915_gem_proto_context *pc,
+  bool protected)
+{
+   int ret = 0;
+
+   if (!intel_pxp_is_enabled(&i915->gt.pxp))
+   ret = -ENODEV;
+   else if (!protected)
+   pc->user_flags &= ~BIT(UCONTEXT_PROTECTED);
+   else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) ||
+!(pc->user_flags & BIT(UCONTEXT_BANNABLE)))
+   ret = -EPERM;
+   else
+   pc->user_flags |= BIT(UCONTEXT_PROTECTED);
+
+   return ret;
+}
+
   static struct i915_gem_proto_context *
   proto_context_create(struct drm_i915_private *i915, unsigned int flags)
   {
@@ -686,6 +707,8 @@ static int set_proto_ctx_param(struct drm_i915_file_private 
*fpriv,
ret = -EPERM;
else if (args->value)
pc->user_flags |= BIT(UCONTEXT_BANNABLE);
+   else if (pc->user_flags & BIT(UCONTEXT_PROTECTED))
+   ret = -EPERM;
else
pc->user_flags &= ~BIT(UCONTEXT_BANNABLE);
break;
@@ -693,10 +716,12 @@ static int set_proto_ctx_param(struct 
drm_i915_file_private *fpriv,
case I915_CONTEXT_PARAM_RECOVERABLE:
if (args->size)
ret = -EINVAL;
-   else if (args->value)
-   pc->user_flags |= BIT(UCONTEXT_RECOVERABLE);
-   else
+   else if (!args->value)
pc->user_flags &= ~BIT(UCONTEXT_RECOVERABLE);
+   else if (pc->user_flags & BIT(UCONTEXT_PROTECTED))
+   ret = -EPERM;
+   else
+   pc->user_flags |= BIT(UCONTEXT_RECOVERABLE);
break;
case I915_CONTEXT_PARAM_PRIORITY:
@@ -724,6 +749,11 @@ static int set_proto_ctx_param(struct 
drm_i915_file_private *fpriv,
args->value);
break;
+   case I915_CONTEXT_PARAM

Re: [PATCH] drm/i915/dp: Use max params for older panels

2021-08-16 Thread Ville Syrjälä
On Wed, Aug 04, 2021 at 11:24:02PM +0800, Kai-Heng Feng wrote:
> Users reported that after commit 2bbd6dba84d4 ("drm/i915: Try to use
> fast+narrow link on eDP again and fall back to the old max strategy on
> failure"), the screen starts to have wobbly effect.
> 
> Commit a5c936add6a2 ("drm/i915/dp: Use slow and wide link training for
> everything") doesn't help either, that means the affected panels only
> work with max params.

Unfortunate that the link training apparently passes with the bad
params and thus the automagic use_max_params fallback doesn't kick in
:(

> 
> The panels are all DP 1.1 ones, so apply max params to them to resolve
> the issue.
> 
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3714
> Fixes: 2bbd6dba84d4 ("drm/i915: Try to use fast+narrow link on eDP again and 
> fall back to the old max strategy on failure")
> Fixes: a5c936add6a2 ("drm/i915/dp: Use slow and wide link training for 
> everything")
> Signed-off-by: Kai-Heng Feng 
> ---
>  drivers/gpu/drm/i915/display/intel_dp.c | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
> b/drivers/gpu/drm/i915/display/intel_dp.c
> index 75d4ebc669411..e64bab4b016e1 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> @@ -1330,14 +1330,16 @@ intel_dp_compute_link_config(struct intel_encoder 
> *encoder,
>   limits.min_bpp = intel_dp_min_bpp(pipe_config->output_format);
>   limits.max_bpp = intel_dp_max_bpp(intel_dp, pipe_config);
>  
> - if (intel_dp->use_max_params) {
> + if (intel_dp->use_max_params ||
> + intel_dp->dpcd[DP_DPCD_REV] <= DP_DPCD_REV_11) {

IIRC Windows uses the optimal link rate only for EPD_REV>=1.4.
We should probably do the same the minimize future headaches.

>   /*
>* Use the maximum clock and number of lanes the eDP panel
>* advertizes being capable of in case the initial fast
> -  * optimal params failed us. The panels are generally
> -  * designed to support only a single clock and lane
> -  * configuration, and typically on older panels these
> -  * values correspond to the native resolution of the panel.
> +  * optimal params failed us or the panel is DP 1.1 or earlier.
> +  * The panels are generally designed to support only a single
> +  * clock and lane configuration, and typically on older panels
> +  * these values correspond to the native resolution of the
> +  * panel.
>*/
>   limits.min_lane_count = limits.max_lane_count;
>   limits.min_clock = limits.max_clock;
> -- 
> 2.31.1

-- 
Ville Syrjälä
Intel


[PATCH v2] drm/i915: Ditch the i915_gem_ww_ctx loop member

2021-08-16 Thread Thomas Hellström
It's only used by the for_i915_gem_ww() macro and we can use
the (typically) on-stack _err variable in its place.

v2:
- Don't clear the _err variable when entering the loop
  (Matthew Auld, Maarten Lankhorst).
- Use parentheses around the _err macro argument.
- Fix up comment.

Cc: Matthew Auld 
Suggested-by: Maarten Lankhorst 
Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/i915_gem_ww.h | 25 -
 1 file changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h 
b/drivers/gpu/drm/i915/i915_gem_ww.h
index f6b1a796667b..86f0fe343de6 100644
--- a/drivers/gpu/drm/i915/i915_gem_ww.h
+++ b/drivers/gpu/drm/i915/i915_gem_ww.h
@@ -11,8 +11,7 @@ struct i915_gem_ww_ctx {
struct ww_acquire_ctx ctx;
struct list_head obj_list;
struct drm_i915_gem_object *contended;
-   unsigned short intr;
-   unsigned short loop;
+   bool intr;
 };
 
 void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
@@ -20,31 +19,23 @@ void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx);
 int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx);
 void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj);
 
-/* Internal functions used by the inlines! Don't use. */
+/* Internal function used by the inlines! Don't use. */
 static inline int __i915_gem_ww_fini(struct i915_gem_ww_ctx *ww, int err)
 {
-   ww->loop = 0;
if (err == -EDEADLK) {
err = i915_gem_ww_ctx_backoff(ww);
if (!err)
-   ww->loop = 1;
+   err = -EDEADLK;
}
 
-   if (!ww->loop)
+   if (err != -EDEADLK)
i915_gem_ww_ctx_fini(ww);
 
return err;
 }
 
-static inline void
-__i915_gem_ww_init(struct i915_gem_ww_ctx *ww, bool intr)
-{
-   i915_gem_ww_ctx_init(ww, intr);
-   ww->loop = 1;
-}
-
-#define for_i915_gem_ww(_ww, _err, _intr)  \
-   for (__i915_gem_ww_init(_ww, _intr); (_ww)->loop;   \
-_err = __i915_gem_ww_fini(_ww, _err))
-
+#define for_i915_gem_ww(_ww, _err, _intr)\
+   for (i915_gem_ww_ctx_init(_ww, _intr), (_err) = -EDEADLK; \
+(_err) == -EDEADLK;  \
+(_err) = __i915_gem_ww_fini(_ww, _err))
 #endif
-- 
2.31.1



Re: [PATCH v1 1/2] drm/msm/dp: Add support for SC7280 eDP

2021-08-16 Thread Matthias Kaehlcke
On Thu, Aug 12, 2021 at 05:38:01AM +0530, Sankeerth Billakanti wrote:
> The eDP controller on SC7280 is similar to the eDP/DP controllers
> supported by the current driver implementation.
> 
> SC7280 supports one EDP and one DP controller which can operate
> concurrently.
> 
> The following are some required changes for the sc7280 sink:
> 1. Additional gpio configuration for backlight and pwm via pmic.
> 2. ASSR support programming on the sink.
> 3. SSC support programming on the sink.
> 
> Signed-off-by: Sankeerth Billakanti 
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c |  4 ++--
>  drivers/gpu/drm/msm/dp/dp_ctrl.c   | 19 +++
>  drivers/gpu/drm/msm/dp/dp_display.c| 32 
> --
>  drivers/gpu/drm/msm/dp/dp_parser.c | 31 +
>  drivers/gpu/drm/msm/dp/dp_parser.h |  5 
>  5 files changed, 87 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> index b131fd37..1096c44 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> @@ -856,9 +856,9 @@ static const struct dpu_intf_cfg sm8150_intf[] = {
>  };
>  
>  static const struct dpu_intf_cfg sc7280_intf[] = {
> - INTF_BLK("intf_0", INTF_0, 0x34000, INTF_DP, 0, 24, INTF_SC7280_MASK, 
> MDP_SSPP_TOP0_INTR, 24, 25),
> + INTF_BLK("intf_0", INTF_0, 0x34000, INTF_DP, 1, 24, INTF_SC7280_MASK, 
> MDP_SSPP_TOP0_INTR, 24, 25),
>   INTF_BLK("intf_1", INTF_1, 0x35000, INTF_DSI, 0, 24, INTF_SC7280_MASK, 
> MDP_SSPP_TOP0_INTR, 26, 27),
> - INTF_BLK("intf_5", INTF_5, 0x39000, INTF_EDP, 0, 24, INTF_SC7280_MASK, 
> MDP_SSPP_TOP0_INTR, 22, 23),
> + INTF_BLK("intf_5", INTF_5, 0x39000, INTF_DP, 0, 24, INTF_SC7280_MASK, 
> MDP_SSPP_TOP0_INTR, 22, 23),
>  };
>  
>  /*
> diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c 
> b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> index d2569da..06d5a2d 100644
> --- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
> +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> @@ -1244,7 +1244,9 @@ static int dp_ctrl_link_train(struct dp_ctrl_private 
> *ctrl,
>   struct dp_cr_status *cr, int *training_step)
>  {
>   int ret = 0;
> + u8 *dpcd = ctrl->panel->dpcd;
>   u8 encoding = DP_SET_ANSI_8B10B;
> + u8 ssc = 0, assr = 0;
>   struct dp_link_info link_info = {0};
>  
>   dp_ctrl_config_ctrl(ctrl);
> @@ -1254,9 +1256,21 @@ static int dp_ctrl_link_train(struct dp_ctrl_private 
> *ctrl,
>   link_info.capabilities = DP_LINK_CAP_ENHANCED_FRAMING;
>  
>   dp_aux_link_configure(ctrl->aux, &link_info);
> +
> + if (dpcd[DP_MAX_DOWNSPREAD] & DP_MAX_DOWNSPREAD_0_5) {
> + ssc = DP_SPREAD_AMP_0_5;
> + drm_dp_dpcd_write(ctrl->aux, DP_DOWNSPREAD_CTRL, &ssc, 1);
> + }
> +
>   drm_dp_dpcd_write(ctrl->aux, DP_MAIN_LINK_CHANNEL_CODING_SET,
>   &encoding, 1);
>  
> + if (dpcd[DP_EDP_CONFIGURATION_CAP] & DP_ALTERNATE_SCRAMBLER_RESET_CAP) {
> + assr = DP_ALTERNATE_SCRAMBLER_RESET_ENABLE;
> + drm_dp_dpcd_write(ctrl->aux, DP_EDP_CONFIGURATION_SET,
> + &assr, 1);
> + }
> +
>   ret = dp_ctrl_link_train_1(ctrl, cr, training_step);
>   if (ret) {
>   DRM_ERROR("link training #1 failed. ret=%d\n", ret);
> @@ -1328,9 +1342,11 @@ static int dp_ctrl_enable_mainlink_clocks(struct 
> dp_ctrl_private *ctrl)
>   struct dp_io *dp_io = &ctrl->parser->io;
>   struct phy *phy = dp_io->phy;
>   struct phy_configure_opts_dp *opts_dp = &dp_io->phy_opts.dp;
> + u8 *dpcd = ctrl->panel->dpcd;
>  
>   opts_dp->lanes = ctrl->link->link_params.num_lanes;
>   opts_dp->link_rate = ctrl->link->link_params.rate / 100;
> + opts_dp->ssc = dpcd[DP_MAX_DOWNSPREAD] & DP_MAX_DOWNSPREAD_0_5;
>   dp_ctrl_set_clock_rate(ctrl, DP_CTRL_PM, "ctrl_link",
>   ctrl->link->link_params.rate * 1000);
>  
> @@ -1760,6 +1776,9 @@ int dp_ctrl_on_stream(struct dp_ctrl *dp_ctrl)
>   ctrl->link->link_params.num_lanes = ctrl->panel->link_info.num_lanes;
>   ctrl->dp_ctrl.pixel_rate = ctrl->panel->dp_mode.drm_mode.clock;
>  
> + if (ctrl->dp_ctrl.pixel_rate == 0)
> + return -EINVAL;
> +
>   DRM_DEBUG_DP("rate=%d, num_lanes=%d, pixel_rate=%d\n",
>   ctrl->link->link_params.rate,
>   ctrl->link->link_params.num_lanes, ctrl->dp_ctrl.pixel_rate);
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c 
> b/drivers/gpu/drm/msm/dp/dp_display.c
> index ee5bf64..a772290 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -117,8 +117,36 @@ struct dp_display_private {
>   struct dp_audio *audio;
>  };
>  
> +struct msm_dp_config {
> + phys_addr_t io_start[3];
> + size_t num_

Re: [PATCH v1 2/2] dt-bindings: Add SC7280 compatible string

2021-08-16 Thread Matthias Kaehlcke
On Thu, Aug 12, 2021 at 05:38:02AM +0530, Sankeerth Billakanti wrote:
> The Qualcomm SC7280 platform supports an eDP controller, add
> compatible string for it to msm/binding.
> 
> Signed-off-by: Sankeerth Billakanti 
> ---
>  Documentation/devicetree/bindings/display/msm/dp-controller.yaml | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/display/msm/dp-controller.yaml 
> b/Documentation/devicetree/bindings/display/msm/dp-controller.yaml
> index 64d8d9e..23b78ac 100644
> --- a/Documentation/devicetree/bindings/display/msm/dp-controller.yaml
> +++ b/Documentation/devicetree/bindings/display/msm/dp-controller.yaml
> @@ -17,6 +17,9 @@ properties:
>compatible:
>  enum:
>- qcom,sc7180-dp
> +  - qcom,sc8180x-dp
> +  - qcom,sc8180x-edp
> +  - qcom,sc7280-edp

This adds compatible strings for sc8180x and sc7280 (e)DP, however the
commit message only mentions sc7280. So either the commit message needs
and update or the sc8180x compatibles should be removed.

The driver change 
(https://patchwork.kernel.org/project/linux-arm-msm/patch/1628726882-27841-2-git-send-email-sbill...@codeaurora.org/)
adds some (currently unused) 'io_start' addresses which are hardcoded,
I wonder if these should be in the device tree instead (and 'num_dp'
too?), if they are needed at all.


Re: [PATCH] drm/radeon: Add break to switch statement in radeonfb_create_pinned_object()

2021-08-16 Thread Alex Deucher
Applied.  Thanks!

Alex

On Mon, Aug 16, 2021 at 3:23 AM Christian König
 wrote:
>
> Am 15.08.21 um 21:29 schrieb Nathan Chancellor:
> > Clang + -Wimplicit-fallthrough warns:
> >
> > drivers/gpu/drm/radeon/radeon_fb.c:170:2: warning: unannotated
> > fall-through between switch labels [-Wimplicit-fallthrough]
> >  default:
> >  ^
> > drivers/gpu/drm/radeon/radeon_fb.c:170:2: note: insert 'break;' to avoid
> > fall-through
> >  default:
> >  ^
> >  break;
> > 1 warning generated.
> >
> > Clang's version of this warning is a little bit more pedantic than
> > GCC's. Add the missing break to satisfy it to match what has been done
> > all over the kernel tree.
> >
> > Signed-off-by: Nathan Chancellor 
>
> Reviewed-by: Christian König 
>
> > ---
> >   drivers/gpu/drm/radeon/radeon_fb.c | 1 +
> >   1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/gpu/drm/radeon/radeon_fb.c 
> > b/drivers/gpu/drm/radeon/radeon_fb.c
> > index 0b206b052972..c8b545181497 100644
> > --- a/drivers/gpu/drm/radeon/radeon_fb.c
> > +++ b/drivers/gpu/drm/radeon/radeon_fb.c
> > @@ -167,6 +167,7 @@ static int radeonfb_create_pinned_object(struct 
> > radeon_fbdev *rfbdev,
> >   break;
> >   case 2:
> >   tiling_flags |= RADEON_TILING_SWAP_16BIT;
> > + break;
> >   default:
> >   break;
> >   }
> >
> > base-commit: ba31f97d43be41ca99ab72a6131d7c226306865f
>


Re: [PATCH v6 04/13] drm/amdkfd: add SPM support for SVM

2021-08-16 Thread Felix Kuehling
Am 2021-08-15 um 5:10 a.m. schrieb Christoph Hellwig:
>> @@ -880,17 +881,22 @@ int svm_migrate_init(struct amdgpu_device *adev)
>>   * should remove reserved size
>>   */
>>  size = ALIGN(adev->gmc.real_vram_size, 2ULL << 20);
>> -res = devm_request_free_mem_region(adev->dev, &iomem_resource, size);
>> +if (xgmi_connected_to_cpu)
>> +res = lookup_resource(&iomem_resource, adev->gmc.aper_base);
>> +else
>> +res = devm_request_free_mem_region(adev->dev, &iomem_resource, 
>> size);
>> +
> Can you explain what the point of the lookup_resource is here? res->start
> is obviously identical to the start value you pass in.  So this is used
> as a way to query the length, but I'm pretty sure the driver must
> already know that as it inserted the resource itself, right?

I think you're right. We only need the start and end address from
lookup_resource and we already know that anyway. It means we can drop
patch 3 from the series.

Just to be sure, we'll confirm that the end address determined by our
driver matches the one from lookup_resource (coming from the system
address map in the system BIOS). If there were a mismatch, it would
probably be a bug (in the driver or the BIOS) that we'd need to fix anyway.


>
> On a slightly higher level comment svm_migrate_init is a bit of a mess
> with all the if/else already, and with the above addressed will become
> a bit more.  I think splitting it into a device private and device
> generic case would probably help people finding it to understand the code
> much better later on.  Even more so with a useful comment.

I don't really see the "mess" you're talking about. Including the above,
there are only 3 conditional statements in that function that are not
error-handling related:

/* Page migration works on Vega10 or newer */
if (kfddev->device_info->asic_family < CHIP_VEGA10)
return -EINVAL;
...
if (xgmi_connected_to_cpu)
res = lookup_resource(&iomem_resource, adev->gmc.aper_base);
else
res = devm_request_free_mem_region(adev->dev, &iomem_resource, 
size);
...
pgmap->type = xgmi_connected_to_cpu ?
MEMORY_DEVICE_GENERIC : MEMORY_DEVICE_PRIVATE;


Regards,
  Felix




Re: [PATCH v6 02/13] mm: remove extra ZONE_DEVICE struct page refcount

2021-08-16 Thread Felix Kuehling
Am 2021-08-15 um 4:40 p.m. schrieb John Hubbard:
> On 8/15/21 8:37 AM, Christoph Hellwig wrote:
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>> index 8ae31622deef..d48a1f0889d1 100644
>>> --- a/include/linux/mm.h
>>> +++ b/include/linux/mm.h
>>> @@ -1218,7 +1218,7 @@ __maybe_unused struct page
>>> *try_grab_compound_head(struct page *page, int refs,
>>>   static inline __must_check bool try_get_page(struct page *page)
>>>   {
>>>   page = compound_head(page);
>>> -    if (WARN_ON_ONCE(page_ref_count(page) <= 0))
>>> +    if (WARN_ON_ONCE(page_ref_count(page) <
>>> (int)!is_zone_device_page(page)))
>>
>> Please avoid the overly long line.  In fact I'd be tempted to just not
>> bother here and keep the old, more lose check.  Especially given that
>> John has a patch ready that removes try_get_page entirely.
>>
>
> Yes. Andrew has accepted it into mmotm.
>
> Ralph's patch here was written well before my cleanup that removed
> try_grab_page() [1]. But now that we're here, if you drop this hunk then
> it will make merging easier, I think.
>
>
> [1]
> https://lore.kernel.org/r/20210813044133.1536842-4-jhubb...@nvidia.com

Hi John,

Thanks for the pointer. We'll drop this hunk and add a statement to our
patch description to highlight the dependency on your patch.

Regards,
  Felix


>
> thanks,
> -- 
> John Hubbard
> NVIDIA
>


Re: [PATCH v6 08/13] mm: call pgmap->ops->page_free for DEVICE_GENERIC pages

2021-08-16 Thread Felix Kuehling


Am 2021-08-15 um 11:40 a.m. schrieb Christoph Hellwig:
> On Fri, Aug 13, 2021 at 01:31:45AM -0500, Alex Sierra wrote:
>> Add MEMORY_DEVICE_GENERIC case to free_zone_device_page callback.
>> Device generic type memory case is now able to free its pages properly.
> How is this going to work for the two existing MEMORY_DEVICE_GENERIC
> that now change behavior?  And which don't have a ->page_free callback
> at all?

That's a good catch. Existing drivers shouldn't need a page_free
callback if they didn't have one before. That means we need to add a
NULL-pointer check in free_device_page.

Regards,
  Felix


>
>> Signed-off-by: Alex Sierra 
>> ---
>>  mm/memremap.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/memremap.c b/mm/memremap.c
>> index 5aa8163fd948..5773e15b6ac9 100644
>> --- a/mm/memremap.c
>> +++ b/mm/memremap.c
>> @@ -459,7 +459,7 @@ struct dev_pagemap *get_dev_pagemap(unsigned long pfn,
>>  EXPORT_SYMBOL_GPL(get_dev_pagemap);
>>  
>>  #ifdef CONFIG_DEV_PAGEMAP_OPS
>> -static void free_device_private_page(struct page *page)
>> +static void free_device_page(struct page *page)
>>  {
>>  
>>  __ClearPageWaiters(page);
>> @@ -498,7 +498,8 @@ void free_zone_device_page(struct page *page)
>>  wake_up_var(&page->_refcount);
>>  return;
>>  case MEMORY_DEVICE_PRIVATE:
>> -free_device_private_page(page);
>> +case MEMORY_DEVICE_GENERIC:
>> +free_device_page(page);
>>  return;
>>  default:
>>  return;
>> -- 
>> 2.32.0
> ---end quoted text---
>


Re: [PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-16 Thread Alex Deucher
Applied.  Thanks!

Alex

On Mon, Aug 16, 2021 at 11:07 AM Michel Dänzer  wrote:
>
> On 2021-08-16 2:06 p.m., Christian König wrote:
> > Am 16.08.21 um 13:33 schrieb Lazar, Lijo:
> >> On 8/16/2021 4:05 PM, Michel Dänzer wrote:
> >>> From: Michel Dänzer 
> >>>
> >>> schedule_delayed_work does not push back the work if it was already
> >>> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
> >>> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
> >>> was disabled and re-enabled again during those 100 ms.
> >>>
> >>> This resulted in frame drops / stutter with the upcoming mutter 41
> >>> release on Navi 14, due to constantly enabling GFXOFF in the HW and
> >>> disabling it again (for getting the GPU clock counter).
> >>>
> >>> To fix this, call cancel_delayed_work_sync when the disable count
> >>> transitions from 0 to 1, and only schedule the delayed work on the
> >>> reverse transition, not if the disable count was already 0. This makes
> >>> sure the delayed work doesn't run at unexpected times, and allows it to
> >>> be lock-free.
> >>>
> >>> v2:
> >>> * Use cancel_delayed_work_sync & mutex_trylock instead of
> >>>mod_delayed_work.
> >>> v3:
> >>> * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
> >>>
> >>> Cc: sta...@vger.kernel.org
> >>> Signed-off-by: Michel Dänzer 
> >>> ---
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +--
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 22 +-
> >>>   2 files changed, 22 insertions(+), 11 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> index f3fd5ec710b6..f944ed858f3e 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> @@ -2777,12 +2777,11 @@ static void 
> >>> amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
> >>>   struct amdgpu_device *adev =
> >>>   container_of(work, struct amdgpu_device, 
> >>> gfx.gfx_off_delay_work.work);
> >>>   -mutex_lock(&adev->gfx.gfx_off_mutex);
> >>> -if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
> >>> -if (!amdgpu_dpm_set_powergating_by_smu(adev, 
> >>> AMD_IP_BLOCK_TYPE_GFX, true))
> >>> -adev->gfx.gfx_off_state = true;
> >>> -}
> >>> -mutex_unlock(&adev->gfx.gfx_off_mutex);
> >>> +WARN_ON_ONCE(adev->gfx.gfx_off_state);
> >>
> >> Don't see any case for this. It's not expected to be scheduled in this 
> >> case, right?
> >>
> >>> + WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
> >>> +
> >>
> >> Thinking about ON_ONCE here - this may happen more than once if it's 
> >> completed as part of cancel_ call. Is the warning needed?
> >
> > WARN_ON_ONCE() is usually used to prevent spamming the system log with 
> > warnings. E.g. the warning is only printed once indicating a driver bug and 
> > that's it.
>
> Right, these WARN_ONs are like assert()s in user-space code, documenting the 
> pre-conditions and checking them at runtime. And I use _ONCE so that if a 
> pre-condition is ever violated for some reason, dmesg isn't spammed with 
> multiple warnings.
>
>
> >> Anyway,
> >> Reviewed-by: Lijo Lazar 
> >
> > Acked-by: Christian König 
>
> Thanks guys!
>
>
> --
> Earthling Michel Dänzer   |   https://redhat.com
> Libre software enthusiast | Mesa and X developer


[RFC PATCH 0/5] drm/mediatek: Add mt8195 DisplayPort driver

2021-08-16 Thread Markus Schneider-Pargmann
Hi everyone,

this series is built around the DisplayPort driver. The dpi/dpintf driver and
the added helper functions are required for the DisplayPort.

Note that this is an RFC. I would like to have your opinion on the driver and
what needs to change. The driver itself has its rough edges that I am currently
still working on, especially the training and powerup/down need some work in
my opinion. Also the code compiles without an issue but is not fully tested
yet.

However I already wanted to reach out for some feedback to see what can and
should be improved. I am happy about every comment, thanks for taking the
time.

The series is currently based on v5.14-rc5 but it requires other patches to
actually work on mt8195 (clock, etc.). A binding documentation is not included
yet.

Thanks,
Markus

Markus Schneider-Pargmann (5):
  dt-bindings: mediatek,dpi: Add mt8195 dpintf
  drm/mediatek: dpi: Add dpintf support
  drm/edid: Add cea_sad helpers for freq/length
  video/hdmi: Add audio_infoframe packing for DP
  drm/mediatek: Add mt8195 DisplayPort driver

 .../display/mediatek/mediatek,dpi.yaml|   48 +-
 drivers/gpu/drm/drm_edid.c|   57 +
 drivers/gpu/drm/mediatek/Kconfig  |7 +
 drivers/gpu/drm/mediatek/Makefile |2 +
 drivers/gpu/drm/mediatek/mtk_dp.c | 3025 
 drivers/gpu/drm/mediatek/mtk_dp_reg.h | 3095 +
 drivers/gpu/drm/mediatek/mtk_dpi.c|  282 +-
 drivers/gpu/drm/mediatek/mtk_dpi_regs.h   |   12 +
 drivers/video/hdmi.c  |   87 +-
 include/drm/drm_edid.h|   18 +-
 include/linux/hdmi.h  |4 +
 11 files changed, 6564 insertions(+), 73 deletions(-)
 create mode 100644 drivers/gpu/drm/mediatek/mtk_dp.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_dp_reg.h

-- 
2.32.0



[RFC PATCH 3/5] drm/edid: Add cea_sad helpers for freq/length

2021-08-16 Thread Markus Schneider-Pargmann
This patch adds two helper functions that extract the frequency and word
length from a struct cea_sad.

For these helper functions new defines are added that help translate the
'freq' and 'byte2' fields into real numbers.

Signed-off-by: Markus Schneider-Pargmann 
---
 drivers/gpu/drm/drm_edid.c | 57 ++
 include/drm/drm_edid.h | 18 ++--
 2 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 81d5f2524246..2389d34ce10e 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -4666,6 +4666,63 @@ int drm_edid_to_speaker_allocation(struct edid *edid, u8 
**sadb)
 }
 EXPORT_SYMBOL(drm_edid_to_speaker_allocation);
 
+/**
+ * drm_cea_sad_get_sample_rate - Extract the sample rate from cea_sad
+ * @sad: Pointer to the cea_sad struct
+ *
+ * Extracts the cea_sad frequency field and returns the sample rate in Hz.
+ *
+ * Return: Sample rate in Hz or a negative errno if parsing failed.
+ */
+int drm_cea_sad_get_sample_rate(struct cea_sad *sad)
+{
+   switch (sad->freq) {
+   case CEA_SAD_FREQ_32KHZ:
+   return 32000;
+   case CEA_SAD_FREQ_44KHZ:
+   return 44100;
+   case CEA_SAD_FREQ_48KHZ:
+   return 48000;
+   case CEA_SAD_FREQ_88KHZ:
+   return 88200;
+   case CEA_SAD_FREQ_96KHZ:
+   return 96000;
+   case CEA_SAD_FREQ_176KHZ:
+   return 176400;
+   case CEA_SAD_FREQ_192KHZ:
+   return 192000;
+   default:
+   return -EINVAL;
+   }
+}
+EXPORT_SYMBOL(drm_cea_sad_get_sample_rate);
+
+/**
+ * drm_cea_sad_get_uncompressed_word_length - Extract word length
+ * @sad: Pointer to the cea_sad struct
+ *
+ * Extracts the cea_sad byte2 field and returns the word length for an
+ * uncompressed stream.
+ *
+ * Note: This function may only be called for uncompressed audio.
+ *
+ * Return: Word length in bits or a negative errno if parsing failed.
+ */
+int drm_cea_sad_get_uncompressed_word_length(struct cea_sad *sad)
+{
+   switch (sad->byte2) {
+   case CEA_SAD_UNCOMPRESSED_WORD_16BIT:
+   return 16;
+   case CEA_SAD_UNCOMPRESSED_WORD_20BIT:
+   return 20;
+   case CEA_SAD_UNCOMPRESSED_WORD_24BIT:
+   return 24;
+   default:
+   return -EINVAL;
+   }
+}
+EXPORT_SYMBOL(drm_cea_sad_get_uncompressed_word_length);
+
 /**
  * drm_av_sync_delay - compute the HDMI/DP sink audio-video sync delay
  * @connector: connector associated with the HDMI/DP sink
diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
index 759328a5eeb2..bed091a749ef 100644
--- a/include/drm/drm_edid.h
+++ b/include/drm/drm_edid.h
@@ -361,12 +361,24 @@ struct edid {
 
 /* Short Audio Descriptor */
 struct cea_sad {
-   u8 format;
+   u8 format; /* See HDMI_AUDIO_CODING_TYPE_* */
u8 channels; /* max number of channels - 1 */
-   u8 freq;
+   u8 freq; /* See CEA_SAD_FREQ_* */
u8 byte2; /* meaning depends on format */
 };
 
+#define CEA_SAD_FREQ_32KHZ  BIT(0)
+#define CEA_SAD_FREQ_44KHZ  BIT(1)
+#define CEA_SAD_FREQ_48KHZ  BIT(2)
+#define CEA_SAD_FREQ_88KHZ  BIT(3)
+#define CEA_SAD_FREQ_96KHZ  BIT(4)
+#define CEA_SAD_FREQ_176KHZ BIT(5)
+#define CEA_SAD_FREQ_192KHZ BIT(6)
+
+#define CEA_SAD_UNCOMPRESSED_WORD_16BIT BIT(0)
+#define CEA_SAD_UNCOMPRESSED_WORD_20BIT BIT(1)
+#define CEA_SAD_UNCOMPRESSED_WORD_24BIT BIT(2)
+
 struct drm_encoder;
 struct drm_connector;
 struct drm_connector_state;
@@ -374,6 +386,8 @@ struct drm_display_mode;
 
 int drm_edid_to_sad(struct edid *edid, struct cea_sad **sads);
 int drm_edid_to_speaker_allocation(struct edid *edid, u8 **sadb);
+int drm_cea_sad_get_sample_rate(struct cea_sad *sad);
+int drm_cea_sad_get_uncompressed_word_length(struct cea_sad *sad);
 int drm_av_sync_delay(struct drm_connector *connector,
  const struct drm_display_mode *mode);
 
-- 
2.32.0



[RFC PATCH 1/5] dt-bindings: mediatek,dpi: Add mt8195 dpintf

2021-08-16 Thread Markus Schneider-Pargmann
DP_INTF is similar to the actual dpi. They differ in some points
regarding registers and what needs to be set but the function blocks
itself are similar in design.

Signed-off-by: Markus Schneider-Pargmann 
---
 .../display/mediatek/mediatek,dpi.yaml| 48 ---
 1 file changed, 42 insertions(+), 6 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/display/mediatek/mediatek,dpi.yaml 
b/Documentation/devicetree/bindings/display/mediatek/mediatek,dpi.yaml
index dd2896a40ff0..de4bdacd83ac 100644
--- a/Documentation/devicetree/bindings/display/mediatek/mediatek,dpi.yaml
+++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,dpi.yaml
@@ -4,7 +4,7 @@
 $id: http://devicetree.org/schemas/display/mediatek/mediatek,dpi.yaml#
 $schema: http://devicetree.org/meta-schemas/core.yaml#
 
-title: mediatek DPI Controller Device Tree Bindings
+title: mediatek DPI/DP_INTF Controller Device Tree Bindings
 
 maintainers:
   - CK Hu 
@@ -13,7 +13,8 @@ maintainers:
 description: |
   The Mediatek DPI function block is a sink of the display subsystem and
   provides 8-bit RGB/YUV444 or 8/10/10-bit YUV422 pixel data on a parallel
-  output bus.
+  output bus. The Mediatek DP_INTF is a similar function block that is
+  connected to the (embedded) display port function block.
 
 properties:
   compatible:
@@ -23,6 +24,7 @@ properties:
   - mediatek,mt8173-dpi
   - mediatek,mt8183-dpi
   - mediatek,mt8192-dpi
+  - mediatek,mt8195-dpintf
 
   reg:
 maxItems: 1
@@ -37,10 +39,11 @@ properties:
   - description: DPI PLL
 
   clock-names:
-items:
-  - const: pixel
-  - const: engine
-  - const: pll
+description:
+  For dpi clocks pixel, engine and pll are required. For dpintf pixel, pll,
+  pll_d2, pll_d4, pll_d8, pll_d16, hf_fmm, hf_fdp are required.
+minItems: 3
+maxItems: 8
 
   pinctrl-0: true
   pinctrl-1: true
@@ -64,6 +67,39 @@ required:
   - clock-names
   - port
 
+allOf:
+  - if:
+  properties:
+compatible:
+  contains:
+enum:
+  - mediatek,mt8195-dpintf
+then:
+  properties:
+clocks:
+  minItems: 8
+  maxItems: 8
+clock-names:
+  items:
+- const: pixel
+- const: pll
+- const: pll_d2
+- const: pll_d4
+- const: pll_d8
+- const: pll_d16
+- const: hf_fmm
+- const: hf_fdp
+else:
+  properties:
+clocks:
+  minItems: 3
+  maxItems: 3
+clock-names:
+  items:
+- const: pixel
+- const: engine
+- const: pll
+
 additionalProperties: false
 
 examples:
-- 
2.32.0



[RFC PATCH 2/5] drm/mediatek: dpi: Add dpintf support

2021-08-16 Thread Markus Schneider-Pargmann
dpintf is the displayport interface hardware unit. This unit is similar
to dpi and can reuse most of the code.

This patch adds support for mt8195-dpintf to this dpi driver. Main
differences are:
 - Some features/functional components are not available for dpintf
   which are now excluded from code execution once is_dpintf is set
 - dpintf can and needs to choose between different clockdividers based
   on the clockspeed. This is done by choosing a different clock parent.
 - There are two additional clocks that need to be managed. These are
   only set for dpintf and will be set to NULL if not supplied. The
   clk_* calls handle these as normal clocks then.
 - Some register contents differ slightly between the two components. To
   work around this I added register bits/masks with a DPINTF_ prefix
   and use them where different.

Based on a separate driver for dpintf created by
Jason-JH.Lin .

Signed-off-by: Markus Schneider-Pargmann 
---
 drivers/gpu/drm/mediatek/mtk_dpi.c  | 282 
 drivers/gpu/drm/mediatek/mtk_dpi_regs.h |  12 +
 2 files changed, 247 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_dpi.c 
b/drivers/gpu/drm/mediatek/mtk_dpi.c
index bced555648b0..4ad6d1fc6bde 100644
--- a/drivers/gpu/drm/mediatek/mtk_dpi.c
+++ b/drivers/gpu/drm/mediatek/mtk_dpi.c
@@ -63,6 +63,14 @@ enum mtk_dpi_out_color_format {
MTK_DPI_COLOR_FORMAT_YCBCR_422_FULL
 };
 
+enum mtk_dpi_tvdpll_clk {
+   MTK_DPI_TVDPLL_D2 = 0,
+   MTK_DPI_TVDPLL_D4 = 1,
+   MTK_DPI_TVDPLL_D8 = 2,
+   MTK_DPI_TVDPLL_D16 = 3,
+   MTK_DPI_TVDPLL_NUM_CLKS = 4
+};
+
 struct mtk_dpi {
struct drm_encoder encoder;
struct drm_bridge bridge;
@@ -71,8 +79,11 @@ struct mtk_dpi {
void __iomem *regs;
struct device *dev;
struct clk *engine_clk;
+   struct clk *hf_fmm_clk;
+   struct clk *hf_fdp_clk;
struct clk *pixel_clk;
struct clk *tvd_clk;
+   struct clk_bulk_data tvd_clks[MTK_DPI_TVDPLL_NUM_CLKS];
int irq;
struct drm_display_mode mode;
const struct mtk_dpi_conf *conf;
@@ -125,6 +136,7 @@ struct mtk_dpi_conf {
bool edge_sel_en;
const u32 *output_fmts;
u32 num_output_fmts;
+   bool is_dpintf;
 };
 
 static void mtk_dpi_mask(struct mtk_dpi *dpi, u32 offset, u32 val, u32 mask)
@@ -153,30 +165,52 @@ static void mtk_dpi_disable(struct mtk_dpi *dpi)
 static void mtk_dpi_config_hsync(struct mtk_dpi *dpi,
 struct mtk_dpi_sync_param *sync)
 {
-   mtk_dpi_mask(dpi, DPI_TGEN_HWIDTH,
-sync->sync_width << HPW, HPW_MASK);
-   mtk_dpi_mask(dpi, DPI_TGEN_HPORCH,
-sync->back_porch << HBP, HBP_MASK);
-   mtk_dpi_mask(dpi, DPI_TGEN_HPORCH, sync->front_porch << HFP,
-HFP_MASK);
+   if (dpi->conf->is_dpintf) {
+   mtk_dpi_mask(dpi, DPI_TGEN_HWIDTH,
+sync->sync_width << HPW, DPINTF_HPW_MASK);
+   mtk_dpi_mask(dpi, DPI_TGEN_HPORCH,
+sync->back_porch << HBP, DPINTF_HBP_MASK);
+   mtk_dpi_mask(dpi, DPI_TGEN_HPORCH, sync->front_porch << HFP,
+DPINTF_HFP_MASK);
+   } else {
+   mtk_dpi_mask(dpi, DPI_TGEN_HWIDTH,
+sync->sync_width << HPW, HPW_MASK);
+   mtk_dpi_mask(dpi, DPI_TGEN_HPORCH,
+sync->back_porch << HBP, HBP_MASK);
+   mtk_dpi_mask(dpi, DPI_TGEN_HPORCH, sync->front_porch << HFP,
+HFP_MASK);
+   }
 }
 
 static void mtk_dpi_config_vsync(struct mtk_dpi *dpi,
 struct mtk_dpi_sync_param *sync,
 u32 width_addr, u32 porch_addr)
 {
-   mtk_dpi_mask(dpi, width_addr,
-sync->sync_width << VSYNC_WIDTH_SHIFT,
-VSYNC_WIDTH_MASK);
mtk_dpi_mask(dpi, width_addr,
 sync->shift_half_line << VSYNC_HALF_LINE_SHIFT,
 VSYNC_HALF_LINE_MASK);
-   mtk_dpi_mask(dpi, porch_addr,
-sync->back_porch << VSYNC_BACK_PORCH_SHIFT,
-VSYNC_BACK_PORCH_MASK);
-   mtk_dpi_mask(dpi, porch_addr,
-sync->front_porch << VSYNC_FRONT_PORCH_SHIFT,
-VSYNC_FRONT_PORCH_MASK);
+
+   if (dpi->conf->is_dpintf) {
+   mtk_dpi_mask(dpi, width_addr,
+sync->sync_width << VSYNC_WIDTH_SHIFT,
+DPINTF_VSYNC_WIDTH_MASK);
+   mtk_dpi_mask(dpi, porch_addr,
+sync->back_porch << VSYNC_BACK_PORCH_SHIFT,
+DPINTF_VSYNC_BACK_PORCH_MASK);
+   mtk_dpi_mask(dpi, porch_addr,
+sync->front_porch << VSYNC_FRONT_PORCH_SHIFT,
+DPINTF_VSYNC_FRONT_PORCH_MASK);
+ 

[RFC PATCH 4/5] video/hdmi: Add audio_infoframe packing for DP

2021-08-16 Thread Markus Schneider-Pargmann
Similar to HDMI, DP uses audio infoframes as well which are structured
very similar to the HDMI ones.

This patch adds a helper function to pack the HDMI audio infoframe for
DP, called hdmi_audio_infoframe_pack_for_dp().
hdmi_audio_infoframe_pack_only() is split into two parts. One of them
packs the payload only and can be used for HDMI and DP.

Signed-off-by: Markus Schneider-Pargmann 
---
 drivers/video/hdmi.c | 87 +++-
 include/linux/hdmi.h |  4 ++
 2 files changed, 73 insertions(+), 18 deletions(-)

diff --git a/drivers/video/hdmi.c b/drivers/video/hdmi.c
index 947be761dfa4..59c4341549e4 100644
--- a/drivers/video/hdmi.c
+++ b/drivers/video/hdmi.c
@@ -387,6 +387,28 @@ int hdmi_audio_infoframe_check(struct hdmi_audio_infoframe 
*frame)
 }
 EXPORT_SYMBOL(hdmi_audio_infoframe_check);
 
+static void
+hdmi_audio_infoframe_pack_payload(const struct hdmi_audio_infoframe *frame,
+ u8 *buffer)
+{
+   u8 channels;
+
+   if (frame->channels >= 2)
+   channels = frame->channels - 1;
+   else
+   channels = 0;
+
+   buffer[0] = ((frame->coding_type & 0xf) << 4) | (channels & 0x7);
+   buffer[1] = ((frame->sample_frequency & 0x7) << 2) |
+(frame->sample_size & 0x3);
+   buffer[2] = frame->coding_type_ext & 0x1f;
+   buffer[3] = frame->channel_allocation;
+   buffer[4] = (frame->level_shift_value & 0xf) << 3;
+
+   if (frame->downmix_inhibit)
+   buffer[4] |= BIT(7);
+}
+
 /**
  * hdmi_audio_infoframe_pack_only() - write HDMI audio infoframe to binary 
buffer
  * @frame: HDMI audio infoframe
@@ -404,7 +426,6 @@ EXPORT_SYMBOL(hdmi_audio_infoframe_check);
 ssize_t hdmi_audio_infoframe_pack_only(const struct hdmi_audio_infoframe 
*frame,
   void *buffer, size_t size)
 {
-   unsigned char channels;
u8 *ptr = buffer;
size_t length;
int ret;
@@ -420,28 +441,13 @@ ssize_t hdmi_audio_infoframe_pack_only(const struct 
hdmi_audio_infoframe *frame,
 
memset(buffer, 0, size);
 
-   if (frame->channels >= 2)
-   channels = frame->channels - 1;
-   else
-   channels = 0;
-
ptr[0] = frame->type;
ptr[1] = frame->version;
ptr[2] = frame->length;
ptr[3] = 0; /* checksum */
 
-   /* start infoframe payload */
-   ptr += HDMI_INFOFRAME_HEADER_SIZE;
-
-   ptr[0] = ((frame->coding_type & 0xf) << 4) | (channels & 0x7);
-   ptr[1] = ((frame->sample_frequency & 0x7) << 2) |
-(frame->sample_size & 0x3);
-   ptr[2] = frame->coding_type_ext & 0x1f;
-   ptr[3] = frame->channel_allocation;
-   ptr[4] = (frame->level_shift_value & 0xf) << 3;
-
-   if (frame->downmix_inhibit)
-   ptr[4] |= BIT(7);
+   hdmi_audio_infoframe_pack_payload(frame,
+ ptr + HDMI_INFOFRAME_HEADER_SIZE);
 
hdmi_infoframe_set_checksum(buffer, length);
 
@@ -479,6 +485,51 @@ ssize_t hdmi_audio_infoframe_pack(struct 
hdmi_audio_infoframe *frame,
 }
 EXPORT_SYMBOL(hdmi_audio_infoframe_pack);
 
+/**
+ * hdmi_audio_infoframe_pack_for_dp - Pack a HDMI Audio infoframe for
+ *displayport
+ *
+ * @frame HDMI Audio infoframe
+ * @header Header buffer to be used
+ * @header_size Size of header buffer
+ * @data Data buffer to be used
+ * @data_size Size of data buffer
+ * @dp_version Display Port version to be encoded in the header
+ *
+ * Packs a HDMI Audio Infoframe to be sent over Display Port. This function
+ * fills both header and data buffer with the required data.
+ *
+ * Return: Number of total written bytes or a negative errno on failure.
+ */
+ssize_t hdmi_audio_infoframe_pack_for_dp(struct hdmi_audio_infoframe *frame,
+void *header, size_t header_size,
+void *data, size_t data_size,
+u8 dp_version)
+{
+   int ret;
+   u8 *hdr_ptr = header;
+
+   ret = hdmi_audio_infoframe_check(frame);
+   if (ret)
+   return ret;
+
+   if (header_size < 4 || data_size < frame->length)
+   return -ENOSPC;
+
+   memset(header, 0, header_size);
+   memset(data, 0, data_size);
+
+   // Secondary-data packet header
+   hdr_ptr[1] = frame->type;
+   hdr_ptr[2] = 0x1B;  // As documented by DP spec for Secondary-data 
Packets
+   hdr_ptr[3] = (dp_version & 0x3f) << 2;
+
+   hdmi_audio_infoframe_pack_payload(frame, data);
+
+   return frame->length + 4;
+}
+EXPORT_SYMBOL(hdmi_audio_infoframe_pack_for_dp);
+
 /**
  * hdmi_vendor_infoframe_init() - initialize an HDMI vendor infoframe
  * @frame: HDMI vendor infoframe
diff --git a/include/linux/hdmi.h b/include/linux/hdmi.h
index c8ec982ff498..f576a0b08c85 100644
--- a/include/linux/hdmi.h
+++ b/include/linux/hdmi.h
@@ -334,6

Re: [PATCH v6 05/13] drm/amdkfd: generic type as sys mem on migration to ram

2021-08-16 Thread Sierra Guiza, Alejandro (Alex)



On 8/15/2021 10:38 AM, Christoph Hellwig wrote:

On Fri, Aug 13, 2021 at 01:31:42AM -0500, Alex Sierra wrote:

migrate.vma = vma;
migrate.start = start;
migrate.end = end;
-   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
migrate.pgmap_owner = SVM_ADEV_PGMAP_OWNER(adev);
  
+	if (adev->gmc.xgmi.connected_to_cpu)

+   migrate.flags = MIGRATE_VMA_SELECT_SYSTEM;
+   else
+   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;

It's been a while since I touched this migrate code, but doesn't this
mean that if the range already contains system memory the migration
now won't do anything? for the connected_to_cpu case?


For above’s condition equal to connected_to_cpu , we’re explicitly 
migrating from
device memory to system memory with device generic type. In this type, 
device PTEs are

present in CPU page table.

During migrate_vma_collect_pmd walk op at migrate_vma_setup call, 
there’s a condition
for present pte that require migrate->flags be set for 
MIGRATE_VMA_SELECT_SYSTEM.

Otherwise, the migration for this entry will be ignored.

Regards,
Alex S.



Re: [RFC PATCH 5/5] drm/mediatek: Add mt8195 DisplayPort driver

2021-08-16 Thread Sam Ravnborg
Hi Markus,

A few general things in the following. This is what I look for first in
a bridge driver - and I had no time today to review the driver in full.
Please address these, then cc: me on next revision where I hopefully
have more time.

Sam

> +static int mtk_dp_bridge_attach(struct drm_bridge *bridge,
> +   enum drm_bridge_attach_flags flags)
> +{
> +   struct mtk_dp *mtk_dp = mtk_dp_from_bridge(bridge);
> +   int ret;
> +
> +   mtk_dp_poweron(mtk_dp);
> +
> +   if (mtk_dp->next_bridge) {
> +   ret = drm_bridge_attach(bridge->encoder, mtk_dp->next_bridge,
> +   &mtk_dp->bridge, flags);
> +   if (ret) {
> +   drm_warn(mtk_dp->drm_dev,
> +"Failed to attach external bridge: %d\n", 
> ret);
> +   return ret;
> +   }
> +   }
> +
> +   if (flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR) {
> +   drm_err(mtk_dp->drm_dev,
> +   "Fix bridge driver to make connector optional!");
> +   return 0;
> +   }

This driver is only used by mediatek, and I thought all of mediatek is
converted so the display driver creates the connector.

It would be better to migrate mediatek over to always let the display
driver create the connector and drop the connector support in this
driver.


> + struct drm_bridge_funcs mtk_dp_bridge_funcs = {
> + .attach = mtk_dp_bridge_attach,
> + .mode_fixup = mtk_dp_bridge_mode_fixup,
> + .disable = mtk_dp_bridge_disable,
> + .post_disable = mtk_dp_bridge_post_disable,
> + .mode_set = mtk_dp_bridge_mode_set,
> + .pre_enable = mtk_dp_bridge_pre_enable,
> + .enable = mtk_dp_bridge_enable,
> + .get_edid = mtk_get_edid,
> + .detect = mtk_dp_bdg_detect,
> +};


For new drivers please avoid the recently deprecated functions.

- Use the atomic versions of pre_enable, enable, disable and post_disable.

- Merge mode_set with atomic_enable - as there is no need for the mode_Set
  operation.

- Use atomic_check in favour of mode_fixup, albeit the rules for
  atomic_check is at best vauge at the moment.
 



Re: [PATCH v6 05/13] drm/amdkfd: generic type as sys mem on migration to ram

2021-08-16 Thread Zeng, Oak


Regards,
Oak 

 

On 2021-08-16, 3:53 PM, "amd-gfx on behalf of Sierra Guiza, Alejandro (Alex)" 
 wrote:


On 8/15/2021 10:38 AM, Christoph Hellwig wrote:
> On Fri, Aug 13, 2021 at 01:31:42AM -0500, Alex Sierra wrote:
>>  migrate.vma = vma;
>>  migrate.start = start;
>>  migrate.end = end;
>> -migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
>>  migrate.pgmap_owner = SVM_ADEV_PGMAP_OWNER(adev);
>>   
>> +if (adev->gmc.xgmi.connected_to_cpu)
>> +migrate.flags = MIGRATE_VMA_SELECT_SYSTEM;
>> +else
>> +migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
> It's been a while since I touched this migrate code, but doesn't this
> mean that if the range already contains system memory the migration
> now won't do anything? for the connected_to_cpu case?

For above’s condition equal to connected_to_cpu , we’re explicitly 
migrating from
device memory to system memory with device generic type. 

For MEMORY_DEVICE_GENERIC memory type, why do we need to explicitly migrate it 
from device memory to normal system memory? I thought the design was, for this 
type of memory, CPU can access it in place without migration(just like CPU 
access normal system memory), so there is no need to migrate such type of 
memory to normal system memory...

With this patch, the migration behavior will be: when memory is accessed by 
CPU, it will be migrated to normal system memory; when memory is accessed by 
GPU, it will be migrated to device vram. This is basically the same behavior as 
when vram is treated as DEVICE_PRIVATE. 

I thought the whole goal of introducing DEVICE_GENERIC is to avoid such back 
and forth migration b/t device memory and normal system memory. But maybe I am 
missing something here

Regards,
Oak

In this type, 
device PTEs are
present in CPU page table.

During migrate_vma_collect_pmd walk op at migrate_vma_setup call, 
there’s a condition
for present pte that require migrate->flags be set for 
MIGRATE_VMA_SELECT_SYSTEM.
Otherwise, the migration for this entry will be ignored.

Regards,
Alex S.




  1   2   >