Re: [PATCH v2 2/2] drm/panel: Add support for E Ink VB3300-KCA

2021-07-05 Thread Alistair Francis
On Fri, Jun 25, 2021 at 6:52 AM Rob Herring  wrote:
>
> On Tue, Jun 15, 2021 at 08:33:12PM +1000, Alistair Francis wrote:
> > Add support for the 10.3" E Ink panel described at:
> > https://www.eink.com/product.html?type=productdetail&id=7
> >
> > Signed-off-by: Alistair Francis 
> > ---
> > v2:
> >  - Fix build warning
> >  - Document new string
> >
> >  .../bindings/display/panel/panel-simple.yaml  |  2 ++
> >  drivers/gpu/drm/panel/panel-simple.c  | 29 +++
> >  2 files changed, 31 insertions(+)
> >
> > diff --git 
> > a/Documentation/devicetree/bindings/display/panel/panel-simple.yaml 
> > b/Documentation/devicetree/bindings/display/panel/panel-simple.yaml
> > index b3797ba2698b..799e20222551 100644
> > --- a/Documentation/devicetree/bindings/display/panel/panel-simple.yaml
> > +++ b/Documentation/devicetree/bindings/display/panel/panel-simple.yaml
> > @@ -128,6 +128,8 @@ properties:
> >  # Emerging Display Technology Corp. WVGA TFT Display with 
> > capacitive touch
> >- edt,etm0700g0dh6
> >- edt,etm0700g0edh6
> > +# E Ink VB3300-KCA
> > +  - eink,vb3300-kca
>
> Combining this with patch 1 would be preferrable. Either way,
>
> Acked-by: Rob Herring 

This should be ready to merge.

Let me know if you want me to squash the patches and send a v3.

Alistair


Re: [Freedreno] [RFC 3/6] drm/msm/mdp5: move mdp5_encoder_set_intf_mode after msm_dsi_modeset_init

2021-07-05 Thread Alexey Minnekhanov

On 6/10/21 12:17 AM, Dmitry Baryshkov wrote:
> Move a call to mdp5_encoder_set_intf_mode() after
> msm_dsi_modeset_init(), removing set_encoder_mode callback.
>
> Signed-off-by: Dmitry Baryshkov 
> ---
>   drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 11 +++
>   1 file changed, 3 insertions(+), 8 deletions(-)
>

My Samsung Galaxy S5 with mdp5 and cmd mode panel seems to work same as 
before with these patches applied.


Tested-by: Alexey Minnekhanov 



Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-07-05 Thread Claire Chang
On Sat, Jul 3, 2021 at 1:55 PM Nathan Chancellor  wrote:
>
> Hi Will and Robin,
>
> On Fri, Jul 02, 2021 at 04:13:50PM +0100, Robin Murphy wrote:
> > On 2021-07-02 14:58, Will Deacon wrote:
> > > Hi Nathan,
> > >
> > > On Thu, Jul 01, 2021 at 12:52:20AM -0700, Nathan Chancellor wrote:
> > > > On 7/1/2021 12:40 AM, Will Deacon wrote:
> > > > > On Wed, Jun 30, 2021 at 08:56:51AM -0700, Nathan Chancellor wrote:
> > > > > > On Wed, Jun 30, 2021 at 12:43:48PM +0100, Will Deacon wrote:
> > > > > > > On Wed, Jun 30, 2021 at 05:17:27PM +0800, Claire Chang wrote:
> > > > > > > > `BUG: unable to handle page fault for address: 
> > > > > > > > 003a8290` and
> > > > > > > > the fact it crashed at `_raw_spin_lock_irqsave` look like the 
> > > > > > > > memory
> > > > > > > > (maybe dev->dma_io_tlb_mem) was corrupted?
> > > > > > > > The dev->dma_io_tlb_mem should be set here
> > > > > > > > (https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/pci/probe.c#n2528)
> > > > > > > > through device_initialize.
> > > > > > >
> > > > > > > I'm less sure about this. 'dma_io_tlb_mem' should be pointing at
> > > > > > > 'io_tlb_default_mem', which is a page-aligned allocation from 
> > > > > > > memblock.
> > > > > > > The spinlock is at offset 0x24 in that structure, and looking at 
> > > > > > > the
> > > > > > > register dump from the crash:
> > > > > > >
> > > > > > > Jun 29 18:28:42 hp-4300G kernel: RSP: 0018:adb4013db9e8 
> > > > > > > EFLAGS: 00010006
> > > > > > > Jun 29 18:28:42 hp-4300G kernel: RAX: 003a8290 RBX: 
> > > > > > >  RCX: 8900572ad580
> > > > > > > Jun 29 18:28:42 hp-4300G kernel: RDX: 89005653f024 RSI: 
> > > > > > > 000c RDI: 1d17
> > > > > > > Jun 29 18:28:42 hp-4300G kernel: RBP: 0a20d000 R08: 
> > > > > > > 000c R09: 
> > > > > > > Jun 29 18:28:42 hp-4300G kernel: R10: 0a20d000 R11: 
> > > > > > > 89005653f000 R12: 0212
> > > > > > > Jun 29 18:28:42 hp-4300G kernel: R13: 1000 R14: 
> > > > > > > 0002 R15: 0020
> > > > > > > Jun 29 18:28:42 hp-4300G kernel: FS:  7f1f8898ea40() 
> > > > > > > GS:89005728() knlGS:
> > > > > > > Jun 29 18:28:42 hp-4300G kernel: CS:  0010 DS:  ES:  CR0: 
> > > > > > > 80050033
> > > > > > > Jun 29 18:28:42 hp-4300G kernel: CR2: 003a8290 CR3: 
> > > > > > > 0001020d CR4: 00350ee0
> > > > > > > Jun 29 18:28:42 hp-4300G kernel: Call Trace:
> > > > > > > Jun 29 18:28:42 hp-4300G kernel:  _raw_spin_lock_irqsave+0x39/0x50
> > > > > > > Jun 29 18:28:42 hp-4300G kernel:  
> > > > > > > swiotlb_tbl_map_single+0x12b/0x4c0
> > > > > > >
> > > > > > > Then that correlates with R11 holding the 'dma_io_tlb_mem' 
> > > > > > > pointer and
> > > > > > > RDX pointing at the spinlock. Yet RAX is holding junk :/
> > > > > > >
> > > > > > > I agree that enabling KASAN would be a good idea, but I also 
> > > > > > > think we
> > > > > > > probably need to get some more information out of 
> > > > > > > swiotlb_tbl_map_single()
> > > > > > > to see see what exactly is going wrong in there.
> > > > > >
> > > > > > I can certainly enable KASAN and if there is any debug print I can 
> > > > > > add
> > > > > > or dump anything, let me know!
> > > > >
> > > > > I bit the bullet and took v5.13 with swiotlb/for-linus-5.14 merged 
> > > > > in, built
> > > > > x86 defconfig and ran it on my laptop. However, it seems to work fine!
> > > > >
> > > > > Please can you share your .config?
> > > >
> > > > Sure thing, it is attached. It is just Arch Linux's config run through
> > > > olddefconfig. The original is below in case you need to diff it.
> > > >
> > > > https://raw.githubusercontent.com/archlinux/svntogit-packages/9045405dc835527164f3034b3ceb9a67c7a53cd4/trunk/config
> > > >
> > > > If there is anything more that I can provide, please let me know.
> > >
> > > I eventually got this booting (for some reason it was causing LD to SEGV
> > > trying to link it for a while...) and sadly it works fine on my laptop. 
> > > Hmm.
>
> Seems like it might be something specific to the amdgpu module?
>
> > > Did you manage to try again with KASAN?
>
> Yes, it took a few times to reproduce the issue but I did manage to get
> a dmesg, please find it attached. I build from commit 7d31f1c65cc9 ("swiotlb:
> fix implicit debugfs declarations") in Konrad's tree.

Looking at the logs, the use-after-free bug looked somehow relevant
(and it's nvme again. Qian's crash is about nvme too):

[2.468288] BUG: KASAN: use-after-free in __iommu_dma_unmap_swiotlb+0x64/0xb0
[2.468288] Read of size 8 at addr 8881d783 by task swapper/0/0

[2.468288] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc3-debug #1
[2.468288] Hardware name: HP HP Desktop M01-F1xxx/87D6, BIOS F.12 12/17/2020
[2.468288] Call Trace:
[2.468288]  
[2.479433]  dump_s

Re: [PATCH libdrm] headers: drm: Sync with drm-next

2021-07-05 Thread Michel Dänzer
On 2021-07-03 9:59 p.m., Mario Kleiner wrote:
> Generated using make headers_install from the drm-next
> tree - git://anongit.freedesktop.org/drm/drm
> branch - drm-next
> commit - 8a02ea42bc1d4c448caf1bab0e05899dad503f74
> 
> The changes were as follows (shortlog from
> b10733527bfd864605c33ab2e9a886eec317ec39..HEAD):

libdrm uses GitLab merge requests now: 
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer


[PATCH 1/4] drm/gem: Export implementation of shadow-plane helpers

2021-07-05 Thread Thomas Zimmermann
Export the implementation of duplicate, destroy and reset helpers for
shadow-buffered plane state. Useful for drivers that subclass struct
drm_shadow_plane_state.

The exported functions are wrappers around plane-state implementation,
but using them is the correct thing to do for drivers.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/drm_gem_atomic_helper.c | 55 +++--
 include/drm/drm_gem_atomic_helper.h |  6 +++
 2 files changed, 58 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c 
b/drivers/gpu/drm/drm_gem_atomic_helper.c
index bc9396f2a0ed..26af09b959d4 100644
--- a/drivers/gpu/drm/drm_gem_atomic_helper.c
+++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
@@ -182,6 +182,27 @@ EXPORT_SYMBOL(drm_gem_simple_display_pipe_prepare_fb);
  * Shadow-buffered Planes
  */
 
+/**
+ * __drm_gem_duplicate_shadow_plane_state - duplicates shadow-buffered plane 
state
+ * @plane: the plane
+ * @new_shadow_plane_state: the new shadow-buffered plane state
+ *
+ * This function duplicates shadow-buffered plane state. This is helpful for 
drivers
+ * that subclass struct drm_shadow_plane_state.
+ *
+ * The function does not duplicate existing mappings of the shadow buffers.
+ * Mappings are maintained during the atomic commit by the plane's prepare_fb
+ * and cleanup_fb helpers. See drm_gem_prepare_shadow_fb() and 
drm_gem_cleanup_shadow_fb()
+ * for corresponding helpers.
+ */
+void
+__drm_gem_duplicate_shadow_plane_state(struct drm_plane *plane,
+  struct drm_shadow_plane_state 
*new_shadow_plane_state)
+{
+   __drm_atomic_helper_plane_duplicate_state(plane, 
&new_shadow_plane_state->base);
+}
+EXPORT_SYMBOL(__drm_gem_duplicate_shadow_plane_state);
+
 /**
  * drm_gem_duplicate_shadow_plane_state - duplicates shadow-buffered plane 
state
  * @plane: the plane
@@ -211,12 +232,25 @@ drm_gem_duplicate_shadow_plane_state(struct drm_plane 
*plane)
new_shadow_plane_state = kzalloc(sizeof(*new_shadow_plane_state), 
GFP_KERNEL);
if (!new_shadow_plane_state)
return NULL;
-   __drm_atomic_helper_plane_duplicate_state(plane, 
&new_shadow_plane_state->base);
+   __drm_gem_duplicate_shadow_plane_state(plane, new_shadow_plane_state);
 
return &new_shadow_plane_state->base;
 }
 EXPORT_SYMBOL(drm_gem_duplicate_shadow_plane_state);
 
+/**
+ * __drm_gem_destroy_shadow_plane_state - cleans up shadow-buffered plane state
+ * @shadow_plane_state: the shadow-buffered plane state
+ *
+ * This function cleans up shadow-buffered plane state. Helpful for drivers 
that
+ * subclass struct drm_shadow_plane_state.
+ */
+void __drm_gem_destroy_shadow_plane_state(struct drm_shadow_plane_state 
*shadow_plane_state)
+{
+   __drm_atomic_helper_plane_destroy_state(&shadow_plane_state->base);
+}
+EXPORT_SYMBOL(__drm_gem_destroy_shadow_plane_state);
+
 /**
  * drm_gem_destroy_shadow_plane_state - deletes shadow-buffered plane state
  * @plane: the plane
@@ -232,11 +266,26 @@ void drm_gem_destroy_shadow_plane_state(struct drm_plane 
*plane,
struct drm_shadow_plane_state *shadow_plane_state =
to_drm_shadow_plane_state(plane_state);
 
-   __drm_atomic_helper_plane_destroy_state(&shadow_plane_state->base);
+   __drm_gem_destroy_shadow_plane_state(shadow_plane_state);
kfree(shadow_plane_state);
 }
 EXPORT_SYMBOL(drm_gem_destroy_shadow_plane_state);
 
+/**
+ * __drm_gem_reset_shadow_plane - resets a shadow-buffered plane
+ * @plane: the plane
+ * @shadow_plane_state: the shadow-buffered plane state
+ *
+ * This function resets state for shadow-buffered planes. Helpful
+ * for drivers that subclass struct drm_shadow_plane_state.
+ */
+void __drm_gem_reset_shadow_plane(struct drm_plane *plane,
+ struct drm_shadow_plane_state 
*shadow_plane_state)
+{
+   __drm_atomic_helper_plane_reset(plane, &shadow_plane_state->base);
+}
+EXPORT_SYMBOL(__drm_gem_reset_shadow_plane);
+
 /**
  * drm_gem_reset_shadow_plane - resets a shadow-buffered plane
  * @plane: the plane
@@ -258,7 +307,7 @@ void drm_gem_reset_shadow_plane(struct drm_plane *plane)
shadow_plane_state = kzalloc(sizeof(*shadow_plane_state), GFP_KERNEL);
if (!shadow_plane_state)
return;
-   __drm_atomic_helper_plane_reset(plane, &shadow_plane_state->base);
+   __drm_gem_reset_shadow_plane(plane, shadow_plane_state);
 }
 EXPORT_SYMBOL(drm_gem_reset_shadow_plane);
 
diff --git a/include/drm/drm_gem_atomic_helper.h 
b/include/drm/drm_gem_atomic_helper.h
index cfc5adee3d13..d82c23622156 100644
--- a/include/drm/drm_gem_atomic_helper.h
+++ b/include/drm/drm_gem_atomic_helper.h
@@ -53,6 +53,12 @@ to_drm_shadow_plane_state(struct drm_plane_state *state)
return container_of(state, struct drm_shadow_plane_state, base);
 }
 
+void __drm_gem_duplicate_shadow_plane_state(struct drm_plane *plane,
+   struct drm_shado

[PATCH 2/4] drm/vkms: Inherit plane state from struct drm_shadow_plane_state

2021-07-05 Thread Thomas Zimmermann
Subclass struct drm_shadow_plane_state for VKMS planes and update
all plane-state callbacks accordingly.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/vkms/vkms_composer.c |  2 +-
 drivers/gpu/drm/vkms/vkms_drv.h  |  5 +++--
 drivers/gpu/drm/vkms/vkms_plane.c| 16 
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
b/drivers/gpu/drm/vkms/vkms_composer.c
index e49523866e1d..3ade0df173d2 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -251,7 +251,7 @@ void vkms_composer_worker(struct work_struct *work)
 
if (crtc_state->num_active_planes >= 1) {
act_plane = crtc_state->active_planes[0];
-   if (act_plane->base.plane->type == DRM_PLANE_TYPE_PRIMARY)
+   if (act_plane->base.base.plane->type == DRM_PLANE_TYPE_PRIMARY)
primary_composer = act_plane->composer;
}
 
diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index ac8c9c2fa4ed..7a76dccd7316 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -7,6 +7,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -33,7 +34,7 @@ struct vkms_composer {
  * @composer: data required for composing computation
  */
 struct vkms_plane_state {
-   struct drm_plane_state base;
+   struct drm_shadow_plane_state base;
struct vkms_composer *composer;
 };
 
@@ -111,7 +112,7 @@ struct vkms_device {
container_of(target, struct vkms_crtc_state, base)
 
 #define to_vkms_plane_state(target)\
-   container_of(target, struct vkms_plane_state, base)
+   container_of(target, struct vkms_plane_state, base.base)
 
 /* CRTC */
 int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
diff --git a/drivers/gpu/drm/vkms/vkms_plane.c 
b/drivers/gpu/drm/vkms/vkms_plane.c
index 107521ace597..6ee4fa71bd87 100644
--- a/drivers/gpu/drm/vkms/vkms_plane.c
+++ b/drivers/gpu/drm/vkms/vkms_plane.c
@@ -40,17 +40,16 @@ vkms_plane_duplicate_state(struct drm_plane *plane)
 
vkms_state->composer = composer;
 
-   __drm_atomic_helper_plane_duplicate_state(plane,
- &vkms_state->base);
+   __drm_gem_duplicate_shadow_plane_state(plane, &vkms_state->base);
 
-   return &vkms_state->base;
+   return &vkms_state->base.base;
 }
 
 static void vkms_plane_destroy_state(struct drm_plane *plane,
 struct drm_plane_state *old_state)
 {
struct vkms_plane_state *vkms_state = to_vkms_plane_state(old_state);
-   struct drm_crtc *crtc = vkms_state->base.crtc;
+   struct drm_crtc *crtc = vkms_state->base.base.crtc;
 
if (crtc) {
/* dropping the reference we acquired in
@@ -63,7 +62,7 @@ static void vkms_plane_destroy_state(struct drm_plane *plane,
kfree(vkms_state->composer);
vkms_state->composer = NULL;
 
-   __drm_atomic_helper_plane_destroy_state(old_state);
+   __drm_gem_destroy_shadow_plane_state(&vkms_state->base);
kfree(vkms_state);
 }
 
@@ -71,8 +70,10 @@ static void vkms_plane_reset(struct drm_plane *plane)
 {
struct vkms_plane_state *vkms_state;
 
-   if (plane->state)
+   if (plane->state) {
vkms_plane_destroy_state(plane, plane->state);
+   plane->state = NULL; /* must be set to NULL here */
+   }
 
vkms_state = kzalloc(sizeof(*vkms_state), GFP_KERNEL);
if (!vkms_state) {
@@ -80,8 +81,7 @@ static void vkms_plane_reset(struct drm_plane *plane)
return;
}
 
-   plane->state = &vkms_state->base;
-   plane->state->plane = plane;
+   __drm_gem_reset_shadow_plane(plane, &vkms_state->base);
 }
 
 static const struct drm_plane_funcs vkms_plane_funcs = {
-- 
2.32.0



[PATCH 4/4] drm/vkms: Use dma-buf mapping from shadow-plane state for composing

2021-07-05 Thread Thomas Zimmermann
Store the shadow-buffer mapping's address in struct vkms_composer and
use the value when composing the output. It's the same value as stored
in the GEM SHMEM BO, but frees the composer code from its dependency
on GEM SHMEM.

Using struct dma_buf_map is how framebuffer access is supposed to be.
The long-term plan is to perform all framebuffer access via struct
dma_buf_map and avoid the details of accessing I/O and system memory.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/vkms/vkms_composer.c | 24 +++-
 drivers/gpu/drm/vkms/vkms_drv.h  |  1 +
 drivers/gpu/drm/vkms/vkms_plane.c|  3 +++
 3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
b/drivers/gpu/drm/vkms/vkms_composer.c
index 3ade0df173d2..ead8fff81f30 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -6,7 +6,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #include "vkms_drv.h"
@@ -154,24 +153,21 @@ static void compose_plane(struct vkms_composer 
*primary_composer,
  struct vkms_composer *plane_composer,
  void *vaddr_out)
 {
-   struct drm_gem_object *plane_obj;
-   struct drm_gem_shmem_object *plane_shmem_obj;
struct drm_framebuffer *fb = &plane_composer->fb;
+   void *vaddr;
void (*pixel_blend)(const u8 *p_src, u8 *p_dst);
 
-   plane_obj = drm_gem_fb_get_obj(&plane_composer->fb, 0);
-   plane_shmem_obj = to_drm_gem_shmem_obj(plane_obj);
-
-   if (WARN_ON(!plane_shmem_obj->vaddr))
+   if (WARN_ON(dma_buf_map_is_null(&primary_composer->map[0])))
return;
 
+   vaddr = plane_composer->map[0].vaddr;
+
if (fb->format->format == DRM_FORMAT_ARGB)
pixel_blend = &alpha_blend;
else
pixel_blend = &x_blend;
 
-   blend(vaddr_out, plane_shmem_obj->vaddr, primary_composer,
- plane_composer, pixel_blend);
+   blend(vaddr_out, vaddr, primary_composer, plane_composer, pixel_blend);
 }
 
 static int compose_active_planes(void **vaddr_out,
@@ -180,21 +176,23 @@ static int compose_active_planes(void **vaddr_out,
 {
struct drm_framebuffer *fb = &primary_composer->fb;
struct drm_gem_object *gem_obj = drm_gem_fb_get_obj(fb, 0);
-   struct drm_gem_shmem_object *shmem_obj = to_drm_gem_shmem_obj(gem_obj);
+   const void *vaddr;
int i;
 
if (!*vaddr_out) {
-   *vaddr_out = kzalloc(shmem_obj->base.size, GFP_KERNEL);
+   *vaddr_out = kzalloc(gem_obj->size, GFP_KERNEL);
if (!*vaddr_out) {
DRM_ERROR("Cannot allocate memory for output frame.");
return -ENOMEM;
}
}
 
-   if (WARN_ON(!shmem_obj->vaddr))
+   if (WARN_ON(dma_buf_map_is_null(&primary_composer->map[0])))
return -EINVAL;
 
-   memcpy(*vaddr_out, shmem_obj->vaddr, shmem_obj->base.size);
+   vaddr = primary_composer->map[0].vaddr;
+
+   memcpy(*vaddr_out, vaddr, gem_obj->size);
 
/* If there are other planes besides primary, we consider the active
 * planes should be in z-order and compose them associatively:
diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index 7a76dccd7316..8c731b6dcba7 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -23,6 +23,7 @@
 struct vkms_composer {
struct drm_framebuffer fb;
struct drm_rect src, dst;
+   struct dma_buf_map map[4];
unsigned int offset;
unsigned int pitch;
unsigned int cpp;
diff --git a/drivers/gpu/drm/vkms/vkms_plane.c 
b/drivers/gpu/drm/vkms/vkms_plane.c
index 8fbbd148163d..8a56fbf572b0 100644
--- a/drivers/gpu/drm/vkms/vkms_plane.c
+++ b/drivers/gpu/drm/vkms/vkms_plane.c
@@ -97,6 +97,7 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
struct drm_plane_state *new_state = 
drm_atomic_get_new_plane_state(state,
   
plane);
struct vkms_plane_state *vkms_plane_state;
+   struct drm_shadow_plane_state *shadow_plane_state;
struct drm_framebuffer *fb = new_state->fb;
struct vkms_composer *composer;
 
@@ -104,11 +105,13 @@ static void vkms_plane_atomic_update(struct drm_plane 
*plane,
return;
 
vkms_plane_state = to_vkms_plane_state(new_state);
+   shadow_plane_state = &vkms_plane_state->base;
 
composer = vkms_plane_state->composer;
memcpy(&composer->src, &new_state->src, sizeof(struct drm_rect));
memcpy(&composer->dst, &new_state->dst, sizeof(struct drm_rect));
memcpy(&composer->fb, fb, sizeof(struct drm_framebuffer));
+   memcpy(&composer->map, &shadow_plane_state->map, sizeof(composer->map));
drm_framebuffer_get(&composer->fb);
composer->offset 

[PATCH 3/4] drm/vkms: Let shadow-plane helpers prepare the plane's FB

2021-07-05 Thread Thomas Zimmermann
Replace vkms' prepare_fb and cleanup_fb functions with the generic
code for shadow-buffered planes. No functional changes.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/vkms/vkms_plane.c | 38 +--
 1 file changed, 1 insertion(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_plane.c 
b/drivers/gpu/drm/vkms/vkms_plane.c
index 6ee4fa71bd87..8fbbd148163d 100644
--- a/drivers/gpu/drm/vkms/vkms_plane.c
+++ b/drivers/gpu/drm/vkms/vkms_plane.c
@@ -8,7 +8,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "vkms_drv.h"
 
@@ -150,45 +149,10 @@ static int vkms_plane_atomic_check(struct drm_plane 
*plane,
return 0;
 }
 
-static int vkms_prepare_fb(struct drm_plane *plane,
-  struct drm_plane_state *state)
-{
-   struct drm_gem_object *gem_obj;
-   struct dma_buf_map map;
-   int ret;
-
-   if (!state->fb)
-   return 0;
-
-   gem_obj = drm_gem_fb_get_obj(state->fb, 0);
-   ret = drm_gem_shmem_vmap(gem_obj, &map);
-   if (ret)
-   DRM_ERROR("vmap failed: %d\n", ret);
-
-   return drm_gem_plane_helper_prepare_fb(plane, state);
-}
-
-static void vkms_cleanup_fb(struct drm_plane *plane,
-   struct drm_plane_state *old_state)
-{
-   struct drm_gem_object *gem_obj;
-   struct drm_gem_shmem_object *shmem_obj;
-   struct dma_buf_map map;
-
-   if (!old_state->fb)
-   return;
-
-   gem_obj = drm_gem_fb_get_obj(old_state->fb, 0);
-   shmem_obj = to_drm_gem_shmem_obj(drm_gem_fb_get_obj(old_state->fb, 0));
-   dma_buf_map_set_vaddr(&map, shmem_obj->vaddr);
-   drm_gem_shmem_vunmap(gem_obj, &map);
-}
-
 static const struct drm_plane_helper_funcs vkms_primary_helper_funcs = {
.atomic_update  = vkms_plane_atomic_update,
.atomic_check   = vkms_plane_atomic_check,
-   .prepare_fb = vkms_prepare_fb,
-   .cleanup_fb = vkms_cleanup_fb,
+   DRM_GEM_SHADOW_PLANE_HELPER_FUNCS,
 };
 
 struct vkms_plane *vkms_plane_init(struct vkms_device *vkmsdev,
-- 
2.32.0



[PATCH 0/4] vkms: Switch to shadow-buffered plane state

2021-07-05 Thread Thomas Zimmermann
Vkms copies each plane's framebuffer into the output buffer; essentially
using a shadow buffer. DRM provides struct drm_shadow_plane_state, which
handles the details of mapping/unmapping shadow buffers into memory for
active planes.

Convert vkms to the helpers. Makes vkms use shared code and gives more
test exposure to shadow-plane helpers.

Thomas Zimmermann (4):
  drm/gem: Export implementation of shadow-plane helpers
  drm/vkms: Inherit plane state from struct drm_shadow_plane_state
  drm/vkms: Let shadow-plane helpers prepare the plane's FB
  drm/vkms: Use dma-buf mapping from shadow-plane state for composing

 drivers/gpu/drm/drm_gem_atomic_helper.c | 55 ++--
 drivers/gpu/drm/vkms/vkms_composer.c| 26 ++-
 drivers/gpu/drm/vkms/vkms_drv.h |  6 ++-
 drivers/gpu/drm/vkms/vkms_plane.c   | 57 ++---
 include/drm/drm_gem_atomic_helper.h |  6 +++
 5 files changed, 86 insertions(+), 64 deletions(-)


base-commit: 3d3b5479895dd6dd133571ded4318adf595708ba
--
2.32.0



Re: drm: screen stuck since commit f611b1e7624c: drm: Avoid circular dependencies for CONFIG_FB

2021-07-05 Thread Jani Nikula
On Sat, 03 Jul 2021, Corentin Labbe  wrote:
> Hello
>
> On next-20210701, my screen is stuck (see attached photo).
> I bisect the problem to:
> git bisect start
> # good: [62fb9874f5da54fdb243003b386128037319b219] Linux 5.13
> git bisect good 62fb9874f5da54fdb243003b386128037319b219
> # bad: [fb0ca446157a86b75502c1636b0d81e642fe6bf1] Add linux-next specific 
> files for 20210701
> git bisect bad fb0ca446157a86b75502c1636b0d81e642fe6bf1
> # good: [f63c4fda987a19b1194cc45cb72fd5bf968d9d90] Merge remote-tracking 
> branch 'rdma/for-next'
> git bisect good f63c4fda987a19b1194cc45cb72fd5bf968d9d90
> # bad: [49c8769be0b910d4134eba07cae5d9c71b861c4a] Merge remote-tracking 
> branch 'drm/drm-next'
> git bisect bad 49c8769be0b910d4134eba07cae5d9c71b861c4a
> # good: [4e3db44a242a4e2afe33b59793898ecbb61d478e] Merge tag 
> 'wireless-drivers-next-2021-06-25' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
> git bisect good 4e3db44a242a4e2afe33b59793898ecbb61d478e
> # good: [5745d647d5563d3e9d32013ad4e5c629acff04d7] Merge tag 
> 'amd-drm-next-5.14-2021-06-02' of https://gitlab.freedesktop.org/agd5f/linux 
> into drm-next
> git bisect good 5745d647d5563d3e9d32013ad4e5c629acff04d7
> # bad: [8fe44c080a53ac0ccbe88053a2e40f9acca33091] drm/amdgpu/display: fold 
> DRM_AMD_DC_DCN3_1 into DRM_AMD_DC_DCN
> git bisect bad 8fe44c080a53ac0ccbe88053a2e40f9acca33091
> # good: [2c1b1ac7084edf477309d27c02d9da7f79b33cec] drm/amdgpu/vcn: drop 
> gfxoff control for VCN2+
> git bisect good 2c1b1ac7084edf477309d27c02d9da7f79b33cec
> # good: [2c1b1ac7084edf477309d27c02d9da7f79b33cec] drm/amdgpu/vcn: drop 
> gfxoff control for VCN2+
> git bisect good 2c1b1ac7084edf477309d27c02d9da7f79b33cec
> # bad: [d4c9b03ff6a9914b55e4e23fcac11339a2706cc6] drm/amd/pm: Add renoir 
> throttler translation
> git bisect bad d4c9b03ff6a9914b55e4e23fcac11339a2706cc6
> # bad: [691cf8cd7a531dbfcc29d09a23c509a86fd9b24f] drm/amdgpu: use correct 
> rounding macro for 64-bit
> git bisect bad 691cf8cd7a531dbfcc29d09a23c509a86fd9b24f
> # bad: [2fdcb55dfc86835e4845e3f422180b5596d23cb4] drm/amdkfd: use resource 
> cursor in svm_migrate_copy_to_vram v2
> git bisect bad 2fdcb55dfc86835e4845e3f422180b5596d23cb4
> # bad: [6c3f953381e526a1623d4575660afae8b19ffa20] drm/sti/sti_hqvdp: Fix 
> incorrectly named function 'sti_hqvdp_vtg_cb()'
> git bisect bad 6c3f953381e526a1623d4575660afae8b19ffa20
> # bad: [5ea4dba68305d9648b9dba30036cc36d4e877bca] drm/msm/a6xx: add 
> CONFIG_QCOM_LLCC dependency
> git bisect bad 5ea4dba68305d9648b9dba30036cc36d4e877bca
> # good: [4a888ba03fd97d1cb0253581973533965bf348c4] drm/vgem/vgem_drv: 
> Standard comment blocks should not use kernel-doc format
> git bisect good 4a888ba03fd97d1cb0253581973533965bf348c4
> # good: [c5ef15ae09637fb51ae43e1d1d98329d67dd4fd6] video: fbdev: atyfb: 
> mach64_cursor.c: deleted the repeated word
> git bisect good c5ef15ae09637fb51ae43e1d1d98329d67dd4fd6
> # bad: [f611b1e7624ccdbd495c19e9805629e22265aa16] drm: Avoid circular 
> dependencies for CONFIG_FB
> git bisect bad f611b1e7624ccdbd495c19e9805629e22265aa16
> # good: [ff323d6d72e1e4971c8ba9e2f3cf8afc48f22383] video: fbdev: mb862xx: use 
> DEVICE_ATTR_RO macro
> git bisect good ff323d6d72e1e4971c8ba9e2f3cf8afc48f22383
> # first bad commit: [f611b1e7624ccdbd495c19e9805629e22265aa16] drm: Avoid 
> circular dependencies for CONFIG_FB
>
> reverting ff323d6d72e1e4971c8ba9e2f3cf8afc48f22383 lead to a correct boot (I 
> got a viewable login prompt).

I presume that's a copy-paste error, and you mean
f611b1e7624ccdbd495c19e9805629e22265aa16, the first bad commit, i.e.

commit f611b1e7624ccdbd495c19e9805629e22265aa16
Author: Kees Cook 
Date:   Wed Jun 2 14:52:50 2021 -0700

drm: Avoid circular dependencies for CONFIG_FB

> So ff323d6d72e1 cause this config change when compiling my defconfig:
> --- config.fbok   2021-07-03 21:31:08.527260693 +0200
> +++ config.fbko   2021-07-03 21:39:51.604275703 +0200
> -CONFIG_VT_HW_CONSOLE_BINDING=y
> +# CONFIG_VT_HW_CONSOLE_BINDING is not set
> -CONFIG_DRM_FBDEV_EMULATION=y
> -CONFIG_DRM_FBDEV_OVERALLOC=100
> -CONFIG_FB_NOTIFY=y
> -CONFIG_FB=y
> -# CONFIG_FIRMWARE_EDID is not set
> -CONFIG_FB_CFB_FILLRECT=y
> -CONFIG_FB_CFB_COPYAREA=y
> -CONFIG_FB_CFB_IMAGEBLIT=y
> -CONFIG_FB_SYS_FILLRECT=y
> -CONFIG_FB_SYS_COPYAREA=y
> -CONFIG_FB_SYS_IMAGEBLIT=y
> -# CONFIG_FB_FOREIGN_ENDIAN is not set
> -CONFIG_FB_SYS_FOPS=y
> -CONFIG_FB_DEFERRED_IO=y
> -CONFIG_FB_MODE_HELPERS=y
> -CONFIG_FB_TILEBLITTING=y
> -
> -#
> -# Frame buffer hardware drivers
> -#
> -# CONFIG_FB_CIRRUS is not set
> -# CONFIG_FB_PM2 is not set
> -# CONFIG_FB_CYBER2000 is not set
> -# CONFIG_FB_ARC is not set
> -# CONFIG_FB_ASILIANT is not set
> -# CONFIG_FB_IMSTT is not set
> -# CONFIG_FB_VGA16 is not set
> -# CONFIG_FB_UVESA is not set
> -# CONFIG_FB_VESA is not set
> -CONFIG_FB_EFI=y
> -# CONFIG_FB_N411 is not set
> -# CONFIG_FB_HGA is not set
> -# CONFIG_FB_OPENCORES is not set
> -# CONFIG_FB_S1D13XXX is not set
> -# CONFIG_FB_NVIDIA is

Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-05 Thread Steven Price
On 02/07/2021 19:11, Boris Brezillon wrote:
> On Fri, 2 Jul 2021 12:49:55 -0400
> Alyssa Rosenzweig  wrote:
> 
 ```  
>  #define PANFROST_BO_REF_EXCLUSIVE0x1
> +#define PANFROST_BO_REF_NO_IMPLICIT_DEP  0x2
 ```

 This seems logically backwards. NO_IMPLICIT_DEP makes sense if we're
 trying to keep backwards compatibility, but here you're crafting a new
 interface totally from scratch. If anything, isn't BO_REF_IMPLICIT_DEP
 the flag you'd want?  
>>>
>>> AFAICT, all other drivers make the no-implicit-dep an opt-in, and I
>>> didn't want to do things differently in panfrost. But if that's really
>>> an issue, I can make it an opt-out.  
>>
>> I don't have strong feelings either way. I was just under the
>> impressions other drivers did this for b/w compat reasons which don't
>> apply here.
> 
> Okay, I think I'll keep it like that unless there's a strong reason to
> make no-implicit dep the default. It's safer to oversync than the skip
> the synchronization, so it does feel like something the user should
> explicitly enable.

I don't have strong feelings - ultimately the number of projects caring
about the uABI is so limited the "default" is pretty irrelevant (it's
not as if we are needing to guide random developers who are new to the
interface). But a conservative default seems sensible.

>>
 Hmm. I'm not /opposed/ and I know kbase uses strides but it seems like
 somewhat unwarranted complexity, and there is a combinatoric explosion
 here (if jobs, bo refs, and syncobj refs use 3 different versions, as
 this encoding permits... as opposed to just specifying a UABI version or
 something like that)  
>>>
>>> Sounds like a good idea. I'll add a version field and map that
>>> to a  tuple.  
>>
>> Cc Steven, does this make sense?
> 
> I have this approach working, and I must admit I prefer it to the
> per-object stride field passed to the submit struct.
> 

There are benefits both ways:

 * Version number: easier to think about, and less worries about
combinatorial explosion of possible options to test.

 * Explicit structure sizes/strides: much harder to accidentally forgot
to change, clients 'naturally' move to newer versions just with recompiling.

For now I'd be tempted to go for the version number, but I suspect we
should also ensure there's a generic 'flags' field in there. That would
allow us to introduce new features/behaviours in a way which can be
backported more easily if necessary.

The main benefit of structure sizes/strides is if you can break binary
backwards compatibility after a few years - because source compatibility
can easily be maintained while dropping the code for the shorter/older
structs. But Linux tries to maintain binary compatibility so this isn't
so relevant.

Steve


Re: Xorg doesn't work anymore after the latest DRM updates

2021-07-05 Thread Das, Nirmoy

Hi Christian,


This issue looks similar to the one Mikel Rychliski fixed recently  : 
https://patchwork.freedesktop.org/patch/440791. Let us know if this helps.



Regards,

Nirmoy

On 7/3/2021 9:30 AM, Christian Zigotzky wrote:

Hi All,

Xorg doesn't work anymore after the latest DRM updates. [1]

Error messages:

Jul 03 08:54:51 Fienix systemd[1]: Starting Light Display Manager...
Jul 03 08:54:51 Fienix systemd[1]: Started Light Display Manager.
Jul 03 08:54:51 Fienix kernel: BUG: Kernel NULL pointer dereference on 
read at 0x0010
Jul 03 08:54:51 Fienix kernel: Faulting instruction address: 
0xc0630750
Jul 03 08:54:51 Fienix kernel: Oops: Kernel access of bad area, sig: 
11 [#1]
Jul 03 08:54:51 Fienix kernel: BE PAGE_SIZE=4K PREEMPT SMP NR_CPUS=4 
CoreNet Generic
Jul 03 08:54:51 Fienix kernel: Modules linked in: algif_skcipher bnep 
tuner_simple tuner_types tea5767 tuner tda7432 tvaudio msp3400 bttv 
tea575x tveeprom videobuf_dma_sg videobuf_core rc_core videodev mc 
btusb btrtl btbcm btintel bluetooth ecdh_generic ecc uio_pdrv_genirq uio
Jul 03 08:54:51 Fienix kernel: CPU: 3 PID: 4300 Comm: Xorg.wrap Not 
tainted 5.14.0-a3_A-EON_X5000-07637-g3dbdb38e2869-dirty #1
Jul 03 08:54:51 Fienix kernel: NIP:  c0630750 LR: 
c060fedc CTR: c0630728
Jul 03 08:54:51 Fienix kernel: REGS: c0008d903470 TRAP: 0300 Not 
tainted  (5.14.0-a3_A-EON_X5000-07637-g3dbdb38e2869-dirty)
Jul 03 08:54:51 Fienix kernel: MSR:  80029002   CR: 
2222  XER: 2000
Jul 03 08:54:51 Fienix kernel: DEAR: 0010 ESR: 
 IRQMASK: 0
   GPR00: c060fedc 
c0008d903710 c190c400 c00085d59c00
   GPR04: c0008d9035b8 
 c000870a4900 c00085b62d00
   GPR08: 000f 
 c0630728 0003
   GPR12: 2222 
c0003fffeac0 ffe51070 0086007c
   GPR16: 00862820 
ffb7ec68  
   GPR20: c04064a0 
00450088 ffca79e4 5deadbeef122
   GPR24: 5deadbeef100 
 c000876028f0 c00080bd4000
   GPR28: c00087603c48 
c00085d59d78 c00085d59c00 c00085d59c78
Jul 03 08:54:51 Fienix kernel: NIP [c0630750] 
.radeon_ttm_bo_destroy+0x28/0xc0
Jul 03 08:54:51 Fienix kernel: LR [c060fedc] 
.ttm_bo_put+0x2ec/0x344

Jul 03 08:54:51 Fienix kernel: Call Trace:
Jul 03 08:54:51 Fienix kernel: [c0008d903710] [c060fbe4] 
.ttm_bo_cleanup_memtype_use+0x54/0x60 (unreliable)
Jul 03 08:54:51 Fienix kernel: [c0008d903790] [c060fedc] 
.ttm_bo_put+0x2ec/0x344
Jul 03 08:54:51 Fienix kernel: [c0008d903820] [c0630b50] 
.radeon_bo_unref+0x28/0x3c
Jul 03 08:54:51 Fienix kernel: [c0008d9038a0] [c06d1f6c] 
.radeon_vm_fini+0x1b0/0x1b8
Jul 03 08:54:51 Fienix kernel: [c0008d903940] [c0618e38] 
.radeon_driver_postclose_kms+0x128/0x178
Jul 03 08:54:51 Fienix kernel: [c0008d9039e0] [c05deb14] 
.drm_file_free+0x1d8/0x278
Jul 03 08:54:51 Fienix kernel: [c0008d903aa0] [c05def00] 
.drm_release+0x64/0xc8
Jul 03 08:54:51 Fienix kernel: [c0008d903b30] [c017636c] 
.__fput+0x11c/0x25c
Jul 03 08:54:51 Fienix kernel: [c0008d903bd0] [c008b1e8] 
.task_work_run+0xa4/0xbc
Jul 03 08:54:51 Fienix kernel: [c0008d903c70] [c0004bf4] 
.do_notify_resume+0x144/0x2f0
Jul 03 08:54:51 Fienix kernel: [c0008d903d70] [c000b380] 
.syscall_exit_prepare+0x110/0x130
Jul 03 08:54:51 Fienix kernel: [c0008d903e10] [c688] 
system_call_common+0x100/0x1fc

Jul 03 08:54:51 Fienix kernel: --- interrupt: c00 at 0x3f4f58
Jul 03 08:54:51 Fienix kernel: NIP:  003f4f58 LR: 
003f4f2c CTR: 
Jul 03 08:54:51 Fienix kernel: REGS: c0008d903e80 TRAP: 0c00 Not 
tainted  (5.14.0-a3_A-EON_X5000-07637-g3dbdb38e2869-dirty)
Jul 03 08:54:51 Fienix kernel: MSR:  0002d002   
CR: 2420  XER: 

Jul 03 08:54:51 Fienix kernel: IRQMASK: 0
   GPR00: 0006 
ffca66a0 f798a310 
   GPR04:  
  
   GPR08:  
  
   GPR12:  
0044fff4 ffe51070 0086007c
   GPR16: 00862820 
ffb7ec68  
   GPR20: c04064a0 
00450088 ffca79e4 004317ac
   GPR24: 004317b8 
ffca66d0 

[PATCH v4 0/7] drm/panfrost: drm/panfrost: Add a new submit ioctl

2021-07-05 Thread Boris Brezillon
Hello,

This is an attempt at providing a new submit ioctl that's more
Vulkan-friendly than the existing one. This ioctl

1/ allows passing several out syncobjs so we can easily update
   several fence/semaphore in a single ioctl() call
2/ allows passing several jobs so we don't have to have one ioctl
   per job-chain recorded in the command buffer
3/ supports disabling implicit dependencies as well as 
   non-exclusive access to BOs, thus removing unnecessary
   synchronization

I've also been looking at adding {IN,OUT}_FENCE_FD support (allowing
one to pass at most one sync_file object in input and/or creating a
sync_file FD embedding the render out fence), but it's not entirely
clear to me when that's useful. Indeed, we can already do the
sync_file <-> syncobj conversion using the
SYNCOBJ_{FD_TO_HANDLE,HANDLE_TO_FD} ioctls if we have to.
Note that, unlike Turnip, PanVk is using syncobjs to implement
vkQueueWaitIdle(), so the syncobj -> sync_file conversion doesn't
have to happen for each submission, but maybe there's a good reason
to use sync_files for that too. Any feedback on that aspect would
be useful I guess.

Any feedback on this new ioctl is welcome, in particular, do you
think other things are missing/would be nice to have for Vulkan?

Regards,

Boris

P.S.: basic igt tests for these new ioctls re available there [1]

[1]https://gitlab.freedesktop.org/bbrezillon/igt-gpu-tools/-/tree/panfrost-batch-submit

Changes in v4:
* Replace the desc strides by a version field
* Change the submitqueue_create() prototype to return the queue id
  directly
* Implement the old submit ioctl() as a simple wrapper around
  panfrost_submit_job()

Changes in v3:
* Fix a deadlock in the submitqueue logic
* Limit the number of submitqueue per context to 16

*** BLURB HERE ***

Boris Brezillon (7):
  drm/panfrost: Pass a job to panfrost_{acquire,attach}_object_fences()
  drm/panfrost: Move the mappings collection out of
panfrost_lookup_bos()
  drm/panfrost: Add BO access flags to relax dependencies between jobs
  drm/panfrost: Add the ability to create submit queues
  drm/panfrost: Add a new ioctl to submit batches
  drm/panfrost: Advertise the SYNCOBJ_TIMELINE feature
  drm/panfrost: Bump minor version to reflect the feature additions

 drivers/gpu/drm/panfrost/Makefile |   3 +-
 drivers/gpu/drm/panfrost/panfrost_device.h|   2 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c   | 611 +-
 drivers/gpu/drm/panfrost/panfrost_job.c   |  89 ++-
 drivers/gpu/drm/panfrost/panfrost_job.h   |  10 +-
 .../gpu/drm/panfrost/panfrost_submitqueue.c   | 132 
 .../gpu/drm/panfrost/panfrost_submitqueue.h   |  26 +
 include/uapi/drm/panfrost_drm.h   | 112 
 8 files changed, 766 insertions(+), 219 deletions(-)
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.c
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.h

-- 
2.31.1



[PATCH v4 3/7] drm/panfrost: Add BO access flags to relax dependencies between jobs

2021-07-05 Thread Boris Brezillon
Jobs reading from the same BO should not be serialized. Add access
flags so we can relax the implicit dependencies in that case. We force
exclusive access for now to keep the behavior unchanged, but a new
SUBMIT ioctl taking explicit access flags will be introduced.

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c |  9 +
 drivers/gpu/drm/panfrost/panfrost_job.c | 23 +++
 drivers/gpu/drm/panfrost/panfrost_job.h |  1 +
 include/uapi/drm/panfrost_drm.h |  3 +++
 4 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 9bbc9e78cc85..b6b5997c9366 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -164,6 +164,15 @@ panfrost_lookup_bos(struct drm_device *dev,
if (!job->bo_count)
return 0;
 
+   job->bo_flags = kvmalloc_array(job->bo_count,
+  sizeof(*job->bo_flags),
+  GFP_KERNEL | __GFP_ZERO);
+   if (!job->bo_flags)
+   return -ENOMEM;
+
+   for (i = 0; i < job->bo_count; i++)
+   job->bo_flags[i] = PANFROST_BO_REF_EXCLUSIVE;
+
ret = drm_gem_objects_lookup(file_priv,
 (void __user *)(uintptr_t)args->bo_handles,
 job->bo_count, &job->bos);
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index fdc1bd7ecf12..152245b122be 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -245,8 +245,16 @@ static int panfrost_acquire_object_fences(struct 
panfrost_job *job)
int i, ret;
 
for (i = 0; i < job->bo_count; i++) {
-   /* panfrost always uses write mode in its current uapi */
-   ret = drm_gem_fence_array_add_implicit(&job->deps, job->bos[i], 
true);
+   bool exclusive = job->bo_flags[i] & PANFROST_BO_REF_EXCLUSIVE;
+
+   if (!exclusive) {
+   ret = dma_resv_reserve_shared(job->bos[i]->resv, 1);
+   if (ret)
+   return ret;
+   }
+
+   ret = drm_gem_fence_array_add_implicit(&job->deps, job->bos[i],
+  exclusive);
if (ret)
return ret;
}
@@ -258,8 +266,14 @@ static void panfrost_attach_object_fences(struct 
panfrost_job *job)
 {
int i;
 
-   for (i = 0; i < job->bo_count; i++)
-   dma_resv_add_excl_fence(job->bos[i]->resv, 
job->render_done_fence);
+   for (i = 0; i < job->bo_count; i++) {
+   struct dma_resv *robj = job->bos[i]->resv;
+
+   if (job->bo_flags[i] & PANFROST_BO_REF_EXCLUSIVE)
+   dma_resv_add_excl_fence(robj, job->render_done_fence);
+   else
+   dma_resv_add_shared_fence(robj, job->render_done_fence);
+   }
 }
 
 int panfrost_job_push(struct panfrost_job *job)
@@ -340,6 +354,7 @@ static void panfrost_job_cleanup(struct kref *ref)
kvfree(job->bos);
}
 
+   kvfree(job->bo_flags);
kfree(job);
 }
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h 
b/drivers/gpu/drm/panfrost/panfrost_job.h
index 82306a03b57e..1cbc3621b663 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.h
+++ b/drivers/gpu/drm/panfrost/panfrost_job.h
@@ -32,6 +32,7 @@ struct panfrost_job {
 
struct panfrost_gem_mapping **mappings;
struct drm_gem_object **bos;
+   u32 *bo_flags;
u32 bo_count;
 
/* Fence to be signaled by drm-sched once its done with the job */
diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
index 061e700dd06c..3723c9d231b5 100644
--- a/include/uapi/drm/panfrost_drm.h
+++ b/include/uapi/drm/panfrost_drm.h
@@ -224,6 +224,9 @@ struct drm_panfrost_madvise {
__u32 retained;   /* out, whether backing store still exists */
 };
 
+/* Exclusive (AKA write) access to the BO */
+#define PANFROST_BO_REF_EXCLUSIVE  0x1
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.31.1



[PATCH v4 1/7] drm/panfrost: Pass a job to panfrost_{acquire, attach}_object_fences()

2021-07-05 Thread Boris Brezillon
So we don't have to change the prototype if we extend the function.

v3:
* Fix subject

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 71a72fb50e6b..fdc1bd7ecf12 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -240,15 +240,13 @@ static void panfrost_job_hw_submit(struct panfrost_job 
*job, int js)
spin_unlock(&pfdev->js->job_lock);
 }
 
-static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
- int bo_count,
- struct xarray *deps)
+static int panfrost_acquire_object_fences(struct panfrost_job *job)
 {
int i, ret;
 
-   for (i = 0; i < bo_count; i++) {
+   for (i = 0; i < job->bo_count; i++) {
/* panfrost always uses write mode in its current uapi */
-   ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
+   ret = drm_gem_fence_array_add_implicit(&job->deps, job->bos[i], 
true);
if (ret)
return ret;
}
@@ -256,14 +254,12 @@ static int panfrost_acquire_object_fences(struct 
drm_gem_object **bos,
return 0;
 }
 
-static void panfrost_attach_object_fences(struct drm_gem_object **bos,
- int bo_count,
- struct dma_fence *fence)
+static void panfrost_attach_object_fences(struct panfrost_job *job)
 {
int i;
 
-   for (i = 0; i < bo_count; i++)
-   dma_resv_add_excl_fence(bos[i]->resv, fence);
+   for (i = 0; i < job->bo_count; i++)
+   dma_resv_add_excl_fence(job->bos[i]->resv, 
job->render_done_fence);
 }
 
 int panfrost_job_push(struct panfrost_job *job)
@@ -290,8 +286,7 @@ int panfrost_job_push(struct panfrost_job *job)
 
job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
 
-   ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
-&job->deps);
+   ret = panfrost_acquire_object_fences(job);
if (ret) {
mutex_unlock(&pfdev->sched_lock);
goto unlock;
@@ -303,8 +298,7 @@ int panfrost_job_push(struct panfrost_job *job)
 
mutex_unlock(&pfdev->sched_lock);
 
-   panfrost_attach_object_fences(job->bos, job->bo_count,
- job->render_done_fence);
+   panfrost_attach_object_fences(job);
 
 unlock:
drm_gem_unlock_reservations(job->bos, job->bo_count, &acquire_ctx);
-- 
2.31.1



[PATCH v4 2/7] drm/panfrost: Move the mappings collection out of panfrost_lookup_bos()

2021-07-05 Thread Boris Brezillon
So we can re-use it from elsewhere.

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 52 ++---
 1 file changed, 29 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 1ffaef5ec5ff..9bbc9e78cc85 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -109,6 +109,34 @@ static int panfrost_ioctl_create_bo(struct drm_device 
*dev, void *data,
return 0;
 }
 
+static int
+panfrost_get_job_mappings(struct drm_file *file_priv, struct panfrost_job *job)
+{
+   struct panfrost_file_priv *priv = file_priv->driver_priv;
+   unsigned int i;
+
+   job->mappings = kvmalloc_array(job->bo_count,
+  sizeof(*job->mappings),
+  GFP_KERNEL | __GFP_ZERO);
+   if (!job->mappings)
+   return -ENOMEM;
+
+   for (i = 0; i < job->bo_count; i++) {
+   struct panfrost_gem_mapping *mapping;
+   struct panfrost_gem_object *bo;
+
+   bo = to_panfrost_bo(job->bos[i]);
+   mapping = panfrost_gem_mapping_get(bo, priv);
+   if (!mapping)
+   return -EINVAL;
+
+   atomic_inc(&bo->gpu_usecount);
+   job->mappings[i] = mapping;
+   }
+
+   return 0;
+}
+
 /**
  * panfrost_lookup_bos() - Sets up job->bo[] with the GEM objects
  * referenced by the job.
@@ -128,8 +156,6 @@ panfrost_lookup_bos(struct drm_device *dev,
  struct drm_panfrost_submit *args,
  struct panfrost_job *job)
 {
-   struct panfrost_file_priv *priv = file_priv->driver_priv;
-   struct panfrost_gem_object *bo;
unsigned int i;
int ret;
 
@@ -144,27 +170,7 @@ panfrost_lookup_bos(struct drm_device *dev,
if (ret)
return ret;
 
-   job->mappings = kvmalloc_array(job->bo_count,
-  sizeof(struct panfrost_gem_mapping *),
-  GFP_KERNEL | __GFP_ZERO);
-   if (!job->mappings)
-   return -ENOMEM;
-
-   for (i = 0; i < job->bo_count; i++) {
-   struct panfrost_gem_mapping *mapping;
-
-   bo = to_panfrost_bo(job->bos[i]);
-   mapping = panfrost_gem_mapping_get(bo, priv);
-   if (!mapping) {
-   ret = -EINVAL;
-   break;
-   }
-
-   atomic_inc(&bo->gpu_usecount);
-   job->mappings[i] = mapping;
-   }
-
-   return ret;
+   return panfrost_get_job_mappings(file_priv, job);
 }
 
 /**
-- 
2.31.1



[PATCH v4 6/7] drm/panfrost: Advertise the SYNCOBJ_TIMELINE feature

2021-07-05 Thread Boris Brezillon
Now that we have a new SUBMIT ioctl dealing with timelined syncojbs we
can advertise the feature.

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index a624e4f86aff..fae62142c878 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -789,7 +789,8 @@ DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops);
  * - 1.2 - adds AFBC_FEATURES query
  */
 static const struct drm_driver panfrost_drm_driver = {
-   .driver_features= DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ,
+   .driver_features= DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ |
+ DRIVER_SYNCOBJ_TIMELINE,
.open   = panfrost_open,
.postclose  = panfrost_postclose,
.ioctls = panfrost_drm_driver_ioctls,
-- 
2.31.1



[PATCH v4 4/7] drm/panfrost: Add the ability to create submit queues

2021-07-05 Thread Boris Brezillon
Needed to keep VkQueues isolated from each other.

v4:
* Make panfrost_ioctl_create_submitqueue() return the queue ID
  instead of a queue object

v3:
* Limit the number of submitqueue per context to 16
* Fix a deadlock

Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/panfrost/Makefile |   3 +-
 drivers/gpu/drm/panfrost/panfrost_device.h|   2 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c   |  69 -
 drivers/gpu/drm/panfrost/panfrost_job.c   |  47 ++-
 drivers/gpu/drm/panfrost/panfrost_job.h   |   9 +-
 .../gpu/drm/panfrost/panfrost_submitqueue.c   | 132 ++
 .../gpu/drm/panfrost/panfrost_submitqueue.h   |  26 
 include/uapi/drm/panfrost_drm.h   |  17 +++
 8 files changed, 260 insertions(+), 45 deletions(-)
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.c
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.h

diff --git a/drivers/gpu/drm/panfrost/Makefile 
b/drivers/gpu/drm/panfrost/Makefile
index b71935862417..e99192b66ec9 100644
--- a/drivers/gpu/drm/panfrost/Makefile
+++ b/drivers/gpu/drm/panfrost/Makefile
@@ -9,6 +9,7 @@ panfrost-y := \
panfrost_gpu.o \
panfrost_job.o \
panfrost_mmu.o \
-   panfrost_perfcnt.o
+   panfrost_perfcnt.o \
+   panfrost_submitqueue.o
 
 obj-$(CONFIG_DRM_PANFROST) += panfrost.o
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
b/drivers/gpu/drm/panfrost/panfrost_device.h
index 8b25278f34c8..51c0ba4e50f5 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -137,7 +137,7 @@ struct panfrost_mmu {
 struct panfrost_file_priv {
struct panfrost_device *pfdev;
 
-   struct drm_sched_entity sched_entity[NUM_JOB_SLOTS];
+   struct idr queues;
 
struct panfrost_mmu *mmu;
 };
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index b6b5997c9366..8e28ef30310b 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -19,6 +19,7 @@
 #include "panfrost_job.h"
 #include "panfrost_gpu.h"
 #include "panfrost_perfcnt.h"
+#include "panfrost_submitqueue.h"
 
 static bool unstable_ioctls;
 module_param_unsafe(unstable_ioctls, bool, 0600);
@@ -250,6 +251,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
struct panfrost_device *pfdev = dev->dev_private;
struct drm_panfrost_submit *args = data;
struct drm_syncobj *sync_out = NULL;
+   struct panfrost_submitqueue *queue;
struct panfrost_job *job;
int ret = 0;
 
@@ -259,10 +261,16 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
if (args->requirements && args->requirements != PANFROST_JD_REQ_FS)
return -EINVAL;
 
+   queue = panfrost_submitqueue_get(file->driver_priv, 0);
+   if (IS_ERR(queue))
+   return PTR_ERR(queue);
+
if (args->out_sync > 0) {
sync_out = drm_syncobj_find(file, args->out_sync);
-   if (!sync_out)
-   return -ENODEV;
+   if (!sync_out) {
+   ret = -ENODEV;
+   goto fail_put_queue;
+   }
}
 
job = kzalloc(sizeof(*job), GFP_KERNEL);
@@ -289,7 +297,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
if (ret)
goto fail_job;
 
-   ret = panfrost_job_push(job);
+   ret = panfrost_job_push(queue, job);
if (ret)
goto fail_job;
 
@@ -302,6 +310,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
 fail_out_sync:
if (sync_out)
drm_syncobj_put(sync_out);
+fail_put_queue:
+   panfrost_submitqueue_put(queue);
 
return ret;
 }
@@ -451,6 +461,36 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, 
void *data,
return ret;
 }
 
+static int
+panfrost_ioctl_create_submitqueue(struct drm_device *dev, void *data,
+ struct drm_file *file_priv)
+{
+   struct panfrost_file_priv *priv = file_priv->driver_priv;
+   struct drm_panfrost_create_submitqueue *args = data;
+   int ret;
+
+   ret = panfrost_submitqueue_create(priv, args->priority, args->flags);
+   if (ret < 0)
+   return ret;
+
+   args->id = ret;
+   return 0;
+}
+
+static int
+panfrost_ioctl_destroy_submitqueue(struct drm_device *dev, void *data,
+  struct drm_file *file_priv)
+{
+   struct panfrost_file_priv *priv = file_priv->driver_priv;
+   u32 id = *((u32 *)data);
+
+   /* Default queue can't be destroyed. */
+   if (!id)
+   return -ENOENT;
+
+   return panfrost_submitqueue_destroy(priv, id);
+}
+
 int panfrost_unstable_ioctl_check(void)
 {
if (!unstable_ioctls)
@@ -479,13 +519,22 @@ panfrost_open(struct drm_device *dev,

[PATCH v4 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-05 Thread Boris Brezillon
This should help limit the number of ioctls when submitting multiple
jobs. The new ioctl also supports syncobj timelines and BO access flags.

v4:
* Implement panfrost_ioctl_submit() as a wrapper around
  panfrost_submit_job()
* Replace stride fields by a version field which is mapped to
  a  tuple internally

v3:
* Re-use panfrost_get_job_bos() and panfrost_get_job_in_syncs() in the
  old submit path

Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 562 
 drivers/gpu/drm/panfrost/panfrost_job.c |   3 +
 include/uapi/drm/panfrost_drm.h |  92 
 3 files changed, 479 insertions(+), 178 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 8e28ef30310b..a624e4f86aff 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -138,184 +138,6 @@ panfrost_get_job_mappings(struct drm_file *file_priv, 
struct panfrost_job *job)
return 0;
 }
 
-/**
- * panfrost_lookup_bos() - Sets up job->bo[] with the GEM objects
- * referenced by the job.
- * @dev: DRM device
- * @file_priv: DRM file for this fd
- * @args: IOCTL args
- * @job: job being set up
- *
- * Resolve handles from userspace to BOs and attach them to job.
- *
- * Note that this function doesn't need to unreference the BOs on
- * failure, because that will happen at panfrost_job_cleanup() time.
- */
-static int
-panfrost_lookup_bos(struct drm_device *dev,
- struct drm_file *file_priv,
- struct drm_panfrost_submit *args,
- struct panfrost_job *job)
-{
-   unsigned int i;
-   int ret;
-
-   job->bo_count = args->bo_handle_count;
-
-   if (!job->bo_count)
-   return 0;
-
-   job->bo_flags = kvmalloc_array(job->bo_count,
-  sizeof(*job->bo_flags),
-  GFP_KERNEL | __GFP_ZERO);
-   if (!job->bo_flags)
-   return -ENOMEM;
-
-   for (i = 0; i < job->bo_count; i++)
-   job->bo_flags[i] = PANFROST_BO_REF_EXCLUSIVE;
-
-   ret = drm_gem_objects_lookup(file_priv,
-(void __user *)(uintptr_t)args->bo_handles,
-job->bo_count, &job->bos);
-   if (ret)
-   return ret;
-
-   return panfrost_get_job_mappings(file_priv, job);
-}
-
-/**
- * panfrost_copy_in_sync() - Sets up job->deps with the sync objects
- * referenced by the job.
- * @dev: DRM device
- * @file_priv: DRM file for this fd
- * @args: IOCTL args
- * @job: job being set up
- *
- * Resolve syncobjs from userspace to fences and attach them to job.
- *
- * Note that this function doesn't need to unreference the fences on
- * failure, because that will happen at panfrost_job_cleanup() time.
- */
-static int
-panfrost_copy_in_sync(struct drm_device *dev,
- struct drm_file *file_priv,
- struct drm_panfrost_submit *args,
- struct panfrost_job *job)
-{
-   u32 *handles;
-   int ret = 0;
-   int i, in_fence_count;
-
-   in_fence_count = args->in_sync_count;
-
-   if (!in_fence_count)
-   return 0;
-
-   handles = kvmalloc_array(in_fence_count, sizeof(u32), GFP_KERNEL);
-   if (!handles) {
-   ret = -ENOMEM;
-   DRM_DEBUG("Failed to allocate incoming syncobj handles\n");
-   goto fail;
-   }
-
-   if (copy_from_user(handles,
-  (void __user *)(uintptr_t)args->in_syncs,
-  in_fence_count * sizeof(u32))) {
-   ret = -EFAULT;
-   DRM_DEBUG("Failed to copy in syncobj handles\n");
-   goto fail;
-   }
-
-   for (i = 0; i < in_fence_count; i++) {
-   struct dma_fence *fence;
-
-   ret = drm_syncobj_find_fence(file_priv, handles[i], 0, 0,
-&fence);
-   if (ret)
-   goto fail;
-
-   ret = drm_gem_fence_array_add(&job->deps, fence);
-
-   if (ret)
-   goto fail;
-   }
-
-fail:
-   kvfree(handles);
-   return ret;
-}
-
-static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
-   struct drm_file *file)
-{
-   struct panfrost_device *pfdev = dev->dev_private;
-   struct drm_panfrost_submit *args = data;
-   struct drm_syncobj *sync_out = NULL;
-   struct panfrost_submitqueue *queue;
-   struct panfrost_job *job;
-   int ret = 0;
-
-   if (!args->jc)
-   return -EINVAL;
-
-   if (args->requirements && args->requirements != PANFROST_JD_REQ_FS)
-   return -EINVAL;
-
-   queue = panfrost_submitqueue_get(file->driver_priv, 0);
-   if (IS_ERR(queue))
-   return PTR_ERR(queue);
-
-   if (args->out_sync > 0) {
- 

[PATCH v4 7/7] drm/panfrost: Bump minor version to reflect the feature additions

2021-07-05 Thread Boris Brezillon
We now have a new ioctl that allows submitting multiple jobs at once
(among other things) and we support timelined syncobjs. Bump the
minor version number to reflect those changes.

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index fae62142c878..27e2e8b36ec9 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -787,6 +787,8 @@ DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops);
  * - 1.0 - initial interface
  * - 1.1 - adds HEAP and NOEXEC flags for CREATE_BO
  * - 1.2 - adds AFBC_FEATURES query
+ * - 1.3 - adds the BATCH_SUBMIT, CREATE_SUBMITQUEUE, DESTROY_SUBMITQUEUE
+ *ioctls and advertises the SYNCOBJ_TIMELINE feature
  */
 static const struct drm_driver panfrost_drm_driver = {
.driver_features= DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ |
@@ -800,7 +802,7 @@ static const struct drm_driver panfrost_drm_driver = {
.desc   = "panfrost DRM",
.date   = "20180908",
.major  = 1,
-   .minor  = 2,
+   .minor  = 3,
 
.gem_create_object  = panfrost_gem_create_object,
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
-- 
2.31.1



Re: [Intel-gfx] [PATCH v2 3/3] drm/i915/uapi: reject set_domain for discrete

2021-07-05 Thread Tvrtko Ursulin



On 02/07/2021 20:22, Daniel Vetter wrote:

On Fri, Jul 02, 2021 at 03:31:08PM +0100, Tvrtko Ursulin wrote:


On 01/07/2021 16:10, Matthew Auld wrote:

The CPU domain should be static for discrete, and on DG1 we don't need
any flushing since everything is already coherent, so really all this


Knowledge of the write combine buffer is assumed to be had by anyone involved?


What about this question? For discrete userspace will assume WC and will 
know how to flush WC buffer? Or it is assumed the flush will be hit 
implicitly? Will this be documented?



does is an object wait, for which we have an ioctl. Longer term the
desired caching should be an immutable creation time property for the
BO, which can be set with something like gem_create_ext.

One other user is iris + userptr, which uses the set_domain to probe all
the pages to check if the GUP succeeds, however keeping the set_domain
around just for that seems rather scuffed. We could equally just submit
a dummy batch, which should hopefully be good enough, otherwise adding a
new creation time flag for userptr might be an option. Although longer
term we will also have vm_bind, which should also be a nice fit for
this, so adding a whole new flag is likely overkill.


Execbuf sounds horrible. But it all reminds me of past work by Chris which is 
surprisingly hard to find in the archives. Patches like:

commit 7706a433388016983052a27c0fd74a64b1897ae7
Author: Chris Wilson 
Date:   Wed Nov 8 17:04:07 2017 +

 drm/i915/userptr: Probe existence of backing struct pages upon creation
 Jason Ekstrand requested a more efficient method than userptr+set-domain
 to determine if the userptr object was backed by a complete set of pages
 upon creation. To be more efficient than simply populating the userptr
 using get_user_pages() (as done by the call to set-domain or execbuf),
 we can walk the tree of vm_area_struct and check for gaps or vma not
 backed by struct page (VM_PFNMAP). The question is how to handle
 VM_MIXEDMAP which may be either struct page or pfn backed...

commit 7ca21d3390eec23db99b8131ed18bc036efaba18
Author: Chris Wilson 
Date:   Wed Nov 8 17:48:22 2017 +

 drm/i915/userptr: Add a flag to populate the userptr on creation
 Acquiring the backing struct pages for the userptr range is not free;
 the first client for userptr would insist on frequently creating userptr
 objects ahead of time and not use them. For that first client, deferring
 the cost of populating the userptr (calling get_user_pages()) to the
 actual execbuf was a substantial improvement. However, not all clients
 are the same, and most would like to validate that the userptr is valid
 and backed by struct pages upon creation, so offer a
 I915_USERPTR_POPULATE flag to do just that.
 Note that big difference between I915_USERPTR_POPULATE and the deferred
 scheme is that POPULATE is guaranteed to be synchronous, the result is
 known before the ioctl returns (and the handle exposed). However, due to
 system memory pressure, the object may be paged out before use,
 requiring them to be paged back in on execbuf (as may always happen).

At least with the first one I think I was skeptical, since probing at
point A makes a weak test versus userptr getting used at point B.
Populate is kind of same really when user controls the backing store. At
least these two arguments I think stand if we are trying to sell these
flags as validation. But if the idea is limited to pure preload, with no
guarantees that it keeps working by time of real use, then I guess it
may be passable.


Well we've thrown this out again because there was no userspace. But if
this is requested by mesa, then the _PROBE flag should be entirely
sufficient.


Why probe and not populate? For me probe is weak and implies to give a 
guarantee which cannot really be given. If the pointer is not trusted, 
there is no reason to think it cannot go bad between creating the buffer 
(probe) and actual use. Populate on the other hand could be described as 
simply instantiate the backing store with the same caveat mentioned. No 
guarantees about the future validity of the backing store in either case 
should be implied.



Since I don't want to hold up dg1 pciids on this it'd be nice if we could
just go ahead with the dummy batch, if Ken/Jordan don't object - iris is
the only umd that needs this.


I am not up to speed to understand how to PCI ids come into play here, 
but what is the suggestion with the dummy batch - to actually submit 
something which ends up executing, waking up the GPU etc? Or be crafty 
and make it fail after it acquires backing store? Not sure if we have 
such a spot that late so just asking to start with. If the plan is to 
wake up the GPU that's quite ugly in my opinion. Especially since patch 
which adds the flag already exists so shouldn't really be much a delay 
to sync userspace and i915 merge.



Disclaimer that I haven'

Re: drm: screen stuck since commit f611b1e7624c: drm: Avoid circular dependencies for CONFIG_FB

2021-07-05 Thread Daniel Vetter
On Mon, Jul 05, 2021 at 11:21:22AM +0300, Jani Nikula wrote:
> On Sat, 03 Jul 2021, Corentin Labbe  wrote:
> > Hello
> >
> > On next-20210701, my screen is stuck (see attached photo).
> > I bisect the problem to:
> > git bisect start
> > # good: [62fb9874f5da54fdb243003b386128037319b219] Linux 5.13
> > git bisect good 62fb9874f5da54fdb243003b386128037319b219
> > # bad: [fb0ca446157a86b75502c1636b0d81e642fe6bf1] Add linux-next specific 
> > files for 20210701
> > git bisect bad fb0ca446157a86b75502c1636b0d81e642fe6bf1
> > # good: [f63c4fda987a19b1194cc45cb72fd5bf968d9d90] Merge remote-tracking 
> > branch 'rdma/for-next'
> > git bisect good f63c4fda987a19b1194cc45cb72fd5bf968d9d90
> > # bad: [49c8769be0b910d4134eba07cae5d9c71b861c4a] Merge remote-tracking 
> > branch 'drm/drm-next'
> > git bisect bad 49c8769be0b910d4134eba07cae5d9c71b861c4a
> > # good: [4e3db44a242a4e2afe33b59793898ecbb61d478e] Merge tag 
> > 'wireless-drivers-next-2021-06-25' of 
> > git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
> > git bisect good 4e3db44a242a4e2afe33b59793898ecbb61d478e
> > # good: [5745d647d5563d3e9d32013ad4e5c629acff04d7] Merge tag 
> > 'amd-drm-next-5.14-2021-06-02' of 
> > https://gitlab.freedesktop.org/agd5f/linux into drm-next
> > git bisect good 5745d647d5563d3e9d32013ad4e5c629acff04d7
> > # bad: [8fe44c080a53ac0ccbe88053a2e40f9acca33091] drm/amdgpu/display: fold 
> > DRM_AMD_DC_DCN3_1 into DRM_AMD_DC_DCN
> > git bisect bad 8fe44c080a53ac0ccbe88053a2e40f9acca33091
> > # good: [2c1b1ac7084edf477309d27c02d9da7f79b33cec] drm/amdgpu/vcn: drop 
> > gfxoff control for VCN2+
> > git bisect good 2c1b1ac7084edf477309d27c02d9da7f79b33cec
> > # good: [2c1b1ac7084edf477309d27c02d9da7f79b33cec] drm/amdgpu/vcn: drop 
> > gfxoff control for VCN2+
> > git bisect good 2c1b1ac7084edf477309d27c02d9da7f79b33cec
> > # bad: [d4c9b03ff6a9914b55e4e23fcac11339a2706cc6] drm/amd/pm: Add renoir 
> > throttler translation
> > git bisect bad d4c9b03ff6a9914b55e4e23fcac11339a2706cc6
> > # bad: [691cf8cd7a531dbfcc29d09a23c509a86fd9b24f] drm/amdgpu: use correct 
> > rounding macro for 64-bit
> > git bisect bad 691cf8cd7a531dbfcc29d09a23c509a86fd9b24f
> > # bad: [2fdcb55dfc86835e4845e3f422180b5596d23cb4] drm/amdkfd: use resource 
> > cursor in svm_migrate_copy_to_vram v2
> > git bisect bad 2fdcb55dfc86835e4845e3f422180b5596d23cb4
> > # bad: [6c3f953381e526a1623d4575660afae8b19ffa20] drm/sti/sti_hqvdp: Fix 
> > incorrectly named function 'sti_hqvdp_vtg_cb()'
> > git bisect bad 6c3f953381e526a1623d4575660afae8b19ffa20
> > # bad: [5ea4dba68305d9648b9dba30036cc36d4e877bca] drm/msm/a6xx: add 
> > CONFIG_QCOM_LLCC dependency
> > git bisect bad 5ea4dba68305d9648b9dba30036cc36d4e877bca
> > # good: [4a888ba03fd97d1cb0253581973533965bf348c4] drm/vgem/vgem_drv: 
> > Standard comment blocks should not use kernel-doc format
> > git bisect good 4a888ba03fd97d1cb0253581973533965bf348c4
> > # good: [c5ef15ae09637fb51ae43e1d1d98329d67dd4fd6] video: fbdev: atyfb: 
> > mach64_cursor.c: deleted the repeated word
> > git bisect good c5ef15ae09637fb51ae43e1d1d98329d67dd4fd6
> > # bad: [f611b1e7624ccdbd495c19e9805629e22265aa16] drm: Avoid circular 
> > dependencies for CONFIG_FB
> > git bisect bad f611b1e7624ccdbd495c19e9805629e22265aa16
> > # good: [ff323d6d72e1e4971c8ba9e2f3cf8afc48f22383] video: fbdev: mb862xx: 
> > use DEVICE_ATTR_RO macro
> > git bisect good ff323d6d72e1e4971c8ba9e2f3cf8afc48f22383
> > # first bad commit: [f611b1e7624ccdbd495c19e9805629e22265aa16] drm: Avoid 
> > circular dependencies for CONFIG_FB
> >
> > reverting ff323d6d72e1e4971c8ba9e2f3cf8afc48f22383 lead to a correct boot 
> > (I got a viewable login prompt).
> 
> I presume that's a copy-paste error, and you mean
> f611b1e7624ccdbd495c19e9805629e22265aa16, the first bad commit, i.e.
> 
> commit f611b1e7624ccdbd495c19e9805629e22265aa16
> Author: Kees Cook 
> Date:   Wed Jun 2 14:52:50 2021 -0700
> 
> drm: Avoid circular dependencies for CONFIG_FB
> 
> > So ff323d6d72e1 cause this config change when compiling my defconfig:
> > --- config.fbok 2021-07-03 21:31:08.527260693 +0200
> > +++ config.fbko 2021-07-03 21:39:51.604275703 +0200
> > -CONFIG_VT_HW_CONSOLE_BINDING=y
> > +# CONFIG_VT_HW_CONSOLE_BINDING is not set
> > -CONFIG_DRM_FBDEV_EMULATION=y
> > -CONFIG_DRM_FBDEV_OVERALLOC=100
> > -CONFIG_FB_NOTIFY=y
> > -CONFIG_FB=y
> > -# CONFIG_FIRMWARE_EDID is not set
> > -CONFIG_FB_CFB_FILLRECT=y
> > -CONFIG_FB_CFB_COPYAREA=y
> > -CONFIG_FB_CFB_IMAGEBLIT=y
> > -CONFIG_FB_SYS_FILLRECT=y
> > -CONFIG_FB_SYS_COPYAREA=y
> > -CONFIG_FB_SYS_IMAGEBLIT=y
> > -# CONFIG_FB_FOREIGN_ENDIAN is not set
> > -CONFIG_FB_SYS_FOPS=y
> > -CONFIG_FB_DEFERRED_IO=y
> > -CONFIG_FB_MODE_HELPERS=y
> > -CONFIG_FB_TILEBLITTING=y
> > -
> > -#
> > -# Frame buffer hardware drivers
> > -#
> > -# CONFIG_FB_CIRRUS is not set
> > -# CONFIG_FB_PM2 is not set
> > -# CONFIG_FB_CYBER2000 is not set
> > -# CONFIG_FB_ARC is not set
> > -# CONFIG_FB_ASILIANT is not set
> > -# CONFIG_FB_IMSTT is not set

Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-05 Thread Boris Brezillon
Hi Steven,

On Mon, 5 Jul 2021 09:22:39 +0100
Steven Price  wrote:

> On 02/07/2021 19:11, Boris Brezillon wrote:
> > On Fri, 2 Jul 2021 12:49:55 -0400
> > Alyssa Rosenzweig  wrote:
> >   
>  ```
> >  #define PANFROST_BO_REF_EXCLUSIVE  0x1
> > +#define PANFROST_BO_REF_NO_IMPLICIT_DEP0x2  
>  ```
> 
>  This seems logically backwards. NO_IMPLICIT_DEP makes sense if we're
>  trying to keep backwards compatibility, but here you're crafting a new
>  interface totally from scratch. If anything, isn't BO_REF_IMPLICIT_DEP
>  the flag you'd want?
> >>>
> >>> AFAICT, all other drivers make the no-implicit-dep an opt-in, and I
> >>> didn't want to do things differently in panfrost. But if that's really
> >>> an issue, I can make it an opt-out.
> >>
> >> I don't have strong feelings either way. I was just under the
> >> impressions other drivers did this for b/w compat reasons which don't
> >> apply here.  
> > 
> > Okay, I think I'll keep it like that unless there's a strong reason to
> > make no-implicit dep the default. It's safer to oversync than the skip
> > the synchronization, so it does feel like something the user should
> > explicitly enable.  
> 
> I don't have strong feelings - ultimately the number of projects caring
> about the uABI is so limited the "default" is pretty irrelevant (it's
> not as if we are needing to guide random developers who are new to the
> interface). But a conservative default seems sensible.
> 
> >>  
>  Hmm. I'm not /opposed/ and I know kbase uses strides but it seems like
>  somewhat unwarranted complexity, and there is a combinatoric explosion
>  here (if jobs, bo refs, and syncobj refs use 3 different versions, as
>  this encoding permits... as opposed to just specifying a UABI version or
>  something like that)
> >>>
> >>> Sounds like a good idea. I'll add a version field and map that
> >>> to a  tuple.
> >>
> >> Cc Steven, does this make sense?  
> > 
> > I have this approach working, and I must admit I prefer it to the
> > per-object stride field passed to the submit struct.
> >   
> 
> There are benefits both ways:
> 
>  * Version number: easier to think about, and less worries about
> combinatorial explosion of possible options to test.
> 
>  * Explicit structure sizes/strides: much harder to accidentally forgot
> to change, clients 'naturally' move to newer versions just with recompiling.

The version I just sent has a PANFROST_SUBMIT_BATCH_VERSION macro
defined in the the uAPI header, so getting right without changing the
code should be fine (same has with the sizeof(struct xx)) trick with
the per-desc stride approach).

> 
> For now I'd be tempted to go for the version number, but I suspect we
> should also ensure there's a generic 'flags' field in there. That would
> allow us to introduce new features/behaviours in a way which can be
> backported more easily if necessary.

Adding features at the submit level without changing the version number
is already possible (we can extend drm_panfrost_batch_submit without
breaking the uABI), but I'm not sure that's a good idea...

If you mean adding a flags field at the job level, I can add it, but I
wonder what you have in mind when you say some features might be
interesting to backport. I really thought we'd force people to update
their kernel when they want those new features.

Regards,

Boris


Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-05 Thread Steven Price
On 05/07/2021 09:43, Boris Brezillon wrote:
> Hi Steven,
> 
> On Mon, 5 Jul 2021 09:22:39 +0100
> Steven Price  wrote:
> 
>> On 02/07/2021 19:11, Boris Brezillon wrote:
>>> On Fri, 2 Jul 2021 12:49:55 -0400
>>> Alyssa Rosenzweig  wrote:
>>>   
>> ```
>>>  #define PANFROST_BO_REF_EXCLUSIVE  0x1
>>> +#define PANFROST_BO_REF_NO_IMPLICIT_DEP0x2  
>> ```
>>
>> This seems logically backwards. NO_IMPLICIT_DEP makes sense if we're
>> trying to keep backwards compatibility, but here you're crafting a new
>> interface totally from scratch. If anything, isn't BO_REF_IMPLICIT_DEP
>> the flag you'd want?
>
> AFAICT, all other drivers make the no-implicit-dep an opt-in, and I
> didn't want to do things differently in panfrost. But if that's really
> an issue, I can make it an opt-out.

 I don't have strong feelings either way. I was just under the
 impressions other drivers did this for b/w compat reasons which don't
 apply here.  
>>>
>>> Okay, I think I'll keep it like that unless there's a strong reason to
>>> make no-implicit dep the default. It's safer to oversync than the skip
>>> the synchronization, so it does feel like something the user should
>>> explicitly enable.  
>>
>> I don't have strong feelings - ultimately the number of projects caring
>> about the uABI is so limited the "default" is pretty irrelevant (it's
>> not as if we are needing to guide random developers who are new to the
>> interface). But a conservative default seems sensible.
>>
  
>> Hmm. I'm not /opposed/ and I know kbase uses strides but it seems like
>> somewhat unwarranted complexity, and there is a combinatoric explosion
>> here (if jobs, bo refs, and syncobj refs use 3 different versions, as
>> this encoding permits... as opposed to just specifying a UABI version or
>> something like that)
>
> Sounds like a good idea. I'll add a version field and map that
> to a  tuple.

 Cc Steven, does this make sense?  
>>>
>>> I have this approach working, and I must admit I prefer it to the
>>> per-object stride field passed to the submit struct.
>>>   
>>
>> There are benefits both ways:
>>
>>  * Version number: easier to think about, and less worries about
>> combinatorial explosion of possible options to test.
>>
>>  * Explicit structure sizes/strides: much harder to accidentally forgot
>> to change, clients 'naturally' move to newer versions just with recompiling.
> 
> The version I just sent has a PANFROST_SUBMIT_BATCH_VERSION macro
> defined in the the uAPI header, so getting right without changing the
> code should be fine (same has with the sizeof(struct xx)) trick with
> the per-desc stride approach).
> 
>>
>> For now I'd be tempted to go for the version number, but I suspect we
>> should also ensure there's a generic 'flags' field in there. That would
>> allow us to introduce new features/behaviours in a way which can be
>> backported more easily if necessary.
> 
> Adding features at the submit level without changing the version number
> is already possible (we can extend drm_panfrost_batch_submit without
> breaking the uABI), but I'm not sure that's a good idea...

Ah, yes I'd forgotten the ioctl itself already had the implicit
sizeof(struct) encoding. I guess there's no need for flags (now) because
it can be added later if it even becomes useful.

> If you mean adding a flags field at the job level, I can add it, but I
> wonder what you have in mind when you say some features might be
> interesting to backport. I really thought we'd force people to update
> their kernel when they want those new features.

My concern is if we ever find a security bug which requires new
information/behaviour in the submit ABI to properly fix. In this case it
would be appropriate to backport a 'feature' (bug fix) which provides a
new ABI but it would need to be a small change. A flags field where we
can set a "PANFROST_ACTUALLY_BE_SECURE" bit would be useful then - but
we wouldn't want to start bumping version numbers in the backport.

But at least for now we could just assume we'll expand the ioctl struct
if we ever hit that situation, so no need for an explicit flags field.

Steve


Re: Questions over DSI within DRM.

2021-07-05 Thread Jani Nikula
On Fri, 02 Jul 2021, Laurent Pinchart  wrote:
> On Fri, Jul 02, 2021 at 12:03:31PM +0100, Dave Stevenson wrote:
>> Hi All
>> 
>> I'm trying to get DSI devices working reliably on the Raspberry Pi,
>> but I'm hitting a number of places where it isn't clear as to the
>> expected behaviour within DRM.
>
> Not a surprise. I dread reading the rest of this e-mail though :-)

Indeed.

I'll just note that i915 is *not* a prime example to look at. We
basically just abuse the abstractions as helpers. We have one
parametrized panel driver with both the DSI host and device embedded in
i915, instead of one panel device driver per panel.

>> Power on state. Many devices want the DSI clock and/or data lanes in
>> LP-11 state when they are powered up.
>
> When they are powered up, or when they are enabled ?
>
>> With the normal calling sequence of:
>> - panel/bridge pre_enable calls from connector towards the encoder.
>> - encoder enable which also enables video.
>> - panel/bridge enable calls from encoder to connector.
>> there is no point at which the DSI tx is initialised but not
>> transmitting video. What DSI states are expected to be adopted at each
>> point?
>
> That's undefined I'm afraid, and it should be documented. The upside is
> that you can propose the behaviour that you need :-)
>
>> On a similar theme, some devices want the clock lane in HS mode early
>> so they can use it in place of an external oscillator, but the data
>> lanes still in LP-11. There appears to be no way for the
>> display/bridge to signal this requirement or it be achieved.
>
> You're right. A lng time ago, the omapdrm driver had an internal
> infrastructure that didn't use drm_bridge or drm_panel and instead
> required omapdrm-specific drivers for those components. It used to model
> the display pipeline in a different way than drm_bridge, with the sync
> explicitly setting the source state. A DSI sink could thus control its
> enable sequence, interleaving programming of the sink with control of
> the source.
>
> Migrating omapdrm to the drm_bridge model took a really large effort,
> which makes me believe that transitioning the whole subsystem to
> sink-controlled sources would be close to impossible. We could add
> DSI-specific operations, or add another enable bridge operation
> (post_pre_enable ? :-D). Neither would scale, but it may be enough.

There's also quite a wide range of DSI abstraction in hardware. IIRC in
omapdrm you need to control the LP and ULPS etc. yourself in the driver,
while in i915 the hardware does quite a bunch of stuff for you. Any drm
level abstraction that does not take this into account will get in the
way.

BR,
Jani.

>> host_transfer calls can supposedly be made at any time, however unless
>> MIPI_DSI_MSG_USE_LPM is set in the message then we're meant to send it
>> in high speed mode. If this is before a mode has been set, what
>> defines the link frequency parameters at this point? Adopting a random
>> default sounds like a good way to get undefined behaviour.
>> 
>> DSI burst mode needs to set the DSI link frequency independently of
>> the display mode. How is that meant to be configured? I would have
>> expected it to come from DT due to link frequency often being chosen
>> based on EMC restrictions, but I don't see such a thing in any
>> binding.
>
> Undefined too. DSI support was added to DRM without any design effort,
> it's more a hack than a real solution. The issue with devices that can
> be controlled over both DSI and I2C is completely unhandled. So far
> nobody has really cared about implementing DSI right as far as I can
> tell.
>
>> As a follow on, bridge devices can support burst mode (eg TI's
>> SN65DSI83 that's just been merged), so it needs to know the desired
>> panel timings for the output side of the bridge, but the DSI link
>> timings to set up the bridge's PLL. What's the correct way for
>> signalling that? drm_crtc_state->adjusted_mode vs
>> drm_crtc_state->mode? Except mode is userspace's request, not what has
>> been validated/updated by the panel/bridge.
>
> adjusted_mode is also a bit of a hack, it solves very specific issues,
> and its design assumes a single encoder in the chain with no extra
> bridges. We should instead add modes to the bridge state, and negotiate
> modes along the pipeline the same way we negotiate formats.
>
>> vc4 has constraints that the DSI host interface is fed off an integer
>> divider from a typically 3GHz clock, so the host interface needs to
>> signal that burst mode is in use even if the panel/bridge doesn't need
>> to run in burst mode. (This does mean that displays that require a
>> very precise link frequency can not be supported).
>> It currently updates the adjusted_mode via drm_encoder_helper_funcs
>> mode_fixup, but is that the correct thing to do, or is there a better
>> solution?
>> I'd have expected the DSI tx to be responsible for configuring burst
>> mode parameters anyway, so the mechanism required would seem to be
>> just the nor

Re: [PATCH v4 4/7] drm/panfrost: Add the ability to create submit queues

2021-07-05 Thread Steven Price
On 05/07/2021 09:29, Boris Brezillon wrote:
> Needed to keep VkQueues isolated from each other.
> 
> v4:
> * Make panfrost_ioctl_create_submitqueue() return the queue ID
>   instead of a queue object
> 
> v3:
> * Limit the number of submitqueue per context to 16
> * Fix a deadlock
> 
> Signed-off-by: Boris Brezillon 

Reviewed-by: Steven Price 

> ---
>  drivers/gpu/drm/panfrost/Makefile |   3 +-
>  drivers/gpu/drm/panfrost/panfrost_device.h|   2 +-
>  drivers/gpu/drm/panfrost/panfrost_drv.c   |  69 -
>  drivers/gpu/drm/panfrost/panfrost_job.c   |  47 ++-
>  drivers/gpu/drm/panfrost/panfrost_job.h   |   9 +-
>  .../gpu/drm/panfrost/panfrost_submitqueue.c   | 132 ++
>  .../gpu/drm/panfrost/panfrost_submitqueue.h   |  26 
>  include/uapi/drm/panfrost_drm.h   |  17 +++
>  8 files changed, 260 insertions(+), 45 deletions(-)
>  create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.c
>  create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.h
> 
> diff --git a/drivers/gpu/drm/panfrost/Makefile 
> b/drivers/gpu/drm/panfrost/Makefile
> index b71935862417..e99192b66ec9 100644
> --- a/drivers/gpu/drm/panfrost/Makefile
> +++ b/drivers/gpu/drm/panfrost/Makefile
> @@ -9,6 +9,7 @@ panfrost-y := \
>   panfrost_gpu.o \
>   panfrost_job.o \
>   panfrost_mmu.o \
> - panfrost_perfcnt.o
> + panfrost_perfcnt.o \
> + panfrost_submitqueue.o
>  
>  obj-$(CONFIG_DRM_PANFROST) += panfrost.o
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
> b/drivers/gpu/drm/panfrost/panfrost_device.h
> index 8b25278f34c8..51c0ba4e50f5 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -137,7 +137,7 @@ struct panfrost_mmu {
>  struct panfrost_file_priv {
>   struct panfrost_device *pfdev;
>  
> - struct drm_sched_entity sched_entity[NUM_JOB_SLOTS];
> + struct idr queues;
>  
>   struct panfrost_mmu *mmu;
>  };
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index b6b5997c9366..8e28ef30310b 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -19,6 +19,7 @@
>  #include "panfrost_job.h"
>  #include "panfrost_gpu.h"
>  #include "panfrost_perfcnt.h"
> +#include "panfrost_submitqueue.h"
>  
>  static bool unstable_ioctls;
>  module_param_unsafe(unstable_ioctls, bool, 0600);
> @@ -250,6 +251,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
> void *data,
>   struct panfrost_device *pfdev = dev->dev_private;
>   struct drm_panfrost_submit *args = data;
>   struct drm_syncobj *sync_out = NULL;
> + struct panfrost_submitqueue *queue;
>   struct panfrost_job *job;
>   int ret = 0;
>  
> @@ -259,10 +261,16 @@ static int panfrost_ioctl_submit(struct drm_device 
> *dev, void *data,
>   if (args->requirements && args->requirements != PANFROST_JD_REQ_FS)
>   return -EINVAL;
>  
> + queue = panfrost_submitqueue_get(file->driver_priv, 0);
> + if (IS_ERR(queue))
> + return PTR_ERR(queue);
> +
>   if (args->out_sync > 0) {
>   sync_out = drm_syncobj_find(file, args->out_sync);
> - if (!sync_out)
> - return -ENODEV;
> + if (!sync_out) {
> + ret = -ENODEV;
> + goto fail_put_queue;
> + }
>   }
>  
>   job = kzalloc(sizeof(*job), GFP_KERNEL);
> @@ -289,7 +297,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
> void *data,
>   if (ret)
>   goto fail_job;
>  
> - ret = panfrost_job_push(job);
> + ret = panfrost_job_push(queue, job);
>   if (ret)
>   goto fail_job;
>  
> @@ -302,6 +310,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
> void *data,
>  fail_out_sync:
>   if (sync_out)
>   drm_syncobj_put(sync_out);
> +fail_put_queue:
> + panfrost_submitqueue_put(queue);
>  
>   return ret;
>  }
> @@ -451,6 +461,36 @@ static int panfrost_ioctl_madvise(struct drm_device 
> *dev, void *data,
>   return ret;
>  }
>  
> +static int
> +panfrost_ioctl_create_submitqueue(struct drm_device *dev, void *data,
> +   struct drm_file *file_priv)
> +{
> + struct panfrost_file_priv *priv = file_priv->driver_priv;
> + struct drm_panfrost_create_submitqueue *args = data;
> + int ret;
> +
> + ret = panfrost_submitqueue_create(priv, args->priority, args->flags);
> + if (ret < 0)
> + return ret;
> +
> + args->id = ret;
> + return 0;
> +}
> +
> +static int
> +panfrost_ioctl_destroy_submitqueue(struct drm_device *dev, void *data,
> +struct drm_file *file_priv)
> +{
> + struct panfrost_file_priv *priv = file_priv->driver_priv;
> + u32 id = *((u32 *)data);
> +
> + /* Default queue can't be destroyed

[PATCH] drm/nouveau: remove unused varialble "struct device *dev"

2021-07-05 Thread Cai Huoqing
Signed-off-by: Cai Huoqing 
Reviewed-by: Karol Herbst 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 4f3a5357dd56..c5624048de5e 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1242,7 +1242,6 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev,
 {
struct ttm_tt *ttm_dma = (void *)ttm;
struct nouveau_drm *drm;
-   struct device *dev;
bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);

if (ttm_tt_is_populated(ttm))
@@ -1255,7 +1254,6 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev,
}

drm = nouveau_bdev(bdev);
-   dev = drm->dev->dev;

return ttm_pool_alloc(&drm->ttm.bdev.pool, ttm, ctx);
 }
@@ -1265,14 +1263,12 @@ nouveau_ttm_tt_unpopulate(struct ttm_device *bdev,
  struct ttm_tt *ttm)
 {
struct nouveau_drm *drm;
-   struct device *dev;
bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);

if (slave)
return;

drm = nouveau_bdev(bdev);
-   dev = drm->dev->dev;

return ttm_pool_free(&drm->ttm.bdev.pool, ttm);
 }
--
2.25.1


Re: [PATCH v2] drm/dp_mst: Fix return code on sideband message failure

2021-07-05 Thread Jani Nikula
On Tue, 29 Jun 2021, Kuogee Hsieh  wrote:
> From: Rajkumar Subbiah 
>
> Commit 2f015ec6eab6 ("drm/dp_mst: Add sideband down request tracing +
> selftests") added some debug code for sideband message tracing. But
> it seems to have unintentionally changed the behavior on sideband message
> failure. It catches and returns failure only if DRM_UT_DP is enabled.
> Otherwise it ignores the error code and returns success. So on an MST
> unplug, the caller is unaware that the clear payload message failed and
> ends up waiting for 4 seconds for the response. Fixes the issue by
> returning the proper error code.
>
> Changes in V2:
> -- Revise commit text as review comment
> -- add Fixes text
>
> Fixes: 2f015ec6eab6 ("drm/dp_mst: Add sideband down request tracing + 
> selftests")
>
> Signed-off-by: Rajkumar Subbiah 
> Signed-off-by: Kuogee Hsieh 
>
> Reviewed-by: Stephen Boyd 
> ---
>  drivers/gpu/drm/drm_dp_mst_topology.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c 
> b/drivers/gpu/drm/drm_dp_mst_topology.c
> index 1590144..8d97430 100644
> --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> @@ -2887,11 +2887,13 @@ static int process_single_tx_qlock(struct 
> drm_dp_mst_topology_mgr *mgr,
>   idx += tosend + 1;
>  
>   ret = drm_dp_send_sideband_msg(mgr, up, chunk, idx);
> - if (unlikely(ret) && drm_debug_enabled(DRM_UT_DP)) {
> - struct drm_printer p = drm_debug_printer(DBG_PREFIX);
> + if (unlikely(ret)) {
> + if (drm_debug_enabled(DRM_UT_DP)) {
> + struct drm_printer p = drm_debug_printer(DBG_PREFIX);
>  
> - drm_printf(&p, "sideband msg failed to send\n");
> - drm_dp_mst_dump_sideband_msg_tx(&p, txmsg);
> + drm_printf(&p, "sideband msg failed to send\n");
> + drm_dp_mst_dump_sideband_msg_tx(&p, txmsg);
> + }
>   return ret;
>   }

Seems like a sensible thing to do.

(I'd probably rip out the "unlikely" while at it, as it feels like
unnecessary optimization, but *shrug*.)

Reviewed-by: Jani Nikula 


-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [PATCH 0/4] vkms: Switch to shadow-buffered plane state

2021-07-05 Thread Daniel Vetter
On Mon, Jul 05, 2021 at 09:46:29AM +0200, Thomas Zimmermann wrote:
> Vkms copies each plane's framebuffer into the output buffer; essentially
> using a shadow buffer. DRM provides struct drm_shadow_plane_state, which
> handles the details of mapping/unmapping shadow buffers into memory for
> active planes.
> 
> Convert vkms to the helpers. Makes vkms use shared code and gives more
> test exposure to shadow-plane helpers.
> 
> Thomas Zimmermann (4):
>   drm/gem: Export implementation of shadow-plane helpers
>   drm/vkms: Inherit plane state from struct drm_shadow_plane_state
>   drm/vkms: Let shadow-plane helpers prepare the plane's FB
>   drm/vkms: Use dma-buf mapping from shadow-plane state for composing

So I think right now this fits, but I think it'll mismit going forward: We
don't really have a shadow-plane that we then toss to the hw, it's a
shadow-crtc-area. Right now there's no difference, because we don't
support positioning/scaling the primary plane. But that's all kinda stuff
that's on the table.

But conceptually at least the compositioning buffer should bet part of the
crtc, not of the primary plane.

So not sure what to do, but also coffee hasn't kicked in yet, so maybe I'm
just confused.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v4 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-05 Thread Daniel Vetter
On Mon, Jul 05, 2021 at 10:29:48AM +0200, Boris Brezillon wrote:
> This should help limit the number of ioctls when submitting multiple
> jobs. The new ioctl also supports syncobj timelines and BO access flags.
> 
> v4:
> * Implement panfrost_ioctl_submit() as a wrapper around
>   panfrost_submit_job()
> * Replace stride fields by a version field which is mapped to
>   a  tuple internally
> 
> v3:
> * Re-use panfrost_get_job_bos() and panfrost_get_job_in_syncs() in the
>   old submit path
> 
> Signed-off-by: Boris Brezillon 
> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 562 
>  drivers/gpu/drm/panfrost/panfrost_job.c |   3 +
>  include/uapi/drm/panfrost_drm.h |  92 
>  3 files changed, 479 insertions(+), 178 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 8e28ef30310b..a624e4f86aff 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -138,184 +138,6 @@ panfrost_get_job_mappings(struct drm_file *file_priv, 
> struct panfrost_job *job)
>   return 0;
>  }
>  
> -/**
> - * panfrost_lookup_bos() - Sets up job->bo[] with the GEM objects
> - * referenced by the job.
> - * @dev: DRM device
> - * @file_priv: DRM file for this fd
> - * @args: IOCTL args
> - * @job: job being set up
> - *
> - * Resolve handles from userspace to BOs and attach them to job.
> - *
> - * Note that this function doesn't need to unreference the BOs on
> - * failure, because that will happen at panfrost_job_cleanup() time.
> - */
> -static int
> -panfrost_lookup_bos(struct drm_device *dev,
> -   struct drm_file *file_priv,
> -   struct drm_panfrost_submit *args,
> -   struct panfrost_job *job)
> -{
> - unsigned int i;
> - int ret;
> -
> - job->bo_count = args->bo_handle_count;
> -
> - if (!job->bo_count)
> - return 0;
> -
> - job->bo_flags = kvmalloc_array(job->bo_count,
> -sizeof(*job->bo_flags),
> -GFP_KERNEL | __GFP_ZERO);
> - if (!job->bo_flags)
> - return -ENOMEM;
> -
> - for (i = 0; i < job->bo_count; i++)
> - job->bo_flags[i] = PANFROST_BO_REF_EXCLUSIVE;
> -
> - ret = drm_gem_objects_lookup(file_priv,
> -  (void __user *)(uintptr_t)args->bo_handles,
> -  job->bo_count, &job->bos);
> - if (ret)
> - return ret;
> -
> - return panfrost_get_job_mappings(file_priv, job);
> -}
> -
> -/**
> - * panfrost_copy_in_sync() - Sets up job->deps with the sync objects
> - * referenced by the job.
> - * @dev: DRM device
> - * @file_priv: DRM file for this fd
> - * @args: IOCTL args
> - * @job: job being set up
> - *
> - * Resolve syncobjs from userspace to fences and attach them to job.
> - *
> - * Note that this function doesn't need to unreference the fences on
> - * failure, because that will happen at panfrost_job_cleanup() time.
> - */
> -static int
> -panfrost_copy_in_sync(struct drm_device *dev,
> -   struct drm_file *file_priv,
> -   struct drm_panfrost_submit *args,
> -   struct panfrost_job *job)
> -{
> - u32 *handles;
> - int ret = 0;
> - int i, in_fence_count;
> -
> - in_fence_count = args->in_sync_count;
> -
> - if (!in_fence_count)
> - return 0;
> -
> - handles = kvmalloc_array(in_fence_count, sizeof(u32), GFP_KERNEL);
> - if (!handles) {
> - ret = -ENOMEM;
> - DRM_DEBUG("Failed to allocate incoming syncobj handles\n");
> - goto fail;
> - }
> -
> - if (copy_from_user(handles,
> -(void __user *)(uintptr_t)args->in_syncs,
> -in_fence_count * sizeof(u32))) {
> - ret = -EFAULT;
> - DRM_DEBUG("Failed to copy in syncobj handles\n");
> - goto fail;
> - }
> -
> - for (i = 0; i < in_fence_count; i++) {
> - struct dma_fence *fence;
> -
> - ret = drm_syncobj_find_fence(file_priv, handles[i], 0, 0,
> -  &fence);
> - if (ret)
> - goto fail;
> -
> - ret = drm_gem_fence_array_add(&job->deps, fence);
> -
> - if (ret)
> - goto fail;
> - }
> -
> -fail:
> - kvfree(handles);
> - return ret;
> -}
> -
> -static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
> - struct drm_file *file)
> -{
> - struct panfrost_device *pfdev = dev->dev_private;
> - struct drm_panfrost_submit *args = data;
> - struct drm_syncobj *sync_out = NULL;
> - struct panfrost_submitqueue *queue;
> - struct panfrost_job *job;
> - int ret = 0;
> -
> - if (!args->jc)
> - return -EINVAL;
> -
> - if (args->requirements && args->requirement

Re: [PATCH v4 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-05 Thread Steven Price
On 05/07/2021 09:29, Boris Brezillon wrote:
> This should help limit the number of ioctls when submitting multiple
> jobs. The new ioctl also supports syncobj timelines and BO access flags.
> 
> v4:
> * Implement panfrost_ioctl_submit() as a wrapper around
>   panfrost_submit_job()
> * Replace stride fields by a version field which is mapped to
>   a  tuple internally
> 
> v3:
> * Re-use panfrost_get_job_bos() and panfrost_get_job_in_syncs() in the
>   old submit path
> 
> Signed-off-by: Boris Brezillon 
> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 562 
>  drivers/gpu/drm/panfrost/panfrost_job.c |   3 +
>  include/uapi/drm/panfrost_drm.h |  92 
>  3 files changed, 479 insertions(+), 178 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 8e28ef30310b..a624e4f86aff 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -138,184 +138,6 @@ panfrost_get_job_mappings(struct drm_file *file_priv, 
> struct panfrost_job *job)
>   return 0;
>  }
>  
> -/**
> - * panfrost_lookup_bos() - Sets up job->bo[] with the GEM objects
> - * referenced by the job.
> - * @dev: DRM device
> - * @file_priv: DRM file for this fd
> - * @args: IOCTL args
> - * @job: job being set up
> - *
> - * Resolve handles from userspace to BOs and attach them to job.
> - *
> - * Note that this function doesn't need to unreference the BOs on
> - * failure, because that will happen at panfrost_job_cleanup() time.
> - */
> -static int
> -panfrost_lookup_bos(struct drm_device *dev,
> -   struct drm_file *file_priv,
> -   struct drm_panfrost_submit *args,
> -   struct panfrost_job *job)
> -{
> - unsigned int i;
> - int ret;
> -
> - job->bo_count = args->bo_handle_count;
> -
> - if (!job->bo_count)
> - return 0;
> -
> - job->bo_flags = kvmalloc_array(job->bo_count,
> -sizeof(*job->bo_flags),
> -GFP_KERNEL | __GFP_ZERO);
> - if (!job->bo_flags)
> - return -ENOMEM;
> -
> - for (i = 0; i < job->bo_count; i++)
> - job->bo_flags[i] = PANFROST_BO_REF_EXCLUSIVE;
> -
> - ret = drm_gem_objects_lookup(file_priv,
> -  (void __user *)(uintptr_t)args->bo_handles,
> -  job->bo_count, &job->bos);
> - if (ret)
> - return ret;
> -
> - return panfrost_get_job_mappings(file_priv, job);
> -}
> -
> -/**
> - * panfrost_copy_in_sync() - Sets up job->deps with the sync objects
> - * referenced by the job.
> - * @dev: DRM device
> - * @file_priv: DRM file for this fd
> - * @args: IOCTL args
> - * @job: job being set up
> - *
> - * Resolve syncobjs from userspace to fences and attach them to job.
> - *
> - * Note that this function doesn't need to unreference the fences on
> - * failure, because that will happen at panfrost_job_cleanup() time.
> - */
> -static int
> -panfrost_copy_in_sync(struct drm_device *dev,
> -   struct drm_file *file_priv,
> -   struct drm_panfrost_submit *args,
> -   struct panfrost_job *job)
> -{
> - u32 *handles;
> - int ret = 0;
> - int i, in_fence_count;
> -
> - in_fence_count = args->in_sync_count;
> -
> - if (!in_fence_count)
> - return 0;
> -
> - handles = kvmalloc_array(in_fence_count, sizeof(u32), GFP_KERNEL);
> - if (!handles) {
> - ret = -ENOMEM;
> - DRM_DEBUG("Failed to allocate incoming syncobj handles\n");
> - goto fail;
> - }
> -
> - if (copy_from_user(handles,
> -(void __user *)(uintptr_t)args->in_syncs,
> -in_fence_count * sizeof(u32))) {
> - ret = -EFAULT;
> - DRM_DEBUG("Failed to copy in syncobj handles\n");
> - goto fail;
> - }
> -
> - for (i = 0; i < in_fence_count; i++) {
> - struct dma_fence *fence;
> -
> - ret = drm_syncobj_find_fence(file_priv, handles[i], 0, 0,
> -  &fence);
> - if (ret)
> - goto fail;
> -
> - ret = drm_gem_fence_array_add(&job->deps, fence);
> -
> - if (ret)
> - goto fail;
> - }
> -
> -fail:
> - kvfree(handles);
> - return ret;
> -}
> -
> -static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
> - struct drm_file *file)
> -{
> - struct panfrost_device *pfdev = dev->dev_private;
> - struct drm_panfrost_submit *args = data;
> - struct drm_syncobj *sync_out = NULL;
> - struct panfrost_submitqueue *queue;
> - struct panfrost_job *job;
> - int ret = 0;
> -
> - if (!args->jc)
> - return -EINVAL;
> -
> - if (args->requirements && args->requirements != PANFROST_JD_REQ_

Re: Questions over DSI within DRM.

2021-07-05 Thread Jagan Teki
On Fri, Jul 2, 2021 at 10:12 PM Laurent Pinchart
 wrote:
>
> Hi Dave,
>
> (Expanding the CC list a bit)
>
> On Fri, Jul 02, 2021 at 12:03:31PM +0100, Dave Stevenson wrote:
> > Hi All
> >
> > I'm trying to get DSI devices working reliably on the Raspberry Pi,
> > but I'm hitting a number of places where it isn't clear as to the
> > expected behaviour within DRM.
>
> Not a surprise. I dread reading the rest of this e-mail though :-)
>
> > Power on state. Many devices want the DSI clock and/or data lanes in
> > LP-11 state when they are powered up.
>
> When they are powered up, or when they are enabled ?
>
> > With the normal calling sequence of:
> > - panel/bridge pre_enable calls from connector towards the encoder.
> > - encoder enable which also enables video.
> > - panel/bridge enable calls from encoder to connector.
> > there is no point at which the DSI tx is initialised but not
> > transmitting video. What DSI states are expected to be adopted at each
> > point?
>
> That's undefined I'm afraid, and it should be documented. The upside is
> that you can propose the behaviour that you need :-)
>
> > On a similar theme, some devices want the clock lane in HS mode early
> > so they can use it in place of an external oscillator, but the data
> > lanes still in LP-11. There appears to be no way for the
> > display/bridge to signal this requirement or it be achieved.
>
> You're right. A lng time ago, the omapdrm driver had an internal
> infrastructure that didn't use drm_bridge or drm_panel and instead
> required omapdrm-specific drivers for those components. It used to model
> the display pipeline in a different way than drm_bridge, with the sync
> explicitly setting the source state. A DSI sink could thus control its
> enable sequence, interleaving programming of the sink with control of
> the source.
>
> Migrating omapdrm to the drm_bridge model took a really large effort,
> which makes me believe that transitioning the whole subsystem to
> sink-controlled sources would be close to impossible. We could add
> DSI-specific operations, or add another enable bridge operation
> (post_pre_enable ? :-D). Neither would scale, but it may be enough.
>
> > host_transfer calls can supposedly be made at any time, however unless
> > MIPI_DSI_MSG_USE_LPM is set in the message then we're meant to send it
> > in high speed mode. If this is before a mode has been set, what
> > defines the link frequency parameters at this point? Adopting a random
> > default sounds like a good way to get undefined behaviour.
> >
> > DSI burst mode needs to set the DSI link frequency independently of
> > the display mode. How is that meant to be configured? I would have
> > expected it to come from DT due to link frequency often being chosen
> > based on EMC restrictions, but I don't see such a thing in any
> > binding.
>
> Undefined too. DSI support was added to DRM without any design effort,
> it's more a hack than a real solution. The issue with devices that can
> be controlled over both DSI and I2C is completely unhandled. So far
> nobody has really cared about implementing DSI right as far as I can
> tell.
>
> > As a follow on, bridge devices can support burst mode (eg TI's
> > SN65DSI83 that's just been merged), so it needs to know the desired
> > panel timings for the output side of the bridge, but the DSI link
> > timings to set up the bridge's PLL. What's the correct way for
> > signalling that? drm_crtc_state->adjusted_mode vs
> > drm_crtc_state->mode? Except mode is userspace's request, not what has
> > been validated/updated by the panel/bridge.
>
> adjusted_mode is also a bit of a hack, it solves very specific issues,
> and its design assumes a single encoder in the chain with no extra
> bridges. We should instead add modes to the bridge state, and negotiate
> modes along the pipeline the same way we negotiate formats.
>
> > vc4 has constraints that the DSI host interface is fed off an integer
> > divider from a typically 3GHz clock, so the host interface needs to
> > signal that burst mode is in use even if the panel/bridge doesn't need
> > to run in burst mode. (This does mean that displays that require a
> > very precise link frequency can not be supported).
> > It currently updates the adjusted_mode via drm_encoder_helper_funcs
> > mode_fixup, but is that the correct thing to do, or is there a better
> > solution?
> > I'd have expected the DSI tx to be responsible for configuring burst
> > mode parameters anyway, so the mechanism required would seem to be
> > just the normal approach for adopting burst mode if that is defined.
> >
> > Some DSI host interfaces are implemented as bridges, others are
> > encoders. Pro's and con's of each? I suspect I'm just missing the
> > history here.
>
> It's indeed history. drm_encoder can't go away as it has been erronously
> exposed to userspace, but going forward, everything should be a bridge.
> The drm_encoder will still be required, but should just be a dummy,
> representing the ch

Re: [PATCH 0/4] vkms: Switch to shadow-buffered plane state

2021-07-05 Thread Thomas Zimmermann

Hi

Am 05.07.21 um 11:27 schrieb Daniel Vetter:

On Mon, Jul 05, 2021 at 09:46:29AM +0200, Thomas Zimmermann wrote:

Vkms copies each plane's framebuffer into the output buffer; essentially
using a shadow buffer. DRM provides struct drm_shadow_plane_state, which
handles the details of mapping/unmapping shadow buffers into memory for
active planes.

Convert vkms to the helpers. Makes vkms use shared code and gives more
test exposure to shadow-plane helpers.

Thomas Zimmermann (4):
   drm/gem: Export implementation of shadow-plane helpers
   drm/vkms: Inherit plane state from struct drm_shadow_plane_state
   drm/vkms: Let shadow-plane helpers prepare the plane's FB
   drm/vkms: Use dma-buf mapping from shadow-plane state for composing


So I think right now this fits, but I think it'll mismit going forward: We
don't really have a shadow-plane that we then toss to the hw, it's a
shadow-crtc-area. Right now there's no difference, because we don't
support positioning/scaling the primary plane. But that's all kinda stuff
that's on the table.

But conceptually at least the compositioning buffer should bet part of the
crtc, not of the primary plane.

So not sure what to do, but also coffee hasn't kicked in yet, so maybe I'm
just confused.


I'm not sure if I understand your concern. Can you elaborate? The 
compositing output buffer is not affected by this patchset. Only the 
input frambuffers of the planes. Those are shadow buffers. AFAICT the 
composer code memcpy's the primary plane and then blends the other 
planes on top. Supporting transformation of the primary plane doesn't 
really change much wrt to the vmaping of input fbs.


Best regards
Thomas


-Daniel



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] drm/etnaviv: Implement mmap as GEM object function

2021-07-05 Thread Thomas Zimmermann

Hi,

will you review my patch?

Best regards
Thomas

Am 24.06.21 um 12:50 schrieb Lucas Stach:

Am Donnerstag, dem 24.06.2021 um 12:47 +0200 schrieb Thomas Zimmermann:

Hi

Am 24.06.21 um 11:11 schrieb Lucas Stach:

Am Donnerstag, dem 24.06.2021 um 10:58 +0200 schrieb Thomas Zimmermann:

Moving the driver-specific mmap code into a GEM object function allows
for using DRM helpers for various mmap callbacks.

The respective etnaviv functions are being removed. The file_operations
structure fops is now being created by the helper macro
DEFINE_DRM_GEM_FOPS().

Signed-off-by: Thomas Zimmermann 
---
   drivers/gpu/drm/etnaviv/etnaviv_drv.c   | 14 ++
   drivers/gpu/drm/etnaviv/etnaviv_drv.h   |  3 ---
   drivers/gpu/drm/etnaviv/etnaviv_gem.c   | 18 +-
   drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 13 -
   4 files changed, 7 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c 
b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
index f0a07278ad04..7dcc6392792d 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
@@ -468,17 +468,7 @@ static const struct drm_ioctl_desc etnaviv_ioctls[] = {
ETNA_IOCTL(PM_QUERY_SIG, pm_query_sig, DRM_RENDER_ALLOW),
   };
   
-static const struct file_operations fops = {

-   .owner  = THIS_MODULE,
-   .open   = drm_open,
-   .release= drm_release,
-   .unlocked_ioctl = drm_ioctl,
-   .compat_ioctl   = drm_compat_ioctl,
-   .poll   = drm_poll,
-   .read   = drm_read,
-   .llseek = no_llseek,
-   .mmap   = etnaviv_gem_mmap,
-};
+DEFINE_DRM_GEM_FOPS(fops);
   
   static const struct drm_driver etnaviv_drm_driver = {

.driver_features= DRIVER_GEM | DRIVER_RENDER,
@@ -487,7 +477,7 @@ static const struct drm_driver etnaviv_drm_driver = {
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import_sg_table = etnaviv_gem_prime_import_sg_table,
-   .gem_prime_mmap = etnaviv_gem_prime_mmap,
+   .gem_prime_mmap = drm_gem_prime_mmap,
   #ifdef CONFIG_DEBUG_FS
.debugfs_init   = etnaviv_debugfs_init,
   #endif
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.h 
b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
index 003288ebd896..049ae87de9be 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
@@ -47,12 +47,9 @@ struct etnaviv_drm_private {
   int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data,
struct drm_file *file);
   
-int etnaviv_gem_mmap(struct file *filp, struct vm_area_struct *vma);

   int etnaviv_gem_mmap_offset(struct drm_gem_object *obj, u64 *offset);
   struct sg_table *etnaviv_gem_prime_get_sg_table(struct drm_gem_object *obj);
   int etnaviv_gem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map 
*map);
-int etnaviv_gem_prime_mmap(struct drm_gem_object *obj,
-  struct vm_area_struct *vma);
   struct drm_gem_object *etnaviv_gem_prime_import_sg_table(struct drm_device 
*dev,
struct dma_buf_attachment *attach, struct sg_table *sg);
   int etnaviv_gem_prime_pin(struct drm_gem_object *obj);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index b8fa6ed3dd73..8f1b5af47dd6 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -130,8 +130,7 @@ static int etnaviv_gem_mmap_obj(struct etnaviv_gem_object 
*etnaviv_obj,
   {
pgprot_t vm_page_prot;
   
-	vma->vm_flags &= ~VM_PFNMAP;

-   vma->vm_flags |= VM_MIXEDMAP;
+   vma->vm_flags |= VM_IO | VM_MIXEDMAP | VM_DONTEXPAND | VM_DONTDUMP;


I don't fully understand why this change is needed and the commit log
is silent about it. Excuse my ignorance, but can you please explain the
reasoning behind this change?


Sure, sorry for being brief.

I worked on cleaning up the deprecated gem_prime_* callbacks in struct
drm_driver. These are supposed to be GEM object functions. The only
obsolete gem prime callback in drm_driver is gem_prime_mmap.


Sorry, that's a misunderstanding. I see the justification for the patch
as a whole. I was asking specifically about the hunk above my comment.
Why are the vm_flags changed and how did you come up with this exact
combination of flags?

Regards,
Lucas



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: [RFC PATCH 03/17] drm/exynos: dsi: Use the drm_panel_bridge API

2021-07-05 Thread Marek Szyprowski
On 04.07.2021 11:02, Jagan Teki wrote:
> Use drm_panel_bridge to replace manual panel and
> bridge_chain handling code.
>
> This makes the driver simpler to allow all components
> in the display pipeline to be treated as bridges by
> cleaning the way to generic connector handling.
>
> Signed-off-by: Jagan Teki 

This breaks Exysos DSI driver operation (Trats board worked fine with 
only patches 1-2):

[    2.540066] exynos4-fb 11c0.fimd: Adding to iommu group 0
[    2.554733] OF: graph: no port node found in /soc/fimd@11c0
[    2.602819] [drm] Exynos DRM: using 11c0.fimd device for DMA 
mapping operations
[    2.609649] exynos-drm exynos-drm: bound 11c0.fimd (ops 
fimd_component_ops)
[    2.632558] exynos-drm exynos-drm: failed to bind 11c8.dsi (ops 
exynos_dsi_component_ops): -22
[    2.642263] exynos-drm exynos-drm: master bind failed: -22
[    2.651017] exynos-drm: probe of exynos-drm failed with error -22

> ---
>   drivers/gpu/drm/exynos/exynos_drm_dsi.c | 167 
>   1 file changed, 23 insertions(+), 144 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c 
> b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
> index d7d60aee465b..24f0b082ac6d 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
> @@ -254,9 +254,6 @@ struct exynos_dsi_driver_data {
>   struct exynos_dsi {
>   struct drm_encoder encoder;
>   struct mipi_dsi_host dsi_host;
> - struct drm_connector connector;
> - struct drm_panel *panel;
> - struct list_head bridge_chain;
>   struct drm_bridge bridge;
>   struct drm_bridge *out_bridge;
>   struct drm_device *drm;
> @@ -287,7 +284,6 @@ struct exynos_dsi {
>   };
>   
>   #define host_to_dsi(host) container_of(host, struct exynos_dsi, dsi_host)
> -#define connector_to_dsi(c) container_of(c, struct exynos_dsi, connector)
>   
>   static inline struct exynos_dsi *bridge_to_dsi(struct drm_bridge *b)
>   {
> @@ -1379,7 +1375,6 @@ static void exynos_dsi_unregister_te_irq(struct 
> exynos_dsi *dsi)
>   static void exynos_dsi_bridge_enable(struct drm_bridge *bridge)
>   {
>   struct exynos_dsi *dsi = bridge_to_dsi(bridge);
> - struct drm_bridge *iter;
>   int ret;
>   
>   if (dsi->state & DSIM_STATE_ENABLED)
> @@ -1393,134 +1388,51 @@ static void exynos_dsi_bridge_enable(struct 
> drm_bridge *bridge)
>   
>   dsi->state |= DSIM_STATE_ENABLED;
>   
> - if (dsi->panel) {
> - ret = drm_panel_prepare(dsi->panel);
> - if (ret < 0)
> - goto err_put_sync;
> - } else {
> - list_for_each_entry_reverse(iter, &dsi->bridge_chain,
> - chain_node) {
> - if (iter->funcs->pre_enable)
> - iter->funcs->pre_enable(iter);
> - }
> - }
> -
>   exynos_dsi_set_display_mode(dsi);
>   exynos_dsi_set_display_enable(dsi, true);
>   
> - if (dsi->panel) {
> - ret = drm_panel_enable(dsi->panel);
> - if (ret < 0)
> - goto err_display_disable;
> - } else {
> - list_for_each_entry(iter, &dsi->bridge_chain, chain_node) {
> - if (iter->funcs->enable)
> - iter->funcs->enable(iter);
> - }
> - }
> -
>   dsi->state |= DSIM_STATE_VIDOUT_AVAILABLE;
>   return;
> -
> -err_display_disable:
> - exynos_dsi_set_display_enable(dsi, false);
> - drm_panel_unprepare(dsi->panel);
> -
> -err_put_sync:
> - dsi->state &= ~DSIM_STATE_ENABLED;
> - pm_runtime_put(dsi->dev);
>   }
>   
>   static void exynos_dsi_bridge_disable(struct drm_bridge *bridge)
>   {
>   struct exynos_dsi *dsi = bridge_to_dsi(bridge);
> - struct drm_bridge *iter;
>   
>   if (!(dsi->state & DSIM_STATE_ENABLED))
>   return;
>   
>   dsi->state &= ~DSIM_STATE_VIDOUT_AVAILABLE;
>   
> - drm_panel_disable(dsi->panel);
> -
> - list_for_each_entry_reverse(iter, &dsi->bridge_chain, chain_node) {
> - if (iter->funcs->disable)
> - iter->funcs->disable(iter);
> - }
> -
>   exynos_dsi_set_display_enable(dsi, false);
> - drm_panel_unprepare(dsi->panel);
> -
> - list_for_each_entry(iter, &dsi->bridge_chain, chain_node) {
> - if (iter->funcs->post_disable)
> - iter->funcs->post_disable(iter);
> - }
>   
>   dsi->state &= ~DSIM_STATE_ENABLED;
>   pm_runtime_put_sync(dsi->dev);
>   }
>   
> -static enum drm_connector_status
> -exynos_dsi_detect(struct drm_connector *connector, bool force)
> -{
> - return connector->status;
> -}
> -
> -static void exynos_dsi_connector_destroy(struct drm_connector *connector)
> +static int exynos_dsi_panel_or_bridge(struct exynos_dsi *dsi,
> +   struct device_node *node)
>   {
> - drm_connector_unregister(connector);
> - 

Re: [PATCH] drm/i915: Improve debug Kconfig texts a bit

2021-07-05 Thread Joonas Lahtinen
Quoting Daniel Vetter (2021-07-02 23:17:08)
> We're not consistently recommending these for developers only.
> 
> I stumbled over this due to DRM_I915_LOW_LEVEL_TRACEPOINTS, which was
> added in
> 
> commit 354d036fcf70654cff2e2cbdda54a835d219b9d2
> Author: Tvrtko Ursulin 
> Date:   Tue Feb 21 11:01:42 2017 +
> 
> drm/i915/tracepoints: Add request submit and execute tracepoints
> 
> to "alleviate the performance impact concerns."
> 
> Which is nonsense.

I think that was the original reason why the patch was developed and
it got merged primarily for the latter reason. Unfortunately the patch
commit messages don't always get rewritten.

> Tvrtko and Joonas pointed out on irc that the real (but undocumented
> reason) was stable abi concerns for tracepoints, see
> 
> https://lwn.net/Articles/705270/
> 
> and the specific change that was blocked around tracepoints:
> 
> https://lwn.net/Articles/442113/
> 
> Anyway to make it a notch clearer why we have this Kconfig option
> consistly add the "Recommended for driver developers only." to it and
> all the other debug options we have.
> 
> Cc: Tvrtko Ursulin 
> Cc: Joonas Lahtinen 
> Cc: Matthew Brost 
> Signed-off-by: Daniel Vetter 

Reviewed-by: Joonas Lahtinen 

Regards, Joonas

> ---
>  drivers/gpu/drm/i915/Kconfig.debug | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
> b/drivers/gpu/drm/i915/Kconfig.debug
> index 2ca88072d30f..f27c0b5873f7 100644
> --- a/drivers/gpu/drm/i915/Kconfig.debug
> +++ b/drivers/gpu/drm/i915/Kconfig.debug
> @@ -215,6 +215,8 @@ config DRM_I915_LOW_LEVEL_TRACEPOINTS
>   This provides the ability to precisely monitor engine utilisation
>   and also analyze the request dependency resolving timeline.
>  
> + Recommended for driver developers only.
> +
>   If in doubt, say "N".
>  
>  config DRM_I915_DEBUG_VBLANK_EVADE
> @@ -228,6 +230,8 @@ config DRM_I915_DEBUG_VBLANK_EVADE
>   is exceeded, even if there isn't an actual risk of missing
>   the vblank.
>  
> + Recommended for driver developers only.
> +
>   If in doubt, say "N".
>  
>  config DRM_I915_DEBUG_RUNTIME_PM
> @@ -240,4 +244,6 @@ config DRM_I915_DEBUG_RUNTIME_PM
>   runtime PM functionality. This may introduce overhead during
>   driver loading, suspend and resume operations.
>  
> + Recommended for driver developers only.
> +
>   If in doubt, say "N"
> -- 
> 2.32.0.rc2
> 


Re: [Intel-gfx] [PATCH 01/53] drm/i915: Add "release id" version

2021-07-05 Thread Jani Nikula
On Fri, 02 Jul 2021, Tvrtko Ursulin  wrote:
> On 01/07/2021 21:23, Matt Roper wrote:
>> From: Lucas De Marchi 
>> 
>> Besides the arch version returned by GRAPHICS_VER(), new platforms
>> contain a "release id" to make clear the difference from one platform to
>> another. Although for the first ones we may use them as if they were a
>
> What does "first ones" refer to here?
>
>> major/minor version, that is not true for all platforms: we may have a
>> `release_id == n` that is closer to `n - 2` than to `n - 1`.
>
> Hm this is a bit confusing. Is the sentence simply trying to say that, 
> as the release id number is growing, hw capabilities are not simply 
> accumulating but can be removed as well? Otherwise I am not sure how the 
> user of these macros is supposed to act on this sentence.
>
>> However the release id number is not defined by hardware until we start
>> using the GMD_ID register. For the platforms before that register is
>> useful we will set the values in software and we can set them as we
>> please. So the plan is to set them so we can group different features
>> under a single GRAPHICS_VER_FULL() check.
>> 
>> After GMD_ID is used, the usefulness of a "full version check" will be
>> greatly reduced and will be mostly used for deciding workarounds and a
>> few code paths. So it makes sense to keep it as a separate field from
>> graphics_ver.
>> 
>> Also, currently there is not much use for the release id in media and
>> display, so keep them out.
>> 
>> This is a mix of 2 independent changes: one by me and the other by Matt
>> Roper.
>> 
>> Cc: Matt Roper 
>> Signed-off-by: Lucas De Marchi 
>> Signed-off-by: Matt Roper 
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h  | 6 ++
>>   drivers/gpu/drm/i915/intel_device_info.c | 2 ++
>>   drivers/gpu/drm/i915/intel_device_info.h | 2 ++
>>   3 files changed, 10 insertions(+)
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 6dff4ca01241..9639800485b9 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1258,11 +1258,17 @@ static inline struct drm_i915_private 
>> *pdev_to_i915(struct pci_dev *pdev)
>>*/
>>   #define IS_GEN(dev_priv, n)(GRAPHICS_VER(dev_priv) == (n))
>>   
>> +#define IP_VER(ver, release)((ver) << 8 | (release))
>> +
>>   #define GRAPHICS_VER(i915) (INTEL_INFO(i915)->graphics_ver)
>> +#define GRAPHICS_VER_FULL(i915) 
>> IP_VER(INTEL_INFO(i915)->graphics_ver, \
>> +   
>> INTEL_INFO(i915)->graphics_ver_release)
>>   #define IS_GRAPHICS_VER(i915, from, until) \
>>  (GRAPHICS_VER(i915) >= (from) && GRAPHICS_VER(i915) <= (until))
>>   
>>   #define MEDIA_VER(i915)(INTEL_INFO(i915)->media_ver)
>> +#define MEDIA_VER_FULL(i915)
>> IP_VER(INTEL_INFO(i915)->media_ver, \
>> +   
>> INTEL_INFO(i915)->media_ver_release)
>>   #define IS_MEDIA_VER(i915, from, until) \
>>  (MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until))
>>   
>> diff --git a/drivers/gpu/drm/i915/intel_device_info.c 
>> b/drivers/gpu/drm/i915/intel_device_info.c
>> index 7eaa92fee421..e8ad14f002c1 100644
>> --- a/drivers/gpu/drm/i915/intel_device_info.c
>> +++ b/drivers/gpu/drm/i915/intel_device_info.c
>> @@ -97,7 +97,9 @@ void intel_device_info_print_static(const struct 
>> intel_device_info *info,
>>  struct drm_printer *p)
>>   {
>>  drm_printf(p, "graphics_ver: %u\n", info->graphics_ver);
>> +drm_printf(p, "graphics_ver_release: %u\n", info->graphics_ver_release);
>
> I get the VER and VER_FULL in the macros but could 'ver' and 
> 'ver_release' here and in the code simply be renamed to 'ver'/'version' 
> and 'release'? Maybe it is just me but don't think I encountered the 
> term "version release" before.

Just bikeshedding here, but I thought of:

if (info->grapics_ver_release)
drm_printf(p, "graphics_ver: %u.%u\n", info->graphics_ver, 
info->graphics_ver_release);
else
drm_printf(p, "graphics_ver: %u\n", info->graphics_ver);

Also, I thought "x_ver" and "x_ver_release" sounds a bit odd, perhaps
having "x_ver" and "x_rel" is more natural?

Ultimately I think we've historically always sucked at trying to figure
this out up front, so maybe the right answer is merging something and
fixing later...


BR,
Jani.





>
> Regards,
>
> Tvrtko
>
>>  drm_printf(p, "media_ver: %u\n", info->media_ver);
>> +drm_printf(p, "media_ver_release: %u\n", info->media_ver_release);
>>  drm_printf(p, "display_ver: %u\n", info->display.ver);
>>  drm_printf(p, "gt: %d\n", info->gt);
>>  drm_printf(p, "iommu: %s\n", iommu_name());
>> diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
>> b/drivers/gpu/drm/i915/intel_device_info.h
>> index b326aff65cd6..944a5ff4df49 100644
>> --- a/drivers/gpu/drm/i915/intel_d

Re: [RFC PATCH 03/17] drm/exynos: dsi: Use the drm_panel_bridge API

2021-07-05 Thread Jagan Teki
On Mon, Jul 5, 2021 at 5:18 PM Marek Szyprowski
 wrote:
>
> On 04.07.2021 11:02, Jagan Teki wrote:
> > Use drm_panel_bridge to replace manual panel and
> > bridge_chain handling code.
> >
> > This makes the driver simpler to allow all components
> > in the display pipeline to be treated as bridges by
> > cleaning the way to generic connector handling.
> >
> > Signed-off-by: Jagan Teki 
>
> This breaks Exysos DSI driver operation (Trats board worked fine with
> only patches 1-2):
>
> [2.540066] exynos4-fb 11c0.fimd: Adding to iommu group 0
> [2.554733] OF: graph: no port node found in /soc/fimd@11c0
> [2.602819] [drm] Exynos DRM: using 11c0.fimd device for DMA
> mapping operations
> [2.609649] exynos-drm exynos-drm: bound 11c0.fimd (ops
> fimd_component_ops)
> [2.632558] exynos-drm exynos-drm: failed to bind 11c8.dsi (ops
> exynos_dsi_component_ops): -22
> [2.642263] exynos-drm exynos-drm: master bind failed: -22
> [2.651017] exynos-drm: probe of exynos-drm failed with error -22

Thanks for testing it.

Can you check Squash of 3,4 or 3,4,5 will work or not?

Thanks,
Jagan.


Re: [RFC PATCH 03/17] drm/exynos: dsi: Use the drm_panel_bridge API

2021-07-05 Thread Marek Szyprowski
On 05.07.2021 14:00, Jagan Teki wrote:
> On Mon, Jul 5, 2021 at 5:18 PM Marek Szyprowski
>  wrote:
>> On 04.07.2021 11:02, Jagan Teki wrote:
>>> Use drm_panel_bridge to replace manual panel and
>>> bridge_chain handling code.
>>>
>>> This makes the driver simpler to allow all components
>>> in the display pipeline to be treated as bridges by
>>> cleaning the way to generic connector handling.
>>>
>>> Signed-off-by: Jagan Teki 
>> This breaks Exysos DSI driver operation (Trats board worked fine with
>> only patches 1-2):
>>
>> [2.540066] exynos4-fb 11c0.fimd: Adding to iommu group 0
>> [2.554733] OF: graph: no port node found in /soc/fimd@11c0
>> [2.602819] [drm] Exynos DRM: using 11c0.fimd device for DMA
>> mapping operations
>> [2.609649] exynos-drm exynos-drm: bound 11c0.fimd (ops
>> fimd_component_ops)
>> [2.632558] exynos-drm exynos-drm: failed to bind 11c8.dsi (ops
>> exynos_dsi_component_ops): -22
>> [2.642263] exynos-drm exynos-drm: master bind failed: -22
>> [2.651017] exynos-drm: probe of exynos-drm failed with error -22
> Thanks for testing it.
>
> Can you check Squash of 3,4 or 3,4,5 will work or not?

I've check both sets: 1-4 and 1-5 and none of them works. The result is 
same as above. If I remember correctly, last time when I played with 
that code, there was a problem with DRM core calling bridge ops in 
different order than when they are used by the Exynos DSI driver.

Best regards

-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland



Re: [RFC PATCH 03/17] drm/exynos: dsi: Use the drm_panel_bridge API

2021-07-05 Thread Jagan Teki
On Mon, Jul 5, 2021 at 5:43 PM Marek Szyprowski
 wrote:
>
> On 05.07.2021 14:00, Jagan Teki wrote:
> > On Mon, Jul 5, 2021 at 5:18 PM Marek Szyprowski
> >  wrote:
> >> On 04.07.2021 11:02, Jagan Teki wrote:
> >>> Use drm_panel_bridge to replace manual panel and
> >>> bridge_chain handling code.
> >>>
> >>> This makes the driver simpler to allow all components
> >>> in the display pipeline to be treated as bridges by
> >>> cleaning the way to generic connector handling.
> >>>
> >>> Signed-off-by: Jagan Teki 
> >> This breaks Exysos DSI driver operation (Trats board worked fine with
> >> only patches 1-2):
> >>
> >> [2.540066] exynos4-fb 11c0.fimd: Adding to iommu group 0
> >> [2.554733] OF: graph: no port node found in /soc/fimd@11c0
> >> [2.602819] [drm] Exynos DRM: using 11c0.fimd device for DMA
> >> mapping operations
> >> [2.609649] exynos-drm exynos-drm: bound 11c0.fimd (ops
> >> fimd_component_ops)
> >> [2.632558] exynos-drm exynos-drm: failed to bind 11c8.dsi (ops
> >> exynos_dsi_component_ops): -22
> >> [2.642263] exynos-drm exynos-drm: master bind failed: -22
> >> [2.651017] exynos-drm: probe of exynos-drm failed with error -22
> > Thanks for testing it.
> >
> > Can you check Squash of 3,4 or 3,4,5 will work or not?
>
> I've check both sets: 1-4 and 1-5 and none of them works. The result is
> same as above. If I remember correctly, last time when I played with
> that code, there was a problem with DRM core calling bridge ops in
> different order than when they are used by the Exynos DSI driver.

Okay. Let me check with sun6i-mipi-dsi as it is component_ops based.

Jagan.


[PATCH 00/12] mgag200: Refactor PLL setup

2021-07-05 Thread Thomas Zimmermann
Split the PLL setup code into computation and update functions; compute
the PLL values during atomic checks, update the PLL during atomic commits;
cleanup the whole thing.

The current PLL setup code for mgag200 mixes up computation if the PLL
values and programming the HW. Both is done during atomic commits. As the
computation phase can fail, the patch splits the functions and moves
the computation to atomic-check phase. The PLL values are stores as part
of the CRTC's atomic state.

As the PLL code is currently unmaintainable, apply various cleanups. For
example, split functions that handle multiple HW revisions, constify values,
move compute and update code to distict locations, and unify the
representation of the PLL's values.

Tested on G200EH by setting modes in Weston, fbdev, and Xorg. Further
testing is welcome.

Thomas Zimmermann (12):
  drm/mgag200: Select clock in PLL update functions
  drm/mgag200: Return errno codes from PLL compute functions
  drm/mgag200: Remove P_ARRAY_SIZE
  drm/mgag200: Split PLL setup into compute and update functions
  drm/mgag200: Introduce separate variable for PLL S parameter
  drm/mgag200: Store values (not bits) in struct mgag200_pll_values
  drm/mgag200: Split several PLL functions by device type
  drm/mgag200: Separate PLL compute and update functions from each other
  drm/mgag200: Split PLL computation for G200SE
  drm/mgag200: Declare PLL clock constants static const
  drm/mgag200: Introduce custom CRTC state
  drm/mgag200: Compute PLL values during atomic check

 drivers/gpu/drm/drm_simple_kms_helper.c |  39 +-
 drivers/gpu/drm/mgag200/mgag200_drv.h   |  28 +
 drivers/gpu/drm/mgag200/mgag200_mode.c  | 965 +++-
 include/drm/drm_simple_kms_helper.h |  27 +
 4 files changed, 720 insertions(+), 339 deletions(-)

--
2.32.0



[PATCH 01/12] drm/mgag200: Select clock in PLL update functions

2021-07-05 Thread Thomas Zimmermann
Put the clock-selection code into each of the PLL-update functions to
make them select the correct pixel clock.

The pixel clock for video output was not actually set before programming
the clock's values. It worked because the device had the correct clock
pre-set.

Signed-off-by: Thomas Zimmermann 
Fixes: db05f8d3dc87 ("drm/mgag200: Split MISC register update into PLL 
selection, SYNC and I/O")
Cc: Sam Ravnborg 
Cc: Emil Velikov 
Cc: Dave Airlie 
Cc: dri-devel@lists.freedesktop.org
Cc:  # v5.9+
---
 drivers/gpu/drm/mgag200/mgag200_mode.c | 47 --
 1 file changed, 37 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index 3b3059f471c2..482843ebb69f 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -130,6 +130,7 @@ static int mgag200_g200_set_plls(struct mga_device *mdev, 
long clock)
long ref_clk = mdev->model.g200.ref_clk;
long p_clk_min = mdev->model.g200.pclk_min;
long p_clk_max =  mdev->model.g200.pclk_max;
+   u8 misc;
 
if (clock > p_clk_max) {
drm_err(dev, "Pixel Clock %ld too high\n", clock);
@@ -174,6 +175,11 @@ static int mgag200_g200_set_plls(struct mga_device *mdev, 
long clock)
drm_dbg_kms(dev, "clock: %ld vco: %ld m: %d n: %d p: %d s: %d\n",
clock, f_vco, m, n, p, s);
 
+   misc = RREG8(MGA_MISC_IN);
+   misc &= ~MGAREG_MISC_CLK_SEL_MASK;
+   misc |= MGAREG_MISC_CLK_SEL_MGA_MSK;
+   WREG8(MGA_MISC_OUT, misc);
+
WREG_DAC(MGA1064_PIX_PLLC_M, m);
WREG_DAC(MGA1064_PIX_PLLC_N, n);
WREG_DAC(MGA1064_PIX_PLLC_P, (p | (s << 3)));
@@ -194,6 +200,7 @@ static int mga_g200se_set_plls(struct mga_device *mdev, 
long clock)
unsigned int pvalues_e4[P_ARRAY_SIZE] = {16, 14, 12, 10, 8, 6, 4, 2, 1};
unsigned int fvv;
unsigned int i;
+   u8 misc;
 
if (unique_rev_id <= 0x03) {
 
@@ -289,6 +296,11 @@ static int mga_g200se_set_plls(struct mga_device *mdev, 
long clock)
return 1;
}
 
+   misc = RREG8(MGA_MISC_IN);
+   misc &= ~MGAREG_MISC_CLK_SEL_MASK;
+   misc |= MGAREG_MISC_CLK_SEL_MGA_MSK;
+   WREG8(MGA_MISC_OUT, misc);
+
WREG_DAC(MGA1064_PIX_PLLC_M, m);
WREG_DAC(MGA1064_PIX_PLLC_N, n);
WREG_DAC(MGA1064_PIX_PLLC_P, p);
@@ -312,7 +324,7 @@ static int mga_g200wb_set_plls(struct mga_device *mdev, 
long clock)
unsigned int computed;
int i, j, tmpcount, vcount;
bool pll_locked = false;
-   u8 tmp;
+   u8 tmp, misc;
 
m = n = p = 0;
 
@@ -385,6 +397,11 @@ static int mga_g200wb_set_plls(struct mga_device *mdev, 
long clock)
}
}
 
+   misc = RREG8(MGA_MISC_IN);
+   misc &= ~MGAREG_MISC_CLK_SEL_MASK;
+   misc |= MGAREG_MISC_CLK_SEL_MGA_MSK;
+   WREG8(MGA_MISC_OUT, misc);
+
for (i = 0; i <= 32 && pll_locked == false; i++) {
if (i > 0) {
WREG8(MGAREG_CRTC_INDEX, 0x1e);
@@ -489,7 +506,7 @@ static int mga_g200ev_set_plls(struct mga_device *mdev, 
long clock)
unsigned int testp, testm, testn;
unsigned int p, m, n;
unsigned int computed;
-   u8 tmp;
+   u8 tmp, misc;
 
m = n = p = 0;
vcomax = 55;
@@ -522,6 +539,11 @@ static int mga_g200ev_set_plls(struct mga_device *mdev, 
long clock)
}
}
 
+   misc = RREG8(MGA_MISC_IN);
+   misc &= ~MGAREG_MISC_CLK_SEL_MASK;
+   misc |= MGAREG_MISC_CLK_SEL_MGA_MSK;
+   WREG8(MGA_MISC_OUT, misc);
+
WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL);
tmp = RREG8(DAC_DATA);
tmp |= MGA1064_PIX_CLK_CTL_CLK_DIS;
@@ -583,7 +605,7 @@ static int mga_g200eh_set_plls(struct mga_device *mdev, 
long clock)
unsigned int p, m, n;
unsigned int computed;
int i, j, tmpcount, vcount;
-   u8 tmp;
+   u8 tmp, misc;
bool pll_locked = false;
 
m = n = p = 0;
@@ -654,6 +676,12 @@ static int mga_g200eh_set_plls(struct mga_device *mdev, 
long clock)
}
}
}
+
+   misc = RREG8(MGA_MISC_IN);
+   misc &= ~MGAREG_MISC_CLK_SEL_MASK;
+   misc |= MGAREG_MISC_CLK_SEL_MGA_MSK;
+   WREG8(MGA_MISC_OUT, misc);
+
for (i = 0; i <= 32 && pll_locked == false; i++) {
WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL);
tmp = RREG8(DAC_DATA);
@@ -714,6 +742,7 @@ static int mga_g200er_set_plls(struct mga_device *mdev, 
long clock)
unsigned int p, m, n;
unsigned int computed, vco;
int tmp;
+   u8 misc;
 
m = n = p = 0;
vcomax = 1488000;
@@ -754,6 +783,11 @@ static int mga_g200er_set_plls(struct mga_device *mdev, 
long clock)
}
}
 
+   misc = RREG8(MGA_MISC_IN);
+   misc &= ~MGAREG_MISC_CLK_SEL_MASK;
+   misc |= MGAREG_MISC_CLK_SEL_MGA_MSK

[PATCH 02/12] drm/mgag200: Return errno codes from PLL compute functions

2021-07-05 Thread Thomas Zimmermann
Return -EINVAL if there's no PLL configuration for the given pixel
clock.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/mgag200/mgag200_mode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index 482843ebb69f..045a20055515 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -134,7 +134,7 @@ static int mgag200_g200_set_plls(struct mga_device *mdev, 
long clock)
 
if (clock > p_clk_max) {
drm_err(dev, "Pixel Clock %ld too high\n", clock);
-   return 1;
+   return -EINVAL;
}
 
if (clock < p_clk_min >> 3)
@@ -293,7 +293,7 @@ static int mga_g200se_set_plls(struct mga_device *mdev, 
long clock)
 
if (delta > permitteddelta) {
pr_warn("PLL delta too large\n");
-   return 1;
+   return -EINVAL;
}
 
misc = RREG8(MGA_MISC_IN);
-- 
2.32.0



[PATCH 05/12] drm/mgag200: Introduce separate variable for PLL S parameter

2021-07-05 Thread Thomas Zimmermann
The S parameter is controls the loop filter bandwidth when programming
the PLL. It's currently stored as part of P (i.e., the clock divider.)
Add a separate variable for S prepares the PLL code for a further
refactoring.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/mgag200/mgag200_mode.c | 36 --
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index 961bd128fea3..9f6fe7673674 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -210,18 +210,17 @@ static int mgag200_compute_pixpll_values_g200se(struct 
mga_device *mdev, long cl
unsigned int vcomax, vcomin, pllreffreq;
unsigned int delta, tmpdelta, permitteddelta;
unsigned int testp, testm, testn;
-   unsigned int p, m, n;
+   unsigned int p, m, n, s;
unsigned int computed;
unsigned int fvv;
unsigned int i;
 
-   if (unique_rev_id <= 0x03) {
+   m = n = p = s = 0;
 
-   m = n = p = 0;
+   if (unique_rev_id <= 0x03) {
vcomax = 32;
vcomin = 16;
pllreffreq = 25000;
-
delta = 0x;
permitteddelta = clock * 5 / 1000;
 
@@ -249,9 +248,6 @@ static int mgag200_compute_pixpll_values_g200se(struct 
mga_device *mdev, long cl
}
}
} else {
-
-
-   m = n = p = 0;
vcomax= 160;
vcomin= 80;
pllreffreq= 25000;
@@ -312,7 +308,7 @@ static int mgag200_compute_pixpll_values_g200se(struct 
mga_device *mdev, long cl
pixpllc->m = m;
pixpllc->n = n;
pixpllc->p = p;
-   pixpllc->s = 0;
+   pixpllc->s = s;
 
return 0;
 }
@@ -348,10 +344,10 @@ static int mgag200_compute_pixpll_values_g200wb(struct 
mga_device *mdev, long cl
unsigned int vcomax, vcomin, pllreffreq;
unsigned int delta, tmpdelta;
unsigned int testp, testm, testn, testp2;
-   unsigned int p, m, n;
+   unsigned int p, m, n, s;
unsigned int computed;
 
-   m = n = p = 0;
+   m = n = p = s = 0;
 
delta = 0x;
 
@@ -425,7 +421,7 @@ static int mgag200_compute_pixpll_values_g200wb(struct 
mga_device *mdev, long cl
pixpllc->m = m;
pixpllc->n = n;
pixpllc->p = p;
-   pixpllc->s = 0;
+   pixpllc->s = s;
 
return 0;
 }
@@ -549,10 +545,10 @@ static int mgag200_compute_pixpll_values_g200ev(struct 
mga_device *mdev, long cl
unsigned int vcomax, vcomin, pllreffreq;
unsigned int delta, tmpdelta;
unsigned int testp, testm, testn;
-   unsigned int p, m, n;
+   unsigned int p, m, n, s;
unsigned int computed;
 
-   m = n = p = 0;
+   m = n = p = s = 0;
vcomax = 55;
vcomin = 15;
pllreffreq = 5;
@@ -586,7 +582,7 @@ static int mgag200_compute_pixpll_values_g200ev(struct 
mga_device *mdev, long cl
pixpllc->m = m;
pixpllc->n = n;
pixpllc->p = p;
-   pixpllc->s = 0;
+   pixpllc->s = s;
 
return 0;
 }
@@ -662,10 +658,10 @@ static int mgag200_compute_pixpll_values_g200eh(struct 
mga_device *mdev, long cl
unsigned int vcomax, vcomin, pllreffreq;
unsigned int delta, tmpdelta;
unsigned int testp, testm, testn;
-   unsigned int p, m, n;
+   unsigned int p, m, n, s;
unsigned int computed;
 
-   m = n = p = 0;
+   m = n = p = s = 0;
 
if (mdev->type == G200_EH3) {
vcomax = 300;
@@ -737,7 +733,7 @@ static int mgag200_compute_pixpll_values_g200eh(struct 
mga_device *mdev, long cl
pixpllc->m = m;
pixpllc->n = n;
pixpllc->p = p;
-   pixpllc->s = 0;
+   pixpllc->s = s;
 
return 0;
 }
@@ -814,10 +810,10 @@ static int mgag200_compute_pixpll_values_g200er(struct 
mga_device *mdev, long cl
unsigned int vcomax, vcomin, pllreffreq;
unsigned int delta, tmpdelta;
int testr, testn, testm, testo;
-   unsigned int p, m, n;
+   unsigned int p, m, n, s;
unsigned int computed, vco;
 
-   m = n = p = 0;
+   m = n = p = s = 0;
vcomax = 1488000;
vcomin = 1056000;
pllreffreq = 48000;
@@ -859,7 +855,7 @@ static int mgag200_compute_pixpll_values_g200er(struct 
mga_device *mdev, long cl
pixpllc->m = m;
pixpllc->n = n;
pixpllc->p = p;
-   pixpllc->s = 0;
+   pixpllc->s = s;
 
return 0;
 }
-- 
2.32.0



[PATCH 06/12] drm/mgag200: Store values (not bits) in struct mgag200_pll_values

2021-07-05 Thread Thomas Zimmermann
The fields in struct mgag200_pll_values currently hold the bits of
each register. Store the PLL values instead and let the PLL-update
code figure out the bits for each register.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/mgag200/mgag200_mode.c | 153 +++--
 1 file changed, 91 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index 9f6fe7673674..7d6707bd6c27 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -123,7 +123,7 @@ static int mgag200_compute_pixpll_values_g200(struct 
mga_device *mdev, long cloc
const int in_div_max = 6;
const int feed_div_min = 7;
const int feed_div_max = 127;
-   u8 testm, testn;
+   u8 testp, testm, testn;
u8 n = 0, m = 0, p, s;
long f_vco;
long computed;
@@ -141,10 +141,11 @@ static int mgag200_compute_pixpll_values_g200(struct 
mga_device *mdev, long cloc
clock = p_clk_min >> 3;
 
f_vco = clock;
-   for (p = 0;
-p <= post_div_max && f_vco < p_clk_min;
-p = (p << 1) + 1, f_vco <<= 1)
+   for (testp = 0;
+testp <= post_div_max && f_vco < p_clk_min;
+testp = (testp << 1) + 1, f_vco <<= 1)
;
+   p = testp + 1;
 
delta = clock;
 
@@ -157,12 +158,12 @@ static int mgag200_compute_pixpll_values_g200(struct 
mga_device *mdev, long cloc
tmp_delta = computed - f_vco;
if (tmp_delta < delta) {
delta = tmp_delta;
-   m = testm;
-   n = testn;
+   m = testm + 1;
+   n = testn + 1;
}
}
}
-   f_vco = ref_clk * (n + 1) / (m + 1);
+   f_vco = ref_clk * n / m;
if (f_vco < 10)
s = 0;
else if (f_vco < 14)
@@ -186,6 +187,7 @@ static int mgag200_compute_pixpll_values_g200(struct 
mga_device *mdev, long cloc
 static void mgag200_set_pixpll_g200(struct mga_device *mdev,
const struct mgag200_pll_values *pixpllc)
 {
+   unsigned int pixpllcm, pixpllcn, pixpllcp, pixpllcs;
u8 misc, xpixpllcm, xpixpllcn, xpixpllcp;
 
misc = RREG8(MGA_MISC_IN);
@@ -193,9 +195,14 @@ static void mgag200_set_pixpll_g200(struct mga_device 
*mdev,
misc |= MGAREG_MISC_CLK_SEL_MGA_MSK;
WREG8(MGA_MISC_OUT, misc);
 
-   xpixpllcm = pixpllc->m;
-   xpixpllcn = pixpllc->n;
-   xpixpllcp = pixpllc->p | (pixpllc->s << 3);
+   pixpllcm = pixpllc->m - 1;
+   pixpllcn = pixpllc->n - 1;
+   pixpllcp = pixpllc->p - 1;
+   pixpllcs = pixpllc->s;
+
+   xpixpllcm = pixpllcm;
+   xpixpllcn = pixpllcn;
+   xpixpllcp = (pixpllcs << 3) | pixpllcp;
WREG_DAC(MGA1064_PIX_PLLC_M, xpixpllcm);
WREG_DAC(MGA1064_PIX_PLLC_N, xpixpllcn);
WREG_DAC(MGA1064_PIX_PLLC_P, xpixpllcp);
@@ -240,9 +247,9 @@ static int mgag200_compute_pixpll_values_g200se(struct 
mga_device *mdev, long cl
tmpdelta = clock - computed;
if (tmpdelta < delta) {
delta = tmpdelta;
-   m = testm - 1;
-   n = testn - 1;
-   p = testp - 1;
+   m = testm;
+   n = testn;
+   p = testp;
}
}
}
@@ -280,22 +287,19 @@ static int mgag200_compute_pixpll_values_g200se(struct 
mga_device *mdev, long cl
 
if (tmpdelta < delta) {
delta = tmpdelta;
-   m = testm - 1;
-   n = testn - 1;
-   p = testp - 1;
+   m = testm;
+   n = testn;
+   p = testp;
}
}
}
}
 
-   fvv = pllreffreq * (n + 1) / (m + 1);
+   fvv = pllreffreq * n / m;
fvv = (fvv - 80) / 5;
-
if (fvv > 15)
fvv = 15;
-
-   p |= (fvv << 4);
-   m |= 0x80;
+   s = fvv << 1;
 
clock = clock / 2;
}
@@ -317,6 +321,7 @@ static vo

[PATCH 09/12] drm/mgag200: Split PLL computation for G200SE

2021-07-05 Thread Thomas Zimmermann
The compute function for G200SE pixle PLLs handles two revisions with
different algorithms. Split it accordingly to make it readable.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/mgag200/mgag200_mode.c | 165 +++--
 1 file changed, 97 insertions(+), 68 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index 72fdf242cac7..99b35e4f9353 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -184,100 +184,118 @@ static int mgag200_compute_pixpll_values_g200(struct 
mga_device *mdev, long cloc
return 0;
 }
 
-static int mgag200_compute_pixpll_values_g200se(struct mga_device *mdev, long 
clock,
-   struct mgag200_pll_values 
*pixpllc)
+static int mgag200_compute_pixpll_values_g200se_00(struct mga_device *mdev, 
long clock,
+  struct mgag200_pll_values 
*pixpllc)
 {
-   static const unsigned int pvalues_e4[] = {16, 14, 12, 10, 8, 6, 4, 2, 
1};
-
-   u32 unique_rev_id = mdev->model.g200se.unique_rev_id;
unsigned int vcomax, vcomin, pllreffreq;
unsigned int delta, tmpdelta, permitteddelta;
unsigned int testp, testm, testn;
unsigned int p, m, n, s;
unsigned int computed;
-   unsigned int fvv;
-   unsigned int i;
 
m = n = p = s = 0;
delta = 0x;
 
-   if (unique_rev_id <= 0x03) {
-   vcomax = 32;
-   vcomin = 16;
-   pllreffreq = 25000;
-   permitteddelta = clock * 5 / 1000;
+   vcomax = 32;
+   vcomin = 16;
+   pllreffreq = 25000;
+   permitteddelta = clock * 5 / 1000;
 
-   for (testp = 8; testp > 0; testp /= 2) {
-   if (clock * testp > vcomax)
-   continue;
-   if (clock * testp < vcomin)
-   continue;
+   for (testp = 8; testp > 0; testp /= 2) {
+   if (clock * testp > vcomax)
+   continue;
+   if (clock * testp < vcomin)
+   continue;
 
-   for (testn = 17; testn < 256; testn++) {
-   for (testm = 1; testm < 32; testm++) {
-   computed = (pllreffreq * testn) /
-   (testm * testp);
-   if (computed > clock)
-   tmpdelta = computed - clock;
-   else
-   tmpdelta = clock - computed;
-   if (tmpdelta < delta) {
-   delta = tmpdelta;
-   m = testm;
-   n = testn;
-   p = testp;
-   }
+   for (testn = 17; testn < 256; testn++) {
+   for (testm = 1; testm < 32; testm++) {
+   computed = (pllreffreq * testn) / (testm * 
testp);
+   if (computed > clock)
+   tmpdelta = computed - clock;
+   else
+   tmpdelta = clock - computed;
+   if (tmpdelta < delta) {
+   delta = tmpdelta;
+   m = testm;
+   n = testn;
+   p = testp;
}
}
}
-   } else {
-   vcomax= 160;
-   vcomin= 80;
-   pllreffreq= 25000;
+   }
 
-   if (clock < 25000)
-   clock = 25000;
-   clock = clock * 2;
+   if (delta > permitteddelta) {
+   pr_warn("PLL delta too large\n");
+   return -EINVAL;
+   }
 
-   /* Permited delta is 0.5% as VESA Specification */
-   permitteddelta = clock * 5 / 1000;
+   pixpllc->m = m;
+   pixpllc->n = n;
+   pixpllc->p = p;
+   pixpllc->s = s;
 
-   for (i = 0 ; i < ARRAY_SIZE(pvalues_e4); i++) {
-   testp = pvalues_e4[i];
+   return 0;
+}
 
-   if ((clock * testp) > vcomax)
-   continue;
-   if ((clock * testp) < vcomin)
-   continue;
+static int mgag200_compute_pixpll_values_g200se_04(struct mga_device *mdev, 
long clock,
+  struct mgag200_pll_values 
*pi

[PATCH 08/12] drm/mgag200: Separate PLL compute and update functions from each other

2021-07-05 Thread Thomas Zimmermann
Move PLL compute and update functionality to distict places within
the file. Contains some minor style cleanups, but no functional changes.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/mgag200/mgag200_mode.c | 737 +
 1 file changed, 369 insertions(+), 368 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index d3b15e60f057..72fdf242cac7 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -184,30 +184,6 @@ static int mgag200_compute_pixpll_values_g200(struct 
mga_device *mdev, long cloc
return 0;
 }
 
-static void mgag200_set_pixpll_g200(struct mga_device *mdev,
-   const struct mgag200_pll_values *pixpllc)
-{
-   unsigned int pixpllcm, pixpllcn, pixpllcp, pixpllcs;
-   u8 misc, xpixpllcm, xpixpllcn, xpixpllcp;
-
-   misc = RREG8(MGA_MISC_IN);
-   misc &= ~MGAREG_MISC_CLK_SEL_MASK;
-   misc |= MGAREG_MISC_CLK_SEL_MGA_MSK;
-   WREG8(MGA_MISC_OUT, misc);
-
-   pixpllcm = pixpllc->m - 1;
-   pixpllcn = pixpllc->n - 1;
-   pixpllcp = pixpllc->p - 1;
-   pixpllcs = pixpllc->s;
-
-   xpixpllcm = pixpllcm;
-   xpixpllcn = pixpllcn;
-   xpixpllcp = (pixpllcs << 3) | pixpllcp;
-   WREG_DAC(MGA1064_PIX_PLLC_M, xpixpllcm);
-   WREG_DAC(MGA1064_PIX_PLLC_N, xpixpllcn);
-   WREG_DAC(MGA1064_PIX_PLLC_P, xpixpllcp);
-}
-
 static int mgag200_compute_pixpll_values_g200se(struct mga_device *mdev, long 
clock,
struct mgag200_pll_values 
*pixpllc)
 {
@@ -223,12 +199,12 @@ static int mgag200_compute_pixpll_values_g200se(struct 
mga_device *mdev, long cl
unsigned int i;
 
m = n = p = s = 0;
+   delta = 0x;
 
if (unique_rev_id <= 0x03) {
vcomax = 32;
vcomin = 16;
pllreffreq = 25000;
-   delta = 0x;
permitteddelta = clock * 5 / 1000;
 
for (testp = 8; testp > 0; testp /= 2) {
@@ -261,10 +237,8 @@ static int mgag200_compute_pixpll_values_g200se(struct 
mga_device *mdev, long cl
 
if (clock < 25000)
clock = 25000;
-
clock = clock * 2;
 
-   delta = 0x;
/* Permited delta is 0.5% as VESA Specification */
permitteddelta = clock * 5 / 1000;
 
@@ -317,38 +291,55 @@ static int mgag200_compute_pixpll_values_g200se(struct 
mga_device *mdev, long cl
return 0;
 }
 
-static void mgag200_set_pixpll_g200se(struct mga_device *mdev,
- const struct mgag200_pll_values *pixpllc)
+static int mgag200_compute_pixpll_values_g200wb(struct mga_device *mdev, long 
clock,
+   struct mgag200_pll_values 
*pixpllc)
 {
-   u32 unique_rev_id = mdev->model.g200se.unique_rev_id;
-   unsigned int pixpllcm, pixpllcn, pixpllcp, pixpllcs;
-   u8 misc, xpixpllcm, xpixpllcn, xpixpllcp;
+   unsigned int vcomax, vcomin, pllreffreq;
+   unsigned int delta, tmpdelta;
+   unsigned int testp, testm, testn;
+   unsigned int p, m, n, s;
+   unsigned int computed;
 
-   misc = RREG8(MGA_MISC_IN);
-   misc &= ~MGAREG_MISC_CLK_SEL_MASK;
-   misc |= MGAREG_MISC_CLK_SEL_MGA_MSK;
-   WREG8(MGA_MISC_OUT, misc);
+   m = n = p = s = 0;
+   delta = 0x;
 
-   pixpllcm = pixpllc->m - 1;
-   pixpllcn = pixpllc->n - 1;
-   pixpllcp = pixpllc->p - 1;
-   pixpllcs = pixpllc->s;
+   vcomax = 55;
+   vcomin = 15;
+   pllreffreq = 48000;
 
-   xpixpllcm = pixpllcm | ((pixpllcn & BIT(8)) >> 1);
-   xpixpllcn = pixpllcn;
-   xpixpllcp = (pixpllcs << 3) | pixpllcp;
-   WREG_DAC(MGA1064_PIX_PLLC_M, xpixpllcm);
-   WREG_DAC(MGA1064_PIX_PLLC_N, xpixpllcn);
-   WREG_DAC(MGA1064_PIX_PLLC_P, xpixpllcp);
+   for (testp = 1; testp < 9; testp++) {
+   if (clock * testp > vcomax)
+   continue;
+   if (clock * testp < vcomin)
+   continue;
 
-   if (unique_rev_id >= 0x04) {
-   WREG_DAC(0x1a, 0x09);
-   msleep(20);
-   WREG_DAC(0x1a, 0x01);
+   for (testm = 1; testm < 17; testm++) {
+   for (testn = 1; testn < 151; testn++) {
+   computed = (pllreffreq * testn) / (testm * 
testp);
+   if (computed > clock)
+   tmpdelta = computed - clock;
+   else
+   tmpdelta = clock - computed;
+   if (tmpdelta < delta) {
+   delta = tmpdelta;
+   n = testn;
+ 

[PATCH 12/12] drm/mgag200: Compute PLL values during atomic check

2021-07-05 Thread Thomas Zimmermann
PLL setup can fail if the display mode's clock is not supported by
any PLL configuration. Compute the PLL values during atomic check, so
that atomic commits can fail at the appropriate time. If successful,
use the values in the atomic-update phase.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/mgag200/mgag200_drv.h  |  2 ++
 drivers/gpu/drm/mgag200/mgag200_mode.c | 20 +---
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h 
b/drivers/gpu/drm/mgag200/mgag200_drv.h
index c813d6c15f70..6473e9c037d0 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.h
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.h
@@ -129,6 +129,8 @@ struct mgag200_pll_values {
 
 struct mgag200_crtc_state {
struct drm_crtc_state base;
+
+   struct mgag200_pll_values pixpll;
 };
 
 static inline struct mgag200_crtc_state *to_mgag200_crtc_state(struct 
drm_crtc_state *base)
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index 6a5c419c6641..55c4f76175bd 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -1802,6 +1802,7 @@ mgag200_simple_display_pipe_enable(struct 
drm_simple_display_pipe *pipe,
struct drm_device *dev = crtc->dev;
struct mga_device *mdev = to_mga_device(dev);
struct drm_display_mode *adjusted_mode = &crtc_state->adjusted_mode;
+   struct mgag200_crtc_state *mgag200_crtc_state = 
to_mgag200_crtc_state(crtc_state);
struct drm_framebuffer *fb = plane_state->fb;
struct drm_shadow_plane_state *shadow_plane_state = 
to_drm_shadow_plane_state(plane_state);
struct drm_rect fullscreen = {
@@ -1810,15 +1811,13 @@ mgag200_simple_display_pipe_enable(struct 
drm_simple_display_pipe *pipe,
.y1 = 0,
.y2 = fb->height,
};
-   struct mgag200_pll_values pixpll;
 
if (mdev->type == G200_WB || mdev->type == G200_EW3)
mgag200_g200wb_hold_bmc(mdev);
 
mgag200_set_format_regs(mdev, fb);
mgag200_set_mode_regs(mdev, adjusted_mode);
-   mgag200_compute_pixpll_values(mdev, adjusted_mode->clock, &pixpll);
-   mgag200_set_pixpll(mdev, &pixpll);
+   mgag200_set_pixpll(mdev, &mgag200_crtc_state->pixpll);
 
if (mdev->type == G200_ER)
mgag200_g200er_reset_tagfifo(mdev);
@@ -1852,8 +1851,12 @@ mgag200_simple_display_pipe_check(struct 
drm_simple_display_pipe *pipe,
  struct drm_crtc_state *crtc_state)
 {
struct drm_plane *plane = plane_state->plane;
+   struct drm_device *dev = plane->dev;
+   struct mga_device *mdev = to_mga_device(dev);
+   struct mgag200_crtc_state *mgag200_crtc_state = 
to_mgag200_crtc_state(crtc_state);
struct drm_framebuffer *new_fb = plane_state->fb;
struct drm_framebuffer *fb = NULL;
+   int ret;
 
if (!new_fb)
return 0;
@@ -1864,6 +1867,13 @@ mgag200_simple_display_pipe_check(struct 
drm_simple_display_pipe *pipe,
if (!fb || (fb->format != new_fb->format))
crtc_state->mode_changed = true; /* update PLL settings */
 
+   if (crtc_state->mode_changed) {
+   ret = mgag200_compute_pixpll_values(mdev, 
crtc_state->mode.clock,
+   
&mgag200_crtc_state->pixpll);
+   if (ret)
+   return ret;
+   }
+
return 0;
 }
 
@@ -1891,6 +1901,7 @@ mgag200_simple_display_pipe_duplicate_crtc_state(struct 
drm_simple_display_pipe
 {
struct drm_crtc *crtc = &pipe->crtc;
struct drm_crtc_state *crtc_state = crtc->state;
+   struct mgag200_crtc_state *mgag200_crtc_state = 
to_mgag200_crtc_state(crtc_state);
struct mgag200_crtc_state *new_mgag200_crtc_state;
 
if (!crtc_state)
@@ -1901,6 +1912,9 @@ mgag200_simple_display_pipe_duplicate_crtc_state(struct 
drm_simple_display_pipe
return NULL;
__drm_atomic_helper_crtc_duplicate_state(crtc, 
&new_mgag200_crtc_state->base);
 
+   memcpy(&new_mgag200_crtc_state->pixpll, &mgag200_crtc_state->pixpll,
+  sizeof(new_mgag200_crtc_state->pixpll));
+
return &new_mgag200_crtc_state->base;
 }
 
-- 
2.32.0



[PATCH 11/12] drm/mgag200: Introduce custom CRTC state

2021-07-05 Thread Thomas Zimmermann
The CRTC state in mgag200 will hold PLL values for modeset operations.
Simple KMS helpers already support custom state for planes. Extend the
helpers to support custom CRTC state as well.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/drm_simple_kms_helper.c | 39 +++--
 drivers/gpu/drm/mgag200/mgag200_drv.h   |  9 +
 drivers/gpu/drm/mgag200/mgag200_mode.c  | 46 +
 include/drm/drm_simple_kms_helper.h | 27 +++
 4 files changed, 118 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_simple_kms_helper.c 
b/drivers/gpu/drm/drm_simple_kms_helper.c
index 735f4f34bcc4..72989ed1baba 100644
--- a/drivers/gpu/drm/drm_simple_kms_helper.c
+++ b/drivers/gpu/drm/drm_simple_kms_helper.c
@@ -145,6 +145,39 @@ static const struct drm_crtc_helper_funcs 
drm_simple_kms_crtc_helper_funcs = {
.atomic_disable = drm_simple_kms_crtc_disable,
 };
 
+static void drm_simple_kms_crtc_reset(struct drm_crtc *crtc)
+{
+   struct drm_simple_display_pipe *pipe;
+
+   pipe = container_of(crtc, struct drm_simple_display_pipe, crtc);
+   if (!pipe->funcs || !pipe->funcs->reset_crtc)
+   return drm_atomic_helper_crtc_reset(crtc);
+
+   return pipe->funcs->reset_crtc(pipe);
+}
+
+static struct drm_crtc_state *drm_simple_kms_crtc_duplicate_state(struct 
drm_crtc *crtc)
+{
+   struct drm_simple_display_pipe *pipe;
+
+   pipe = container_of(crtc, struct drm_simple_display_pipe, crtc);
+   if (!pipe->funcs || !pipe->funcs->duplicate_crtc_state)
+   return drm_atomic_helper_crtc_duplicate_state(crtc);
+
+   return pipe->funcs->duplicate_crtc_state(pipe);
+}
+
+static void drm_simple_kms_crtc_destroy_state(struct drm_crtc *crtc, struct 
drm_crtc_state *state)
+{
+   struct drm_simple_display_pipe *pipe;
+
+   pipe = container_of(crtc, struct drm_simple_display_pipe, crtc);
+   if (!pipe->funcs || !pipe->funcs->destroy_crtc_state)
+   drm_atomic_helper_crtc_destroy_state(crtc, state);
+   else
+   pipe->funcs->destroy_crtc_state(pipe, state);
+}
+
 static int drm_simple_kms_crtc_enable_vblank(struct drm_crtc *crtc)
 {
struct drm_simple_display_pipe *pipe;
@@ -168,12 +201,12 @@ static void drm_simple_kms_crtc_disable_vblank(struct 
drm_crtc *crtc)
 }
 
 static const struct drm_crtc_funcs drm_simple_kms_crtc_funcs = {
-   .reset = drm_atomic_helper_crtc_reset,
+   .reset = drm_simple_kms_crtc_reset,
.destroy = drm_crtc_cleanup,
.set_config = drm_atomic_helper_set_config,
.page_flip = drm_atomic_helper_page_flip,
-   .atomic_duplicate_state = drm_atomic_helper_crtc_duplicate_state,
-   .atomic_destroy_state = drm_atomic_helper_crtc_destroy_state,
+   .atomic_duplicate_state = drm_simple_kms_crtc_duplicate_state,
+   .atomic_destroy_state = drm_simple_kms_crtc_destroy_state,
.enable_vblank = drm_simple_kms_crtc_enable_vblank,
.disable_vblank = drm_simple_kms_crtc_disable_vblank,
 };
diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h 
b/drivers/gpu/drm/mgag200/mgag200_drv.h
index fca3904fde0c..c813d6c15f70 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.h
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.h
@@ -127,6 +127,15 @@ struct mgag200_pll_values {
unsigned int s;
 };
 
+struct mgag200_crtc_state {
+   struct drm_crtc_state base;
+};
+
+static inline struct mgag200_crtc_state *to_mgag200_crtc_state(struct 
drm_crtc_state *base)
+{
+   return container_of(base, struct mgag200_crtc_state, base);
+}
+
 #define to_mga_connector(x) container_of(x, struct mga_connector, base)
 
 struct mga_i2c_chan {
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index 5da72ebd8a7f..6a5c419c6641 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -1886,6 +1886,49 @@ mgag200_simple_display_pipe_update(struct 
drm_simple_display_pipe *pipe,
mgag200_handle_damage(mdev, fb, &damage, 
&shadow_plane_state->map[0]);
 }
 
+static struct drm_crtc_state *
+mgag200_simple_display_pipe_duplicate_crtc_state(struct 
drm_simple_display_pipe *pipe)
+{
+   struct drm_crtc *crtc = &pipe->crtc;
+   struct drm_crtc_state *crtc_state = crtc->state;
+   struct mgag200_crtc_state *new_mgag200_crtc_state;
+
+   if (!crtc_state)
+   return NULL;
+
+   new_mgag200_crtc_state = kzalloc(sizeof(*new_mgag200_crtc_state), 
GFP_KERNEL);
+   if (!new_mgag200_crtc_state)
+   return NULL;
+   __drm_atomic_helper_crtc_duplicate_state(crtc, 
&new_mgag200_crtc_state->base);
+
+   return &new_mgag200_crtc_state->base;
+}
+
+static void mgag200_simple_display_pipe_destroy_crtc_state(struct 
drm_simple_display_pipe *pipe,
+  struct 
drm_crtc_state *crtc_state)
+{
+   struct mgag200_crtc_state *mgag200_crtc_state = 
to_mgag200_crt

[PATCH 03/12] drm/mgag200: Remove P_ARRAY_SIZE

2021-07-05 Thread Thomas Zimmermann
Replace P_ARRAY_SIZE by array pre-initializing and ARRAY_SIZE(). No
functional changes.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/mgag200/mgag200_mode.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index 045a20055515..fa06f1994d68 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -187,17 +187,16 @@ static int mgag200_g200_set_plls(struct mga_device *mdev, 
long clock)
return 0;
 }
 
-#define P_ARRAY_SIZE 9
-
 static int mga_g200se_set_plls(struct mga_device *mdev, long clock)
 {
+   static const unsigned int pvalues_e4[] = {16, 14, 12, 10, 8, 6, 4, 2, 
1};
+
u32 unique_rev_id = mdev->model.g200se.unique_rev_id;
unsigned int vcomax, vcomin, pllreffreq;
unsigned int delta, tmpdelta, permitteddelta;
unsigned int testp, testm, testn;
unsigned int p, m, n;
unsigned int computed;
-   unsigned int pvalues_e4[P_ARRAY_SIZE] = {16, 14, 12, 10, 8, 6, 4, 2, 1};
unsigned int fvv;
unsigned int i;
u8 misc;
@@ -252,7 +251,7 @@ static int mga_g200se_set_plls(struct mga_device *mdev, 
long clock)
/* Permited delta is 0.5% as VESA Specification */
permitteddelta = clock * 5 / 1000;
 
-   for (i = 0 ; i < P_ARRAY_SIZE ; i++) {
+   for (i = 0 ; i < ARRAY_SIZE(pvalues_e4); i++) {
testp = pvalues_e4[i];
 
if ((clock * testp) > vcomax)
-- 
2.32.0



[PATCH 07/12] drm/mgag200: Split several PLL functions by device type

2021-07-05 Thread Thomas Zimmermann
Several PLL functions compute values for different device types. Split
them up to make the code more readable. No functional changes.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/mgag200/mgag200_mode.c | 256 ++---
 1 file changed, 146 insertions(+), 110 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index 7d6707bd6c27..d3b15e60f057 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -353,7 +353,7 @@ static int mgag200_compute_pixpll_values_g200wb(struct 
mga_device *mdev, long cl
 {
unsigned int vcomax, vcomin, pllreffreq;
unsigned int delta, tmpdelta;
-   unsigned int testp, testm, testn, testp2;
+   unsigned int testp, testm, testn;
unsigned int p, m, n, s;
unsigned int computed;
 
@@ -361,64 +361,29 @@ static int mgag200_compute_pixpll_values_g200wb(struct 
mga_device *mdev, long cl
 
delta = 0x;
 
-   if (mdev->type == G200_EW3) {
-   vcomax = 80;
-   vcomin = 40;
-   pllreffreq = 25000;
-
-   for (testp = 1; testp < 8; testp++) {
-   for (testp2 = 1; testp2 < 8; testp2++) {
-   if (testp < testp2)
-   continue;
-   if ((clock * testp * testp2) > vcomax)
-   continue;
-   if ((clock * testp * testp2) < vcomin)
-   continue;
-   for (testm = 1; testm < 26; testm++) {
-   for (testn = 32; testn < 2048 ; 
testn++) {
-   computed = (pllreffreq * testn) 
/
-   (testm * testp * 
testp2);
-   if (computed > clock)
-   tmpdelta = computed - 
clock;
-   else
-   tmpdelta = clock - 
computed;
-   if (tmpdelta < delta) {
-   delta = tmpdelta;
-   m = testm + 1;
-   n = testn + 1;
-   p = testp + 1;
-   s = testp2;
-   }
-   }
-   }
-   }
-   }
-   } else {
-   vcomax = 55;
-   vcomin = 15;
-   pllreffreq = 48000;
+   vcomax = 55;
+   vcomin = 15;
+   pllreffreq = 48000;
 
-   for (testp = 1; testp < 9; testp++) {
-   if (clock * testp > vcomax)
-   continue;
-   if (clock * testp < vcomin)
-   continue;
+   for (testp = 1; testp < 9; testp++) {
+   if (clock * testp > vcomax)
+   continue;
+   if (clock * testp < vcomin)
+   continue;
 
-   for (testm = 1; testm < 17; testm++) {
-   for (testn = 1; testn < 151; testn++) {
-   computed = (pllreffreq * testn) /
-   (testm * testp);
-   if (computed > clock)
-   tmpdelta = computed - clock;
-   else
-   tmpdelta = clock - computed;
-   if (tmpdelta < delta) {
-   delta = tmpdelta;
-   n = testn;
-   m = testm;
-   p = testp;
-   s = 0;
-   }
+   for (testm = 1; testm < 17; testm++) {
+   for (testn = 1; testn < 151; testn++) {
+   computed = (pllreffreq * testn) / (testm * 
testp);
+   if (computed > clock)
+   tmpdelta = computed - clock;
+   else
+   tmpdelta = clock - computed;
+   if (tmpdelta < delta) {
+   delta = tmpdelta;
+   n 

[PATCH 04/12] drm/mgag200: Split PLL setup into compute and update functions

2021-07-05 Thread Thomas Zimmermann
The _set_plls() functions compute a pixel clock's PLL values
and program the hardware accordingly. This happens during atomic
commits.

For atomic modesetting, it's better to separate computation and
programming from each other. This will allow to compute the PLL
value during atomic checks and catch unsupported modes early.

Split the PLL setup into a compute and a update functions, and
call them one after the other. Computed PLL values are store in
struct mgag200_pll_values. No functional changes.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/mgag200/mgag200_drv.h  |  17 ++
 drivers/gpu/drm/mgag200/mgag200_mode.c | 234 +++--
 2 files changed, 196 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h 
b/drivers/gpu/drm/mgag200/mgag200_drv.h
index f7a0537c0d0a..fca3904fde0c 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.h
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.h
@@ -110,6 +110,23 @@
 #define MGAG200_MAX_FB_HEIGHT 4096
 #define MGAG200_MAX_FB_WIDTH 4096
 
+/*
+ * Stores parameters for programming the PLLs
+ *
+ * Fref: reference frequency (A: 25.175 Mhz, B: 28.361, C: XX Mhz)
+ * Fo: output frequency
+ * Fvco = Fref * (N / M)
+ * Fo = Fvco / P
+ *
+ * S = [0..3]
+ */
+struct mgag200_pll_values {
+   unsigned int m;
+   unsigned int n;
+   unsigned int p;
+   unsigned int s;
+};
+
 #define to_mga_connector(x) container_of(x, struct mga_connector, base)
 
 struct mga_i2c_chan {
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index fa06f1994d68..961bd128fea3 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -114,7 +114,8 @@ static inline void mga_wait_busy(struct mga_device *mdev)
  * PLL setup
  */
 
-static int mgag200_g200_set_plls(struct mga_device *mdev, long clock)
+static int mgag200_compute_pixpll_values_g200(struct mga_device *mdev, long 
clock,
+ struct mgag200_pll_values 
*pixpllc)
 {
struct drm_device *dev = &mdev->base;
const int post_div_max = 7;
@@ -130,7 +131,6 @@ static int mgag200_g200_set_plls(struct mga_device *mdev, 
long clock)
long ref_clk = mdev->model.g200.ref_clk;
long p_clk_min = mdev->model.g200.pclk_min;
long p_clk_max =  mdev->model.g200.pclk_max;
-   u8 misc;
 
if (clock > p_clk_max) {
drm_err(dev, "Pixel Clock %ld too high\n", clock);
@@ -175,19 +175,34 @@ static int mgag200_g200_set_plls(struct mga_device *mdev, 
long clock)
drm_dbg_kms(dev, "clock: %ld vco: %ld m: %d n: %d p: %d s: %d\n",
clock, f_vco, m, n, p, s);
 
+   pixpllc->m = m;
+   pixpllc->n = n;
+   pixpllc->p = p;
+   pixpllc->s = s;
+
+   return 0;
+}
+
+static void mgag200_set_pixpll_g200(struct mga_device *mdev,
+   const struct mgag200_pll_values *pixpllc)
+{
+   u8 misc, xpixpllcm, xpixpllcn, xpixpllcp;
+
misc = RREG8(MGA_MISC_IN);
misc &= ~MGAREG_MISC_CLK_SEL_MASK;
misc |= MGAREG_MISC_CLK_SEL_MGA_MSK;
WREG8(MGA_MISC_OUT, misc);
 
-   WREG_DAC(MGA1064_PIX_PLLC_M, m);
-   WREG_DAC(MGA1064_PIX_PLLC_N, n);
-   WREG_DAC(MGA1064_PIX_PLLC_P, (p | (s << 3)));
-
-   return 0;
+   xpixpllcm = pixpllc->m;
+   xpixpllcn = pixpllc->n;
+   xpixpllcp = pixpllc->p | (pixpllc->s << 3);
+   WREG_DAC(MGA1064_PIX_PLLC_M, xpixpllcm);
+   WREG_DAC(MGA1064_PIX_PLLC_N, xpixpllcn);
+   WREG_DAC(MGA1064_PIX_PLLC_P, xpixpllcp);
 }
 
-static int mga_g200se_set_plls(struct mga_device *mdev, long clock)
+static int mgag200_compute_pixpll_values_g200se(struct mga_device *mdev, long 
clock,
+   struct mgag200_pll_values 
*pixpllc)
 {
static const unsigned int pvalues_e4[] = {16, 14, 12, 10, 8, 6, 4, 2, 
1};
 
@@ -199,7 +214,6 @@ static int mga_g200se_set_plls(struct mga_device *mdev, 
long clock)
unsigned int computed;
unsigned int fvv;
unsigned int i;
-   u8 misc;
 
if (unique_rev_id <= 0x03) {
 
@@ -295,35 +309,47 @@ static int mga_g200se_set_plls(struct mga_device *mdev, 
long clock)
return -EINVAL;
}
 
+   pixpllc->m = m;
+   pixpllc->n = n;
+   pixpllc->p = p;
+   pixpllc->s = 0;
+
+   return 0;
+}
+
+static void mgag200_set_pixpll_g200se(struct mga_device *mdev,
+ const struct mgag200_pll_values *pixpllc)
+{
+   u32 unique_rev_id = mdev->model.g200se.unique_rev_id;
+   u8 misc, xpixpllcm, xpixpllcn, xpixpllcp;
+
misc = RREG8(MGA_MISC_IN);
misc &= ~MGAREG_MISC_CLK_SEL_MASK;
misc |= MGAREG_MISC_CLK_SEL_MGA_MSK;
WREG8(MGA_MISC_OUT, misc);
 
-   WREG_DAC(MGA1064_PIX_PLLC_M, m);
-   WREG_DAC(MGA1064_PIX_PLLC_N, n);
-   WREG_DAC(MGA1064_PIX_PLLC_P, p);
+   xpixpllcm = pixpllc->m;
+ 

[PATCH 10/12] drm/mgag200: Declare PLL clock constants static const

2021-07-05 Thread Thomas Zimmermann
Move the PLL constants to the RO data section by declaring them as
static const. No functional changes.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/mgag200/mgag200_mode.c | 70 --
 1 file changed, 31 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index 99b35e4f9353..5da72ebd8a7f 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -187,7 +187,10 @@ static int mgag200_compute_pixpll_values_g200(struct 
mga_device *mdev, long cloc
 static int mgag200_compute_pixpll_values_g200se_00(struct mga_device *mdev, 
long clock,
   struct mgag200_pll_values 
*pixpllc)
 {
-   unsigned int vcomax, vcomin, pllreffreq;
+   static const unsigned int vcomax = 32;
+   static const unsigned int vcomin = 16;
+   static const unsigned int pllreffreq = 25000;
+
unsigned int delta, tmpdelta, permitteddelta;
unsigned int testp, testm, testn;
unsigned int p, m, n, s;
@@ -196,9 +199,6 @@ static int mgag200_compute_pixpll_values_g200se_00(struct 
mga_device *mdev, long
m = n = p = s = 0;
delta = 0x;
 
-   vcomax = 32;
-   vcomin = 16;
-   pllreffreq = 25000;
permitteddelta = clock * 5 / 1000;
 
for (testp = 8; testp > 0; testp /= 2) {
@@ -241,8 +241,10 @@ static int mgag200_compute_pixpll_values_g200se_04(struct 
mga_device *mdev, long
   struct mgag200_pll_values 
*pixpllc)
 {
static const unsigned int pvalues_e4[] = {16, 14, 12, 10, 8, 6, 4, 2, 
1};
+   static const unsigned int vcomax = 160;
+   static const unsigned int vcomin = 80;
+   static const unsigned int pllreffreq = 25000;
 
-   unsigned int vcomax, vcomin, pllreffreq;
unsigned int delta, tmpdelta, permitteddelta;
unsigned int testp, testm, testn;
unsigned int p, m, n, s;
@@ -253,10 +255,6 @@ static int mgag200_compute_pixpll_values_g200se_04(struct 
mga_device *mdev, long
m = n = p = s = 0;
delta = 0x;
 
-   vcomax= 160;
-   vcomin= 80;
-   pllreffreq= 25000;
-
if (clock < 25000)
clock = 25000;
clock = clock * 2;
@@ -323,7 +321,10 @@ static int mgag200_compute_pixpll_values_g200se(struct 
mga_device *mdev, long cl
 static int mgag200_compute_pixpll_values_g200wb(struct mga_device *mdev, long 
clock,
struct mgag200_pll_values 
*pixpllc)
 {
-   unsigned int vcomax, vcomin, pllreffreq;
+   static const unsigned int vcomax = 55;
+   static const unsigned int vcomin = 15;
+   static const unsigned int pllreffreq = 48000;
+
unsigned int delta, tmpdelta;
unsigned int testp, testm, testn;
unsigned int p, m, n, s;
@@ -332,10 +333,6 @@ static int mgag200_compute_pixpll_values_g200wb(struct 
mga_device *mdev, long cl
m = n = p = s = 0;
delta = 0x;
 
-   vcomax = 55;
-   vcomin = 15;
-   pllreffreq = 48000;
-
for (testp = 1; testp < 9; testp++) {
if (clock * testp > vcomax)
continue;
@@ -371,7 +368,10 @@ static int mgag200_compute_pixpll_values_g200wb(struct 
mga_device *mdev, long cl
 static int mgag200_compute_pixpll_values_g200ev(struct mga_device *mdev, long 
clock,
struct mgag200_pll_values 
*pixpllc)
 {
-   unsigned int vcomax, vcomin, pllreffreq;
+   static const unsigned int vcomax = 55;
+   static const unsigned int vcomin = 15;
+   static const unsigned int pllreffreq = 5;
+
unsigned int delta, tmpdelta;
unsigned int testp, testm, testn;
unsigned int p, m, n, s;
@@ -380,10 +380,6 @@ static int mgag200_compute_pixpll_values_g200ev(struct 
mga_device *mdev, long cl
m = n = p = s = 0;
delta = 0x;
 
-   vcomax = 55;
-   vcomin = 15;
-   pllreffreq = 5;
-
for (testp = 16; testp > 0; testp--) {
if (clock * testp > vcomax)
continue;
@@ -419,7 +415,10 @@ static int mgag200_compute_pixpll_values_g200ev(struct 
mga_device *mdev, long cl
 static int mgag200_compute_pixpll_values_g200eh(struct mga_device *mdev, long 
clock,
struct mgag200_pll_values 
*pixpllc)
 {
-   unsigned int vcomax, vcomin, pllreffreq;
+   static const unsigned int vcomax = 80;
+   static const unsigned int vcomin = 40;
+   static const unsigned int pllreffreq = 3;
+
unsigned int delta, tmpdelta;
unsigned int testp, testm, testn;
unsigned int p, m, n, s;
@@ -428,10 +427,6 @@ static int mgag200_compute_pixpll_values_g200eh(struct 

[PATCH v4 0/2] Add p2p via dmabuf to habanalabs

2021-07-05 Thread Oded Gabbay
Hi,
I'm sending v4 of this patch-set following the long email thread.
I want to thank Jason for reviewing v3 and pointing out the errors, saving
us time later to debug it :)

I consulted with Christian on how to fix patch 2 (the implementation) and
at the end of the day I shamelessly copied the relevant content from
amdgpu_vram_mgr_alloc_sgt() and amdgpu_dma_buf_attach(), regarding the
usage of dma_map_resource() and pci_p2pdma_distance_many(), respectively.

I also made a few improvements after looking at the relevant code in amdgpu.
The details are in the changelog of patch 2.

I took the time to write an import code into the driver, allowing me to
check real P2P with two Gaudi devices, one as exporter and the other as
importer. I'm not going to include the import code in the product, it was
just for testing purposes (although I can share it if anyone wants).

I run it on a bare-metal environment with IOMMU enabled, on a sky-lake CPU
with a white-listed PCIe bridge (to make the pci_p2pdma_distance_many happy).

Greg, I hope this will be good enough for you to merge this code.

Thanks,
Oded

Oded Gabbay (1):
  habanalabs: define uAPI to export FD for DMA-BUF

Tomer Tayar (1):
  habanalabs: add support for dma-buf exporter

 drivers/misc/habanalabs/Kconfig |   1 +
 drivers/misc/habanalabs/common/habanalabs.h |  26 ++
 drivers/misc/habanalabs/common/memory.c | 480 +++-
 drivers/misc/habanalabs/gaudi/gaudi.c   |   1 +
 drivers/misc/habanalabs/goya/goya.c |   1 +
 include/uapi/misc/habanalabs.h  |  28 +-
 6 files changed, 532 insertions(+), 5 deletions(-)

-- 
2.25.1



[PATCH v4 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-07-05 Thread Oded Gabbay
User process might want to share the device memory with another
driver/device, and to allow it to access it over PCIe (P2P).

To enable this, we utilize the dma-buf mechanism and add a dma-buf
exporter support, so the other driver can import the device memory and
access it.

The device memory is allocated using our existing allocation uAPI,
where the user will get a handle that represents the allocation.

The user will then need to call the new
uAPI (HL_MEM_OP_EXPORT_DMABUF_FD) and give the handle as a parameter.

The driver will return a FD that represents the DMA-BUF object that
was created to match that allocation.

Signed-off-by: Oded Gabbay 
Reviewed-by: Tomer Tayar 
---
 include/uapi/misc/habanalabs.h | 28 +++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/include/uapi/misc/habanalabs.h b/include/uapi/misc/habanalabs.h
index 18765eb75b65..c5cbd60696d7 100644
--- a/include/uapi/misc/habanalabs.h
+++ b/include/uapi/misc/habanalabs.h
@@ -808,6 +808,10 @@ union hl_wait_cs_args {
 #define HL_MEM_OP_UNMAP3
 /* Opcode to map a hw block */
 #define HL_MEM_OP_MAP_BLOCK4
+/* Opcode to create DMA-BUF object for an existing device memory allocation
+ * and to export an FD of that DMA-BUF back to the caller
+ */
+#define HL_MEM_OP_EXPORT_DMABUF_FD 5
 
 /* Memory flags */
 #define HL_MEM_CONTIGUOUS  0x1
@@ -879,11 +883,26 @@ struct hl_mem_in {
/* Virtual address returned from HL_MEM_OP_MAP */
__u64 device_virt_addr;
} unmap;
+
+   /* HL_MEM_OP_EXPORT_DMABUF_FD */
+   struct {
+   /* Handle returned from HL_MEM_OP_ALLOC. In Gaudi,
+* where we don't have MMU for the device memory, the
+* driver expects a physical address (instead of
+* a handle) in the device memory space.
+*/
+   __u64 handle;
+   /* Size of memory allocation. Relevant only for GAUDI */
+   __u64 mem_size;
+   } export_dmabuf_fd;
};
 
/* HL_MEM_OP_* */
__u32 op;
-   /* HL_MEM_* flags */
+   /* HL_MEM_* flags.
+* For the HL_MEM_OP_EXPORT_DMABUF_FD opcode, this field holds the
+* DMA-BUF file/FD flags.
+*/
__u32 flags;
/* Context ID - Currently not in use */
__u32 ctx_id;
@@ -920,6 +939,13 @@ struct hl_mem_out {
 
__u32 pad;
};
+
+   /* Returned in HL_MEM_OP_EXPORT_DMABUF_FD. Represents the
+* DMA-BUF object that was created to describe a memory
+* allocation on the device's memory space. The FD should be
+* passed to the importer driver
+*/
+   __u64 fd;
};
 };
 
-- 
2.25.1



[PATCH v4 2/2] habanalabs: add support for dma-buf exporter

2021-07-05 Thread Oded Gabbay
From: Tomer Tayar 

Implement the calls to the dma-buf kernel api to create a dma-buf
object backed by FD.

We block the option to mmap the DMA-BUF object because we don't support
DIRECT_IO and implicit P2P. We only implement support for explicit P2P
through importing the FD of the DMA-BUF.

In the export phase, we provide a static SG list to the DMA-BUF object
because in  Habanalabs's ASICs, the device memory is pinned and
immutable. Therefore, there is no need for dynamic mappings and pinning
callbacks.

Note that in GAUDI we don't have an MMU towards the device memory and
the user works on physical addresses. Therefore, the user doesn't pass
through the kernel driver to allocate memory there. As a result, only
for GAUDI we receive from the user a device memory physical address
(instead of a handle) and a size.

To get the DMA address of the PCI bar, we use the dma_map_resources()
kernel API, because our device memory is not backed by page struct
and this API doesn't need page struct to map the physical address to
a DMA address.

We check the p2p distance using pci_p2pdma_distance_many() and refusing
to map dmabuf in case the distance doesn't allow p2p.

Signed-off-by: Tomer Tayar 
Reviewed-by: Oded Gabbay 
Reviewed-by: Gal Pressman 
Signed-off-by: Oded Gabbay 
---
Changes in v4:
 - Using dma_map_resources() to map the physical address of the PCI BAR
   to a DMA'able address that the other device can use

 - Check the p2p distance using pci_p2pdma_distance_many() and refusing
   to map dmabuf in case the distance doesn't allow p2p.

 - Move the creation of the SGT to the dmabuf map callback to use the
   PCI device of the importer in the call to dma_map_resources()

 - Move validity check whether the device address is exposed on the bar
   to the export phase

 - Allocate sgt per map instead of using single one for dmabuf

 - Move split pages code only to export from address

 - Check this works between two Gaudi devices. For that, I implemented
   a simple import code to import the FD of the DMABUF into Gaudi's MMU.
   This code is not upstreamed as it isn't a functionality I would like
   to support but it was done to verify that the export works correctly
   in real p2p scenario.

 drivers/misc/habanalabs/Kconfig |   1 +
 drivers/misc/habanalabs/common/habanalabs.h |  26 ++
 drivers/misc/habanalabs/common/memory.c | 480 +++-
 drivers/misc/habanalabs/gaudi/gaudi.c   |   1 +
 drivers/misc/habanalabs/goya/goya.c |   1 +
 5 files changed, 505 insertions(+), 4 deletions(-)

diff --git a/drivers/misc/habanalabs/Kconfig b/drivers/misc/habanalabs/Kconfig
index 293d79811372..c82d2e7b2035 100644
--- a/drivers/misc/habanalabs/Kconfig
+++ b/drivers/misc/habanalabs/Kconfig
@@ -8,6 +8,7 @@ config HABANA_AI
depends on PCI && HAS_IOMEM
select GENERIC_ALLOCATOR
select HWMON
+   select DMA_SHARED_BUFFER
help
  Enables PCIe card driver for Habana's AI Processors (AIP) that are
  designed to accelerate Deep Learning inference and training workloads.
diff --git a/drivers/misc/habanalabs/common/habanalabs.h 
b/drivers/misc/habanalabs/common/habanalabs.h
index 9aedea471ebe..f2605030286e 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define HL_NAME"habanalabs"
 
@@ -1326,6 +1327,27 @@ struct hl_pending_cb {
u32 hw_queue_id;
 };
 
+/**
+ * struct hl_dmabuf_wrapper - a dma-buf wrapper object.
+ * @dmabuf: pointer to dma-buf object.
+ * @ctx: pointer to the dma-buf owner's context.
+ * @phys_pg_pack: pointer to physical page pack if the dma-buf was exported for
+ *memory allocation handle.
+ * @pages: array of pages that represents the physical addresses in the device
+ * memory. This is relevant in case phys_pg_pack is NULL (For Gaudi).
+ * If phys_pg_pack is valid, we take the pages array from there.
+ * @npages: number of entries in pages array, relevant if phys_pg_pack is NULL
+ * @page_size: size of page in pages array, relevant if phys_pg_pack is NULL
+ */
+struct hl_dmabuf_wrapper {
+   struct dma_buf  *dmabuf;
+   struct hl_ctx   *ctx;
+   struct hl_vm_phys_pg_pack   *phys_pg_pack;
+   u64 *pages;
+   u64 npages;
+   u32 page_size;
+};
+
 /**
  * struct hl_ctx - user/kernel context.
  * @mem_hash: holds mapping from virtual address to virtual memory area
@@ -1634,6 +1656,7 @@ struct hl_vm_hw_block_list_node {
  * @npages: num physical pages in the pack.
  * @total_size: total size of all the pages in this list.
  * @mapping_cnt: number of shared mappings.
+ * @exporting_cnt: number of dma-buf exporting.
  * @asid: the context related to this list.
  * @page_size: size of

[PATCH RESEND] drm/msm: Fix error return code in msm_drm_init()

2021-07-05 Thread Wei Li
When it fail to create crtc_event kthread, it just jump to err_msm_uninit,
while the 'ret' is not updated. So assign the return code before that.

Fixes: 25fdd5933e4c ("drm/msm: Add SDM845 DPU support")
Reported-by: Hulk Robot 
Signed-off-by: Wei Li 
Reviewed-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/msm_drv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index fe7d17cd35ec..787522f92e63 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -524,6 +524,7 @@ static int msm_drm_init(struct device *dev, const struct 
drm_driver *drv)
"crtc_event:%d", priv->event_thread[i].crtc_id);
if (IS_ERR(priv->event_thread[i].worker)) {
DRM_DEV_ERROR(dev, "failed to create crtc_event 
kthread\n");
+   ret = PTR_ERR(priv->event_thread[i].worker);
goto err_msm_uninit;
}
 
-- 
2.25.1



Re: [PATCH v2] drm/panfrost:report the full raw fault information instead

2021-07-05 Thread Steven Price
On 02/07/2021 02:40, Chunyou Tang wrote:
> Hi Steve,
>> You didn't answer my previous question:
>>
>>> Is this device working with the kbase/DDK proprietary driver?
> 
> I don't know whether I used kbase/DDK,I only know I used the driver of
> panfrost in linux 5.11.

kbase is the Linux kernel driver for Arm's proprietary driver. The
proprietary driver from Arm is often called "the [Mali] DDK" by Arm (or
"the blob" by others). I guess "libmali.so" is the closest it has to a
real name.

>> What you are describing sounds like a hardware integration issue, so
>> it would be good to check that the hardware is working with the
>> proprietary driver to rule that out. And perhaps there is something
>> in the kbase for this device that is setting a chicken bit to 'fix'
>> the coherency?
> 
> I don't have the proprietary driver,I only used driver in linux 5.11.

Interesting - I would have expected the initial hardware bring up to
have happened with the proprietary driver from Arm. And my first step
would be to check whether any workarounds for integration issues were
applied in the kernel.

I think we'd need some input from the people who did the hardware
integration, and hopefully they would have access to the proprietary
driver from Arm.

Steve

> Thinks very much!
> 
> Chunyou.
> 
> 
> ?? Thu, 1 Jul 2021 11:15:14 +0100
> Steven Price  :
> 
>> On 29/06/2021 04:04, Chunyou Tang wrote:
> 
> 
>>> Hi Steve,
>>> thinks for your reply.
>>> I set the pte in arm_lpae_prot_to_pte(),
>>> ***
>>> /*
>>>  * Also Mali has its own notions of shareability wherein its
>>> Inner
>>>  * domain covers the cores within the GPU, and its Outer
>>> domain is
>>>  * "outside the GPU" (i.e. either the Inner or System
>>> domain in CPU
>>>  * terms, depending on coherency).
>>>  */
>>> if (prot & IOMMU_CACHE && data->iop.fmt != ARM_MALI_LPAE)
>>> pte |= ARM_LPAE_PTE_SH_IS;
>>> else
>>> pte |= ARM_LPAE_PTE_SH_OS;
>>> ***
>>> I set pte |= ARM_LPAE_PTE_SH_NS.
>>>
>>> If I set pte to ARM_LPAE_PTE_SH_OS or
>>> ARM_LPAE_PTE_SH_IS,whether I use singel core GPU or multi
>>> core GPU,it will occur GPU Fault.
>>> if I set pte to ARM_LPAE_PTE_SH_NS,whether I use singel core
>>> GPU or multi core GPU,it will not occur GPU Fault.
>>
>> Hi,
>>
>> So this is a difference between Panfrost and kbase. Panfrost (well
>> technically the IOMMU framework) enables the inner-shareable bit for
>> all memory, whereas kbase only enables it for some memory types (the
>> BASE_MEM_COHERENT_LOCAL flag in the UABI controls it). However this
>> should only be a performance/power difference (and AFAIK probably an
>> irrelevant one) and it's definitely required that "inner shareable"
>> (i.e. within the GPU) works for communication between the different
>> units of the GPU.
>>
>> You didn't answer my previous question:
>>
>>> Is this device working with the kbase/DDK proprietary driver?
>>
>> What you are describing sounds like a hardware integration issue, so
>> it would be good to check that the hardware is working with the
>> proprietary driver to rule that out. And perhaps there is something
>> in the kbase for this device that is setting a chicken bit to 'fix'
>> the coherency?
>>
>> Steve
> 
> 



[PATCH v3 1/5] drm/i915: use consistent CPU mappings for pin_map users

2021-07-05 Thread Matthew Auld
For discrete, users of pin_map() needs to obey the same rules at the TTM
backend, where we map system only objects as WB, and everything else as
WC. The simplest for now is to just force the correct mapping type as
per the new rules for discrete.

Suggested-by: Thomas Hellström 
Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Maarten Lankhorst 
Cc: Daniel Vetter 
Cc: Ramalingam C 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 34 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  4 +++
 drivers/gpu/drm/i915/gem/i915_gem_pages.c  | 22 --
 3 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 547cc9dad90d..9da7b288b7ed 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -625,6 +625,40 @@ int i915_gem_object_migrate(struct drm_i915_gem_object 
*obj,
return obj->ops->migrate(obj, mr);
 }
 
+/**
+ * i915_gem_object_placement_possible - Check whether the object can be
+ * placed at certain memory type
+ * @obj: Pointer to the object
+ * @type: The memory type to check
+ *
+ * Return: True if the object can be placed in @type. False otherwise.
+ */
+bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
+   enum intel_memory_type type)
+{
+   unsigned int i;
+
+   if (!obj->mm.n_placements) {
+   switch (type) {
+   case INTEL_MEMORY_LOCAL:
+   return i915_gem_object_has_iomem(obj);
+   case INTEL_MEMORY_SYSTEM:
+   return i915_gem_object_has_pages(obj);
+   default:
+   /* Ignore stolen for now */
+   GEM_BUG_ON(1);
+   return false;
+   }
+   }
+
+   for (i = 0; i < obj->mm.n_placements; i++) {
+   if (obj->mm.placements[i]->type == type)
+   return true;
+   }
+
+   return false;
+}
+
 void i915_gem_init__objects(struct drm_i915_private *i915)
 {
INIT_WORK(&i915->mm.free_work, __i915_gem_free_work);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index d423d8cac4f2..8be4fadeee48 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -12,6 +12,7 @@
 #include 
 
 #include "display/intel_frontbuffer.h"
+#include "intel_memory_region.h"
 #include "i915_gem_object_types.h"
 #include "i915_gem_gtt.h"
 #include "i915_gem_ww.h"
@@ -607,6 +608,9 @@ bool i915_gem_object_can_migrate(struct drm_i915_gem_object 
*obj,
 int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj,
   unsigned int flags);
 
+bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
+   enum intel_memory_type type);
+
 #ifdef CONFIG_MMU_NOTIFIER
 static inline bool
 i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index f2f850e31b8e..810a157a18f8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -321,8 +321,7 @@ static void *i915_gem_object_map_pfn(struct 
drm_i915_gem_object *obj,
dma_addr_t addr;
void *vaddr;
 
-   if (type != I915_MAP_WC)
-   return ERR_PTR(-ENODEV);
+   GEM_BUG_ON(type != I915_MAP_WC);
 
if (n_pfn > ARRAY_SIZE(stack)) {
/* Too big for stack -- allocate temporary array instead */
@@ -374,6 +373,25 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object 
*obj,
}
GEM_BUG_ON(!i915_gem_object_has_pages(obj));
 
+   /*
+* For discrete our CPU mappings needs to be consistent in order to
+* function correctly on !x86. When mapping things through TTM, we use
+* the same rules to determine the caching type.
+*
+* Internal users of lmem are already expected to get this right, so no
+* fudging needed there.
+*/
+   if (i915_gem_object_placement_possible(obj, INTEL_MEMORY_LOCAL)) {
+   if (type != I915_MAP_WC && !obj->mm.n_placements) {
+   ptr = ERR_PTR(-ENODEV);
+   goto err_unpin;
+   }
+
+   type = I915_MAP_WC;
+   } else if (IS_DGFX(to_i915(obj->base.dev))) {
+   type = I915_MAP_WB;
+   }
+
ptr = page_unpack_bits(obj->mm.mapping, &has_type);
if (ptr && has_type != type) {
if (pinned) {
-- 
2.26.3



[PATCH v3 2/5] drm/i915/uapi: convert drm_i915_gem_caching to kernel doc

2021-07-05 Thread Matthew Auld
Convert all the drm_i915_gem_caching bits to proper kernel doc.

Suggested-by: Daniel Vetter 
Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Maarten Lankhorst 
Cc: Tvrtko Ursulin 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Jason Ekstrand 
Cc: Daniel Vetter 
Cc: Ramalingam C 
---
 include/uapi/drm/i915_drm.h | 70 +++--
 1 file changed, 43 insertions(+), 27 deletions(-)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 2f70c48567c0..d13c6c5fad04 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1363,43 +1363,59 @@ struct drm_i915_gem_busy {
 };
 
 /**
- * I915_CACHING_NONE
+ * struct drm_i915_gem_caching - Set or get the caching for given object
+ * handle.
  *
- * GPU access is not coherent with cpu caches. Default for machines without an
- * LLC.
- */
-#define I915_CACHING_NONE  0
-/**
- * I915_CACHING_CACHED
+ * Allow userspace to control the GTT caching bits for a given object when the
+ * object is later mapped through the ppGTT(or GGTT on older platforms lacking
+ * ppGTT support, or if the object is used for scanout). Note that this might
+ * require unbinding the object from the GTT first, if its current caching 
value
+ * doesn't match.
  *
- * GPU access is coherent with cpu caches and furthermore the data is cached in
- * last-level caches shared between cpu cores and the gpu GT. Default on
- * machines with HAS_LLC.
- */
-#define I915_CACHING_CACHED1
-/**
- * I915_CACHING_DISPLAY
  *
- * Special GPU caching mode which is coherent with the scanout engines.
- * Transparently falls back to I915_CACHING_NONE on platforms where no special
- * cache mode (like write-through or gfdt flushing) is available. The kernel
- * automatically sets this mode when using a buffer as a scanout target.
- * Userspace can manually set this mode to avoid a costly stall and clflush in
- * the hotpath of drawing the first frame.
  */
-#define I915_CACHING_DISPLAY   2
-
 struct drm_i915_gem_caching {
/**
-* Handle of the buffer to set/get the caching level of. */
+* @handle: Handle of the buffer to set/get the caching level.
+*/
__u32 handle;
 
/**
-* Cacheing level to apply or return value
+* @caching: The GTT caching level to apply or possible return value.
+*
+* The supported @caching values:
 *
-* bits0-15 are for generic caching control (i.e. the above defined
-* values). bits16-31 are reserved for platform-specific variations
-* (e.g. l3$ caching on gen7). */
+* I915_CACHING_NONE:
+*
+* GPU access is not coherent with CPU caches.  Default for machines
+* without an LLC. This means we need to manually clflush, if we want
+* GPU access to be coherent.
+*
+* I915_CACHING_CACHED:
+*
+* GPU access is coherent with CPU caches and furthermore the data is
+* cached in last-level caches shared between CPU cores and the GPU GT.
+* Default on machines with HAS_LLC. In general the fast shared
+* last-level cache(HAS_LLC) is considered much faster then platforms
+* which only support snooping(HAS_SNOOP), hence by default
+*
+* I915_CACHING_DISPLAY:
+*
+* Special GPU caching mode which is coherent with the scanout engines.
+* Transparently falls back to I915_CACHING_NONE on platforms where no
+* special cache mode (like write-through or gfdt flushing) is
+* available. The kernel automatically sets this mode when using a
+* buffer as a scanout target.  Userspace can manually set this mode to
+* avoid a costly stall and clflush in the hotpath of drawing the first
+* frame.
+*
+* Side note: On gen8+ this no longer does much since we lost the GGTT
+* caching bits. Although setting this is harmless, since it still
+* effectively falls back to I915_CACHING_NONE.
+*/
+#define I915_CACHING_NONE  0
+#define I915_CACHING_CACHED1
+#define I915_CACHING_DISPLAY   2
__u32 caching;
 };
 
-- 
2.26.3



[PATCH v3 3/5] drm/i915/uapi: reject caching ioctls for discrete

2021-07-05 Thread Matthew Auld
It's a noop on DG1, and in the future when need to support other devices
which let us control the coherency, then it should be an immutable
creation time property for the BO. This will likely be controlled
through a new gem_create_ext extension.

v2: add some kernel doc for the discrete changes, and document the
implicit rules

Suggested-by: Daniel Vetter 
Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Maarten Lankhorst 
Cc: Tvrtko Ursulin 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Jason Ekstrand 
Cc: Daniel Vetter 
Cc: Ramalingam C 
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c |  6 +
 include/uapi/drm/i915_drm.h| 29 +-
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 7d1400b13429..43004bef55cb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -268,6 +268,9 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, void 
*data,
struct drm_i915_gem_object *obj;
int err = 0;
 
+   if (IS_DGFX(to_i915(dev)))
+   return -ENODEV;
+
rcu_read_lock();
obj = i915_gem_object_lookup_rcu(file, args->handle);
if (!obj) {
@@ -303,6 +306,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void 
*data,
enum i915_cache_level level;
int ret = 0;
 
+   if (IS_DGFX(i915))
+   return -ENODEV;
+
switch (args->caching) {
case I915_CACHING_NONE:
level = I915_CACHE_NONE;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index d13c6c5fad04..a4faceeb8c32 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1372,7 +1372,34 @@ struct drm_i915_gem_busy {
  * require unbinding the object from the GTT first, if its current caching 
value
  * doesn't match.
  *
- *
+ * Note that this all changes on discrete platforms, starting from DG1, the
+ * set/get caching is no longer supported, and is now rejected.  Instead the 
CPU
+ * caching attributes(WB vs WC) will become an immutable creation time property
+ * for the object, along with the GTT caching level. For now we don't expose 
any
+ * new uAPI for this, instead on DG1 this is all implicit, although this 
largely
+ * shouldn't matter since DG1 is coherent by default(without any way of
+ * controlling it).
+ *
+ * Implicit caching rules, starting from DG1:
+ *
+ * - If any of the object placements (see 
&drm_i915_gem_create_ext_memory_regions)
+ *   contain I915_MEMORY_CLASS_DEVICE then the object will be allocated and
+ *   mapped as write-combined only.
+ *
+ * - Everything else is always allocated and mapped as write-back, with the
+ *   guarantee that everything is also coherent with the GPU.
+ *
+ * Note that this is likely to change in the future again, where we might need
+ * more flexibility on future devices, so making this all explicit as part of a
+ * new &drm_i915_gem_create_ext extension is probable.
+ *
+ * Side note: Part of the reason for this is that changing the 
at-allocation-time CPU
+ * caching attributes for the pages might be required(and is expensive) if we
+ * need to then CPU map the pages later with different caching attributes. This
+ * inconsistent caching behaviour, while supported on x86, is not universally
+ * supported on other architectures. So for simplicity we opt for setting
+ * everything at creation time, whilst also making it immutable, on discrete
+ * platforms.
  */
 struct drm_i915_gem_caching {
/**
-- 
2.26.3



[PATCH v3 4/5] drm/i915/uapi: convert drm_i915_gem_set_domain to kernel doc

2021-07-05 Thread Matthew Auld
Convert all the drm_i915_gem_set_domain bits to proper kernel doc.

Suggested-by: Daniel Vetter 
Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Maarten Lankhorst 
Cc: Tvrtko Ursulin 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Jason Ekstrand 
Cc: Daniel Vetter 
Cc: Ramalingam C 
---
 include/uapi/drm/i915_drm.h | 34 +++---
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a4faceeb8c32..6f94e5e7569a 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -880,14 +880,42 @@ struct drm_i915_gem_mmap_offset {
__u64 extensions;
 };
 
+
+/**
+ * struct drm_i915_gem_set_domain - Adjust the objects write or read domain, in
+ * preparation for accessing the pages via some CPU domain.
+ *
+ * Specifying a new write or read domain will flush the object out of the
+ * previous domain(if required), before then updating the objects domain
+ * tracking with the new domain.
+ *
+ * Note this might involve waiting for the object first if it is still active 
on
+ * the GPU.
+ *
+ * Supported values for @read_domains and @write_domain:
+ *
+ * - I915_GEM_DOMAIN_WC: Uncached write-combined domain
+ * - I915_GEM_DOMAIN_CPU: CPU cache domain
+ * - I915_GEM_DOMAIN_GTT: Mappable aperture domain
+ *
+ * All other domains are rejected.
+ *
+ */
 struct drm_i915_gem_set_domain {
-   /** Handle for the object */
+   /** @handle: Handle for the object. */
__u32 handle;
 
-   /** New read domains */
+   /**
+* @read_domains: New read domains.
+*/
__u32 read_domains;
 
-   /** New write domain */
+   /**
+* @write_domain: New write domain.
+*
+* Note that having something in the write domain implies it's in the
+* read domain, and only that read domain.
+*/
__u32 write_domain;
 };
 
-- 
2.26.3



[PATCH v3 5/5] drm/i915/uapi: reject set_domain for discrete

2021-07-05 Thread Matthew Auld
The CPU domain should be static for discrete, and on DG1 we don't need
any flushing since everything is already coherent, so really all this
does is an object wait, for which we have an ioctl. Longer term the
desired caching should be an immutable creation time property for the
BO, which can be set with something like gem_create_ext.

One other user is iris + userptr, which uses the set_domain to probe all
the pages to check if the GUP succeeds, however keeping the set_domain
around just for that seems rather scuffed. We could equally just submit
a dummy batch, which should hopefully be good enough, otherwise adding a
new creation time flag for userptr might be an option. Although longer
term we will also have vm_bind, which should also be a nice fit for
this, so adding a whole new flag is likely overkill.

v2: add some more kernel doc, also add the implicit rules with caching

Suggested-by: Daniel Vetter 
Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Maarten Lankhorst 
Cc: Tvrtko Ursulin 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Jason Ekstrand 
Cc: Daniel Vetter 
Cc: Ramalingam C 
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c |  3 +++
 include/uapi/drm/i915_drm.h| 18 ++
 2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 43004bef55cb..b684a62bf3b0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -490,6 +490,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void 
*data,
u32 write_domain = args->write_domain;
int err;
 
+   if (IS_DGFX(to_i915(dev)))
+   return -ENODEV;
+
/* Only handle setting domains to types used by the CPU. */
if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS)
return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 6f94e5e7569a..fd1a9878730c 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -900,6 +900,24 @@ struct drm_i915_gem_mmap_offset {
  *
  * All other domains are rejected.
  *
+ * Note that for discrete, starting from DG1, this is no longer supported, and
+ * is instead rejected. On such platforms the CPU domain is effectively static,
+ * where we also only support a single &drm_i915_gem_mmap_offset cache mode,
+ * which can't be set explicitly and instead depends on the object placements,
+ * as per the below.
+ *
+ * Implicit caching rules, starting from DG1:
+ *
+ * - If any of the object placements (see 
&drm_i915_gem_create_ext_memory_regions)
+ *   contain I915_MEMORY_CLASS_DEVICE then the object will be allocated and
+ *   mapped as write-combined only.
+ *
+ * - Everything else is always allocated and mapped as write-back, with the
+ *   guarantee that everything is also coherent with the GPU.
+ *
+ * Note that this is likely to change in the future again, where we might need
+ * more flexibility on future devices, so making this all explicit as part of a
+ * new &drm_i915_gem_create_ext extension is probable.
  */
 struct drm_i915_gem_set_domain {
/** @handle: Handle for the object. */
-- 
2.26.3



Re: [PATCH 0/4] vkms: Switch to shadow-buffered plane state

2021-07-05 Thread Daniel Vetter
On Mon, Jul 05, 2021 at 12:05:28PM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 05.07.21 um 11:27 schrieb Daniel Vetter:
> > On Mon, Jul 05, 2021 at 09:46:29AM +0200, Thomas Zimmermann wrote:
> > > Vkms copies each plane's framebuffer into the output buffer; essentially
> > > using a shadow buffer. DRM provides struct drm_shadow_plane_state, which
> > > handles the details of mapping/unmapping shadow buffers into memory for
> > > active planes.
> > > 
> > > Convert vkms to the helpers. Makes vkms use shared code and gives more
> > > test exposure to shadow-plane helpers.
> > > 
> > > Thomas Zimmermann (4):
> > >drm/gem: Export implementation of shadow-plane helpers
> > >drm/vkms: Inherit plane state from struct drm_shadow_plane_state
> > >drm/vkms: Let shadow-plane helpers prepare the plane's FB
> > >drm/vkms: Use dma-buf mapping from shadow-plane state for composing
> > 
> > So I think right now this fits, but I think it'll mismit going forward: We
> > don't really have a shadow-plane that we then toss to the hw, it's a
> > shadow-crtc-area. Right now there's no difference, because we don't
> > support positioning/scaling the primary plane. But that's all kinda stuff
> > that's on the table.
> > 
> > But conceptually at least the compositioning buffer should bet part of the
> > crtc, not of the primary plane.
> > 
> > So not sure what to do, but also coffee hasn't kicked in yet, so maybe I'm
> > just confused.
> 
> I'm not sure if I understand your concern. Can you elaborate? The
> compositing output buffer is not affected by this patchset. Only the input
> frambuffers of the planes. Those are shadow buffers. AFAICT the composer
> code memcpy's the primary plane and then blends the other planes on top.
> Supporting transformation of the primary plane doesn't really change much
> wrt to the vmaping of input fbs.

Yeah that's the current implementation, because that's easier. But
fundamentally we don't need a copy of the input shadow plane, we need a
scratch area that's sized for the crtc.

So if the primary plane is smaller than the crtc window (because we use
plane hw for compositing, or maybe primary plane shows a vidoe with black
borders or whatever), then the primary plane shadow isn't the right size.

And yes this means some surgery, vkms isn't there yet at all. But still it
would mean we're going right here, but then have to backtrack before we
can go left again. So a detour.

Also I don't think any other driver will ever need this, you really only
need it when you want to composite planes in software - which defeats the
purpose of planes. Except when the goal of your driver is to be a software
model.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH v2 3/3] drm/i915/uapi: reject set_domain for discrete

2021-07-05 Thread Daniel Vetter
On Mon, Jul 05, 2021 at 09:34:22AM +0100, Tvrtko Ursulin wrote:
> On 02/07/2021 20:22, Daniel Vetter wrote:
> > On Fri, Jul 02, 2021 at 03:31:08PM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 01/07/2021 16:10, Matthew Auld wrote:
> > > > The CPU domain should be static for discrete, and on DG1 we don't need
> > > > any flushing since everything is already coherent, so really all this
> > > 
> > > Knowledge of the write combine buffer is assumed to be had by anyone 
> > > involved?
> 
> What about this question? For discrete userspace will assume WC and will
> know how to flush WC buffer? Or it is assumed the flush will be hit
> implicitly? Will this be documented?

The kernel doesn't pick something at random, it's just fixed. So yeah
userspace needs to flush the WC buffer or anything else.

> > > > does is an object wait, for which we have an ioctl. Longer term the
> > > > desired caching should be an immutable creation time property for the
> > > > BO, which can be set with something like gem_create_ext.
> > > > 
> > > > One other user is iris + userptr, which uses the set_domain to probe all
> > > > the pages to check if the GUP succeeds, however keeping the set_domain
> > > > around just for that seems rather scuffed. We could equally just submit
> > > > a dummy batch, which should hopefully be good enough, otherwise adding a
> > > > new creation time flag for userptr might be an option. Although longer
> > > > term we will also have vm_bind, which should also be a nice fit for
> > > > this, so adding a whole new flag is likely overkill.
> > > 
> > > Execbuf sounds horrible. But it all reminds me of past work by Chris 
> > > which is surprisingly hard to find in the archives. Patches like:
> > > 
> > > commit 7706a433388016983052a27c0fd74a64b1897ae7
> > > Author: Chris Wilson 
> > > Date:   Wed Nov 8 17:04:07 2017 +
> > > 
> > >  drm/i915/userptr: Probe existence of backing struct pages upon 
> > > creation
> > >  Jason Ekstrand requested a more efficient method than 
> > > userptr+set-domain
> > >  to determine if the userptr object was backed by a complete set of 
> > > pages
> > >  upon creation. To be more efficient than simply populating the 
> > > userptr
> > >  using get_user_pages() (as done by the call to set-domain or 
> > > execbuf),
> > >  we can walk the tree of vm_area_struct and check for gaps or vma not
> > >  backed by struct page (VM_PFNMAP). The question is how to handle
> > >  VM_MIXEDMAP which may be either struct page or pfn backed...
> > > 
> > > commit 7ca21d3390eec23db99b8131ed18bc036efaba18
> > > Author: Chris Wilson 
> > > Date:   Wed Nov 8 17:48:22 2017 +
> > > 
> > >  drm/i915/userptr: Add a flag to populate the userptr on creation
> > >  Acquiring the backing struct pages for the userptr range is not free;
> > >  the first client for userptr would insist on frequently creating 
> > > userptr
> > >  objects ahead of time and not use them. For that first client, 
> > > deferring
> > >  the cost of populating the userptr (calling get_user_pages()) to the
> > >  actual execbuf was a substantial improvement. However, not all 
> > > clients
> > >  are the same, and most would like to validate that the userptr is 
> > > valid
> > >  and backed by struct pages upon creation, so offer a
> > >  I915_USERPTR_POPULATE flag to do just that.
> > >  Note that big difference between I915_USERPTR_POPULATE and the 
> > > deferred
> > >  scheme is that POPULATE is guaranteed to be synchronous, the result 
> > > is
> > >  known before the ioctl returns (and the handle exposed). However, 
> > > due to
> > >  system memory pressure, the object may be paged out before use,
> > >  requiring them to be paged back in on execbuf (as may always happen).
> > > 
> > > At least with the first one I think I was skeptical, since probing at
> > > point A makes a weak test versus userptr getting used at point B.
> > > Populate is kind of same really when user controls the backing store. At
> > > least these two arguments I think stand if we are trying to sell these
> > > flags as validation. But if the idea is limited to pure preload, with no
> > > guarantees that it keeps working by time of real use, then I guess it
> > > may be passable.
> > 
> > Well we've thrown this out again because there was no userspace. But if
> > this is requested by mesa, then the _PROBE flag should be entirely
> > sufficient.
> 
> Why probe and not populate? For me probe is weak and implies to give a
> guarantee which cannot really be given. If the pointer is not trusted, there
> is no reason to think it cannot go bad between creating the buffer (probe)
> and actual use. Populate on the other hand could be described as simply
> instantiate the backing store with the same caveat mentioned. No guarantees
> about the future validity of the backing store in either case should be
> implied.

The pointer can also go bad with populate. The on

Re: [PATCH v7 0/5] drm: address potential UAF bugs with drm_master ptrs

2021-07-05 Thread Daniel Vetter
On Mon, Jul 05, 2021 at 10:15:45AM +0800, Desmond Cheong Zhi Xi wrote:
> On 3/7/21 3:07 am, Daniel Vetter wrote:
> > On Fri, Jul 02, 2021 at 12:53:53AM +0800, Desmond Cheong Zhi Xi wrote:
> > > This patch series addresses potential use-after-free errors when 
> > > dereferencing pointers to struct drm_master. These were identified after 
> > > one such bug was caught by Syzbot in drm_getunique():
> > > https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803
> > > 
> > > The series is broken up into five patches:
> > > 
> > > 1. Move a call to drm_is_current_master() out from a section locked by 
> > > &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not 
> > > apply to stable.
> > > 
> > > 2. Move a call to _drm_lease_held() out from the section locked by 
> > > &dev->mode_config.idr_mutex in __drm_mode_object_find().
> > > 
> > > 3. Implement a locked version of drm_is_current_master() function that's 
> > > used within drm_auth.c.
> > > 
> > > 4. Serialize drm_file.master by introducing a new lock that's held 
> > > whenever the value of drm_file.master changes.
> > > 
> > > 5. Identify areas in drm_lease.c where pointers to struct drm_master are 
> > > dereferenced, and ensure that the master pointers are not freed during 
> > > use.
> > > 
> > > Changes in v6 -> v7:
> > > - Patch 2:
> > > Modify code alignment as suggested by the intel-gfx CI.
> > > 
> > > Update commit message based on the changes to patch 5.
> > > 
> > > - Patch 4:
> > > Add patch 4 to the series. This patch adds a new lock to serialize 
> > > drm_file.master, in response to the lockdep splat by the intel-gfx CI.
> > > 
> > > - Patch 5:
> > > Move kerneldoc comment about protecting drm_file.master with 
> > > drm_device.master_mutex into patch 4.
> > > 
> > > Update drm_file_get_master to use the new drm_file.master_lock instead of 
> > > drm_device.master_mutex, in response to the lockdep splat by the 
> > > intel-gfx CI.
> > 
> > So there's another one now because master->leases is protected by the
> > mode_config.idr_mutex, and that's a bit awkward to untangle.
> > 
> > Also I'm really surprised that there was now lockdep through the atomic
> > code anywhere. The reason seems to be that somehow CI reboot first before
> > it managed to run any of the kms_atomic tests, and we can only hit this
> > when we go through the atomic kms ioctl, the legacy kms ioctl don't have
> > that specific issue.
> > 
> > Anyway I think this approach doesn't look too workable, and we need
> > something new.
> > 
> > But first things first: Are you still on board working on this? You
> > started with a simple patch to fix a UAF bug, now we're deep into
> > reworking tricky locking ... If you feel like you want out I'm totally
> > fine with that.
> > 
> 
> Hi Daniel,
> 
> Thanks for asking, but I'm committed to seeing this through :) In fact, I
> really appreciate all your guidance and patience as the simple patch evolved
> into the current state of things.

Cool, it's definitely been fun trying to figure out a good solution for
this tricky problem here :-)

> > Anyway, I think we need to split drm_device->master_mutex up into two
> > parts:
> > 
> > - One part that protects the actual access/changes, which I think for
> >simplicity we'll just leave as the current lock. That lock is a very
> >inner lock, since for the drm_lease.c stuff it has to nest within
> >mode_config.idr_mutex even.
> > 
> > - Now the issue with checking master status/leases/whatever as an
> >innermost lock is that you can race, it's a classic time of check vs
> >time of use race: By the time we actually use the thing we validate
> >we'er allowed to use, we might now have access anymore. There's two
> >reasons for that:
> > 
> >* DROPMASTER ioctl could remove the master rights, which removes access
> >  rights also for all leases
> > 
> >* REVOKE_LEASE ioctl can do the same but only for a specific lease
> > 
> >This is the thing we're trying to protect against in fbcon code, but
> >that's very spotty protection because all the ioctls by other users
> >aren't actually protected against this.
> > 
> >So I think for this we need some kind of big reader lock.
> > 
> > Now for the implementation, there's a few things:
> > 
> > - I think best option for this big reader lock would be to just use srcu.
> >We only need to flush out all current readers when we drop master or
> >revoke a lease, so synchronize_srcu is perfectly good enough for this
> >purpose.
> > 
> > - The fbdev code would switch over to srcu in
> >drm_master_internal_acquire() and drm_master_internal_release(). Ofc
> >within drm_master_internal_acquire we'd still need to check master
> >status with the normal master_mutex.
> > 
> > - While we revamp all this we should fix the ioctl checks in drm_ioctl.c.
> >Just noticed that drm_ioctl_permit() could and should be unexported,
> >last user was remove

Re: [Intel-gfx] [PATCH v2 3/3] drm/i915/uapi: reject set_domain for discrete

2021-07-05 Thread Tvrtko Ursulin



On 05/07/2021 15:25, Daniel Vetter wrote:

On Mon, Jul 05, 2021 at 09:34:22AM +0100, Tvrtko Ursulin wrote:

On 02/07/2021 20:22, Daniel Vetter wrote:

On Fri, Jul 02, 2021 at 03:31:08PM +0100, Tvrtko Ursulin wrote:


On 01/07/2021 16:10, Matthew Auld wrote:

The CPU domain should be static for discrete, and on DG1 we don't need
any flushing since everything is already coherent, so really all this


Knowledge of the write combine buffer is assumed to be had by anyone involved?


What about this question? For discrete userspace will assume WC and will
know how to flush WC buffer? Or it is assumed the flush will be hit
implicitly? Will this be documented?


The kernel doesn't pick something at random, it's just fixed. So yeah
userspace needs to flush the WC buffer or anything else.


Right, so does that needs to be documented somewhere or thinking is it 
is common knowledge? Probably does to be mentioned in conjunction with 
the mmap usage.



does is an object wait, for which we have an ioctl. Longer term the
desired caching should be an immutable creation time property for the
BO, which can be set with something like gem_create_ext.

One other user is iris + userptr, which uses the set_domain to probe all
the pages to check if the GUP succeeds, however keeping the set_domain
around just for that seems rather scuffed. We could equally just submit
a dummy batch, which should hopefully be good enough, otherwise adding a
new creation time flag for userptr might be an option. Although longer
term we will also have vm_bind, which should also be a nice fit for
this, so adding a whole new flag is likely overkill.


Execbuf sounds horrible. But it all reminds me of past work by Chris which is 
surprisingly hard to find in the archives. Patches like:

commit 7706a433388016983052a27c0fd74a64b1897ae7
Author: Chris Wilson 
Date:   Wed Nov 8 17:04:07 2017 +

  drm/i915/userptr: Probe existence of backing struct pages upon creation
  Jason Ekstrand requested a more efficient method than userptr+set-domain
  to determine if the userptr object was backed by a complete set of pages
  upon creation. To be more efficient than simply populating the userptr
  using get_user_pages() (as done by the call to set-domain or execbuf),
  we can walk the tree of vm_area_struct and check for gaps or vma not
  backed by struct page (VM_PFNMAP). The question is how to handle
  VM_MIXEDMAP which may be either struct page or pfn backed...

commit 7ca21d3390eec23db99b8131ed18bc036efaba18
Author: Chris Wilson 
Date:   Wed Nov 8 17:48:22 2017 +

  drm/i915/userptr: Add a flag to populate the userptr on creation
  Acquiring the backing struct pages for the userptr range is not free;
  the first client for userptr would insist on frequently creating userptr
  objects ahead of time and not use them. For that first client, deferring
  the cost of populating the userptr (calling get_user_pages()) to the
  actual execbuf was a substantial improvement. However, not all clients
  are the same, and most would like to validate that the userptr is valid
  and backed by struct pages upon creation, so offer a
  I915_USERPTR_POPULATE flag to do just that.
  Note that big difference between I915_USERPTR_POPULATE and the deferred
  scheme is that POPULATE is guaranteed to be synchronous, the result is
  known before the ioctl returns (and the handle exposed). However, due to
  system memory pressure, the object may be paged out before use,
  requiring them to be paged back in on execbuf (as may always happen).

At least with the first one I think I was skeptical, since probing at
point A makes a weak test versus userptr getting used at point B.
Populate is kind of same really when user controls the backing store. At
least these two arguments I think stand if we are trying to sell these
flags as validation. But if the idea is limited to pure preload, with no
guarantees that it keeps working by time of real use, then I guess it
may be passable.


Well we've thrown this out again because there was no userspace. But if
this is requested by mesa, then the _PROBE flag should be entirely
sufficient.


Why probe and not populate? For me probe is weak and implies to give a
guarantee which cannot really be given. If the pointer is not trusted, there
is no reason to think it cannot go bad between creating the buffer (probe)
and actual use. Populate on the other hand could be described as simply
instantiate the backing store with the same caveat mentioned. No guarantees
about the future validity of the backing store in either case should be
implied.


The pointer can also go bad with populate. The only thing probe guarantees
is that "right now I should be able to call get_user_pages and the only
reasons it could fail is ENOMEM". Which is pretty much the same as we
guarantee when we create a normal object.

Neither does guarantee that by the time you execbuf you won't hit an

Re: Questions over DSI within DRM.

2021-07-05 Thread Dave Stevenson
Hi Laurent.

Thanks to you, Jani, and Jagan for your replies.
I'm replying to Laurent's email as it has the greatest number of
discussion points.

Noted that NWL DSI and Exynos DSI have undergone the conversion to
bridges - hopefully I can take those as vague examples.

On Fri, 2 Jul 2021 at 17:42, Laurent Pinchart
 wrote:
>
> Hi Dave,
>
> (Expanding the CC list a bit)
>
> On Fri, Jul 02, 2021 at 12:03:31PM +0100, Dave Stevenson wrote:
> > Hi All
> >
> > I'm trying to get DSI devices working reliably on the Raspberry Pi,
> > but I'm hitting a number of places where it isn't clear as to the
> > expected behaviour within DRM.
>
> Not a surprise. I dread reading the rest of this e-mail though :-)
>
> > Power on state. Many devices want the DSI clock and/or data lanes in
> > LP-11 state when they are powered up.
>
> When they are powered up, or when they are enabled ?

Generally powered on, so it wants to be in before pre_enable.

Taking SN65DSI83 as an example, the datasheet section 7.4.2
Initialization Sequence says
"After power is applied and stable, the DSI CLK lanes MUST be in HS
state and the DSI data lanes MUST be driven to LP11 state"

It was discussed when Marek's was developing the merged driver that it
seems to be happy to violate this initialisation sequence by starting
video on the data lanes before configuring the DSI83, but is that more
luck than anything else?

Reading the functional spec for Toshiba TC358762 (DSI to DPI bridge
used on the Pi 7" DSI panel), it says:
"1.HW:  Put  D-PHY  data  lane  0 to  ULPS  state
  a)  D-PHY  clock  lane  must  continue  to  supply  clock  for
minimum  20  clocks  before put  the  clock  lane  to  idle
2. HW:  Enable  Clock  Lane  and  Data  Lane  0 D-PHYs
3. HW:  Disable  Data  Lane  1 D-PHY
4. HW:  Transition  to  Operation  mode  when  Data  Lane  0 D-PHY  is
 not  in  ULPS  state.
  a) D-PHY clock lane must supply before Data Lane 0 D-PHY changes to LP state
  b) Host must wait minimum 5200 HSBCLK clock for CORE2 Power to be
stable before transfer any DSI packets."

So there we have a timing constraint from pushing the clock lane into
HS before we can do anything else. Quick calcs do appear to say this
is around 0.5msecs, so possibly it can be ignored.

> > With the normal calling sequence of:
> > - panel/bridge pre_enable calls from connector towards the encoder.
> > - encoder enable which also enables video.
> > - panel/bridge enable calls from encoder to connector.
> > there is no point at which the DSI tx is initialised but not
> > transmitting video. What DSI states are expected to be adopted at each
> > point?
>
> That's undefined I'm afraid, and it should be documented. The upside is
> that you can propose the behaviour that you need :-)

Can we reduce it to one behaviour that is valid for all devices? I
suspect we need at least some set of options :-/

> > On a similar theme, some devices want the clock lane in HS mode early
> > so they can use it in place of an external oscillator, but the data
> > lanes still in LP-11. There appears to be no way for the
> > display/bridge to signal this requirement or it be achieved.
>
> You're right. A lng time ago, the omapdrm driver had an internal
> infrastructure that didn't use drm_bridge or drm_panel and instead
> required omapdrm-specific drivers for those components. It used to model
> the display pipeline in a different way than drm_bridge, with the sync
> explicitly setting the source state. A DSI sink could thus control its
> enable sequence, interleaving programming of the sink with control of
> the source.
>
> Migrating omapdrm to the drm_bridge model took a really large effort,
> which makes me believe that transitioning the whole subsystem to
> sink-controlled sources would be close to impossible. We could add
> DSI-specific operations, or add another enable bridge operation
> (post_pre_enable ? :-D). Neither would scale, but it may be enough.

I haven't thought it through for all generic cases, but I suspect it's
more a pre_pre_enable that is needed to initialise the PHY etc,
probably from source to sink.
If the panel/bridge can set a flag that can be checked at this point
for whether an early clock is required or not, I think that allows us
to comply with the requirements for a large number of panels/bridges
(LP-11 vs HS config for clock and or data lanes before pre_enable is
called).

pre_enable retains the current behaviour (initialise the chain from
sink to source).
enable then actually starts sending video and enabling outputs.

When I discussed this briefly with Maxime there was a suggestion of
using pm_runtime to be able to power up the pipeline as a whole. If
the bridge driver can use pm_runtime to power up the PHY when
required, then that may solve the issue, however I know too little of
the details to say whether that is actually practical.

> > host_transfer calls can supposedly be made at any time, however unless
> > MIPI_DSI_MSG_USE_LPM is set in the message then we're meant to 

Re: [PATCH v4 12/17] drm/uAPI: Add "preferred color format" drm property as setting for userspace

2021-07-05 Thread Werner Sembach


Am 01.07.21 um 15:24 schrieb Pekka Paalanen:
> On Thu, 1 Jul 2021 14:50:13 +0200
> Werner Sembach  wrote:
>
>> Am 01.07.21 um 10:07 schrieb Pekka Paalanen:
>>
>>> On Wed, 30 Jun 2021 11:20:18 +0200
>>> Werner Sembach  wrote:
>>>  
 Am 30.06.21 um 10:41 schrieb Pekka Paalanen:
  
> On Tue, 29 Jun 2021 13:39:18 +0200
> Werner Sembach  wrote:
>
>> Am 29.06.21 um 13:17 schrieb Pekka Paalanen:
>>> On Tue, 29 Jun 2021 08:12:54 +
>>> Simon Ser  wrote:
>>>   
 On Tuesday, June 22nd, 2021 at 09:15, Pekka Paalanen 
  wrote:
   
> yes, I think this makes sense, even if it is a property that one can't
> tell for sure what it does before hand.
>
> Using a pair of properties, preference and active, to ask for 
> something
> and then check what actually worked is good for reducing the
> combinatorial explosion caused by needing to "atomic TEST_ONLY commit"
> test different KMS configurations. Userspace has a better chance of
> finding a configuration that is possible.
>
> OTOH, this has the problem than in UI one cannot tell the user in
> advance which options are truly possible. Given that KMS properties 
> are
> rarely completely independent, and in this case known to depend on
> several other KMS properties, I think it is good enough to know after
> the fact.
>
> If a driver does not use what userspace prefers, there is no way to
> understand why, or what else to change to make it happen. That problem
> exists anyway, because TEST_ONLY commits do not give useful feedback
> but only a yes/no.
 By submitting incremental atomic reqs with TEST_ONLY (i.e. only 
 changing one
 property at a time), user-space can discover which property makes the 
 atomic
 commit fail.
>>> That works if the properties are independent of each other. Color
>>> range, color format, bpc and more may all be interconnected,
>>> allowing only certain combinations to work.
>>>
>>> If all these properties have "auto" setting too, then it would be
>>> possible to probe each property individually, but that still does not
>>> tell which combinations are valid.
>>>
>>> If you probe towards a certain configuration by setting the properties
>>> one by one, then depending on the order you pick the properties, you
>>> may come to a different conclusion on which property breaks the
>>> configuration.
>> My mind crossed another point that must be considered: When plugin in
>> a Monitor a list of possible Resolutions+Framerate combinations is
>> created for xrandr and other userspace (I guess by atomic checks? but
>> I don't know).
> Hi,
>
> I would not think so, but I hope to be corrected if I'm wrong.
>
> My belief is that the driver collects a list of modes from EDID, some
> standard modes, and maybe some other hardcoded modes, and then
> validates each entry against all the known limitations like vertical
> and horizontal frequency limits, discarding modes that do not fit.
>
> Not all limitations are known during that phase, which is why KMS
> property "link-status" exists. When userspace actually programs a mode
> (not a TEST_ONLY commit), the link training may fail. The kernel prunes
> the mode from the list and sets the link status property to signal
> failure, and sends a hotplug uevent. Userspace needs to re-check the
> mode list and try again.
>
> That is a generic escape hatch for when TEST_ONLY commit succeeds, but
> in reality the hardware cannot do it, you just cannot know until you
> actually try for real. It causes end user visible flicker if it happens
> on an already running connector, but since it usually happens when
> turning a connector on to begin with, there is no flicker to be seen,
> just a small delay in finding a mode that works.
>
>> During this drm
>> properties are already considered, which is no problem atm because as
>> far as i can tell there is currently no drm property that would make
>> a certain Resolutions+Framerate combination unreachable that would be
>> possible with everything on default.
> I would not expect KMS properties to be considered at all. It would
> reject modes that are actually possible if the some KMS properties were
> changed. So at least going forward, current KMS property values cannot
> factor in.
 At least the debugfs variable "force_yuv420_output" did change the 
 available modes here: 
 https://elixir.bootlin.com/linux/v5.13/source/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L5165
  
 before my patch 
 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/c

Re: [PATCH] drm/vc4: dsi: Only register our component once a DSI device is attached

2021-07-05 Thread Dave Stevenson
Hi Laurent

On Fri, 2 Jul 2021 at 21:19, Laurent Pinchart
 wrote:
>
> Hi Dave,
>
> On Fri, Jul 02, 2021 at 06:44:22PM +0100, Dave Stevenson wrote:
> > On Fri, 2 Jul 2021 at 17:47, Laurent Pinchart wrote:
> > > On Mon, Jun 21, 2021 at 04:59:51PM +0300, Laurent Pinchart wrote:
> > >> On Mon, Jun 21, 2021 at 04:09:05PM +0300, Laurent Pinchart wrote:
> > >>> On Mon, Jun 21, 2021 at 03:56:16PM +0300, Laurent Pinchart wrote:
> >  On Mon, Jun 21, 2021 at 12:49:14PM +0100, Dave Stevenson wrote:
> > > On Sun, 20 Jun 2021 at 23:49, Laurent Pinchart wrote:
> > >> On Sun, Jun 20, 2021 at 09:42:27PM +0300, Laurent Pinchart wrote:
> > >>> On Sun, Jun 20, 2021 at 03:29:03PM +0100, Dave Stevenson wrote:
> >  On Sun, 20 Jun 2021 at 04:26, Laurent Pinchart wrote:
> > >
> > > Hi Maxime,
> > >
> > > I'm testing this, and I'm afraid it causes an issue with all the
> > > I2C-controlled bridges. I'm focussing on the newly merged 
> > > ti-sn65dsi83
> > > driver at the moment, but other are affected the same way.
> > >
> > > With this patch, the DSI component is only added when the DSI 
> > > device is
> > > attached to the host with mipi_dsi_attach(). In the ti-sn65dsi83 
> > > driver,
> > > this happens in the bridge attach callback, which is called when 
> > > the
> > > bridge is attached by a call to drm_bridge_attach() in 
> > > vc4_dsi_bind().
> > > This creates a circular dependency, and the DRM/KMS device is 
> > > never
> > > created.
> > >
> > > How should this be solved ? Dave, I think you have shown an 
> > > interest in
> > > the sn65dsi83 recently, any help would be appreciated. On a side 
> > > note,
> > > I've tested the ti-sn65dsi83 driver on a v5.10 RPi kernel, 
> > > without much
> > > success (on top of commit e1499baa0b0c I get a very weird frame 
> > > rate -
> > > 147 fps of 99 fps instead of 60 fps - and nothing on the screen, 
> > > and on
> > > top of the latest v5.10 RPi branch, I get lock-related warnings 
> > > at every
> > > page flip), which is why I tried v5.12 and noticed this patch. Is 
> > > it
> > > worth trying to bring up the display on the v5.10 RPi kernel in 
> > > parallel
> > > to fixing the issue introduced in this patch, or is DSI known to 
> > > be
> > > broken there ?
> > 
> >  I've been looking at SN65DSI83/4, but as I don't have any hardware
> >  I've largely been suggesting things to try to those on the forums 
> >  who
> >  do [1].
> > 
> >  My branch at 
> >  https://github.com/6by9/linux/tree/rpi-5.10.y-sn65dsi8x-marek
> >  is the latest one I've worked on. It's rpi-5.10.y with Marek's 
> >  driver
> >  cherry-picked, and an overlay and simple-panel definition by 
> >  others.
> >  It also has a rework for vc4_dsi to use pm_runtime, instead of
> >  breaking up the DSI bridge chain (which is flawed as it never calls
> >  the bridge mode_set or mode_valid functions which sn65dsi83 relies
> >  on).
> > >>
> > >> I've looked at that, and I'm afraid it doesn't go in the right
> > >> direction. The drm_encoder.crtc field is deprecated and documented as
> > >> only meaningful for non-atomic drivers. You're not introducing its
> > >> usage, but moving the configuration code from .enable() to the runtime
> > >> PM resume handler will make it impossible to fix this. The driver should
> > >> instead move to the .atomic_enable() function. If you need
> > >> enable/pre_enable in the DSI encoder, then you should turn it into a
> > >> drm_bridge.
> > >
> > > Is this something you're looking at by any chance ? I'm testing the
> > > ti-sn65dsi83 driver with VC4. I've spent a couple of hours debugging,
> > > only to realise that the vc4_dsi driver (before the rework you mention
> > > above) doesn't call .mode_set() on the bridges... Applying my sn65dsi83
> > > series that removes .mode_set() didn't help much as vc4_dsi doesn't call
> > > the atomic operations either :-) I'll test your branch now.
> >
> > This is one of the reasons for my email earlier today - thank you for
> > your reply.
> >
> > The current mainline vc4_dsi driver deliberately breaks the bridge
> > chain so that it gets called before the panel/bridge pre_enable and
> > can power everything up, therefore pre_enable can call host_transfer
> > to configure the panel/bridge over the DSI interface.
> > However we've both noted that it doesn't forward on the mode_set and
> > mode_valid calls, and my investigations say that it doesn't have
> > enough information to make those calls.
> >
> > My branch returns the chain to normal, and tries to use pm_runtime to
> > power up the PHY 

Re: [PATCH libdrm] headers: drm: Sync with drm-next

2021-07-05 Thread Mario Kleiner
On Mon, Jul 5, 2021 at 9:39 AM Michel Dänzer  wrote:
>
> On 2021-07-03 9:59 p.m., Mario Kleiner wrote:
> > Generated using make headers_install from the drm-next
> > tree - git://anongit.freedesktop.org/drm/drm
> > branch - drm-next
> > commit - 8a02ea42bc1d4c448caf1bab0e05899dad503f74
> >
> > The changes were as follows (shortlog from
> > b10733527bfd864605c33ab2e9a886eec317ec39..HEAD):
>
> libdrm uses GitLab merge requests now: 
> https://gitlab.freedesktop.org/mesa/drm/-/merge_requests
>

Thanks for the info Michel. Merge request sent out. Maybe the
Contributing doc needs some updates, as it describes a patch + mailing
list workflow?

-mario


Re: [PATCH v4 2/2] habanalabs: add support for dma-buf exporter

2021-07-05 Thread Jason Gunthorpe
On Mon, Jul 05, 2021 at 04:03:14PM +0300, Oded Gabbay wrote:

> + rc = sg_alloc_table(*sgt, nents, GFP_KERNEL | __GFP_ZERO);
> + if (rc)
> + goto error_free;

If you are not going to include a CPU list then I suggest setting
sg_table->orig_nents == 0

And using only the nents which is the length of the DMA list.

At least it gives some hope that other parts of the system could
detect this.

> +
> + /* Merge pages and put them into the scatterlist */
> + cur_page = 0;
> + for_each_sgtable_sg((*sgt), sg, i) {

for_each_sgtable_sg should never be used when working with
sg_dma_address() type stuff, here and everywhere else. The DMA list
should be iterated using the for_each_sgtable_dma_sg() macro.

> + /* In case we got a large memory area to export, we need to divide it
> +  * to smaller areas because each entry in the dmabuf sgt can only
> +  * describe unsigned int.
> +  */

Huh? This is forming a SGL, it should follow the SGL rules which means
you have to fragment based on the dma_get_max_seg_size() of the
importer device.

> + hl_dmabuf->pages = kcalloc(hl_dmabuf->npages, sizeof(*hl_dmabuf->pages),
> + GFP_KERNEL);
> + if (!hl_dmabuf->pages) {
> + rc = -ENOMEM;
> + goto err_free_dmabuf_wrapper;
> + }

Why not just create the SGL directly? Is there a reason it needs to
make a page list?

Jason


Re: Xorg doesn't work anymore after the latest DRM updates

2021-07-05 Thread Christian Zigotzky

Hi Nirmoy,

Many thanks for this information. We will test this patch asap.

Have a nice day,
Christian

On 05 July 2021 at 10:26pm, Nirmoy wrote:
> Hi Christian,
>
>
> This issue looks similar to the one Mikel Rychliski fixed recently  : 
https://patchwork.freedesktop.org/patch/440791. Let us know if this helps.

>
>
> Regards,
>
> Nirmoy
>
> On 7/3/2021 9:30 AM, Christian Zigotzky wrote:
>> Hi All,
>>
>> Xorg doesn't work anymore after the latest DRM updates. [1]
>>
>> Error messages:
>>
>> Jul 03 08:54:51 Fienix systemd[1]: Starting Light Display Manager...
>> Jul 03 08:54:51 Fienix systemd[1]: Started Light Display Manager.
>> Jul 03 08:54:51 Fienix kernel: BUG: Kernel NULL pointer dereference 
on read at 0x0010
>> Jul 03 08:54:51 Fienix kernel: Faulting instruction address: 
0xc0630750
>> Jul 03 08:54:51 Fienix kernel: Oops: Kernel access of bad area, sig: 
11 [#1]
>> Jul 03 08:54:51 Fienix kernel: BE PAGE_SIZE=4K PREEMPT SMP NR_CPUS=4 
CoreNet Generic
>> Jul 03 08:54:51 Fienix kernel: Modules linked in: algif_skcipher 
bnep tuner_simple tuner_types tea5767 tuner tda7432 tvaudio msp3400 bttv 
tea575x tveeprom videobuf_dma_sg videobuf_core rc_core videodev mc btusb 
btrtl btbcm btintel bluetooth ecdh_generic ecc uio_pdrv_genirq uio
>> Jul 03 08:54:51 Fienix kernel: CPU: 3 PID: 4300 Comm: Xorg.wrap Not 
tainted 5.14.0-a3_A-EON_X5000-07637-g3dbdb38e2869-dirty #1
>> Jul 03 08:54:51 Fienix kernel: NIP:  c0630750 LR: 
c060fedc CTR: c0630728
>> Jul 03 08:54:51 Fienix kernel: REGS: c0008d903470 TRAP: 0300 Not 
tainted  (5.14.0-a3_A-EON_X5000-07637-g3dbdb38e2869-dirty)
>> Jul 03 08:54:51 Fienix kernel: MSR:  80029002   
CR: 2222  XER: 2000
>> Jul 03 08:54:51 Fienix kernel: DEAR: 0010 ESR: 
 IRQMASK: 0
>>    GPR00: c060fedc 
c0008d903710 c190c400 c00085d59c00
>>    GPR04: c0008d9035b8 
 c000870a4900 c00085b62d00
>>    GPR08: 000f 
 c0630728 0003
>>    GPR12: 2222 
c0003fffeac0 ffe51070 0086007c
>>    GPR16: 00862820 
ffb7ec68  
>>    GPR20: c04064a0 
00450088 ffca79e4 5deadbeef122
>>    GPR24: 5deadbeef100 
 c000876028f0 c00080bd4000
>>    GPR28: c00087603c48 
c00085d59d78 c00085d59c00 c00085d59c78
>> Jul 03 08:54:51 Fienix kernel: NIP [c0630750] 
.radeon_ttm_bo_destroy+0x28/0xc0
>> Jul 03 08:54:51 Fienix kernel: LR [c060fedc] 
.ttm_bo_put+0x2ec/0x344

>> Jul 03 08:54:51 Fienix kernel: Call Trace:
>> Jul 03 08:54:51 Fienix kernel: [c0008d903710] [c060fbe4] 
.ttm_bo_cleanup_memtype_use+0x54/0x60 (unreliable)
>> Jul 03 08:54:51 Fienix kernel: [c0008d903790] [c060fedc] 
.ttm_bo_put+0x2ec/0x344
>> Jul 03 08:54:51 Fienix kernel: [c0008d903820] [c0630b50] 
.radeon_bo_unref+0x28/0x3c
>> Jul 03 08:54:51 Fienix kernel: [c0008d9038a0] [c06d1f6c] 
.radeon_vm_fini+0x1b0/0x1b8
>> Jul 03 08:54:51 Fienix kernel: [c0008d903940] [c0618e38] 
.radeon_driver_postclose_kms+0x128/0x178
>> Jul 03 08:54:51 Fienix kernel: [c0008d9039e0] [c05deb14] 
.drm_file_free+0x1d8/0x278
>> Jul 03 08:54:51 Fienix kernel: [c0008d903aa0] [c05def00] 
.drm_release+0x64/0xc8
>> Jul 03 08:54:51 Fienix kernel: [c0008d903b30] [c017636c] 
.__fput+0x11c/0x25c
>> Jul 03 08:54:51 Fienix kernel: [c0008d903bd0] [c008b1e8] 
.task_work_run+0xa4/0xbc
>> Jul 03 08:54:51 Fienix kernel: [c0008d903c70] [c0004bf4] 
.do_notify_resume+0x144/0x2f0
>> Jul 03 08:54:51 Fienix kernel: [c0008d903d70] [c000b380] 
.syscall_exit_prepare+0x110/0x130
>> Jul 03 08:54:51 Fienix kernel: [c0008d903e10] [c688] 
system_call_common+0x100/0x1fc

>> Jul 03 08:54:51 Fienix kernel: --- interrupt: c00 at 0x3f4f58
>> Jul 03 08:54:51 Fienix kernel: NIP:  003f4f58 LR: 
003f4f2c CTR: 
>> Jul 03 08:54:51 Fienix kernel: REGS: c0008d903e80 TRAP: 0c00 Not 
tainted  (5.14.0-a3_A-EON_X5000-07637-g3dbdb38e2869-dirty)
>> Jul 03 08:54:51 Fienix kernel: MSR:  0002d002   
CR: 2420  XER: 

>> Jul 03 08:54:51 Fienix kernel: IRQMASK: 0
>>    GPR00: 0006 
ffca66a0 f798a310 
>>    GPR04:  
  
>>    GPR08:  
  
>>    GPR12:  
0044fff4 ffe5

[Bug 213391] AMDGPU retries page fault with some specific processes amdgpu and sometimes followed [gfxhub0] retry page fault until *ERROR* ring gfx timeout, but soft recovered

2021-07-05 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213391

--- Comment #29 from Leandro Jacques (ls...@yahoo.com) ---
How to file a bug to the linux-firmware project for the amdgpu driver? After
the downgrade I haven't experienced any issues anymore.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 209457] AMDGPU resume fail with RX 580 GPU

2021-07-05 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=209457

--- Comment #31 from Leandro Jacques (ls...@yahoo.com) ---
How to file a bug to the linux-firmware project for the amdgpu driver? After
the downgrade I haven't experienced any issues anymore.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-07-05 Thread Will Deacon
Hi Nathan,

I may have just spotted something in these logs...

On Fri, Jul 02, 2021 at 10:55:17PM -0700, Nathan Chancellor wrote:
> [2.340956] pci :0c:00.1: Adding to iommu group 4
> [2.340996] pci :0c:00.2: Adding to iommu group 4
> [2.341038] pci :0c:00.3: Adding to iommu group 4
> [2.341078] pci :0c:00.4: Adding to iommu group 4
> [2.341122] pci :0c:00.6: Adding to iommu group 4
> [2.341163] pci :0d:00.0: Adding to iommu group 4
> [2.341203] pci :0d:00.1: Adding to iommu group 4
> [2.361821] pci :00:00.2: AMD-Vi: Found IOMMU cap 0x40
> [2.361839] pci :00:00.2: AMD-Vi: Extended features 
> (0x206d73ef22254ade):
> [2.361846]  PPR X2APIC NX GT IA GA PC GA_vAPIC
> [2.361861] AMD-Vi: Interrupt remapping enabled
> [2.361865] AMD-Vi: Virtual APIC enabled
> [2.361870] AMD-Vi: X2APIC enabled
> [2.362272] AMD-Vi: Lazy IO/TLB flushing enabled

So at this point, the AMD IOMMU driver does:

swiotlb= (iommu_default_passthrough() || sme_me_mask) ? 1 : 0;

where 'swiotlb' is a global variable indicating whether or not swiotlb
is in use. It's picked up a bit later on by pci_swiotlb_late_init(), which
will call swiotlb_exit() if 'swiotlb' is false.

Now, that used to work fine, because swiotlb_exit() clears
'io_tlb_default_mem' to NULL, but now with the restricted DMA changes, I
think that all the devices which have successfully probed beforehand will
have stale pointers to the freed structure in their 'dev->dma_io_tlb_mem'
field.

Will


Re: [PATCH 0/4] vkms: Switch to shadow-buffered plane state

2021-07-05 Thread Melissa Wen
On 07/05, Daniel Vetter wrote:
> On Mon, Jul 05, 2021 at 12:05:28PM +0200, Thomas Zimmermann wrote:
> > Hi
> > 
> > Am 05.07.21 um 11:27 schrieb Daniel Vetter:
> > > On Mon, Jul 05, 2021 at 09:46:29AM +0200, Thomas Zimmermann wrote:
> > > > Vkms copies each plane's framebuffer into the output buffer; essentially
> > > > using a shadow buffer. DRM provides struct drm_shadow_plane_state, which
> > > > handles the details of mapping/unmapping shadow buffers into memory for
> > > > active planes.
> > > > 
> > > > Convert vkms to the helpers. Makes vkms use shared code and gives more
> > > > test exposure to shadow-plane helpers.
> > > > 
> > > > Thomas Zimmermann (4):
> > > >drm/gem: Export implementation of shadow-plane helpers
> > > >drm/vkms: Inherit plane state from struct drm_shadow_plane_state
> > > >drm/vkms: Let shadow-plane helpers prepare the plane's FB
> > > >drm/vkms: Use dma-buf mapping from shadow-plane state for composing
> > > 
> > > So I think right now this fits, but I think it'll mismit going forward: We
> > > don't really have a shadow-plane that we then toss to the hw, it's a
> > > shadow-crtc-area. Right now there's no difference, because we don't
> > > support positioning/scaling the primary plane. But that's all kinda stuff
> > > that's on the table.
> > > 
> > > But conceptually at least the compositioning buffer should bet part of the
> > > crtc, not of the primary plane.
> > > 
> > > So not sure what to do, but also coffee hasn't kicked in yet, so maybe I'm
> > > just confused.
> > 
> > I'm not sure if I understand your concern. Can you elaborate? The
> > compositing output buffer is not affected by this patchset. Only the input
> > frambuffers of the planes. Those are shadow buffers. AFAICT the composer
> > code memcpy's the primary plane and then blends the other planes on top.
> > Supporting transformation of the primary plane doesn't really change much
> > wrt to the vmaping of input fbs.
> 
> Yeah that's the current implementation, because that's easier. But
> fundamentally we don't need a copy of the input shadow plane, we need a
> scratch area that's sized for the crtc.

Maybe I'm missing something, but I am not sure the relevance for vkms to
switch to shadow-buffered plane. (?)

Btw, in terms of code changes, it works well and also vmap error stops
to happen (reported in the last update of todo list). This fix seems to
be a side-effect because it replaces the vkms_prepare_fb that got
problematic since (I guess) the last switch to shmem. 

Thanks,
Melissa

> So if the primary plane is smaller than the crtc window (because we use
> plane hw for compositing, or maybe primary plane shows a vidoe with black
> borders or whatever), then the primary plane shadow isn't the right size.
> 
> And yes this means some surgery, vkms isn't there yet at all. But still it
> would mean we're going right here, but then have to backtrack before we
> can go left again. So a detour.
> 
> Also I don't think any other driver will ever need this, you really only
> need it when you want to composite planes in software - which defeats the
> purpose of planes. Except when the goal of your driver is to be a software
> model.
> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


[PATCH RFC] drm/vc4: hdmi: Fix connector detect logic

2021-07-05 Thread Stefan Wahren
Commit "drm/vc4: hdmi: Convert to gpiod" changes the behavior of
vc4_hdmi_connector_detect() which results into CPU hangs in case there
is no HDMI connected. Let's restore the old behavior.

Reported-by: Nathan Chancellor 
Reported-by: Ojaswin Mujoo 
Fixes: 6800234ceee0 ("drm/vc4: hdmi: Convert to gpiod")
Signed-off-by: Stefan Wahren 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index aab1b36..cf8339c 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -168,9 +168,9 @@ vc4_hdmi_connector_detect(struct drm_connector *connector, 
bool force)
 
WARN_ON(pm_runtime_resume_and_get(&vc4_hdmi->pdev->dev));
 
-   if (vc4_hdmi->hpd_gpio &&
-   gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio)) {
-   connected = true;
+   if (vc4_hdmi->hpd_gpio) {
+   if (gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio))
+   connected = true;
} else if (drm_probe_ddc(vc4_hdmi->ddc)) {
connected = true;
} else if (HDMI_READ(HDMI_HOTPLUG) & VC4_HDMI_HOTPLUG_CONNECTED) {
-- 
2.7.4



Re: [PATCH 1/4] drm/amdgpu: fix a signedness bug in __verify_ras_table_checksum()

2021-07-05 Thread Luben Tuikov
Alex,

I think we should pull these through amd-staging-drm-next.

Regards,
Luben

On 2021-07-04 11:18 a.m., Luben Tuikov wrote:
> Series is,
> Reviewed-by: Luben Tuikov 
>
> Regards,
> Luben
>
> On 2021-07-03 5:44 a.m., Dan Carpenter wrote:
>> If amdgpu_eeprom_read() returns a negative error code then the error
>> handling checks:
>>
>>  if (res < buf_size) {
>>
>> The problem is that "buf_size" is a u32 so negative values are type
>> promoted to a high positive values and the condition is false.  Fix
>> this by changing the type of "buf_size" to int.
>>
>> Fixes: 79beb6114014 ("drm/amdgpu: Optimize EEPROM RAS table I/O")
>> Signed-off-by: Dan Carpenter 
>> ---
>> It's hard for me to tell the exact upper bound that "buf_size" can be,
>> but if it's over USHRT_MAX then we are well toasted.
>>
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>> index fc70620369e4..f07a456506ef 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>> @@ -978,9 +978,8 @@ const struct file_operations 
>> amdgpu_ras_debugfs_eeprom_table_ops = {
>>  static int __verify_ras_table_checksum(struct amdgpu_ras_eeprom_control 
>> *control)
>>  {
>>  struct amdgpu_device *adev = to_amdgpu_device(control);
>> -int res;
>> +int buf_size, res;
>>  u8  csum, *buf, *pp;
>> -u32 buf_size;
>>  
>>  buf_size = RAS_TABLE_HEADER_SIZE +
>>  control->ras_num_recs * RAS_TABLE_RECORD_SIZE;



Re: [PATCH 2/2] drm/rockchip: dw_hdmi: add rk3568 support

2021-07-05 Thread Huang Jiachai

add algea.cao and andy.yan

在 2021/7/5 22:03, Benjamin Gaignard 写道:

Add a new dw_hdmi_plat_data struct and new compatible for rk3568.

Signed-off-by: Benjamin Gaignard 
---
  drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c | 28 +
  1 file changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c 
b/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c
index 830bdd5e9b7ce..5817c3a9fe64b 100644
--- a/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c
+++ b/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c
@@ -50,6 +50,10 @@
  #define RK3399_GRF_SOC_CON20  0x6250
  #define RK3399_HDMI_LCDC_SEL  BIT(6)
  
+#define RK3568_GRF_VO_CON1		0x0364

+#define RK3568_HDMI_SDAIN_MSK  BIT(15)
+#define RK3568_HDMI_SCLIN_MSK  BIT(14)
+
  #define HIWORD_UPDATE(val, mask)  (val | (mask) << 16)
  
  /**

@@ -467,6 +471,19 @@ static const struct dw_hdmi_plat_data rk3399_hdmi_drv_data 
= {
.use_drm_infoframe = true,
  };
  
+static struct rockchip_hdmi_chip_data rk3568_chip_data = {

+   .lcdsel_grf_reg = -1,
+};
+
+static const struct dw_hdmi_plat_data rk3568_hdmi_drv_data = {
+   .mode_valid = dw_hdmi_rockchip_mode_valid,
+   .mpll_cfg   = rockchip_mpll_cfg,
+   .cur_ctr= rockchip_cur_ctr,
+   .phy_config = rockchip_phy_config,
+   .phy_data = &rk3568_chip_data,
+   .use_drm_infoframe = true,
+};
+
  static const struct of_device_id dw_hdmi_rockchip_dt_ids[] = {
{ .compatible = "rockchip,rk3228-dw-hdmi",
  .data = &rk3228_hdmi_drv_data
@@ -480,6 +497,9 @@ static const struct of_device_id dw_hdmi_rockchip_dt_ids[] 
= {
{ .compatible = "rockchip,rk3399-dw-hdmi",
  .data = &rk3399_hdmi_drv_data
},
+   { .compatible = "rockchip,rk3568-dw-hdmi",
+ .data = &rk3568_hdmi_drv_data
+   },
{},
  };
  MODULE_DEVICE_TABLE(of, dw_hdmi_rockchip_dt_ids);
@@ -536,6 +556,14 @@ static int dw_hdmi_rockchip_bind(struct device *dev, 
struct device *master,
return ret;
}
  
+	if (hdmi->chip_data == &rk3568_chip_data) {

+   regmap_write(hdmi->regmap, RK3568_GRF_VO_CON1,
+HIWORD_UPDATE(RK3568_HDMI_SDAIN_MSK |
+  RK3568_HDMI_SCLIN_MSK,
+  RK3568_HDMI_SDAIN_MSK |
+  RK3568_HDMI_SCLIN_MSK));
+   }
+
hdmi->phy = devm_phy_optional_get(dev, "hdmi");
if (IS_ERR(hdmi->phy)) {
ret = PTR_ERR(hdmi->phy);


--
Best Regard

黄家钗
Sandy Huang
Addr: 福州市鼓楼区铜盘路软件大道89号福州软件园A区21号楼(350003)
  No. 21 Building, A District, No.89,software Boulevard Fuzhou,Fujian,PRC
Tel:+86 0591-87884919  8690
E-mail:h...@rock-chips.com





RE: [PATCH -next] drm/amdgpu: Fix missing unlock on error in amdgpu_ras_debugfs_table_read()

2021-07-05 Thread Chen, Guchun
[Public]

Thank you for the patch, Yingliang.

There is a similar patch sent out last Saturday and under review. Please check 
it.

[PATCH 3/4] drm/amdgpu: unlock on error in amdgpu_ras_debugfs

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Yang 
Yingliang
Sent: Monday, July 5, 2021 9:40 AM
To: linux-ker...@vger.kernel.org; dri-devel@lists.freedesktop.org; 
amd-...@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: [PATCH -next] drm/amdgpu: Fix missing unlock on error in 
amdgpu_ras_debugfs_table_read()

Add the missing unlock before return from function
amdgpu_ras_debugfs_table_read() in the error handling case.

Fixes: 9b790694a031 ("drm/amdgpu: RAS EEPROM table is now in debugfs")
Reported-by: Hulk Robot 
Signed-off-by: Yang Yingliang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index fc70620369e4..dbeeb4986ca6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -912,8 +912,10 @@ static ssize_t amdgpu_ras_debugfs_table_read(struct file 
*f, char __user *buf,
 record.retired_page);
 
data_len = min_t(size_t, rec_hdr_fmt_size - r, size);
-   if (copy_to_user(buf, &data[r], data_len))
-   return -EINVAL;
+   if (copy_to_user(buf, &data[r], data_len)) {
+   res = -EINVAL;
+   goto Out;
+   }
buf += data_len;
size -= data_len;
*pos += data_len;
--
2.25.1

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cguchun.chen%40amd.com%7C895d0b06d5e54b3598cf08d93f83454a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637610655312805026%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=b8UQJCZDgKs7CkMFMMXtFUfGe%2FQA4Cnm%2FKJKOlvV1K0%3D&reserved=0


Re: nouveau: failed to initialise sync

2021-07-05 Thread Alistair Popple
I am also hitting this with upstream. Reverting d02117f8efaa ("drm/ttm: remove 
special handling for non GEM drivers") also fixed it for me.

The change log for that commit reads:

drm/ttm: remove special handling for non GEM drivers

vmwgfx is the only driver actually using this. Move the handling into
the driver instead.

I wonder if Nouveau might actually have been using this somehow too?

 - Alistair

On Sunday, 4 July 2021 5:21:29 AM AEST Corentin Labbe wrote:
> Hello
> 
> Since some days on next, nouveau fail to load:
> [2.754087] nouveau :02:00.0: vgaarb: deactivate vga console
> [2.761260] Console: switching to colour dummy device 80x25
> [2.766888] nouveau :02:00.0: NVIDIA MCP77/MCP78 (0aa480a2)
> [2.783954] nouveau :02:00.0: bios: version 62.77.2a.00.04
> [2.810122] nouveau :02:00.0: fb: 256 MiB stolen system memory
> [3.484031] nouveau :02:00.0: DRM: VRAM: 256 MiB
> [3.488993] nouveau :02:00.0: DRM: GART: 1048576 MiB
> [3.494308] nouveau :02:00.0: DRM: TMDS table version 2.0
> [3.500052] nouveau :02:00.0: DRM: DCB version 4.0
> [3.505192] nouveau :02:00.0: DRM: DCB outp 00: 01000300 001e
> [3.511632] nouveau :02:00.0: DRM: DCB outp 01: 01011332 00020010
> [3.518074] nouveau :02:00.0: DRM: DCB conn 00: 0100
> [3.523728] nouveau :02:00.0: DRM: DCB conn 01: 1261
> [3.529455] nouveau :02:00.0: DRM: failed to initialise sync 
subsystem, -28
> [3.545946] nouveau: probe of :02:00.0 failed with error -28
> 
> I bisected it to:
> git bisect start
> # good: [62fb9874f5da54fdb243003b386128037319b219] Linux 5.13
> git bisect good 62fb9874f5da54fdb243003b386128037319b219
> # bad: [fb0ca446157a86b75502c1636b0d81e642fe6bf1] Add linux-next specific 
files for 20210701
> git bisect bad fb0ca446157a86b75502c1636b0d81e642fe6bf1
> # good: [f63c4fda987a19b1194cc45cb72fd5bf968d9d90] Merge remote-tracking 
branch 'rdma/for-next'
> git bisect good f63c4fda987a19b1194cc45cb72fd5bf968d9d90
> # bad: [49c8769be0b910d4134eba07cae5d9c71b861c4a] Merge remote-tracking 
branch 'drm/drm-next'
> git bisect bad 49c8769be0b910d4134eba07cae5d9c71b861c4a
> # good: [4e3db44a242a4e2afe33b59793898ecbb61d478e] Merge tag 'wireless-
drivers-next-2021-06-25' of git://git.kernel.org/pub/scm/linux/kernel/git/
kvalo/wireless-drivers-next
> git bisect good 4e3db44a242a4e2afe33b59793898ecbb61d478e
> # bad: [5745d647d5563d3e9d32013ad4e5c629acff04d7] Merge tag 'amd-drm-
next-5.14-2021-06-02' of https://gitlab.freedesktop.org/agd5f/linux into drm-
next
> git bisect bad 5745d647d5563d3e9d32013ad4e5c629acff04d7
> # bad: [c99c4d0ca57c978dcc2a2f41ab8449684ea154cc] Merge tag 'amd-drm-
next-5.14-2021-05-19' of https://gitlab.freedesktop.org/agd5f/linux into drm-
next
> git bisect bad c99c4d0ca57c978dcc2a2f41ab8449684ea154cc
> # bad: [ae25ec2fc6c5a9e5767bf1922cd648501d0f914c] Merge tag 'drm-misc-
next-2021-05-17' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
> git bisect bad ae25ec2fc6c5a9e5767bf1922cd648501d0f914c
> # bad: [cac80e71cfb0b00202d743c6e90333c45ba77cc5] drm/vkms: rename cursor to 
plane on ops of planes composition
> git bisect bad cac80e71cfb0b00202d743c6e90333c45ba77cc5
> # good: [178bdba84c5f0ad14de384fc7f15fba0e272919d] drm/ttm/ttm_device: 
Demote kernel-doc abuses
> git bisect good 178bdba84c5f0ad14de384fc7f15fba0e272919d
> # bad: [3f3a6524f6065fd3d130515e012f63eac74d96da] drm/dp: Clarify DP AUX 
registration time
> git bisect bad 3f3a6524f6065fd3d130515e012f63eac74d96da
> # bad: [6dd7efc437611db16d432e0030f72d0c7e890127] drm/gud: cleanup coding 
style a bit
> git bisect bad 6dd7efc437611db16d432e0030f72d0c7e890127
> # bad: [13b29cc3a722c2c0bc9ab9f72f9047d55d08a2f9] drm/mxsfb: Don't select 
DRM_KMS_FB_HELPER
> git bisect bad 13b29cc3a722c2c0bc9ab9f72f9047d55d08a2f9
> # bad: [d02117f8efaa5fbc37437df1ae955a147a2a424a] drm/ttm: remove special 
handling for non GEM drivers
> git bisect bad d02117f8efaa5fbc37437df1ae955a147a2a424a
> # good: [13ea9aa1e7d891e950230e82f1dd2c84e5debcff] drm/ttm: fix error 
handling if no BO can be swapped out v4
> git bisect good 13ea9aa1e7d891e950230e82f1dd2c84e5debcff
> # first bad commit: [d02117f8efaa5fbc37437df1ae955a147a2a424a] drm/ttm: 
remove special handling for non GEM drivers
> 
> Reverting the patch permit to have nouveau works again.
> 
> Regards
> 
> 






Re: [PATCH 0/4] vkms: Switch to shadow-buffered plane state

2021-07-05 Thread Thomas Zimmermann

Hi

Am 05.07.21 um 16:20 schrieb Daniel Vetter:

On Mon, Jul 05, 2021 at 12:05:28PM +0200, Thomas Zimmermann wrote:

Hi

Am 05.07.21 um 11:27 schrieb Daniel Vetter:

On Mon, Jul 05, 2021 at 09:46:29AM +0200, Thomas Zimmermann wrote:

Vkms copies each plane's framebuffer into the output buffer; essentially
using a shadow buffer. DRM provides struct drm_shadow_plane_state, which
handles the details of mapping/unmapping shadow buffers into memory for
active planes.

Convert vkms to the helpers. Makes vkms use shared code and gives more
test exposure to shadow-plane helpers.

Thomas Zimmermann (4):
drm/gem: Export implementation of shadow-plane helpers
drm/vkms: Inherit plane state from struct drm_shadow_plane_state
drm/vkms: Let shadow-plane helpers prepare the plane's FB
drm/vkms: Use dma-buf mapping from shadow-plane state for composing


So I think right now this fits, but I think it'll mismit going forward: We
don't really have a shadow-plane that we then toss to the hw, it's a
shadow-crtc-area. Right now there's no difference, because we don't
support positioning/scaling the primary plane. But that's all kinda stuff
that's on the table.

But conceptually at least the compositioning buffer should bet part of the
crtc, not of the primary plane.

So not sure what to do, but also coffee hasn't kicked in yet, so maybe I'm
just confused.


I'm not sure if I understand your concern. Can you elaborate? The
compositing output buffer is not affected by this patchset. Only the input
frambuffers of the planes. Those are shadow buffers. AFAICT the composer
code memcpy's the primary plane and then blends the other planes on top.
Supporting transformation of the primary plane doesn't really change much
wrt to the vmaping of input fbs.


Yeah that's the current implementation, because that's easier. But
fundamentally we don't need a copy of the input shadow plane, we need a
scratch area that's sized for the crtc.


I stll don't understand. Are you talking about the whole patchset or 
just patch 4? Because if you want to do anything with vkms planes, you 
have to vmap the framebuffer BOs into memory. (right?) That's all that 
shadow-buffer planes do. Patches 1 to 3 have nothing to do with memcpy. 
Admittedly, Patch 4 does some minor refactoring, but only because it 
became a low-hanging fruit.


Best regards
Thomas


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 0/4] vkms: Switch to shadow-buffered plane state

2021-07-05 Thread Thomas Zimmermann

Hi

Am 05.07.21 um 23:29 schrieb Melissa Wen:

On 07/05, Daniel Vetter wrote:

On Mon, Jul 05, 2021 at 12:05:28PM +0200, Thomas Zimmermann wrote:

Hi

Am 05.07.21 um 11:27 schrieb Daniel Vetter:

On Mon, Jul 05, 2021 at 09:46:29AM +0200, Thomas Zimmermann wrote:

Vkms copies each plane's framebuffer into the output buffer; essentially
using a shadow buffer. DRM provides struct drm_shadow_plane_state, which
handles the details of mapping/unmapping shadow buffers into memory for
active planes.

Convert vkms to the helpers. Makes vkms use shared code and gives more
test exposure to shadow-plane helpers.

Thomas Zimmermann (4):
drm/gem: Export implementation of shadow-plane helpers
drm/vkms: Inherit plane state from struct drm_shadow_plane_state
drm/vkms: Let shadow-plane helpers prepare the plane's FB
drm/vkms: Use dma-buf mapping from shadow-plane state for composing


So I think right now this fits, but I think it'll mismit going forward: We
don't really have a shadow-plane that we then toss to the hw, it's a
shadow-crtc-area. Right now there's no difference, because we don't
support positioning/scaling the primary plane. But that's all kinda stuff
that's on the table.

But conceptually at least the compositioning buffer should bet part of the
crtc, not of the primary plane.

So not sure what to do, but also coffee hasn't kicked in yet, so maybe I'm
just confused.


I'm not sure if I understand your concern. Can you elaborate? The
compositing output buffer is not affected by this patchset. Only the input
frambuffers of the planes. Those are shadow buffers. AFAICT the composer
code memcpy's the primary plane and then blends the other planes on top.
Supporting transformation of the primary plane doesn't really change much
wrt to the vmaping of input fbs.


Yeah that's the current implementation, because that's easier. But
fundamentally we don't need a copy of the input shadow plane, we need a
scratch area that's sized for the crtc.


Maybe I'm missing something, but I am not sure the relevance for vkms to
switch to shadow-buffered plane. (?)


It replaces the vkms code with shared code. Nothing else. For the shared 
shadow-buffer code it means more testing. If vkms ever supports color 
formats that use multiple planes, the new code will be ready.


Best regards
Thomas


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: nouveau: failed to initialise sync

2021-07-05 Thread Christian König

Hi guys,

yes nouveau was using the same functionality for internal BOs without 
noticing it. This is fixes by the following commit:


commit d098775ed44021293b1962dea61efb19297b8d02
Author: Christian König 
Date:   Wed Jun 9 19:25:56 2021 +0200

    drm/nouveau: init the base GEM fields for internal BOs

    TTMs buffer objects are based on GEM objects for quite a while
    and rely on initializing those fields before initializing the TTM BO.

    Nouveau now doesn't init the GEM object for internally allocated BOs,
    so make sure that we at least initialize some necessary fields.

Could be that the patch needs to be send to stable as well.

Regards,
Christian.

Am 06.07.21 um 04:44 schrieb Alistair Popple:

I am also hitting this with upstream. Reverting d02117f8efaa ("drm/ttm: remove
special handling for non GEM drivers") also fixed it for me.

The change log for that commit reads:

 drm/ttm: remove special handling for non GEM drivers
 
 vmwgfx is the only driver actually using this. Move the handling into

 the driver instead.

I wonder if Nouveau might actually have been using this somehow too?

  - Alistair

On Sunday, 4 July 2021 5:21:29 AM AEST Corentin Labbe wrote:

Hello

Since some days on next, nouveau fail to load:
[2.754087] nouveau :02:00.0: vgaarb: deactivate vga console
[2.761260] Console: switching to colour dummy device 80x25
[2.766888] nouveau :02:00.0: NVIDIA MCP77/MCP78 (0aa480a2)
[2.783954] nouveau :02:00.0: bios: version 62.77.2a.00.04
[2.810122] nouveau :02:00.0: fb: 256 MiB stolen system memory
[3.484031] nouveau :02:00.0: DRM: VRAM: 256 MiB
[3.488993] nouveau :02:00.0: DRM: GART: 1048576 MiB
[3.494308] nouveau :02:00.0: DRM: TMDS table version 2.0
[3.500052] nouveau :02:00.0: DRM: DCB version 4.0
[3.505192] nouveau :02:00.0: DRM: DCB outp 00: 01000300 001e
[3.511632] nouveau :02:00.0: DRM: DCB outp 01: 01011332 00020010
[3.518074] nouveau :02:00.0: DRM: DCB conn 00: 0100
[3.523728] nouveau :02:00.0: DRM: DCB conn 01: 1261
[3.529455] nouveau :02:00.0: DRM: failed to initialise sync

subsystem, -28

[3.545946] nouveau: probe of :02:00.0 failed with error -28

I bisected it to:
git bisect start
# good: [62fb9874f5da54fdb243003b386128037319b219] Linux 5.13
git bisect good 62fb9874f5da54fdb243003b386128037319b219
# bad: [fb0ca446157a86b75502c1636b0d81e642fe6bf1] Add linux-next specific

files for 20210701

git bisect bad fb0ca446157a86b75502c1636b0d81e642fe6bf1
# good: [f63c4fda987a19b1194cc45cb72fd5bf968d9d90] Merge remote-tracking

branch 'rdma/for-next'

git bisect good f63c4fda987a19b1194cc45cb72fd5bf968d9d90
# bad: [49c8769be0b910d4134eba07cae5d9c71b861c4a] Merge remote-tracking

branch 'drm/drm-next'

git bisect bad 49c8769be0b910d4134eba07cae5d9c71b861c4a
# good: [4e3db44a242a4e2afe33b59793898ecbb61d478e] Merge tag 'wireless-

drivers-next-2021-06-25' of git://git.kernel.org/pub/scm/linux/kernel/git/
kvalo/wireless-drivers-next

git bisect good 4e3db44a242a4e2afe33b59793898ecbb61d478e
# bad: [5745d647d5563d3e9d32013ad4e5c629acff04d7] Merge tag 'amd-drm-

next-5.14-2021-06-02' of 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fagd5f%2Flinux&data=04%7C01%7Cchristian.koenig%40amd.com%7C5f05fa59cd3b4432e71108d94027ede6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637611362989756089%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=gJ98NtSRf3IyVfsCzw3DKydeMTGKIkHJNzUUhUfsWzY%3D&reserved=0
 into drm-
next

git bisect bad 5745d647d5563d3e9d32013ad4e5c629acff04d7
# bad: [c99c4d0ca57c978dcc2a2f41ab8449684ea154cc] Merge tag 'amd-drm-

next-5.14-2021-05-19' of 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fagd5f%2Flinux&data=04%7C01%7Cchristian.koenig%40amd.com%7C5f05fa59cd3b4432e71108d94027ede6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637611362989756089%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=gJ98NtSRf3IyVfsCzw3DKydeMTGKIkHJNzUUhUfsWzY%3D&reserved=0
 into drm-
next

git bisect bad c99c4d0ca57c978dcc2a2f41ab8449684ea154cc
# bad: [ae25ec2fc6c5a9e5767bf1922cd648501d0f914c] Merge tag 'drm-misc-

next-2021-05-17' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

git bisect bad ae25ec2fc6c5a9e5767bf1922cd648501d0f914c
# bad: [cac80e71cfb0b00202d743c6e90333c45ba77cc5] drm/vkms: rename cursor to

plane on ops of planes composition

git bisect bad cac80e71cfb0b00202d743c6e90333c45ba77cc5
# good: [178bdba84c5f0ad14de384fc7f15fba0e272919d] drm/ttm/ttm_device:

Demote kernel-doc abuses

git bisect good 178bdba84c5f0ad14de384fc7f15fba0e272919d
# bad: [3f3a6524f6065fd3d130515e012f63eac74d96da] drm/dp: Clarify DP AUX

registration time

git bisect bad 3f3a6524f6065fd3d130515e012f63eac74d96da
# bad: [6dd7efc437611db16d432e0030f72d0c7e890127