[PATCH] dma-buf: add kernel count for dma_buf

2021-07-14 Thread guangming.cao
From: Guangming Cao 

Add a refcount for kernel to prevent UAF(Use After Free) issue.

We can assume a case like below:
1. kernel space alloc dma_buf(file count = 1)
2. kernel use dma_buf to get fd(file count = 1)
3. userspace use fd to do mapping (file count = 2)
4. kernel call dma_buf_put (file count = 1)
5. userpsace close buffer fd(file count = 0)
6. at this time, buffer is released, but va is valid!!
   So we still can read/write buffer via mmap va,
   it maybe cause memory leak, or kernel exception.
   And also, if we use "ls -ll" to watch corresponding process
   fd link info, it also will cause kernel exception.

Another case:
 Using dma_buf_fd to generate more than 1 fd, because
 dma_buf_fd will not increase file count, thus, when close
 the second fd, it maybe occurs error.

Solution:
Add a kernel count for dma_buf, and make sure the file count
of dma_buf.file hold by kernel is 1.

Notes: For this solution, kref couldn't work because kernel ref
   maybe added from 0, but kref don't allow it.

Signed-off-by: Guangming Cao 
---
 drivers/dma-buf/dma-buf.c | 23 +++
 include/linux/dma-buf.h   |  6 --
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 511fe0d217a0..04ee92aac8b9 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -62,6 +62,7 @@ static void dma_buf_release(struct dentry *dentry)
if (unlikely(!dmabuf))
return;
 
+   WARN_ON(atomic64_read(&dmabuf->kernel_ref));
BUG_ON(dmabuf->vmapping_counter);
 
/*
@@ -555,6 +556,7 @@ struct dma_buf *dma_buf_export(const struct 
dma_buf_export_info *exp_info)
goto err_module;
}
 
+   atomic64_set(&dmabuf->kernel_ref, 1);
dmabuf->priv = exp_info->priv;
dmabuf->ops = exp_info->ops;
dmabuf->size = exp_info->size;
@@ -617,6 +619,9 @@ int dma_buf_fd(struct dma_buf *dmabuf, int flags)
 
fd_install(fd, dmabuf->file);
 
+   /* Add file cnt for each new fd */
+   get_file(dmabuf->file);
+
return fd;
 }
 EXPORT_SYMBOL_GPL(dma_buf_fd);
@@ -626,12 +631,13 @@ EXPORT_SYMBOL_GPL(dma_buf_fd);
  * @fd:[in]fd associated with the struct dma_buf to be returned
  *
  * On success, returns the struct dma_buf associated with an fd; uses
- * file's refcounting done by fget to increase refcount. returns ERR_PTR
- * otherwise.
+ * dmabuf's ref refcounting done by kref_get to increase refcount.
+ * Returns ERR_PTR otherwise.
  */
 struct dma_buf *dma_buf_get(int fd)
 {
struct file *file;
+   struct dma_buf *dmabuf;
 
file = fget(fd);
 
@@ -643,7 +649,12 @@ struct dma_buf *dma_buf_get(int fd)
return ERR_PTR(-EINVAL);
}
 
-   return file->private_data;
+   dmabuf = file->private_data;
+   /* replace file count increase as ref increase for kernel user */
+   get_dma_buf(dmabuf);
+   fput(file);
+
+   return dmabuf;
 }
 EXPORT_SYMBOL_GPL(dma_buf_get);
 
@@ -662,7 +673,11 @@ void dma_buf_put(struct dma_buf *dmabuf)
if (WARN_ON(!dmabuf || !dmabuf->file))
return;
 
-   fput(dmabuf->file);
+   if (WARN_ON(!atomic64_read(&dmabuf->kernel_ref)))
+   return;
+
+   if (!atomic64_dec_return(&dmabuf->kernel_ref))
+   fput(dmabuf->file);
 }
 EXPORT_SYMBOL_GPL(dma_buf_put);
 
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index efdc56b9d95f..bc790cb028eb 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -308,6 +308,7 @@ struct dma_buf_ops {
 struct dma_buf {
size_t size;
struct file *file;
+   atomic64_t kernel_ref;
struct list_head attachments;
const struct dma_buf_ops *ops;
struct mutex lock;
@@ -436,7 +437,7 @@ struct dma_buf_export_info {
 .owner = THIS_MODULE }
 
 /**
- * get_dma_buf - convenience wrapper for get_file.
+ * get_dma_buf - increase a kernel ref of dma-buf
  * @dmabuf:[in]pointer to dma_buf
  *
  * Increments the reference count on the dma-buf, needed in case of drivers
@@ -446,7 +447,8 @@ struct dma_buf_export_info {
  */
 static inline void get_dma_buf(struct dma_buf *dmabuf)
 {
-   get_file(dmabuf->file);
+   if (atomic64_inc_return(&dmabuf->kernel_ref) == 1)
+   get_file(dmabuf->file);
 }
 
 /**
-- 
2.17.1



Re: [PATCH 1/4] drm/vmwgfx: Add support for CursorMob and CursorBypass 4

2021-07-14 Thread Thomas Zimmermann

Hi

Am 14.07.21 um 06:14 schrieb Zack Rusin:

From: Martin Krastev 

* Add support for CursorMob
* Add support for CursorBypass 4

Reviewed-by: Zack Rusin 
Signed-off-by: Martin Krastev 
Signed-off-by: Zack Rusin 
---
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 45 +++-
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.h |  6 +++
  drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 79 +++--
  3 files changed, 125 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 086dc75e7b42..7d8cc2f6b04e 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -1,7 +1,7 @@
  // SPDX-License-Identifier: GPL-2.0 OR MIT
  /**
   *
- * Copyright 2009-2016 VMware, Inc., Palo Alto, CA., USA
+ * Copyright 2009-2021 VMware, Inc., Palo Alto, CA., USA
   *
   * Permission is hereby granted, free of charge, to any person obtaining a
   * copy of this software and associated documentation files (the
@@ -301,8 +301,12 @@ static void vmw_print_capabilities2(uint32_t capabilities2)
DRM_INFO("  Grow oTable.\n");


These macros have been out of fashion for a while. There's drm_info(), 
drm_warn(), drm_err(), etc as replacements. They also print device 
information. Applis here and for the rest of the patchset.




if (capabilities2 & SVGA_CAP2_INTRA_SURFACE_COPY)
DRM_INFO("  IntraSurface copy.\n");
+   if (capabilities2 & SVGA_CAP2_CURSOR_MOB)
+   DRM_INFO("  Cursor Mob.\n");
if (capabilities2 & SVGA_CAP2_DX3)
DRM_INFO("  DX3.\n");
+   if (capabilities2 & SVGA_CAP2_EXTRA_REGS)
+   DRM_INFO("  Extra Regs.\n");
  }
  
  static void vmw_print_capabilities(uint32_t capabilities)

@@ -505,6 +509,7 @@ static int vmw_request_device_late(struct vmw_private 
*dev_priv)
  static int vmw_request_device(struct vmw_private *dev_priv)
  {
int ret;
+   size_t i;
  
  	ret = vmw_device_init(dev_priv);

if (unlikely(ret != 0)) {
@@ -526,6 +531,37 @@ static int vmw_request_device(struct vmw_private *dev_priv)
if (unlikely(ret != 0))
goto out_no_query_bo;
  
+	/* Set up mobs for cursor updates */

+   if (dev_priv->has_mob && dev_priv->capabilities2 & 
SVGA_CAP2_CURSOR_MOB) {
+   const uint32_t cursor_max_dim = vmw_read(dev_priv, 
SVGA_REG_CURSOR_MAX_DIMENSION);
+
+   for (i = 0; i < ARRAY_SIZE(dev_priv->cursor_mob); i++) {
+   struct ttm_buffer_object **const bo = 
&dev_priv->cursor_mob[i];
+
+   ret = vmw_bo_create_kernel(dev_priv,
+   cursor_max_dim * cursor_max_dim * sizeof(u32) + 
sizeof(SVGAGBCursorHeader),
+   &vmw_mob_placement, bo);
+
+   if (ret != 0) {
+   DRM_ERROR("Unable to create CursorMob 
array.\n");
+   break;
+   }
+
+   BUG_ON((*bo)->resource->mem_type != VMW_PL_MOB);


BUG_ON() crashes the kernel. The prefered way is to use drm_WARN_*() and 
return.



+
+   /* Fence the mob creation so we are guarateed to have 
the mob */
+   ret = ttm_bo_reserve(*bo, false, true, NULL);
+   BUG_ON(ret);


I'm not quite sure, but this line is probably a no-go wrt to best 
practices. See the comment above.



+
+   vmw_bo_fence_single(*bo, NULL);
+
+   ttm_bo_unreserve(*bo);
+
+   DRM_INFO("Using CursorMob mobid %lu, max dimension 
%u\n",
+(*bo)->resource->start, cursor_max_dim);


IIRC anything *_info() is just radom info into the log. Most of the 
time, no one cares. Better use one of the drm_dbg_() calls.



+   }
+   }
+
return 0;
  
  out_no_query_bo:

@@ -556,6 +592,8 @@ static int vmw_request_device(struct vmw_private *dev_priv)
   */
  static void vmw_release_device_early(struct vmw_private *dev_priv)
  {
+   size_t i;
+
/*
 * Previous destructions should've released
 * the pinned bo.
@@ -570,6 +608,11 @@ static void vmw_release_device_early(struct vmw_private 
*dev_priv)
if (dev_priv->has_mob) {
struct ttm_resource_manager *man;
  
+		for (i = 0; i < ARRAY_SIZE(dev_priv->cursor_mob); i++) {

+   if (dev_priv->cursor_mob[i] != NULL)
+   ttm_bo_put(dev_priv->cursor_mob[i]);
+   }
+
man = ttm_manager_type(&dev_priv->bdev, VMW_PL_MOB);
ttm_resource_manager_evict_all(&dev_priv->bdev, man);
vmw_otables_takedown(dev_priv);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
index 356f82c26f59..46bf54f6169a 100644
--- a/drivers/g

Re: [PATCH 1/2] drm: add crtc background color property

2021-07-14 Thread Pekka Paalanen
On Tue, 13 Jul 2021 09:54:35 -0400
Harry Wentland  wrote:

> On 2021-07-13 3:52 a.m., Pekka Paalanen wrote:
> > On Mon, 12 Jul 2021 12:15:59 -0400
> > Harry Wentland  wrote:
> >   
> >> On 2021-07-12 4:03 a.m., Pekka Paalanen wrote:  
> >>> On Fri, 9 Jul 2021 18:23:26 +0200
> >>> Raphael Gallais-Pou  wrote:
> >>> 
>  On 7/9/21 10:04 AM, Pekka Paalanen wrote:
> > On Wed, 7 Jul 2021 08:48:47 +
> > Raphael GALLAIS-POU - foss  wrote:
> >  
> >> Some display controllers can be programmed to present non-black colors
> >> for pixels not covered by any plane (or pixels covered by the
> >> transparent regions of higher planes).  Compositors that want a UI with
> >> a solid color background can potentially save memory bandwidth by
> >> setting the CRTC background property and using smaller planes to 
> >> display
> >> the rest of the content.
> >>
> >> To avoid confusion between different ways of encoding RGB data, we
> >> define a standard 64-bit format that should be used for this property's
> >> value.  Helper functions and macros are provided to generate and 
> >> dissect
> >> values in this standard format with varying component precision values.
> >>
> >> Signed-off-by: Raphael Gallais-Pou 
> >> Signed-off-by: Matt Roper 
> >> ---
> >>   drivers/gpu/drm/drm_atomic_state_helper.c |  1 +
> >>   drivers/gpu/drm/drm_atomic_uapi.c |  4 +++
> >>   drivers/gpu/drm/drm_blend.c   | 34 
> >> +--
> >>   drivers/gpu/drm/drm_mode_config.c |  6 
> >>   include/drm/drm_blend.h   |  1 +
> >>   include/drm/drm_crtc.h| 12 
> >>   include/drm/drm_mode_config.h |  5 
> >>   include/uapi/drm/drm_mode.h   | 28 +++
> >>   8 files changed, 89 insertions(+), 2 deletions(-)  
> > 
> > ...
> >   
> > The question about full vs. limited range seems unnecessary to me, as
> > the background color will be used as-is in the blending stage, so
> > userspace can just program the correct value that fits the pipeline it
> > is setting up.
> >
> > One more question is, as HDR exists, could we need background colors
> > with component values greater than 1.0?  
> 
>  AR4H color format should cover that case, isn't it ?
> >>>
> >>> Yes, but with the inconvenience I mentioned.
> >>>
> >>> This is a genuine question though, would anyone actually need
> >>> background color values > 1.0. I don't know of any case yet where it
> >>> would be required. It would imply that plane blending happens in a
> >>> color space where >1.0 values are meaningful. I'm not even sure if any
> >>> hardware supporting that exists.
> >>>
> >>> Maybe it would be best to assume that only [0.0, 1.0] pixel value range
> >>> is useful, and mention in the commit message that if someone really
> >>> needs values outside of that, they should create another background
> >>> color property. Then, you can pick a simple unsigned integer pixel
> >>> format, too. (I didn't see any 16 bit-per-channel formats like that in
> >>> drm_fourcc.h though.)
> >>> 
> >>
> >> I don't think we should artificially limit this to [0.0, 1.0]. As you
> >> mentioned above when talking about full vs limited, the userspace
> >> understands what's the correct value that fits the pipeline. If that
> >> pipeline is FP16 with > 1.0 values then it would make sense that the
> >> background color can be > 1.0.  
> > 
> > Ok. The standard FP32 format then for ease of use and guaranteed enough
> > range and precision for far into the future?
> >   
> 
> I don't have a strong preference for FP16 vs FP32. My understanding is
> that FP16 is enough to represent linearly encoded data in a way that
> looks smooth to humans.
> 
> scRGB uses FP16 with linear encoding in a range of [-0.5, 7.4999].
> 
> > Or do you want to keep it in 64 bits total, so the UABI can pack
> > everything into a u64 instead of needing to create a blob?
> > 
> > I don't mind as long as it's clearly documented what it is and how it
> > works, and it carries enough precision.
> > 
> > But FP16 with its 10 bits of precision might be too little for integer
> > 12-16 bpc pipelines and sinks?

The 10 bits worries me still.

If you have a pipeline that works in [0.0, 1.0] range only, then FP16
limits precision to 10 bits (in the upper half of the range?).

> > 
> > If the values can go beyond [0.0, 1.0] range, then does the blending
> > hardware and the degamma/ctm/gamma coming afterwards cope with them, or
> > do they get clamped anyway?
> >   
> 
> That probably depends on the HW and how it's configured. AMD HW can handle
> values above and below [0.0, 1.0].

Right, so how would userspace know what will happen?

Or do we need to specify that while values outside that unit range are
expressable, it is hardware-specific on how they w

[syzbot] BUG: unable to handle kernel paging request in vga16fb_fillrect

2021-07-14 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:3dbdb38e Merge branch 'for-5.14' of git://git.kernel.org/p..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1323c40230
kernel config:  https://syzkaller.appspot.com/x/.config?x=a1fcf15a09815757
dashboard link: https://syzkaller.appspot.com/bug?extid=04168c8063cfdde1db5e
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=11f0e77230
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1114b9b030

Bisection is inconclusive: the issue happens on the oldest tested release.

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10fa45d830
final oops: https://syzkaller.appspot.com/x/report.txt?x=12fa45d830
console output: https://syzkaller.appspot.com/x/log.txt?x=14fa45d830

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+04168c8063cfdde1d...@syzkaller.appspotmail.com

BUG: unable to handle page fault for address: 88800150
#PF: supervisor write access in kernel mode
#PF: error_code(0x0003) - permissions violation
PGD 10e01067 P4D 10e01067 PUD 10e02067 PMD 810001e1 
Oops: 0003 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 8433 Comm: syz-executor067 Tainted: GW 
5.13.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
RIP: 0010:writeb arch/x86/include/asm/io.h:65 [inline]
RIP: 0010:vga16fb_fillrect+0x993/0x18d0 drivers/video/fbdev/vga16fb.c:923
Code: 6c fd 48 63 44 24 10 45 31 f6 48 89 04 24 e8 44 a6 6c fd 31 ff 89 de 31 
ed e8 79 ad 6c fd 85 db 4d 89 ec 74 22 e8 2d a6 6c fd <45> 88 34 24 83 c5 01 89 
df 49 83 c4 01 89 ee e8 49 ae 6c fd 39 eb
RSP: 0018:c9eff848 EFLAGS: 00010293
RAX:  RBX: 001b RCX: 
RDX: 88802d949c40 RSI: 8408e403 RDI: 0003
RBP:  R08:  R09: 8408dd8d
R10: 8408e3f7 R11:  R12: 88800150
R13: 88800150 R14:  R15: 0ffeb7ff
FS:  01aa2300() GS:8880b9d0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 88800150 CR3: 346fb000 CR4: 001506e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 bit_clear_margins+0x3f6/0x4b0 drivers/video/fbdev/core/bitblit.c:224
 fbcon_clear_margins+0x1f1/0x280 drivers/video/fbdev/core/fbcon.c:1315
 fbcon_switch+0xa8c/0x1620 drivers/video/fbdev/core/fbcon.c:2146
 redraw_screen+0x2b9/0x740 drivers/tty/vt/vt.c:1021
 fbcon_modechanged+0x593/0x6d0 drivers/video/fbdev/core/fbcon.c:2651
 fbcon_update_vcs+0x3a/0x50 drivers/video/fbdev/core/fbcon.c:2696
 do_fb_ioctl+0x62e/0x690 drivers/video/fbdev/core/fbmem.c:1110
 fb_ioctl+0xe7/0x150 drivers/video/fbdev/core/fbmem.c:1185
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:1069 [inline]
 __se_sys_ioctl fs/ioctl.c:1055 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:1055
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x43efd9
Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 
c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:7ffc362df848 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 00400488 RCX: 0043efd9
RDX: 2200 RSI: 4601 RDI: 0003
RBP: 00402fc0 R08:  R09: 00400488
R10:  R11: 0246 R12: 00403050
R13:  R14: 004ac018 R15: 00400488
Modules linked in:
CR2: 88800150
---[ end trace 39dce64bc5621bd3 ]---
RIP: 0010:writeb arch/x86/include/asm/io.h:65 [inline]
RIP: 0010:vga16fb_fillrect+0x993/0x18d0 drivers/video/fbdev/vga16fb.c:923
Code: 6c fd 48 63 44 24 10 45 31 f6 48 89 04 24 e8 44 a6 6c fd 31 ff 89 de 31 
ed e8 79 ad 6c fd 85 db 4d 89 ec 74 22 e8 2d a6 6c fd <45> 88 34 24 83 c5 01 89 
df 49 83 c4 01 89 ee e8 49 ae 6c fd 39 eb
RSP: 0018:c9eff848 EFLAGS: 00010293
RAX:  RBX: 001b RCX: 
RDX: 88802d949c40 RSI: 8408e403 RDI: 0003
RBP:  R08:  R09: 8408dd8d
R10: 8408e3f7 R11:  R12: 88800150
R13: 88800150 R14:  R15: 0ffeb7ff
FS:  01aa2300() GS:8880b9d0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 88800150 CR3: 346fb000 CR4: 001506e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400


---
T

[PATCH v6 10/11] arm: dts: mediatek: Get rid of mediatek, larb for MM nodes

2021-07-14 Thread Yong Wu
After adding device_link between the IOMMU consumer and smi,
the mediatek,larb is unnecessary now.

CC: Matthias Brugger 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 arch/arm/boot/dts/mt2701.dtsi  | 2 --
 arch/arm/boot/dts/mt7623n.dtsi | 5 -
 2 files changed, 7 deletions(-)

diff --git a/arch/arm/boot/dts/mt2701.dtsi b/arch/arm/boot/dts/mt2701.dtsi
index 4776f85d6d5b..ef583cfd3baf 100644
--- a/arch/arm/boot/dts/mt2701.dtsi
+++ b/arch/arm/boot/dts/mt2701.dtsi
@@ -564,7 +564,6 @@
clock-names = "jpgdec-smi",
  "jpgdec";
power-domains = <&scpsys MT2701_POWER_DOMAIN_ISP>;
-   mediatek,larb = <&larb2>;
iommus = <&iommu MT2701_M4U_PORT_JPGDEC_WDMA>,
 <&iommu MT2701_M4U_PORT_JPGDEC_BSDMA>;
};
@@ -577,7 +576,6 @@
clocks =  <&imgsys CLK_IMG_VENC>;
clock-names = "jpgenc";
power-domains = <&scpsys MT2701_POWER_DOMAIN_ISP>;
-   mediatek,larb = <&larb2>;
iommus = <&iommu MT2701_M4U_PORT_JPGENC_RDMA>,
 <&iommu MT2701_M4U_PORT_JPGENC_BSDMA>;
};
diff --git a/arch/arm/boot/dts/mt7623n.dtsi b/arch/arm/boot/dts/mt7623n.dtsi
index bcb0846e29fd..3adab5cd1fef 100644
--- a/arch/arm/boot/dts/mt7623n.dtsi
+++ b/arch/arm/boot/dts/mt7623n.dtsi
@@ -121,7 +121,6 @@
clock-names = "jpgdec-smi",
  "jpgdec";
power-domains = <&scpsys MT2701_POWER_DOMAIN_ISP>;
-   mediatek,larb = <&larb2>;
iommus = <&iommu MT2701_M4U_PORT_JPGDEC_WDMA>,
 <&iommu MT2701_M4U_PORT_JPGDEC_BSDMA>;
};
@@ -144,7 +143,6 @@
interrupts = ;
clocks = <&mmsys CLK_MM_DISP_OVL>;
iommus = <&iommu MT2701_M4U_PORT_DISP_OVL_0>;
-   mediatek,larb = <&larb0>;
};
 
rdma0: rdma@14008000 {
@@ -154,7 +152,6 @@
interrupts = ;
clocks = <&mmsys CLK_MM_DISP_RDMA>;
iommus = <&iommu MT2701_M4U_PORT_DISP_RDMA>;
-   mediatek,larb = <&larb0>;
};
 
wdma@14009000 {
@@ -164,7 +161,6 @@
interrupts = ;
clocks = <&mmsys CLK_MM_DISP_WDMA>;
iommus = <&iommu MT2701_M4U_PORT_DISP_WDMA>;
-   mediatek,larb = <&larb0>;
};
 
bls: pwm@1400a000 {
@@ -215,7 +211,6 @@
interrupts = ;
clocks = <&mmsys CLK_MM_DISP_RDMA1>;
iommus = <&iommu MT2701_M4U_PORT_DISP_RDMA1>;
-   mediatek,larb = <&larb0>;
};
 
dpi0: dpi@14014000 {
-- 
2.18.0



Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-07-14 Thread Konrad Rzeszutek Wilk
..snip..
> > > I think the main question I have is how would you like to see patches for
> > > 5.15? i.e. as patches on top of devel/for-linus-5.14 or something else?
> > 
> > Yes that would be perfect. If there are any dependencies on the rc1, I
> > can rebase it on top of that.
> 
> Yes, please, rebasing would be very helpful. The broader rework of
> 'io_tlb_default_mem' is going to conflict quite badly otherwise.

There is a devel/for-linus-5.15 (based on v5.14-rc1) now.

Thank you!
> 
> Cheers,
> 
> Will


[PATCH v6 06/11] drm/mediatek: Add pm runtime support for ovl and rdma

2021-07-14 Thread Yong Wu
From: Yongqiang Niu 

Prepare for smi cleaning up "mediatek,larb".

Display use the dispsys device to call pm_rumtime_get_sync before.
This patch add pm_runtime_xx with ovl and rdma device whose nodes has
"iommus" property, then display could help pm_runtime_get for smi via
ovl or rdma device.

CC: CK Hu 
Signed-off-by: Yongqiang Niu 
Signed-off-by: Yong Wu 
(Yong: Use pm_runtime_resume_and_get instead of pm_runtime_get_sync)
Acked-by: Chun-Kuang Hu 
---
 drivers/gpu/drm/mediatek/mtk_disp_ovl.c  |  9 -
 drivers/gpu/drm/mediatek/mtk_disp_rdma.c |  9 -
 drivers/gpu/drm/mediatek/mtk_drm_crtc.c  | 12 +++-
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
index fa9d79963cd3..ea5760f856ec 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "mtk_disp_drv.h"
@@ -414,15 +415,21 @@ static int mtk_disp_ovl_probe(struct platform_device 
*pdev)
return ret;
}
 
+   pm_runtime_enable(dev);
+
ret = component_add(dev, &mtk_disp_ovl_component_ops);
-   if (ret)
+   if (ret) {
+   pm_runtime_disable(dev);
dev_err(dev, "Failed to add component: %d\n", ret);
+   }
 
return ret;
 }
 
 static int mtk_disp_ovl_remove(struct platform_device *pdev)
 {
+   pm_runtime_disable(&pdev->dev);
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c 
b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
index 705f28ceb4dd..0f31d1c8e37c 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "mtk_disp_drv.h"
@@ -327,9 +328,13 @@ static int mtk_disp_rdma_probe(struct platform_device 
*pdev)
 
platform_set_drvdata(pdev, priv);
 
+   pm_runtime_enable(dev);
+
ret = component_add(dev, &mtk_disp_rdma_component_ops);
-   if (ret)
+   if (ret) {
+   pm_runtime_disable(dev);
dev_err(dev, "Failed to add component: %d\n", ret);
+   }
 
return ret;
 }
@@ -338,6 +343,8 @@ static int mtk_disp_rdma_remove(struct platform_device 
*pdev)
 {
component_del(&pdev->dev, &mtk_disp_rdma_component_ops);
 
+   pm_runtime_disable(&pdev->dev);
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c 
b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
index 474efb844249..08e3f352377d 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
@@ -557,9 +557,15 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
return;
}
 
+   ret = pm_runtime_resume_and_get(comp->dev);
+   if (ret < 0)
+   DRM_DEV_ERROR(comp->dev, "Failed to enable power domain: %d\n",
+ ret);
+
ret = mtk_crtc_ddp_hw_init(mtk_crtc);
if (ret) {
mtk_smi_larb_put(comp->larb_dev);
+   pm_runtime_put(comp->dev);
return;
}
 
@@ -572,7 +578,7 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
*crtc,
 {
struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
struct mtk_ddp_comp *comp = mtk_crtc->ddp_comp[0];
-   int i;
+   int i, ret;
 
DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id);
if (!mtk_crtc->enabled)
@@ -596,6 +602,10 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
*crtc,
drm_crtc_vblank_off(crtc);
mtk_crtc_ddp_hw_fini(mtk_crtc);
mtk_smi_larb_put(comp->larb_dev);
+   ret = pm_runtime_put(comp->dev);
+   if (ret < 0)
+   DRM_DEV_ERROR(comp->dev, "Failed to disable power domain: %d\n",
+ ret);
 
mtk_crtc->enabled = false;
 }
-- 
2.18.0



[RFC] drm: return int error code from mode_fixup

2021-07-14 Thread Grace An
When CONFIG_PROVE_LOCKING is defined, the kernel randomly injects
-EDEADLK errors for all the ww_mutex. This results in
drm_atomic_get_private_obj_state randomly returning -EDEADLK.
However, the mode_fixup functions do not propagate these error
codes and return false, causing the atomic commit to fail with
-EINVAL instead of retrying.

Change encoder, crtc, and bridge mode_fixup functions to return
an int instead of a boolean to indicate success or failure. If
any of these functions fail, the mode_fixup function now returns
the provided integer error code instead of -EINVAL.

This change needs modifications across drivers, but before submitting
the entire change, we want to get feedback on this RFC.

Signed-off-by: Grace An 
---
 drivers/gpu/drm/drm_atomic_helper.c  | 8 
 drivers/gpu/drm/drm_bridge.c | 4 ++--
 include/drm/drm_bridge.h | 2 +-
 include/drm/drm_modeset_helper_vtables.h | 4 ++--
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index f2b3e28..d75f09a 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -457,10 +457,10 @@ mode_fixup(struct drm_atomic_state *state)
} else if (funcs && funcs->mode_fixup) {
ret = funcs->mode_fixup(encoder, &new_crtc_state->mode,
&new_crtc_state->adjusted_mode);
-   if (!ret) {
+   if (ret) {
DRM_DEBUG_ATOMIC("[ENCODER:%d:%s] fixup 
failed\n",
 encoder->base.id, 
encoder->name);
-   return -EINVAL;
+   return ret;
}
}
}
@@ -481,10 +481,10 @@ mode_fixup(struct drm_atomic_state *state)
 
ret = funcs->mode_fixup(crtc, &new_crtc_state->mode,
&new_crtc_state->adjusted_mode);
-   if (!ret) {
+   if (ret) {
DRM_DEBUG_ATOMIC("[CRTC:%d:%s] fixup failed\n",
 crtc->base.id, crtc->name);
-   return -EINVAL;
+   return ret;
}
}
 
diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
index 64f0eff..3ad16b5 100644
--- a/drivers/gpu/drm/drm_bridge.c
+++ b/drivers/gpu/drm/drm_bridge.c
@@ -736,9 +736,9 @@ static int drm_atomic_bridge_check(struct drm_bridge 
*bridge,
if (ret)
return ret;
} else if (bridge->funcs->mode_fixup) {
-   if (!bridge->funcs->mode_fixup(bridge, &crtc_state->mode,
+   if (bridge->funcs->mode_fixup(bridge, &crtc_state->mode,
   &crtc_state->adjusted_mode))
-   return -EINVAL;
+   return ret;
}
 
return 0;
diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h
index 2195daa..5d02dfc 100644
--- a/include/drm/drm_bridge.h
+++ b/include/drm/drm_bridge.h
@@ -153,7 +153,7 @@ struct drm_bridge_funcs {
 * True if an acceptable configuration is possible, false if the modeset
 * operation should be rejected.
 */
-   bool (*mode_fixup)(struct drm_bridge *bridge,
+   int (*mode_fixup)(struct drm_bridge *bridge,
   const struct drm_display_mode *mode,
   struct drm_display_mode *adjusted_mode);
/**
diff --git a/include/drm/drm_modeset_helper_vtables.h 
b/include/drm/drm_modeset_helper_vtables.h
index f3a4b47..e305c97 100644
--- a/include/drm/drm_modeset_helper_vtables.h
+++ b/include/drm/drm_modeset_helper_vtables.h
@@ -184,7 +184,7 @@ struct drm_crtc_helper_funcs {
 * True if an acceptable configuration is possible, false if the modeset
 * operation should be rejected.
 */
-   bool (*mode_fixup)(struct drm_crtc *crtc,
+   int (*mode_fixup)(struct drm_crtc *crtc,
   const struct drm_display_mode *mode,
   struct drm_display_mode *adjusted_mode);
 
@@ -599,7 +599,7 @@ struct drm_encoder_helper_funcs {
 * True if an acceptable configuration is possible, false if the modeset
 * operation should be rejected.
 */
-   bool (*mode_fixup)(struct drm_encoder *encoder,
+   int (*mode_fixup)(struct drm_encoder *encoder,
   const struct drm_display_mode *mode,
   struct drm_display_mode *adjusted_mode);
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v6 07/11] drm/mediatek: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Yong Wu
MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the drm device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: CK Hu 
CC: Philipp Zabel 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Chun-Kuang Hu 
---
 drivers/gpu/drm/mediatek/mtk_drm_crtc.c |  9 --
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 36 ++---
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h |  1 -
 drivers/gpu/drm/mediatek/mtk_drm_drv.c  |  5 +--
 4 files changed, 3 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c 
b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
index 08e3f352377d..d046abcf66ce 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
@@ -10,7 +10,6 @@
 #include 
 
 #include 
-#include 
 
 #include 
 #include 
@@ -551,12 +550,6 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
 
DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id);
 
-   ret = mtk_smi_larb_get(comp->larb_dev);
-   if (ret) {
-   DRM_ERROR("Failed to get larb: %d\n", ret);
-   return;
-   }
-
ret = pm_runtime_resume_and_get(comp->dev);
if (ret < 0)
DRM_DEV_ERROR(comp->dev, "Failed to enable power domain: %d\n",
@@ -564,7 +557,6 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
 
ret = mtk_crtc_ddp_hw_init(mtk_crtc);
if (ret) {
-   mtk_smi_larb_put(comp->larb_dev);
pm_runtime_put(comp->dev);
return;
}
@@ -601,7 +593,6 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
*crtc,
 
drm_crtc_vblank_off(crtc);
mtk_crtc_ddp_hw_fini(mtk_crtc);
-   mtk_smi_larb_put(comp->larb_dev);
ret = pm_runtime_put(comp->dev);
if (ret < 0)
DRM_DEV_ERROR(comp->dev, "Failed to disable power domain: %d\n",
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
index 75bc00e17fc4..7d240218d4c7 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
@@ -449,37 +449,15 @@ unsigned int mtk_drm_find_possible_crtc_by_comp(struct 
drm_device *drm,
return ret;
 }
 
-static int mtk_ddp_get_larb_dev(struct device_node *node, struct mtk_ddp_comp 
*comp,
-   struct device *dev)
-{
-   struct device_node *larb_node;
-   struct platform_device *larb_pdev;
-
-   larb_node = of_parse_phandle(node, "mediatek,larb", 0);
-   if (!larb_node) {
-   dev_err(dev, "Missing mediadek,larb phandle in %pOF node\n", 
node);
-   return -EINVAL;
-   }
-
-   larb_pdev = of_find_device_by_node(larb_node);
-   if (!larb_pdev) {
-   dev_warn(dev, "Waiting for larb device %pOF\n", larb_node);
-   of_node_put(larb_node);
-   return -EPROBE_DEFER;
-   }
-   of_node_put(larb_node);
-   comp->larb_dev = &larb_pdev->dev;
-
-   return 0;
-}
-
 int mtk_ddp_comp_init(struct device_node *node, struct mtk_ddp_comp *comp,
  enum mtk_ddp_comp_id comp_id)
 {
struct platform_device *comp_pdev;
enum mtk_ddp_comp_type type;
struct mtk_ddp_comp_dev *priv;
+#if IS_REACHABLE(CONFIG_MTK_CMDQ)
int ret;
+#endif
 
if (comp_id < 0 || comp_id >= DDP_COMPONENT_ID_MAX)
return -EINVAL;
@@ -495,16 +473,6 @@ int mtk_ddp_comp_init(struct device_node *node, struct 
mtk_ddp_comp *comp,
}
comp->dev = &comp_pdev->dev;
 
-   /* Only DMA capable components need the LARB property */
-   if (type == MTK_DISP_OVL ||
-   type == MTK_DISP_OVL_2L ||
-   type == MTK_DISP_RDMA ||
-   type == MTK_DISP_WDMA) {
-   ret = mtk_ddp_get_larb_dev(node, comp, comp->dev);
-   if (ret)
-   return ret;
-   }
-
if (type == MTK_DISP_BLS ||
type == MTK_DISP_CCORR ||
type == MTK_DISP_COLOR ||
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h 
b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
index bb914d976cf5..1b582262b682 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
@@ -70,7 +70,6 @@ struct mtk_ddp_comp_funcs {
 struct mtk_ddp_comp {
struct device *dev;
int irq;
-   struct device *larb_dev;
enum mtk_ddp_comp_id id;
const struct mtk_ddp_comp_funcs *funcs;
 };
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index b46bdb8985da..0d5ef3d8d081 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -577,11 +577,8 @@ static int mtk_drm_probe(struct platform_device *pdev)
pm_runtime_disable(dev);
 err_node:
of_node_

[PATCH v6 09/11] memory: mtk-smi: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Yong Wu
After adding device_link between the iommu consumer and smi-larb,
the pm_runtime_get(_sync) of smi-larb and smi-common will be called
automatically. we can get rid of mtk_smi_larb_get/put.

CC: Matthias Brugger 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Krzysztof Kozlowski 
Acked-by: Matthias Brugger 
---
 drivers/memory/mtk-smi.c   | 14 --
 include/soc/mediatek/smi.h | 20 
 2 files changed, 34 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index c5fb51f73b34..7c61c924e220 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -134,20 +134,6 @@ static void mtk_smi_clk_disable(const struct mtk_smi *smi)
clk_disable_unprepare(smi->clk_apb);
 }
 
-int mtk_smi_larb_get(struct device *larbdev)
-{
-   int ret = pm_runtime_resume_and_get(larbdev);
-
-   return (ret < 0) ? ret : 0;
-}
-EXPORT_SYMBOL_GPL(mtk_smi_larb_get);
-
-void mtk_smi_larb_put(struct device *larbdev)
-{
-   pm_runtime_put_sync(larbdev);
-}
-EXPORT_SYMBOL_GPL(mtk_smi_larb_put);
-
 static int
 mtk_smi_larb_bind(struct device *dev, struct device *master, void *data)
 {
diff --git a/include/soc/mediatek/smi.h b/include/soc/mediatek/smi.h
index 15e3397cec58..11f7d6b59642 100644
--- a/include/soc/mediatek/smi.h
+++ b/include/soc/mediatek/smi.h
@@ -19,26 +19,6 @@ struct mtk_smi_larb_iommu {
unsigned char  bank[32];
 };
 
-/*
- * mtk_smi_larb_get: Enable the power domain and clocks for this local arbiter.
- *   It also initialize some basic setting(like iommu).
- * mtk_smi_larb_put: Disable the power domain and clocks for this local 
arbiter.
- * Both should be called in non-atomic context.
- *
- * Returns 0 if successful, negative on failure.
- */
-int mtk_smi_larb_get(struct device *larbdev);
-void mtk_smi_larb_put(struct device *larbdev);
-
-#else
-
-static inline int mtk_smi_larb_get(struct device *larbdev)
-{
-   return 0;
-}
-
-static inline void mtk_smi_larb_put(struct device *larbdev) { }
-
 #endif
 
 #endif
-- 
2.18.0



[PATCH v6 00/11] Clean up "mediatek,larb"

2021-07-14 Thread Yong Wu
MediaTek IOMMU block diagram always like below:

M4U
 |
smi-common
 |
  -
  | |  ...
  | |
larb1 larb2
  | |
vdec   venc

All the consumer connect with smi-larb, then connect with smi-common.

When the consumer works, it should enable the smi-larb's power which also
need enable the smi-common's power firstly.

Thus, Firstly, use the device link connect the consumer and the
smi-larbs. then add device link between the smi-larb and smi-common.

After adding the device_link, then "mediatek,larb" property can be removed.
the iommu consumer don't need call the mtk_smi_larb_get/put to enable
the power and clock of smi-larb and smi-common.

About the MM dt-binding/dtsi patches, I guess they should go together, thus
I don't split them for each a MM module and each a SoC.

Base on v5.14-rc1, and a jpeg[1] and mdp[2] patchset.

[1] 
https://lore.kernel.org/linux-mediatek/20210702102304.3346429-1-hsi...@chromium.org/
[2] 
https://lore.kernel.org/linux-mediatek/20210709022324.1607884-1-ei...@chromium.org/

Change notes:
v6: 1) rebase on v5.14-rc1.
2) Fix the issue commented in v5 from Dafna and Hsin-Yi.
3) Remove the patches about using pm_runtime_resume_and_get since they have
   already been merged by other patches.

v5: 
https://lore.kernel.org/linux-mediatek/20210410091128.31823-1-yong...@mediatek.com/
1) Base v5.12-rc2.
2) Remove changing the mtk-iommu to module_platform_driver patch, It have 
already been a
independent patch.

v4: 
https://lore.kernel.org/linux-mediatek/1590826218-23653-1-git-send-email-yong...@mediatek.com/
 
base on v5.7-rc1.
  1) Move drm PM patch before smi patchs.
  2) Change builtin_platform_driver to module_platform_driver since we may need
 build as module.
  3) Rebase many patchset as above.

v3: 
https://lore.kernel.org/linux-iommu/1567503456-24725-1-git-send-email-yong...@mediatek.com/
1) rebase on v5.3-rc1 and the latest mt8183 patchset.
2) Use device_is_bound to check whether the driver is ready from Matthias.  
  
3) Add DL_FLAG_STATELESS flag when calling device_link_add and explain the
   reason in the commit message[3/14].
4) Add a display patch[12/14] into this series. otherwise it may affect
   display HW fastlogo even though it don't happen in mt8183.
   
v2: 
https://lore.kernel.org/linux-iommu/1560171313-28299-1-git-send-email-yong...@mediatek.com/
   1) rebase on v5.2-rc1.
   2) Move adding device_link between the consumer and smi-larb into
iommu_add_device from Robin.
   3) add DL_FLAG_AUTOREMOVE_CONSUMER even though the smi is built-in from Evan.
   4) Remove the shutdown callback in iommu.   

v1: 
https://lore.kernel.org/linux-iommu/1546318276-18993-1-git-send-email-yong...@mediatek.com/

Yong Wu (10):
  dt-binding: mediatek: Get rid of mediatek,larb for multimedia HW
  iommu/mediatek: Add probe_defer for smi-larb
  iommu/mediatek: Add device_link between the consumer and the larb
devices
  media: mtk-jpeg: Get rid of mtk_smi_larb_get/put
  media: mtk-mdp: Get rid of mtk_smi_larb_get/put
  drm/mediatek: Get rid of mtk_smi_larb_get/put
  media: mtk-vcodec: Get rid of mtk_smi_larb_get/put
  memory: mtk-smi: Get rid of mtk_smi_larb_get/put
  arm: dts: mediatek: Get rid of mediatek,larb for MM nodes
  arm64: dts: mediatek: Get rid of mediatek,larb for MM nodes

Yongqiang Niu (1):
  drm/mediatek: Add pm runtime support for ovl and rdma

 .../display/mediatek/mediatek,disp.txt|  9 
 .../bindings/media/mediatek-jpeg-decoder.yaml |  9 
 .../bindings/media/mediatek-jpeg-encoder.yaml |  9 
 .../bindings/media/mediatek-mdp.txt   |  8 
 .../bindings/media/mediatek-vcodec.txt|  4 --
 arch/arm/boot/dts/mt2701.dtsi |  2 -
 arch/arm/boot/dts/mt7623n.dtsi|  5 --
 arch/arm64/boot/dts/mediatek/mt8173.dtsi  | 16 ---
 arch/arm64/boot/dts/mediatek/mt8183.dtsi  |  6 ---
 drivers/gpu/drm/mediatek/mtk_disp_ovl.c   |  9 +++-
 drivers/gpu/drm/mediatek/mtk_disp_rdma.c  |  9 +++-
 drivers/gpu/drm/mediatek/mtk_drm_crtc.c   | 19 
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c   | 36 +--
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h   |  1 -
 drivers/gpu/drm/mediatek/mtk_drm_drv.c|  5 +-
 drivers/iommu/mtk_iommu.c | 24 +-
 drivers/iommu/mtk_iommu_v1.c  | 22 -
 .../media/platform/mtk-jpeg/mtk_jpeg_core.c   | 45 +-
 .../media/platform/mtk-jpeg/mtk_jpeg_core.h   |  2 -
 drivers/media/platform/mtk-mdp/mtk_mdp_comp.c | 46 +--
 drivers/media/platform/mtk-mdp/mtk_mdp_comp.h |  2 -
 drivers/media/platform/mtk-mdp/mtk_mdp_core.c |  1 -
 .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c   | 37 ++-
 .../platform/mtk-vcodec/mtk_vcodec_drv.h  |  3 --
 .../platform/mtk-vcodec/mtk_vcodec_enc.c  |  1 -
 .../platform/mtk-vcodec/mtk_vcodec_enc_pm.c   | 44 ++
 drivers

[PATCH v6 02/11] iommu/mediatek: Add probe_defer for smi-larb

2021-07-14 Thread Yong Wu
Prepare for adding device_link.

The iommu consumer should use device_link to connect with the
smi-larb(supplier). then the smi-larb should run before the iommu
consumer. Here we delay the iommu driver until the smi driver is
ready, then all the iommu consumer always is after the smi driver.

When there is no this patch, if some consumer drivers run before
smi-larb, the supplier link_status is DL_DEV_NO_DRIVER(0) in the
device_link_add, then device_links_driver_bound will use WARN_ON
to complain that the link_status of supplier is not right.

Signed-off-by: Yong Wu 
---
since [1], device_is_bound is not allowed to be EXPORT. It will
affect this driver built as module. thus still use dev.driver here.

[1] https://lore.kernel.org/patchwork/patch/1334670/
---
 drivers/iommu/mtk_iommu.c| 2 +-
 drivers/iommu/mtk_iommu_v1.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 6f7c69688ce2..a02dde094788 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -855,7 +855,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
id = i;
 
plarbdev = of_find_device_by_node(larbnode);
-   if (!plarbdev) {
+   if (!plarbdev || !plarbdev->dev.driver) {
of_node_put(larbnode);
return -EPROBE_DEFER;
}
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 778e66f5f1aa..d9365a3d8dc9 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -594,7 +594,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
}
 
plarbdev = of_find_device_by_node(larbnode);
-   if (!plarbdev) {
+   if (!plarbdev || !plarbdev->dev.driver) {
of_node_put(larbnode);
return -EPROBE_DEFER;
}
-- 
2.18.0



[PATCH v6 03/11] iommu/mediatek: Add device_link between the consumer and the larb devices

2021-07-14 Thread Yong Wu
MediaTek IOMMU-SMI diagram is like below. all the consumer connect with
smi-larb, then connect with smi-common.

M4U
 |
smi-common
 |
  -
  | |...
  | |
larb1 larb2
  | |
vdec   venc

When the consumer works, it should enable the smi-larb's power which
also need enable the smi-common's power firstly.

Thus, First of all, use the device link connect the consumer and the
smi-larbs. then add device link between the smi-larb and smi-common.

This patch adds device_link between the consumer and the larbs.

When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling
pm_runtime_xx to keep the original status of clocks. It can avoid two
issues:
1) Display HW show fastlogo abnormally reported in [1]. At the beggining,
all the clocks are enabled before entering kernel, but the clocks for
display HW(always in larb0) will be gated after clk_enable and clk_disable
called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock
operation happened before display driver probe. At that time, the display
HW will be abnormal.

2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip
pm_runtime_xx to avoid the deadlock.

Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then
device_link_removed should be added explicitly.

[1] https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/
[2] https://lore.kernel.org/patchwork/patch/1086569/

Suggested-by: Tomasz Figa 
Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c| 22 ++
 drivers/iommu/mtk_iommu_v1.c | 20 +++-
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index a02dde094788..ee742900cf4b 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -571,22 +571,44 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
 {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct mtk_iommu_data *data;
+   struct device_link *link;
+   struct device *larbdev;
+   unsigned int larbid;
 
if (!fwspec || fwspec->ops != &mtk_iommu_ops)
return ERR_PTR(-ENODEV); /* Not a iommu client device */
 
data = dev_iommu_priv_get(dev);
 
+   /*
+* Link the consumer device with the smi-larb device(supplier)
+* The device in each a larb is a independent HW. thus only link
+* one larb here.
+*/
+   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
+   larbdev = data->larb_imu[larbid].dev;
+   link = device_link_add(dev, larbdev,
+  DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+   if (!link)
+   dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
return &data->iommu;
 }
 
 static void mtk_iommu_release_device(struct device *dev)
 {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct mtk_iommu_data *data;
+   struct device *larbdev;
+   unsigned int larbid;
 
if (!fwspec || fwspec->ops != &mtk_iommu_ops)
return;
 
+   data = dev_iommu_priv_get(dev);
+   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
+   larbdev = data->larb_imu[larbid].dev;
+   device_link_remove(dev, larbdev);
+
iommu_fwspec_free(dev);
 }
 
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index d9365a3d8dc9..d2a7c66b8239 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -424,7 +424,9 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct of_phandle_args iommu_spec;
struct mtk_iommu_data *data;
-   int err, idx = 0;
+   int err, idx = 0, larbid;
+   struct device_link *link;
+   struct device *larbdev;
 
while (!of_parse_phandle_with_args(dev->of_node, "iommus",
   "#iommu-cells",
@@ -445,6 +447,14 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
 
data = dev_iommu_priv_get(dev);
 
+   /* Link the consumer device with the smi-larb device(supplier) */
+   larbid = mt2701_m4u_to_larb(fwspec->ids[0]);
+   larbdev = data->larb_imu[larbid].dev;
+   link = device_link_add(dev, larbdev,
+  DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+   if (!link)
+   dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
+
return &data->iommu;
 }
 
@@ -465,10 +475,18 @@ static void mtk_iommu_probe_finalize(struct device *dev)
 static void mtk_iommu_release_device(struct device *dev)
 {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct mtk_iommu_data *data;
+   struct device *larbdev;
+   unsigned int larbid;
 
if (!fwspec || fwspec->ops != &mtk_iommu_ops)
return;
 
+

[PATCH v6 01/11] dt-binding: mediatek: Get rid of mediatek, larb for multimedia HW

2021-07-14 Thread Yong Wu
After adding device_link between the consumer with the smi-larbs,
if the consumer call its owner pm_runtime_get(_sync), the
pm_runtime_get(_sync) of smi-larb and smi-common will be called
automatically. Thus, the consumer don't need the property.

And IOMMU also know which larb this consumer connects with from
iommu id in the "iommus=" property.

Signed-off-by: Yong Wu 
Reviewed-by: Rob Herring 
Reviewed-by: Evan Green 
---
 .../bindings/display/mediatek/mediatek,disp.txt  | 9 -
 .../devicetree/bindings/media/mediatek-jpeg-decoder.yaml | 9 -
 .../devicetree/bindings/media/mediatek-jpeg-encoder.yaml | 9 -
 Documentation/devicetree/bindings/media/mediatek-mdp.txt | 8 
 .../devicetree/bindings/media/mediatek-vcodec.txt| 4 
 5 files changed, 39 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt 
b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
index fbb59c9ddda6..867bd82e2f03 100644
--- a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
+++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
@@ -61,8 +61,6 @@ Required properties (DMA function blocks):
"mediatek,-disp-rdma"
"mediatek,-disp-wdma"
   the supported chips are mt2701, mt8167 and mt8173.
-- larb: Should contain a phandle pointing to the local arbiter device as 
defined
-  in 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
 - iommus: Should point to the respective IOMMU block with master port as
   argument, see Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
   for details.
@@ -91,7 +89,6 @@ ovl0: ovl@1400c000 {
power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_OVL0>;
iommus = <&iommu M4U_PORT_DISP_OVL0>;
-   mediatek,larb = <&larb0>;
 };
 
 ovl1: ovl@1400d000 {
@@ -101,7 +98,6 @@ ovl1: ovl@1400d000 {
power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_OVL1>;
iommus = <&iommu M4U_PORT_DISP_OVL1>;
-   mediatek,larb = <&larb4>;
 };
 
 rdma0: rdma@1400e000 {
@@ -111,7 +107,6 @@ rdma0: rdma@1400e000 {
power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_RDMA0>;
iommus = <&iommu M4U_PORT_DISP_RDMA0>;
-   mediatek,larb = <&larb0>;
mediatek,rdma-fifosize = <8192>;
 };
 
@@ -122,7 +117,6 @@ rdma1: rdma@1400f000 {
power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_RDMA1>;
iommus = <&iommu M4U_PORT_DISP_RDMA1>;
-   mediatek,larb = <&larb4>;
 };
 
 rdma2: rdma@1401 {
@@ -132,7 +126,6 @@ rdma2: rdma@1401 {
power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_RDMA2>;
iommus = <&iommu M4U_PORT_DISP_RDMA2>;
-   mediatek,larb = <&larb4>;
 };
 
 wdma0: wdma@14011000 {
@@ -142,7 +135,6 @@ wdma0: wdma@14011000 {
power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_WDMA0>;
iommus = <&iommu M4U_PORT_DISP_WDMA0>;
-   mediatek,larb = <&larb0>;
 };
 
 wdma1: wdma@14012000 {
@@ -152,7 +144,6 @@ wdma1: wdma@14012000 {
power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_WDMA1>;
iommus = <&iommu M4U_PORT_DISP_WDMA1>;
-   mediatek,larb = <&larb4>;
 };
 
 color0: color@14013000 {
diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml 
b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
index 9b87f036f178..052e752157b4 100644
--- a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
+++ b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
@@ -42,13 +42,6 @@ properties:
   power-domains:
 maxItems: 1
 
-  mediatek,larb:
-$ref: '/schemas/types.yaml#/definitions/phandle'
-description: |
-  Must contain the local arbiters in the current Socs, see
-  
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
-  for details.
-
   iommus:
 maxItems: 2
 description: |
@@ -63,7 +56,6 @@ required:
   - clocks
   - clock-names
   - power-domains
-  - mediatek,larb
   - iommus
 
 additionalProperties: false
@@ -83,7 +75,6 @@ examples:
   clock-names = "jpgdec-smi",
 "jpgdec";
   power-domains = <&scpsys MT2701_POWER_DOMAIN_ISP>;
-  mediatek,larb = <&larb2>;
   iommus = <&iommu MT2701_M4U_PORT_JPGDEC_WDMA>,
<&iommu MT2701_M4U_PORT_JPGDEC_BSDMA>;
 };
diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-encoder.yaml 
b/Documentation/devicetree/bindings/media/mediatek-jpeg-encoder.yaml
index fcd9b829e036..8bfdfdfaba59 100644
--- a/Documentation/devicetree/bindings/media/mediatek-jpeg-encoder.yaml
+++ b/Documentation/devicetree/bindings/media/mediatek-jpeg-encoder.yaml
@@ -35,13 +35,6 @@ properties:
   

[PATCH v6 04/11] media: mtk-jpeg: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Yong Wu
MediaTek IOMMU has already added device_link between the consumer
and smi-larb device. If the jpg device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

After removing the larb_get operations, then mtk_jpeg_clk_init is
also unnecessary. Remove it too.

CC: Rick Chang 
CC: Xia Jiang 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Rick Chang 
---
 .../media/platform/mtk-jpeg/mtk_jpeg_core.c   | 45 +--
 .../media/platform/mtk-jpeg/mtk_jpeg_core.h   |  2 -
 2 files changed, 2 insertions(+), 45 deletions(-)

diff --git a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c 
b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
index a89c7b206eef..4fea2c512434 100644
--- a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
+++ b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
@@ -22,7 +22,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mtk_jpeg_enc_hw.h"
 #include "mtk_jpeg_dec_hw.h"
@@ -1055,10 +1054,6 @@ static void mtk_jpeg_clk_on(struct mtk_jpeg_dev *jpeg)
 {
int ret;
 
-   ret = mtk_smi_larb_get(jpeg->larb);
-   if (ret)
-   dev_err(jpeg->dev, "mtk_smi_larb_get larbvdec fail %d\n", ret);
-
ret = clk_bulk_prepare_enable(jpeg->variant->num_clks,
  jpeg->variant->clks);
if (ret)
@@ -1069,7 +1064,6 @@ static void mtk_jpeg_clk_off(struct mtk_jpeg_dev *jpeg)
 {
clk_bulk_disable_unprepare(jpeg->variant->num_clks,
   jpeg->variant->clks);
-   mtk_smi_larb_put(jpeg->larb);
 }
 
 static irqreturn_t mtk_jpeg_enc_done(struct mtk_jpeg_dev *jpeg)
@@ -1284,35 +1278,6 @@ static struct clk_bulk_data mtk_jpeg_clocks[] = {
{ .id = "jpgenc" },
 };
 
-static int mtk_jpeg_clk_init(struct mtk_jpeg_dev *jpeg)
-{
-   struct device_node *node;
-   struct platform_device *pdev;
-   int ret;
-
-   node = of_parse_phandle(jpeg->dev->of_node, "mediatek,larb", 0);
-   if (!node)
-   return -EINVAL;
-   pdev = of_find_device_by_node(node);
-   if (WARN_ON(!pdev)) {
-   of_node_put(node);
-   return -EINVAL;
-   }
-   of_node_put(node);
-
-   jpeg->larb = &pdev->dev;
-
-   ret = devm_clk_bulk_get(jpeg->dev, jpeg->variant->num_clks,
-   jpeg->variant->clks);
-   if (ret) {
-   dev_err(&pdev->dev, "failed to get jpeg clock:%d\n", ret);
-   put_device(&pdev->dev);
-   return ret;
-   }
-
-   return 0;
-}
-
 static void mtk_jpeg_job_timeout_work(struct work_struct *work)
 {
struct mtk_jpeg_dev *jpeg = container_of(work, struct mtk_jpeg_dev,
@@ -1333,11 +1298,6 @@ static void mtk_jpeg_job_timeout_work(struct work_struct 
*work)
v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
 }
 
-static inline void mtk_jpeg_clk_release(struct mtk_jpeg_dev *jpeg)
-{
-   put_device(jpeg->larb);
-}
-
 static int mtk_jpeg_probe(struct platform_device *pdev)
 {
struct mtk_jpeg_dev *jpeg;
@@ -1376,7 +1336,8 @@ static int mtk_jpeg_probe(struct platform_device *pdev)
goto err_req_irq;
}
 
-   ret = mtk_jpeg_clk_init(jpeg);
+   ret = devm_clk_bulk_get(jpeg->dev, jpeg->variant->num_clks,
+   jpeg->variant->clks);
if (ret) {
dev_err(&pdev->dev, "Failed to init clk, err %d\n", ret);
goto err_clk_init;
@@ -1442,7 +1403,6 @@ static int mtk_jpeg_probe(struct platform_device *pdev)
v4l2_device_unregister(&jpeg->v4l2_dev);
 
 err_dev_register:
-   mtk_jpeg_clk_release(jpeg);
 
 err_clk_init:
 
@@ -1460,7 +1420,6 @@ static int mtk_jpeg_remove(struct platform_device *pdev)
video_device_release(jpeg->vdev);
v4l2_m2m_release(jpeg->m2m_dev);
v4l2_device_unregister(&jpeg->v4l2_dev);
-   mtk_jpeg_clk_release(jpeg);
 
return 0;
 }
diff --git a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h 
b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
index 595f7f10c9fd..3e4811a41ba2 100644
--- a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
+++ b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
@@ -85,7 +85,6 @@ struct mtk_jpeg_variant {
  * @alloc_ctx: videobuf2 memory allocator's context
  * @vdev:  video device node for jpeg mem2mem mode
  * @reg_base:  JPEG registers mapping
- * @larb:  SMI device
  * @job_timeout_work:  IRQ timeout structure
  * @variant:   driver variant to be used
  */
@@ -99,7 +98,6 @@ struct mtk_jpeg_dev {
void*alloc_ctx;
struct video_device *vdev;
void __iomem*reg_base;
-   struct device   *larb;
struct delayed_work job_timeout_work;
const struct mtk_jpeg_variant *variant;
 };
-- 
2.18.0



[PATCH v6 08/11] media: mtk-vcodec: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Yong Wu
MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the vcodec device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: Tiffany Lin 
CC: Irui Wang 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Tiffany Lin 
---
 .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c   | 37 +++-
 .../platform/mtk-vcodec/mtk_vcodec_drv.h  |  3 --
 .../platform/mtk-vcodec/mtk_vcodec_enc.c  |  1 -
 .../platform/mtk-vcodec/mtk_vcodec_enc_pm.c   | 44 +++
 4 files changed, 10 insertions(+), 75 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
index 6038db96f71c..d0bf9aa3b29d 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
@@ -8,14 +8,12 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mtk_vcodec_dec_pm.h"
 #include "mtk_vcodec_util.h"
 
 int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)
 {
-   struct device_node *node;
struct platform_device *pdev;
struct mtk_vcodec_pm *pm;
struct mtk_vcodec_clk *dec_clk;
@@ -26,18 +24,7 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)
pm = &mtkdev->pm;
pm->mtkdev = mtkdev;
dec_clk = &pm->vdec_clk;
-   node = of_parse_phandle(pdev->dev.of_node, "mediatek,larb", 0);
-   if (!node) {
-   mtk_v4l2_err("of_parse_phandle mediatek,larb fail!");
-   return -1;
-   }
 
-   pdev = of_find_device_by_node(node);
-   of_node_put(node);
-   if (WARN_ON(!pdev)) {
-   return -1;
-   }
-   pm->larbvdec = &pdev->dev;
pdev = mtkdev->plat_dev;
pm->dev = &pdev->dev;
 
@@ -47,14 +34,11 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)
dec_clk->clk_info = devm_kcalloc(&pdev->dev,
dec_clk->clk_num, sizeof(*clk_info),
GFP_KERNEL);
-   if (!dec_clk->clk_info) {
-   ret = -ENOMEM;
-   goto put_device;
-   }
+   if (!dec_clk->clk_info)
+   return -ENOMEM;
} else {
mtk_v4l2_err("Failed to get vdec clock count");
-   ret = -EINVAL;
-   goto put_device;
+   return -EINVAL;
}
 
for (i = 0; i < dec_clk->clk_num; i++) {
@@ -63,29 +47,24 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)
"clock-names", i, &clk_info->clk_name);
if (ret) {
mtk_v4l2_err("Failed to get clock name id = %d", i);
-   goto put_device;
+   return ret;
}
clk_info->vcodec_clk = devm_clk_get(&pdev->dev,
clk_info->clk_name);
if (IS_ERR(clk_info->vcodec_clk)) {
mtk_v4l2_err("devm_clk_get (%d)%s fail", i,
clk_info->clk_name);
-   ret = PTR_ERR(clk_info->vcodec_clk);
-   goto put_device;
+   return PTR_ERR(clk_info->vcodec_clk);
}
}
 
pm_runtime_enable(&pdev->dev);
return 0;
-put_device:
-   put_device(pm->larbvdec);
-   return ret;
 }
 
 void mtk_vcodec_release_dec_pm(struct mtk_vcodec_dev *dev)
 {
pm_runtime_disable(dev->pm.dev);
-   put_device(dev->pm.larbvdec);
 }
 
 int mtk_vcodec_dec_pw_on(struct mtk_vcodec_pm *pm)
@@ -122,11 +101,6 @@ void mtk_vcodec_dec_clock_on(struct mtk_vcodec_pm *pm)
}
}
 
-   ret = mtk_smi_larb_get(pm->larbvdec);
-   if (ret) {
-   mtk_v4l2_err("mtk_smi_larb_get larbvdec fail %d", ret);
-   goto error;
-   }
return;
 
 error:
@@ -139,7 +113,6 @@ void mtk_vcodec_dec_clock_off(struct mtk_vcodec_pm *pm)
struct mtk_vcodec_clk *dec_clk = &pm->vdec_clk;
int i = 0;
 
-   mtk_smi_larb_put(pm->larbvdec);
for (i = dec_clk->clk_num - 1; i >= 0; i--)
clk_disable_unprepare(dec_clk->clk_info[i].vcodec_clk);
 }
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index c6c7672fecfb..64b73dd880ce 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -189,10 +189,7 @@ struct mtk_vcodec_clk {
  */
 struct mtk_vcodec_pm {
struct mtk_vcodec_clk   vdec_clk;
-   struct device   *larbvdec;
-
struct mtk_vcodec_clk   venc_clk;
-   struct device   *larbvenc;
struct device   *dev;
struct mtk_vcodec_dev   *mtkdev;
 };
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcode

[PATCH v6 11/11] arm64: dts: mediatek: Get rid of mediatek, larb for MM nodes

2021-07-14 Thread Yong Wu
After adding device_link between the IOMMU consumer and smi,
the mediatek,larb is unnecessary now.

CC: Matthias Brugger 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 arch/arm64/boot/dts/mediatek/mt8173.dtsi | 16 
 arch/arm64/boot/dts/mediatek/mt8183.dtsi |  6 --
 2 files changed, 22 deletions(-)

diff --git a/arch/arm64/boot/dts/mediatek/mt8173.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8173.dtsi
index 2f0fc1e317d7..cf5d26db82b8 100644
--- a/arch/arm64/boot/dts/mediatek/mt8173.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8173.dtsi
@@ -1009,7 +1009,6 @@
 <&mmsys CLK_MM_MUTEX_32K>;
power-domains = <&spm MT8173_POWER_DOMAIN_MM>;
iommus = <&iommu M4U_PORT_MDP_RDMA0>;
-   mediatek,larb = <&larb0>;
};
 
mdp_rdma1: rdma@14002000 {
@@ -1019,7 +1018,6 @@
 <&mmsys CLK_MM_MUTEX_32K>;
power-domains = <&spm MT8173_POWER_DOMAIN_MM>;
iommus = <&iommu M4U_PORT_MDP_RDMA1>;
-   mediatek,larb = <&larb4>;
};
 
mdp_rsz0: rsz@14003000 {
@@ -1049,7 +1047,6 @@
clocks = <&mmsys CLK_MM_MDP_WDMA>;
power-domains = <&spm MT8173_POWER_DOMAIN_MM>;
iommus = <&iommu M4U_PORT_MDP_WDMA>;
-   mediatek,larb = <&larb0>;
};
 
mdp_wrot0: wrot@14007000 {
@@ -1058,7 +1055,6 @@
clocks = <&mmsys CLK_MM_MDP_WROT0>;
power-domains = <&spm MT8173_POWER_DOMAIN_MM>;
iommus = <&iommu M4U_PORT_MDP_WROT0>;
-   mediatek,larb = <&larb0>;
};
 
mdp_wrot1: wrot@14008000 {
@@ -1067,7 +1063,6 @@
clocks = <&mmsys CLK_MM_MDP_WROT1>;
power-domains = <&spm MT8173_POWER_DOMAIN_MM>;
iommus = <&iommu M4U_PORT_MDP_WROT1>;
-   mediatek,larb = <&larb4>;
};
 
ovl0: ovl@1400c000 {
@@ -1077,7 +1072,6 @@
power-domains = <&spm MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_OVL0>;
iommus = <&iommu M4U_PORT_DISP_OVL0>;
-   mediatek,larb = <&larb0>;
mediatek,gce-client-reg = <&gce SUBSYS_1400 0xc000 
0x1000>;
};
 
@@ -1088,7 +1082,6 @@
power-domains = <&spm MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_OVL1>;
iommus = <&iommu M4U_PORT_DISP_OVL1>;
-   mediatek,larb = <&larb4>;
mediatek,gce-client-reg = <&gce SUBSYS_1400 0xd000 
0x1000>;
};
 
@@ -1099,7 +1092,6 @@
power-domains = <&spm MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_RDMA0>;
iommus = <&iommu M4U_PORT_DISP_RDMA0>;
-   mediatek,larb = <&larb0>;
mediatek,gce-client-reg = <&gce SUBSYS_1400 0xe000 
0x1000>;
};
 
@@ -1110,7 +1102,6 @@
power-domains = <&spm MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_RDMA1>;
iommus = <&iommu M4U_PORT_DISP_RDMA1>;
-   mediatek,larb = <&larb4>;
mediatek,gce-client-reg = <&gce SUBSYS_1400 0xf000 
0x1000>;
};
 
@@ -1121,7 +1112,6 @@
power-domains = <&spm MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_RDMA2>;
iommus = <&iommu M4U_PORT_DISP_RDMA2>;
-   mediatek,larb = <&larb4>;
mediatek,gce-client-reg = <&gce SUBSYS_1401 0 
0x1000>;
};
 
@@ -1132,7 +1122,6 @@
power-domains = <&spm MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_WDMA0>;
iommus = <&iommu M4U_PORT_DISP_WDMA0>;
-   mediatek,larb = <&larb0>;
mediatek,gce-client-reg = <&gce SUBSYS_1401 0x1000 
0x1000>;
};
 
@@ -1143,7 +1132,6 @@
power-domains = <&spm MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_WDMA1>;
iommus = <&iommu M4U_PORT_DISP_WDMA1>;
-   mediatek,larb = <&larb4>;
mediatek,gce-client-reg = <&gce SUBSYS_1401 0x2000 
0x1000>;
};
 
@@ -1394,7 +1382,6 @@
  <0 0x16027800 0 0x800>,   /* VDEC_HWB */
  <0 0x16028400 0 0x400>;   /* VDEC_HWG */
  

[PATCH v6 05/11] media: mtk-mdp: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Yong Wu
MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the mdp device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: Minghsiu Tsai 
CC: Houlong Wei 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Reviewed-by: Houlong Wei 
---
 drivers/media/platform/mtk-mdp/mtk_mdp_comp.c | 46 +--
 drivers/media/platform/mtk-mdp/mtk_mdp_comp.h |  2 -
 drivers/media/platform/mtk-mdp/mtk_mdp_core.c |  1 -
 3 files changed, 1 insertion(+), 48 deletions(-)

diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c 
b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
index de2d425efdd1..5e0ea83a9f7f 100644
--- a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
+++ b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #include "mtk_mdp_comp.h"
@@ -57,13 +56,6 @@ int mtk_mdp_comp_power_on(struct mtk_mdp_comp *comp)
 {
int status, err;
 
-   if (comp->larb_dev) {
-   err = mtk_smi_larb_get(comp->larb_dev);
-   if (err)
-   dev_err(comp->dev,
-   "failed to get larb, err %d.\n", err);
-   }
-
err = pm_runtime_get_sync(comp->dev);
if (err < 0) {
dev_err(comp->dev, "failed to runtime get, err %d.\n", err);
@@ -146,9 +138,6 @@ void mtk_mdp_comp_clock_off(struct mtk_mdp_comp *comp)
continue;
clk_disable_unprepare(comp->clk[i]);
}
-
-   if (comp->larb_dev)
-   mtk_smi_larb_put(comp->larb_dev);
 }
 
 /*
@@ -236,9 +225,6 @@ static const struct component_ops mtk_mdp_component_ops = {
 
 int mtk_mdp_comp_init(struct mtk_mdp_comp *comp, struct device *dev)
 {
-   struct device_node *larb_node;
-   struct platform_device *larb_pdev;
-   int ret;
int i;
struct device_node *node = dev->of_node;
enum mtk_mdp_comp_type comp_type =
@@ -252,8 +238,7 @@ int mtk_mdp_comp_init(struct mtk_mdp_comp *comp, struct 
device *dev)
if (IS_ERR(comp->clk[i])) {
if (PTR_ERR(comp->clk[i]) != -EPROBE_DEFER)
dev_err(dev, "Failed to get clock\n");
-   ret = PTR_ERR(comp->clk[i]);
-   goto err;
+   return PTR_ERR(comp->clk[i]);
}
 
/* Only RDMA needs two clocks */
@@ -261,36 +246,7 @@ int mtk_mdp_comp_init(struct mtk_mdp_comp *comp, struct 
device *dev)
break;
}
 
-   /* Only DMA capable components need the LARB property */
-   comp->larb_dev = NULL;
-   if (comp_type != MTK_MDP_RDMA &&
-   comp_type != MTK_MDP_WDMA &&
-   comp_type != MTK_MDP_WROT)
-   return 0;
-
-   larb_node = of_parse_phandle(node, "mediatek,larb", 0);
-   if (!larb_node) {
-   dev_err(dev,
-   "Missing mediadek,larb phandle in %pOF node\n", node);
-   ret = -EINVAL;
-   goto err;
-   }
-
-   larb_pdev = of_find_device_by_node(larb_node);
-   if (!larb_pdev) {
-   dev_warn(dev, "Waiting for larb device %pOF\n", larb_node);
-   of_node_put(larb_node);
-   ret = -EPROBE_DEFER;
-   goto err;
-   }
-   of_node_put(larb_node);
-
-   comp->larb_dev = &larb_pdev->dev;
-
return 0;
-
-err:
-   return ret;
 }
 
 static int mtk_mdp_comp_probe(struct platform_device *pdev)
diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h 
b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h
index 5201c47f7baa..2bd229cc7eae 100644
--- a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h
+++ b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h
@@ -11,13 +11,11 @@
  * struct mtk_mdp_comp - the MDP's function component data
  * @node:  list node to track sibing MDP components
  * @clk:   clocks required for component
- * @larb_dev:  SMI device required for component
  * @dev:   component's device
  */
 struct mtk_mdp_comp {
struct list_headnode;
struct clk  *clk[2];
-   struct device   *larb_dev;
struct device   *dev;
 };
 
diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_core.c 
b/drivers/media/platform/mtk-mdp/mtk_mdp_core.c
index e1fb39231248..be7d35b3e3ff 100644
--- a/drivers/media/platform/mtk-mdp/mtk_mdp_core.c
+++ b/drivers/media/platform/mtk-mdp/mtk_mdp_core.c
@@ -18,7 +18,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mtk_mdp_comp.h"
 #include "mtk_mdp_core.h"
-- 
2.18.0



[PATCH v2] video: fbdev: kyro: fix a DoS bug by restricting user input

2021-07-14 Thread Zheyu Ma
The user can pass in any value to the driver through the 'ioctl'
interface. The driver dost not check, which may cause DoS bugs.

The following log reveals it:

divide error:  [#1] PREEMPT SMP KASAN PTI
RIP: 0010:SetOverlayViewPort+0x133/0x5f0 
drivers/video/fbdev/kyro/STG4000OverlayDevice.c:476
Call Trace:
 kyro_dev_overlay_viewport_set drivers/video/fbdev/kyro/fbdev.c:378 [inline]
 kyrofb_ioctl+0x2eb/0x330 drivers/video/fbdev/kyro/fbdev.c:603
 do_fb_ioctl+0x1f3/0x700 drivers/video/fbdev/core/fbmem.c:1171
 fb_ioctl+0xeb/0x130 drivers/video/fbdev/core/fbmem.c:1185
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl fs/ioctl.c:739 [inline]
 __x64_sys_ioctl+0x19b/0x220 fs/ioctl.c:739
 do_syscall_64+0x32/0x80 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Signed-off-by: Zheyu Ma 
---
Changes in v2:
- Validate the inputs on a higher level
---
 drivers/video/fbdev/kyro/fbdev.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/video/fbdev/kyro/fbdev.c b/drivers/video/fbdev/kyro/fbdev.c
index 8fbde92ae8b9..eb0cbd1d12d5 100644
--- a/drivers/video/fbdev/kyro/fbdev.c
+++ b/drivers/video/fbdev/kyro/fbdev.c
@@ -372,6 +372,11 @@ static int kyro_dev_overlay_viewport_set(u32 x, u32 y, u32 
ulWidth, u32 ulHeight
/* probably haven't called CreateOverlay yet */
return -EINVAL;
 
+   if (ulWidth == 0 || ulWidth == 0x ||
+   ulHeight == 0 || ulHeight == 0x ||
+   (x < 2 && ulWidth + 2 == 0))
+   return -EINVAL;
+
/* Stop Ramdac Output */
DisableRamdacOutput(deviceInfo.pSTGReg);
 
-- 
2.17.6



Re: [PATCH v6 01/11] dt-binding: mediatek: Get rid of mediatek, larb for multimedia HW

2021-07-14 Thread Dafna Hirschfeld

Hi,
thanks for the patch

On 14.07.21 04:56, Yong Wu wrote:

After adding device_link between the consumer with the smi-larbs,
if the consumer call its owner pm_runtime_get(_sync), the
pm_runtime_get(_sync) of smi-larb and smi-common will be called
automatically. Thus, the consumer don't need the property.

And IOMMU also know which larb this consumer connects with from
iommu id in the "iommus=" property.

Signed-off-by: Yong Wu 
Reviewed-by: Rob Herring 
Reviewed-by: Evan Green 
---
  .../bindings/display/mediatek/mediatek,disp.txt  | 9 -
  .../devicetree/bindings/media/mediatek-jpeg-decoder.yaml | 9 -
  .../devicetree/bindings/media/mediatek-jpeg-encoder.yaml | 9 -


On which repo are these patches based on ?
In linux-next the file mediatek-jpeg-encoder.yaml don't exist

Thanks,
Dafna


  Documentation/devicetree/bindings/media/mediatek-mdp.txt | 8 
  .../devicetree/bindings/media/mediatek-vcodec.txt| 4 
  5 files changed, 39 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt 
b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
index fbb59c9ddda6..867bd82e2f03 100644
--- a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
+++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
@@ -61,8 +61,6 @@ Required properties (DMA function blocks):
"mediatek,-disp-rdma"
"mediatek,-disp-wdma"
the supported chips are mt2701, mt8167 and mt8173.
-- larb: Should contain a phandle pointing to the local arbiter device as 
defined
-  in 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
  - iommus: Should point to the respective IOMMU block with master port as
argument, see Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
for details.
@@ -91,7 +89,6 @@ ovl0: ovl@1400c000 {
power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_OVL0>;
iommus = <&iommu M4U_PORT_DISP_OVL0>;
-   mediatek,larb = <&larb0>;
  };
  
  ovl1: ovl@1400d000 {

@@ -101,7 +98,6 @@ ovl1: ovl@1400d000 {
power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_OVL1>;
iommus = <&iommu M4U_PORT_DISP_OVL1>;
-   mediatek,larb = <&larb4>;
  };
  
  rdma0: rdma@1400e000 {

@@ -111,7 +107,6 @@ rdma0: rdma@1400e000 {
power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_RDMA0>;
iommus = <&iommu M4U_PORT_DISP_RDMA0>;
-   mediatek,larb = <&larb0>;
mediatek,rdma-fifosize = <8192>;
  };
  
@@ -122,7 +117,6 @@ rdma1: rdma@1400f000 {

power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_RDMA1>;
iommus = <&iommu M4U_PORT_DISP_RDMA1>;
-   mediatek,larb = <&larb4>;
  };
  
  rdma2: rdma@1401 {

@@ -132,7 +126,6 @@ rdma2: rdma@1401 {
power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_RDMA2>;
iommus = <&iommu M4U_PORT_DISP_RDMA2>;
-   mediatek,larb = <&larb4>;
  };
  
  wdma0: wdma@14011000 {

@@ -142,7 +135,6 @@ wdma0: wdma@14011000 {
power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_WDMA0>;
iommus = <&iommu M4U_PORT_DISP_WDMA0>;
-   mediatek,larb = <&larb0>;
  };
  
  wdma1: wdma@14012000 {

@@ -152,7 +144,6 @@ wdma1: wdma@14012000 {
power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
clocks = <&mmsys CLK_MM_DISP_WDMA1>;
iommus = <&iommu M4U_PORT_DISP_WDMA1>;
-   mediatek,larb = <&larb4>;
  };
  
  color0: color@14013000 {

diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml 
b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
index 9b87f036f178..052e752157b4 100644
--- a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
+++ b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
@@ -42,13 +42,6 @@ properties:
power-domains:
  maxItems: 1
  
-  mediatek,larb:

-$ref: '/schemas/types.yaml#/definitions/phandle'
-description: |
-  Must contain the local arbiters in the current Socs, see
-  
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
-  for details.
-
iommus:
  maxItems: 2
  description: |
@@ -63,7 +56,6 @@ required:
- clocks
- clock-names
- power-domains
-  - mediatek,larb
- iommus
  
  additionalProperties: false

@@ -83,7 +75,6 @@ examples:
clock-names = "jpgdec-smi",
  "jpgdec";
power-domains = <&scpsys MT2701_POWER_DOMAIN_ISP>;
-  mediatek,larb = <&larb2>;
iommus = <&iommu MT2701_M4U_PORT_JPGDEC_WDMA>,
 <&iommu MT2701_M4U_PORT_JPGDEC_BSDMA>;
  };
diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-encoder.yaml 
b/Documentation/devicetree/bindings/media/medi

[PATCH] drm/amd/display: Fix 10bit 4K display on CIK GPUs

2021-07-14 Thread Liviu Dudau
Commit 72a7cf0aec0c ("drm/amd/display: Keep linebuffer pixel depth at
30bpp for DCE-11.0.") doesn't seems to have fixed 10bit 4K rendering over
DisplayPort for CIK GPUs. On my machine with a HAWAII GPU I get a broken
image that looks like it has an effective resolution of 1920x1080 but
scaled up in an irregular way. Reverting the commit or applying this
patch fixes the problem on v5.14-rc1.

Fixes: 72a7cf0aec0c ("drm/amd/display: Keep linebuffer pixel depth at 30bpp for 
DCE-11.0.")
Signed-off-by: Liviu Dudau 
---
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index a6a67244a322e..1596f6b7fed7c 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -1062,7 +1062,7 @@ bool resource_build_scaling_params(struct pipe_ctx 
*pipe_ctx)
 * so use only 30 bpp on DCE_VERSION_11_0. Testing with DCE 11.2 and 8.3
 * did not show such problems, so this seems to be the exception.
 */
-   if (plane_state->ctx->dce_version != DCE_VERSION_11_0)
+   if (plane_state->ctx->dce_version > DCE_VERSION_11_0)
pipe_ctx->plane_res.scl_data.lb_params.depth = 
LB_PIXEL_DEPTH_36BPP;
else
pipe_ctx->plane_res.scl_data.lb_params.depth = 
LB_PIXEL_DEPTH_30BPP;
-- 
2.32.0



Re: [PATCH v6 03/11] iommu/mediatek: Add device_link between the consumer and the larb devices

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

MediaTek IOMMU-SMI diagram is like below. all the consumer connect with
smi-larb, then connect with smi-common.

 M4U
  |
 smi-common
  |
   -
   | |...
   | |
larb1 larb2
   | |
vdec   venc

When the consumer works, it should enable the smi-larb's power which
also need enable the smi-common's power firstly.

Thus, First of all, use the device link connect the consumer and the
smi-larbs. then add device link between the smi-larb and smi-common.

This patch adds device_link between the consumer and the larbs.

When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling
pm_runtime_xx to keep the original status of clocks. It can avoid two
issues:
1) Display HW show fastlogo abnormally reported in [1]. At the beggining,
all the clocks are enabled before entering kernel, but the clocks for
display HW(always in larb0) will be gated after clk_enable and clk_disable
called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock
operation happened before display driver probe. At that time, the display
HW will be abnormal.

2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip
pm_runtime_xx to avoid the deadlock.

Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then
device_link_removed should be added explicitly.

[1] https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/
[2] https://lore.kernel.org/patchwork/patch/1086569/

Suggested-by: Tomasz Figa 
Signed-off-by: Yong Wu 
---
  drivers/iommu/mtk_iommu.c| 22 ++
  drivers/iommu/mtk_iommu_v1.c | 20 +++-
  2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index a02dde094788..ee742900cf4b 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -571,22 +571,44 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
  {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct mtk_iommu_data *data;
+   struct device_link *link;
+   struct device *larbdev;
+   unsigned int larbid;
  
  	if (!fwspec || fwspec->ops != &mtk_iommu_ops)

return ERR_PTR(-ENODEV); /* Not a iommu client device */
  
  	data = dev_iommu_priv_get(dev);
  
+	/*

+* Link the consumer device with the smi-larb device(supplier)
+* The device in each a larb is a independent HW. thus only link
+* one larb here.
+*/
+   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
+   larbdev = data->larb_imu[larbid].dev;
+   link = device_link_add(dev, larbdev,
+  DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+   if (!link)
+   dev_err(dev, "Unable to link %s\n", dev_name(larbdev));

shoudn't ERR_PTR be returned in case of failure?

Thanks,
Dafna


return &data->iommu;
  }
  
  static void mtk_iommu_release_device(struct device *dev)

  {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct mtk_iommu_data *data;
+   struct device *larbdev;
+   unsigned int larbid;
  
  	if (!fwspec || fwspec->ops != &mtk_iommu_ops)

return;
  
+	data = dev_iommu_priv_get(dev);

+   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
+   larbdev = data->larb_imu[larbid].dev;
+   device_link_remove(dev, larbdev);
+
iommu_fwspec_free(dev);
  }
  
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c

index d9365a3d8dc9..d2a7c66b8239 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -424,7 +424,9 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct of_phandle_args iommu_spec;
struct mtk_iommu_data *data;
-   int err, idx = 0;
+   int err, idx = 0, larbid;
+   struct device_link *link;
+   struct device *larbdev;
  
  	while (!of_parse_phandle_with_args(dev->of_node, "iommus",

   "#iommu-cells",
@@ -445,6 +447,14 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
  
  	data = dev_iommu_priv_get(dev);
  
+	/* Link the consumer device with the smi-larb device(supplier) */

+   larbid = mt2701_m4u_to_larb(fwspec->ids[0]);
+   larbdev = data->larb_imu[larbid].dev;
+   link = device_link_add(dev, larbdev,
+  DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+   if (!link)
+   dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
+
return &data->iommu;
  }
  
@@ -465,10 +475,18 @@ static void mtk_iommu_probe_finalize(struct device *dev)

  static void mtk_iommu_release_device(struct device *dev)
  {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct mtk_iommu_data *data;
+   struct device *larbdev;
+   un

Re: [PATCH v8 00/14] drm/tegra: Introduce a modern UABI

2021-07-14 Thread Thierry Reding
On Sat, Jul 10, 2021 at 12:16:28AM +0300, Dmitry Osipenko wrote:
> Hello Thierry,
> 
> 09.07.2021 22:31, Thierry Reding пишет:
> > From: Thierry Reding 
> > 
> > Hi all,
> > 
> > Mikko has been away for a few weeks, so I've been testing and revising
> > the new UABI patches in the meantime. There are very minor changes to
> > the naming of some of the UABI fields, but other than that it's mostly
> > unchanged from v7.
> 
> Why you haven't addressed any of the previous review comments? There
> were some obvious problems in v7 and v8 still has them.
> 
> > One notable change is that mappings can now be read-only, write-only,
> > read-write or none of them (rather than just read-only or read-write),
> > since those combinations are all supported by the IOMMUs and it might
> > be useful to make some mappings write-only.
> > 
> > For a full list of changes in v8, see the changelog in patch 6.
> > 
> > I've also updated the libdrm_tegra library to work against this version
> > of the UABI. A branch can be found here:
> > 
> >   https://gitlab.freedesktop.org/tagr/drm/-/commits/drm-tegra-uabi-v8
> > 
> > That contains helper APIs for the concepts introduced in this series and
> > shows how they can be used in various tests that can be run for sanity
> > checking.
> > 
> > In addition, Mikko has made updates to the following projects, though
> > they may need to be updated for the minor changes in v8:
> > 
> > * vaapi-tegra-driver - https://github.com/cyndis/vaapi-tegra-driver
> >   Experimental support for MPEG2 and H264 decoding on T210, T186
> >   and T194.
> > 
> > * xf86-video-opentegra - 
> > https://github.com/grate-driver/xf86-video-opentegra
> >   X11 userspace acceleration driver for Tegra20, Tegra30, and Tegra114.
> > 
> > * grate - https://github.com/grate-driver/grate
> >   3D rendering testbed for Tegra20, Tegra30, and Tegra114
> > 
> > I plan on putting this into linux-next soon after v5.14-rc1 so that this
> > can get some soak time.
> 
> It should be a bit too early to push it into kernel. The UAPI is not
> ready because it's missing essential features. We can't call this a
> 'modern UABI' until it's fully implemented. The design decisions are
> still questionable because this UAPI is built around the proprietary
> firmware (and based on UAPI of downstream driver) which doesn't fit well
> into DRM world. I haven't got all the answers to my previous questions,
> should I repeat them?

I don't know what you means by "built around the proprietary firmware".
Yes, this ends up using proprietary firmware for some of the hardware
engines that host1x drives, but that's completely orthogonal to the
UABI. No matter what UABI we'd be introducing, we'd be using that same
firmware.

And yes, this is based on the UABI of the downstream drivers. The design
is guided by what we've learned over the last decade working with this
hardware in use-cases that customers need. It'd be dumb not to use that
knowledge to our advantage. This is the only way to ensure we can
deliver an upstream driver that's on par with our downstream drivers and
therefore make it possible to eventually adopt the upstream driver.

And frankly, you did get answers to previous questions, though perhaps
not all, but I'm out of patience. We've been going in circles and at
some point we have to make a decision so we can make progress.

I made several attempts over the years to get something usable merged
upstream so that we can finally make use of this hardware and get it
supported upstream and each time I made the mistake of trying to make it
perfect and accomodate all wishlist items. The result is that I wasted a
lot of time and have nothing to show for it.

I've also been very hard Mikko with his work on this and I think we've
stretched this as far as we can without compromising too much on what we
are going to need from this UABI in the future.

We've gone through the process of making sure all existing userspace can
and does work with this new UABI and even left the old UABI in place in
case we need it.

I'm reasonably satisfied with what we have now and I don't see any
reason to hold this back any further. We always have the option of
adding UABI if we need it for something, or extend functionality of
existing UABI where it makes sense. But we also do have to start
somewhere, otherwise we're just not going to get anywhere, as the last
10 years have shown.

> UAPI is not the only problem that we have. The performance and stability
> of the driver are in a very bad shape too. The modern UAPI can't be
> built on top of the old code. It's clear now that this is a very serious
> problem that must be addressed along with the UAPI work and I'm getting
> silence from you guys.

We've been over this multiple times before, though perhaps never over
email. So let me make this clear here again and for future reference: we
will *not* be rewriting the driver from scratch.

If there are any serious performance and stability issues, then we'll
find them and

Re: [PATCH v6 04/11] media: mtk-jpeg: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

MediaTek IOMMU has already added device_link between the consumer
and smi-larb device. If the jpg device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

After removing the larb_get operations, then mtk_jpeg_clk_init is
also unnecessary. Remove it too.

CC: Rick Chang 
CC: Xia Jiang 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Rick Chang 


Reviewed-by: Dafna Hirschfeld 


---
  .../media/platform/mtk-jpeg/mtk_jpeg_core.c   | 45 +--
  .../media/platform/mtk-jpeg/mtk_jpeg_core.h   |  2 -
  2 files changed, 2 insertions(+), 45 deletions(-)

diff --git a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c 
b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
index a89c7b206eef..4fea2c512434 100644
--- a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
+++ b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
@@ -22,7 +22,6 @@
  #include 
  #include 
  #include 
-#include 
  
  #include "mtk_jpeg_enc_hw.h"

  #include "mtk_jpeg_dec_hw.h"
@@ -1055,10 +1054,6 @@ static void mtk_jpeg_clk_on(struct mtk_jpeg_dev *jpeg)
  {
int ret;
  
-	ret = mtk_smi_larb_get(jpeg->larb);

-   if (ret)
-   dev_err(jpeg->dev, "mtk_smi_larb_get larbvdec fail %d\n", ret);
-
ret = clk_bulk_prepare_enable(jpeg->variant->num_clks,
  jpeg->variant->clks);
if (ret)
@@ -1069,7 +1064,6 @@ static void mtk_jpeg_clk_off(struct mtk_jpeg_dev *jpeg)
  {
clk_bulk_disable_unprepare(jpeg->variant->num_clks,
   jpeg->variant->clks);
-   mtk_smi_larb_put(jpeg->larb);
  }
  
  static irqreturn_t mtk_jpeg_enc_done(struct mtk_jpeg_dev *jpeg)

@@ -1284,35 +1278,6 @@ static struct clk_bulk_data mtk_jpeg_clocks[] = {
{ .id = "jpgenc" },
  };
  
-static int mtk_jpeg_clk_init(struct mtk_jpeg_dev *jpeg)

-{
-   struct device_node *node;
-   struct platform_device *pdev;
-   int ret;
-
-   node = of_parse_phandle(jpeg->dev->of_node, "mediatek,larb", 0);
-   if (!node)
-   return -EINVAL;
-   pdev = of_find_device_by_node(node);
-   if (WARN_ON(!pdev)) {
-   of_node_put(node);
-   return -EINVAL;
-   }
-   of_node_put(node);
-
-   jpeg->larb = &pdev->dev;
-
-   ret = devm_clk_bulk_get(jpeg->dev, jpeg->variant->num_clks,
-   jpeg->variant->clks);
-   if (ret) {
-   dev_err(&pdev->dev, "failed to get jpeg clock:%d\n", ret);
-   put_device(&pdev->dev);
-   return ret;
-   }
-
-   return 0;
-}
-
  static void mtk_jpeg_job_timeout_work(struct work_struct *work)
  {
struct mtk_jpeg_dev *jpeg = container_of(work, struct mtk_jpeg_dev,
@@ -1333,11 +1298,6 @@ static void mtk_jpeg_job_timeout_work(struct work_struct 
*work)
v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
  }
  
-static inline void mtk_jpeg_clk_release(struct mtk_jpeg_dev *jpeg)

-{
-   put_device(jpeg->larb);
-}
-
  static int mtk_jpeg_probe(struct platform_device *pdev)
  {
struct mtk_jpeg_dev *jpeg;
@@ -1376,7 +1336,8 @@ static int mtk_jpeg_probe(struct platform_device *pdev)
goto err_req_irq;
}
  
-	ret = mtk_jpeg_clk_init(jpeg);

+   ret = devm_clk_bulk_get(jpeg->dev, jpeg->variant->num_clks,
+   jpeg->variant->clks);
if (ret) {
dev_err(&pdev->dev, "Failed to init clk, err %d\n", ret);
goto err_clk_init;
@@ -1442,7 +1403,6 @@ static int mtk_jpeg_probe(struct platform_device *pdev)
v4l2_device_unregister(&jpeg->v4l2_dev);
  
  err_dev_register:

-   mtk_jpeg_clk_release(jpeg);
  
  err_clk_init:
  
@@ -1460,7 +1420,6 @@ static int mtk_jpeg_remove(struct platform_device *pdev)

video_device_release(jpeg->vdev);
v4l2_m2m_release(jpeg->m2m_dev);
v4l2_device_unregister(&jpeg->v4l2_dev);
-   mtk_jpeg_clk_release(jpeg);
  
  	return 0;

  }
diff --git a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h 
b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
index 595f7f10c9fd..3e4811a41ba2 100644
--- a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
+++ b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
@@ -85,7 +85,6 @@ struct mtk_jpeg_variant {
   * @alloc_ctx:videobuf2 memory allocator's context
   * @vdev: video device node for jpeg mem2mem mode
   * @reg_base: JPEG registers mapping
- * @larb:  SMI device
   * @job_timeout_work: IRQ timeout structure
   * @variant:  driver variant to be used
   */
@@ -99,7 +98,6 @@ struct mtk_jpeg_dev {
void*alloc_ctx;
struct video_device *vdev;
void __iomem*reg_base;
-   struct device   *larb;
struct delayed_work job_timeout_work;
const struct mtk_jpeg_variant *variant;
  };



Re: [PATCH v6 05/11] media: mtk-mdp: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the mdp device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: Minghsiu Tsai 
CC: Houlong Wei 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Reviewed-by: Houlong Wei 


Reviewed-by: Dafna Hirschfeld 


---
  drivers/media/platform/mtk-mdp/mtk_mdp_comp.c | 46 +--
  drivers/media/platform/mtk-mdp/mtk_mdp_comp.h |  2 -
  drivers/media/platform/mtk-mdp/mtk_mdp_core.c |  1 -
  3 files changed, 1 insertion(+), 48 deletions(-)

diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c 
b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
index de2d425efdd1..5e0ea83a9f7f 100644
--- a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
+++ b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
@@ -13,7 +13,6 @@
  #include 
  #include 
  #include 
-#include 
  #include 
  
  #include "mtk_mdp_comp.h"

@@ -57,13 +56,6 @@ int mtk_mdp_comp_power_on(struct mtk_mdp_comp *comp)
  {
int status, err;
  
-	if (comp->larb_dev) {

-   err = mtk_smi_larb_get(comp->larb_dev);
-   if (err)
-   dev_err(comp->dev,
-   "failed to get larb, err %d.\n", err);
-   }
-
err = pm_runtime_get_sync(comp->dev);
if (err < 0) {
dev_err(comp->dev, "failed to runtime get, err %d.\n", err);
@@ -146,9 +138,6 @@ void mtk_mdp_comp_clock_off(struct mtk_mdp_comp *comp)
continue;
clk_disable_unprepare(comp->clk[i]);
}
-
-   if (comp->larb_dev)
-   mtk_smi_larb_put(comp->larb_dev);
  }
  
  /*

@@ -236,9 +225,6 @@ static const struct component_ops mtk_mdp_component_ops = {
  
  int mtk_mdp_comp_init(struct mtk_mdp_comp *comp, struct device *dev)

  {
-   struct device_node *larb_node;
-   struct platform_device *larb_pdev;
-   int ret;
int i;
struct device_node *node = dev->of_node;
enum mtk_mdp_comp_type comp_type =
@@ -252,8 +238,7 @@ int mtk_mdp_comp_init(struct mtk_mdp_comp *comp, struct 
device *dev)
if (IS_ERR(comp->clk[i])) {
if (PTR_ERR(comp->clk[i]) != -EPROBE_DEFER)
dev_err(dev, "Failed to get clock\n");
-   ret = PTR_ERR(comp->clk[i]);
-   goto err;
+   return PTR_ERR(comp->clk[i]);
}
  
  		/* Only RDMA needs two clocks */

@@ -261,36 +246,7 @@ int mtk_mdp_comp_init(struct mtk_mdp_comp *comp, struct 
device *dev)
break;
}
  
-	/* Only DMA capable components need the LARB property */

-   comp->larb_dev = NULL;
-   if (comp_type != MTK_MDP_RDMA &&
-   comp_type != MTK_MDP_WDMA &&
-   comp_type != MTK_MDP_WROT)
-   return 0;
-
-   larb_node = of_parse_phandle(node, "mediatek,larb", 0);
-   if (!larb_node) {
-   dev_err(dev,
-   "Missing mediadek,larb phandle in %pOF node\n", node);
-   ret = -EINVAL;
-   goto err;
-   }
-
-   larb_pdev = of_find_device_by_node(larb_node);
-   if (!larb_pdev) {
-   dev_warn(dev, "Waiting for larb device %pOF\n", larb_node);
-   of_node_put(larb_node);
-   ret = -EPROBE_DEFER;
-   goto err;
-   }
-   of_node_put(larb_node);
-
-   comp->larb_dev = &larb_pdev->dev;
-
return 0;
-
-err:
-   return ret;
  }
  
  static int mtk_mdp_comp_probe(struct platform_device *pdev)

diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h 
b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h
index 5201c47f7baa..2bd229cc7eae 100644
--- a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h
+++ b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h
@@ -11,13 +11,11 @@
   * struct mtk_mdp_comp - the MDP's function component data
   * @node: list node to track sibing MDP components
   * @clk:  clocks required for component
- * @larb_dev:  SMI device required for component
   * @dev:  component's device
   */
  struct mtk_mdp_comp {
struct list_headnode;
struct clk  *clk[2];
-   struct device   *larb_dev;
struct device   *dev;
  };
  
diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_core.c b/drivers/media/platform/mtk-mdp/mtk_mdp_core.c

index e1fb39231248..be7d35b3e3ff 100644
--- a/drivers/media/platform/mtk-mdp/mtk_mdp_core.c
+++ b/drivers/media/platform/mtk-mdp/mtk_mdp_core.c
@@ -18,7 +18,6 @@
  #include 
  #include 
  #include 
-#include 
  
  #include "mtk_mdp_comp.h"

  #include "mtk_mdp_core.h"



Re: [PATCH v6 07/11] drm/mediatek: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the drm device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: CK Hu 
CC: Philipp Zabel 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Chun-Kuang Hu 


Reviewed-by: Dafna Hirschfeld 


---
  drivers/gpu/drm/mediatek/mtk_drm_crtc.c |  9 --
  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 36 ++---
  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h |  1 -
  drivers/gpu/drm/mediatek/mtk_drm_drv.c  |  5 +--
  4 files changed, 3 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c 
b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
index 08e3f352377d..d046abcf66ce 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
@@ -10,7 +10,6 @@
  #include 
  
  #include 

-#include 
  
  #include 

  #include 
@@ -551,12 +550,6 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
  
  	DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id);
  
-	ret = mtk_smi_larb_get(comp->larb_dev);

-   if (ret) {
-   DRM_ERROR("Failed to get larb: %d\n", ret);
-   return;
-   }
-
ret = pm_runtime_resume_and_get(comp->dev);
if (ret < 0)
DRM_DEV_ERROR(comp->dev, "Failed to enable power domain: %d\n",
@@ -564,7 +557,6 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
  
  	ret = mtk_crtc_ddp_hw_init(mtk_crtc);

if (ret) {
-   mtk_smi_larb_put(comp->larb_dev);
pm_runtime_put(comp->dev);
return;
}
@@ -601,7 +593,6 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
*crtc,
  
  	drm_crtc_vblank_off(crtc);

mtk_crtc_ddp_hw_fini(mtk_crtc);
-   mtk_smi_larb_put(comp->larb_dev);
ret = pm_runtime_put(comp->dev);
if (ret < 0)
DRM_DEV_ERROR(comp->dev, "Failed to disable power domain: %d\n",
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
index 75bc00e17fc4..7d240218d4c7 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
@@ -449,37 +449,15 @@ unsigned int mtk_drm_find_possible_crtc_by_comp(struct 
drm_device *drm,
return ret;
  }
  
-static int mtk_ddp_get_larb_dev(struct device_node *node, struct mtk_ddp_comp *comp,

-   struct device *dev)
-{
-   struct device_node *larb_node;
-   struct platform_device *larb_pdev;
-
-   larb_node = of_parse_phandle(node, "mediatek,larb", 0);
-   if (!larb_node) {
-   dev_err(dev, "Missing mediadek,larb phandle in %pOF node\n", 
node);
-   return -EINVAL;
-   }
-
-   larb_pdev = of_find_device_by_node(larb_node);
-   if (!larb_pdev) {
-   dev_warn(dev, "Waiting for larb device %pOF\n", larb_node);
-   of_node_put(larb_node);
-   return -EPROBE_DEFER;
-   }
-   of_node_put(larb_node);
-   comp->larb_dev = &larb_pdev->dev;
-
-   return 0;
-}
-
  int mtk_ddp_comp_init(struct device_node *node, struct mtk_ddp_comp *comp,
  enum mtk_ddp_comp_id comp_id)
  {
struct platform_device *comp_pdev;
enum mtk_ddp_comp_type type;
struct mtk_ddp_comp_dev *priv;
+#if IS_REACHABLE(CONFIG_MTK_CMDQ)
int ret;
+#endif
  
  	if (comp_id < 0 || comp_id >= DDP_COMPONENT_ID_MAX)

return -EINVAL;
@@ -495,16 +473,6 @@ int mtk_ddp_comp_init(struct device_node *node, struct 
mtk_ddp_comp *comp,
}
comp->dev = &comp_pdev->dev;
  
-	/* Only DMA capable components need the LARB property */

-   if (type == MTK_DISP_OVL ||
-   type == MTK_DISP_OVL_2L ||
-   type == MTK_DISP_RDMA ||
-   type == MTK_DISP_WDMA) {
-   ret = mtk_ddp_get_larb_dev(node, comp, comp->dev);
-   if (ret)
-   return ret;
-   }
-
if (type == MTK_DISP_BLS ||
type == MTK_DISP_CCORR ||
type == MTK_DISP_COLOR ||
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h 
b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
index bb914d976cf5..1b582262b682 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
@@ -70,7 +70,6 @@ struct mtk_ddp_comp_funcs {
  struct mtk_ddp_comp {
struct device *dev;
int irq;
-   struct device *larb_dev;
enum mtk_ddp_comp_id id;
const struct mtk_ddp_comp_funcs *funcs;
  };
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index b46bdb8985da..0d5ef3d8d081 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -577,11 +577,8 @@ static int mtk_drm_probe(struct platform_de

Re: [PATCH v6 08/11] media: mtk-vcodec: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the vcodec device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: Tiffany Lin 
CC: Irui Wang 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Tiffany Lin 


Reviewed-by: Dafna Hirschfeld 


---
  .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c   | 37 +++-
  .../platform/mtk-vcodec/mtk_vcodec_drv.h  |  3 --
  .../platform/mtk-vcodec/mtk_vcodec_enc.c  |  1 -
  .../platform/mtk-vcodec/mtk_vcodec_enc_pm.c   | 44 +++
  4 files changed, 10 insertions(+), 75 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
index 6038db96f71c..d0bf9aa3b29d 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
@@ -8,14 +8,12 @@
  #include 
  #include 
  #include 
-#include 
  
  #include "mtk_vcodec_dec_pm.h"

  #include "mtk_vcodec_util.h"
  
  int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)

  {
-   struct device_node *node;
struct platform_device *pdev;
struct mtk_vcodec_pm *pm;
struct mtk_vcodec_clk *dec_clk;
@@ -26,18 +24,7 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)
pm = &mtkdev->pm;
pm->mtkdev = mtkdev;
dec_clk = &pm->vdec_clk;
-   node = of_parse_phandle(pdev->dev.of_node, "mediatek,larb", 0);
-   if (!node) {
-   mtk_v4l2_err("of_parse_phandle mediatek,larb fail!");
-   return -1;
-   }
  
-	pdev = of_find_device_by_node(node);

-   of_node_put(node);
-   if (WARN_ON(!pdev)) {
-   return -1;
-   }
-   pm->larbvdec = &pdev->dev;
pdev = mtkdev->plat_dev;
pm->dev = &pdev->dev;
  
@@ -47,14 +34,11 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)

dec_clk->clk_info = devm_kcalloc(&pdev->dev,
dec_clk->clk_num, sizeof(*clk_info),
GFP_KERNEL);
-   if (!dec_clk->clk_info) {
-   ret = -ENOMEM;
-   goto put_device;
-   }
+   if (!dec_clk->clk_info)
+   return -ENOMEM;
} else {
mtk_v4l2_err("Failed to get vdec clock count");
-   ret = -EINVAL;
-   goto put_device;
+   return -EINVAL;
}
  
  	for (i = 0; i < dec_clk->clk_num; i++) {

@@ -63,29 +47,24 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)
"clock-names", i, &clk_info->clk_name);
if (ret) {
mtk_v4l2_err("Failed to get clock name id = %d", i);
-   goto put_device;
+   return ret;
}
clk_info->vcodec_clk = devm_clk_get(&pdev->dev,
clk_info->clk_name);
if (IS_ERR(clk_info->vcodec_clk)) {
mtk_v4l2_err("devm_clk_get (%d)%s fail", i,
clk_info->clk_name);
-   ret = PTR_ERR(clk_info->vcodec_clk);
-   goto put_device;
+   return PTR_ERR(clk_info->vcodec_clk);
}
}
  
  	pm_runtime_enable(&pdev->dev);

return 0;
-put_device:
-   put_device(pm->larbvdec);
-   return ret;
  }
  
  void mtk_vcodec_release_dec_pm(struct mtk_vcodec_dev *dev)

  {
pm_runtime_disable(dev->pm.dev);
-   put_device(dev->pm.larbvdec);
  }
  
  int mtk_vcodec_dec_pw_on(struct mtk_vcodec_pm *pm)

@@ -122,11 +101,6 @@ void mtk_vcodec_dec_clock_on(struct mtk_vcodec_pm *pm)
}
}
  
-	ret = mtk_smi_larb_get(pm->larbvdec);

-   if (ret) {
-   mtk_v4l2_err("mtk_smi_larb_get larbvdec fail %d", ret);
-   goto error;
-   }
return;
  
  error:

@@ -139,7 +113,6 @@ void mtk_vcodec_dec_clock_off(struct mtk_vcodec_pm *pm)
struct mtk_vcodec_clk *dec_clk = &pm->vdec_clk;
int i = 0;
  
-	mtk_smi_larb_put(pm->larbvdec);

for (i = dec_clk->clk_num - 1; i >= 0; i--)
clk_disable_unprepare(dec_clk->clk_info[i].vcodec_clk);
  }
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index c6c7672fecfb..64b73dd880ce 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -189,10 +189,7 @@ struct mtk_vcodec_clk {
   */
  struct mtk_vcodec_pm {
struct mtk_vcodec_clk   vdec_clk;
-   struct device   *larbvdec;
-
struct mtk_vcodec_clk   venc_clk;
-   struct device   *larbvenc;
struct device   *dev;
struct mtk_vcodec_dev   *mtkdev;
  };
diff --git a/drivers

Re: [PATCH v6 09/11] memory: mtk-smi: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

After adding device_link between the iommu consumer and smi-larb,
the pm_runtime_get(_sync) of smi-larb and smi-common will be called
automatically. we can get rid of mtk_smi_larb_get/put.

CC: Matthias Brugger 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Krzysztof Kozlowski 
Acked-by: Matthias Brugger 


Reviewed-by: Dafna Hirschfeld 


---
  drivers/memory/mtk-smi.c   | 14 --
  include/soc/mediatek/smi.h | 20 
  2 files changed, 34 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index c5fb51f73b34..7c61c924e220 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -134,20 +134,6 @@ static void mtk_smi_clk_disable(const struct mtk_smi *smi)
clk_disable_unprepare(smi->clk_apb);
  }
  
-int mtk_smi_larb_get(struct device *larbdev)

-{
-   int ret = pm_runtime_resume_and_get(larbdev);
-
-   return (ret < 0) ? ret : 0;
-}
-EXPORT_SYMBOL_GPL(mtk_smi_larb_get);
-
-void mtk_smi_larb_put(struct device *larbdev)
-{
-   pm_runtime_put_sync(larbdev);
-}
-EXPORT_SYMBOL_GPL(mtk_smi_larb_put);
-
  static int
  mtk_smi_larb_bind(struct device *dev, struct device *master, void *data)
  {
diff --git a/include/soc/mediatek/smi.h b/include/soc/mediatek/smi.h
index 15e3397cec58..11f7d6b59642 100644
--- a/include/soc/mediatek/smi.h
+++ b/include/soc/mediatek/smi.h
@@ -19,26 +19,6 @@ struct mtk_smi_larb_iommu {
unsigned char  bank[32];
  };
  
-/*

- * mtk_smi_larb_get: Enable the power domain and clocks for this local arbiter.
- *   It also initialize some basic setting(like iommu).
- * mtk_smi_larb_put: Disable the power domain and clocks for this local 
arbiter.
- * Both should be called in non-atomic context.
- *
- * Returns 0 if successful, negative on failure.
- */
-int mtk_smi_larb_get(struct device *larbdev);
-void mtk_smi_larb_put(struct device *larbdev);
-
-#else
-
-static inline int mtk_smi_larb_get(struct device *larbdev)
-{
-   return 0;
-}
-
-static inline void mtk_smi_larb_put(struct device *larbdev) { }
-
  #endif
  
  #endif




Re: [PATCH v9 1/4] dt-bindings:drm/bridge:anx7625:add vendor define flags

2021-07-14 Thread Laurent Pinchart
Hi Xin,

On Wed, Jul 07, 2021 at 03:30:51PM +0800, Xin Ji wrote:
> On Thu, Jun 24, 2021 at 01:57:22PM +0200, Robert Foss wrote:
> > Hey Xin,
> > 
> > I would like to merge this series now, but this patch needs a review
> > first. Maybe Laurent/Rob Herring are good candidates.
>
> Hi Rob, I get Laurent/Rob comments before, and explained why we needs
> these DT properties, so far, I didn't get any response.
> 
> Hi Rob Herring and Laurent, for the DT property lane0/1-swing, Google
> engineer has strong demond for them, they don't want to move DP swing
> adjusting to kernel, thus may cause change the driver code in each
> project, so config them in DT is a best option.

Hardcoding it in the driver is certainly not a good option, but
hardcoding it in DT isn't either unless you can explain how the value
should be computed. "Contact the vendor" isn't good enough.

> > On Tue, 22 Jun 2021 at 14:31, Xin Ji  wrote:
> > >
> > > Add 'bus-type' and 'data-lanes' define for port0. Define DP tx lane0,
> > > lane1 swing register array define, and audio enable flag.
> > >
> > > Signed-off-by: Xin Ji 
> > > ---
> > >  .../display/bridge/analogix,anx7625.yaml  | 57 ++-
> > >  1 file changed, 56 insertions(+), 1 deletion(-)
> > >
> > > diff --git 
> > > a/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml 
> > > b/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml
> > > index ab48ab2f4240..9e604d19a3d5 100644
> > > --- 
> > > a/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml
> > > +++ 
> > > b/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml
> > > @@ -43,6 +43,26 @@ properties:
> > >vdd33-supply:
> > >  description: Regulator that provides the supply 3.3V power.
> > >
> > > +  analogix,lane0-swing:
> > > +$ref: /schemas/types.yaml#/definitions/uint32-array
> > > +minItems: 1
> > > +maxItems: 20
> > > +description:
> > > +  an array of swing register setting for DP tx lane0 PHY, please 
> > > don't
> > > +  add this property, or contact vendor.
> > > +
> > > +  analogix,lane1-swing:
> > > +$ref: /schemas/types.yaml#/definitions/uint32-array
> > > +minItems: 1
> > > +maxItems: 20
> > > +description:
> > > +  an array of swing register setting for DP tx lane1 PHY, please 
> > > don't
> > > +  add this property, or contact vendor.
> > > +
> > > +  analogix,audio-enable:
> > > +type: boolean
> > > +description: let the driver enable audio HDMI codec function or not.
> > > +
> > >ports:
> > >  $ref: /schemas/graph.yaml#/properties/ports
> > >
> > > @@ -50,13 +70,43 @@ properties:
> > >port@0:
> > >  $ref: /schemas/graph.yaml#/properties/port
> > >  description:
> > > -  Video port for MIPI DSI input.
> > > +  MIPI DSI/DPI input.
> > > +
> > > +properties:
> > > +  endpoint:
> > > +$ref: /schemas/media/video-interfaces.yaml#
> > > +type: object
> > > +additionalProperties: false
> > > +
> > > +properties:
> > > +  remote-endpoint: true
> > > +  bus-type: true
> > > +  data-lanes: true
> > > +
> > > +required:
> > > +  - remote-endpoint
> > > +
> > > +required:
> > > +  - endpoint
> > > +
> > >
> > >port@1:
> > >  $ref: /schemas/graph.yaml#/properties/port
> > >  description:
> > >Video port for panel or connector.
> > >
> > > +properties:
> > > +  endpoint:
> > > +$ref: /schemas/media/video-interfaces.yaml#
> > > +type: object
> > > +additionalProperties: false
> > > +
> > > +properties:
> > > +  remote-endpoint: true
> > > +
> > > +required:
> > > +  - remote-endpoint
> > > +
> > >  required:
> > >- port@0
> > >- port@1
> > > @@ -87,6 +137,9 @@ examples:
> > >  vdd10-supply = <&pp1000_mipibrdg>;
> > >  vdd18-supply = <&pp1800_mipibrdg>;
> > >  vdd33-supply = <&pp3300_mipibrdg>;
> > > +analogix,audio-enable;
> > > +analogix,lane0-swing = <0x14 0x54 0x64 0x74 0x29 0x7b 0x77 
> > > 0x5b>;
> > > +analogix,lane1-swing = <0x14 0x54 0x64 0x74 0x29 0x7b 0x77 
> > > 0x5b>;
> > >
> > >  ports {
> > >  #address-cells = <1>;
> > > @@ -96,6 +149,8 @@ examples:
> > >  reg = <0>;
> > >  anx7625_in: endpoint {
> > >  remote-endpoint = <&mipi_dsi>;
> > > +bus-type = <5>;
> > > +data-lanes = <0 1 2 3>;
> > >  };
> > >  };
> > >

-- 
Regards,

Laurent Pinchart


Re: [PATCH] drm/of: free the iterator object on failure

2021-07-14 Thread Laurent Pinchart
Hi Steven,

On Tue, Jul 13, 2021 at 05:16:16PM +0100, Steven Price wrote:
> On 12/07/2021 22:55, Laurent Pinchart wrote:
> > On Mon, Jul 12, 2021 at 10:31:52PM +0100, Steven Price wrote:
> >> On 12/07/2021 17:50, Laurent Pinchart wrote:
> >>> On Mon, Jul 12, 2021 at 04:57:58PM +0100, Steven Price wrote:
>  When bailing out due to the sanity check the iterator value needs to be
>  freed because the early return prevents for_each_child_of_node() from
>  doing the dereference itself.
> 
>  Fixes: 4ee48cc5586b ("drm: of: Fix double-free bug")
> >>>
> >>> I don't think the Fixes tag is correct, the issue was already present
> >>> before 4ee48cc5586b. The fix looks right though.
> >>
> >> I'm not sure quite what you mean by "already present". As I understand
> >> it the timeline was:
> >>
> >> 1. 6529007522de drm: of: Add drm_of_lvds_get_dual_link_pixel_order
> >>The function was originally added. This made the mistake twice of
> >>calling of_node_put() on the wrong variable (remote_port rather than
> >>endpoint).
> > 
> > Correct.
> > 
> >> 2. 4ee48cc5586b drm: of: Fix double-free bug
> >>One of the of_node_put() calls was removed as it was a double-free.
> >>This left the first incorrect of_node_put() in place, and the second
> >>is now a straight leak.
> > 
> > That's right, but this commit didn't introduce the leak, it was already
> > there in 6529007522de (in addition to the double-free).
> 
> Ah, I see what you mean. My thought process was that the original
> comment had the bug "using the wrong variable", and (2) (partially)
> fixed that but in the process introduced a new bug (a memory leak). But
> I guess technically the memory leak was there from the beginning.
> 
> The other reason I referenced (2) in the Fixes line is because this
> patch depends on patch (2), whereas it won't apply cleanly without.
> 
> However I don't think it really matters either way: (2) has already been
> backported, and either way this needs fixing if either (1) or (2) are
> present.
> 
> Would you like me to resend with a "Fixes: 6529007522de drm: of: Add
> drm_of_lvds_get_dual_link_pixel_order", or are you happy to just fix
> this up when merging?

I don't mind either way, from my point of view it can be fixed up by
whoever will pick the patch up and merge it.

> >> 3. b557a5f8da57 drm/of: free the right object
> >>This (correctly) fixes the first of_node_put() to free endpoint. And
> >>the post from Daniel was what caused me to look.
> >>
> >> 4. This patch
> >>Reintroduces the of_node_put() removed in (2) but putting endpoint
> >>rather than remote_port.
> >>
> >> I've put (2) in the Fixes line as this patch is fixing the leak
> >> introduced by that patch, but that in itself was of course 'fixing' the
> >> double free of the original patch.
> >>
>  Signed-off-by: Steven Price 
>  ---
>   drivers/gpu/drm/drm_of.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
>  Daniel's email[1] made me take a look at this function and it appears
>  that for_each_child_of_node()'s interface had caused a bad bug fix due
>  to the hidden reference counting in the iterator.
> 
>  [1] https://lore.kernel.org/r/YOxQ5TbkNrqCGBDJ%40phenom.ffwll.local
> 
>  diff --git a/drivers/gpu/drm/drm_of.c b/drivers/gpu/drm/drm_of.c
>  index 197c57477344..997b8827fed2 100644
>  --- a/drivers/gpu/drm/drm_of.c
>  +++ b/drivers/gpu/drm/drm_of.c
>  @@ -331,8 +331,10 @@ static int drm_of_lvds_get_remote_pixels_type(
>    * configurations by passing the endpoints explicitly to
>    * drm_of_lvds_get_dual_link_pixel_order().
>    */
>  -if (!current_pt || pixels_type != current_pt)
>  +if (!current_pt || pixels_type != current_pt) {
>  +of_node_put(endpoint);
>   return -EINVAL;
>  +}
>   }
>   
>   return pixels_type;

-- 
Regards,

Laurent Pinchart


[PATCH -next] drm/bochs: Fix missing pci_disable_device() on error in bochs_pci_probe()

2021-07-14 Thread Yang Yingliang
Fix the missing pci_disable_device() before return
from bochs_pci_probe() in the error handling case.

Reported-by: Hulk Robot 
Signed-off-by: Yang Yingliang 
---
 drivers/gpu/drm/tiny/bochs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/tiny/bochs.c b/drivers/gpu/drm/tiny/bochs.c
index a2cfecfa8556..74832b9d3eae 100644
--- a/drivers/gpu/drm/tiny/bochs.c
+++ b/drivers/gpu/drm/tiny/bochs.c
@@ -666,6 +666,7 @@ static int bochs_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent
return ret;
 
 err_free_dev:
+   pci_disable_device(pdev);
drm_dev_put(dev);
return ret;
 }
-- 
2.25.1



Re: [PATCH v6 06/11] drm/mediatek: Add pm runtime support for ovl and rdma

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

From: Yongqiang Niu 

Prepare for smi cleaning up "mediatek,larb".

Display use the dispsys device to call pm_rumtime_get_sync before.
This patch add pm_runtime_xx with ovl and rdma device whose nodes has
"iommus" property, then display could help pm_runtime_get for smi via
ovl or rdma device.

CC: CK Hu 
Signed-off-by: Yongqiang Niu 
Signed-off-by: Yong Wu 
(Yong: Use pm_runtime_resume_and_get instead of pm_runtime_get_sync)
Acked-by: Chun-Kuang Hu 
---
  drivers/gpu/drm/mediatek/mtk_disp_ovl.c  |  9 -
  drivers/gpu/drm/mediatek/mtk_disp_rdma.c |  9 -
  drivers/gpu/drm/mediatek/mtk_drm_crtc.c  | 12 +++-
  3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
index fa9d79963cd3..ea5760f856ec 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
@@ -11,6 +11,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  
  #include "mtk_disp_drv.h"

@@ -414,15 +415,21 @@ static int mtk_disp_ovl_probe(struct platform_device 
*pdev)
return ret;
}
  
+	pm_runtime_enable(dev);

+
ret = component_add(dev, &mtk_disp_ovl_component_ops);
-   if (ret)
+   if (ret) {
+   pm_runtime_disable(dev);
dev_err(dev, "Failed to add component: %d\n", ret);
+   }
  
  	return ret;

  }
  
  static int mtk_disp_ovl_remove(struct platform_device *pdev)

  {
+   pm_runtime_disable(&pdev->dev);
+
return 0;
  }
  
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c

index 705f28ceb4dd..0f31d1c8e37c 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
@@ -9,6 +9,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  
  #include "mtk_disp_drv.h"

@@ -327,9 +328,13 @@ static int mtk_disp_rdma_probe(struct platform_device 
*pdev)
  
  	platform_set_drvdata(pdev, priv);
  
+	pm_runtime_enable(dev);

+
ret = component_add(dev, &mtk_disp_rdma_component_ops);
-   if (ret)
+   if (ret) {
+   pm_runtime_disable(dev);
dev_err(dev, "Failed to add component: %d\n", ret);
+   }
  
  	return ret;

  }
@@ -338,6 +343,8 @@ static int mtk_disp_rdma_remove(struct platform_device 
*pdev)
  {
component_del(&pdev->dev, &mtk_disp_rdma_component_ops);
  
+	pm_runtime_disable(&pdev->dev);

+
return 0;
  }
  
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c

index 474efb844249..08e3f352377d 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
@@ -557,9 +557,15 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
return;
}
  
+	ret = pm_runtime_resume_and_get(comp->dev);

+   if (ret < 0)
+   DRM_DEV_ERROR(comp->dev, "Failed to enable power domain: %d\n",
+ ret);


shouldn't the code return in case of failure here?

Thanks,
Dafna


+
ret = mtk_crtc_ddp_hw_init(mtk_crtc);
if (ret) {
mtk_smi_larb_put(comp->larb_dev);
+   pm_runtime_put(comp->dev);
return;
}
  
@@ -572,7 +578,7 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc *crtc,

  {
struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
struct mtk_ddp_comp *comp = mtk_crtc->ddp_comp[0];
-   int i;
+   int i, ret;
  
  	DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id);

if (!mtk_crtc->enabled)
@@ -596,6 +602,10 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
*crtc,
drm_crtc_vblank_off(crtc);
mtk_crtc_ddp_hw_fini(mtk_crtc);
mtk_smi_larb_put(comp->larb_dev);
+   ret = pm_runtime_put(comp->dev);
+   if (ret < 0)
+   DRM_DEV_ERROR(comp->dev, "Failed to disable power domain: %d\n",
+ ret);
  
  	mtk_crtc->enabled = false;

  }



Re: [PATCH v1 1/1] drm: bridge: Mark mode_fixup deprecated

2021-07-14 Thread Laurent Pinchart
Hi Sam,

Thank you for the patch.

On Tue, Jul 13, 2021 at 09:32:57PM +0200, Sam Ravnborg wrote:
> Make it obvious that mode_fixup is deprecated and new drivers shall use
> atomic_check.

Could you also mark drm_bridge_chain_mode_fixup() as deprecated ?

Regarding usage of .atomic_check(), while I agree that's the way to go,
we have more work to do. .mode_fixup() was created a long time ago, when
we were supposed to have a single bridge at the output of the CRTC. The
bridge could then instruct the CRTC to output a different mode than what
the display requires. Now that we have support for multiple bridges,
it's not as straightforward, and we've so far just pretended to ignore
the problem. The .mode_fixup() operation is used and abused, and just
telling people to use .atomic_check() will likely make things worse as
that operation has access to the full atomic commit and can alter the
mode of pretty much anything. We need to define clear semantics for
.atomic_check() in bridges.

> Signed-off-by: Sam Ravnborg 
> Cc: Laurent Pinchart 
> Cc: Andrzej Hajda 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Thomas Zimmermann 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> ---
>  include/drm/drm_bridge.h | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h
> index 46bdfa48c413..668f14234459 100644
> --- a/include/drm/drm_bridge.h
> +++ b/include/drm/drm_bridge.h
> @@ -136,6 +136,9 @@ struct drm_bridge_funcs {
>*
>* NOTE:
>*
> +  * This is deprecated, do not use!
> +  * New drivers shall use &drm_bridge_funcs.atomic_check.
> +  *
>* This function is called in the check phase of atomic modesets, which
>* can be aborted for any reason (including on userspace's request to
>* just check whether a configuration would be possible). Drivers MUST

-- 
Regards,

Laurent Pinchart


Re: [PATCH] dma-buf: add kernel count for dma_buf

2021-07-14 Thread Christian König

Am 14.07.21 um 09:11 schrieb guangming@mediatek.com:

From: Guangming Cao 

Add a refcount for kernel to prevent UAF(Use After Free) issue.


Well NAK on so many levels.



We can assume a case like below:
 1. kernel space alloc dma_buf(file count = 1)
 2. kernel use dma_buf to get fd(file count = 1)
 3. userspace use fd to do mapping (file count = 2)


Creating an userspace mapping increases the reference count for the 
underlying file object.


See the implementation of mmap_region():
...
    vma->vm_file = get_file(file);
    error = call_mmap(file, vma);
...

What can happen is the the underlying exporter redirects the mmap to a 
different file, e.g. TTM or GEM drivers do that all the time.


But this is fine since then the VA mapping is independent of the DMA-buf.


 4. kernel call dma_buf_put (file count = 1)
 5. userpsace close buffer fd(file count = 0)
 6. at this time, buffer is released, but va is valid!!
So we still can read/write buffer via mmap va,
it maybe cause memory leak, or kernel exception.
And also, if we use "ls -ll" to watch corresponding process
fd link info, it also will cause kernel exception.

Another case:
  Using dma_buf_fd to generate more than 1 fd, because
  dma_buf_fd will not increase file count, thus, when close
  the second fd, it maybe occurs error.


Each opened fd will increase the reference count so this is certainly 
not correct what you describe here.


Regards,
Christian.



Solution:
 Add a kernel count for dma_buf, and make sure the file count
 of dma_buf.file hold by kernel is 1.

Notes: For this solution, kref couldn't work because kernel ref
maybe added from 0, but kref don't allow it.

Signed-off-by: Guangming Cao 
---
  drivers/dma-buf/dma-buf.c | 23 +++
  include/linux/dma-buf.h   |  6 --
  2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 511fe0d217a0..04ee92aac8b9 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -62,6 +62,7 @@ static void dma_buf_release(struct dentry *dentry)
if (unlikely(!dmabuf))
return;
  
+	WARN_ON(atomic64_read(&dmabuf->kernel_ref));

BUG_ON(dmabuf->vmapping_counter);
  
  	/*

@@ -555,6 +556,7 @@ struct dma_buf *dma_buf_export(const struct 
dma_buf_export_info *exp_info)
goto err_module;
}
  
+	atomic64_set(&dmabuf->kernel_ref, 1);

dmabuf->priv = exp_info->priv;
dmabuf->ops = exp_info->ops;
dmabuf->size = exp_info->size;
@@ -617,6 +619,9 @@ int dma_buf_fd(struct dma_buf *dmabuf, int flags)
  
  	fd_install(fd, dmabuf->file);
  
+	/* Add file cnt for each new fd */

+   get_file(dmabuf->file);
+
return fd;
  }
  EXPORT_SYMBOL_GPL(dma_buf_fd);
@@ -626,12 +631,13 @@ EXPORT_SYMBOL_GPL(dma_buf_fd);
   * @fd:   [in]fd associated with the struct dma_buf to be returned
   *
   * On success, returns the struct dma_buf associated with an fd; uses
- * file's refcounting done by fget to increase refcount. returns ERR_PTR
- * otherwise.
+ * dmabuf's ref refcounting done by kref_get to increase refcount.
+ * Returns ERR_PTR otherwise.
   */
  struct dma_buf *dma_buf_get(int fd)
  {
struct file *file;
+   struct dma_buf *dmabuf;
  
  	file = fget(fd);
  
@@ -643,7 +649,12 @@ struct dma_buf *dma_buf_get(int fd)

return ERR_PTR(-EINVAL);
}
  
-	return file->private_data;

+   dmabuf = file->private_data;
+   /* replace file count increase as ref increase for kernel user */
+   get_dma_buf(dmabuf);
+   fput(file);
+
+   return dmabuf;
  }
  EXPORT_SYMBOL_GPL(dma_buf_get);
  
@@ -662,7 +673,11 @@ void dma_buf_put(struct dma_buf *dmabuf)

if (WARN_ON(!dmabuf || !dmabuf->file))
return;
  
-	fput(dmabuf->file);

+   if (WARN_ON(!atomic64_read(&dmabuf->kernel_ref)))
+   return;
+
+   if (!atomic64_dec_return(&dmabuf->kernel_ref))
+   fput(dmabuf->file);
  }
  EXPORT_SYMBOL_GPL(dma_buf_put);
  
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h

index efdc56b9d95f..bc790cb028eb 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -308,6 +308,7 @@ struct dma_buf_ops {
  struct dma_buf {
size_t size;
struct file *file;
+   atomic64_t kernel_ref;
struct list_head attachments;
const struct dma_buf_ops *ops;
struct mutex lock;
@@ -436,7 +437,7 @@ struct dma_buf_export_info {
 .owner = THIS_MODULE }
  
  /**

- * get_dma_buf - convenience wrapper for get_file.
+ * get_dma_buf - increase a kernel ref of dma-buf
   * @dmabuf:   [in]pointer to dma_buf
   *
   * Increments the reference count on the dma-buf, needed in case of drivers
@@ -446,7 +447,8 @@ struct dma_buf_export_info {
   */
  static inlin

Re: [PATCH v6 01/11] dt-binding: mediatek: Get rid of mediatek, larb for multimedia HW

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 10:13, Dafna Hirschfeld wrote:

Hi,
thanks for the patch

On 14.07.21 04:56, Yong Wu wrote:

After adding device_link between the consumer with the smi-larbs,
if the consumer call its owner pm_runtime_get(_sync), the
pm_runtime_get(_sync) of smi-larb and smi-common will be called
automatically. Thus, the consumer don't need the property.

And IOMMU also know which larb this consumer connects with from
iommu id in the "iommus=" property.

Signed-off-by: Yong Wu 
Reviewed-by: Rob Herring 
Reviewed-by: Evan Green 
---
  .../bindings/display/mediatek/mediatek,disp.txt  | 9 -
  .../devicetree/bindings/media/mediatek-jpeg-decoder.yaml | 9 -
  .../devicetree/bindings/media/mediatek-jpeg-encoder.yaml | 9 -


On which repo are these patches based on ?
In linux-next the file mediatek-jpeg-encoder.yaml don't exist

Thanks,
Dafna


sorry, I see you reference the patch that convert to yaml in the cover letter.

Thanks,
Dafna




  Documentation/devicetree/bindings/media/mediatek-mdp.txt | 8 
  .../devicetree/bindings/media/mediatek-vcodec.txt    | 4 
  5 files changed, 39 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt 
b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
index fbb59c9ddda6..867bd82e2f03 100644
--- a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
+++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
@@ -61,8 +61,6 @@ Required properties (DMA function blocks):
  "mediatek,-disp-rdma"
  "mediatek,-disp-wdma"
    the supported chips are mt2701, mt8167 and mt8173.
-- larb: Should contain a phandle pointing to the local arbiter device as 
defined
-  in 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
  - iommus: Should point to the respective IOMMU block with master port as
    argument, see Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
    for details.
@@ -91,7 +89,6 @@ ovl0: ovl@1400c000 {
  power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
  clocks = <&mmsys CLK_MM_DISP_OVL0>;
  iommus = <&iommu M4U_PORT_DISP_OVL0>;
-    mediatek,larb = <&larb0>;
  };
  ovl1: ovl@1400d000 {
@@ -101,7 +98,6 @@ ovl1: ovl@1400d000 {
  power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
  clocks = <&mmsys CLK_MM_DISP_OVL1>;
  iommus = <&iommu M4U_PORT_DISP_OVL1>;
-    mediatek,larb = <&larb4>;
  };
  rdma0: rdma@1400e000 {
@@ -111,7 +107,6 @@ rdma0: rdma@1400e000 {
  power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
  clocks = <&mmsys CLK_MM_DISP_RDMA0>;
  iommus = <&iommu M4U_PORT_DISP_RDMA0>;
-    mediatek,larb = <&larb0>;
  mediatek,rdma-fifosize = <8192>;
  };
@@ -122,7 +117,6 @@ rdma1: rdma@1400f000 {
  power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
  clocks = <&mmsys CLK_MM_DISP_RDMA1>;
  iommus = <&iommu M4U_PORT_DISP_RDMA1>;
-    mediatek,larb = <&larb4>;
  };
  rdma2: rdma@1401 {
@@ -132,7 +126,6 @@ rdma2: rdma@1401 {
  power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
  clocks = <&mmsys CLK_MM_DISP_RDMA2>;
  iommus = <&iommu M4U_PORT_DISP_RDMA2>;
-    mediatek,larb = <&larb4>;
  };
  wdma0: wdma@14011000 {
@@ -142,7 +135,6 @@ wdma0: wdma@14011000 {
  power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
  clocks = <&mmsys CLK_MM_DISP_WDMA0>;
  iommus = <&iommu M4U_PORT_DISP_WDMA0>;
-    mediatek,larb = <&larb0>;
  };
  wdma1: wdma@14012000 {
@@ -152,7 +144,6 @@ wdma1: wdma@14012000 {
  power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
  clocks = <&mmsys CLK_MM_DISP_WDMA1>;
  iommus = <&iommu M4U_PORT_DISP_WDMA1>;
-    mediatek,larb = <&larb4>;
  };
  color0: color@14013000 {
diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml 
b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
index 9b87f036f178..052e752157b4 100644
--- a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
+++ b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
@@ -42,13 +42,6 @@ properties:
    power-domains:
  maxItems: 1
-  mediatek,larb:
-    $ref: '/schemas/types.yaml#/definitions/phandle'
-    description: |
-  Must contain the local arbiters in the current Socs, see
-  
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
-  for details.
-
    iommus:
  maxItems: 2
  description: |
@@ -63,7 +56,6 @@ required:
    - clocks
    - clock-names
    - power-domains
-  - mediatek,larb
    - iommus
  additionalProperties: false
@@ -83,7 +75,6 @@ examples:
    clock-names = "jpgdec-smi",
  "jpgdec";
    power-domains = <&scpsys MT2701_POWER_DOMAIN_ISP>;
-  mediatek,larb = <&larb2>;
    iommus = <&iommu MT2701_M4U_PORT_JPGDEC_WDMA>,
 <&iommu MT2701_M4U_PORT_JPGDEC_BSDMA>;
  };
diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-encoder.yaml 
b/Docu

Re: [PATCH v3 1/1] drm/ttm: Fix COW check

2021-07-14 Thread Christian König




Am 13.07.21 um 17:28 schrieb Alex Deucher:

On Tue, Jul 13, 2021 at 2:57 AM Christian König
 wrote:



Am 13.07.21 um 00:06 schrieb Felix Kuehling:

KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
is_cow_mapping returns true for these mappings. Add a check for
vm_flags & VM_WRITE to avoid mmap failures on private read-only or
PROT_NONE mappings.

v2: protect against mprotect making a mapping writable after the fact
v3: update driver-specific vm_operations_structs

Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
Signed-off-by: Felix Kuehling 
Signed-off-by: Alex Deucher 

Reviewed-by: Christian König 

Are you planning to push this to drm-misc?


Yes, just didn't found time yesterday.

Christian.



Alex


---
   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  |  3 ++-
   drivers/gpu/drm/nouveau/nouveau_gem.c|  3 ++-
   drivers/gpu/drm/radeon/radeon_gem.c  |  3 ++-
   drivers/gpu/drm/ttm/ttm_bo_vm.c  | 14 +-
   drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c |  1 +
   include/drm/ttm/ttm_bo_api.h |  4 
   6 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index b3404c43a911..1aa750a6a5d2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -79,7 +79,8 @@ static const struct vm_operations_struct amdgpu_gem_vm_ops = {
   .fault = amdgpu_gem_fault,
   .open = ttm_bo_vm_open,
   .close = ttm_bo_vm_close,
- .access = ttm_bo_vm_access
+ .access = ttm_bo_vm_access,
+ .mprotect = ttm_bo_vm_mprotect
   };

   static void amdgpu_gem_object_free(struct drm_gem_object *gobj)
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 5b27845075a1..164ea564bb7a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -70,7 +70,8 @@ static const struct vm_operations_struct nouveau_ttm_vm_ops = 
{
   .fault = nouveau_ttm_fault,
   .open = ttm_bo_vm_open,
   .close = ttm_bo_vm_close,
- .access = ttm_bo_vm_access
+ .access = ttm_bo_vm_access,
+ .mprotect = ttm_bo_vm_mprotect
   };

   void
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 458f92a70887..c19ad07eb7b5 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -77,7 +77,8 @@ static const struct vm_operations_struct radeon_gem_vm_ops = {
   .fault = radeon_gem_fault,
   .open = ttm_bo_vm_open,
   .close = ttm_bo_vm_close,
- .access = ttm_bo_vm_access
+ .access = ttm_bo_vm_access,
+ .mprotect = ttm_bo_vm_mprotect
   };

   static void radeon_gem_object_free(struct drm_gem_object *gobj)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index f56be5bc0861..fb325bad5db6 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -542,17 +542,29 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned 
long addr,
   }
   EXPORT_SYMBOL(ttm_bo_vm_access);

+int ttm_bo_vm_mprotect(struct vm_area_struct *vma, unsigned long start,
+unsigned long end, unsigned long newflags)
+{
+ /* Enforce no COW since would have really strange behavior with it. */
+ if (is_cow_mapping(newflags) && (newflags & VM_WRITE))
+ return -EINVAL;
+
+ return 0;
+}
+EXPORT_SYMBOL(ttm_bo_vm_mprotect);
+
   static const struct vm_operations_struct ttm_bo_vm_ops = {
   .fault = ttm_bo_vm_fault,
   .open = ttm_bo_vm_open,
   .close = ttm_bo_vm_close,
   .access = ttm_bo_vm_access,
+ .mprotect = ttm_bo_vm_mprotect,
   };

   int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct ttm_buffer_object *bo)
   {
   /* Enforce no COW since would have really strange behavior with it. */
- if (is_cow_mapping(vma->vm_flags))
+ if (is_cow_mapping(vma->vm_flags) && (vma->vm_flags & VM_WRITE))
   return -EINVAL;

   ttm_bo_get(bo);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c
index e6b1f98ec99f..e4bf7dc99320 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c
@@ -61,6 +61,7 @@ int vmw_mmap(struct file *filp, struct vm_area_struct *vma)
   .fault = vmw_bo_vm_fault,
   .open = ttm_bo_vm_open,
   .close = ttm_bo_vm_close,
+ .mprotect = ttm_bo_vm_mprotect,
   #ifdef CONFIG_TRANSPARENT_HUGEPAGE
   .huge_fault = vmw_bo_vm_huge_fault,
   #endif
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index f681bbdbc698..40eb95875355 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -605,6 +605,10 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);

   int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
   

Re: [PATCH v6 00/11] Clean up "mediatek,larb"

2021-07-14 Thread Dafna Hirschfeld

Hi

Thanks for the patchset.

I tested it on mt8173 (elm) with chromeos userspace.
Before that patchset, the test:

tast -verbose run -build=false 10.42.0.175 video.DecodeAccel.h264

sometimes passed and sometimes failed with 'context deadline exceeded'.
With this patchset it seems that the test always passes so I added tested-by:

Tested-by: Dafna Hirschfeld 

Thanks,
Dafna




On 14.07.21 04:56, Yong Wu wrote:

MediaTek IOMMU block diagram always like below:

 M4U
  |
 smi-common
  |
   -
   | |  ...
   | |
larb1 larb2
   | |
vdec   venc

All the consumer connect with smi-larb, then connect with smi-common.

When the consumer works, it should enable the smi-larb's power which also
need enable the smi-common's power firstly.

Thus, Firstly, use the device link connect the consumer and the
smi-larbs. then add device link between the smi-larb and smi-common.

After adding the device_link, then "mediatek,larb" property can be removed.
the iommu consumer don't need call the mtk_smi_larb_get/put to enable
the power and clock of smi-larb and smi-common.

About the MM dt-binding/dtsi patches, I guess they should go together, thus
I don't split them for each a MM module and each a SoC.

Base on v5.14-rc1, and a jpeg[1] and mdp[2] patchset.

[1] 
https://lore.kernel.org/linux-mediatek/20210702102304.3346429-1-hsi...@chromium.org/
[2] 
https://lore.kernel.org/linux-mediatek/20210709022324.1607884-1-ei...@chromium.org/

Change notes:
v6: 1) rebase on v5.14-rc1.
 2) Fix the issue commented in v5 from Dafna and Hsin-Yi.
 3) Remove the patches about using pm_runtime_resume_and_get since they have
already been merged by other patches.

v5: 
https://lore.kernel.org/linux-mediatek/20210410091128.31823-1-yong...@mediatek.com/
 1) Base v5.12-rc2.
 2) Remove changing the mtk-iommu to module_platform_driver patch, It have 
already been a
 independent patch.

v4: 
https://lore.kernel.org/linux-mediatek/1590826218-23653-1-git-send-email-yong...@mediatek.com/
 base on v5.7-rc1.
   1) Move drm PM patch before smi patchs.
   2) Change builtin_platform_driver to module_platform_driver since we may need
  build as module.
   3) Rebase many patchset as above.

v3: 
https://lore.kernel.org/linux-iommu/1567503456-24725-1-git-send-email-yong...@mediatek.com/
 1) rebase on v5.3-rc1 and the latest mt8183 patchset.
 2) Use device_is_bound to check whether the driver is ready from Matthias.
 3) Add DL_FLAG_STATELESS flag when calling device_link_add and explain the
reason in the commit message[3/14].
 4) Add a display patch[12/14] into this series. otherwise it may affect
display HW fastlogo even though it don't happen in mt8183.

v2: https://lore.kernel.org/linux-iommu/1560171313-28299-1-git-send-email-yong...@mediatek.com/

1) rebase on v5.2-rc1.
2) Move adding device_link between the consumer and smi-larb into
iommu_add_device from Robin.
3) add DL_FLAG_AUTOREMOVE_CONSUMER even though the smi is built-in from 
Evan.
4) Remove the shutdown callback in iommu.

v1: 
https://lore.kernel.org/linux-iommu/1546318276-18993-1-git-send-email-yong...@mediatek.com/

Yong Wu (10):
   dt-binding: mediatek: Get rid of mediatek,larb for multimedia HW
   iommu/mediatek: Add probe_defer for smi-larb
   iommu/mediatek: Add device_link between the consumer and the larb
 devices
   media: mtk-jpeg: Get rid of mtk_smi_larb_get/put
   media: mtk-mdp: Get rid of mtk_smi_larb_get/put
   drm/mediatek: Get rid of mtk_smi_larb_get/put
   media: mtk-vcodec: Get rid of mtk_smi_larb_get/put
   memory: mtk-smi: Get rid of mtk_smi_larb_get/put
   arm: dts: mediatek: Get rid of mediatek,larb for MM nodes
   arm64: dts: mediatek: Get rid of mediatek,larb for MM nodes

Yongqiang Niu (1):
   drm/mediatek: Add pm runtime support for ovl and rdma

  .../display/mediatek/mediatek,disp.txt|  9 
  .../bindings/media/mediatek-jpeg-decoder.yaml |  9 
  .../bindings/media/mediatek-jpeg-encoder.yaml |  9 
  .../bindings/media/mediatek-mdp.txt   |  8 
  .../bindings/media/mediatek-vcodec.txt|  4 --
  arch/arm/boot/dts/mt2701.dtsi |  2 -
  arch/arm/boot/dts/mt7623n.dtsi|  5 --
  arch/arm64/boot/dts/mediatek/mt8173.dtsi  | 16 ---
  arch/arm64/boot/dts/mediatek/mt8183.dtsi  |  6 ---
  drivers/gpu/drm/mediatek/mtk_disp_ovl.c   |  9 +++-
  drivers/gpu/drm/mediatek/mtk_disp_rdma.c  |  9 +++-
  drivers/gpu/drm/mediatek/mtk_drm_crtc.c   | 19 
  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c   | 36 +--
  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h   |  1 -
  drivers/gpu/drm/mediatek/mtk_drm_drv.c|  5 +-
  drivers/iommu/mtk_iommu.c | 24 +-
  drivers/iommu/mtk_iommu_v1.c  | 22 -
  .../media/platform/mtk-jpeg/mtk_jpeg_core.c   | 45 +-
  .../media/platform/

Aw: [PATCH v6 00/11] Clean up "mediatek,larb"

2021-07-14 Thread Frank Wunderlich
Hi,

sorry this (or the 2 depency-series) cause a NULL Pointer deref in 
iommu_group_remove_device on mt7623/bpi-r2

i wonder why on bootup a cleanup is run, but have no hint about this.

since "dts: mtk-mdp: remove mediatek, vpu property from primary MDP device" all 
is good, i guess problem comes up while removing larb with DT

this is backtrace

[6.274465] PC is at iommu_group_remove_device+0x28/0x148
[6.279877] LR is at iommu_release_device+0x4c/0x70

[6.674347] Backtrace:
[6.676797] [] (iommu_group_remove_device) from [] (iomm)
[6.686221]  r7: r6:c06bf04c r5:c0d7a1ac r4:c21fc010
[6.691883] [] (iommu_release_device) from [] (remove_io)
[6.700689]  r5: r4:
[6.704265] [] (remove_iommu_group) from [] (bus_for_eac)
[6.712725] [] (bus_for_each_dev) from [] (bus_set_iommu)
[6.720753]  r6:c331f440 r5:c1406f58 r4:ffea
[6.725370] [] (bus_set_iommu) from [] (mtk_iommu_probe+)
[6.733484]  r7:c32db0b8 r6:c21f9c00 r5:c331f1c0 r4:
[6.739145] [] (mtk_iommu_probe) from [] (platform_probe)
[6.747176]  r10:c21f9c10 r9:c2496f54 r8:c14623b8 r7:c14623b8 r6:c1405b90 r50
[6.755012]  r4:
[6.757544] [] (platform_probe) from [] (really_probe.pa)
[6.766006]  r7:c14623b8 r6:c1405b90 r5: r4:c21f9c10
[6.771667] [] (really_probe.part.0) from [] (really_pro)
[6.779866]  r7:c21f9c10 r6:c2549e74 r5:c1405b90 r4:c21f9c10
[6.785527] [] (really_probe) from [] (__driver_probe_de)
[6.793984]  r5:c1405b90 r4:c21f9c10
[6.797560] [] (__driver_probe_device) from [] (driver_p)
[6.806543]  r9:c2496f54 r8:0008 r7:c21f9c10 r6:c2549e74 r5:c14c6ec8 r4:4
[6.814291] [] (driver_probe_device) from [] (__device_a)
[6.823448]  r9:c2496f54 r8: r7:c21f9c10 r6:c2549e74 r5:c1405b90 r4:1
[6.831196] [] (__device_attach_driver) from [] (bus_for)
[6.840007]  r7:c14623b8 r6:c073635c r5:c2549e74 r4:
[6.845669] [] (bus_for_each_drv) from [] (__device_atta)
[6.854044]  r6:0001 r5:c21f9c54 r4:c21f9c10
[6.858662] [] (__device_attach) from [] (device_initial)
[6.867207]  r6:c21f9c10 r5:c1406f58 r4:c1406ca0
[6.871825] [] (device_initial_probe) from [] (bus_probe)
[6.880454] [] (bus_probe_device) from [] (deferred_prob)


bisect shows this commit as breaking:

Author: Yong Wu 
Date:   Wed Jul 14 10:56:17 2021 +0800

iommu/mediatek: Add probe_defer for smi-larb

Prepare for adding device_link.

regards Frank


Re: [PATCH v2 1/2] dt-bindings: display: rockchip: Add compatible for rk3568 HDMI

2021-07-14 Thread Michael Riesch
Hello Heiko,

On 7/13/21 10:49 AM, Heiko Stübner wrote:
> Hi Michael,
> 
> Am Dienstag, 13. Juli 2021, 10:44:00 CEST schrieb Michael Riesch:
>> The HDMI TX block in the RK3568 requires two power supplies, which have
>> to be enabled in some cases (at least on the RK3568 EVB1 the voltages
>> VDDA0V9_IMAGE and VCCA1V8_IMAGE are disabled by default). It would be
>> great if this was considered by the driver and the device tree binding.
>> I am not sure, though, whether this is a RK3568 specific or
>> rockchip_dw_hdmi specific thing. Maybe it can even enter the Synopsis DW
>> HDMI driver.
> 
> I do remember that this discussion happened many years back already.
> And yes the supplies are needed for all but back then there was opposition
> as these are supposedly phy-related supplies, not for the dw-hdmi itself.
> [There are variants with an external phy, like on the rk3328]
> 
> See discussion on [0]
> 
> [0] 
> https://dri-devel.freedesktop.narkive.com/pen2zWo1/patch-v3-1-2-drm-bridge-dw-hdmi-support-optional-supply-regulators

Thanks for the pointer. My summary of this discussion would be the
following:

 - There was no consensus on how to handle the issue. The voltages still
have to be enabled from the outside of the driver.
 - Open question: rockchip-specific or general solution? (one may detect
a tendency towards a rockchip-specific solution)
 - Open question: separation of the phy from the dw_hdmi IP core?

First of all, IMHO the driver should enable those voltages, otherwise we
will have the same discussion again in 5-6 years :-)

Then, the rockchip,dw-hdmi binding features a property "phys",
presumably to handle external phys (e.g., for the RK3328). This fact and
the referenced discussion suggest a rockchip-specific solution.

In the Rockchip documentation (at least for RK3328, RK3399 and RK3568),
there are two extra voltages denoted as "HDMI PHY analog power". It
would be tempting to add the internal phy to the device tree and glue it
to the dw-hdmi using the "phys" property. However, as pointed out in the
referenced discussion, the configuration registers of the phy are
somewhat interleaved with the dw-hdmi registers and a clear separation
may be tricky.

As a more pragmatic alternative, we could add optional supplies to the
rockchip,dw-hdmi binding and evaluate the "phys" property. If the latter
is not specified, the internal phy is used and the supplies must be
enabled. Would such an approach be acceptable?

Best regards,
Michael

>> On 7/7/21 2:03 PM, Benjamin Gaignard wrote:
>>> Define a new compatible for rk3568 HDMI.
>>> This version of HDMI hardware block needs two new clocks hclk_vio and hclk
>>> to provide phy reference clocks.
>>>
>>> Signed-off-by: Benjamin Gaignard 
>>> ---
>>> version 2:
>>> - Add the clocks needed for the phy.
>>>
>>>  .../bindings/display/rockchip/rockchip,dw-hdmi.yaml | 6 +-
>>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git 
>>> a/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml 
>>> b/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
>>> index 75cd9c686e985..cb8643b3a8b84 100644
>>> --- 
>>> a/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
>>> +++ 
>>> b/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
>>> @@ -23,6 +23,7 @@ properties:
>>>- rockchip,rk3288-dw-hdmi
>>>- rockchip,rk3328-dw-hdmi
>>>- rockchip,rk3399-dw-hdmi
>>> +  - rockchip,rk3568-dw-hdmi
>>>  
>>>reg-io-width:
>>>  const: 4
>>> @@ -51,8 +52,11 @@ properties:
>>>- vpll
>>>- enum:
>>>- grf
>>> +  - hclk_vio
>>> +  - vpll
>>> +  - enum:
>>> +  - hclk
>>>- vpll
>>> -  - const: vpll
>>
>> The description and documentation of the clocks are somewhat misleading
>> IMHO. This is not caused by your patches, of course. But maybe this is a
>> chance to clean them up a bit.
>>
>> It seems that the CEC clock is an optional clock of the dw-hdmi driver.
>> Shouldn't it be documented in the synopsys,dw-hdmi.yaml?
>>
>> Also, it would be nice if the clocks hclk_vio and hclk featured a
>> description in the binding.
>>
>> BTW, I am not too familiar with the syntax here, but shouldn't items in
>> clocks and items in clock-names be aligned (currently, there is a plain
>> list vs. an enum structure)?
>>
>> Best regards,
>> Michael
>>
>>>  
>>>ddc-i2c-bus:
>>>  $ref: /schemas/types.yaml#/definitions/phandle
>>>
>>
> 
> 
> 
> 


Re: [PATCH -next] drm/bochs: Fix missing pci_disable_device() on error in bochs_pci_probe()

2021-07-14 Thread Thomas Zimmermann

Hi

Am 14.07.21 um 10:39 schrieb Yang Yingliang:

Fix the missing pci_disable_device() before return
from bochs_pci_probe() in the error handling case.


It's maybe better to replace pci_enable_device() with 
pcim_enable_device(), [1] so that the release happens automatically. 
Does this work?


Best regards
Thomas

https://elixir.bootlin.com/linux/v5.13.1/source/drivers/pci/pci.c#L2042



Reported-by: Hulk Robot 
Signed-off-by: Yang Yingliang 
---
  drivers/gpu/drm/tiny/bochs.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/tiny/bochs.c b/drivers/gpu/drm/tiny/bochs.c
index a2cfecfa8556..74832b9d3eae 100644
--- a/drivers/gpu/drm/tiny/bochs.c
+++ b/drivers/gpu/drm/tiny/bochs.c
@@ -666,6 +666,7 @@ static int bochs_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent
return ret;
  
  err_free_dev:

+   pci_disable_device(pdev);
drm_dev_put(dev);
return ret;
  }



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] dma-buf: add kernel count for dma_buf

2021-07-14 Thread Guangming . Cao
On Wed, 2021-07-14 at 10:46 +0200, Christian König wrote:
> Am 14.07.21 um 09:11 schrieb guangming@mediatek.com:
> > From: Guangming Cao 
> > 
> > Add a refcount for kernel to prevent UAF(Use After Free) issue.
> 
> Well NAK on so many levels.
> 
> > 
> > We can assume a case like below:
> >  1. kernel space alloc dma_buf(file count = 1)
> >  2. kernel use dma_buf to get fd(file count = 1)
> >  3. userspace use fd to do mapping (file count = 2)
> 
> Creating an userspace mapping increases the reference count for the 
> underlying file object.
> 
> See the implementation of mmap_region():
> ...
>  vma->vm_file = get_file(file);
>  error = call_mmap(file, vma);
> ...
> 
> What can happen is the the underlying exporter redirects the mmap to
> a 
> different file, e.g. TTM or GEM drivers do that all the time.
> 
> But this is fine since then the VA mapping is independent of the DMA-
> buf.
> 
> >  4. kernel call dma_buf_put (file count = 1)
> >  5. userpsace close buffer fd(file count = 0)
> >  6. at this time, buffer is released, but va is valid!!
> > So we still can read/write buffer via mmap va,
> > it maybe cause memory leak, or kernel exception.
> > And also, if we use "ls -ll" to watch corresponding process
> > fd link info, it also will cause kernel exception.
> > 
> > Another case:
> >   Using dma_buf_fd to generate more than 1 fd, because
> >   dma_buf_fd will not increase file count, thus, when close
> >   the second fd, it maybe occurs error.
> 
> Each opened fd will increase the reference count so this is
> certainly 
> not correct what you describe here.
> 
> Regards,
> Christian.
> 

Yes, mmap will increase file count by calling get_file, so step[2] ->
step[3], file count increase 1.

But, dma_buf_fd() will not increase file count.
function "dma_buf_fd(struct dma_buf *dmabuf, int flags)" just get an
unused fd, via call "get_unused_fd_flags(flags)", and call
"fd_install(fd, dmabuf->file)", it will let associated "struct file* "
in task's fdt->fd[fd] points to this dma_buf.file, not increase the
file count of dma_buf.file.
I think this is confusing, I can get more than 1 fds via dma_buf_fd,
but they don't need to close it because they don't increase file count.

However, dma_buf_put() can decrease file count at kernel side directly.
If somebody write a ko to put file count of dma_buf.file many times, it
will cause buffer freed earlier than except. At last on Android, I
think this is a little bit dangerous.

> > 
> > Solution:
> >  Add a kernel count for dma_buf, and make sure the file count
> >  of dma_buf.file hold by kernel is 1.
> > 
> > Notes: For this solution, kref couldn't work because kernel ref
> > maybe added from 0, but kref don't allow it.
> > 
> > Signed-off-by: Guangming Cao 
> > ---
> >   drivers/dma-buf/dma-buf.c | 23 +++
> >   include/linux/dma-buf.h   |  6 --
> >   2 files changed, 23 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index 511fe0d217a0..04ee92aac8b9 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -62,6 +62,7 @@ static void dma_buf_release(struct dentry
> > *dentry)
> > if (unlikely(!dmabuf))
> > return;
> >   
> > +   WARN_ON(atomic64_read(&dmabuf->kernel_ref));
> > BUG_ON(dmabuf->vmapping_counter);
> >   
> > /*
> > @@ -555,6 +556,7 @@ struct dma_buf *dma_buf_export(const struct
> > dma_buf_export_info *exp_info)
> > goto err_module;
> > }
> >   
> > +   atomic64_set(&dmabuf->kernel_ref, 1);
> > dmabuf->priv = exp_info->priv;
> > dmabuf->ops = exp_info->ops;
> > dmabuf->size = exp_info->size;
> > @@ -617,6 +619,9 @@ int dma_buf_fd(struct dma_buf *dmabuf, int
> > flags)
> >   
> > fd_install(fd, dmabuf->file);
> >   
> > +   /* Add file cnt for each new fd */
> > +   get_file(dmabuf->file);
> > +
> > return fd;
> >   }
> >   EXPORT_SYMBOL_GPL(dma_buf_fd);
> > @@ -626,12 +631,13 @@ EXPORT_SYMBOL_GPL(dma_buf_fd);
> >* @fd:   [in]fd associated with the struct dma_buf to be
> > returned
> >*
> >* On success, returns the struct dma_buf associated with an fd;
> > uses
> > - * file's refcounting done by fget to increase refcount. returns
> > ERR_PTR
> > - * otherwise.
> > + * dmabuf's ref refcounting done by kref_get to increase refcount.
> > + * Returns ERR_PTR otherwise.
> >*/
> >   struct dma_buf *dma_buf_get(int fd)
> >   {
> > struct file *file;
> > +   struct dma_buf *dmabuf;
> >   
> > file = fget(fd);
> >   
> > @@ -643,7 +649,12 @@ struct dma_buf *dma_buf_get(int fd)
> > return ERR_PTR(-EINVAL);
> > }
> >   
> > -   return file->private_data;
> > +   dmabuf = file->private_data;
> > +   /* replace file count increase as ref increase for kernel user
> > */
> > +   get_dma_buf(dmabuf);
> > +   fput(file);
> > +
> > +

Re: [PATCH v4 08/18] drm/v3d: Move drm_sched_job_init to v3d_job_init

2021-07-14 Thread Melissa Wen
On 07/12, Daniel Vetter wrote:
> Prep work for using the scheduler dependency handling. We need to call
> drm_sched_job_init earlier so we can use the new drm_sched_job_await*
> functions for dependency handling here.
> 
> v2: Slightly better commit message and rebase to include the
> drm_sched_job_arm() call (Emma).
> 
> v3: Cleanup jobs under construction correctly (Emma)
> 
> Cc: Melissa Wen 
> Signed-off-by: Daniel Vetter 
> Cc: Emma Anholt 
> ---
>  drivers/gpu/drm/v3d/v3d_drv.h   |  1 +
>  drivers/gpu/drm/v3d/v3d_gem.c   | 88 ++---
>  drivers/gpu/drm/v3d/v3d_sched.c | 15 +++---
>  3 files changed, 44 insertions(+), 60 deletions(-)
> 
> diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
> index 8a390738d65b..1d870261eaac 100644
> --- a/drivers/gpu/drm/v3d/v3d_drv.h
> +++ b/drivers/gpu/drm/v3d/v3d_drv.h
> @@ -332,6 +332,7 @@ int v3d_submit_csd_ioctl(struct drm_device *dev, void 
> *data,
>struct drm_file *file_priv);
>  int v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file_priv);
> +void v3d_job_cleanup(struct v3d_job *job);
>  void v3d_job_put(struct v3d_job *job);
>  void v3d_reset(struct v3d_dev *v3d);
>  void v3d_invalidate_caches(struct v3d_dev *v3d);
> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> index 69ac20e11b09..5eccd3658938 100644
> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> @@ -392,6 +392,12 @@ v3d_render_job_free(struct kref *ref)
>   v3d_job_free(ref);
>  }
>  
> +void v3d_job_cleanup(struct v3d_job *job)
> +{
> + drm_sched_job_cleanup(&job->base);
> + v3d_job_put(job);
> +}
> +
>  void v3d_job_put(struct v3d_job *job)
>  {
>   kref_put(&job->refcount, job->free);
> @@ -433,9 +439,10 @@ v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
>  static int
>  v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
>struct v3d_job *job, void (*free)(struct kref *ref),
> -  u32 in_sync)
> +  u32 in_sync, enum v3d_queue queue)
>  {
>   struct dma_fence *in_fence = NULL;
> + struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
>   int ret;
>  
>   job->v3d = v3d;
> @@ -446,35 +453,33 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
> *file_priv,
>   return ret;
>  
>   xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
> + ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
> +  v3d_priv);
> + if (ret)
> + goto fail;
>  
>   ret = drm_syncobj_find_fence(file_priv, in_sync, 0, 0, &in_fence);
>   if (ret == -EINVAL)
> - goto fail;
> + goto fail_job;
>  
>   ret = drm_gem_fence_array_add(&job->deps, in_fence);
>   if (ret)
> - goto fail;
> + goto fail_job;
>  
>   kref_init(&job->refcount);
>  
>   return 0;
> +fail_job:
> + drm_sched_job_cleanup(&job->base);
>  fail:
>   xa_destroy(&job->deps);
>   pm_runtime_put_autosuspend(v3d->drm.dev);
>   return ret;
>  }
>  
> -static int
> -v3d_push_job(struct v3d_file_priv *v3d_priv,
> -  struct v3d_job *job, enum v3d_queue queue)
> +static void
> +v3d_push_job(struct v3d_job *job)
>  {
> - int ret;
> -
> - ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
> -  v3d_priv);
> - if (ret)
> - return ret;
> -
>   drm_sched_job_arm(&job->base);
>  
>   job->done_fence = dma_fence_get(&job->base.s_fence->finished);
> @@ -483,8 +488,6 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
>   kref_get(&job->refcount);
>  
>   drm_sched_entity_push_job(&job->base);
> -
> - return 0;
>  }
>  
>  static void
> @@ -530,7 +533,6 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
>   struct drm_file *file_priv)
>  {
>   struct v3d_dev *v3d = to_v3d_dev(dev);
> - struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
>   struct drm_v3d_submit_cl *args = data;
>   struct v3d_bin_job *bin = NULL;
>   struct v3d_render_job *render;
> @@ -556,7 +558,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
>   INIT_LIST_HEAD(&render->unref_list);
>  
>   ret = v3d_job_init(v3d, file_priv, &render->base,
> -v3d_render_job_free, args->in_sync_rcl);
> +v3d_render_job_free, args->in_sync_rcl, V3D_RENDER);
>   if (ret) {
>   kfree(render);
>   return ret;
> @@ -570,7 +572,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
>   }
>  
>   ret = v3d_job_init(v3d, file_priv, &bin->base,
> -v3d_job_free, args->in_sync_bcl);
> +v3d_job_free, args->in_sync_bcl, V3D_BIN);
>   if (ret) {
>   v3d_job_put(&render->base);
>

Re: [PATCH v4 09/18] drm/v3d: Use scheduler dependency handling

2021-07-14 Thread Melissa Wen
On 07/12, Daniel Vetter wrote:
> With the prep work out of the way this isn't tricky anymore.
> 
> Aside: The chaining of the various jobs is a bit awkward, with the
> possibility of failure in bad places. I think with the
> drm_sched_job_init/arm split and maybe preloading the
> job->dependencies xarray this should be fixable.
> 
> Cc: Melissa Wen 
> Signed-off-by: Daniel Vetter 
> Cc: Cc: Emma Anholt 
> ---
>  drivers/gpu/drm/v3d/v3d_drv.h   |  5 -
>  drivers/gpu/drm/v3d/v3d_gem.c   | 25 -
>  drivers/gpu/drm/v3d/v3d_sched.c | 29 +
>  3 files changed, 9 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
> index 1d870261eaac..f80f4ff1f7aa 100644
> --- a/drivers/gpu/drm/v3d/v3d_drv.h
> +++ b/drivers/gpu/drm/v3d/v3d_drv.h
> @@ -192,11 +192,6 @@ struct v3d_job {
>   struct drm_gem_object **bo;
>   u32 bo_count;
>  
> - /* Array of struct dma_fence * to block on before submitting this job.
> -  */
> - struct xarray deps;
> - unsigned long last_dep;
> -
>   /* v3d fence to be signaled by IRQ handler when the job is complete. */
>   struct dma_fence *irq_fence;
>  
> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> index 5eccd3658938..42b07ffbea5e 100644
> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> @@ -257,8 +257,8 @@ v3d_lock_bo_reservations(struct v3d_job *job,
>   return ret;
>  
>   for (i = 0; i < job->bo_count; i++) {
> - ret = drm_gem_fence_array_add_implicit(&job->deps,
> -job->bo[i], true);
> + ret = drm_sched_job_await_implicit(&job->base,
> +job->bo[i], true);
>   if (ret) {
>   drm_gem_unlock_reservations(job->bo, job->bo_count,
>   acquire_ctx);
> @@ -354,8 +354,6 @@ static void
>  v3d_job_free(struct kref *ref)
>  {
>   struct v3d_job *job = container_of(ref, struct v3d_job, refcount);
> - unsigned long index;
> - struct dma_fence *fence;
>   int i;
>  
>   for (i = 0; i < job->bo_count; i++) {
> @@ -364,11 +362,6 @@ v3d_job_free(struct kref *ref)
>   }
>   kvfree(job->bo);
>  
> - xa_for_each(&job->deps, index, fence) {
> - dma_fence_put(fence);
> - }
> - xa_destroy(&job->deps);
> -
>   dma_fence_put(job->irq_fence);
>   dma_fence_put(job->done_fence);
>  
> @@ -452,7 +445,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
> *file_priv,
>   if (ret < 0)
>   return ret;
>  
> - xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
>   ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
>v3d_priv);
>   if (ret)
> @@ -462,7 +454,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
> *file_priv,
>   if (ret == -EINVAL)
>   goto fail_job;
>  
> - ret = drm_gem_fence_array_add(&job->deps, in_fence);
> + ret = drm_sched_job_await_fence(&job->base, in_fence);
>   if (ret)
>   goto fail_job;
>  
> @@ -472,7 +464,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
> *file_priv,
>  fail_job:
>   drm_sched_job_cleanup(&job->base);
>  fail:
> - xa_destroy(&job->deps);
>   pm_runtime_put_autosuspend(v3d->drm.dev);
>   return ret;
>  }
> @@ -619,8 +610,8 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
>   if (bin) {
>   v3d_push_job(&bin->base);
>  
> - ret = drm_gem_fence_array_add(&render->base.deps,
> -   
> dma_fence_get(bin->base.done_fence));
> + ret = drm_sched_job_await_fence(&render->base.base,
> + 
> dma_fence_get(bin->base.done_fence));
>   if (ret)
>   goto fail_unreserve;
>   }
> @@ -630,7 +621,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
>   if (clean_job) {
>   struct dma_fence *render_fence =
>   dma_fence_get(render->base.done_fence);
> - ret = drm_gem_fence_array_add(&clean_job->deps, render_fence);
> + ret = drm_sched_job_await_fence(&clean_job->base, render_fence);
>   if (ret)
>   goto fail_unreserve;
>   v3d_push_job(clean_job);
> @@ -820,8 +811,8 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
>   mutex_lock(&v3d->sched_lock);
>   v3d_push_job(&job->base);
>  
> - ret = drm_gem_fence_array_add(&clean_job->deps,
> -   dma_fence_get(job->base.done_fence));
> + ret = drm_sched_job_await_fence(&clean_job->base,
> + dma_fence_get(job->base.done_fence));
>   if (ret)
>   goto fail_unreserv

[PATCH v2] drm/bridge: nwl-dsi: Avoid potential multiplication overflow on 32-bit

2021-07-14 Thread Geert Uytterhoeven
As nwl_dsi.lanes is u32, and NSEC_PER_SEC is 10L, the second
multiplication in

dsi->lanes * 8 * NSEC_PER_SEC

will overflow on a 32-bit platform.  Fix this by making the constant
unsigned long long, forcing 64-bit arithmetic.

As iMX8 is arm64, this driver is currently used on 64-bit platforms
only, where long is 64-bit, so this cannot happen.  But the issue will
start to happen when the driver is reused for a 32-bit SoC (e.g.
i.MX7ULP), or when code is copied for a new driver.

Signed-off-by: Geert Uytterhoeven 
Reviewed-by: Fabio Estevam 
Reviewed-by: Laurent Pinchart 
---
Compile-tested only.

v2:
  - Add Reviewed-by,
  - Add reference to i.MX7ULP.
---
 drivers/gpu/drm/bridge/nwl-dsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/nwl-dsi.c b/drivers/gpu/drm/bridge/nwl-dsi.c
index 873995f0a7416e58..6002404ffcb9df08 100644
--- a/drivers/gpu/drm/bridge/nwl-dsi.c
+++ b/drivers/gpu/drm/bridge/nwl-dsi.c
@@ -196,7 +196,7 @@ static u32 ps2bc(struct nwl_dsi *dsi, unsigned long long ps)
u32 bpp = mipi_dsi_pixel_format_to_bpp(dsi->format);
 
return DIV64_U64_ROUND_UP(ps * dsi->mode.clock * bpp,
- dsi->lanes * 8 * NSEC_PER_SEC);
+ dsi->lanes * 8ULL * NSEC_PER_SEC);
 }
 
 /*
-- 
2.25.1



[PATCH v2 0/7] Add support to the mmsys driver to be a reset controller

2021-07-14 Thread Enric Balletbo i Serra
Dear all,

The following patchset is a reimplementation of the patch sent by Jitao
Shi [1] some time ago. As suggested by Chun-Kuang Hu, this time the
reset is done using the reset API, where the mmsys driver is the reset
controller and the mtk_dsi driver is the reset consumer.

Note that the first patch is kind of unrelated change, it's just a
cleanup but is needed if you want to apply all the following patches
cleanly.

This patchset is important in order to have the DSI panel working on some
kukui MT8183 Chromebooks (i.e Lenovo IdeaPad Duet). Without it, you just
get a black screen.

Best regards,
  Enric

[1] 
https://lore.kernel.org/linux-arm-kernel/20210420132614.150242-4-jitao@mediatek.com/


Changes in v2:
- Fix build test ERROR Reported-by: kernel test robot 
- Added a new patch to describe the dsi reset optional property.

Enric Balletbo i Serra (7):
  arm64: dts: mediatek: Move reset controller constants into common
location
  dt-bindings: mediatek: Add #reset-cells to mmsys system controller
  dt-bindings: display: mediatek: add dsi reset optional property
  arm64: dts: mt8173: Add the mmsys reset bit to reset the dsi0
  arm64: dts: mt8183: Add the mmsys reset bit to reset the dsi0
  soc: mediatek: mmsys: Add reset controller support
  drm/mediatek: mtk_dsi: Reset the dsi0 hardware

 .../bindings/arm/mediatek/mediatek,mmsys.txt  |  2 +
 .../display/mediatek/mediatek,dsi.txt |  6 ++
 arch/arm64/boot/dts/mediatek/mt8173.dtsi  |  2 +
 arch/arm64/boot/dts/mediatek/mt8183.dtsi  |  5 +-
 drivers/gpu/drm/mediatek/mtk_dsi.c|  5 +-
 drivers/soc/mediatek/mtk-mmsys.c  | 69 +++
 drivers/soc/mediatek/mtk-mmsys.h  |  2 +
 drivers/watchdog/mtk_wdt.c|  6 +-
 .../mt2712-resets.h   |  0
 include/dt-bindings/reset/mt8173-resets.h |  2 +
 .../mt8183-resets.h   |  3 +
 .../mt8192-resets.h   |  0
 12 files changed, 96 insertions(+), 6 deletions(-)
 rename include/dt-bindings/{reset-controller => reset}/mt2712-resets.h (100%)
 rename include/dt-bindings/{reset-controller => reset}/mt8183-resets.h (98%)
 rename include/dt-bindings/{reset-controller => reset}/mt8192-resets.h (100%)

-- 
2.30.2



[PATCH v2 3/7] dt-bindings: display: mediatek: add dsi reset optional property

2021-07-14 Thread Enric Balletbo i Serra
Update device tree binding documentation for the dsi to add the optional
property to reset the dsi controller.

Signed-off-by: Enric Balletbo i Serra 
---

Changes in v2:
- Added a new patch to describe the dsi reset optional property.

 .../devicetree/bindings/display/mediatek/mediatek,dsi.txt   | 6 ++
 1 file changed, 6 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/display/mediatek/mediatek,dsi.txt 
b/Documentation/devicetree/bindings/display/mediatek/mediatek,dsi.txt
index 8238a86686be..3209b700ded6 100644
--- a/Documentation/devicetree/bindings/display/mediatek/mediatek,dsi.txt
+++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,dsi.txt
@@ -19,6 +19,11 @@ Required properties:
   Documentation/devicetree/bindings/graph.txt. This port should be connected
   to the input port of an attached DSI panel or DSI-to-eDP encoder chip.
 
+Optional properties:
+- resets: list of phandle + reset specifier pair, as described in [1].
+
+[1] Documentation/devicetree/bindings/reset/reset.txt
+
 MIPI TX Configuration Module
 
 
@@ -45,6 +50,7 @@ dsi0: dsi@1401b000 {
clocks = <&mmsys MM_DSI0_ENGINE>, <&mmsys MM_DSI0_DIGITAL>,
 <&mipi_tx0>;
clock-names = "engine", "digital", "hs";
+   resets = <&mmsys MT8173_MMSYS_SW0_RST_B_DISP_DSI0>;
phys = <&mipi_tx0>;
phy-names = "dphy";
 
-- 
2.30.2



[PATCH v2 7/7] drm/mediatek: mtk_dsi: Reset the dsi0 hardware

2021-07-14 Thread Enric Balletbo i Serra
Reset dsi0 HW to default when power on. This prevents to have different
settingbetween the bootloader and the kernel.

As not all Mediatek boards have the reset consumer configured in their
board description, also is not needed on all of them, the reset is optional,
so the change is compatible with all boards.

Cc: Jitao Shi 
Suggested-by: Chun-Kuang Hu 
Signed-off-by: Enric Balletbo i Serra 
---

(no changes since v1)

 drivers/gpu/drm/mediatek/mtk_dsi.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_dsi.c 
b/drivers/gpu/drm/mediatek/mtk_dsi.c
index ae403c67cbd9..d8b81e2ab841 100644
--- a/drivers/gpu/drm/mediatek/mtk_dsi.c
+++ b/drivers/gpu/drm/mediatek/mtk_dsi.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -980,8 +981,10 @@ static int mtk_dsi_bind(struct device *dev, struct device 
*master, void *data)
struct mtk_dsi *dsi = dev_get_drvdata(dev);
 
ret = mtk_dsi_encoder_init(drm, dsi);
+   if (ret)
+   return ret;
 
-   return ret;
+   return device_reset_optional(dev);
 }
 
 static void mtk_dsi_unbind(struct device *dev, struct device *master,
-- 
2.30.2



[PATCH v2] dt-bindings: display: renesas, du: Make resets optional on R-Car H1

2021-07-14 Thread Geert Uytterhoeven
The "resets" property is not present on R-Car Gen1 SoCs.
Supporting it would require migrating from renesas,cpg-clocks to
renesas,cpg-mssr.

Reflect this in the DT bindings by removing the global "required:
resets".  All SoCs that do have "resets" properties already have
SoC-specific rules making it required.

Fixes: 99d66127fad25ebb ("dt-bindings: display: renesas,du: Convert binding to 
YAML")
Signed-off-by: Geert Uytterhoeven 
Reviewed-by: Laurent Pinchart 
---
v2:
  - Add Reviewed-by.
---
 Documentation/devicetree/bindings/display/renesas,du.yaml | 1 -
 1 file changed, 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/display/renesas,du.yaml 
b/Documentation/devicetree/bindings/display/renesas,du.yaml
index 5f4345d43020fd53..e3ca5389c17d34e9 100644
--- a/Documentation/devicetree/bindings/display/renesas,du.yaml
+++ b/Documentation/devicetree/bindings/display/renesas,du.yaml
@@ -92,7 +92,6 @@ required:
   - reg
   - clocks
   - interrupts
-  - resets
   - ports
 
 allOf:
-- 
2.25.1



Re: [PATCH] drm/bridge: anx7625: Use pm_runtime_force_{suspend,resume}

2021-07-14 Thread Daniel Vetter
On Wed, Jul 14, 2021 at 02:01:59PM +0800, Pi-Hsun Shih wrote:
> Use pm_runtime_force_suspend and pm_runtime_force_resume to ensure that
> anx7625 would always be powered off when suspended. Also update the
> bridge enable hook to always ensure that the anx7625 is powered on
> before starting DP operations.
> 
> Fixes: 409776fa3c42 ("drm/bridge: anx7625: add suspend / resume hooks")
> 
> Signed-off-by: Pi-Hsun Shih 
> 
> ---
> 
> An issue was found that the anx7625 driver won't power off when used as
> eDP bridge on Asurada board if suspend is entered via VT2.
> 
> The reason is that in this case, anx7625_suspend won't power off anx7625
> (since intp_irq is not set). And anx7625_bridge_disable is only called
> indirectly by other driver's (mediatek-drm) suspend.
> pm_runtime_put_sync won't do anything since it's already in system
> suspend.
> 
> If not in VT2, the bridge disable is indirectly called when Chrome
> stops, so anx7625 will be powered off correctly.
> 
> To fix the issue, the suspend resume hooks are changed to
> pm_runtime_force_{suspend,resume} to ensure the runtime suspend / resume
> is always called correctly when system suspend / resume.
> (Note that IRQ no longer needs to be disabled on suspend after commit
> f03ab6629c7b ("drm/bridge: anx7625: Make hpd workqueue freezable"))
> 
> Since bridge disable is called indirectly by mediatek-drm driver's
> suspend, it might happens after anx7625 suspend is called. So a check
> if the driver is already suspended via pm_runtime_force_suspend is also
> added, to ensure that the anx7625_dp_stop won't be called when power
> is off. And also since bridge enable might happens before anx7625 resume
> is called, a check to that is also added, and would force resume the
> device in this case.
> 
> I'm not sure if the approach to fix this is the most appropriate way,
> since using pm_runtime_force_resume in bridge enable kinda feels hacky
> to me. I'm open to any suggestions.

I thought the real fix was to create device links between the bridge and
the other parts of the overall drm driver, so that the driver core can
resume devices in the right order.

Unfortunately those device link patches haven't made it in yet. Quick
search on lore didn't find anything, maybe I was just dreaming, or maybe
the patches only existed for panels.

Either way, this is a drm_bridge.c problem that needs to be fixed there,
not individually in each driver.
-Daniel

> 
> ---
>  drivers/gpu/drm/bridge/analogix/anx7625.c | 55 +--
>  1 file changed, 20 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c 
> b/drivers/gpu/drm/bridge/analogix/anx7625.c
> index a3d82377066b..9d0f5dc88b16 100644
> --- a/drivers/gpu/drm/bridge/analogix/anx7625.c
> +++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
> @@ -1559,7 +1559,20 @@ static void anx7625_bridge_enable(struct drm_bridge 
> *bridge)
>  
>   DRM_DEV_DEBUG_DRIVER(dev, "drm enable\n");
>  
> - pm_runtime_get_sync(dev);
> + /*
> +  * The only case where pm_runtime is disabled here is when the function
> +  * is called other driver's resume hook by
> +  * drm_mode_config_helper_resume, but when the pm_runtime_force_resume
> +  * hasn't been called on this device.
> +  *
> +  * pm_runtime_get_sync won't power on anx7625 in this case since we're
> +  * in system resume, so instead we force resume anx7625 to make sure
> +  * the following anx7625_dp_start would succeed.
> +  */
> + if (pm_runtime_enabled(dev))
> + pm_runtime_get_sync(dev);
> + else
> + pm_runtime_force_resume(dev);
>  
>   anx7625_dp_start(ctx);
>  }
> @@ -1571,9 +1584,10 @@ static void anx7625_bridge_disable(struct drm_bridge 
> *bridge)
>  
>   DRM_DEV_DEBUG_DRIVER(dev, "drm disable\n");
>  
> - anx7625_dp_stop(ctx);
> -
> - pm_runtime_put_sync(dev);
> + if (pm_runtime_enabled(dev)) {
> + anx7625_dp_stop(ctx);
> + pm_runtime_put_sync(dev);
> + }
>  }
>  
>  static enum drm_connector_status
> @@ -1705,38 +1719,9 @@ static int __maybe_unused 
> anx7625_runtime_pm_resume(struct device *dev)
>   return 0;
>  }
>  
> -static int __maybe_unused anx7625_resume(struct device *dev)
> -{
> - struct anx7625_data *ctx = dev_get_drvdata(dev);
> -
> - if (!ctx->pdata.intp_irq)
> - return 0;
> -
> - if (!pm_runtime_enabled(dev) || !pm_runtime_suspended(dev)) {
> - enable_irq(ctx->pdata.intp_irq);
> - anx7625_runtime_pm_resume(dev);
> - }
> -
> - return 0;
> -}
> -
> -static int __maybe_unused anx7625_suspend(struct device *dev)
> -{
> - struct anx7625_data *ctx = dev_get_drvdata(dev);
> -
> - if (!ctx->pdata.intp_irq)
> - return 0;
> -
> - if (!pm_runtime_enabled(dev) || !pm_runtime_suspended(dev)) {
> - anx7625_runtime_pm_suspend(dev);
> - disable_irq(ctx->pdata.intp_irq);
> - }
> -
> - return 0;

Re: [PATCH v3 1/1] drm/ttm: Fix COW check

2021-07-14 Thread Daniel Vetter
On Mon, Jul 12, 2021 at 06:06:36PM -0400, Felix Kuehling wrote:
> KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
> is_cow_mapping returns true for these mappings. Add a check for
> vm_flags & VM_WRITE to avoid mmap failures on private read-only or
> PROT_NONE mappings.
> 
> v2: protect against mprotect making a mapping writable after the fact
> v3: update driver-specific vm_operations_structs
> 
> Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
> Signed-off-by: Felix Kuehling 
> Signed-off-by: Alex Deucher 

So looking at vmf_insert_pfn_prot() and the comment there we can't have
VM_PFNMAP and is_cow_mapping ever be true, or things break. On platforms
without pte_special at least.

So I'm not sure this is a great idea, and definitely not for all drivers
...

Can we clear VM_MAYWRITE instead to force this to be a non-cow mapping
instead?
-Daniel

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  |  3 ++-
>  drivers/gpu/drm/nouveau/nouveau_gem.c|  3 ++-
>  drivers/gpu/drm/radeon/radeon_gem.c  |  3 ++-
>  drivers/gpu/drm/ttm/ttm_bo_vm.c  | 14 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c |  1 +
>  include/drm/ttm/ttm_bo_api.h |  4 
>  6 files changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index b3404c43a911..1aa750a6a5d2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -79,7 +79,8 @@ static const struct vm_operations_struct amdgpu_gem_vm_ops 
> = {
>   .fault = amdgpu_gem_fault,
>   .open = ttm_bo_vm_open,
>   .close = ttm_bo_vm_close,
> - .access = ttm_bo_vm_access
> + .access = ttm_bo_vm_access,
> + .mprotect = ttm_bo_vm_mprotect
>  };
>  
>  static void amdgpu_gem_object_free(struct drm_gem_object *gobj)
> diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
> b/drivers/gpu/drm/nouveau/nouveau_gem.c
> index 5b27845075a1..164ea564bb7a 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_gem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
> @@ -70,7 +70,8 @@ static const struct vm_operations_struct nouveau_ttm_vm_ops 
> = {
>   .fault = nouveau_ttm_fault,
>   .open = ttm_bo_vm_open,
>   .close = ttm_bo_vm_close,
> - .access = ttm_bo_vm_access
> + .access = ttm_bo_vm_access,
> + .mprotect = ttm_bo_vm_mprotect
>  };
>  
>  void
> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
> b/drivers/gpu/drm/radeon/radeon_gem.c
> index 458f92a70887..c19ad07eb7b5 100644
> --- a/drivers/gpu/drm/radeon/radeon_gem.c
> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> @@ -77,7 +77,8 @@ static const struct vm_operations_struct radeon_gem_vm_ops 
> = {
>   .fault = radeon_gem_fault,
>   .open = ttm_bo_vm_open,
>   .close = ttm_bo_vm_close,
> - .access = ttm_bo_vm_access
> + .access = ttm_bo_vm_access,
> + .mprotect = ttm_bo_vm_mprotect
>  };
>  
>  static void radeon_gem_object_free(struct drm_gem_object *gobj)
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index f56be5bc0861..fb325bad5db6 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -542,17 +542,29 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, 
> unsigned long addr,
>  }
>  EXPORT_SYMBOL(ttm_bo_vm_access);
>  
> +int ttm_bo_vm_mprotect(struct vm_area_struct *vma, unsigned long start,
> +unsigned long end, unsigned long newflags)
> +{
> + /* Enforce no COW since would have really strange behavior with it. */
> + if (is_cow_mapping(newflags) && (newflags & VM_WRITE))
> + return -EINVAL;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(ttm_bo_vm_mprotect);
> +
>  static const struct vm_operations_struct ttm_bo_vm_ops = {
>   .fault = ttm_bo_vm_fault,
>   .open = ttm_bo_vm_open,
>   .close = ttm_bo_vm_close,
>   .access = ttm_bo_vm_access,
> + .mprotect = ttm_bo_vm_mprotect,
>  };
>  
>  int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct ttm_buffer_object *bo)
>  {
>   /* Enforce no COW since would have really strange behavior with it. */
> - if (is_cow_mapping(vma->vm_flags))
> + if (is_cow_mapping(vma->vm_flags) && (vma->vm_flags & VM_WRITE))
>   return -EINVAL;
>  
>   ttm_bo_get(bo);
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c 
> b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c
> index e6b1f98ec99f..e4bf7dc99320 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c
> @@ -61,6 +61,7 @@ int vmw_mmap(struct file *filp, struct vm_area_struct *vma)
>   .fault = vmw_bo_vm_fault,
>   .open = ttm_bo_vm_open,
>   .close = ttm_bo_vm_close,
> + .mprotect = ttm_bo_vm_mprotect,
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>   .huge_fault = vmw_bo_vm_huge_fault,
>  #endif
> diff --git a/include/drm/ttm/ttm_bo_api.h b

Re: [PATCH v3 1/1] drm/ttm: Fix COW check

2021-07-14 Thread Daniel Vetter
On Wed, Jul 14, 2021 at 12:44:00PM +0200, Daniel Vetter wrote:
> On Mon, Jul 12, 2021 at 06:06:36PM -0400, Felix Kuehling wrote:
> > KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
> > is_cow_mapping returns true for these mappings. Add a check for
> > vm_flags & VM_WRITE to avoid mmap failures on private read-only or
> > PROT_NONE mappings.
> > 
> > v2: protect against mprotect making a mapping writable after the fact
> > v3: update driver-specific vm_operations_structs
> > 
> > Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
> > Signed-off-by: Felix Kuehling 
> > Signed-off-by: Alex Deucher 
> 
> So looking at vmf_insert_pfn_prot() and the comment there we can't have
> VM_PFNMAP and is_cow_mapping ever be true, or things break. On platforms
> without pte_special at least.
> 
> So I'm not sure this is a great idea, and definitely not for all drivers
> ...
> 
> Can we clear VM_MAYWRITE instead to force this to be a non-cow mapping
> instead?

Quick git grep says there's plenty of drivers which clear MAYWRITE, and
often with comments to block mprotect upfront. That feels like the cleaner
approach, and maybe limited to an overwrite in just amdgpu with a comment
explaining why it's needed? As in amdgpu mmap function which just clears
VM_MAYWRITE if it's a cow mapping and then calls into ttm mmap
implementation.
-Daniel

> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  |  3 ++-
> >  drivers/gpu/drm/nouveau/nouveau_gem.c|  3 ++-
> >  drivers/gpu/drm/radeon/radeon_gem.c  |  3 ++-
> >  drivers/gpu/drm/ttm/ttm_bo_vm.c  | 14 +-
> >  drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c |  1 +
> >  include/drm/ttm/ttm_bo_api.h |  4 
> >  6 files changed, 24 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > index b3404c43a911..1aa750a6a5d2 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > @@ -79,7 +79,8 @@ static const struct vm_operations_struct 
> > amdgpu_gem_vm_ops = {
> > .fault = amdgpu_gem_fault,
> > .open = ttm_bo_vm_open,
> > .close = ttm_bo_vm_close,
> > -   .access = ttm_bo_vm_access
> > +   .access = ttm_bo_vm_access,
> > +   .mprotect = ttm_bo_vm_mprotect
> >  };
> >  
> >  static void amdgpu_gem_object_free(struct drm_gem_object *gobj)
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
> > b/drivers/gpu/drm/nouveau/nouveau_gem.c
> > index 5b27845075a1..164ea564bb7a 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_gem.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
> > @@ -70,7 +70,8 @@ static const struct vm_operations_struct 
> > nouveau_ttm_vm_ops = {
> > .fault = nouveau_ttm_fault,
> > .open = ttm_bo_vm_open,
> > .close = ttm_bo_vm_close,
> > -   .access = ttm_bo_vm_access
> > +   .access = ttm_bo_vm_access,
> > +   .mprotect = ttm_bo_vm_mprotect
> >  };
> >  
> >  void
> > diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
> > b/drivers/gpu/drm/radeon/radeon_gem.c
> > index 458f92a70887..c19ad07eb7b5 100644
> > --- a/drivers/gpu/drm/radeon/radeon_gem.c
> > +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> > @@ -77,7 +77,8 @@ static const struct vm_operations_struct 
> > radeon_gem_vm_ops = {
> > .fault = radeon_gem_fault,
> > .open = ttm_bo_vm_open,
> > .close = ttm_bo_vm_close,
> > -   .access = ttm_bo_vm_access
> > +   .access = ttm_bo_vm_access,
> > +   .mprotect = ttm_bo_vm_mprotect
> >  };
> >  
> >  static void radeon_gem_object_free(struct drm_gem_object *gobj)
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
> > b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > index f56be5bc0861..fb325bad5db6 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > @@ -542,17 +542,29 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, 
> > unsigned long addr,
> >  }
> >  EXPORT_SYMBOL(ttm_bo_vm_access);
> >  
> > +int ttm_bo_vm_mprotect(struct vm_area_struct *vma, unsigned long start,
> > +  unsigned long end, unsigned long newflags)
> > +{
> > +   /* Enforce no COW since would have really strange behavior with it. */
> > +   if (is_cow_mapping(newflags) && (newflags & VM_WRITE))
> > +   return -EINVAL;
> > +
> > +   return 0;
> > +}
> > +EXPORT_SYMBOL(ttm_bo_vm_mprotect);
> > +
> >  static const struct vm_operations_struct ttm_bo_vm_ops = {
> > .fault = ttm_bo_vm_fault,
> > .open = ttm_bo_vm_open,
> > .close = ttm_bo_vm_close,
> > .access = ttm_bo_vm_access,
> > +   .mprotect = ttm_bo_vm_mprotect,
> >  };
> >  
> >  int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct ttm_buffer_object 
> > *bo)
> >  {
> > /* Enforce no COW since would have really strange behavior with it. */
> > -   if (is_cow_mapping(vma->vm_flags))
> > +   if (is_cow_mapping(vma->vm_flags) && (vma->vm_flags & VM_WRITE))
> > return -EINVAL;
> >  
> > ttm_bo_get(bo);
> > diff --git a/drivers/gpu

Re: [PATCH v3 1/1] drm/ttm: Fix COW check

2021-07-14 Thread Christian König

Am 14.07.21 um 12:44 schrieb Daniel Vetter:

On Mon, Jul 12, 2021 at 06:06:36PM -0400, Felix Kuehling wrote:

KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
is_cow_mapping returns true for these mappings. Add a check for
vm_flags & VM_WRITE to avoid mmap failures on private read-only or
PROT_NONE mappings.

v2: protect against mprotect making a mapping writable after the fact
v3: update driver-specific vm_operations_structs

Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
Signed-off-by: Felix Kuehling 
Signed-off-by: Alex Deucher 

So looking at vmf_insert_pfn_prot() and the comment there we can't have
VM_PFNMAP and is_cow_mapping ever be true, or things break. On platforms
without pte_special at least.


Key idea is that we never end up in vmf_insert_pfn_prot() because the 
vma is mapped with PROT_NONE.




So I'm not sure this is a great idea, and definitely not for all drivers


Yeah, I'm absolutely not happy with this either but it seemed to be the 
least painful thing to do.



...

Can we clear VM_MAYWRITE instead to force this to be a non-cow mapping
instead?


Well we have considered forcefully setting VM_SHARED, which won't work 
easily for a couple of reasons.


But clearing VM_MAYWRITE in amdgpu/amdkfd may actually work as well.

Felix can you test this?

Thanks,
Christian.


-Daniel


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  |  3 ++-
  drivers/gpu/drm/nouveau/nouveau_gem.c|  3 ++-
  drivers/gpu/drm/radeon/radeon_gem.c  |  3 ++-
  drivers/gpu/drm/ttm/ttm_bo_vm.c  | 14 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c |  1 +
  include/drm/ttm/ttm_bo_api.h |  4 
  6 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index b3404c43a911..1aa750a6a5d2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -79,7 +79,8 @@ static const struct vm_operations_struct amdgpu_gem_vm_ops = {
.fault = amdgpu_gem_fault,
.open = ttm_bo_vm_open,
.close = ttm_bo_vm_close,
-   .access = ttm_bo_vm_access
+   .access = ttm_bo_vm_access,
+   .mprotect = ttm_bo_vm_mprotect
  };
  
  static void amdgpu_gem_object_free(struct drm_gem_object *gobj)

diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 5b27845075a1..164ea564bb7a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -70,7 +70,8 @@ static const struct vm_operations_struct nouveau_ttm_vm_ops = 
{
.fault = nouveau_ttm_fault,
.open = ttm_bo_vm_open,
.close = ttm_bo_vm_close,
-   .access = ttm_bo_vm_access
+   .access = ttm_bo_vm_access,
+   .mprotect = ttm_bo_vm_mprotect
  };
  
  void

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 458f92a70887..c19ad07eb7b5 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -77,7 +77,8 @@ static const struct vm_operations_struct radeon_gem_vm_ops = {
.fault = radeon_gem_fault,
.open = ttm_bo_vm_open,
.close = ttm_bo_vm_close,
-   .access = ttm_bo_vm_access
+   .access = ttm_bo_vm_access,
+   .mprotect = ttm_bo_vm_mprotect
  };
  
  static void radeon_gem_object_free(struct drm_gem_object *gobj)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index f56be5bc0861..fb325bad5db6 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -542,17 +542,29 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned 
long addr,
  }
  EXPORT_SYMBOL(ttm_bo_vm_access);
  
+int ttm_bo_vm_mprotect(struct vm_area_struct *vma, unsigned long start,

+  unsigned long end, unsigned long newflags)
+{
+   /* Enforce no COW since would have really strange behavior with it. */
+   if (is_cow_mapping(newflags) && (newflags & VM_WRITE))
+   return -EINVAL;
+
+   return 0;
+}
+EXPORT_SYMBOL(ttm_bo_vm_mprotect);
+
  static const struct vm_operations_struct ttm_bo_vm_ops = {
.fault = ttm_bo_vm_fault,
.open = ttm_bo_vm_open,
.close = ttm_bo_vm_close,
.access = ttm_bo_vm_access,
+   .mprotect = ttm_bo_vm_mprotect,
  };
  
  int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct ttm_buffer_object *bo)

  {
/* Enforce no COW since would have really strange behavior with it. */
-   if (is_cow_mapping(vma->vm_flags))
+   if (is_cow_mapping(vma->vm_flags) && (vma->vm_flags & VM_WRITE))
return -EINVAL;
  
  	ttm_bo_get(bo);

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c
index e6b1f98ec99f..e4bf7dc99320 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c
@@ -61,6 +61,7 @@ int vmw_mmap(struct fil

Re: [PATCH v3 1/1] drm/ttm: Fix COW check

2021-07-14 Thread Daniel Vetter
On Wed, Jul 14, 2021 at 12:51:15PM +0200, Christian König wrote:
> Am 14.07.21 um 12:44 schrieb Daniel Vetter:
> > On Mon, Jul 12, 2021 at 06:06:36PM -0400, Felix Kuehling wrote:
> > > KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
> > > is_cow_mapping returns true for these mappings. Add a check for
> > > vm_flags & VM_WRITE to avoid mmap failures on private read-only or
> > > PROT_NONE mappings.
> > > 
> > > v2: protect against mprotect making a mapping writable after the fact
> > > v3: update driver-specific vm_operations_structs
> > > 
> > > Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
> > > Signed-off-by: Felix Kuehling 
> > > Signed-off-by: Alex Deucher 
> > So looking at vmf_insert_pfn_prot() and the comment there we can't have
> > VM_PFNMAP and is_cow_mapping ever be true, or things break. On platforms
> > without pte_special at least.
> 
> Key idea is that we never end up in vmf_insert_pfn_prot() because the vma is
> mapped with PROT_NONE.

Ah right if it's PROT_NONE then it's ok. But the code here only checks for
VM_WRITE, not VM_READ, so PROT_READ can get through and go boom? Or
something else I'm missing?

Maybe time for a few amdgpu mmap tests that go through the combos and make
sure it works/fails all correctly.
-Daniel

> > So I'm not sure this is a great idea, and definitely not for all drivers
> 
> Yeah, I'm absolutely not happy with this either but it seemed to be the
> least painful thing to do.
> 
> > ...
> > 
> > Can we clear VM_MAYWRITE instead to force this to be a non-cow mapping
> > instead?
> 
> Well we have considered forcefully setting VM_SHARED, which won't work
> easily for a couple of reasons.
> 
> But clearing VM_MAYWRITE in amdgpu/amdkfd may actually work as well.
> 
> Felix can you test this?
> 
> Thanks,
> Christian.
> 
> > -Daniel
> > 
> > > ---
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  |  3 ++-
> > >   drivers/gpu/drm/nouveau/nouveau_gem.c|  3 ++-
> > >   drivers/gpu/drm/radeon/radeon_gem.c  |  3 ++-
> > >   drivers/gpu/drm/ttm/ttm_bo_vm.c  | 14 +-
> > >   drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c |  1 +
> > >   include/drm/ttm/ttm_bo_api.h |  4 
> > >   6 files changed, 24 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > > index b3404c43a911..1aa750a6a5d2 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > > @@ -79,7 +79,8 @@ static const struct vm_operations_struct 
> > > amdgpu_gem_vm_ops = {
> > >   .fault = amdgpu_gem_fault,
> > >   .open = ttm_bo_vm_open,
> > >   .close = ttm_bo_vm_close,
> > > - .access = ttm_bo_vm_access
> > > + .access = ttm_bo_vm_access,
> > > + .mprotect = ttm_bo_vm_mprotect
> > >   };
> > >   static void amdgpu_gem_object_free(struct drm_gem_object *gobj)
> > > diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
> > > b/drivers/gpu/drm/nouveau/nouveau_gem.c
> > > index 5b27845075a1..164ea564bb7a 100644
> > > --- a/drivers/gpu/drm/nouveau/nouveau_gem.c
> > > +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
> > > @@ -70,7 +70,8 @@ static const struct vm_operations_struct 
> > > nouveau_ttm_vm_ops = {
> > >   .fault = nouveau_ttm_fault,
> > >   .open = ttm_bo_vm_open,
> > >   .close = ttm_bo_vm_close,
> > > - .access = ttm_bo_vm_access
> > > + .access = ttm_bo_vm_access,
> > > + .mprotect = ttm_bo_vm_mprotect
> > >   };
> > >   void
> > > diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
> > > b/drivers/gpu/drm/radeon/radeon_gem.c
> > > index 458f92a70887..c19ad07eb7b5 100644
> > > --- a/drivers/gpu/drm/radeon/radeon_gem.c
> > > +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> > > @@ -77,7 +77,8 @@ static const struct vm_operations_struct 
> > > radeon_gem_vm_ops = {
> > >   .fault = radeon_gem_fault,
> > >   .open = ttm_bo_vm_open,
> > >   .close = ttm_bo_vm_close,
> > > - .access = ttm_bo_vm_access
> > > + .access = ttm_bo_vm_access,
> > > + .mprotect = ttm_bo_vm_mprotect
> > >   };
> > >   static void radeon_gem_object_free(struct drm_gem_object *gobj)
> > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
> > > b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > > index f56be5bc0861..fb325bad5db6 100644
> > > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > > @@ -542,17 +542,29 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, 
> > > unsigned long addr,
> > >   }
> > >   EXPORT_SYMBOL(ttm_bo_vm_access);
> > > +int ttm_bo_vm_mprotect(struct vm_area_struct *vma, unsigned long start,
> > > +unsigned long end, unsigned long newflags)
> > > +{
> > > + /* Enforce no COW since would have really strange behavior with it. */
> > > + if (is_cow_mapping(newflags) && (newflags & VM_WRITE))
> > > + return -EINVAL;
> > > +
> > > + return 0;
> > > +}
> > > +EXPORT_SYMBOL(ttm_bo_vm_mprotect);
> > > +
> 

Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

2021-07-14 Thread Daniel Vetter
On Tue, Jul 13, 2021 at 09:46:30PM +0300, Ville Syrjälä wrote:
> On Tue, Jul 13, 2021 at 07:24:23PM +0100, Matthew Auld wrote:
> > On Tue, 13 Jul 2021 at 18:47, Ville Syrjälä
> >  wrote:
> > >
> > > On Tue, Jul 13, 2021 at 05:13:37PM +0100, Matthew Auld wrote:
> > > > On Tue, 13 Jul 2021 at 16:55, Ville Syrjälä
> > > >  wrote:
> > > > >
> > > > > On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
> > > > > > + /**
> > > > > > +  * @cache_coherent:
> > > > > > +  *
> > > > > > +  * Track whether the pages are coherent with the GPU if 
> > > > > > reading or
> > > > > > +  * writing through the CPU cache.
> > > > > > +  *
> > > > > > +  * This largely depends on the @cache_level, for example if 
> > > > > > the object
> > > > > > +  * is marked as I915_CACHE_LLC, then GPU access is coherent 
> > > > > > for both
> > > > > > +  * reads and writes through the CPU cache.
> > > > > > +  *
> > > > > > +  * Note that on platforms with shared-LLC support(HAS_LLC) 
> > > > > > reads through
> > > > > > +  * the CPU cache are always coherent, regardless of the 
> > > > > > @cache_level. On
> > > > > > +  * snooping based platforms this is not the case, unless the 
> > > > > > full
> > > > > > +  * I915_CACHE_LLC or similar setting is used.
> > > > > > +  *
> > > > > > +  * As a result of this we need to track coherency separately 
> > > > > > for reads
> > > > > > +  * and writes, in order to avoid superfluous flushing on 
> > > > > > shared-LLC
> > > > > > +  * platforms, for reads.
> > > > > > +  *
> > > > > > +  * I915_BO_CACHE_COHERENT_FOR_READ:
> > > > > > +  *
> > > > > > +  * When reading through the CPU cache, the GPU is still 
> > > > > > coherent. Note
> > > > > > +  * that no data has actually been modified here, so it might 
> > > > > > seem
> > > > > > +  * strange that we care about this.
> > > > > > +  *
> > > > > > +  * As an example, if some object is mapped on the CPU with 
> > > > > > write-back
> > > > > > +  * caching, and we read some page, then the cache likely now 
> > > > > > contains
> > > > > > +  * the data from that read. At this point the cache and main 
> > > > > > memory
> > > > > > +  * match up, so all good. But next the GPU needs to write 
> > > > > > some data to
> > > > > > +  * that same page. Now if the @cache_level is I915_CACHE_NONE 
> > > > > > and the
> > > > > > +  * the platform doesn't have the shared-LLC, then the GPU will
> > > > > > +  * effectively skip invalidating the cache(or however that 
> > > > > > works
> > > > > > +  * internally) when writing the new value.  This is really 
> > > > > > bad since the
> > > > > > +  * GPU has just written some new data to main memory, but the 
> > > > > > CPU cache
> > > > > > +  * is still valid and now contains stale data. As a result 
> > > > > > the next time
> > > > > > +  * we do a cached read with the CPU, we are rewarded with 
> > > > > > stale data.
> > > > > > +  * Likewise if the cache is later flushed, we might be 
> > > > > > rewarded with
> > > > > > +  * overwriting main memory with stale data.
> > > > > > +  *
> > > > > > +  * I915_BO_CACHE_COHERENT_FOR_WRITE:
> > > > > > +  *
> > > > > > +  * When writing through the CPU cache, the GPU is still 
> > > > > > coherent. Note
> > > > > > +  * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> > > > > > +  *
> > > > > > +  * This is never set when I915_CACHE_NONE is used for 
> > > > > > @cache_level,
> > > > > > +  * where instead we have to manually flush the caches after 
> > > > > > writing
> > > > > > +  * through the CPU cache. For other cache levels this should 
> > > > > > be set and
> > > > > > +  * the object is therefore considered coherent for both reads 
> > > > > > and writes
> > > > > > +  * through the CPU cache.
> > > > >
> > > > > I don't remember why we have this read vs. write split and this new
> > > > > documentation doesn't seem to really explain it either.
> > > >
> > > > Hmm, I attempted to explain that earlier:
> > > >
> > > > * Note that on platforms with shared-LLC support(HAS_LLC) reads through
> > > > * the CPU cache are always coherent, regardless of the @cache_level. On
> > > > * snooping based platforms this is not the case, unless the full
> > > > * I915_CACHE_LLC or similar setting is used.
> > > > *
> > > > * As a result of this we need to track coherency separately for reads
> > > > * and writes, in order to avoid superfluous flushing on shared-LLC
> > > > * platforms, for reads.
> > > >
> > > > So AFAIK it's just because shared-LLC can be coherent for reads, while
> > > > also not being coherent for writes(CACHE_NONE),
> > >
> > > CPU vs. GPU is fully coherent when it comes to LLC. Or at least I've
> > > never heard of any mechanism that would make it only partially coherent.
> > 
> > What do you mean by "comes to LLC"

Re: [PATCH 5/5] drm/i915/ehl: unconditionally flush the pages on acquire

2021-07-14 Thread Daniel Vetter
On Tue, Jul 13, 2021 at 11:45:54AM +0100, Matthew Auld wrote:
> EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
> possible for userspace to bypass the GTT caching bits set by the kernel,
> as per the given object cache_level. This is troublesome since the heavy
> flush we apply when first acquiring the pages is skipped if the kernel
> thinks the object is coherent with the GPU. As a result it might be
> possible to bypass the cache and read the contents of the page directly,
> which could be stale data. If it's just a case of userspace shooting
> themselves in the foot then so be it, but since i915 takes the stance of
> always zeroing memory before handing it to userspace, we need to prevent
> this.
> 
> v2: this time actually set cache_dirty in put_pages()
> v3: move to get_pages() which looks simpler
> 
> BSpec: 34007
> References: 046091758b50 ("Revert "drm/i915/ehl: Update MOCS table for EHL"")
> Signed-off-by: Matthew Auld 
> Cc: Tejas Upadhyay 
> Cc: Francisco Jerez 
> Cc: Lucas De Marchi 
> Cc: Jon Bloomfield 
> Cc: Chris Wilson 
> Cc: Matt Roper 
> Cc: Daniel Vetter 

Reviewed-by: Daniel Vetter 

I was pondering whether we can have a solid testcase for this, but:
- igt lacks the visibility, since we can't check easily whether stuff
  leaks.
- selftests don't have rendercopy, where we could select the nasty
  mocs entry

So it's a bit awkward. Is there something, or is this pure hw workaround
stuff on theoretical grounds?
-Daniel
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h   |  6 ++
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c  | 18 ++
>  2 files changed, 24 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index da2194290436..7089d1b222c5 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -522,6 +522,12 @@ struct drm_i915_gem_object {
>* I915_BO_CACHE_COHERENT_FOR_WRITE, i.e that the GPU will be coherent
>* for both reads and writes though the CPU cache. So pretty much this
>* should only be needed for I915_CACHE_NONE objects.
> +  *
> +  * Update: Some bonkers hardware decided to add the 'Bypass LLC' MOCS
> +  * entry, which defeats our @cache_coherent tracking, since userspace
> +  * can freely bypass the CPU cache when touching the pages with the GPU,
> +  * where the kernel is completely unaware. On such platform we need
> +  * apply the sledgehammer-on-acquire regardless of the @cache_coherent.
>*/
>   unsigned int cache_dirty:1;
>  
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> index 6a04cce188fc..11f072193f3b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -182,6 +182,24 @@ static int shmem_get_pages(struct drm_i915_gem_object 
> *obj)
>   if (i915_gem_object_needs_bit17_swizzle(obj))
>   i915_gem_object_do_bit_17_swizzle(obj, st);
>  
> + /*
> +  * EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
> +  * possible for userspace to bypass the GTT caching bits set by the
> +  * kernel, as per the given object cache_level. This is troublesome
> +  * since the heavy flush we apply when first gathering the pages is
> +  * skipped if the kernel thinks the object is coherent with the GPU. As
> +  * a result it might be possible to bypass the cache and read the
> +  * contents of the page directly, which could be stale data. If it's
> +  * just a case of userspace shooting themselves in the foot then so be
> +  * it, but since i915 takes the stance of always zeroing memory before
> +  * handing it to userspace, we need to prevent this.
> +  *
> +  * By setting cache_dirty here we make the clflush in set_pages
> +  * unconditional on such platforms.
> +  */
> + if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER)
> + obj->cache_dirty = true;
> +
>   __i915_gem_object_set_pages(obj, st, sg_page_sizes);
>  
>   return 0;
> -- 
> 2.26.3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Aw: Re: [PATCH v6 00/11] Clean up "mediatek,larb"

2021-07-14 Thread Frank Wunderlich
> Gesendet: Mittwoch, 14. Juli 2021 um 13:18 Uhr
> Von: "Yong Wu" 
> Hi Frank,
>
> Thanks for your report. mt7623 use mtk_iommu_v1.c.
>
> I will try to reproduce this locally.

Hi,

as far as i have debugged it dev->iommu_group is NULL, so it crashes on first 
access (dev_info)

drivers/iommu/iommu.c:

 923 void iommu_group_remove_device(struct device *dev)
 924 {
 925 printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
 926 struct iommu_group *group = dev->iommu_group;
 927 struct group_device *tmp_device, *device = NULL;
 928
 929 printk(KERN_ALERT "DEBUG: Passed %s %d 
0x%08x\n",__FUNCTION__,__LINE__,(unsigned int)group);
 930 dev_info(dev, "Removing from iommu group %d\n", group->id);


regards Frank


Re: [PATCH] dma-buf: add kernel count for dma_buf

2021-07-14 Thread Guangming . Cao
On Wed, 2021-07-14 at 12:43 +0200, Christian König wrote:
> Am 14.07.21 um 11:44 schrieb guangming@mediatek.com:
> > From: Guangming Cao 
> > 
> > On Wed, 2021-07-14 at 10:46 +0200, Christian König wrote:
> > > Am 14.07.21 um 09:11 schrieb guangming@mediatek.com:
> > > > From: Guangming Cao 
> > > > 
> > > > Add a refcount for kernel to prevent UAF(Use After Free) issue.
> > > 
> > > Well NAK on so many levels.
> > > 
> > > > We can assume a case like below:
> > > >   1. kernel space alloc dma_buf(file count = 1)
> > > >   2. kernel use dma_buf to get fd(file count = 1)
> > > >   3. userspace use fd to do mapping (file count = 2)
> > > 
> > > Creating an userspace mapping increases the reference count for
> > > the
> > > underlying file object.
> > > 
> > > See the implementation of mmap_region():
> > > ...
> > >   vma->vm_file = get_file(file);
> > >   error = call_mmap(file, vma);
> > > ...
> > > 
> > > What can happen is the the underlying exporter redirects the mmap
> > > to
> > > a
> > > different file, e.g. TTM or GEM drivers do that all the time.
> > > 
> > > But this is fine since then the VA mapping is independent of the
> > > DMA-
> > > buf.
> > > 
> > > >   4. kernel call dma_buf_put (file count = 1)
> > > >   5. userpsace close buffer fd(file count = 0)
> > > >   6. at this time, buffer is released, but va is valid!!
> > > >  So we still can read/write buffer via mmap va,
> > > >  it maybe cause memory leak, or kernel exception.
> > > >  And also, if we use "ls -ll" to watch corresponding
> > > > process
> > > >  fd link info, it also will cause kernel exception.
> > > > 
> > > > Another case:
> > > >Using dma_buf_fd to generate more than 1 fd, because
> > > >dma_buf_fd will not increase file count, thus, when
> > > > close
> > > >the second fd, it maybe occurs error.
> > > 
> > > Each opened fd will increase the reference count so this is
> > > certainly
> > > not correct what you describe here.
> > > 
> > > Regards,
> > > Christian.
> > > 
> > 
> > Yes, mmap will increase file count by calling get_file, so step[2]
> > ->
> > step[3], file count increase 1.
> > 
> > But, dma_buf_fd() will not increase file count.
> > function "dma_buf_fd(struct dma_buf *dmabuf, int flags)" just get
> > an
> > unused fd, via call "get_unused_fd_flags(flags)", and call
> > "fd_install(fd, dmabuf->file)", it will let associated "struct
> > file*"
> > in task's fdt->fd[fd] points to this dma_buf.file, not increase the
> > file count of dma_buf.file.
> > I think this is confusing, I can get more than 1 fds via
> > dma_buf_fd,
> > but they don't need to close it because they don't increase file
> > count.
> > 
> > However, dma_buf_put() can decrease file count at kernel side
> > directly.
> > If somebody write a ko to put file count of dma_buf.file many
> > times, it
> > will cause buffer freed earlier than except. At last on Android, I
> > think this is a little bit dangerous.
> 
> dma_buf_fd() takes the dma_buf pointer and converts it into a fd. So
> the 
> reference is consumed.
> 
> That's why users of this interface make sure to get a separate 
> reference, see drm_gem_prime_handle_to_fd() for example:
> 
> ...
> out_have_handle:
>  ret = dma_buf_fd(dmabuf, flags);
>  /*
>   * We must _not_ remove the buffer from the handle cache since
> the 
> newly
>   * created dma buf is already linked in the global obj->dma_buf 
> pointer,
>   * and that is invariant as long as a userspace gem handle
> exists.
>   * Closing the handle will clean out the cache anyway, so we
> don't 
> leak.
>   */
>  if (ret < 0) {
>  goto fail_put_dmabuf;
>  } else {
>  *prime_fd = ret;
>  ret = 0;
>  }
> 
>  goto out;
> 
> fail_put_dmabuf:
>  dma_buf_put(dmabuf);
> out:
> ...
> 
> You could submit a patch to improve the documentation and explicitly 
> note on dma_buf_fd() that the reference is consumed, but all of this
> is 
> working perfectly fine.
> 
> Regards,
> Christian.
> 

Thanks for your reply!

Yes, drm works fine because it fully understand what dma-buf api will
do. Improve the documentation is really good idea to prevent this case.

But, what I can't understand is, for kernel api exported to
corresponding users, we don't need to ensure all api is safe?

And for general cases, dma-buf framework also need to prevent this
case, isn't it, it will make dma-buf framework more strong?


BRs!
Guangming
> > 
> > > > Solution:
> > > >   Add a kernel count for dma_buf, and make sure the file
> > > > count
> > > >   of dma_buf.file hold by kernel is 1.
> > > > 
> > > > Notes: For this solution, kref couldn't work because kernel ref
> > > >  maybe added from 0, but kref don't allow it.
> > > > 
> > > > Signed-off-by: Guangming Cao 
> > > > ---
> > > >drivers/dma-buf/dma-buf.c | 23 +++
> > > >include/

Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

2021-07-14 Thread Ville Syrjälä
On Wed, Jul 14, 2021 at 01:16:57PM +0200, Daniel Vetter wrote:
> On Tue, Jul 13, 2021 at 09:46:30PM +0300, Ville Syrjälä wrote:
> > On Tue, Jul 13, 2021 at 07:24:23PM +0100, Matthew Auld wrote:
> > > On Tue, 13 Jul 2021 at 18:47, Ville Syrjälä
> > >  wrote:
> > > >
> > > > On Tue, Jul 13, 2021 at 05:13:37PM +0100, Matthew Auld wrote:
> > > > > On Tue, 13 Jul 2021 at 16:55, Ville Syrjälä
> > > > >  wrote:
> > > > > >
> > > > > > On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
> > > > > > > + /**
> > > > > > > +  * @cache_coherent:
> > > > > > > +  *
> > > > > > > +  * Track whether the pages are coherent with the GPU if 
> > > > > > > reading or
> > > > > > > +  * writing through the CPU cache.
> > > > > > > +  *
> > > > > > > +  * This largely depends on the @cache_level, for example if 
> > > > > > > the object
> > > > > > > +  * is marked as I915_CACHE_LLC, then GPU access is coherent 
> > > > > > > for both
> > > > > > > +  * reads and writes through the CPU cache.
> > > > > > > +  *
> > > > > > > +  * Note that on platforms with shared-LLC support(HAS_LLC) 
> > > > > > > reads through
> > > > > > > +  * the CPU cache are always coherent, regardless of the 
> > > > > > > @cache_level. On
> > > > > > > +  * snooping based platforms this is not the case, unless 
> > > > > > > the full
> > > > > > > +  * I915_CACHE_LLC or similar setting is used.
> > > > > > > +  *
> > > > > > > +  * As a result of this we need to track coherency 
> > > > > > > separately for reads
> > > > > > > +  * and writes, in order to avoid superfluous flushing on 
> > > > > > > shared-LLC
> > > > > > > +  * platforms, for reads.
> > > > > > > +  *
> > > > > > > +  * I915_BO_CACHE_COHERENT_FOR_READ:
> > > > > > > +  *
> > > > > > > +  * When reading through the CPU cache, the GPU is still 
> > > > > > > coherent. Note
> > > > > > > +  * that no data has actually been modified here, so it 
> > > > > > > might seem
> > > > > > > +  * strange that we care about this.
> > > > > > > +  *
> > > > > > > +  * As an example, if some object is mapped on the CPU with 
> > > > > > > write-back
> > > > > > > +  * caching, and we read some page, then the cache likely 
> > > > > > > now contains
> > > > > > > +  * the data from that read. At this point the cache and 
> > > > > > > main memory
> > > > > > > +  * match up, so all good. But next the GPU needs to write 
> > > > > > > some data to
> > > > > > > +  * that same page. Now if the @cache_level is 
> > > > > > > I915_CACHE_NONE and the
> > > > > > > +  * the platform doesn't have the shared-LLC, then the GPU 
> > > > > > > will
> > > > > > > +  * effectively skip invalidating the cache(or however that 
> > > > > > > works
> > > > > > > +  * internally) when writing the new value.  This is really 
> > > > > > > bad since the
> > > > > > > +  * GPU has just written some new data to main memory, but 
> > > > > > > the CPU cache
> > > > > > > +  * is still valid and now contains stale data. As a result 
> > > > > > > the next time
> > > > > > > +  * we do a cached read with the CPU, we are rewarded with 
> > > > > > > stale data.
> > > > > > > +  * Likewise if the cache is later flushed, we might be 
> > > > > > > rewarded with
> > > > > > > +  * overwriting main memory with stale data.
> > > > > > > +  *
> > > > > > > +  * I915_BO_CACHE_COHERENT_FOR_WRITE:
> > > > > > > +  *
> > > > > > > +  * When writing through the CPU cache, the GPU is still 
> > > > > > > coherent. Note
> > > > > > > +  * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> > > > > > > +  *
> > > > > > > +  * This is never set when I915_CACHE_NONE is used for 
> > > > > > > @cache_level,
> > > > > > > +  * where instead we have to manually flush the caches after 
> > > > > > > writing
> > > > > > > +  * through the CPU cache. For other cache levels this 
> > > > > > > should be set and
> > > > > > > +  * the object is therefore considered coherent for both 
> > > > > > > reads and writes
> > > > > > > +  * through the CPU cache.
> > > > > >
> > > > > > I don't remember why we have this read vs. write split and this new
> > > > > > documentation doesn't seem to really explain it either.
> > > > >
> > > > > Hmm, I attempted to explain that earlier:
> > > > >
> > > > > * Note that on platforms with shared-LLC support(HAS_LLC) reads 
> > > > > through
> > > > > * the CPU cache are always coherent, regardless of the @cache_level. 
> > > > > On
> > > > > * snooping based platforms this is not the case, unless the full
> > > > > * I915_CACHE_LLC or similar setting is used.
> > > > > *
> > > > > * As a result of this we need to track coherency separately for reads
> > > > > * and writes, in order to avoid superfluous flushing on shared-LLC
> > > > > * platforms, for reads.
> > > > >
> > > > > So AFAIK it's just because s

Re: [PATCH v4 3/4] drm/shmem-helpers: Allocate wc pages on x86

2021-07-14 Thread Christian König

Am 13.07.21 um 22:51 schrieb Daniel Vetter:

intel-gfx-ci realized that something is not quite coherent anymore on
some platforms for our i915+vgem tests, when I tried to switch vgem
over to shmem helpers.

After lots of head-scratching I realized that I've removed calls to
drm_clflush. And we need those. To make this a bit cleaner use the
same page allocation tooling as ttm, which does internally clflush
(and more, as neeeded on any platform instead of just the intel x86
cpus i915 can be combined with).

Unfortunately this doesn't exist on arm, or as a generic feature. For
that I think only the dma-api can get at wc memory reliably, so maybe
we'd need some kind of GFP_WC flag to do this properly.


The problem is that this stuff is extremely architecture specific. So 
GFP_WC and GFP_UNCACHED are really what we should aim for in the long term.


And as far as I know we have at least the following possibilities how it 
is implemented:


* A fixed amount of registers which tells the CPU the caching behavior 
for a memory region, e.g. MTRR.
* Some bits of the memory pointers used, e.g. you see the same memory at 
different locations with different caching attributes.

* Some bits in the CPUs page table.
* Some bits in a separate page table.

On top of that there is the PCIe specification which defines non-cache 
snooping access as an extension.


Mixing that with the CPU caching behavior gets you some really nice ways 
to break a driver. In general x86 seems to be rather graceful, but arm 
and PowerPC are easily pissed if you mess that up.



Signed-off-by: Daniel Vetter 
Cc: Christian König 
Cc: "Thomas Hellström" 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 


Acked-by: Christian könig 

Regards,
Christian.


---
  drivers/gpu/drm/drm_gem_shmem_helper.c | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 296ab1b7c07f..657d2490aaa5 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -10,6 +10,10 @@
  #include 
  #include 
  
+#ifdef CONFIG_X86

+#include 
+#endif
+
  #include 
  #include 
  #include 
@@ -162,6 +166,11 @@ static int drm_gem_shmem_get_pages_locked(struct 
drm_gem_shmem_object *shmem)
return PTR_ERR(pages);
}
  
+#ifdef CONFIG_X86

+   if (shmem->map_wc)
+   set_pages_array_wc(pages, obj->size >> PAGE_SHIFT);
+#endif
+
shmem->pages = pages;
  
  	return 0;

@@ -203,6 +212,11 @@ static void drm_gem_shmem_put_pages_locked(struct 
drm_gem_shmem_object *shmem)
if (--shmem->pages_use_count > 0)
return;
  
+#ifdef CONFIG_X86

+   if (shmem->map_wc)
+   set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
+#endif
+
drm_gem_put_pages(obj, shmem->pages,
  shmem->pages_mark_dirty_on_put,
  shmem->pages_mark_accessed_on_put);




Re: [PATCH v2 1/2] dt-bindings: display: rockchip: Add compatible for rk3568 HDMI

2021-07-14 Thread Robin Murphy

On 2021-07-14 10:19, Michael Riesch wrote:

Hello Heiko,

On 7/13/21 10:49 AM, Heiko Stübner wrote:

Hi Michael,

Am Dienstag, 13. Juli 2021, 10:44:00 CEST schrieb Michael Riesch:

The HDMI TX block in the RK3568 requires two power supplies, which have
to be enabled in some cases (at least on the RK3568 EVB1 the voltages
VDDA0V9_IMAGE and VCCA1V8_IMAGE are disabled by default). It would be
great if this was considered by the driver and the device tree binding.
I am not sure, though, whether this is a RK3568 specific or
rockchip_dw_hdmi specific thing. Maybe it can even enter the Synopsis DW
HDMI driver.


I do remember that this discussion happened many years back already.
And yes the supplies are needed for all but back then there was opposition
as these are supposedly phy-related supplies, not for the dw-hdmi itself.
[There are variants with an external phy, like on the rk3328]

See discussion on [0]

[0] 
https://dri-devel.freedesktop.narkive.com/pen2zWo1/patch-v3-1-2-drm-bridge-dw-hdmi-support-optional-supply-regulators


Thanks for the pointer. My summary of this discussion would be the
following:

  - There was no consensus on how to handle the issue. The voltages still
have to be enabled from the outside of the driver.
  - Open question: rockchip-specific or general solution? (one may detect
a tendency towards a rockchip-specific solution)
  - Open question: separation of the phy from the dw_hdmi IP core?

First of all, IMHO the driver should enable those voltages, otherwise we
will have the same discussion again in 5-6 years :-)

Then, the rockchip,dw-hdmi binding features a property "phys",
presumably to handle external phys (e.g., for the RK3328). This fact and
the referenced discussion suggest a rockchip-specific solution.


FWIW I've long thought that cleaning up the phy situation in dw-hdmi 
would be a good idea. It's always seemed a bit sketchy that on RK3328 we 
still validate modes against the tables for the Synopsys phy which isn't 
relevant, and if that does allow a clock rate through that the actual 
phy rejects then things just go horribly wrong and the display breaks.



In the Rockchip documentation (at least for RK3328, RK3399 and RK3568),
there are two extra voltages denoted as "HDMI PHY analog power". It
would be tempting to add the internal phy to the device tree and glue it
to the dw-hdmi using the "phys" property. However, as pointed out in the
referenced discussion, the configuration registers of the phy are
somewhat interleaved with the dw-hdmi registers and a clear separation
may be tricky.


Conceptually I don't think there's any issue with the HDMI node being 
its own phy provider where appropriate. At the DT level it should simply 
be a case of having both sets of properties, e.g.:


&hdmi {
#phy-cells = <0>;
phys = <&hdmi>;
};

And at the driver level AFAICS it's pretty much just a case of dw-hdmi 
additionally registering itself as a phy provider if the internal phy is 
present - the only difference then should be that it can end up calling 
back into itself via the common phy API rather than directly via 
internal special-cases.


Robin.


As a more pragmatic alternative, we could add optional supplies to the
rockchip,dw-hdmi binding and evaluate the "phys" property. If the latter
is not specified, the internal phy is used and the supplies must be
enabled. Would such an approach be acceptable?

Best regards,
Michael


On 7/7/21 2:03 PM, Benjamin Gaignard wrote:

Define a new compatible for rk3568 HDMI.
This version of HDMI hardware block needs two new clocks hclk_vio and hclk
to provide phy reference clocks.

Signed-off-by: Benjamin Gaignard 
---
version 2:
- Add the clocks needed for the phy.

  .../bindings/display/rockchip/rockchip,dw-hdmi.yaml | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git 
a/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml 
b/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
index 75cd9c686e985..cb8643b3a8b84 100644
--- a/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
+++ b/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
@@ -23,6 +23,7 @@ properties:
- rockchip,rk3288-dw-hdmi
- rockchip,rk3328-dw-hdmi
- rockchip,rk3399-dw-hdmi
+  - rockchip,rk3568-dw-hdmi
  
reg-io-width:

  const: 4
@@ -51,8 +52,11 @@ properties:
- vpll
- enum:
- grf
+  - hclk_vio
+  - vpll
+  - enum:
+  - hclk
- vpll
-  - const: vpll


The description and documentation of the clocks are somewhat misleading
IMHO. This is not caused by your patches, of course. But maybe this is a
chance to clean them up a bit.

It seems that the CEC clock is an optional clock of the dw-hdmi driver.
Shouldn't it be documented in the synopsys,dw-hdmi.yaml?

Also, it would be nice if the clocks hclk_vio and hclk fe

Re: [PATCH] dma-buf: add kernel count for dma_buf

2021-07-14 Thread guangming.cao
From: Guangming.Cao 

On Wed, 2021-07-14 at 12:43 +0200, Christian K鰊ig wrote:
> Am 14.07.21 um 11:44 schrieb guangming@mediatek.com:
> > From: Guangming Cao 
> > 
> > On Wed, 2021-07-14 at 10:46 +0200, Christian K鰊ig wrote:
> > > Am 14.07.21 um 09:11 schrieb guangming@mediatek.com:
> > > > From: Guangming Cao 
> > > > 
> > > > Add a refcount for kernel to prevent UAF(Use After Free) issue.
> > > 
> > > Well NAK on so many levels.
> > > 
> > > > We can assume a case like below:
> > > >   1. kernel space alloc dma_buf(file count = 1)
> > > >   2. kernel use dma_buf to get fd(file count = 1)
> > > >   3. userspace use fd to do mapping (file count = 2)
> > > 
> > > Creating an userspace mapping increases the reference count for
> > > the
> > > underlying file object.
> > > 
> > > See the implementation of mmap_region():
> > > ...
> > >   vma->vm_file = get_file(file);
> > >   error = call_mmap(file, vma);
> > > ...
> > > 
> > > What can happen is the the underlying exporter redirects the mmap
> > > to
> > > a
> > > different file, e.g. TTM or GEM drivers do that all the time.
> > > 
> > > But this is fine since then the VA mapping is independent of the
> > > DMA-
> > > buf.
> > > 
> > > >   4. kernel call dma_buf_put (file count = 1)
> > > >   5. userpsace close buffer fd(file count = 0)
> > > >   6. at this time, buffer is released, but va is valid!!
> > > >  So we still can read/write buffer via mmap va,
> > > >  it maybe cause memory leak, or kernel exception.
> > > >  And also, if we use "ls -ll" to watch corresponding
> > > > process
> > > >  fd link info, it also will cause kernel exception.
> > > > 
> > > > Another case:
> > > >Using dma_buf_fd to generate more than 1 fd, because
> > > >dma_buf_fd will not increase file count, thus, when
> > > > close
> > > >the second fd, it maybe occurs error.
> > > 
> > > Each opened fd will increase the reference count so this is
> > > certainly
> > > not correct what you describe here.
> > > 
> > > Regards,
> > > Christian.
> > > 
> > 
> > Yes, mmap will increase file count by calling get_file, so step[2]
> > ->
> > step[3], file count increase 1.
> > 
> > But, dma_buf_fd() will not increase file count.
> > function "dma_buf_fd(struct dma_buf *dmabuf, int flags)" just get
> > an
> > unused fd, via call "get_unused_fd_flags(flags)", and call
> > "fd_install(fd, dmabuf->file)", it will let associated "struct
> > file*"
> > in task's fdt->fd[fd] points to this dma_buf.file, not increase the
> > file count of dma_buf.file.
> > I think this is confusing, I can get more than 1 fds via
> > dma_buf_fd,
> > but they don't need to close it because they don't increase file
> > count.
> > 
> > However, dma_buf_put() can decrease file count at kernel side
> > directly.
> > If somebody write a ko to put file count of dma_buf.file many
> > times, it
> > will cause buffer freed earlier than except. At last on Android, I
> > think this is a little bit dangerous.
> 
> dma_buf_fd() takes the dma_buf pointer and converts it into a fd. So
> the 
> reference is consumed.
> 
> That's why users of this interface make sure to get a separate 
> reference, see drm_gem_prime_handle_to_fd() for example:
> 
> ...
> out_have_handle:
>  ret = dma_buf_fd(dmabuf, flags);
>  /*
>   * We must _not_ remove the buffer from the handle cache since
> the 
> newly
>   * created dma buf is already linked in the global obj->dma_buf 
> pointer,
>   * and that is invariant as long as a userspace gem handle
> exists.
>   * Closing the handle will clean out the cache anyway, so we
> don't 
> leak.
>   */
>  if (ret < 0) {
>  goto fail_put_dmabuf;
>  } else {
>  *prime_fd = ret;
>  ret = 0;
>  }
> 
>  goto out;
> 
> fail_put_dmabuf:
>  dma_buf_put(dmabuf);
> out:
> ...
> 
> You could submit a patch to improve the documentation and explicitly 
> note on dma_buf_fd() that the reference is consumed, but all of this
> is 
> working perfectly fine.
> 
> Regards,
> Christian.
> 

Thanks for your reply!

Yes, drm works fine because it fully understand what dma-buf api will
do. Improve the documentation is really good idea to prevent this case.

But, what I can't understand is, for kernel api exported to
corresponding users, we don't need to ensure all api is safe?

And for general cases, dma-buf framework also need to prevent this
case, isn't it, it will make dma-buf framework more strong?


BRs!
Guangming
> > 
> > > > Solution:
> > > >   Add a kernel count for dma_buf, and make sure the file
> > > > count
> > > >   of dma_buf.file hold by kernel is 1.
> > > > 
> > > > Notes: For this solution, kref couldn't work because kernel ref
> > > >  maybe added from 0, but kref don't allow it.
> > > > 
> > > > Signed-off-by: Guangming Cao 
> > > > ---
> > > >drivers/dma-buf/dma-buf.c | 23 +++

Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

2021-07-14 Thread Ville Syrjälä
On Wed, Jul 14, 2021 at 02:42:53PM +0300, Ville Syrjälä wrote:
> On Wed, Jul 14, 2021 at 01:16:57PM +0200, Daniel Vetter wrote:
> > On Tue, Jul 13, 2021 at 09:46:30PM +0300, Ville Syrjälä wrote:
> > > On Tue, Jul 13, 2021 at 07:24:23PM +0100, Matthew Auld wrote:
> > > > On Tue, 13 Jul 2021 at 18:47, Ville Syrjälä
> > > >  wrote:
> > > > >
> > > > > On Tue, Jul 13, 2021 at 05:13:37PM +0100, Matthew Auld wrote:
> > > > > > On Tue, 13 Jul 2021 at 16:55, Ville Syrjälä
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
> > > > > > > > + /**
> > > > > > > > +  * @cache_coherent:
> > > > > > > > +  *
> > > > > > > > +  * Track whether the pages are coherent with the GPU if 
> > > > > > > > reading or
> > > > > > > > +  * writing through the CPU cache.
> > > > > > > > +  *
> > > > > > > > +  * This largely depends on the @cache_level, for example 
> > > > > > > > if the object
> > > > > > > > +  * is marked as I915_CACHE_LLC, then GPU access is 
> > > > > > > > coherent for both
> > > > > > > > +  * reads and writes through the CPU cache.
> > > > > > > > +  *
> > > > > > > > +  * Note that on platforms with shared-LLC 
> > > > > > > > support(HAS_LLC) reads through
> > > > > > > > +  * the CPU cache are always coherent, regardless of the 
> > > > > > > > @cache_level. On
> > > > > > > > +  * snooping based platforms this is not the case, unless 
> > > > > > > > the full
> > > > > > > > +  * I915_CACHE_LLC or similar setting is used.
> > > > > > > > +  *
> > > > > > > > +  * As a result of this we need to track coherency 
> > > > > > > > separately for reads
> > > > > > > > +  * and writes, in order to avoid superfluous flushing on 
> > > > > > > > shared-LLC
> > > > > > > > +  * platforms, for reads.
> > > > > > > > +  *
> > > > > > > > +  * I915_BO_CACHE_COHERENT_FOR_READ:
> > > > > > > > +  *
> > > > > > > > +  * When reading through the CPU cache, the GPU is still 
> > > > > > > > coherent. Note
> > > > > > > > +  * that no data has actually been modified here, so it 
> > > > > > > > might seem
> > > > > > > > +  * strange that we care about this.
> > > > > > > > +  *
> > > > > > > > +  * As an example, if some object is mapped on the CPU 
> > > > > > > > with write-back
> > > > > > > > +  * caching, and we read some page, then the cache likely 
> > > > > > > > now contains
> > > > > > > > +  * the data from that read. At this point the cache and 
> > > > > > > > main memory
> > > > > > > > +  * match up, so all good. But next the GPU needs to write 
> > > > > > > > some data to
> > > > > > > > +  * that same page. Now if the @cache_level is 
> > > > > > > > I915_CACHE_NONE and the
> > > > > > > > +  * the platform doesn't have the shared-LLC, then the GPU 
> > > > > > > > will
> > > > > > > > +  * effectively skip invalidating the cache(or however 
> > > > > > > > that works
> > > > > > > > +  * internally) when writing the new value.  This is 
> > > > > > > > really bad since the
> > > > > > > > +  * GPU has just written some new data to main memory, but 
> > > > > > > > the CPU cache
> > > > > > > > +  * is still valid and now contains stale data. As a 
> > > > > > > > result the next time
> > > > > > > > +  * we do a cached read with the CPU, we are rewarded with 
> > > > > > > > stale data.
> > > > > > > > +  * Likewise if the cache is later flushed, we might be 
> > > > > > > > rewarded with
> > > > > > > > +  * overwriting main memory with stale data.
> > > > > > > > +  *
> > > > > > > > +  * I915_BO_CACHE_COHERENT_FOR_WRITE:
> > > > > > > > +  *
> > > > > > > > +  * When writing through the CPU cache, the GPU is still 
> > > > > > > > coherent. Note
> > > > > > > > +  * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> > > > > > > > +  *
> > > > > > > > +  * This is never set when I915_CACHE_NONE is used for 
> > > > > > > > @cache_level,
> > > > > > > > +  * where instead we have to manually flush the caches 
> > > > > > > > after writing
> > > > > > > > +  * through the CPU cache. For other cache levels this 
> > > > > > > > should be set and
> > > > > > > > +  * the object is therefore considered coherent for both 
> > > > > > > > reads and writes
> > > > > > > > +  * through the CPU cache.
> > > > > > >
> > > > > > > I don't remember why we have this read vs. write split and this 
> > > > > > > new
> > > > > > > documentation doesn't seem to really explain it either.
> > > > > >
> > > > > > Hmm, I attempted to explain that earlier:
> > > > > >
> > > > > > * Note that on platforms with shared-LLC support(HAS_LLC) reads 
> > > > > > through
> > > > > > * the CPU cache are always coherent, regardless of the 
> > > > > > @cache_level. On
> > > > > > * snooping based platforms this is not the case, unless the full
> > > > > > * I915_CACHE_LLC or 

Re: [PATCH] dma-buf: add kernel count for dma_buf

2021-07-14 Thread Christian König

Am 14.07.21 um 14:03 schrieb guangming@mediatek.com:

From: Guangming.Cao 

On Wed, 2021-07-14 at 12:43 +0200, Christian K鰊ig wrote:

Am 14.07.21 um 11:44 schrieb guangming@mediatek.com:

From: Guangming Cao 

On Wed, 2021-07-14 at 10:46 +0200, Christian K鰊ig wrote:

Am 14.07.21 um 09:11 schrieb guangming@mediatek.com:

From: Guangming Cao 

Add a refcount for kernel to prevent UAF(Use After Free) issue.

Well NAK on so many levels.


We can assume a case like below:
   1. kernel space alloc dma_buf(file count = 1)
   2. kernel use dma_buf to get fd(file count = 1)
   3. userspace use fd to do mapping (file count = 2)

Creating an userspace mapping increases the reference count for
the
underlying file object.

See the implementation of mmap_region():
...
   vma->vm_file = get_file(file);
   error = call_mmap(file, vma);
...

What can happen is the the underlying exporter redirects the mmap
to
a
different file, e.g. TTM or GEM drivers do that all the time.

But this is fine since then the VA mapping is independent of the
DMA-
buf.


   4. kernel call dma_buf_put (file count = 1)
   5. userpsace close buffer fd(file count = 0)
   6. at this time, buffer is released, but va is valid!!
  So we still can read/write buffer via mmap va,
  it maybe cause memory leak, or kernel exception.
  And also, if we use "ls -ll" to watch corresponding
process
  fd link info, it also will cause kernel exception.

Another case:
Using dma_buf_fd to generate more than 1 fd, because
dma_buf_fd will not increase file count, thus, when
close
the second fd, it maybe occurs error.

Each opened fd will increase the reference count so this is
certainly
not correct what you describe here.

Regards,
Christian.


Yes, mmap will increase file count by calling get_file, so step[2]
->
step[3], file count increase 1.

But, dma_buf_fd() will not increase file count.
function "dma_buf_fd(struct dma_buf *dmabuf, int flags)" just get
an
unused fd, via call "get_unused_fd_flags(flags)", and call
"fd_install(fd, dmabuf->file)", it will let associated "struct
file*"
in task's fdt->fd[fd] points to this dma_buf.file, not increase the
file count of dma_buf.file.
I think this is confusing, I can get more than 1 fds via
dma_buf_fd,
but they don't need to close it because they don't increase file
count.

However, dma_buf_put() can decrease file count at kernel side
directly.
If somebody write a ko to put file count of dma_buf.file many
times, it
will cause buffer freed earlier than except. At last on Android, I
think this is a little bit dangerous.

dma_buf_fd() takes the dma_buf pointer and converts it into a fd. So
the
reference is consumed.

That's why users of this interface make sure to get a separate
reference, see drm_gem_prime_handle_to_fd() for example:

...
out_have_handle:
  ret = dma_buf_fd(dmabuf, flags);
  /*
   * We must _not_ remove the buffer from the handle cache since
the
newly
   * created dma buf is already linked in the global obj->dma_buf
pointer,
   * and that is invariant as long as a userspace gem handle
exists.
   * Closing the handle will clean out the cache anyway, so we
don't
leak.
   */
  if (ret < 0) {
  goto fail_put_dmabuf;
  } else {
  *prime_fd = ret;
  ret = 0;
  }

  goto out;

fail_put_dmabuf:
  dma_buf_put(dmabuf);
out:
...

You could submit a patch to improve the documentation and explicitly
note on dma_buf_fd() that the reference is consumed, but all of this
is
working perfectly fine.

Regards,
Christian.


Thanks for your reply!

Yes, drm works fine because it fully understand what dma-buf api will
do. Improve the documentation is really good idea to prevent this case.

But, what I can't understand is, for kernel api exported to
corresponding users, we don't need to ensure all api is safe?


Well the API is perfectly safe, it is just not what you are expecting.


And for general cases, dma-buf framework also need to prevent this
case, isn't it, it will make dma-buf framework more strong?


What we could do is to move getting the reference into that function if 
all users of that function does that anyway.


This would then be more defensive because new users of dma_buf_fd() 
can't forget to grab a reference.


But this needs a complete audit of the kernel with all of the users of 
dma_buf_fd().


Regards,
Christian.




BRs!
Guangming

Solution:
   Add a kernel count for dma_buf, and make sure the file
count
   of dma_buf.file hold by kernel is 1.

Notes: For this solution, kref couldn't work because kernel ref
  maybe added from 0, but kref don't allow it.

Signed-off-by: Guangming Cao 
---
drivers/dma-buf/dma-buf.c | 23 +++
include/linux/dma-buf.h   |  6 --
2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dm

[PATCH v2 resent] drm/i915: Add TTM offset argument to mmap.

2021-07-14 Thread Maarten Lankhorst
The FIXED mapping is only used for ttm, and tells userspace that the
mapping type is pre-defined. This disables the other type of mmap
offsets when discrete memory is used, so fix the selftests as well.

Document the struct as well, so it shows up in docbook.

Cc: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
Signed-off-by: Maarten Lankhorst 
---
Resent, forgot to cc dri-devel

 drivers/gpu/drm/i915/gem/i915_gem_mman.c  | 17 ++-
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 +
 .../drm/i915/gem/selftests/i915_gem_mman.c| 27 ++-
 include/uapi/drm/i915_drm.h   | 46 ++-
 4 files changed, 77 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index a90f796e85c0..31c4021bb6be 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -679,10 +679,16 @@ __assign_mmap_offset(struct drm_i915_gem_object *obj,
return -ENODEV;
 
if (obj->ops->mmap_offset)  {
+   if (mmap_type != I915_MMAP_TYPE_FIXED)
+   return -ENODEV;
+
*offset = obj->ops->mmap_offset(obj);
return 0;
}
 
+   if (mmap_type == I915_MMAP_TYPE_FIXED)
+   return -ENODEV;
+
if (mmap_type != I915_MMAP_TYPE_GTT &&
!i915_gem_object_has_struct_page(obj) &&
!i915_gem_object_has_iomem(obj))
@@ -727,7 +733,9 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
 {
enum i915_mmap_type mmap_type;
 
-   if (boot_cpu_has(X86_FEATURE_PAT))
+   if (HAS_LMEM(to_i915(dev)))
+   mmap_type = I915_MMAP_TYPE_FIXED;
+   else if (boot_cpu_has(X86_FEATURE_PAT))
mmap_type = I915_MMAP_TYPE_WC;
else if (!i915_ggtt_has_aperture(&to_i915(dev)->ggtt))
return -ENODEV;
@@ -798,6 +806,10 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void 
*data,
type = I915_MMAP_TYPE_UC;
break;
 
+   case I915_MMAP_OFFSET_FIXED:
+   type = I915_MMAP_TYPE_FIXED;
+   break;
+
default:
return -EINVAL;
}
@@ -968,6 +980,9 @@ int i915_gem_mmap(struct file *filp, struct vm_area_struct 
*vma)
vma->vm_ops = &vm_ops_cpu;
break;
 
+   case I915_MMAP_TYPE_FIXED:
+   GEM_WARN_ON(1);
+   /* fall-through */
case I915_MMAP_TYPE_WB:
vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
vma->vm_ops = &vm_ops_cpu;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index ef3de2ae9723..afbadfc5516b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -105,6 +105,7 @@ enum i915_mmap_type {
I915_MMAP_TYPE_WC,
I915_MMAP_TYPE_WB,
I915_MMAP_TYPE_UC,
+   I915_MMAP_TYPE_FIXED,
 };
 
 struct i915_mmap_offset {
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 1da8bd675e54..52789c8ad337 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -573,6 +573,14 @@ static int make_obj_busy(struct drm_i915_gem_object *obj)
return 0;
 }
 
+static enum i915_mmap_type default_mapping(struct drm_i915_private *i915)
+{
+   if (HAS_LMEM(i915))
+   return I915_MMAP_TYPE_FIXED;
+
+   return I915_MMAP_TYPE_GTT;
+}
+
 static bool assert_mmap_offset(struct drm_i915_private *i915,
   unsigned long size,
   int expected)
@@ -585,7 +593,7 @@ static bool assert_mmap_offset(struct drm_i915_private 
*i915,
if (IS_ERR(obj))
return expected && expected == PTR_ERR(obj);
 
-   ret = __assign_mmap_offset(obj, I915_MMAP_TYPE_GTT, &offset, NULL);
+   ret = __assign_mmap_offset(obj, default_mapping(i915), &offset, NULL);
i915_gem_object_put(obj);
 
return ret == expected;
@@ -689,7 +697,7 @@ static int igt_mmap_offset_exhaustion(void *arg)
goto out;
}
 
-   err = __assign_mmap_offset(obj, I915_MMAP_TYPE_GTT, &offset, NULL);
+   err = __assign_mmap_offset(obj, default_mapping(i915), &offset, NULL);
if (err) {
pr_err("Unable to insert object into reclaimed hole\n");
goto err_obj;
@@ -831,8 +839,14 @@ static int wc_check(struct drm_i915_gem_object *obj)
 
 static bool can_mmap(struct drm_i915_gem_object *obj, enum i915_mmap_type type)
 {
+   struct drm_i915_private *i915 = to_i915(obj->base.dev);
bool no_map;
 
+   if (HAS_LMEM(i915))
+   return type == I915_MMAP_TYPE_FIXED;
+   else if (type == I915_MMAP_TYPE_FIXED)
+   return false;
+
if 

[PATCH] dma-buf: support users to change dma_buf.name

2021-07-14 Thread guangming.cao
From: Guangming Cao 

User space user can call DMA_BUF_SET_NAME to set dma_buf.name,
also add a kernel api for users to do same thing at kernel side.

Signed-off-by: Guangming Cao 
---
 drivers/dma-buf/dma-buf.c | 28 ++--
 include/linux/dma-buf.h   |  1 +
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 511fe0d217a0..949af232c644 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -331,20 +331,20 @@ static __poll_t dma_buf_poll(struct file *file, 
poll_table *poll)
  * purpose between different devices.
  *
  * @dmabuf: [in] dmabuf buffer that will be renamed.
- * @buf:[in] A piece of userspace memory that contains the name of
+ * @buf:[in] A piece of memory that contains the name of
  *   the dma-buf.
  *
  * Returns 0 on success. If the dma-buf buffer is already attached to
  * devices, return -EBUSY.
  *
  */
-static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf)
+long dma_buf_set_name(struct dma_buf *dmabuf, const char *buf)
 {
-   char *name = strndup_user(buf, DMA_BUF_NAME_LEN);
+   char *name = kstrndup(buf, DMA_BUF_NAME_LEN, GFP_KERNEL);
long ret = 0;
 
-   if (IS_ERR(name))
-   return PTR_ERR(name);
+   if (!name)
+   return -ENOMEM;
 
dma_resv_lock(dmabuf->resv, NULL);
if (!list_empty(&dmabuf->attachments)) {
@@ -361,6 +361,22 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const 
char __user *buf)
dma_resv_unlock(dmabuf->resv);
return ret;
 }
+EXPORT_SYMBOL_GPL(dma_buf_set_name);
+
+static long
+dma_buf_set_name_user(struct dma_buf *dmabuf, const char __user *buf)
+{
+   char *name = strndup_user(buf, DMA_BUF_NAME_LEN);
+   long ret = 0;
+
+   if (IS_ERR(name))
+   return PTR_ERR(name);
+
+   ret = dma_buf_set_name(dmabuf, name);
+   kfree(name);
+
+   return ret;
+}
 
 static long dma_buf_ioctl(struct file *file,
  unsigned int cmd, unsigned long arg)
@@ -403,7 +419,7 @@ static long dma_buf_ioctl(struct file *file,
 
case DMA_BUF_SET_NAME_A:
case DMA_BUF_SET_NAME_B:
-   return dma_buf_set_name(dmabuf, (const char __user *)arg);
+   return dma_buf_set_name_user(dmabuf, (const char __user *)arg);
 
default:
return -ENOTTY;
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index efdc56b9d95f..e6612ab59a59 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -507,4 +507,5 @@ int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *,
 unsigned long);
 int dma_buf_vmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
 void dma_buf_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
+long dma_buf_set_name(struct dma_buf *dmabuf, const char *name);
 #endif /* __DMA_BUF_H__ */
-- 
2.17.1



Re: [PATCH] dma-buf: support users to change dma_buf.name

2021-07-14 Thread Christian König

Am 14.07.21 um 14:29 schrieb guangming@mediatek.com:

From: Guangming Cao 

User space user can call DMA_BUF_SET_NAME to set dma_buf.name,
also add a kernel api for users to do same thing at kernel side.


Well if you want to add a kernel API to set the name you also need to 
provide an user for this.


Christian.



Signed-off-by: Guangming Cao 
---
  drivers/dma-buf/dma-buf.c | 28 ++--
  include/linux/dma-buf.h   |  1 +
  2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 511fe0d217a0..949af232c644 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -331,20 +331,20 @@ static __poll_t dma_buf_poll(struct file *file, 
poll_table *poll)
   * purpose between different devices.
   *
   * @dmabuf: [in] dmabuf buffer that will be renamed.
- * @buf:[in] A piece of userspace memory that contains the name of
+ * @buf:[in] A piece of memory that contains the name of
   *   the dma-buf.
   *
   * Returns 0 on success. If the dma-buf buffer is already attached to
   * devices, return -EBUSY.
   *
   */
-static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf)
+long dma_buf_set_name(struct dma_buf *dmabuf, const char *buf)
  {
-   char *name = strndup_user(buf, DMA_BUF_NAME_LEN);
+   char *name = kstrndup(buf, DMA_BUF_NAME_LEN, GFP_KERNEL);
long ret = 0;
  
-	if (IS_ERR(name))

-   return PTR_ERR(name);
+   if (!name)
+   return -ENOMEM;
  
  	dma_resv_lock(dmabuf->resv, NULL);

if (!list_empty(&dmabuf->attachments)) {
@@ -361,6 +361,22 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const 
char __user *buf)
dma_resv_unlock(dmabuf->resv);
return ret;
  }
+EXPORT_SYMBOL_GPL(dma_buf_set_name);
+
+static long
+dma_buf_set_name_user(struct dma_buf *dmabuf, const char __user *buf)
+{
+   char *name = strndup_user(buf, DMA_BUF_NAME_LEN);
+   long ret = 0;
+
+   if (IS_ERR(name))
+   return PTR_ERR(name);
+
+   ret = dma_buf_set_name(dmabuf, name);
+   kfree(name);
+
+   return ret;
+}
  
  static long dma_buf_ioctl(struct file *file,

  unsigned int cmd, unsigned long arg)
@@ -403,7 +419,7 @@ static long dma_buf_ioctl(struct file *file,
  
  	case DMA_BUF_SET_NAME_A:

case DMA_BUF_SET_NAME_B:
-   return dma_buf_set_name(dmabuf, (const char __user *)arg);
+   return dma_buf_set_name_user(dmabuf, (const char __user *)arg);
  
  	default:

return -ENOTTY;
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index efdc56b9d95f..e6612ab59a59 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -507,4 +507,5 @@ int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *,
 unsigned long);
  int dma_buf_vmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
  void dma_buf_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
+long dma_buf_set_name(struct dma_buf *dmabuf, const char *name);
  #endif /* __DMA_BUF_H__ */




[PATCH] drm/vgem: use shmem helpers

2021-07-14 Thread Daniel Vetter
Aside from deleting lots of code the real motivation here is to switch
the mmap over to VM_PFNMAP, to be more consistent with what real gpu
drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
work, and even if you try and there's a struct page behind that,
touching it and mucking around with its refcount can upset drivers
real bad.

v2: Review from Thomas:
- sort #include
- drop more dead code that I didn't spot somehow

v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci)

v4: I got tricked by 0cf2ef46c6c0 ("drm/shmem-helper: Use cached
mappings by default"), and we need WC in vgem because vgem doesn't
have explicit begin/end cpu access ioctls.

Also add a comment why exactly vgem has to use wc.

v5: Don't set obj->base.funcs, it will default to drm_gem_shmem_funcs
(Thomas)

v6: vgem also needs an MMU for remapping

v7: I absolutely butchered the rebases over the vgem mmap change and
revert and broke the patch. Actually go back to v6 from before the
vgem mmap changes.

Cc: Thomas Zimmermann 
Acked-by: Thomas Zimmermann 
Cc: John Stultz 
Cc: Sumit Semwal 
Cc: "Christian König" 
Signed-off-by: Daniel Vetter 
Cc: Melissa Wen 
Cc: Chris Wilson 
---
 drivers/gpu/drm/Kconfig |   5 +-
 drivers/gpu/drm/vgem/vgem_drv.c | 342 ++--
 2 files changed, 16 insertions(+), 331 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 314eefa39892..28f7d2006e8b 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -272,7 +272,8 @@ source "drivers/gpu/drm/kmb/Kconfig"
 
 config DRM_VGEM
tristate "Virtual GEM provider"
-   depends on DRM
+   depends on DRM && MMU
+   select DRM_GEM_SHMEM_HELPER
help
  Choose this option to get a virtual graphics memory manager,
  as used by Mesa's software renderer for enhanced performance.
@@ -280,7 +281,7 @@ config DRM_VGEM
 
 config DRM_VKMS
tristate "Virtual KMS (EXPERIMENTAL)"
-   depends on DRM
+   depends on DRM && MMU
select DRM_KMS_HELPER
select DRM_GEM_SHMEM_HELPER
select CRC32
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index bf38a7e319d1..a87eafa89e9f 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -38,6 +38,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -50,87 +51,11 @@
 #define DRIVER_MAJOR   1
 #define DRIVER_MINOR   0
 
-static const struct drm_gem_object_funcs vgem_gem_object_funcs;
-
 static struct vgem_device {
struct drm_device drm;
struct platform_device *platform;
 } *vgem_device;
 
-static void vgem_gem_free_object(struct drm_gem_object *obj)
-{
-   struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
-
-   kvfree(vgem_obj->pages);
-   mutex_destroy(&vgem_obj->pages_lock);
-
-   if (obj->import_attach)
-   drm_prime_gem_destroy(obj, vgem_obj->table);
-
-   drm_gem_object_release(obj);
-   kfree(vgem_obj);
-}
-
-static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
-{
-   struct vm_area_struct *vma = vmf->vma;
-   struct drm_vgem_gem_object *obj = vma->vm_private_data;
-   /* We don't use vmf->pgoff since that has the fake offset */
-   unsigned long vaddr = vmf->address;
-   vm_fault_t ret = VM_FAULT_SIGBUS;
-   loff_t num_pages;
-   pgoff_t page_offset;
-   page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
-
-   num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
-
-   if (page_offset >= num_pages)
-   return VM_FAULT_SIGBUS;
-
-   mutex_lock(&obj->pages_lock);
-   if (obj->pages) {
-   get_page(obj->pages[page_offset]);
-   vmf->page = obj->pages[page_offset];
-   ret = 0;
-   }
-   mutex_unlock(&obj->pages_lock);
-   if (ret) {
-   struct page *page;
-
-   page = shmem_read_mapping_page(
-   file_inode(obj->base.filp)->i_mapping,
-   page_offset);
-   if (!IS_ERR(page)) {
-   vmf->page = page;
-   ret = 0;
-   } else switch (PTR_ERR(page)) {
-   case -ENOSPC:
-   case -ENOMEM:
-   ret = VM_FAULT_OOM;
-   break;
-   case -EBUSY:
-   ret = VM_FAULT_RETRY;
-   break;
-   case -EFAULT:
-   case -EINVAL:
-   ret = VM_FAULT_SIGBUS;
-   break;
-   default:
-   WARN_ON(PTR_ERR(page));
-   ret = VM_FAULT_SIGBUS;
-   break;
-   }
-
-   }
-   return ret;
-}
-
-static const struct vm_op

Re: [PATCH v4 3/4] drm/shmem-helpers: Allocate wc pages on x86

2021-07-14 Thread Daniel Vetter
On Wed, Jul 14, 2021 at 01:54:50PM +0200, Christian König wrote:
> Am 13.07.21 um 22:51 schrieb Daniel Vetter:
> > intel-gfx-ci realized that something is not quite coherent anymore on
> > some platforms for our i915+vgem tests, when I tried to switch vgem
> > over to shmem helpers.
> > 
> > After lots of head-scratching I realized that I've removed calls to
> > drm_clflush. And we need those. To make this a bit cleaner use the
> > same page allocation tooling as ttm, which does internally clflush
> > (and more, as neeeded on any platform instead of just the intel x86
> > cpus i915 can be combined with).
> > 
> > Unfortunately this doesn't exist on arm, or as a generic feature. For
> > that I think only the dma-api can get at wc memory reliably, so maybe
> > we'd need some kind of GFP_WC flag to do this properly.
> 
> The problem is that this stuff is extremely architecture specific. So GFP_WC
> and GFP_UNCACHED are really what we should aim for in the long term.
> 
> And as far as I know we have at least the following possibilities how it is
> implemented:
> 
> * A fixed amount of registers which tells the CPU the caching behavior for a
> memory region, e.g. MTRR.
> * Some bits of the memory pointers used, e.g. you see the same memory at
> different locations with different caching attributes.
> * Some bits in the CPUs page table.
> * Some bits in a separate page table.
> 
> On top of that there is the PCIe specification which defines non-cache
> snooping access as an extension.

Yeah dma-buf is extremely ill-defined even on x86 if you combine these
all. We just play a game of whack-a-mole with the cacheline dirt until
it's gone.

That's the other piece here, how do you even make sure that the page is
properly flushed and ready for wc access:
- easy case is x86 with clflush available pretty much everywhere (since
  10+ years at least)
- next are cpus which have some cache flush instructions, but it's highly
  cpu model specific
- next up is the same, but you absolutely have to make sure there's no
  other mapping around anymore or the coherency fabric just dies
- and I'm pretty sure there's worse stuff where you defacto can only
  allocate wc memory that's set aside at boot-up and that's all you ever
  get.

Cheers, Daniel

> Mixing that with the CPU caching behavior gets you some really nice ways to
> break a driver. In general x86 seems to be rather graceful, but arm and
> PowerPC are easily pissed if you mess that up.
> 
> > Signed-off-by: Daniel Vetter 
> > Cc: Christian König 
> > Cc: "Thomas Hellström" 
> > Cc: Maarten Lankhorst 
> > Cc: Maxime Ripard 
> > Cc: Thomas Zimmermann 
> > Cc: David Airlie 
> > Cc: Daniel Vetter 
> 
> Acked-by: Christian könig 
> 
> Regards,
> Christian.
> 
> > ---
> >   drivers/gpu/drm/drm_gem_shmem_helper.c | 14 ++
> >   1 file changed, 14 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> > b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > index 296ab1b7c07f..657d2490aaa5 100644
> > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > @@ -10,6 +10,10 @@
> >   #include 
> >   #include 
> > +#ifdef CONFIG_X86
> > +#include 
> > +#endif
> > +
> >   #include 
> >   #include 
> >   #include 
> > @@ -162,6 +166,11 @@ static int drm_gem_shmem_get_pages_locked(struct 
> > drm_gem_shmem_object *shmem)
> > return PTR_ERR(pages);
> > }
> > +#ifdef CONFIG_X86
> > +   if (shmem->map_wc)
> > +   set_pages_array_wc(pages, obj->size >> PAGE_SHIFT);
> > +#endif
> > +
> > shmem->pages = pages;
> > return 0;
> > @@ -203,6 +212,11 @@ static void drm_gem_shmem_put_pages_locked(struct 
> > drm_gem_shmem_object *shmem)
> > if (--shmem->pages_use_count > 0)
> > return;
> > +#ifdef CONFIG_X86
> > +   if (shmem->map_wc)
> > +   set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
> > +#endif
> > +
> > drm_gem_put_pages(obj, shmem->pages,
> >   shmem->pages_mark_dirty_on_put,
> >   shmem->pages_mark_accessed_on_put);
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: nouveau: failed to initialise sync

2021-07-14 Thread Kirill A. Shutemov
On Tue, Jul 06, 2021 at 08:58:37AM +0200, Christian König wrote:
> Hi guys,
> 
> yes nouveau was using the same functionality for internal BOs without
> noticing it. This is fixes by the following commit:
> 
> commit d098775ed44021293b1962dea61efb19297b8d02
> Author: Christian König 
> Date:   Wed Jun 9 19:25:56 2021 +0200
> 
>     drm/nouveau: init the base GEM fields for internal BOs
> 
>     TTMs buffer objects are based on GEM objects for quite a while
>     and rely on initializing those fields before initializing the TTM BO.
> 
>     Nouveau now doesn't init the GEM object for internally allocated BOs,
>     so make sure that we at least initialize some necessary fields.
> 
> Could be that the patch needs to be send to stable as well.

The regression is present in v5.14-rc1. Any idea when it will hit
upstream? I don't see it being applied to drm=next.

-- 
 Kirill A. Shutemov


Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

2021-07-14 Thread Daniel Vetter
On Wed, Jul 14, 2021 at 02:42:53PM +0300, Ville Syrjälä wrote:
> On Wed, Jul 14, 2021 at 01:16:57PM +0200, Daniel Vetter wrote:
> > On Tue, Jul 13, 2021 at 09:46:30PM +0300, Ville Syrjälä wrote:
> > > On Tue, Jul 13, 2021 at 07:24:23PM +0100, Matthew Auld wrote:
> > > > On Tue, 13 Jul 2021 at 18:47, Ville Syrjälä
> > > >  wrote:
> > > > >
> > > > > On Tue, Jul 13, 2021 at 05:13:37PM +0100, Matthew Auld wrote:
> > > > > > On Tue, 13 Jul 2021 at 16:55, Ville Syrjälä
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
> > > > > > > > + /**
> > > > > > > > +  * @cache_coherent:
> > > > > > > > +  *
> > > > > > > > +  * Track whether the pages are coherent with the GPU if 
> > > > > > > > reading or
> > > > > > > > +  * writing through the CPU cache.
> > > > > > > > +  *
> > > > > > > > +  * This largely depends on the @cache_level, for example 
> > > > > > > > if the object
> > > > > > > > +  * is marked as I915_CACHE_LLC, then GPU access is 
> > > > > > > > coherent for both
> > > > > > > > +  * reads and writes through the CPU cache.
> > > > > > > > +  *
> > > > > > > > +  * Note that on platforms with shared-LLC 
> > > > > > > > support(HAS_LLC) reads through
> > > > > > > > +  * the CPU cache are always coherent, regardless of the 
> > > > > > > > @cache_level. On
> > > > > > > > +  * snooping based platforms this is not the case, unless 
> > > > > > > > the full
> > > > > > > > +  * I915_CACHE_LLC or similar setting is used.
> > > > > > > > +  *
> > > > > > > > +  * As a result of this we need to track coherency 
> > > > > > > > separately for reads
> > > > > > > > +  * and writes, in order to avoid superfluous flushing on 
> > > > > > > > shared-LLC
> > > > > > > > +  * platforms, for reads.
> > > > > > > > +  *
> > > > > > > > +  * I915_BO_CACHE_COHERENT_FOR_READ:
> > > > > > > > +  *
> > > > > > > > +  * When reading through the CPU cache, the GPU is still 
> > > > > > > > coherent. Note
> > > > > > > > +  * that no data has actually been modified here, so it 
> > > > > > > > might seem
> > > > > > > > +  * strange that we care about this.
> > > > > > > > +  *
> > > > > > > > +  * As an example, if some object is mapped on the CPU 
> > > > > > > > with write-back
> > > > > > > > +  * caching, and we read some page, then the cache likely 
> > > > > > > > now contains
> > > > > > > > +  * the data from that read. At this point the cache and 
> > > > > > > > main memory
> > > > > > > > +  * match up, so all good. But next the GPU needs to write 
> > > > > > > > some data to
> > > > > > > > +  * that same page. Now if the @cache_level is 
> > > > > > > > I915_CACHE_NONE and the
> > > > > > > > +  * the platform doesn't have the shared-LLC, then the GPU 
> > > > > > > > will
> > > > > > > > +  * effectively skip invalidating the cache(or however 
> > > > > > > > that works
> > > > > > > > +  * internally) when writing the new value.  This is 
> > > > > > > > really bad since the
> > > > > > > > +  * GPU has just written some new data to main memory, but 
> > > > > > > > the CPU cache
> > > > > > > > +  * is still valid and now contains stale data. As a 
> > > > > > > > result the next time
> > > > > > > > +  * we do a cached read with the CPU, we are rewarded with 
> > > > > > > > stale data.
> > > > > > > > +  * Likewise if the cache is later flushed, we might be 
> > > > > > > > rewarded with
> > > > > > > > +  * overwriting main memory with stale data.
> > > > > > > > +  *
> > > > > > > > +  * I915_BO_CACHE_COHERENT_FOR_WRITE:
> > > > > > > > +  *
> > > > > > > > +  * When writing through the CPU cache, the GPU is still 
> > > > > > > > coherent. Note
> > > > > > > > +  * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> > > > > > > > +  *
> > > > > > > > +  * This is never set when I915_CACHE_NONE is used for 
> > > > > > > > @cache_level,
> > > > > > > > +  * where instead we have to manually flush the caches 
> > > > > > > > after writing
> > > > > > > > +  * through the CPU cache. For other cache levels this 
> > > > > > > > should be set and
> > > > > > > > +  * the object is therefore considered coherent for both 
> > > > > > > > reads and writes
> > > > > > > > +  * through the CPU cache.
> > > > > > >
> > > > > > > I don't remember why we have this read vs. write split and this 
> > > > > > > new
> > > > > > > documentation doesn't seem to really explain it either.
> > > > > >
> > > > > > Hmm, I attempted to explain that earlier:
> > > > > >
> > > > > > * Note that on platforms with shared-LLC support(HAS_LLC) reads 
> > > > > > through
> > > > > > * the CPU cache are always coherent, regardless of the 
> > > > > > @cache_level. On
> > > > > > * snooping based platforms this is not the case, unless the full
> > > > > > * I915_CACHE_LLC or 

Re: [PATCH v4 3/4] drm/shmem-helpers: Allocate wc pages on x86

2021-07-14 Thread Christian König

Am 14.07.21 um 14:48 schrieb Daniel Vetter:

On Wed, Jul 14, 2021 at 01:54:50PM +0200, Christian König wrote:

Am 13.07.21 um 22:51 schrieb Daniel Vetter:

intel-gfx-ci realized that something is not quite coherent anymore on
some platforms for our i915+vgem tests, when I tried to switch vgem
over to shmem helpers.

After lots of head-scratching I realized that I've removed calls to
drm_clflush. And we need those. To make this a bit cleaner use the
same page allocation tooling as ttm, which does internally clflush
(and more, as neeeded on any platform instead of just the intel x86
cpus i915 can be combined with).

Unfortunately this doesn't exist on arm, or as a generic feature. For
that I think only the dma-api can get at wc memory reliably, so maybe
we'd need some kind of GFP_WC flag to do this properly.

The problem is that this stuff is extremely architecture specific. So GFP_WC
and GFP_UNCACHED are really what we should aim for in the long term.

And as far as I know we have at least the following possibilities how it is
implemented:

* A fixed amount of registers which tells the CPU the caching behavior for a
memory region, e.g. MTRR.
* Some bits of the memory pointers used, e.g. you see the same memory at
different locations with different caching attributes.
* Some bits in the CPUs page table.
* Some bits in a separate page table.

On top of that there is the PCIe specification which defines non-cache
snooping access as an extension.

Yeah dma-buf is extremely ill-defined even on x86 if you combine these
all. We just play a game of whack-a-mole with the cacheline dirt until
it's gone.

That's the other piece here, how do you even make sure that the page is
properly flushed and ready for wc access:
- easy case is x86 with clflush available pretty much everywhere (since
   10+ years at least)
- next are cpus which have some cache flush instructions, but it's highly
   cpu model specific
- next up is the same, but you absolutely have to make sure there's no
   other mapping around anymore or the coherency fabric just dies
- and I'm pretty sure there's worse stuff where you defacto can only
   allocate wc memory that's set aside at boot-up and that's all you ever
   get.


Well long story short you don't make sure that the page is flushed at all.

What you do is to allocate the page as WC in the first place, if you 
fail to do this you can't use it.


The whole idea TTM try to sell until a while ago that you can actually 
change that on the fly only works on x86 and even there only very very 
limited.


Cheers,
Christian.



Cheers, Daniel


Mixing that with the CPU caching behavior gets you some really nice ways to
break a driver. In general x86 seems to be rather graceful, but arm and
PowerPC are easily pissed if you mess that up.


Signed-off-by: Daniel Vetter 
Cc: Christian König 
Cc: "Thomas Hellström" 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 

Acked-by: Christian könig 

Regards,
Christian.


---
   drivers/gpu/drm/drm_gem_shmem_helper.c | 14 ++
   1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 296ab1b7c07f..657d2490aaa5 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -10,6 +10,10 @@
   #include 
   #include 
+#ifdef CONFIG_X86
+#include 
+#endif
+
   #include 
   #include 
   #include 
@@ -162,6 +166,11 @@ static int drm_gem_shmem_get_pages_locked(struct 
drm_gem_shmem_object *shmem)
return PTR_ERR(pages);
}
+#ifdef CONFIG_X86
+   if (shmem->map_wc)
+   set_pages_array_wc(pages, obj->size >> PAGE_SHIFT);
+#endif
+
shmem->pages = pages;
return 0;
@@ -203,6 +212,11 @@ static void drm_gem_shmem_put_pages_locked(struct 
drm_gem_shmem_object *shmem)
if (--shmem->pages_use_count > 0)
return;
+#ifdef CONFIG_X86
+   if (shmem->map_wc)
+   set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
+#endif
+
drm_gem_put_pages(obj, shmem->pages,
  shmem->pages_mark_dirty_on_put,
  shmem->pages_mark_accessed_on_put);




Re: nouveau: failed to initialise sync

2021-07-14 Thread Christian König

Am 14.07.21 um 14:56 schrieb Kirill A. Shutemov:

On Tue, Jul 06, 2021 at 08:58:37AM +0200, Christian König wrote:

Hi guys,

yes nouveau was using the same functionality for internal BOs without
noticing it. This is fixes by the following commit:

commit d098775ed44021293b1962dea61efb19297b8d02
Author: Christian König 
Date:   Wed Jun 9 19:25:56 2021 +0200

     drm/nouveau: init the base GEM fields for internal BOs

     TTMs buffer objects are based on GEM objects for quite a while
     and rely on initializing those fields before initializing the TTM BO.

     Nouveau now doesn't init the GEM object for internally allocated BOs,
     so make sure that we at least initialize some necessary fields.

Could be that the patch needs to be send to stable as well.

The regression is present in v5.14-rc1. Any idea when it will hit
upstream? I don't see it being applied to drm=next.


Well that question needs to answer Dave or somebody else from the 
drm-misc maintainer team.


This fix together with some others are already in drm-misc-next-fixes 
waiting to be pushed upstream, but it looks like that hasn't happened yet.


Even Linus already pinged me where the fix for qxl got stuck.

Regards,
Christian.


Re: [PATCH v3 1/1] drm/ttm: Fix COW check

2021-07-14 Thread Christian König

Am 14.07.21 um 13:15 schrieb Daniel Vetter:

On Wed, Jul 14, 2021 at 12:51:15PM +0200, Christian König wrote:

Am 14.07.21 um 12:44 schrieb Daniel Vetter:

On Mon, Jul 12, 2021 at 06:06:36PM -0400, Felix Kuehling wrote:

KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
is_cow_mapping returns true for these mappings. Add a check for
vm_flags & VM_WRITE to avoid mmap failures on private read-only or
PROT_NONE mappings.

v2: protect against mprotect making a mapping writable after the fact
v3: update driver-specific vm_operations_structs

Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
Signed-off-by: Felix Kuehling 
Signed-off-by: Alex Deucher 

So looking at vmf_insert_pfn_prot() and the comment there we can't have
VM_PFNMAP and is_cow_mapping ever be true, or things break. On platforms
without pte_special at least.

Key idea is that we never end up in vmf_insert_pfn_prot() because the vma is
mapped with PROT_NONE.

Ah right if it's PROT_NONE then it's ok. But the code here only checks for
VM_WRITE, not VM_READ, so PROT_READ can get through and go boom? Or
something else I'm missing?


Ah, good point. Yeah that is indeed not handled correctly and can cause 
a BUG_ON().


Looks like we need to revert that patch and go back to the drawing board 
then.


Christian.



Maybe time for a few amdgpu mmap tests that go through the combos and make
sure it works/fails all correctly.
-Daniel


So I'm not sure this is a great idea, and definitely not for all drivers

Yeah, I'm absolutely not happy with this either but it seemed to be the
least painful thing to do.


...

Can we clear VM_MAYWRITE instead to force this to be a non-cow mapping
instead?

Well we have considered forcefully setting VM_SHARED, which won't work
easily for a couple of reasons.

But clearing VM_MAYWRITE in amdgpu/amdkfd may actually work as well.

Felix can you test this?

Thanks,
Christian.


-Daniel


---
   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  |  3 ++-
   drivers/gpu/drm/nouveau/nouveau_gem.c|  3 ++-
   drivers/gpu/drm/radeon/radeon_gem.c  |  3 ++-
   drivers/gpu/drm/ttm/ttm_bo_vm.c  | 14 +-
   drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c |  1 +
   include/drm/ttm/ttm_bo_api.h |  4 
   6 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index b3404c43a911..1aa750a6a5d2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -79,7 +79,8 @@ static const struct vm_operations_struct amdgpu_gem_vm_ops = {
.fault = amdgpu_gem_fault,
.open = ttm_bo_vm_open,
.close = ttm_bo_vm_close,
-   .access = ttm_bo_vm_access
+   .access = ttm_bo_vm_access,
+   .mprotect = ttm_bo_vm_mprotect
   };
   static void amdgpu_gem_object_free(struct drm_gem_object *gobj)
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 5b27845075a1..164ea564bb7a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -70,7 +70,8 @@ static const struct vm_operations_struct nouveau_ttm_vm_ops = 
{
.fault = nouveau_ttm_fault,
.open = ttm_bo_vm_open,
.close = ttm_bo_vm_close,
-   .access = ttm_bo_vm_access
+   .access = ttm_bo_vm_access,
+   .mprotect = ttm_bo_vm_mprotect
   };
   void
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 458f92a70887..c19ad07eb7b5 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -77,7 +77,8 @@ static const struct vm_operations_struct radeon_gem_vm_ops = {
.fault = radeon_gem_fault,
.open = ttm_bo_vm_open,
.close = ttm_bo_vm_close,
-   .access = ttm_bo_vm_access
+   .access = ttm_bo_vm_access,
+   .mprotect = ttm_bo_vm_mprotect
   };
   static void radeon_gem_object_free(struct drm_gem_object *gobj)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index f56be5bc0861..fb325bad5db6 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -542,17 +542,29 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned 
long addr,
   }
   EXPORT_SYMBOL(ttm_bo_vm_access);
+int ttm_bo_vm_mprotect(struct vm_area_struct *vma, unsigned long start,
+  unsigned long end, unsigned long newflags)
+{
+   /* Enforce no COW since would have really strange behavior with it. */
+   if (is_cow_mapping(newflags) && (newflags & VM_WRITE))
+   return -EINVAL;
+
+   return 0;
+}
+EXPORT_SYMBOL(ttm_bo_vm_mprotect);
+
   static const struct vm_operations_struct ttm_bo_vm_ops = {
.fault = ttm_bo_vm_fault,
.open = ttm_bo_vm_open,
.close = ttm_bo_vm_close,
.access = ttm_bo_vm_access,
+   .mprotect = ttm_bo_vm_mprotect,
   };
   int ttm_bo_mmap_obj(struct vm_a

Re: [PATCH] dma-buf: fix and rework dma_buf_poll v6

2021-07-14 Thread Christian König

Just a gentle ping. Or have I missed your reply?

Thanks,
Christian.

Am 09.07.21 um 14:07 schrieb Christian König:

Daniel pointed me towards this function and there are multiple obvious problems
in the implementation.

First of all the retry loop is not working as intended. In general the retry
makes only sense if you grab the reference first and then check the sequence
values.

Then we should always also wait for the exclusive fence.

It's also good practice to keep the reference around when installing callbacks
to fences you don't own.

And last the whole implementation was unnecessary complex and rather hard to
understand which could lead to probably unexpected behavior of the IOCTL.

Fix all this by reworking the implementation from scratch. Dropping the
whole RCU approach and taking the lock instead.

Only mildly tested and needs a thoughtful review of the code.

v2: fix the reference counting as well
v3: keep the excl fence handling as is for stable
v4: back to testing all fences, drop RCU
v5: handle in and out separately
v6: add missing clear of events

Signed-off-by: Christian König 
CC: sta...@vger.kernel.org
---
  drivers/dma-buf/dma-buf.c | 156 +-
  include/linux/dma-buf.h   |   2 +-
  2 files changed, 72 insertions(+), 86 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index eadd1eaa2fb5..39e1ef872829 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -72,7 +72,7 @@ static void dma_buf_release(struct dentry *dentry)
 * If you hit this BUG() it means someone dropped their ref to the
 * dma-buf while still having pending operation to the buffer.
 */
-   BUG_ON(dmabuf->cb_shared.active || dmabuf->cb_excl.active);
+   BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
  
  	dmabuf->ops->release(dmabuf);
  
@@ -202,16 +202,57 @@ static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb)

wake_up_locked_poll(dcb->poll, dcb->active);
dcb->active = 0;
spin_unlock_irqrestore(&dcb->poll->lock, flags);
+   dma_fence_put(fence);
+}
+
+static bool dma_buf_poll_shared(struct dma_resv *resv,
+   struct dma_buf_poll_cb_t *dcb)
+{
+   struct dma_resv_list *fobj = dma_resv_get_list(resv);
+   struct dma_fence *fence;
+   int i, r;
+
+   if (!fobj)
+   return false;
+
+   for (i = 0; i < fobj->shared_count; ++i) {
+   fence = rcu_dereference_protected(fobj->shared[i],
+ dma_resv_held(resv));
+   dma_fence_get(fence);
+   r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb);
+   if (!r)
+   return true;
+   dma_fence_put(fence);
+   }
+
+   return false;
+}
+
+static bool dma_buf_poll_excl(struct dma_resv *resv,
+ struct dma_buf_poll_cb_t *dcb)
+{
+   struct dma_fence *fence = dma_resv_get_excl(resv);
+   int r;
+
+   if (!fence)
+   return false;
+
+   dma_fence_get(fence);
+   r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb);
+   if (!r)
+   return true;
+   dma_fence_put(fence);
+
+   return false;
  }
  
  static __poll_t dma_buf_poll(struct file *file, poll_table *poll)

  {
struct dma_buf *dmabuf;
struct dma_resv *resv;
-   struct dma_resv_list *fobj;
-   struct dma_fence *fence_excl;
+   unsigned shared_count;
__poll_t events;
-   unsigned shared_count, seq;
+   int r, i;
  
  	dmabuf = file->private_data;

if (!dmabuf || !dmabuf->resv)
@@ -225,101 +266,46 @@ static __poll_t dma_buf_poll(struct file *file, 
poll_table *poll)
if (!events)
return 0;
  
-retry:

-   seq = read_seqcount_begin(&resv->seq);
-   rcu_read_lock();
-
-   fobj = rcu_dereference(resv->fence);
-   if (fobj)
-   shared_count = fobj->shared_count;
-   else
-   shared_count = 0;
-   fence_excl = rcu_dereference(resv->fence_excl);
-   if (read_seqcount_retry(&resv->seq, seq)) {
-   rcu_read_unlock();
-   goto retry;
-   }
+   dma_resv_lock(resv, NULL);
  
-	if (fence_excl && (!(events & EPOLLOUT) || shared_count == 0)) {

-   struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_excl;
-   __poll_t pevents = EPOLLIN;
-
-   if (shared_count == 0)
-   pevents |= EPOLLOUT;
+   if (events & EPOLLOUT) {
+   struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_out;
  
+		/* Check that callback isn't busy */

spin_lock_irq(&dmabuf->poll.lock);
-   if (dcb->active) {
-   dcb->active |= pevents;
-   events &= ~pevents;
-   } else
-   dcb->active = pevents;
+   if

Re: [PATCH v2 1/4] drm/amd/display: Introduce FPU directory inside DC

2021-07-14 Thread Christian König

Am 13.07.21 um 16:06 schrieb Rodrigo Siqueira:

The display core files rely on FPU operation, which requires to be
compiled with special flags. Ideally, we don't want these FPU operations
spread around the DC code; nevertheless, it happens in the current
source. This commit introduces a new directory named fpu_operations that
intends to centralize all files that require the FPU compilation flag.
As part of this new component, this patch also moves one of the
functions that require FPU access to a single shared file. Notice that
this is the first part of the work, and it does not fix the FPU issue
yet; we still need other patches for achieving the complete isolation of
this file.

Change since V1:
- Update documentation and rebase.

Signed-off-by: Rodrigo Siqueira 
---
  drivers/gpu/drm/amd/display/dc/Makefile   |  1 +
  .../drm/amd/display/dc/dcn20/dcn20_resource.c | 39 +
  .../drm/amd/display/dc/dcn20/dcn20_resource.h |  2 -
  .../drm/amd/display/dc/dcn21/dcn21_resource.c |  2 +
  .../amd/display/dc/fpu_operations/Makefile| 58 +
  .../drm/amd/display/dc/fpu_operations/dcn2x.c | 87 +++
  .../drm/amd/display/dc/fpu_operations/dcn2x.h | 33 +++
  7 files changed, 183 insertions(+), 39 deletions(-)
  create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/Makefile
  create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c
  create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.h

diff --git a/drivers/gpu/drm/amd/display/dc/Makefile 
b/drivers/gpu/drm/amd/display/dc/Makefile
index 943fcb164876..93e731a9be68 100644
--- a/drivers/gpu/drm/amd/display/dc/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/Makefile
@@ -37,6 +37,7 @@ DC_LIBS += dcn303
  DC_LIBS += dcn31
  endif
  
+DC_LIBS += fpu_operations

  DC_LIBS += dce120
  
  DC_LIBS += dce112

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index 1b05a37b674d..f99b09643a52 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -35,6 +35,8 @@
  #include "include/irq_service_interface.h"
  #include "dcn20/dcn20_resource.h"
  
+#include "fpu_operations/dcn2x.h"

+
  #include "dcn10/dcn10_hubp.h"
  #include "dcn10/dcn10_ipp.h"
  #include "dcn20_hubbub.h"
@@ -1974,43 +1976,6 @@ void dcn20_split_stream_for_mpc(
ASSERT(primary_pipe->plane_state);
  }
  
-void dcn20_populate_dml_writeback_from_context(

-   struct dc *dc, struct resource_context *res_ctx, 
display_e2e_pipe_params_st *pipes)
-{
-   int pipe_cnt, i;
-
-   for (i = 0, pipe_cnt = 0; i < dc->res_pool->pipe_count; i++) {
-   struct dc_writeback_info *wb_info = 
&res_ctx->pipe_ctx[i].stream->writeback_info[0];
-
-   if (!res_ctx->pipe_ctx[i].stream)
-   continue;
-
-   /* Set writeback information */
-   pipes[pipe_cnt].dout.wb_enable = (wb_info->wb_enabled == true) 
? 1 : 0;
-   pipes[pipe_cnt].dout.num_active_wb++;
-   pipes[pipe_cnt].dout.wb.wb_src_height = 
wb_info->dwb_params.cnv_params.crop_height;
-   pipes[pipe_cnt].dout.wb.wb_src_width = 
wb_info->dwb_params.cnv_params.crop_width;
-   pipes[pipe_cnt].dout.wb.wb_dst_width = 
wb_info->dwb_params.dest_width;
-   pipes[pipe_cnt].dout.wb.wb_dst_height = 
wb_info->dwb_params.dest_height;
-   pipes[pipe_cnt].dout.wb.wb_htaps_luma = 1;
-   pipes[pipe_cnt].dout.wb.wb_vtaps_luma = 1;
-   pipes[pipe_cnt].dout.wb.wb_htaps_chroma = 
wb_info->dwb_params.scaler_taps.h_taps_c;
-   pipes[pipe_cnt].dout.wb.wb_vtaps_chroma = 
wb_info->dwb_params.scaler_taps.v_taps_c;
-   pipes[pipe_cnt].dout.wb.wb_hratio = 1.0;
-   pipes[pipe_cnt].dout.wb.wb_vratio = 1.0;
-   if (wb_info->dwb_params.out_format == dwb_scaler_mode_yuv420) {
-   if (wb_info->dwb_params.output_depth == 
DWB_OUTPUT_PIXEL_DEPTH_8BPC)
-   pipes[pipe_cnt].dout.wb.wb_pixel_format = 
dm_420_8;
-   else
-   pipes[pipe_cnt].dout.wb.wb_pixel_format = 
dm_420_10;
-   } else
-   pipes[pipe_cnt].dout.wb.wb_pixel_format = dm_444_32;
-
-   pipe_cnt++;
-   }
-
-}
-
  int dcn20_populate_dml_pipes_from_context(
struct dc *dc,
struct dc_state *context,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.h 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.h
index c8f3127bbcdf..6ec8ff45f0f7 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.h
@@ -58,8 +58,6 @@ struct pipe_ctx *dcn20_acquire_idle_pipe_for_layer(
struct dc_state *state,
const struct resource_pool *

Re: [PATCH v2 2/4] drm/amd/display: Add FPU event trace

2021-07-14 Thread Christian König

Am 13.07.21 um 16:06 schrieb Rodrigo Siqueira:

We don't have any mechanism for tracing FPU operations inside the
display core, making the debug work a little bit tricky. This commit
introduces a trace mechanism inside our DC_FP_START/END macros for
trying to alleviate this problem.

Signed-off-by: Rodrigo Siqueira 
---
  .../gpu/drm/amd/display/amdgpu_dm/Makefile|  3 +-
  .../amd/display/amdgpu_dm/amdgpu_dm_trace.h   | 21 ++
  .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 64 +++
  .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.h| 33 ++
  drivers/gpu/drm/amd/display/dc/dc_trace.h |  3 +
  drivers/gpu/drm/amd/display/dc/os_types.h |  6 +-
  6 files changed, 126 insertions(+), 4 deletions(-)
  create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
  create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile 
b/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile
index 91fb72c96545..5f7fd4474379 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile
@@ -25,7 +25,8 @@
  
  
  
-AMDGPUDM = amdgpu_dm.o amdgpu_dm_irq.o amdgpu_dm_mst_types.o amdgpu_dm_color.o

+AMDGPUDM = amdgpu_dm.o amdgpu_dm_irq.o amdgpu_dm_mst_types.o amdgpu_dm_color.o 
\
+   dc_fpu.o
  
  ifneq ($(CONFIG_DRM_AMD_DC),)

  AMDGPUDM += amdgpu_dm_services.o amdgpu_dm_helpers.o amdgpu_dm_pp_smu.o 
amdgpu_dm_psr.o
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h
index 46a33f64cf8e..230bb12c405e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h
@@ -637,6 +637,27 @@ TRACE_EVENT(amdgpu_refresh_rate_track,
  __entry->refresh_rate_ns)
  );
  
+TRACE_EVENT(dcn_fpu,

+   TP_PROTO(bool begin, const char *function, const int line),
+   TP_ARGS(begin, function, line),
+
+   TP_STRUCT__entry(
+__field(bool, begin)
+__field(const char *, function)
+__field(int, line)
+   ),
+   TP_fast_assign(
+  __entry->begin = begin;
+  __entry->function = function;
+  __entry->line = line;
+   ),
+   TP_printk("%s()+%d: %s",
+ __entry->function,
+ __entry->line,
+ __entry->begin ? "begin" : "end"
+   )
+);
+
  #endif /* _AMDGPU_DM_TRACE_H_ */
  
  #undef TRACE_INCLUDE_PATH

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
new file mode 100644
index ..d5d156a4517e
--- /dev/null
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
@@ -0,0 +1,64 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright 2021 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: AMD
+ *
+ */
+
+#include "dc_trace.h"
+
+#include 
+
+/**
+ * dc_fpu_begin - Enables FPU protection
+ * @function_name: A string containing the function name for debug purposes
+ *   (usually __func__)
+ *
+ * @line: A line number where DC_FP_START was invoked for debug purpose
+ *   (usually __LINE__)
+ *
+ * This function is responsible for managing the use of kernel_fpu_begin() with
+ * the advantage of providing an event trace for debugging.
+ *
+ * Note: Do not call this function directly; always use DC_FP_START().
+ */
+void dc_fpu_begin(const char *function_name, const int line)
+{
+   TRACE_DCN_FPU(true, function_name, line);
+   kernel_fpu_begin();


The build robot has pointed that out as well, the kernel_fpu_begin() and 
kernel_fpu_end() functions are x86 specific and don't exist on other 
architectures in this form.



+}
+
+/**
+ * dc_fpu_end -

Re: [PATCH v2 3/4] drm/amd/display: Add control mechanism for FPU utilization

2021-07-14 Thread Christian König

Am 13.07.21 um 16:06 schrieb Rodrigo Siqueira:

DC invokes DC_FPU_START/END in multiple parts of the code; this can
create a situation where we invoke this FPU operation in a nested way or
exit too early. For avoiding this situation, this commit adds a
mechanism where dc_fpu_begin/end manages the access to
kernel_fpu_begin/end.

Change since V1:
- Use a better variable names
- Use get_cpu_ptr and put_cpu_ptr to better balance preemption enable
and disable

Signed-off-by: Rodrigo Siqueira 
---
  .../amd/display/amdgpu_dm/amdgpu_dm_trace.h   | 13 ---
  .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 36 ---
  drivers/gpu/drm/amd/display/dc/dc_trace.h |  4 +--
  3 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h
index 230bb12c405e..fdcaea22b456 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h
@@ -638,23 +638,26 @@ TRACE_EVENT(amdgpu_refresh_rate_track,
  );
  
  TRACE_EVENT(dcn_fpu,

-   TP_PROTO(bool begin, const char *function, const int line),
-   TP_ARGS(begin, function, line),
+   TP_PROTO(bool begin, const char *function, const int line, const 
int recursion_depth),
+   TP_ARGS(begin, function, line, recursion_depth),
  
  	TP_STRUCT__entry(

 __field(bool, begin)
 __field(const char *, function)
 __field(int, line)
+__field(int, recursion_depth)
),
TP_fast_assign(
   __entry->begin = begin;
   __entry->function = function;
   __entry->line = line;
+  __entry->recursion_depth = recursion_depth;
),
-   TP_printk("%s()+%d: %s",
+   TP_printk("%s: recursion_depth: %d: %s()+%d:",
+ __entry->begin ? "begin" : "end",
+ __entry->recursion_depth,
  __entry->function,
- __entry->line,
- __entry->begin ? "begin" : "end"
+ __entry->line
)
  );
  
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c

index d5d156a4517e..73179e9e859a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
@@ -28,6 +28,19 @@
  
  #include 
  
+/**

+ * DOC: DC FPU manipulation overview
+ *
+ * DC core uses FPU operations in multiple parts of the code, which requires a
+ * more specialized way to manage these areas' entrance. To fulfill this
+ * requirement, we created some wrapper functions that encapsulate
+ * kernel_fpu_begin/end to better fit our need in the display component. In
+ * summary, in this file, you can find functions related to FPU operation
+ * management.
+ */
+
+static DEFINE_PER_CPU(int, fpu_recursion_depth);
+
  /**
   * dc_fpu_begin - Enables FPU protection
   * @function_name: A string containing the function name for debug purposes
@@ -43,8 +56,16 @@
   */
  void dc_fpu_begin(const char *function_name, const int line)
  {
-   TRACE_DCN_FPU(true, function_name, line);
-   kernel_fpu_begin();
+   int *pcpu;
+
+   pcpu = get_cpu_ptr(&fpu_recursion_depth);
+   *pcpu = this_cpu_inc_return(fpu_recursion_depth);


That doesn't make sense. Please don't use this_cpu_inc_return() in 
combination with get_cpu_ptr().


Christian.


+
+   if (*pcpu == 1)
+   kernel_fpu_begin();
+
+   TRACE_DCN_FPU(true, function_name, line, *pcpu);
+   put_cpu_ptr(&fpu_recursion_depth);
  }
  
  /**

@@ -59,6 +80,13 @@ void dc_fpu_begin(const char *function_name, const int line)
   */
  void dc_fpu_end(const char *function_name, const int line)
  {
-   TRACE_DCN_FPU(false, function_name, line);
-   kernel_fpu_end();
+   int *pcpu;
+
+   pcpu = get_cpu_ptr(&fpu_recursion_depth);
+   *pcpu = this_cpu_dec_return(fpu_recursion_depth);
+   if (*pcpu <= 0)
+   kernel_fpu_end();
+
+   TRACE_DCN_FPU(false, function_name, line, *pcpu);
+   put_cpu_ptr(&fpu_recursion_depth);
  }
diff --git a/drivers/gpu/drm/amd/display/dc/dc_trace.h 
b/drivers/gpu/drm/amd/display/dc/dc_trace.h
index d598ba697e45..c711797e5c9e 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_trace.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_trace.h
@@ -38,5 +38,5 @@
  #define TRACE_DCN_CLOCK_STATE(dcn_clocks) \
trace_amdgpu_dm_dc_clocks_state(dcn_clocks)
  
-#define TRACE_DCN_FPU(begin, function, line) \

-   trace_dcn_fpu(begin, function, line)
+#define TRACE_DCN_FPU(begin, function, line, ref_count) \
+   trace_dcn_fpu(begin, function, line, ref_count)




Re: [PATCH v2 4/4] drm/amd/display: Add DC_FP helper to check FPU state

2021-07-14 Thread Christian König

Am 13.07.21 um 16:06 schrieb Rodrigo Siqueira:

To fully isolate FPU operations in a single place, we must avoid
situations where compilers spill FP values to registers due to FP enable
in a specific C file. Note that even if we isolate all FPU functions in
a single file and call its interface from other files, the compiler
might enable the use of FPU before we call DC_FP_START. Nevertheless, it
is the programmer's responsibility to invoke DC_FP_START/END in the
correct place. To highlight situations where developers forgot to use
the FP protection before calling the DC FPU interface functions, we
introduce a helper that checks if the function is invoked under FP
protection. If not, it will trigger a kernel warning.

Changes since V1:
- Remove fp_enable variables
- Rename dc_is_fp_enabled to dc_assert_fp_enabled
- Replace wrong variable type

Signed-off-by: Rodrigo Siqueira 
---
  .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 22 +++
  .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.h|  1 +
  .../drm/amd/display/dc/dcn20/dcn20_resource.c |  2 ++
  .../drm/amd/display/dc/fpu_operations/dcn2x.c | 17 ++
  4 files changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
index 73179e9e859a..74153a2816f9 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
@@ -41,6 +41,28 @@
  
  static DEFINE_PER_CPU(int, fpu_recursion_depth);
  
+/**

+ * dc_assert_fp_enabled - Check if FPU protection is enabled
+ *
+ * This function tells if the code is already under FPU protection or not. A
+ * function that works as an API for a set of FPU operations can use this
+ * function for checking if the caller invoked it after DC_FP_START(). For
+ * example, take a look at dcn2x.c file.
+ *
+ * Return:
+ * Return true if we already enabled FPU protection, otherwise return false.
+ */
+inline bool dc_assert_fp_enabled(void)


Assert indicates that you print a warning if the condition isn't meet, 
but you only return the condition.


Either rename the function or raise the warning directly.


+{
+   int *pcpu, depth = 0;
+
+   pcpu = get_cpu_ptr(&fpu_recursion_depth);
+   depth = this_cpu_read(fpu_recursion_depth);
+   put_cpu_ptr(&fpu_recursion_depth);


Again this doesn't make sense.

Either you use this_cpu_read() or your use get_cpu_ptr()/put_cpu_ptr(), 
but not both.



+
+   return depth > 1;
+}
+
  /**
   * dc_fpu_begin - Enables FPU protection
   * @function_name: A string containing the function name for debug purposes
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h
index fb54983c5c60..97941794b77c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h
@@ -27,6 +27,7 @@
  #ifndef __DC_FPU_H__
  #define __DC_FPU_H__
  
+bool dc_assert_fp_enabled(void);

  void dc_fpu_begin(const char *function_name, const int line);
  void dc_fpu_end(const char *function_name, const int line);
  
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c

index f99b09643a52..d0b34c7f99dc 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -2355,7 +2355,9 @@ int dcn20_populate_dml_pipes_from_context(
}
  
  	/* populate writeback information */

+   DC_FP_START();
dc->res_pool->funcs->populate_dml_writeback_from_context(dc, res_ctx, 
pipes);
+   DC_FP_END();
  
  	return pipe_cnt;

  }
diff --git a/drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c 
b/drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c
index c815d6c01d64..d8183da0c2b0 100644
--- a/drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c
+++ b/drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c
@@ -41,6 +41,22 @@
   *that deals with FP register is contained within this call.
   * 3. All function that needs to be accessed outside this file requires a
   *public interface that not uses any FPU reference.
+ * 4. Developers should not use DC_FP_START/END in this file, but they need to


This needs to be harder, e.g. "Developers must not use".

Regards,
Christian.


+ *ensure that the caller invokes it before access any function available in
+ *this file. For this reason, public API in this file must invoke
+ *ASSERT(dc_assert_fp_enabled());
+ *
+ * Let's expand a little bit more the idea in the code pattern number for. To
+ * fully isolate FPU operations in a single place, we must avoid situations
+ * where compilers spill FP values to registers due to FP enable in a specific
+ * C file. Note that even if we isolate all FPU functions in a single file and
+ * call its interface from other files, the compiler might enable the use of
+ * FPU before we call DC_FP_START. Nevertheless, it is the 

[PATCH v2 01/13] drm/mgag200: Select clock in PLL update functions

2021-07-14 Thread Thomas Zimmermann
Put the clock-selection code into each of the PLL-update functions to
make them select the correct pixel clock. Instead of copying the code,
introduce a new helper WREG_MISC_MASKED, which does masked writes into
. Use it from each individual PLL update function.

The pixel clock for video output was not actually set before programming
the clock's values. It worked because the device had the correct clock
pre-set.

v2:
* don't duplicate  update code (Sam)

Signed-off-by: Thomas Zimmermann 
Fixes: db05f8d3dc87 ("drm/mgag200: Split MISC register update into PLL 
selection, SYNC and I/O")
Cc: Sam Ravnborg 
Cc: Emil Velikov 
Cc: Dave Airlie 
Cc: dri-devel@lists.freedesktop.org
Cc:  # v5.9+
---
 drivers/gpu/drm/mgag200/mgag200_drv.h  | 16 
 drivers/gpu/drm/mgag200/mgag200_mode.c | 20 +---
 drivers/gpu/drm/mgag200/mgag200_reg.h  |  9 -
 3 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h 
b/drivers/gpu/drm/mgag200/mgag200_drv.h
index f7a0537c0d0a..5302d6566d7c 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.h
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.h
@@ -43,6 +43,22 @@
 #define ATTR_INDEX 0x1fc0
 #define ATTR_DATA 0x1fc1
 
+#define WREG_MISC(v)   \
+   WREG8(MGA_MISC_OUT, v)
+
+#define RREG_MISC(v)   \
+   ((v) = RREG8(MGA_MISC_IN))
+
+#define WREG_MISC_MASKED(v, mask)  \
+   do {\
+   u8 misc_;   \
+   u8 mask_ = (mask);  \
+   RREG_MISC(misc_);   \
+   misc_ &= ~mask_;\
+   misc_ |= ((v) & mask_); \
+   WREG_MISC(misc_);   \
+   } while (0)
+
 #define WREG_ATTR(reg, v)  \
do {\
RREG8(0x1fda);  \
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index 3b3059f471c2..1bdf21474bcb 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -174,6 +174,8 @@ static int mgag200_g200_set_plls(struct mga_device *mdev, 
long clock)
drm_dbg_kms(dev, "clock: %ld vco: %ld m: %d n: %d p: %d s: %d\n",
clock, f_vco, m, n, p, s);
 
+   WREG_MISC_MASKED(MGAREG_MISC_CLKSEL_MGA, MGAREG_MISC_CLKSEL_MASK);
+
WREG_DAC(MGA1064_PIX_PLLC_M, m);
WREG_DAC(MGA1064_PIX_PLLC_N, n);
WREG_DAC(MGA1064_PIX_PLLC_P, (p | (s << 3)));
@@ -289,6 +291,8 @@ static int mga_g200se_set_plls(struct mga_device *mdev, 
long clock)
return 1;
}
 
+   WREG_MISC_MASKED(MGAREG_MISC_CLKSEL_MGA, MGAREG_MISC_CLKSEL_MASK);
+
WREG_DAC(MGA1064_PIX_PLLC_M, m);
WREG_DAC(MGA1064_PIX_PLLC_N, n);
WREG_DAC(MGA1064_PIX_PLLC_P, p);
@@ -385,6 +389,8 @@ static int mga_g200wb_set_plls(struct mga_device *mdev, 
long clock)
}
}
 
+   WREG_MISC_MASKED(MGAREG_MISC_CLKSEL_MGA, MGAREG_MISC_CLKSEL_MASK);
+
for (i = 0; i <= 32 && pll_locked == false; i++) {
if (i > 0) {
WREG8(MGAREG_CRTC_INDEX, 0x1e);
@@ -522,6 +528,8 @@ static int mga_g200ev_set_plls(struct mga_device *mdev, 
long clock)
}
}
 
+   WREG_MISC_MASKED(MGAREG_MISC_CLKSEL_MGA, MGAREG_MISC_CLKSEL_MASK);
+
WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL);
tmp = RREG8(DAC_DATA);
tmp |= MGA1064_PIX_CLK_CTL_CLK_DIS;
@@ -654,6 +662,9 @@ static int mga_g200eh_set_plls(struct mga_device *mdev, 
long clock)
}
}
}
+
+   WREG_MISC_MASKED(MGAREG_MISC_CLKSEL_MGA, MGAREG_MISC_CLKSEL_MASK);
+
for (i = 0; i <= 32 && pll_locked == false; i++) {
WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL);
tmp = RREG8(DAC_DATA);
@@ -754,6 +765,8 @@ static int mga_g200er_set_plls(struct mga_device *mdev, 
long clock)
}
}
 
+   WREG_MISC_MASKED(MGAREG_MISC_CLKSEL_MGA, MGAREG_MISC_CLKSEL_MASK);
+
WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL);
tmp = RREG8(DAC_DATA);
tmp |= MGA1064_PIX_CLK_CTL_CLK_DIS;
@@ -787,8 +800,6 @@ static int mga_g200er_set_plls(struct mga_device *mdev, 
long clock)
 
 static int mgag200_crtc_set_plls(struct mga_device *mdev, long clock)
 {
-   u8 misc;
-
switch(mdev->type) {
case G200_PCI:
case G200_AGP:
@@ -808,11 +819,6 @@ static int mgag200_crtc_set_plls(struct mga_device *mdev, 
long clock)
return mga_g200er_set_plls(mdev, clock);
}
 
-   misc = RREG8(MGA_MISC_IN);
-   misc &= ~MGAREG_MISC_

Re: [PATCH] drm: mxsfb: Clear FIFO_CLEAR bit

2021-07-14 Thread Lucas Stach
Am Donnerstag, dem 01.07.2021 um 00:50 +0200 schrieb Marek Vasut:
> On 6/29/21 10:02 AM, Lucas Stach wrote:
> > Am Dienstag, dem 29.06.2021 um 05:04 +0200 schrieb Marek Vasut:
> > > On 6/28/21 10:09 AM, Lucas Stach wrote:
> > > > Am Samstag, dem 26.06.2021 um 20:15 +0200 schrieb Marek Vasut:
> > > > > On 6/24/21 2:01 PM, Lucas Stach wrote:
> > > > > > Am Dienstag, dem 22.06.2021 um 11:33 +0200 schrieb Marek Vasut:
> > > > > > > On 6/22/21 9:28 AM, Lucas Stach wrote:
> > > > > > > > Am Montag, dem 21.06.2021 um 18:30 +0200 schrieb Marek Vasut:
> > > > > > > > > On 6/21/21 2:14 PM, Lucas Stach wrote:
> > > > > > > > > 
> > > > > > > > > [...]
> > > > > > > > > 
> > > > > > > > > > > diff --git a/drivers/gpu/drm/mxsfb/mxsfb_kms.c 
> > > > > > > > > > > b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
> > > > > > > > > > > index 98d8ba0bae84..22cb749fc9bc 100644
> > > > > > > > > > > --- a/drivers/gpu/drm/mxsfb/mxsfb_kms.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
> > > > > > > > > > > @@ -241,6 +241,9 @@ static void 
> > > > > > > > > > > mxsfb_crtc_mode_set_nofb(struct mxsfb_drm_private *mxsfb,
> > > > > > > > > > >   
> > > > > > > > > > >   /* Clear the FIFOs */
> > > > > > > > > > >   writel(CTRL1_FIFO_CLEAR, mxsfb->base + 
> > > > > > > > > > > LCDC_CTRL1 + REG_SET);
> > > > > > > > > > > + readl(mxsfb->base + LCDC_CTRL1);
> > > > > > > > > > 
> > > > > > > > > > Do you really need those readbacks? As both writes are 
> > > > > > > > > > targeting the
> > > > > > > > > > same slave interface, the memory barrier in the clear write 
> > > > > > > > > > should push
> > > > > > > > > > the set write.
> > > > > > > > > 
> > > > > > > > > What would push the clear write then ? We can drop one of the 
> > > > > > > > > readl()s,
> > > > > > > > > but not the last one.
> > > > > > > > 
> > > > > > > > There are a lot of more writes with barriers to the controller 
> > > > > > > > slave
> > > > > > > > interface in that function after clearing the FIFO. I don't see 
> > > > > > > > why
> > > > > > > > this readback would be required.
> > > > > > > 
> > > > > > > Because you really do want to make sure the fifo is cleared 
> > > > > > > before you
> > > > > > > start doing any of those other writes or configuring the 
> > > > > > > controller in
> > > > > > > any way.
> > > > > > 
> > > > > > I still don't see the reason. What additional properties do you 
> > > > > > think
> > > > > > the readback provides that isn't already provided by the barriers in
> > > > > > the following writes?
> > > > > 
> > > > > See the paragraph above -- we have to make sure the writes that 
> > > > > trigger
> > > > > the FIFO clearing really take place before any other writes do.
> > > > 
> > > > And they do, as there are write barriers prepended to the writes
> > > > following the FIFO clear. The readback just lets the CPU wait until the
> > > > write reached the peripheral, which I don't see a reason to do here.
> > > > The ordering of the writes from the perspective of the peripheral is
> > > > completely the same with or without the readback. The later writes can
> > > > not overtake the FIFO clear writes due to the barriers.
> > > > 
> > > > I'm strongly against adding stuff because it "might have an effect", if
> > > > it isn't required by architectural rules. It clutters the code and some
> > > > months/years down the line nobody dares to cleanup/remove this stuff
> > > > anymore, because everyone assumes that there was a good reason for
> > > > adding those things.
> > > 
> > > Since there is no RTL for any of the iMXes or their IPs, how do you
> > > propose anyone except NXP can validate what is and what is not required ?
> > > 
> > > This patch helps with a problem where I sporadically observe shifted
> > > image on boot on mx8mm.
> > 
> > The order of writes to a device mapped region are defined by the ARM
> > architecture and the AMBA bus standard, not the peripheral. I'm not
> > saying this patch isn't needed. I'm saying the readbacks look bogus.
> > 
> > Have you checked that just adding the write to the REG_CLR doesn't fix
> > your issue?
> 
> No, it does not help with the issue.

Okay, i don't want to hold up this patch over technicalities if it
fixes the issue, in which case the readbacks probably provide just the
right amount of delay for the FIFO clear to happen in hardware. FWIW:

Acked-by: Lucas Stach 

Regards,
Lucas



[PATCH v2] drm/of: free the iterator object on failure

2021-07-14 Thread Steven Price
When bailing out due to the sanity check the iterator value needs to be
freed because the early return prevents for_each_child_of_node() from
doing the dereference itself.

Fixes: 6529007522de ("drm: of: Add drm_of_lvds_get_dual_link_pixel_order")
Signed-off-by: Steven Price 
---
 drivers/gpu/drm/drm_of.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

v2: Fixes now refers to the original commit as suggested by Laurent, rather
than 4ee48cc5586b ("drm: of: Fix double-free bug") which only fixed part of
the problem. Note that 4ee48cc5586b is a dependency for this patch to
cleanly apply.

diff --git a/drivers/gpu/drm/drm_of.c b/drivers/gpu/drm/drm_of.c
index 197c57477344..997b8827fed2 100644
--- a/drivers/gpu/drm/drm_of.c
+++ b/drivers/gpu/drm/drm_of.c
@@ -331,8 +331,10 @@ static int drm_of_lvds_get_remote_pixels_type(
 * configurations by passing the endpoints explicitly to
 * drm_of_lvds_get_dual_link_pixel_order().
 */
-   if (!current_pt || pixels_type != current_pt)
+   if (!current_pt || pixels_type != current_pt) {
+   of_node_put(endpoint);
return -EINVAL;
+   }
}
 
return pixels_type;
-- 
2.20.1



Re: [PATCH v8 00/14] drm/tegra: Introduce a modern UABI

2021-07-14 Thread Dmitry Osipenko
14.07.2021 11:30, Thierry Reding пишет:
> On Sat, Jul 10, 2021 at 12:16:28AM +0300, Dmitry Osipenko wrote:
>> Hello Thierry,
>>
>> 09.07.2021 22:31, Thierry Reding пишет:
>>> From: Thierry Reding 
>>>
>>> Hi all,
>>>
>>> Mikko has been away for a few weeks, so I've been testing and revising
>>> the new UABI patches in the meantime. There are very minor changes to
>>> the naming of some of the UABI fields, but other than that it's mostly
>>> unchanged from v7.
>>
>> Why you haven't addressed any of the previous review comments? There
>> were some obvious problems in v7 and v8 still has them.
>>
>>> One notable change is that mappings can now be read-only, write-only,
>>> read-write or none of them (rather than just read-only or read-write),
>>> since those combinations are all supported by the IOMMUs and it might
>>> be useful to make some mappings write-only.
>>>
>>> For a full list of changes in v8, see the changelog in patch 6.
>>>
>>> I've also updated the libdrm_tegra library to work against this version
>>> of the UABI. A branch can be found here:
>>>
>>>   https://gitlab.freedesktop.org/tagr/drm/-/commits/drm-tegra-uabi-v8
>>>
>>> That contains helper APIs for the concepts introduced in this series and
>>> shows how they can be used in various tests that can be run for sanity
>>> checking.
>>>
>>> In addition, Mikko has made updates to the following projects, though
>>> they may need to be updated for the minor changes in v8:
>>>
>>> * vaapi-tegra-driver - https://github.com/cyndis/vaapi-tegra-driver
>>>   Experimental support for MPEG2 and H264 decoding on T210, T186
>>>   and T194.
>>>
>>> * xf86-video-opentegra - 
>>> https://github.com/grate-driver/xf86-video-opentegra
>>>   X11 userspace acceleration driver for Tegra20, Tegra30, and Tegra114.
>>>
>>> * grate - https://github.com/grate-driver/grate
>>>   3D rendering testbed for Tegra20, Tegra30, and Tegra114
>>>
>>> I plan on putting this into linux-next soon after v5.14-rc1 so that this
>>> can get some soak time.
>>
>> It should be a bit too early to push it into kernel. The UAPI is not
>> ready because it's missing essential features. We can't call this a
>> 'modern UABI' until it's fully implemented. The design decisions are
>> still questionable because this UAPI is built around the proprietary
>> firmware (and based on UAPI of downstream driver) which doesn't fit well
>> into DRM world. I haven't got all the answers to my previous questions,
>> should I repeat them?
> 
> I don't know what you means by "built around the proprietary firmware".
> Yes, this ends up using proprietary firmware for some of the hardware
> engines that host1x drives, but that's completely orthogonal to the
> UABI. No matter what UABI we'd be introducing, we'd be using that same
> firmware.
> 
> And yes, this is based on the UABI of the downstream drivers. The design
> is guided by what we've learned over the last decade working with this
> hardware in use-cases that customers need. It'd be dumb not to use that
> knowledge to our advantage. This is the only way to ensure we can
> deliver an upstream driver that's on par with our downstream drivers and
> therefore make it possible to eventually adopt the upstream driver.
> 
> And frankly, you did get answers to previous questions, though perhaps
> not all, but I'm out of patience. We've been going in circles and at
> some point we have to make a decision so we can make progress.

By firmware I was referring to the supervisor OS and inter-VM
integration, sorry for not making it clear. My rough understanding is
that it's all software defined and technically it's possible to avoid
going though the trouble of supporting the firmware convention defined
by downstream, and thus, making driver less optimal than it could be.
It's still not clear to me how much that firmware is relevant to
upstream in practice.

> I made several attempts over the years to get something usable merged
> upstream so that we can finally make use of this hardware and get it
> supported upstream and each time I made the mistake of trying to make it
> perfect and accomodate all wishlist items. The result is that I wasted a
> lot of time and have nothing to show for it.

It's a problem that you try to do everything on your own and not
collaborating as much as you could. Writing code isn't a problem, the
problem is that there is no clear understanding of what needs to be
done, IMO. I have a vision of whats need to be done from a perspective
of older SoCs, but I never could start implementing it for upstream
because it requires yours feedback and preliminary agreement since
you're the only maintainer of the driver who could merge patches I don't
want to waste my time too.

> I've also been very hard Mikko with his work on this and I think we've
> stretched this as far as we can without compromising too much on what we
> are going to need from this UABI in the future.
> 
> We've gone through the process of making sure all existing userspace can
> and doe

[Bug 209457] AMDGPU resume fail with RX 580 GPU

2021-07-14 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=209457

--- Comment #34 from Leandro Jacques (ls...@yahoo.com) ---
Created attachment 297851
  --> https://bugzilla.kernel.org/attachment.cgi?id=297851&action=edit
Linux Firmware version info 20210511.7685cf4

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 209457] AMDGPU resume fail with RX 580 GPU

2021-07-14 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=209457

Leandro Jacques (ls...@yahoo.com) changed:

   What|Removed |Added

 Attachment #297851|0   |1
is obsolete||

--- Comment #35 from Leandro Jacques (ls...@yahoo.com) ---
Created attachment 297853
  --> https://bugzilla.kernel.org/attachment.cgi?id=297853&action=edit
Linux Firmware version info 20210511.7685cf4

Firmware version when crashed

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 209457] AMDGPU resume fail with RX 580 GPU

2021-07-14 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=209457

--- Comment #36 from Leandro Jacques (ls...@yahoo.com) ---
Created attachment 297855
  --> https://bugzilla.kernel.org/attachment.cgi?id=297855&action=edit
Kernel crash log for linux firmware version 20210511.7685cf4

Kernel log when crashed.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[PATCH resend v2] dt-bindings: display: ssd1307fb: Convert to json-schema

2021-07-14 Thread Geert Uytterhoeven
Convert the Solomon SSD1307 Framebuffer Device Tree binding
documentation to json-schema.

Fix the spelling of the "pwms" property.
Document default values.
Make properties with default values not required.

Signed-off-by: Geert Uytterhoeven 
Reviewed-by: Rob Herring 
---
v2:
  - Add Reviewed-by,
  - Document solomon,dclk-{div,freq} defaults.
---
 .../bindings/display/solomon,ssd1307fb.yaml   | 208 ++
 .../devicetree/bindings/display/ssd1307fb.txt |  60 -
 2 files changed, 208 insertions(+), 60 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml
 delete mode 100644 Documentation/devicetree/bindings/display/ssd1307fb.txt

diff --git a/Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml 
b/Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml
new file mode 100644
index ..2ed2a7d0ca2fa23e
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml
@@ -0,0 +1,208 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/solomon,ssd1307fb.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Solomon SSD1307 OLED Controller Framebuffer
+
+maintainers:
+  - Maxime Ripard 
+
+properties:
+  compatible:
+enum:
+  - solomon,ssd1305fb-i2c
+  - solomon,ssd1306fb-i2c
+  - solomon,ssd1307fb-i2c
+  - solomon,ssd1309fb-i2c
+
+  reg:
+maxItems: 1
+
+  pwms:
+maxItems: 1
+
+  reset-gpios:
+maxItems: 1
+
+  vbat-supply:
+description: The supply for VBAT
+
+  solomon,height:
+$ref: /schemas/types.yaml#/definitions/uint32
+default: 16
+description:
+  Height in pixel of the screen driven by the controller
+
+  solomon,width:
+$ref: /schemas/types.yaml#/definitions/uint32
+default: 96
+description:
+  Width in pixel of the screen driven by the controller
+
+  solomon,page-offset:
+$ref: /schemas/types.yaml#/definitions/uint32
+default: 1
+description:
+  Offset of pages (band of 8 pixels) that the screen is mapped to
+
+  solomon,segment-no-remap:
+type: boolean
+description:
+  Display needs normal (non-inverted) data column to segment mapping
+
+  solomon,col-offset:
+$ref: /schemas/types.yaml#/definitions/uint32
+default: 0
+description:
+  Offset of columns (COL/SEG) that the screen is mapped to
+
+  solomon,com-seq:
+type: boolean
+description:
+  Display uses sequential COM pin configuration
+
+  solomon,com-lrremap:
+type: boolean
+description:
+  Display uses left-right COM pin remap
+
+  solomon,com-invdir:
+type: boolean
+description:
+  Display uses inverted COM pin scan direction
+
+  solomon,com-offset:
+$ref: /schemas/types.yaml#/definitions/uint32
+default: 0
+description:
+  Number of the COM pin wired to the first display line
+
+  solomon,prechargep1:
+$ref: /schemas/types.yaml#/definitions/uint32
+default: 2
+description:
+  Length of deselect period (phase 1) in clock cycles
+
+  solomon,prechargep2:
+$ref: /schemas/types.yaml#/definitions/uint32
+default: 2
+description:
+  Length of precharge period (phase 2) in clock cycles.  This needs to be
+  the higher, the higher the capacitance of the OLED's pixels is.
+
+  solomon,dclk-div:
+$ref: /schemas/types.yaml#/definitions/uint32
+minimum: 1
+maximum: 16
+description:
+  Clock divisor. The default value is controller-dependent.
+
+  solomon,dclk-frq:
+$ref: /schemas/types.yaml#/definitions/uint32
+minimum: 0
+maximum: 15
+description:
+  Clock frequency, higher value means higher frequency.
+  The default value is controller-dependent.
+
+  solomon,lookup-table:
+$ref: /schemas/types.yaml#/definitions/uint8-array
+maxItems: 4
+description:
+  8 bit value array of current drive pulse widths for BANK0, and colors A,
+  B, and C. Each value in range of 31 to 63 for pulse widths of 32 to 64.
+  Color D is always width 64.
+
+  solomon,area-color-enable:
+type: boolean
+description:
+  Display uses color mode
+
+  solomon,low-power:
+type: boolean
+description:
+  Display runs in low power mode
+
+required:
+  - compatible
+  - reg
+
+allOf:
+  - if:
+  properties:
+compatible:
+  contains:
+const: solomon,ssd1305fb-i2c
+then:
+  properties:
+solomon,dclk-div:
+  default: 1
+solomon,dclk-frq:
+  default: 7
+
+  - if:
+  properties:
+compatible:
+  contains:
+const: solomon,ssd1306fb-i2c
+then:
+  properties:
+solomon,dclk-div:
+  default: 1
+solomon,dclk-frq:
+  default: 8
+
+  - if:
+  properties:
+compatible:
+  contains:
+const: solomon,ssd1307fb-i2c
+then:
+  properties:
+so

[PATCH resend 1/5] video: fbdev: ssd1307fb: Propagate errors via ssd1307fb_update_display()

2021-07-14 Thread Geert Uytterhoeven
Make ssd1307fb_update_display() return an error code, so callers that
can handle failures can propagate it.

Signed-off-by: Geert Uytterhoeven 
---
 drivers/video/fbdev/ssd1307fb.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/video/fbdev/ssd1307fb.c b/drivers/video/fbdev/ssd1307fb.c
index eda448b7a0c9d8ce..e6b6263e3bef847f 100644
--- a/drivers/video/fbdev/ssd1307fb.c
+++ b/drivers/video/fbdev/ssd1307fb.c
@@ -152,17 +152,17 @@ static inline int ssd1307fb_write_cmd(struct i2c_client 
*client, u8 cmd)
return ret;
 }
 
-static void ssd1307fb_update_display(struct ssd1307fb_par *par)
+static int ssd1307fb_update_display(struct ssd1307fb_par *par)
 {
struct ssd1307fb_array *array;
u8 *vmem = par->info->screen_buffer;
unsigned int line_length = par->info->fix.line_length;
unsigned int pages = DIV_ROUND_UP(par->height, 8);
-   int i, j, k;
+   int ret, i, j, k;
 
array = ssd1307fb_alloc_array(par->width * pages, SSD1307FB_DATA);
if (!array)
-   return;
+   return -ENOMEM;
 
/*
 * The screen is divided in pages, each having a height of 8
@@ -210,8 +210,9 @@ static void ssd1307fb_update_display(struct ssd1307fb_par 
*par)
}
}
 
-   ssd1307fb_write_array(par->client, array, par->width * pages);
+   ret = ssd1307fb_write_array(par->client, array, par->width * pages);
kfree(array);
+   return ret;
 }
 
 
@@ -222,6 +223,7 @@ static ssize_t ssd1307fb_write(struct fb_info *info, const 
char __user *buf,
unsigned long total_size;
unsigned long p = *ppos;
void *dst;
+   int ret;
 
total_size = info->fix.smem_len;
 
@@ -239,7 +241,9 @@ static ssize_t ssd1307fb_write(struct fb_info *info, const 
char __user *buf,
if (copy_from_user(dst, buf, count))
return -EFAULT;
 
-   ssd1307fb_update_display(par);
+   ret = ssd1307fb_update_display(par);
+   if (ret < 0)
+   return ret;
 
*ppos += count;
 
@@ -483,7 +487,9 @@ static int ssd1307fb_init(struct ssd1307fb_par *par)
return ret;
 
/* Clear the screen */
-   ssd1307fb_update_display(par);
+   ret = ssd1307fb_update_display(par);
+   if (ret < 0)
+   return ret;
 
/* Turn on the display */
ret = ssd1307fb_write_cmd(par->client, SSD1307FB_DISPLAY_ON);
-- 
2.25.1



[PATCH resend 3/5] video: fbdev: ssd1307fb: Extract ssd1307fb_set_address_range()

2021-07-14 Thread Geert Uytterhoeven
Extract the code to set the column and page ranges into a helper
function.

Signed-off-by: Geert Uytterhoeven 
---
 drivers/video/fbdev/ssd1307fb.c | 61 +++--
 1 file changed, 36 insertions(+), 25 deletions(-)

diff --git a/drivers/video/fbdev/ssd1307fb.c b/drivers/video/fbdev/ssd1307fb.c
index 6d7bd025bca1a175..cfa27ea0feab4f01 100644
--- a/drivers/video/fbdev/ssd1307fb.c
+++ b/drivers/video/fbdev/ssd1307fb.c
@@ -152,6 +152,38 @@ static inline int ssd1307fb_write_cmd(struct i2c_client 
*client, u8 cmd)
return ret;
 }
 
+static int ssd1307fb_set_address_range(struct ssd1307fb_par *par, u8 col_start,
+  u8 cols, u8 page_start, u8 pages)
+{
+   u8 col_end = col_start + cols - 1;
+   u8 page_end = page_start + pages - 1;
+   int ret;
+
+   /* Set column range */
+   ret = ssd1307fb_write_cmd(par->client, SSD1307FB_SET_COL_RANGE);
+   if (ret < 0)
+   return ret;
+
+   ret = ssd1307fb_write_cmd(par->client, col_start);
+   if (ret < 0)
+   return ret;
+
+   ret = ssd1307fb_write_cmd(par->client, col_end);
+   if (ret < 0)
+   return ret;
+
+   /* Set page range */
+   ret = ssd1307fb_write_cmd(par->client, SSD1307FB_SET_PAGE_RANGE);
+   if (ret < 0)
+   return ret;
+
+   ret = ssd1307fb_write_cmd(par->client, page_start);
+   if (ret < 0)
+   return ret;
+
+   return ssd1307fb_write_cmd(par->client, page_end);
+}
+
 static int ssd1307fb_update_display(struct ssd1307fb_par *par)
 {
struct ssd1307fb_array *array;
@@ -461,31 +493,10 @@ static int ssd1307fb_init(struct ssd1307fb_par *par)
if (ret < 0)
return ret;
 
-   /* Set column range */
-   ret = ssd1307fb_write_cmd(par->client, SSD1307FB_SET_COL_RANGE);
-   if (ret < 0)
-   return ret;
-
-   ret = ssd1307fb_write_cmd(par->client, par->col_offset);
-   if (ret < 0)
-   return ret;
-
-   ret = ssd1307fb_write_cmd(par->client, par->col_offset + par->width - 
1);
-   if (ret < 0)
-   return ret;
-
-   /* Set page range */
-   ret = ssd1307fb_write_cmd(par->client, SSD1307FB_SET_PAGE_RANGE);
-   if (ret < 0)
-   return ret;
-
-   ret = ssd1307fb_write_cmd(par->client, par->page_offset);
-   if (ret < 0)
-   return ret;
-
-   ret = ssd1307fb_write_cmd(par->client,
- par->page_offset +
- DIV_ROUND_UP(par->height, 8) - 1);
+   /* Set column and page range */
+   ret = ssd1307fb_set_address_range(par, par->col_offset, par->width,
+ par->page_offset,
+ DIV_ROUND_UP(par->height, 8));
if (ret < 0)
return ret;
 
-- 
2.25.1



[PATCH resend 4/5] video: fbdev: ssd1307fb: Optimize screen updates

2021-07-14 Thread Geert Uytterhoeven
Currently, each screen update triggers an I2C transfer of all screen
data, up to 1 KiB of data for a 128x64 display, which takes at least 20
ms in Fast mode.

Reduce the amount of transferred data by only updating the rectangle
that changed.  Remove the call to ssd1307fb_set_address_range() during
initialization, as ssd1307fb_update_rect() now takes care of that.

Note that for now the optimized operation is only used for fillrect,
copyarea, and imageblit, which are used by fbcon.

Signed-off-by: Geert Uytterhoeven 
---
 drivers/video/fbdev/ssd1307fb.c | 43 -
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/video/fbdev/ssd1307fb.c b/drivers/video/fbdev/ssd1307fb.c
index cfa27ea0feab4f01..8e3d4be74723b9bf 100644
--- a/drivers/video/fbdev/ssd1307fb.c
+++ b/drivers/video/fbdev/ssd1307fb.c
@@ -184,16 +184,18 @@ static int ssd1307fb_set_address_range(struct 
ssd1307fb_par *par, u8 col_start,
return ssd1307fb_write_cmd(par->client, page_end);
 }
 
-static int ssd1307fb_update_display(struct ssd1307fb_par *par)
+static int ssd1307fb_update_rect(struct ssd1307fb_par *par, unsigned int x,
+unsigned int y, unsigned int width,
+unsigned int height)
 {
struct ssd1307fb_array *array;
u8 *vmem = par->info->screen_buffer;
unsigned int line_length = par->info->fix.line_length;
-   unsigned int pages = DIV_ROUND_UP(par->height, 8);
+   unsigned int pages = DIV_ROUND_UP(height + y % 8, 8);
u32 array_idx = 0;
int ret, i, j, k;
 
-   array = ssd1307fb_alloc_array(par->width * pages, SSD1307FB_DATA);
+   array = ssd1307fb_alloc_array(width * pages, SSD1307FB_DATA);
if (!array)
return -ENOMEM;
 
@@ -226,13 +228,18 @@ static int ssd1307fb_update_display(struct ssd1307fb_par 
*par)
 *  (5) A4 B4 C4 D4 E4 F4 G4 H4
 */
 
-   for (i = 0; i < pages; i++) {
+   ret = ssd1307fb_set_address_range(par, par->col_offset + x, width,
+ par->page_offset + y / 8, pages);
+   if (ret < 0)
+   goto out_free;
+
+   for (i = y / 8; i < y / 8 + pages; i++) {
int m = 8;
 
/* Last page may be partial */
-   if (i + 1 == pages && par->height % 8)
+   if (8 * (i + 1) > par->height)
m = par->height % 8;
-   for (j = 0; j < par->width; j++) {
+   for (j = x; j < x + width; j++) {
u8 data = 0;
 
for (k = 0; k < m; k++) {
@@ -245,11 +252,17 @@ static int ssd1307fb_update_display(struct ssd1307fb_par 
*par)
}
}
 
-   ret = ssd1307fb_write_array(par->client, array, par->width * pages);
+   ret = ssd1307fb_write_array(par->client, array, width * pages);
+
+out_free:
kfree(array);
return ret;
 }
 
+static int ssd1307fb_update_display(struct ssd1307fb_par *par)
+{
+   return ssd1307fb_update_rect(par, 0, 0, par->width, par->height);
+}
 
 static ssize_t ssd1307fb_write(struct fb_info *info, const char __user *buf,
size_t count, loff_t *ppos)
@@ -299,21 +312,24 @@ static void ssd1307fb_fillrect(struct fb_info *info, 
const struct fb_fillrect *r
 {
struct ssd1307fb_par *par = info->par;
sys_fillrect(info, rect);
-   ssd1307fb_update_display(par);
+   ssd1307fb_update_rect(par, rect->dx, rect->dy, rect->width,
+ rect->height);
 }
 
 static void ssd1307fb_copyarea(struct fb_info *info, const struct fb_copyarea 
*area)
 {
struct ssd1307fb_par *par = info->par;
sys_copyarea(info, area);
-   ssd1307fb_update_display(par);
+   ssd1307fb_update_rect(par, area->dx, area->dy, area->width,
+ area->height);
 }
 
 static void ssd1307fb_imageblit(struct fb_info *info, const struct fb_image 
*image)
 {
struct ssd1307fb_par *par = info->par;
sys_imageblit(info, image);
-   ssd1307fb_update_display(par);
+   ssd1307fb_update_rect(par, image->dx, image->dy, image->width,
+ image->height);
 }
 
 static const struct fb_ops ssd1307fb_ops = {
@@ -493,13 +509,6 @@ static int ssd1307fb_init(struct ssd1307fb_par *par)
if (ret < 0)
return ret;
 
-   /* Set column and page range */
-   ret = ssd1307fb_set_address_range(par, par->col_offset, par->width,
- par->page_offset,
- DIV_ROUND_UP(par->height, 8));
-   if (ret < 0)
-   return ret;
-
/* Clear the screen */
ret = ssd1307fb_update_display(par);
if (ret < 0)
-- 
2.25.1



[PATCH resend 0/5] video: fbdev: ssd1307fb: Optimizations and improvements

2021-07-14 Thread Geert Uytterhoeven
Hi all,

This patch series optimizes console operations on ssd1307fb, after the
customary fixes and cleanups.

Currently, each screen update triggers an I2C transfer of all screen
data, up to 1 KiB of data for a 128x64 display, which takes at least 20
ms in Fast mode.  While many displays are smaller, and thus require less
data to be transferred, 20 ms is still an optimistic value, as the
actual data transfer may be much slower, especially on bitbanged I2C
drivers.  After this series, the amount of data transfer is reduced, as
fillrect, copyarea, and imageblit only update the rectangle that
changed.

This has been tested on an Adafruit FeatherWing OLED with an SSD1306
controller and a 128x32 OLED, connected to an OrangeCrab ECP5 FPGA board
running a 64 MHz VexRiscv RISC-V softcore, where it reduced the CPU
usage for blinking the cursor from more than 70% to ca. 10%.

Thanks for your comments!

Geert Uytterhoeven (5):
  video: fbdev: ssd1307fb: Propagate errors via
ssd1307fb_update_display()
  video: fbdev: ssd1307fb: Simplify ssd1307fb_update_display()
  video: fbdev: ssd1307fb: Extract ssd1307fb_set_address_range()
  video: fbdev: ssd1307fb: Optimize screen updates
  video: fbdev: ssd1307fb: Cache address ranges

 drivers/video/fbdev/ssd1307fb.c | 143 +---
 1 file changed, 96 insertions(+), 47 deletions(-)

-- 
2.25.1

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


[PATCH resend 5/5] video: fbdev: ssd1307fb: Cache address ranges

2021-07-14 Thread Geert Uytterhoeven
Cache the column and page ranges, to avoid doing unneeded I2C transfers
when the values haven't changed.

Signed-off-by: Geert Uytterhoeven 
---
 drivers/video/fbdev/ssd1307fb.c | 52 +++--
 1 file changed, 36 insertions(+), 16 deletions(-)

diff --git a/drivers/video/fbdev/ssd1307fb.c b/drivers/video/fbdev/ssd1307fb.c
index 8e3d4be74723b9bf..23b43ce479898813 100644
--- a/drivers/video/fbdev/ssd1307fb.c
+++ b/drivers/video/fbdev/ssd1307fb.c
@@ -82,6 +82,11 @@ struct ssd1307fb_par {
struct regulator *vbat_reg;
u32 vcomh;
u32 width;
+   /* Cached address ranges */
+   u8 col_start;
+   u8 col_end;
+   u8 page_start;
+   u8 page_end;
 };
 
 struct ssd1307fb_array {
@@ -160,28 +165,43 @@ static int ssd1307fb_set_address_range(struct 
ssd1307fb_par *par, u8 col_start,
int ret;
 
/* Set column range */
-   ret = ssd1307fb_write_cmd(par->client, SSD1307FB_SET_COL_RANGE);
-   if (ret < 0)
-   return ret;
+   if (col_start != par->col_start || col_end != par->col_end) {
+   ret = ssd1307fb_write_cmd(par->client, SSD1307FB_SET_COL_RANGE);
+   if (ret < 0)
+   return ret;
 
-   ret = ssd1307fb_write_cmd(par->client, col_start);
-   if (ret < 0)
-   return ret;
+   ret = ssd1307fb_write_cmd(par->client, col_start);
+   if (ret < 0)
+   return ret;
 
-   ret = ssd1307fb_write_cmd(par->client, col_end);
-   if (ret < 0)
-   return ret;
+   ret = ssd1307fb_write_cmd(par->client, col_end);
+   if (ret < 0)
+   return ret;
+
+   par->col_start = col_start;
+   par->col_end = col_end;
+   }
 
/* Set page range */
-   ret = ssd1307fb_write_cmd(par->client, SSD1307FB_SET_PAGE_RANGE);
-   if (ret < 0)
-   return ret;
+   if (page_start != par->page_start || page_end != par->page_end) {
+   ret = ssd1307fb_write_cmd(par->client,
+ SSD1307FB_SET_PAGE_RANGE);
+   if (ret < 0)
+   return ret;
 
-   ret = ssd1307fb_write_cmd(par->client, page_start);
-   if (ret < 0)
-   return ret;
+   ret = ssd1307fb_write_cmd(par->client, page_start);
+   if (ret < 0)
+   return ret;
+
+   ret = ssd1307fb_write_cmd(par->client, page_end);
+   if (ret < 0)
+   return ret;
 
-   return ssd1307fb_write_cmd(par->client, page_end);
+   par->page_start = page_start;
+   par->page_end = page_end;
+   }
+
+   return 0;
 }
 
 static int ssd1307fb_update_rect(struct ssd1307fb_par *par, unsigned int x,
-- 
2.25.1



[PATCH resend 2/5] video: fbdev: ssd1307fb: Simplify ssd1307fb_update_display()

2021-07-14 Thread Geert Uytterhoeven
Simplify the nested loops to handle conversion from linear frame buffer
to ssd1307 page layout:
  1. Move last page handling one level up, as the value of "m" is the
 same inside a page,
  2. array->data[] is filled linearly, so there is no need to
 recalculate array_idx over and over again; a simple increment is
 sufficient.

Signed-off-by: Geert Uytterhoeven 
---
 drivers/video/fbdev/ssd1307fb.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/video/fbdev/ssd1307fb.c b/drivers/video/fbdev/ssd1307fb.c
index e6b6263e3bef847f..6d7bd025bca1a175 100644
--- a/drivers/video/fbdev/ssd1307fb.c
+++ b/drivers/video/fbdev/ssd1307fb.c
@@ -158,6 +158,7 @@ static int ssd1307fb_update_display(struct ssd1307fb_par 
*par)
u8 *vmem = par->info->screen_buffer;
unsigned int line_length = par->info->fix.line_length;
unsigned int pages = DIV_ROUND_UP(par->height, 8);
+   u32 array_idx = 0;
int ret, i, j, k;
 
array = ssd1307fb_alloc_array(par->width * pages, SSD1307FB_DATA);
@@ -194,19 +195,21 @@ static int ssd1307fb_update_display(struct ssd1307fb_par 
*par)
 */
 
for (i = 0; i < pages; i++) {
+   int m = 8;
+
+   /* Last page may be partial */
+   if (i + 1 == pages && par->height % 8)
+   m = par->height % 8;
for (j = 0; j < par->width; j++) {
-   int m = 8;
-   u32 array_idx = i * par->width + j;
-   array->data[array_idx] = 0;
-   /* Last page may be partial */
-   if (i + 1 == pages && par->height % 8)
-   m = par->height % 8;
+   u8 data = 0;
+
for (k = 0; k < m; k++) {
u8 byte = vmem[(8 * i + k) * line_length +
   j / 8];
u8 bit = (byte >> (j % 8)) & 1;
-   array->data[array_idx] |= bit << k;
+   data |= bit << k;
}
+   array->data[array_idx++] = data;
}
}
 
-- 
2.25.1



Re: nouveau: failed to initialise sync

2021-07-14 Thread Daniel Vetter
On Wed, Jul 14, 2021 at 03:02:21PM +0200, Christian König wrote:
> Am 14.07.21 um 14:56 schrieb Kirill A. Shutemov:
> > On Tue, Jul 06, 2021 at 08:58:37AM +0200, Christian König wrote:
> > > Hi guys,
> > > 
> > > yes nouveau was using the same functionality for internal BOs without
> > > noticing it. This is fixes by the following commit:
> > > 
> > > commit d098775ed44021293b1962dea61efb19297b8d02
> > > Author: Christian König 
> > > Date:   Wed Jun 9 19:25:56 2021 +0200
> > > 
> > >      drm/nouveau: init the base GEM fields for internal BOs
> > > 
> > >      TTMs buffer objects are based on GEM objects for quite a while
> > >      and rely on initializing those fields before initializing the TTM BO.
> > > 
> > >      Nouveau now doesn't init the GEM object for internally allocated BOs,
> > >      so make sure that we at least initialize some necessary fields.
> > > 
> > > Could be that the patch needs to be send to stable as well.
> > The regression is present in v5.14-rc1. Any idea when it will hit
> > upstream? I don't see it being applied to drm=next.
> 
> Well that question needs to answer Dave or somebody else from the drm-misc
> maintainer team.
> 
> This fix together with some others are already in drm-misc-next-fixes
> waiting to be pushed upstream, but it looks like that hasn't happened yet.
> 
> Even Linus already pinged me where the fix for qxl got stuck.

Yeah there was some missed patches. drm-misc-fixes is now in drm-fixes,
and drm-misc-next-fixes is included in drm-misc-fixes, for which Thomas
will do a pull request on Thu so it will land in -rc2.

It should also now be in linux-next.

But yes somehow bugfixes got a bit lost during the merge window.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/ttm: add a check against null pointer dereference

2021-07-14 Thread Christian König

Am 14.07.21 um 16:54 schrieb Zheyu Ma:

When calling ttm_range_man_fini(), 'man' may be uninitialized, which may
cause a null pointer dereference bug.

Fix this by checking if it is a null pointer.


It would be better if the driver doesn't try to fini a manager which was 
never initialized, but for now that should work.




This log reveals it:

[7.902580 ] BUG: kernel NULL pointer dereference, address: 0058
[7.905721 ] RIP: 0010:ttm_range_man_fini+0x40/0x160
[7.911826 ] Call Trace:
[7.911826 ]  radeon_ttm_fini+0x167/0x210
[7.911826 ]  radeon_bo_fini+0x15/0x40
[7.913767 ]  rs400_fini+0x55/0x80
[7.914358 ]  radeon_device_fini+0x3c/0x140
[7.914358 ]  radeon_driver_unload_kms+0x5c/0xe0
[7.914358 ]  radeon_driver_load_kms+0x13a/0x200
[7.914358 ]  ? radeon_driver_unload_kms+0xe0/0xe0
[7.914358 ]  drm_dev_register+0x1db/0x290
[7.914358 ]  radeon_pci_probe+0x16a/0x230
[7.914358 ]  local_pci_probe+0x4a/0xb0

Signed-off-by: Zheyu Ma 


Reviewed-by: Christian König 

Going to push it later.

Thanks,
Christian.


---
  drivers/gpu/drm/ttm/ttm_range_manager.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_range_manager.c 
b/drivers/gpu/drm/ttm/ttm_range_manager.c
index 03395386e8a7..f4b08a8705b3 100644
--- a/drivers/gpu/drm/ttm/ttm_range_manager.c
+++ b/drivers/gpu/drm/ttm/ttm_range_manager.c
@@ -181,6 +181,9 @@ int ttm_range_man_fini(struct ttm_device *bdev,
struct drm_mm *mm = &rman->mm;
int ret;
  
+	if (!man)

+   return 0;
+
ttm_resource_manager_set_used(man, false);
  
  	ret = ttm_resource_manager_evict_all(bdev, man);




Re: [PATCH v8 00/14] drm/tegra: Introduce a modern UABI

2021-07-14 Thread Mikko Perttunen

On 7/14/21 5:50 PM, Dmitry Osipenko wrote:

14.07.2021 11:30, Thierry Reding пишет:

On Sat, Jul 10, 2021 at 12:16:28AM +0300, Dmitry Osipenko wrote:

Hello Thierry,

09.07.2021 22:31, Thierry Reding пишет:

From: Thierry Reding 

Hi all,

Mikko has been away for a few weeks, so I've been testing and revising
the new UABI patches in the meantime. There are very minor changes to
the naming of some of the UABI fields, but other than that it's mostly
unchanged from v7.


Why you haven't addressed any of the previous review comments? There
were some obvious problems in v7 and v8 still has them.


One notable change is that mappings can now be read-only, write-only,
read-write or none of them (rather than just read-only or read-write),
since those combinations are all supported by the IOMMUs and it might
be useful to make some mappings write-only.

For a full list of changes in v8, see the changelog in patch 6.

I've also updated the libdrm_tegra library to work against this version
of the UABI. A branch can be found here:

   https://gitlab.freedesktop.org/tagr/drm/-/commits/drm-tegra-uabi-v8

That contains helper APIs for the concepts introduced in this series and
shows how they can be used in various tests that can be run for sanity
checking.

In addition, Mikko has made updates to the following projects, though
they may need to be updated for the minor changes in v8:

* vaapi-tegra-driver - https://github.com/cyndis/vaapi-tegra-driver
   Experimental support for MPEG2 and H264 decoding on T210, T186
   and T194.

* xf86-video-opentegra - https://github.com/grate-driver/xf86-video-opentegra
   X11 userspace acceleration driver for Tegra20, Tegra30, and Tegra114.

* grate - https://github.com/grate-driver/grate
   3D rendering testbed for Tegra20, Tegra30, and Tegra114

I plan on putting this into linux-next soon after v5.14-rc1 so that this
can get some soak time.


It should be a bit too early to push it into kernel. The UAPI is not
ready because it's missing essential features. We can't call this a
'modern UABI' until it's fully implemented. The design decisions are
still questionable because this UAPI is built around the proprietary
firmware (and based on UAPI of downstream driver) which doesn't fit well
into DRM world. I haven't got all the answers to my previous questions,
should I repeat them?


I don't know what you means by "built around the proprietary firmware".
Yes, this ends up using proprietary firmware for some of the hardware
engines that host1x drives, but that's completely orthogonal to the
UABI. No matter what UABI we'd be introducing, we'd be using that same
firmware.

And yes, this is based on the UABI of the downstream drivers. The design
is guided by what we've learned over the last decade working with this
hardware in use-cases that customers need. It'd be dumb not to use that
knowledge to our advantage. This is the only way to ensure we can
deliver an upstream driver that's on par with our downstream drivers and
therefore make it possible to eventually adopt the upstream driver.

And frankly, you did get answers to previous questions, though perhaps
not all, but I'm out of patience. We've been going in circles and at
some point we have to make a decision so we can make progress.


By firmware I was referring to the supervisor OS and inter-VM
integration, sorry for not making it clear. My rough understanding is
that it's all software defined and technically it's possible to avoid
going though the trouble of supporting the firmware convention defined
by downstream, and thus, making driver less optimal than it could be.
It's still not clear to me how much that firmware is relevant to
upstream in practice.


As mentioned in discussion elsewhere, there is no 'firmware convention'. 
The view I've formed so far is that the model of ephemeral syncpoint 
allocations and value resets, which I believe you are talking about 
here, is fundamentally opposed to the design intent of the hardware, and 
would result in a less efficient system regardless of inter-VM 
integration convention.





I made several attempts over the years to get something usable merged
upstream so that we can finally make use of this hardware and get it
supported upstream and each time I made the mistake of trying to make it
perfect and accomodate all wishlist items. The result is that I wasted a
lot of time and have nothing to show for it.


It's a problem that you try to do everything on your own and not
collaborating as much as you could. Writing code isn't a problem, the
problem is that there is no clear understanding of what needs to be
done, IMO. I have a vision of whats need to be done from a perspective
of older SoCs, but I never could start implementing it for upstream
because it requires yours feedback and preliminary agreement since
you're the only maintainer of the driver who could merge patches I don't
want to waste my time too.


I've also been very hard Mikko with his work on this and I think we've
stretched 

Re: [PATCH v2] drm/of: free the iterator object on failure

2021-07-14 Thread Laurent Pinchart
Hi Steven,

Thank you for the patch.

On Wed, Jul 14, 2021 at 03:33:00PM +0100, Steven Price wrote:
> When bailing out due to the sanity check the iterator value needs to be
> freed because the early return prevents for_each_child_of_node() from
> doing the dereference itself.
> 
> Fixes: 6529007522de ("drm: of: Add drm_of_lvds_get_dual_link_pixel_order")
> Signed-off-by: Steven Price 

Reviewed-by: Laurent Pinchart 

> ---
>  drivers/gpu/drm/drm_of.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> v2: Fixes now refers to the original commit as suggested by Laurent, rather
> than 4ee48cc5586b ("drm: of: Fix double-free bug") which only fixed part of
> the problem. Note that 4ee48cc5586b is a dependency for this patch to
> cleanly apply.
> 
> diff --git a/drivers/gpu/drm/drm_of.c b/drivers/gpu/drm/drm_of.c
> index 197c57477344..997b8827fed2 100644
> --- a/drivers/gpu/drm/drm_of.c
> +++ b/drivers/gpu/drm/drm_of.c
> @@ -331,8 +331,10 @@ static int drm_of_lvds_get_remote_pixels_type(
>* configurations by passing the endpoints explicitly to
>* drm_of_lvds_get_dual_link_pixel_order().
>*/
> - if (!current_pt || pixels_type != current_pt)
> + if (!current_pt || pixels_type != current_pt) {
> + of_node_put(endpoint);
>   return -EINVAL;
> + }
>   }
>  
>   return pixels_type;

-- 
Regards,

Laurent Pinchart


Re: [PATCH resend 0/5] video: fbdev: ssd1307fb: Optimizations and improvements

2021-07-14 Thread Sam Ravnborg
Hi Geert,

On Wed, Jul 14, 2021 at 04:57:59PM +0200, Geert Uytterhoeven wrote:
>   Hi all,
> 
> This patch series optimizes console operations on ssd1307fb, after the
> customary fixes and cleanups.

What is required to to have a drm driver that could do the same?

Note: I will take a look at the patches a bit later.

Sam


[PATCH] drm/msm/dp: Initialize dp->aux->drm_dev before registration

2021-07-14 Thread Sean Paul
From: Sean Paul 

Avoids the following WARN:
[3.009556] [ cut here ]
[3.014306] WARNING: CPU: 7 PID: 109 at
drivers/gpu/drm/drm_dp_helper.c:1796 drm_dp_aux_register+0xa4/0xac
[3.024209] Modules linked in:
[3.027351] CPU: 7 PID: 109 Comm: kworker/7:8 Not tainted 5.10.47 #69
[3.033958] Hardware name: Google Lazor (rev1 - 2) (DT)
[3.039323] Workqueue: events deferred_probe_work_func
[3.044596] pstate: 60c9 (nZCv daif +PAN +UAO -TCO BTYPE=--)
[3.050761] pc : drm_dp_aux_register+0xa4/0xac
[3.055329] lr : dp_aux_register+0x40/0x88
[3.059538] sp : ffc010ad3920
[3.062948] x29: ffc010ad3920 x28: ffa64196ac70
[3.067239] mmc1: Command Queue Engine enabled
[3.068406] x27: ffa64196ac68 x26: 0001
[3.068407] x25: 0002 x24: 0060
[3.068409] x23: ffa642ab3400 x22: ffe126c10e5b
[3.068410] x21: ffa641dc3188 x20: ffa641963c10
[3.068412] x19: ffa642aba910 x18: 0a00
[3.068414] x17: 00476f8e002a x16: 00b8
[3.073008] mmc1: new HS400 Enhanced strobe MMC card at address 0001
[3.078448] x15:  x14: 
[3.078450] x13: 0030 x12: 0030
[3.078452] x11: 0101010101010101 x10: ffe12647a914
[3.078453] x9 : ffe12647a8cc x8 : 
[3.084452] mmcblk1: mmc1:0001 DA4032 29.1 GiB
[3.089372]
[3.089372] x7 : 6c6064717372fefe x6 : ffa642b11494
[3.089374] x5 :  x4 : 6d006c657869
[3.089375] x3 : 6c657869 x2 : 000c
[3.089376] x1 : ffe126c3ae3c x0 : ffa642aba910
[3.089381] Call trace:
[3.094931] mmcblk1boot0: mmc1:0001 DA4032 partition 1 4.00 MiB
[3.100291]  drm_dp_aux_register+0xa4/0xac
[3.100292]  dp_aux_register+0x40/0x88
[3.100294]  dp_display_bind+0x64/0xcc
[3.100295]  component_bind_all+0xdc/0x210
[3.100298]  msm_drm_bind+0x1e8/0x5d4
[3.100301]  try_to_bring_up_master+0x168/0x1b0
[3.105861] mmcblk1boot1: mmc1:0001 DA4032 partition 2 4.00 MiB
[3.112282]  __component_add+0xa0/0x158
[3.112283]  component_add+0x1c/0x28
[3.112284]  dp_display_probe+0x33c/0x380
[3.112286]  platform_drv_probe+0x9c/0xbc
[3.112287]  really_probe+0x140/0x35c
[3.112289]  driver_probe_device+0x84/0xc0
[3.112292]  __device_attach_driver+0x94/0xb0
[3.117967] mmcblk1rpmb: mmc1:0001 DA4032 partition 3 16.0 MiB,
chardev (239:0)
[3.123201]  bus_for_each_drv+0x8c/0xd8
[3.123202]  __device_attach+0xc4/0x150
[3.123204]  device_initial_probe+0x1c/0x28
[3.123205]  bus_probe_device+0x3c/0x9c
[3.123206]  deferred_probe_work_func+0x90/0xcc
[3.123211]  process_one_work+0x218/0x3ec
[3.131976]  mmcblk1: p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12
[3.134123]  worker_thread+0x288/0x3e8
[3.134124]  kthread+0x148/0x1b0
[3.134127]  ret_from_fork+0x10/0x30
[3.134128] ---[ end trace cfb9fce3f70f824d ]---

Signed-off-by: Sean Paul 
---
 drivers/gpu/drm/msm/dp/dp_display.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/msm/dp/dp_display.c 
b/drivers/gpu/drm/msm/dp/dp_display.c
index 051c1be1de7e..987f9e330138 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -219,6 +219,7 @@ static int dp_display_bind(struct device *dev, struct 
device *master,
goto end;
}
 
+   dp->aux->drm_dev = drm;
rc = dp_aux_register(dp->aux);
if (rc) {
DRM_ERROR("DRM DP AUX register failed\n");
-- 
Sean Paul, Software Engineer, Google / Chromium OS



Re: [PATCH v3 1/1] drm/ttm: Fix COW check

2021-07-14 Thread Felix Kuehling
Am 2021-07-14 um 6:51 a.m. schrieb Christian König:
> Am 14.07.21 um 12:44 schrieb Daniel Vetter:
>> On Mon, Jul 12, 2021 at 06:06:36PM -0400, Felix Kuehling wrote:
>>> KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
>>> is_cow_mapping returns true for these mappings. Add a check for
>>> vm_flags & VM_WRITE to avoid mmap failures on private read-only or
>>> PROT_NONE mappings.
>>>
>>> v2: protect against mprotect making a mapping writable after the fact
>>> v3: update driver-specific vm_operations_structs
>>>
>>> Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
>>> Signed-off-by: Felix Kuehling 
>>> Signed-off-by: Alex Deucher 
>> So looking at vmf_insert_pfn_prot() and the comment there we can't have
>> VM_PFNMAP and is_cow_mapping ever be true, or things break. On platforms
>> without pte_special at least.
>
> Key idea is that we never end up in vmf_insert_pfn_prot() because the
> vma is mapped with PROT_NONE.

Ah, thanks for that pointer. I wasn't aware of that BUG_ON. I thought it
was more of an abstract "copy-on-write faults may be bad on these mappings."


>
>>
>> So I'm not sure this is a great idea, and definitely not for all drivers
>
> Yeah, I'm absolutely not happy with this either but it seemed to be
> the least painful thing to do.
>
>> ...
>>
>> Can we clear VM_MAYWRITE instead to force this to be a non-cow mapping
>> instead?
>
> Well we have considered forcefully setting VM_SHARED, which won't work
> easily for a couple of reasons.
>
> But clearing VM_MAYWRITE in amdgpu/amdkfd may actually work as well.
>
> Felix can you test this?

Sounds like it should work and be straight forward (I thought that about
setting VM_SHARED, too ...). I'll give it a try.

Thanks,
  Felix


>
> Thanks,
> Christian.
>
>> -Daniel
>>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  |  3 ++-
>>>   drivers/gpu/drm/nouveau/nouveau_gem.c    |  3 ++-
>>>   drivers/gpu/drm/radeon/radeon_gem.c  |  3 ++-
>>>   drivers/gpu/drm/ttm/ttm_bo_vm.c  | 14 +-
>>>   drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c |  1 +
>>>   include/drm/ttm/ttm_bo_api.h |  4 
>>>   6 files changed, 24 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> index b3404c43a911..1aa750a6a5d2 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> @@ -79,7 +79,8 @@ static const struct vm_operations_struct
>>> amdgpu_gem_vm_ops = {
>>>   .fault = amdgpu_gem_fault,
>>>   .open = ttm_bo_vm_open,
>>>   .close = ttm_bo_vm_close,
>>> -    .access = ttm_bo_vm_access
>>> +    .access = ttm_bo_vm_access,
>>> +    .mprotect = ttm_bo_vm_mprotect
>>>   };
>>>     static void amdgpu_gem_object_free(struct drm_gem_object *gobj)
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c
>>> b/drivers/gpu/drm/nouveau/nouveau_gem.c
>>> index 5b27845075a1..164ea564bb7a 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_gem.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
>>> @@ -70,7 +70,8 @@ static const struct vm_operations_struct
>>> nouveau_ttm_vm_ops = {
>>>   .fault = nouveau_ttm_fault,
>>>   .open = ttm_bo_vm_open,
>>>   .close = ttm_bo_vm_close,
>>> -    .access = ttm_bo_vm_access
>>> +    .access = ttm_bo_vm_access,
>>> +    .mprotect = ttm_bo_vm_mprotect
>>>   };
>>>     void
>>> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c
>>> b/drivers/gpu/drm/radeon/radeon_gem.c
>>> index 458f92a70887..c19ad07eb7b5 100644
>>> --- a/drivers/gpu/drm/radeon/radeon_gem.c
>>> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
>>> @@ -77,7 +77,8 @@ static const struct vm_operations_struct
>>> radeon_gem_vm_ops = {
>>>   .fault = radeon_gem_fault,
>>>   .open = ttm_bo_vm_open,
>>>   .close = ttm_bo_vm_close,
>>> -    .access = ttm_bo_vm_access
>>> +    .access = ttm_bo_vm_access,
>>> +    .mprotect = ttm_bo_vm_mprotect
>>>   };
>>>     static void radeon_gem_object_free(struct drm_gem_object *gobj)
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> index f56be5bc0861..fb325bad5db6 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> @@ -542,17 +542,29 @@ int ttm_bo_vm_access(struct vm_area_struct
>>> *vma, unsigned long addr,
>>>   }
>>>   EXPORT_SYMBOL(ttm_bo_vm_access);
>>>   +int ttm_bo_vm_mprotect(struct vm_area_struct *vma, unsigned long
>>> start,
>>> +   unsigned long end, unsigned long newflags)
>>> +{
>>> +    /* Enforce no COW since would have really strange behavior with
>>> it. */
>>> +    if (is_cow_mapping(newflags) && (newflags & VM_WRITE))
>>> +    return -EINVAL;
>>> +
>>> +    return 0;
>>> +}
>>> +EXPORT_SYMBOL(ttm_bo_vm_mprotect);
>>> +
>>>   static const struct vm_operations_struct ttm_bo_vm_ops = {
>>>   .fault = ttm_bo_vm_fault,
>>>   .open = ttm_bo_vm_open,
>>>   .close = ttm_bo_vm_close,
>>>   .access = ttm_

Re: [PATCH 1/2] drm: add crtc background color property

2021-07-14 Thread Harry Wentland



On 2021-07-14 3:35 a.m., Pekka Paalanen wrote:
> On Tue, 13 Jul 2021 09:54:35 -0400
> Harry Wentland  wrote:
> 
>> On 2021-07-13 3:52 a.m., Pekka Paalanen wrote:
>>> On Mon, 12 Jul 2021 12:15:59 -0400
>>> Harry Wentland  wrote:
>>>   
 On 2021-07-12 4:03 a.m., Pekka Paalanen wrote:  
> On Fri, 9 Jul 2021 18:23:26 +0200
> Raphael Gallais-Pou  wrote:
> 
>> On 7/9/21 10:04 AM, Pekka Paalanen wrote:
>>> On Wed, 7 Jul 2021 08:48:47 +
>>> Raphael GALLAIS-POU - foss  wrote:
>>>  
 Some display controllers can be programmed to present non-black colors
 for pixels not covered by any plane (or pixels covered by the
 transparent regions of higher planes).  Compositors that want a UI with
 a solid color background can potentially save memory bandwidth by
 setting the CRTC background property and using smaller planes to 
 display
 the rest of the content.

 To avoid confusion between different ways of encoding RGB data, we
 define a standard 64-bit format that should be used for this property's
 value.  Helper functions and macros are provided to generate and 
 dissect
 values in this standard format with varying component precision values.

 Signed-off-by: Raphael Gallais-Pou 
 Signed-off-by: Matt Roper 
 ---
   drivers/gpu/drm/drm_atomic_state_helper.c |  1 +
   drivers/gpu/drm/drm_atomic_uapi.c |  4 +++
   drivers/gpu/drm/drm_blend.c   | 34 
 +--
   drivers/gpu/drm/drm_mode_config.c |  6 
   include/drm/drm_blend.h   |  1 +
   include/drm/drm_crtc.h| 12 
   include/drm/drm_mode_config.h |  5 
   include/uapi/drm/drm_mode.h   | 28 +++
   8 files changed, 89 insertions(+), 2 deletions(-)  
>>>
>>> ...
>>>   
>>> The question about full vs. limited range seems unnecessary to me, as
>>> the background color will be used as-is in the blending stage, so
>>> userspace can just program the correct value that fits the pipeline it
>>> is setting up.
>>>
>>> One more question is, as HDR exists, could we need background colors
>>> with component values greater than 1.0?  
>>
>> AR4H color format should cover that case, isn't it ?
>
> Yes, but with the inconvenience I mentioned.
>
> This is a genuine question though, would anyone actually need
> background color values > 1.0. I don't know of any case yet where it
> would be required. It would imply that plane blending happens in a
> color space where >1.0 values are meaningful. I'm not even sure if any
> hardware supporting that exists.
>
> Maybe it would be best to assume that only [0.0, 1.0] pixel value range
> is useful, and mention in the commit message that if someone really
> needs values outside of that, they should create another background
> color property. Then, you can pick a simple unsigned integer pixel
> format, too. (I didn't see any 16 bit-per-channel formats like that in
> drm_fourcc.h though.)
> 

 I don't think we should artificially limit this to [0.0, 1.0]. As you
 mentioned above when talking about full vs limited, the userspace
 understands what's the correct value that fits the pipeline. If that
 pipeline is FP16 with > 1.0 values then it would make sense that the
 background color can be > 1.0.  
>>>
>>> Ok. The standard FP32 format then for ease of use and guaranteed enough
>>> range and precision for far into the future?
>>>   
>>
>> I don't have a strong preference for FP16 vs FP32. My understanding is
>> that FP16 is enough to represent linearly encoded data in a way that
>> looks smooth to humans.
>>
>> scRGB uses FP16 with linear encoding in a range of [-0.5, 7.4999].
>>
>>> Or do you want to keep it in 64 bits total, so the UABI can pack
>>> everything into a u64 instead of needing to create a blob?
>>>
>>> I don't mind as long as it's clearly documented what it is and how it
>>> works, and it carries enough precision.
>>>
>>> But FP16 with its 10 bits of precision might be too little for integer
>>> 12-16 bpc pipelines and sinks?
> 
> The 10 bits worries me still.
> 
> If you have a pipeline that works in [0.0, 1.0] range only, then FP16
> limits precision to 10 bits (in the upper half of the range?).
> 
>>>
>>> If the values can go beyond [0.0, 1.0] range, then does the blending
>>> hardware and the degamma/ctm/gamma coming afterwards cope with them, or
>>> do they get clamped anyway?
>>>   
>>
>> That probably depends on the HW and how it's configured. AMD HW can handle
>> values above and below [0.0, 1.0].
> 
> Right, so how would userspace know what will happen?
> 
> Or do we need to specify that while values 

  1   2   >