Re: [PATCH] dt-bindings: display: renesas, du: Make resets optional on R-Car H1

2021-06-17 Thread Geert Uytterhoeven
Hi Laurent,

On Thu, Jun 17, 2021 at 3:57 AM Laurent Pinchart
 wrote:
> On Thu, Apr 29, 2021 at 06:47:06PM +0300, Laurent Pinchart wrote:
> > On Thu, Apr 29, 2021 at 02:47:31PM +0200, Geert Uytterhoeven wrote:
> > > The "resets" property is not present on R-Car Gen1 SoCs.
> > > Supporting it would require migrating from renesas,cpg-clocks to
> > > renesas,cpg-mssr.
> > >
> > > Reflect this in the DT bindings by removing the global "required:
> > > resets".  All SoCs that do have "resets" properties already have
> > > SoC-specific rules making it required.
> >
> > Should we drop the
> >
> > resets:
> > maxItems: 1
> >
> > from renesas,du-r8a7779 then ? And maybe the
> >
> >   resets: true
> >
> > in the general case ?
>
> Any opinion on this ?

Oops, I did reply to this on April 29, but accidentally dropped
all CCs, which made it disappear from your radar, too?

| R-Car H1 does have a reset controller, we just don't have support for
| it in the DT bindings and Linux driver yet.  So from that point of view
| it makes sense to keep it.
|
| Of course we can remove it, and re-add it later if we ever add support,
| as at that time we probably will want to change the bindings anyway
| to make it required again.

And you replied on April 30, also in private:

|> R-Car H1 does have a reset controller, we just don't have support for
| > it in the DT bindings and Linux driver yet.  So from that point of view
| > it makes sense to keep it.
|
| Not sure what we would "keep", given that there's no reset controller
| available :-)
|
| > Of course we can remove it, and re-add it later if we ever add support,
| > as at that time we probably will want to change the bindings anyway
| > to make it required again.
|
| Let's not bother. I doubt H1 will get support for a reset controller as
| that's an old platform, and the DT bindings thus don't matter too much.
| I'll take this patch as-is.
|
| Reviewed-by: Laurent Pinchart 

> > > Fixes: 99d66127fad25ebb ("dt-bindings: display: renesas,du: Convert 
> > > binding to YAML")
> > > Signed-off-by: Geert Uytterhoeven 
> > > ---
> > >  Documentation/devicetree/bindings/display/renesas,du.yaml | 1 -
> > >  1 file changed, 1 deletion(-)
> > >
> > > diff --git a/Documentation/devicetree/bindings/display/renesas,du.yaml 
> > > b/Documentation/devicetree/bindings/display/renesas,du.yaml
> > > index 552a99ce4f1280d7..e955034da53b86e2 100644
> > > --- a/Documentation/devicetree/bindings/display/renesas,du.yaml
> > > +++ b/Documentation/devicetree/bindings/display/renesas,du.yaml
> > > @@ -89,7 +89,6 @@ required:
> > >- reg
> > >- clocks
> > >- interrupts
> > > -  - resets
> > >- ports
> > >
> > >  allOf:

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)

2021-06-17 Thread Christian König

Am 16.06.21 um 20:30 schrieb Jason Ekstrand:

On Tue, Jun 15, 2021 at 3:41 AM Christian König
 wrote:

Hi Jason & Daniel,

maybe I should explain once more where the problem with this approach is
and why I think we need to get that fixed before we can do something
like this here.

To summarize what this patch here does is that it copies the exclusive
fence and/or the shared fences into a sync_file. This alone is totally
unproblematic.

The problem is what this implies. When you need to copy the exclusive
fence to a sync_file then this means that the driver is at some point
ignoring the exclusive fence on a buffer object.

Not necessarily.  Part of the point of this is to allow for CPU waits
on a past point in buffers timeline.  Today, we have poll() and
GEM_WAIT both of which wait for the buffer to be idle from whatever
GPU work is currently happening.  We want to wait on something in the
past and ignore anything happening now.


Good point, yes that is indeed a valid use case.


But, to the broader point, maybe?  I'm a little fuzzy on exactly where
i915 inserts and/or depends on fences.


When you combine that with complex drivers which use TTM and buffer
moves underneath you can construct an information leak using this and
give userspace access to memory which is allocated to the driver, but
not yet initialized.

This way you can leak things like page tables, passwords, kernel data
etc... in large amounts to userspace and is an absolutely no-go for
security.

Ugh...  Unfortunately, I'm really out of my depth on the implications
going on here but I think I see your point.


That's why I'm said we need to get this fixed before we upstream this
patch set here and especially the driver change which is using that.

Well, i915 has had uAPI for a while to ignore fences.


Yeah, exactly that's illegal.

At least the kernel internal fences like moving or clearing a buffer 
object needs to be taken into account before a driver is allowed to 
access a buffer.


Otherwise we have an information leak worth a CVE and that is certainly 
not something we want.



Those changes are years in the past.  If we have a real problem here (not sure 
on
that yet), then we'll have to figure out how to fix it without nuking
uAPI.


Well, that was the basic idea of attaching flags to the fences in the 
dma_resv object.


In other words you clearly denote when you have to wait for a fence 
before accessing a buffer or you cause a security issue.


Christian.



--Jason



Regards,
Christian.

Am 10.06.21 um 23:09 schrieb Jason Ekstrand:

Modern userspace APIs like Vulkan are built on an explicit
synchronization model.  This doesn't always play nicely with the
implicit synchronization used in the kernel and assumed by X11 and
Wayland.  The client -> compositor half of the synchronization isn't too
bad, at least on intel, because we can control whether or not i915
synchronizes on the buffer and whether or not it's considered written.

The harder part is the compositor -> client synchronization when we get
the buffer back from the compositor.  We're required to be able to
provide the client with a VkSemaphore and VkFence representing the point
in time where the window system (compositor and/or display) finished
using the buffer.  With current APIs, it's very hard to do this in such
a way that we don't get confused by the Vulkan driver's access of the
buffer.  In particular, once we tell the kernel that we're rendering to
the buffer again, any CPU waits on the buffer or GPU dependencies will
wait on some of the client rendering and not just the compositor.

This new IOCTL solves this problem by allowing us to get a snapshot of
the implicit synchronization state of a given dma-buf in the form of a
sync file.  It's effectively the same as a poll() or I915_GEM_WAIT only,
instead of CPU waiting directly, it encapsulates the wait operation, at
the current moment in time, in a sync_file so we can check/wait on it
later.  As long as the Vulkan driver does the sync_file export from the
dma-buf before we re-introduce it for rendering, it will only contain
fences from the compositor or display.  This allows to accurately turn
it into a VkFence or VkSemaphore without any over- synchronization.

This patch series actually contains two new ioctls.  There is the export
one mentioned above as well as an RFC for an import ioctl which provides
the other half.  The intention is to land the export ioctl since it seems
like there's no real disagreement on that one.  The import ioctl, however,
has a lot of debate around it so it's intended to be RFC-only for now.

Mesa MR: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fmesa%2F-%2Fmerge_requests%2F4037&data=04%7C01%7Cchristian.koenig%40amd.com%7Cb094e69c94814727939508d930f4ca94%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637594650220923783%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=xUwaiuw8Qt3d3

Re: [PATCH] dt-bindings: Drop redundant minItems/maxItems

2021-06-17 Thread Marc Kleine-Budde
On 15.06.2021 13:15:43, Rob Herring wrote:
> If a property has an 'items' list, then a 'minItems' or 'maxItems' with the
> same size as the list is redundant and can be dropped. Note that is DT
> schema specific behavior and not standard json-schema behavior. The tooling
> will fixup the final schema adding any unspecified minItems/maxItems.
> 
> This condition is partially checked with the meta-schema already, but
> only if both 'minItems' and 'maxItems' are equal to the 'items' length.
> An improved meta-schema is pending.
[...]
>  Documentation/devicetree/bindings/net/can/bosch,m_can.yaml  | 2 --

Acked-by: Marc Kleine-Budde 

regards,
Marc

-- 
Pengutronix e.K. | Marc Kleine-Budde   |
Embedded Linux   | https://www.pengutronix.de  |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917- |


signature.asc
Description: PGP signature


[PATCH] drm/meson: fix potential NULL pointer exception in meson_drv_unbind()

2021-06-17 Thread Jiajun Cao
Fix a potential NULL pointer exception when meson_drv_unbind()
attempts to operate on the driver_data priv which may be NULL.
Add a null pointer check on the priv struct to avoid the NULL
pointer dereference after calling dev_get_drvdata(), just like
the null pointer checks done on the struct priv in the function
meson_drv_shutdown(), meson_drv_pm_suspend() and meson_drv_pm_resume().

Signed-off-by: Jiajun Cao 
Signed-off-by: Xin Tan 
---
 drivers/gpu/drm/meson/meson_drv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/meson/meson_drv.c 
b/drivers/gpu/drm/meson/meson_drv.c
index 07fcd12dca16..adea6a2b28f5 100644
--- a/drivers/gpu/drm/meson/meson_drv.c
+++ b/drivers/gpu/drm/meson/meson_drv.c
@@ -380,6 +380,8 @@ static int meson_drv_bind(struct device *dev)
 static void meson_drv_unbind(struct device *dev)
 {
struct meson_drm *priv = dev_get_drvdata(dev);
+   if (!priv)
+   return;
struct drm_device *drm = priv->drm;
 
if (priv->canvas) {
-- 
2.17.1



[PATCH] drm/nouveau/core: fix the uninitialized use in nvkm_ioctl_map()

2021-06-17 Thread Yizhuo Zhai
In function nvkm_ioctl_map(), the variable "type" could be
uninitialized if "nvkm_object_map()" returns error code,
however, it does not check the return value and directly
use the "type" in the if statement, which is potentially
unsafe.

Signed-off-by: Yizhuo 
---
 drivers/gpu/drm/nouveau/nvkm/core/ioctl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
index d777df5a64e6..7f2e8482f167 100644
--- a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
+++ b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
@@ -266,6 +266,8 @@ nvkm_ioctl_map(struct nvkm_client *client,
ret = nvkm_object_map(object, data, size, &type,
  &args->v0.handle,
  &args->v0.length);
+   if (ret)
+   return ret;
if (type == NVKM_OBJECT_MAP_IO)
args->v0.type = NVIF_IOCTL_MAP_V0_IO;
else
-- 
2.17.1


Re: [PATCH] drm/ttm: fix error handling in ttm_bo_handle_move_mem()

2021-06-17 Thread Christian König




Am 16.06.21 um 21:19 schrieb Dan Carpenter:

On Wed, Jun 16, 2021 at 01:00:38PM +0200, Christian König wrote:


Am 16.06.21 um 11:36 schrieb Dan Carpenter:

On Wed, Jun 16, 2021 at 10:47:14AM +0200, Christian König wrote:

Am 16.06.21 um 10:37 schrieb Dan Carpenter:

On Wed, Jun 16, 2021 at 08:46:33AM +0200, Christian König wrote:

Sending the first message didn't worked, so let's try again.

Am 16.06.21 um 08:30 schrieb Dan Carpenter:

There are three bugs here:
1) We need to call unpopulate() if ttm_tt_populate() succeeds.
2) The "new_man = ttm_manager_type(bdev, bo->mem.mem_type);" assignment
   was wrong and it was really assigning "new_mem = old_mem;".  There
   is no need for this assignment anyway as we already have the value
   for "new_mem".
3) The (!new_man->use_tt) condition is reversed.

Fixes: ba4e7d973dd0 ("drm: Add the TTM GPU memory manager subsystem.")
Signed-off-by: Dan Carpenter 
---
This is from reading the code and I can't swear that I have understood
it correctly.  My nouveau driver is currently unusable and this patch
has not helped.  But hopefully if I fix enough bugs eventually it will
start to work.

Well NAK, the code previously looked quite well and you are breaking it now.

What's the problem with nouveau?


The new Firefox seems to excersize nouveau more than the old one so
when I start 10 firefox windows it just hangs the graphics.

I've added debug code and it seems like the problem is that
nv50_mem_new() is failing.

Sounds like it is running out of memory to me.

Do you have a dmesg?


At first there was a very straight forward use after free bug which I
fixed.
https://lore.kernel.org/nouveau/YMinJwpIei9n1Pn1@mwanda/T/#u

But now the use after free is gone the only thing in dmesg is:
"[TTM] Buffer eviction failed".  And I have some firmware missing.

[  205.489763] rfkill: input handler disabled
[  205.678292] nouveau :01:00.0: Direct firmware load for 
nouveau/nva8_fuc084 failed with error -2
[  205.678300] nouveau :01:00.0: Direct firmware load for 
nouveau/nva8_fuc084d failed with error -2
[  205.678302] nouveau :01:00.0: msvld: unable to load firmware data
[  205.678304] nouveau :01:00.0: msvld: init failed, -19
[  296.150632] [TTM] Buffer eviction failed
[  417.084265] [TTM] Buffer eviction failed
[  447.295961] [TTM] Buffer eviction failed
[  510.800231] [TTM] Buffer eviction failed
[  556.101384] [TTM] Buffer eviction failed
[  616.495790] [TTM] Buffer eviction failed
[  692.014007] [TTM] Buffer eviction failed

The eviction failed message only shows up a minute after the hang so it
seems more like a symptom than a root cause.

Yeah, look at the timing. What happens is that the buffer eviction timed out
because the hardware is locked up.

No idea what that could be. It might not even be kernel related at all.

I don't think it's hardware related...  Using an old version of firefox
"fixes" the problem.  I downloaded the firmware so that's not the issue.
Here's the dmesg load info with the new firmware.


Oh, I was not suggesting a hardware problem.

The most likely cause is a software issue in userspace, e.g. wrong order 
of doing thing, doing things to fast without waiting etc...


There are tons of things how userspace can crash GPU hardware you can't 
prevent in the kernel. Especially sending an endless loop is well known 
as Turing's halting problems and not even theoretically solvable.


I suggest to start digging in userspace instead.

Christian.



[1.412458] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel 
[1.412527] AMD-Vi: AMD IOMMUv2 functionality not available on this system
[1.412710] nouveau :01:00.0: vgaarb: deactivate vga console
[1.417213] Console: switching to colour dummy device 80x25
[1.417272] nouveau :01:00.0: NVIDIA GT218 (0a8280b1)
[1.531565] nouveau :01:00.0: bios: nvkm_bios_new: version 70.18.6f.00.05
[1.531916] nouveau :01:00.0: fb: nvkm_ram_ctor: 1024 MiB DDR3
[2.248212] tsc: Refined TSC clocksource calibration: 3392.144 MHz
[2.248218] clocksource: tsc: mask: 0x max_cycles: 
0x30e5517d4e4, max_idle_ns: 440795261668 ns
[2.252203] clocksource: Switched to clocksource tsc
[2.848138] nouveau :01:00.0: DRM: VRAM: 1024 MiB
[2.848142] nouveau :01:00.0: DRM: GART: 1048576 MiB
[2.848145] nouveau :01:00.0: DRM: TMDS table version 2.0
[2.848147] nouveau :01:00.0: DRM: DCB version 4.0
[2.848149] nouveau :01:00.0: DRM: DCB outp 00: 01000302 00020030
[2.848151] nouveau :01:00.0: DRM: DCB outp 01: 02000300 
[2.848154] nouveau :01:00.0: DRM: DCB outp 02: 02011362 00020010
[2.848155] nouveau :01:00.0: DRM: DCB outp 03: 01022310 
[2.848157] nouveau :01:00.0: DRM: DCB conn 00: 1030
[2.848159] nouveau :01:00.0: DRM: DCB conn 01: 2161
[2.848161] nouveau :01:00.0: DRM: DCB conn 02: 0200
[2.850214] nouveau :01:00.0: DRM: MM: using COPY for buf

Re: [PATCH 1/2] drm/amdgpu: unwrap fence chains in the explicit sync fence

2021-06-17 Thread Christian König

Alex do want to review those so that we can close the ticket?

Thanks,
Christian.

Am 14.06.21 um 19:45 schrieb Christian König:

Unwrap the explicit fence if it is a dma_fence_chain and
sync to the first fence not matching the owner rules.

Signed-off-by: Christian König 
Acked-by: Daniel Vetter 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 118 +--
  1 file changed, 68 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index 1b2ceccaf5b0..862eb3c1c4c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -28,6 +28,8 @@
   *Christian König 
   */
  
+#include 

+
  #include "amdgpu.h"
  #include "amdgpu_trace.h"
  #include "amdgpu_amdkfd.h"
@@ -186,6 +188,55 @@ int amdgpu_sync_vm_fence(struct amdgpu_sync *sync, struct 
dma_fence *fence)
return amdgpu_sync_fence(sync, fence);
  }
  
+/* Determine based on the owner and mode if we should sync to a fence or not */

+static bool amdgpu_sync_test_fence(struct amdgpu_device *adev,
+  enum amdgpu_sync_mode mode,
+  void *owner, struct dma_fence *f)
+{
+   void *fence_owner = amdgpu_sync_get_owner(f);
+
+   /* Always sync to moves, no matter what */
+   if (fence_owner == AMDGPU_FENCE_OWNER_UNDEFINED)
+   return true;
+
+   /* We only want to trigger KFD eviction fences on
+* evict or move jobs. Skip KFD fences otherwise.
+*/
+   if (fence_owner == AMDGPU_FENCE_OWNER_KFD &&
+   owner != AMDGPU_FENCE_OWNER_UNDEFINED)
+   return false;
+
+   /* Never sync to VM updates either. */
+   if (fence_owner == AMDGPU_FENCE_OWNER_VM &&
+   owner != AMDGPU_FENCE_OWNER_UNDEFINED)
+   return false;
+
+   /* Ignore fences depending on the sync mode */
+   switch (mode) {
+   case AMDGPU_SYNC_ALWAYS:
+   return true;
+
+   case AMDGPU_SYNC_NE_OWNER:
+   if (amdgpu_sync_same_dev(adev, f) &&
+   fence_owner == owner)
+   return false;
+   break;
+
+   case AMDGPU_SYNC_EQ_OWNER:
+   if (amdgpu_sync_same_dev(adev, f) &&
+   fence_owner != owner)
+   return false;
+   break;
+
+   case AMDGPU_SYNC_EXPLICIT:
+   return false;
+   }
+
+   WARN(debug_evictions && fence_owner == AMDGPU_FENCE_OWNER_KFD,
+"Adding eviction fence to sync obj");
+   return true;
+}
+
  /**
   * amdgpu_sync_resv - sync to a reservation object
   *
@@ -211,67 +262,34 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct 
amdgpu_sync *sync,
  
  	/* always sync to the exclusive fence */

f = dma_resv_excl_fence(resv);
-   r = amdgpu_sync_fence(sync, f);
+   dma_fence_chain_for_each(f, f) {
+   struct dma_fence_chain *chain = to_dma_fence_chain(f);
+
+   if (amdgpu_sync_test_fence(adev, mode, owner, chain ?
+  chain->fence : f)) {
+   r = amdgpu_sync_fence(sync, f);
+   dma_fence_put(f);
+   if (r)
+   return r;
+   break;
+   }
+   }
  
  	flist = dma_resv_shared_list(resv);

-   if (!flist || r)
-   return r;
+   if (!flist)
+   return 0;
  
  	for (i = 0; i < flist->shared_count; ++i) {

-   void *fence_owner;
-
f = rcu_dereference_protected(flist->shared[i],
  dma_resv_held(resv));
  
-		fence_owner = amdgpu_sync_get_owner(f);

-
-   /* Always sync to moves, no matter what */
-   if (fence_owner == AMDGPU_FENCE_OWNER_UNDEFINED) {
+   if (amdgpu_sync_test_fence(adev, mode, owner, f)) {
r = amdgpu_sync_fence(sync, f);
if (r)
-   break;
-   }
-
-   /* We only want to trigger KFD eviction fences on
-* evict or move jobs. Skip KFD fences otherwise.
-*/
-   if (fence_owner == AMDGPU_FENCE_OWNER_KFD &&
-   owner != AMDGPU_FENCE_OWNER_UNDEFINED)
-   continue;
-
-   /* Never sync to VM updates either. */
-   if (fence_owner == AMDGPU_FENCE_OWNER_VM &&
-   owner != AMDGPU_FENCE_OWNER_UNDEFINED)
-   continue;
-
-   /* Ignore fences depending on the sync mode */
-   switch (mode) {
-   case AMDGPU_SYNC_ALWAYS:
-   break;
-
-   case AMDGPU_SYNC_NE_OWNER:
-   if (amdgpu_sync_same_dev(adev, f) &&
-   fence_owner == owner)
-  

[PATCH] drm/panfrost:modify 'break' to 'continue' to traverse the circulation

2021-06-17 Thread ChunyouTang
From: ChunyouTang 

The 'break' can cause 'Memory manager not clean during takedown'

It cannot use break to finish the circulation,it should use

continue to traverse the circulation.it should put every mapping

which is not NULL.

Signed-off-by: ChunyouTang 
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 6003cfeb1322..52bccc1d2d42 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -281,7 +281,7 @@ static void panfrost_job_cleanup(struct kref *ref)
if (job->mappings) {
for (i = 0; i < job->bo_count; i++) {
if (!job->mappings[i])
-   break;
+   continue;
 
atomic_dec(&job->mappings[i]->obj->gpu_usecount);
panfrost_gem_mapping_put(job->mappings[i]);
-- 
2.25.1




Re: [Freedreno] [RFC PATCH 00/13] drm/msm: Add Display Stream Compression Support

2021-06-17 Thread Vinod Koul
On 03-06-21, 16:40, abhin...@codeaurora.org wrote:
> On 2021-06-02 04:01, Vinod Koul wrote:
> > On 27-05-21, 16:30, Rob Clark wrote:
> > 
> > yeah that is always a very different world. although it might make sense
> > to use information in tables and try to deduce information about the
> > system can be helpful...
> > 
> > > I'd worry more about what makes sense in a DT world, when it comes to
> > > DT bindings.
> > 
> > And do you have thoughts on that..?
> 
> At the moment, I will comment on the bindings first and my idea on how to
> proceed.
> The bindings mentioned here:
> https://lore.kernel.org/dri-devel/20210521124946.3617862-3-vk...@kernel.org/
> seem to be just
> taken directly from downstream which was not the plan.
> 
> I think all of these should be part of the generic panel bindings as none of
> these are QC specific:

Okay so we have discussed this w/ Bjorn and Abhinav and here are the
conclusions and recommendations for binding

1. the properties are generic and not msm specific
2. The host supports multiple formats but the one we choose depends
mostly upon panel. Notably host runs the config which the panel supports.

So the recommendations is to add a table of dsc properties in the panel
driver. No DT binding here.

I should also note that for DP we should be able to calculate these
values from EDID like the i915 driver seems to do

With this I will drop the binding patch and move dsc properties to panel
driver

Thanks

-- 
~Vinod


Re: [PATCH] drm/dp_mst: Add missing drm parameters to recently added call to drm_dbg_kms()

2021-06-17 Thread Lin, Wayne
[Public]

Really sorry for the mistake that I made and any inconvenience it brought.
Thanks José and Lyude.

Regards,
Wayne


> From: Lyude Paul 
> Sent: Thursday, June 17, 2021 03:47
> To: José Roberto de Souza; intel-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org; Lin, Wayne
> Subject: Re: [PATCH] drm/dp_mst: Add missing drm parameters to recently added 
> call to drm_dbg_kms()
>
> Reviewed-by: Lyude Paul 
>
> Will go ahead and push this to drm-misc-next-fixes, thanks
>
> On Wed, 2021-06-16 at 12:44 -0700, José Roberto de Souza wrote:
> > Commit 3769e4c0af5b ("drm/dp_mst: Avoid to mess up payload table by
> > ports in stale topology") added to calls to drm_dbg_kms() but it
> > missed the first parameter, the drm device breaking the build.
> >
> > Fixes: 3769e4c0af5b ("drm/dp_mst: Avoid to mess up payload table by ports in
> > stale topology")
> > Cc: Wayne Lin 
> > Cc: Lyude Paul 
> > Cc: dri-devel@lists.freedesktop.org
> > Signed-off-by: José Roberto de Souza 
> > ---
> >  drivers/gpu/drm/drm_dp_mst_topology.c | 7 +--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c
> > b/drivers/gpu/drm/drm_dp_mst_topology.c
> > index 9ac148efd9e43..ad0795afc21cf 100644
> > --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> > @@ -3389,7 +3389,9 @@ int drm_dp_update_payload_part1(struct
> > drm_dp_mst_topology_mgr *mgr)
> > mutex_unlock(&mgr->lock);
> >
> > if (skip) {
> > -   drm_dbg_kms("Virtual channel %d is not in
> > current topology\n", i);
> > +   drm_dbg_kms(mgr->dev,
> > +   "Virtual channel %d is not in
> > current topology\n",
> > +   i);
> > continue;
> > }
> > /* Validated ports don't matter if we're releasing
> > @@ -3404,7 +3406,8 @@ int drm_dp_update_payload_part1(struct
> > drm_dp_mst_topology_mgr *mgr)
> > payload->start_slot =
> > req_payload.start_slot;
> > continue;
> > } else {
> > -   drm_dbg_kms("Fail:set
> > payload to invalid sink");
> > +   drm_dbg_kms(mgr->dev,
> > +   "Fail:set
> > payload to invalid sink");
> > mutex_unlock(&mgr-
> > >payload_lock);
> > return -EINVAL;
> > }
>
> --
> Cheers,
>  Lyude Paul (she/her)
>  Software Engineer at Red Hat



Re: [PATCH v4] Documentation: gpu: Mention the requirements for new properties

2021-06-17 Thread Pekka Paalanen
On Wed, 16 Jun 2021 16:38:42 +0200
Maxime Ripard  wrote:

> New KMS properties come with a bunch of requirements to avoid each
> driver from running their own, inconsistent, set of properties,
> eventually leading to issues like property conflicts, inconsistencies
> between drivers and semantics, etc.
> 
> Let's document what we expect.
> 
> Cc: Alexandre Belloni 
> Cc: Alexandre Torgue 
> Cc: Alex Deucher 
> Cc: Alison Wang 
> Cc: Alyssa Rosenzweig 
> Cc: Andrew Jeffery 
> Cc: Andrzej Hajda 
> Cc: Anitha Chrisanthus 
> Cc: Benjamin Gaignard 
> Cc: Ben Skeggs 
> Cc: Boris Brezillon 
> Cc: Brian Starkey 
> Cc: Chen Feng 
> Cc: Chen-Yu Tsai 
> Cc: Christian Gmeiner 
> Cc: "Christian König" 
> Cc: Chun-Kuang Hu 
> Cc: Edmund Dea 
> Cc: Eric Anholt 
> Cc: Fabio Estevam 
> Cc: Gerd Hoffmann 
> Cc: Haneen Mohammed 
> Cc: Hans de Goede 
> Cc: "Heiko Stübner" 
> Cc: Huang Rui 
> Cc: Hyun Kwon 
> Cc: Inki Dae 
> Cc: Jani Nikula 
> Cc: Jernej Skrabec 
> Cc: Jerome Brunet 
> Cc: Joel Stanley 
> Cc: John Stultz 
> Cc: Jonas Karlman 
> Cc: Jonathan Hunter 
> Cc: Joonas Lahtinen 
> Cc: Joonyoung Shim 
> Cc: Jyri Sarha 
> Cc: Kevin Hilman 
> Cc: Kieran Bingham 
> Cc: Krzysztof Kozlowski 
> Cc: Kyungmin Park 
> Cc: Laurent Pinchart 
> Cc: Linus Walleij 
> Cc: Liviu Dudau 
> Cc: Lucas Stach 
> Cc: Ludovic Desroches 
> Cc: Marek Vasut 
> Cc: Martin Blumenstingl 
> Cc: Matthias Brugger 
> Cc: Maxime Coquelin 
> Cc: Maxime Ripard 
> Cc: Melissa Wen 
> Cc: Neil Armstrong 
> Cc: Nicolas Ferre 
> Cc: "Noralf Trønnes" 
> Cc: NXP Linux Team 
> Cc: Oleksandr Andrushchenko 
> Cc: Patrik Jakobsson 
> Cc: Paul Cercueil 
> Cc: Pekka Paalanen 
> Cc: Pengutronix Kernel Team 
> Cc: Philippe Cornu 
> Cc: Philipp Zabel 
> Cc: Qiang Yu 
> Cc: Rob Clark 
> Cc: Robert Foss 
> Cc: Rob Herring 
> Cc: Rodrigo Siqueira 
> Cc: Rodrigo Vivi 
> Cc: Roland Scheidegger 
> Cc: Russell King 
> Cc: Sam Ravnborg 
> Cc: Sandy Huang 
> Cc: Sascha Hauer 
> Cc: Sean Paul 
> Cc: Seung-Woo Kim 
> Cc: Shawn Guo 
> Cc: Simon Ser 
> Cc: Stefan Agner 
> Cc: Steven Price 
> Cc: Sumit Semwal 
> Cc: Thierry Reding 
> Cc: Tian Tao 
> Cc: Tomeu Vizoso 
> Cc: Tomi Valkeinen 
> Cc: VMware Graphics 
> Cc: Xinliang Liu 
> Cc: Xinwei Kong 
> Cc: Yannick Fertre 
> Cc: Zack Rusin 
> Reviewed-by: Daniel Vetter 
> Signed-off-by: Maxime Ripard 
> 
> ---
> 
> Changes from v3:
>   - Roll back to the v2
>   - Add Simon and Pekka in Cc
> 
> Changes from v2:
>   - Take into account the feedback from Laurent and Lidiu to no longer
> force generic properties, but prefix vendor-specific properties with
> the vendor name
> 
> Changes from v1:
>   - Typos and wording reported by Daniel and Alex
> ---
>  Documentation/gpu/drm-kms.rst | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst
> index 87e5023e3f55..c28b464dd397 100644
> --- a/Documentation/gpu/drm-kms.rst
> +++ b/Documentation/gpu/drm-kms.rst
> @@ -463,6 +463,25 @@ KMS Properties
>  This section of the documentation is primarily aimed at user-space 
> developers.
>  For the driver APIs, see the other sections.
>  
> +Requirements
> +
> +
> +KMS drivers might need to add extra properties to support new features.
> +Each new property introduced in a driver need to meet a few
> +requirements, in addition to the one mentioned above.:
> +
> +- It must be standardized, with some documentation to describe how the
> +  property can be used.

Hi,

I might replace "some" with "full" documentation. Also not only how it
can be used but also what it does.

FYI, some common things that tend to be forgotten IME:
- Spell out exactly the name string for the property in the
  documentation so that it is unambiguous what string userspace should
  look for.
- The same for string names of enum values.
- Explicitly document what each enum value means, do not trust that the
  value name describes it well enough.
- Explain how the property interacts with other, existing properties.

Not sure if these should be written down here or anywhere though.
Interaction with other properties is kind of important.

> +
> +- It must provide a generic helper in the core code to register that
> +  property on the object it attaches to.
> +
> +- Its content must be decoded by the core and provided in the object's
> +  associated state structure. That includes anything drivers might want to
> +  precompute, like :c:type:`struct drm_clip_rect ` for planes.
> +
> +- An IGT test must be submitted where reasonable.

Would it be too much to replace "where reasonable" with "if it is at
all possible to write a test."?

> +

How about adding the following somewhere?

- The initial state of the property (set during driver initialization)
  must match how the driver+hardware behaved before introducing this
  property. It may be some fixed value or it may be inherited from e.g.
  the firmware that booted the system. How the initial state is
  determined must also be documented,

Re: [Intel-gfx] [PATCH] drm/i915/gem: Remove duplicated call to ops->pread

2021-06-17 Thread Daniel Vetter
On Wed, Jun 16, 2021 at 11:45:28AM +0100, Matthew Auld wrote:
> On Wed, 16 Jun 2021 at 10:04, Daniel Vetter  wrote:
> >
> > Between
> >
> > commit ae30af84edb5b7cc95485922e43afd909a892e1b
> > Author: Maarten Lankhorst 
> > Date:   Tue Mar 23 16:50:00 2021 +0100
> >
> > drm/i915: Disable userptr pread/pwrite support.
> >
> > and
> >
> > commit 0049b688459b846f819b6e51c24cd0781fcfde41
> > Author: Matthew Auld 
> > Date:   Thu Nov 5 15:49:33 2020 +
> >
> > drm/i915/gem: Allow backends to override pread implementation
> >
> > this accidentally landed twice.
> >
> > Cc: Matthew Auld  > Cc: Thomas Hellström 
> > Cc: Jason Ekstrand 
> > Cc: Daniel Vetter 
> > Signed-off-by: Daniel Vetter 
> Reviewed-by: Matthew Auld  
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 6 --
> >  1 file changed, 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> > b/drivers/gpu/drm/i915/i915_gem.c
> > index 6a0a3f0e36e1..07aa80773a02 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -469,12 +469,6 @@ i915_gem_pread_ioctl(struct drm_device *dev, void 
> > *data,
> > if (ret != -ENODEV)
> > goto out;
> >
> > -   ret = -ENODEV;
> > -   if (obj->ops->pread)
> > -   ret = obj->ops->pread(obj, args);
> > -   if (ret != -ENODEV)
> > -   goto out;
> > -
> > ret = i915_gem_object_wait(obj,
> >I915_WAIT_INTERRUPTIBLE,
> >MAX_SCHEDULE_TIMEOUT);
> > --
> > 2.32.0.rc2
> >
> > ___
> > Intel-gfx mailing list
> > intel-...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 1/2] drm/dp_mst: Do not set proposed vcpi directly

2021-06-17 Thread Lin, Wayne
[AMD Official Use Only]


> From: Wentland, Harry 
> Sent: Thursday, June 17, 2021 03:53
> To: Lin, Wayne; dri-devel@lists.freedesktop.org
> Cc: ly...@redhat.com; Kazlauskas, Nicholas; Zuo, Jerry; Pillai, Aurabindo; 
> Maarten Lankhorst; Maxime Ripard; Thomas Zimmermann; sta...@vger.kernel.org
> Subject: Re: [PATCH v2 1/2] drm/dp_mst: Do not set proposed vcpi directly
>
>
>
> On 2021-06-15 11:55 p.m., Wayne Lin wrote:
> > [Why]
> > When we receive CSN message to notify one port is disconnected, we will
> > implicitly set its corresponding num_slots to 0. Later on, we will
> > eventually call drm_dp_update_payload_part1() to arrange down streams.
> >
> > In drm_dp_update_payload_part1(), we iterate over all proposed_vcpis[]
> > to do the update. Not specific to a target sink only. For example, if we
> > light up 2 monitors, Monitor_A and Monitor_B, and then we unplug
> > Monitor_B. Later on, when we call drm_dp_update_payload_part1() to try
> > to update payload for Monitor_A, we'll also implicitly clean payload for
> > Monitor_B at the same time. And finally, when we try to call
> > drm_dp_update_payload_part1() to clean payload for Monitor_B, we will do
> > nothing at this time since payload for Monitor_B has been cleaned up
> > previously.
> >
> > For StarTech 1to3 DP hub, it seems like if we didn't update DPCD payload
> > ID table then polling for "ACT Handled"(BIT_1 of DPCD 002C0h) will fail
> > and this polling will last for 3 seconds.
> >
> > Therefore, guess the best way is we don't set the proposed_vcpi[]
> > diretly. Let user of these herlper functions to set the proposed_vcpi
> > directly.
> >
> > [How]
> > 1. Revert commit 7617e9621bf2 ("drm/dp_mst: clear time slots for ports
> > invalid")
> > 2. Tackle the issue in previous commit by skipping those trasient
> > proposed VCPIs. These stale VCPIs shoulde be explicitly cleared by
> > user later on.
> >
> > Changes since v1:
> > * Change debug macro to use drm_dbg_kms() instead
> > * Amend the commit message to add Fixed & Cc tags
> >
> > Signed-off-by: Wayne Lin 
> > Fixes: 7617e9621bf2 ("drm/dp_mst: clear time slots for ports invalid")
> > Cc: Lyude Paul 
> > Cc: Wayne Lin 
> > Cc: Maarten Lankhorst 
> > Cc: Maxime Ripard 
> > Cc: Thomas Zimmermann 
> > Cc: dri-devel@lists.freedesktop.org
> > Cc:  # v5.5+
> > ---
> >  drivers/gpu/drm/drm_dp_mst_topology.c | 36 ---
> >  1 file changed, 10 insertions(+), 26 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c 
> > b/drivers/gpu/drm/drm_dp_mst_topology.c
> > index 32b7f8983b94..b41b837db66d 100644
> > --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> > @@ -2501,7 +2501,7 @@ drm_dp_mst_handle_conn_stat(struct drm_dp_mst_branch 
> > *mstb,
> >  {
> >   struct drm_dp_mst_topology_mgr *mgr = mstb->mgr;
> >   struct drm_dp_mst_port *port;
> > - int old_ddps, old_input, ret, i;
> > + int old_ddps, ret;
> >   u8 new_pdt;
> >   bool new_mcs;
> >   bool dowork = false, create_connector = false;
> > @@ -2533,7 +2533,6 @@ drm_dp_mst_handle_conn_stat(struct drm_dp_mst_branch 
> > *mstb,
> >   }
> >
> >   old_ddps = port->ddps;
> > - old_input = port->input;
> >   port->input = conn_stat->input_port;
> >   port->ldps = conn_stat->legacy_device_plug_status;
> >   port->ddps = conn_stat->displayport_device_plug_status;
> > @@ -2555,28 +2554,6 @@ drm_dp_mst_handle_conn_stat(struct drm_dp_mst_branch 
> > *mstb,
> >   dowork = false;
> >   }
> >
> > - if (!old_input && old_ddps != port->ddps && !port->ddps) {
> > - for (i = 0; i < mgr->max_payloads; i++) {
> > - struct drm_dp_vcpi *vcpi = mgr->proposed_vcpis[i];
> > - struct drm_dp_mst_port *port_validated;
> > -
> > - if (!vcpi)
> > - continue;
> > -
> > - port_validated =
> > - container_of(vcpi, struct drm_dp_mst_port, 
> > vcpi);
> > - port_validated =
> > - drm_dp_mst_topology_get_port_validated(mgr, 
> > port_validated);
> > - if (!port_validated) {
> > - mutex_lock(&mgr->payload_lock);
> > - vcpi->num_slots = 0;
> > - mutex_unlock(&mgr->payload_lock);
> > - } else {
> > - drm_dp_mst_topology_put_port(port_validated);
> > - }
> > - }
> > - }
> > -
> >   if (port->connector)
> >   drm_modeset_unlock(&mgr->base.lock);
> >   else if (create_connector)
> > @@ -3410,8 +3387,15 @@ int drm_dp_update_payload_part1(struct 
> > drm_dp_mst_topology_mgr *mgr)
> >   port = drm_dp_mst_topology_get_port_validated(
> >   mgr, 

Re: [PATCH v3] Documentation: gpu: Mention the requirements for new properties

2021-06-17 Thread Pekka Paalanen
On Thu, 17 Jun 2021 00:05:24 +0300
Laurent Pinchart  wrote:

> On Tue, Jun 15, 2021 at 01:16:56PM +0300, Pekka Paalanen wrote:
> > On Tue, 15 Jun 2021 12:45:57 +0300 Laurent Pinchart wrote:  
> > > On Tue, Jun 15, 2021 at 07:15:18AM +, Simon Ser wrote:  
> > > > On Tuesday, June 15th, 2021 at 09:03, Pekka Paalanen wrote:
> > > > 
> > > > > indeed it will, but what else could one do to test userspace KMS
> > > > > clients in generic CI where all you can have is virtual hardware? 
> > > > > Maybe
> > > > > in the long run VKMS needs to loop back to a userspace daemon that
> > > > > implements all the complex processing and returns the writeback result
> > > > > via VKMS again? That daemon would then need a single upstream, like 
> > > > > the
> > > > > kernel, where it is maintained and correctness verified.
> > > > 
> > > > The complex processing must be implemented even without write-back, 
> > > > because
> > > > user-space can ask for CRCs of the CRTC.
> > > > 
> > > > > Or an LD_PRELOAD that hijacks all KMS ioctls and implements virtual
> > > > > stuff in userspace? Didn't someone already have something like that?
> > > > > It would need to be lifted to be a required part of kernel UAPI
> > > > > submissions, I suppose like IGT is nowadays.
> > > > 
> > > > FWIW, I have a mock libdrm [1] for libliftoff. This is nowhere near a 
> > > > full
> > > > software implementation with write-back connectors, but allows to expose
> > > > virtual planes and check atomic commits in CI.
> > > > 
> > > > [1]: 
> > > > https://github.com/emersion/libliftoff/blob/master/test/libdrm_mock.c
> > > > 
> > > > > For compositor developers like me knowing the exact formulas would be 
> > > > > a huge
> > > > > benefit as it would allow me to use KMS to off-load 
> > > > > precision-sensitive
> > > > > operations (e.g.  professional color management). Otherwise, 
> > > > > compositors
> > > > > probably need a switch: "high quality color management? Then do not 
> > > > > use KMS
> > > > > features."
> > > > 
> > > > I think for alpha blending there are already rounding issues depending 
> > > > on the
> > > > hardware. I wouldn't keep my hopes up for any guarantee that all hw 
> > > > uses the
> > > > exact same formulae for color management stuff.
> > > 
> > > Good, because otherwise you would be very quickly disappointed :-)
> > > 
> > > For scaling we would also need to replicate the exact same filter taps,
> > > which are often not documented.  
> > 
> > That is where the documented tolerances come into play.  
> 
> This is something I've experimented with a while ago, when developing
> automated tests for the rcar-du driver. When playing with different
> input images we had to constantly increases tolerances, up to a point
> where the tests started to miss real problems :-(

What should we infer from that? That the hardware is broken and
exposing those KMS properties is a false promise?

If a driver on certain hardware cannot correctly implement a KMS
property over the full domain of the input space, should that driver
then simply not expose the KMS property at all?

But I would assume that the vendor still wants to expose the features
in upstream kernels, yet they cannot use the standard KMS properties
for that. Should the driver then expose vendor-specific properties with
the disclaimer that the result is not always what one would expect, so
that userspace written and tested explicitly for that hardware can
still work?

That is, a sufficient justification for a vendor-specific KMS property
would be that a standard property already exists, but the hardware is
too buggy to make it work. IOW, give up trying to make sense.

I would like to move towards a direction where *hardware* design and
testing is eventually guided by Linux KMS property definitions and
their tests. If we could have a rule that if a driver cannot correctly
implement a property then it must not expose the property, maybe in the
long term that might start having an effect?

My underlying assumption is that generic userspace will not use
vendor-specific properties.

Or, since we have atomic commits with TEST_ONLY, should it be driver's
responsibility to carefully inspect the full state and reject the
commit if the hardware is incapable of implementing it correctly?
Vendor-specific userspace would know to avoid failing configurations to
begin with. I suppose that might put an endless whack-a-mole game on
drivers though.


Thanks,
pq


pgpJvr8yPoG7S.pgp
Description: OpenPGP digital signature


Re: [PATCH] dt-bindings: Drop redundant minItems/maxItems

2021-06-17 Thread Ulf Hansson
On Tue, 15 Jun 2021 at 21:15, Rob Herring  wrote:
>
> If a property has an 'items' list, then a 'minItems' or 'maxItems' with the
> same size as the list is redundant and can be dropped. Note that is DT
> schema specific behavior and not standard json-schema behavior. The tooling
> will fixup the final schema adding any unspecified minItems/maxItems.
>
> This condition is partially checked with the meta-schema already, but
> only if both 'minItems' and 'maxItems' are equal to the 'items' length.
> An improved meta-schema is pending.
>
> Cc: Jens Axboe 
> Cc: Stephen Boyd 
> Cc: Herbert Xu 
> Cc: "David S. Miller" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Vinod Koul 
> Cc: Bartosz Golaszewski 
> Cc: Kamal Dasu 
> Cc: Jonathan Cameron 
> Cc: Lars-Peter Clausen 
> Cc: Thomas Gleixner 
> Cc: Marc Zyngier 
> Cc: Joerg Roedel 
> Cc: Jassi Brar 
> Cc: Mauro Carvalho Chehab 
> Cc: Krzysztof Kozlowski 
> Cc: Ulf Hansson 
> Cc: Jakub Kicinski 
> Cc: Wolfgang Grandegger 
> Cc: Marc Kleine-Budde 
> Cc: Andrew Lunn 
> Cc: Vivien Didelot 
> Cc: Vladimir Oltean 
> Cc: Bjorn Helgaas 
> Cc: Kishon Vijay Abraham I 
> Cc: Linus Walleij 
> Cc: "Uwe Kleine-König" 
> Cc: Lee Jones 
> Cc: Ohad Ben-Cohen 
> Cc: Mathieu Poirier 
> Cc: Philipp Zabel 
> Cc: Paul Walmsley 
> Cc: Palmer Dabbelt 
> Cc: Albert Ou 
> Cc: Alessandro Zummo 
> Cc: Alexandre Belloni 
> Cc: Greg Kroah-Hartman 
> Cc: Mark Brown 
> Cc: Zhang Rui 
> Cc: Daniel Lezcano 
> Cc: Wim Van Sebroeck 
> Cc: Guenter Roeck 
> Signed-off-by: Rob Herring 

Acked-by: Ulf Hansson  # for MMC

[...]

Kind regards
Uffe


Re: [PATCH] intel: Do not assert on unknown chips in drm_intel_decode_context_alloc

2021-06-17 Thread Tvrtko Ursulin



+ a bunch of recent committers to libdrm

Guys, anyone okay to push this patch? I can resend if required.

Regards,

Tvrtko

On 19/11/2020 13:58, Tvrtko Ursulin wrote:


On 19/11/2020 13:52, Chris Wilson wrote:

Quoting Tvrtko Ursulin (2020-11-19 13:42:07)


On 18/11/2020 17:04, Chris Wilson wrote:

Quoting Tvrtko Ursulin (2020-11-18 16:36:01)

From: Tvrtko Ursulin 

There is this long standing nit of igt/tools/intel_error_decode 
asserting
when you feed it an error state from a GPU the local libdrm does 
not know

of.

To fix this I need a tweak in drm_intel_decode_context_alloc to 
make it
not assert but just return NULL (which seems an already possible 
return

value).

Signed-off-by: Tvrtko Ursulin 


Good riddance,
Reviewed-by: Chris Wilson 


Thanks, now how can push to drm and is there some testing to be
triggered before, or after?


cd intel; for i in tests/gen*.sh; do $i; done

But clearly I haven't built libdrm since automake was dropped.


Thanks, all good:

$ for t in ../../intel/tests/gen*.sh; do bash -x $t; done
++ echo ../../intel/tests/gen4-3d.batch.sh
++ sed 's|\.sh$||'
+ TEST_FILENAME=../../intel/tests/gen4-3d.batch
+ ./test_decode ../../intel/tests/gen4-3d.batch
+ ret=0
+ test 0 = 1
+ exit 0
++ echo ../../intel/tests/gen5-3d.batch.sh
++ sed 's|\.sh$||'
+ TEST_FILENAME=../../intel/tests/gen5-3d.batch
+ ./test_decode ../../intel/tests/gen5-3d.batch
+ ret=0
+ test 0 = 1
+ exit 0
++ echo ../../intel/tests/gen6-3d.batch.sh
++ sed 's|\.sh$||'
+ TEST_FILENAME=../../intel/tests/gen6-3d.batch
+ ./test_decode ../../intel/tests/gen6-3d.batch
+ ret=0
+ test 0 = 1
+ exit 0
++ echo ../../intel/tests/gen7-2d-copy.batch.sh
++ sed 's|\.sh$||'
+ TEST_FILENAME=../../intel/tests/gen7-2d-copy.batch
+ ./test_decode ../../intel/tests/gen7-2d-copy.batch
+ ret=0
+ test 0 = 1
+ exit 0
++ echo ../../intel/tests/gen7-3d.batch.sh
++ sed 's|\.sh$||'
+ TEST_FILENAME=../../intel/tests/gen7-3d.batch
+ ./test_decode ../../intel/tests/gen7-3d.batch
+ ret=0
+ test 0 = 1
+ exit 0

Regards,

Tvrtko
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] dma-buf: fix and rework dma_buf_poll

2021-06-17 Thread kernel test robot
Hi "Christian,

I love your patch! Perhaps something to improve:

[auto build test WARNING on next-20210616]
[cannot apply to tegra-drm/drm/tegra/for-next linus/master v5.13-rc6 v5.13-rc5 
v5.13-rc4 v5.13-rc6]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Christian-K-nig/dma-buf-fix-and-rework-dma_buf_poll/20210617-103036
base:c7d4c1fd91ab4a6d2620497921a9c6bf54650ab8
config: x86_64-randconfig-r022-20210617 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
64720f57bea6a6bf033feef4a5751ab9c0c3b401)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# 
https://github.com/0day-ci/linux/commit/dfa9f2ec4c082b73e644e2c565e58e2291f94463
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Christian-K-nig/dma-buf-fix-and-rework-dma_buf_poll/20210617-103036
git checkout dfa9f2ec4c082b73e644e2c565e58e2291f94463
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

>> drivers/dma-buf/dma-buf.c:284:17: warning: variable 'fence_excl' is 
>> uninitialized when used here [-Wuninitialized]
   dma_fence_put(fence_excl);
 ^~
   drivers/dma-buf/dma-buf.c:213:30: note: initialize the variable 'fence_excl' 
to silence this warning
   struct dma_fence *fence_excl;
   ^
= NULL
   1 warning generated.


vim +/fence_excl +284 drivers/dma-buf/dma-buf.c

   206  
   207  static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
   208  {
   209  struct dma_buf_poll_cb_t *dcb;
   210  struct dma_buf *dmabuf;
   211  struct dma_resv *resv;
   212  struct dma_resv_list *fobj;
   213  struct dma_fence *fence_excl;
   214  unsigned shared_count, seq;
   215  struct dma_fence *fence;
   216  __poll_t events;
   217  int r, i;
   218  
   219  dmabuf = file->private_data;
   220  if (!dmabuf || !dmabuf->resv)
   221  return EPOLLERR;
   222  
   223  resv = dmabuf->resv;
   224  
   225  poll_wait(file, &dmabuf->poll, poll);
   226  
   227  events = poll_requested_events(poll) & (EPOLLIN | EPOLLOUT);
   228  if (!events)
   229  return 0;
   230  
   231  dcb = events & EPOLLOUT ? &dmabuf->cb_out : &dmabuf->cb_in;
   232  
   233  /* Only queue a new one if we are not still waiting for the old 
one */
   234  spin_lock_irq(&dmabuf->poll.lock);
   235  if (dcb->active)
   236  events = 0;
   237  else
   238  dcb->active = events;
   239  spin_unlock_irq(&dmabuf->poll.lock);
   240  if (!events)
   241  return 0;
   242  
   243  retry:
   244  seq = read_seqcount_begin(&resv->seq);
   245  rcu_read_lock();
   246  
   247  fobj = rcu_dereference(resv->fence);
   248  if (fobj && events & EPOLLOUT)
   249  shared_count = fobj->shared_count;
   250  else
   251  shared_count = 0;
   252  
   253  for (i = 0; i < shared_count; ++i) {
   254  fence = rcu_dereference(fobj->shared[i]);
   255  fence = dma_fence_get_rcu(fence);
   256  if (!fence || read_seqcount_retry(&resv->seq, seq)) {
   257  /* Concurrent modify detected, force re-check */
   258  dma_fence_put(fence);
   259  rcu_read_unlock();
   260  goto retry;
   261  }
   262  
   263  r = dma_fence_add_callback(fence, &dcb->cb, 
dma_buf_poll_cb);
   264  dma_fence_put(fence);
   265  if (!r) {
   266  /* Callback queued */
   267  events = 0;
   268  goto out;
   269  }
   270  }
   271  
   272  fence = dma_resv_excl_fence(resv);
   273  if (fence) {
   274  fence = 

Re: vc4_bo_create: Failed to allocate from CMA

2021-06-17 Thread nicolas saenz julienne
On Sat, 2021-06-12 at 17:17 +0200, Stefan Wahren wrote:
> Hi,
> 
> while testing the mainline kernel (arm64, defconfig) on Raspberry Pi 3 B
> Plus with Raspberry Pi OS - 64 bit, sometimes X doesn't start into
> desktop properly (unexpected and unusable login screen instead of auto
> login or mouse pointer is show shorty and than switch back to black
> screen in a loop). In that case dmesg shows the following:
> 
> [   74.737106] [drm:vc4_bo_create [vc4]] *ERROR* Failed to allocate from
> CMA:
> [   74.737558] vc4-drm soc:gpu: [drm]    V3D: 
> 28976kb BOs (10)
> [   74.737602] vc4-drm soc:gpu: [drm] V3D
> shader: 44kb BOs (11)
> [   74.737632] vc4-drm soc:gpu: [drm]   dumb:  
> 4564kb BOs (5)
> [   74.737664] vc4-drm soc:gpu: [drm] binner: 
> 16384kb BOs (1)
> [   74.737697] vc4-drm soc:gpu: [drm]    total purged
> BO:  4kb BOs (1)
> [   74.739039] [drm:vc4_bo_create [vc4]] *ERROR* Failed to allocate from
> CMA:
> [   74.739466] vc4-drm soc:gpu: [drm]    V3D: 
> 28972kb BOs (9)
> [   74.739512] vc4-drm soc:gpu: [drm] V3D
> shader: 44kb BOs (11)
> [   74.739541] vc4-drm soc:gpu: [drm]   dumb:  
> 4564kb BOs (5)
> [   74.739570] vc4-drm soc:gpu: [drm] binner: 
> 16384kb BOs (1)
> [   74.739602] vc4-drm soc:gpu: [drm]    total purged
> BO:  4kb BOs (1)
> [   74.740718] [drm:vc4_bo_create [vc4]] *ERROR* Failed to allocate from
> CMA:
> [   74.741138] vc4-drm soc:gpu: [drm]    V3D: 
> 28972kb BOs (9)
> [   74.741171] vc4-drm soc:gpu: [drm] V3D
> shader: 44kb BOs (11)
> [   74.741202] vc4-drm soc:gpu: [drm]   dumb:  
> 4564kb BOs (5)
> [   74.741231] vc4-drm soc:gpu: [drm] binner: 
> 16384kb BOs (1)
> [   74.741263] vc4-drm soc:gpu: [drm]    total purged
> BO:  4kb BOs (1)
> ...
> 
> I have only seen this issue on arm64 with latest mainline kernel
> (5.13.0-rc5-00130-gf21b807c3cf8), but also with older kernel versions.
> So it's not a regression. It seems 64 bit needs more CMA.
> 
> In case X started properly i was also able to reproduce these errors
> above by dis- and reconneting HDMI.
> 
> So i increased CMA in bcm283x.dtsi and the problem disappeared:
> 
> iff --git a/arch/arm/boot/dts/bcm283x.dtsi b/arch/arm/boot/dts/bcm283x.dtsi
> index b83a864..d1304cb 100644
> --- a/arch/arm/boot/dts/bcm283x.dtsi
> +++ b/arch/arm/boot/dts/bcm283x.dtsi
> @@ -37,7 +37,7 @@
>  
>      cma: linux,cma {
>          compatible = "shared-dma-pool";
> -            size = <0x400>; /* 64MB */
> +            size = <0x600>; /* 96MB */
>          reusable;
>          linux,cma-default;
>      };
> 
> The questions are:
> 
> Is this the right way (tm) to fix this problem?

Frankly I don't know if there is a better way. IIRC opensuse and downstream use
DT overlays to cater for this limitation. It seems reasonable to bump the
value. But it'll be in detriment of users that don't care much for graphical
interfaces. Nonetheless, I'm not familiar with how DRM handles CMA/DMA memory.
So let me have a look at it. Maybe there is a SW fix. At first glance I'm
surprised they can't defer to normal page allocations when CMA isn't capable of
honoring the request (like the dma code does).

> And what is a sensible value (don't have a 4K display to test)?

The default for downstream is 256MB. But I've read discussions in the forum
where people needed even more. IIUC it's use-case dependent, resolution is only
one variable, you might then try to run a game and run out of memory there.

Regards,
Nicolas



Re: [PATCH] dma-buf: fix and rework dma_buf_poll

2021-06-17 Thread kernel test robot
Hi "Christian,

I love your patch! Perhaps something to improve:

[auto build test WARNING on next-20210616]
[cannot apply to tegra-drm/drm/tegra/for-next linus/master v5.13-rc6 v5.13-rc5 
v5.13-rc4 v5.13-rc6]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Christian-K-nig/dma-buf-fix-and-rework-dma_buf_poll/20210617-103036
base:c7d4c1fd91ab4a6d2620497921a9c6bf54650ab8
config: s390-randconfig-r022-20210617 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
64720f57bea6a6bf033feef4a5751ab9c0c3b401)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install s390 cross compiling tool for clang build
# apt-get install binutils-s390x-linux-gnu
# 
https://github.com/0day-ci/linux/commit/dfa9f2ec4c082b73e644e2c565e58e2291f94463
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Christian-K-nig/dma-buf-fix-and-rework-dma_buf_poll/20210617-103036
git checkout dfa9f2ec4c082b73e644e2c565e58e2291f94463
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

>> drivers/dma-buf/dma-buf.c:284:17: warning: variable 'fence_excl' is 
>> uninitialized when used here [-Wuninitialized]
   dma_fence_put(fence_excl);
 ^~
   drivers/dma-buf/dma-buf.c:213:30: note: initialize the variable 'fence_excl' 
to silence this warning
   struct dma_fence *fence_excl;
   ^
= NULL
   1 warning generated.


vim +/fence_excl +284 drivers/dma-buf/dma-buf.c

   206  
   207  static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
   208  {
   209  struct dma_buf_poll_cb_t *dcb;
   210  struct dma_buf *dmabuf;
   211  struct dma_resv *resv;
   212  struct dma_resv_list *fobj;
   213  struct dma_fence *fence_excl;
   214  unsigned shared_count, seq;
   215  struct dma_fence *fence;
   216  __poll_t events;
   217  int r, i;
   218  
   219  dmabuf = file->private_data;
   220  if (!dmabuf || !dmabuf->resv)
   221  return EPOLLERR;
   222  
   223  resv = dmabuf->resv;
   224  
   225  poll_wait(file, &dmabuf->poll, poll);
   226  
   227  events = poll_requested_events(poll) & (EPOLLIN | EPOLLOUT);
   228  if (!events)
   229  return 0;
   230  
   231  dcb = events & EPOLLOUT ? &dmabuf->cb_out : &dmabuf->cb_in;
   232  
   233  /* Only queue a new one if we are not still waiting for the old 
one */
   234  spin_lock_irq(&dmabuf->poll.lock);
   235  if (dcb->active)
   236  events = 0;
   237  else
   238  dcb->active = events;
   239  spin_unlock_irq(&dmabuf->poll.lock);
   240  if (!events)
   241  return 0;
   242  
   243  retry:
   244  seq = read_seqcount_begin(&resv->seq);
   245  rcu_read_lock();
   246  
   247  fobj = rcu_dereference(resv->fence);
   248  if (fobj && events & EPOLLOUT)
   249  shared_count = fobj->shared_count;
   250  else
   251  shared_count = 0;
   252  
   253  for (i = 0; i < shared_count; ++i) {
   254  fence = rcu_dereference(fobj->shared[i]);
   255  fence = dma_fence_get_rcu(fence);
   256  if (!fence || read_seqcount_retry(&resv->seq, seq)) {
   257  /* Concurrent modify detected, force re-check */
   258  dma_fence_put(fence);
   259  rcu_read_unlock();
   260  goto retry;
   261  }
   262  
   263  r = dma_fence_add_callback(fence, &dcb->cb, 
dma_buf_poll_cb);
   264  dma_fence_put(fence);
   265  if (!r) {
   266  /* Callback queued */
   267  events = 0;
   268  goto out;
   269  }
   270  }
   271  
   272  fence = dma_resv_excl_fence(resv);
   273  if (fence) {
   274  fence = dma_fence

Re: [PATCH] drm/i915: Perform execbuffer object locking as a separate step

2021-06-17 Thread Ramalingam C
On 2021-06-15 at 13:36:00 +0200, Thomas Hellström wrote:
> To help avoid evicting already resident buffers from the batch we're
> processing, perform locking as a separate step.
> 
Looks reasonable to me.

Reviewed-by: Ramalingam C 

> Signed-off-by: Thomas Hellström 
> ---
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 25 ---
>  1 file changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 201fed19d120..394eb40c95b5 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -922,21 +922,38 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
>   return err;
>  }
>  
> -static int eb_validate_vmas(struct i915_execbuffer *eb)
> +static int eb_lock_vmas(struct i915_execbuffer *eb)
>  {
>   unsigned int i;
>   int err;
>  
> - INIT_LIST_HEAD(&eb->unbound);
> -
>   for (i = 0; i < eb->buffer_count; i++) {
> - struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
>   struct eb_vma *ev = &eb->vma[i];
>   struct i915_vma *vma = ev->vma;
>  
>   err = i915_gem_object_lock(vma->obj, &eb->ww);
>   if (err)
>   return err;
> + }
> +
> + return 0;
> +}
> +
> +static int eb_validate_vmas(struct i915_execbuffer *eb)
> +{
> + unsigned int i;
> + int err;
> +
> + INIT_LIST_HEAD(&eb->unbound);
> +
> + err = eb_lock_vmas(eb);
> + if (err)
> + return err;
> +
> + for (i = 0; i < eb->buffer_count; i++) {
> + struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
> + struct eb_vma *ev = &eb->vma[i];
> + struct i915_vma *vma = ev->vma;
>  
>   err = eb_pin_vma(eb, entry, ev);
>   if (err == -EDEADLK)
> -- 
> 2.31.1
> 


Re: [PATCH] drm/i915: Perform execbuffer object locking as a separate step

2021-06-17 Thread Thomas Hellström



On 6/17/21 11:56 AM, Ramalingam C wrote:

On 2021-06-15 at 13:36:00 +0200, Thomas Hellström wrote:

To help avoid evicting already resident buffers from the batch we're
processing, perform locking as a separate step.


Looks reasonable to me.

Reviewed-by: Ramalingam C 



Thanks for reviewing!

/Thomas




Re: [PATCH] drm/i915: Perform execbuffer object locking as a separate step

2021-06-17 Thread Maarten Lankhorst
Op 15-06-2021 om 13:36 schreef Thomas Hellström:
> To help avoid evicting already resident buffers from the batch we're
> processing, perform locking as a separate step.
>
> Signed-off-by: Thomas Hellström 
> ---
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 25 ---
>  1 file changed, 21 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 201fed19d120..394eb40c95b5 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -922,21 +922,38 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
>   return err;
>  }
>  
> -static int eb_validate_vmas(struct i915_execbuffer *eb)
> +static int eb_lock_vmas(struct i915_execbuffer *eb)
>  {
>   unsigned int i;
>   int err;
>  
> - INIT_LIST_HEAD(&eb->unbound);
> -
>   for (i = 0; i < eb->buffer_count; i++) {
> - struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
>   struct eb_vma *ev = &eb->vma[i];
>   struct i915_vma *vma = ev->vma;
>  
>   err = i915_gem_object_lock(vma->obj, &eb->ww);
>   if (err)
>   return err;
> + }
> +
> + return 0;
> +}
> +
> +static int eb_validate_vmas(struct i915_execbuffer *eb)
> +{
> + unsigned int i;
> + int err;
> +
> + INIT_LIST_HEAD(&eb->unbound);
> +
> + err = eb_lock_vmas(eb);
> + if (err)
> + return err;
> +
> + for (i = 0; i < eb->buffer_count; i++) {
> + struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
> + struct eb_vma *ev = &eb->vma[i];
> + struct i915_vma *vma = ev->vma;
>  
>   err = eb_pin_vma(eb, entry, ev);
>   if (err == -EDEADLK)

Reviewed-by: Maarten Lankhorst 



Re: vc4_bo_create: Failed to allocate from CMA

2021-06-17 Thread Daniel Stone
On Thu, 17 Jun 2021 at 10:36, nicolas saenz julienne  wrote:
> Frankly I don't know if there is a better way. IIRC opensuse and downstream 
> use
> DT overlays to cater for this limitation. It seems reasonable to bump the
> value. But it'll be in detriment of users that don't care much for graphical
> interfaces. Nonetheless, I'm not familiar with how DRM handles CMA/DMA memory.
> So let me have a look at it. Maybe there is a SW fix. At first glance I'm
> surprised they can't defer to normal page allocations when CMA isn't capable 
> of
> honoring the request (like the dma code does).

DMA transfers can be split into multiple transactions at the cost of
being a bit slower. But there isn't a fallback for graphics buffers;
you can't display a quarter of a screen at a time. If the hardware did
support buffers being backed by multiple discontiguous pages, then it
wouldn't need CMA in the first place ...

Cheers,
Daniel


Re: [PATCH v3] Documentation: gpu: Mention the requirements for new properties

2021-06-17 Thread Laurent Pinchart
Hi Pekka,

On Thu, Jun 17, 2021 at 10:27:01AM +0300, Pekka Paalanen wrote:
> On Thu, 17 Jun 2021 00:05:24 +0300 Laurent Pinchart wrote:
> > On Tue, Jun 15, 2021 at 01:16:56PM +0300, Pekka Paalanen wrote:
> > > On Tue, 15 Jun 2021 12:45:57 +0300 Laurent Pinchart wrote:  
> > > > On Tue, Jun 15, 2021 at 07:15:18AM +, Simon Ser wrote:  
> > > > > On Tuesday, June 15th, 2021 at 09:03, Pekka Paalanen wrote:
> > > > > 
> > > > > > indeed it will, but what else could one do to test userspace KMS
> > > > > > clients in generic CI where all you can have is virtual hardware? 
> > > > > > Maybe
> > > > > > in the long run VKMS needs to loop back to a userspace daemon that
> > > > > > implements all the complex processing and returns the writeback 
> > > > > > result
> > > > > > via VKMS again? That daemon would then need a single upstream, like 
> > > > > > the
> > > > > > kernel, where it is maintained and correctness verified.
> > > > > 
> > > > > The complex processing must be implemented even without write-back, 
> > > > > because
> > > > > user-space can ask for CRCs of the CRTC.
> > > > > 
> > > > > > Or an LD_PRELOAD that hijacks all KMS ioctls and implements virtual
> > > > > > stuff in userspace? Didn't someone already have something like that?
> > > > > > It would need to be lifted to be a required part of kernel UAPI
> > > > > > submissions, I suppose like IGT is nowadays.
> > > > > 
> > > > > FWIW, I have a mock libdrm [1] for libliftoff. This is nowhere near a 
> > > > > full
> > > > > software implementation with write-back connectors, but allows to 
> > > > > expose
> > > > > virtual planes and check atomic commits in CI.
> > > > > 
> > > > > [1]: 
> > > > > https://github.com/emersion/libliftoff/blob/master/test/libdrm_mock.c
> > > > > 
> > > > > > For compositor developers like me knowing the exact formulas would 
> > > > > > be a huge
> > > > > > benefit as it would allow me to use KMS to off-load 
> > > > > > precision-sensitive
> > > > > > operations (e.g.  professional color management). Otherwise, 
> > > > > > compositors
> > > > > > probably need a switch: "high quality color management? Then do not 
> > > > > > use KMS
> > > > > > features."
> > > > > 
> > > > > I think for alpha blending there are already rounding issues 
> > > > > depending on the
> > > > > hardware. I wouldn't keep my hopes up for any guarantee that all hw 
> > > > > uses the
> > > > > exact same formulae for color management stuff.
> > > > 
> > > > Good, because otherwise you would be very quickly disappointed :-)
> > > > 
> > > > For scaling we would also need to replicate the exact same filter taps,
> > > > which are often not documented.  
> > > 
> > > That is where the documented tolerances come into play.  
> > 
> > This is something I've experimented with a while ago, when developing
> > automated tests for the rcar-du driver. When playing with different
> > input images we had to constantly increases tolerances, up to a point
> > where the tests started to miss real problems :-(
> 
> What should we infer from that? That the hardware is broken and
> exposing those KMS properties is a false promise?

No, just that the scaler doesn't document the internal hardware
implementation (number of taps in the filters, coefficients, rounding,
...). That's the rule, not the exception, and it doesn't prevent correct
operation, images get scaled in a reproducible way (the same input
produces the same output).

> If a driver on certain hardware cannot correctly implement a KMS
> property over the full domain of the input space, should that driver
> then simply not expose the KMS property at all?

The properties involved here would the the SRC and CRTC rectangles for
the planes. They don't document pixel-perfect scaling :-)

> But I would assume that the vendor still wants to expose the features
> in upstream kernels, yet they cannot use the standard KMS properties
> for that. Should the driver then expose vendor-specific properties with
> the disclaimer that the result is not always what one would expect, so
> that userspace written and tested explicitly for that hardware can
> still work?
> 
> That is, a sufficient justification for a vendor-specific KMS property
> would be that a standard property already exists, but the hardware is
> too buggy to make it work. IOW, give up trying to make sense.

It's not just about buggy hardware, it's also about implementation
specificities, such as rounding, filters, order of operations in the
color management pipeline (it's relatively easy when you only have two
LUTs and a CCM matrix, but if you through 3D LUTs and other tonemapping
features in the mix, not all hardware will implement the same pipeline),
or various types of image compression (this device implements a
"near-lossless" compression scheme that reduces the memory bandwidth by
50% for instance).

> I would like to move towards a direction where *hardware* design and
> testing is eventually guided by 

[PATCH v2] drm/mediatek: force hsa hbp hfp packets multiple of lanenum to avoid screen shift

2021-06-17 Thread Jitao Shi
The bridge chip "ANX7625" requires the packets on lanes to aligne at the end,
or ANX7625 will shift the screen.

Signed-off-by: Jitao Shi 
---
 drivers/gpu/drm/mediatek/mtk_dsi.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/mediatek/mtk_dsi.c 
b/drivers/gpu/drm/mediatek/mtk_dsi.c
index ae403c67cbd9..4735e0092ffe 100644
--- a/drivers/gpu/drm/mediatek/mtk_dsi.c
+++ b/drivers/gpu/drm/mediatek/mtk_dsi.c
@@ -194,6 +194,8 @@ struct mtk_dsi {
struct clk *hs_clk;
 
u32 data_rate;
+   /* force dsi line end without dsi_null data */
+   bool force_dsi_end_without_null;
 
unsigned long mode_flags;
enum mipi_dsi_pixel_format format;
@@ -499,6 +501,13 @@ static void mtk_dsi_config_vdo_timing(struct mtk_dsi *dsi)
DRM_WARN("HFP + HBP less than d-phy, FPS will under 60Hz\n");
}
 
+   if (dsi->force_dsi_end_without_null) {
+   horizontal_sync_active_byte = 
roundup(horizontal_sync_active_byte, dsi->lanes) - 2;
+   horizontal_frontporch_byte = 
roundup(horizontal_frontporch_byte, dsi->lanes) - 2;
+   horizontal_backporch_byte = roundup(horizontal_backporch_byte, 
dsi->lanes) - 2;
+   horizontal_backporch_byte -= (vm->hactive * dsi_tmp_buf_bpp + 
2) % dsi->lanes;
+   }
+
writel(horizontal_sync_active_byte, dsi->regs + DSI_HSA_WC);
writel(horizontal_backporch_byte, dsi->regs + DSI_HBP_WC);
writel(horizontal_frontporch_byte, dsi->regs + DSI_HFP_WC);
@@ -1095,6 +1104,10 @@ static int mtk_dsi_probe(struct platform_device *pdev)
dsi->bridge.of_node = dev->of_node;
dsi->bridge.type = DRM_MODE_CONNECTOR_DSI;
 
+   if (dsi->next_bridge)
+   dsi->force_dsi_end_without_null = 
of_property_read_bool(dsi->next_bridge->of_node,
+   
"force_dsi_end_without_null");
+
drm_bridge_add(&dsi->bridge);
 
ret = component_add(&pdev->dev, &mtk_dsi_component_ops);
-- 
2.25.1


Introduce fence iterators to abstract dma_resv RCU handling

2021-06-17 Thread Christian König
Hi guys,

during the recent discussion about SLAB_TYPESAFE_BY_RCU, dma_fence_get_rcu and 
dma_fence_get_rcu_safe we found that the RCU handling for dma_resv objects was 
implemented multiple times.

Unfortunately a lot of those implementations get the rather complicated dance 
with RCU and the sequence number handling wrong.

So this patch set aims to audit and unify this by providing an iterator which 
automatically restarts when a modification to the dma_resv object is detected.

The result is pretty impressive I think since this not only mean that we got 
rid of all those incorrect dma_fence_get_rcu() cases, but also reduce the 
overall loc count quite a bit.

Please review and/or comment.

Cheers,
Christian.




[PATCH 02/16] dma-buf: add dma_resv_for_each_fence

2021-06-17 Thread Christian König
A simpler version of the iterator to be used when the dma_resv object is
locked.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 38 ++
 include/linux/dma-resv.h   | 18 ++
 2 files changed, 56 insertions(+)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index d8da8a914b07..a0386cf5824c 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -316,6 +316,44 @@ void dma_resv_add_excl_fence(struct dma_resv *obj, struct 
dma_fence *fence)
 }
 EXPORT_SYMBOL(dma_resv_add_excl_fence);
 
+/**
+ * dma_resv_walk - walk over fences in a dma_resv obj
+ * @obj: the dma_resv object
+ * @cursor: cursor to record the current position
+ * @all_fences: true returns also the shared fences
+ * @first: if we should start over
+ *
+ * Return all the fences in the dma_resv object while holding the
+ * dma_resv::lock.
+ */
+struct dma_fence *dma_resv_walk(struct dma_resv *obj,
+   struct dma_resv_cursor *cursor,
+   bool all_fences, bool first)
+{
+   dma_resv_assert_held(obj);
+
+   cursor->is_first = first;
+   if (first) {
+   struct dma_fence *fence;
+
+   cursor->index = -1;
+   cursor->fences = dma_resv_shared_list(obj);
+   cursor->is_exclusive = true;
+
+   fence = dma_resv_excl_fence(obj);
+   if (fence)
+   return fence;
+   }
+
+   if (!all_fences || !cursor->fences ||
+   ++cursor->index >= cursor->fences->shared_count)
+   return NULL;
+
+   return rcu_dereference_protected(cursor->fences->shared[cursor->index],
+dma_resv_held(obj));
+}
+EXPORT_SYMBOL_GPL(dma_resv_walk);
+
 /**
  * dma_resv_walk_unlocked - walk over fences in a dma_resv obj
  * @obj: the dma_resv object
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 74775f2cb679..84de4dff4ecc 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -91,6 +91,21 @@ struct dma_resv_cursor {
bool is_exclusive;
 };
 
+/**
+ * dma_resv_for_each_fence - fence iterator
+ * @obj: a dma_resv object pointer
+ * @cursor: a struct dma_resv_cursor pointer
+ * @all_fences: true if all fences should be returned
+ * @fence: the current fence
+ *
+ * Iterate over the fences in a struct dma_resv object while holding the
+ * dma_resv::lock. @all_fences controls if the shared fences are returned as
+ * well.
+ */
+#define dma_resv_for_each_fence(obj, cursor, all_fences, fence)
  \
+   for (fence = dma_resv_walk(obj, cursor, all_fences, true); fence; \
+fence = dma_resv_walk(obj, cursor, all_fences, false))
+
 /**
  * dma_resv_for_each_fence_unlocked - fence iterator
  * @obj: a dma_resv object pointer
@@ -305,6 +320,9 @@ void dma_resv_fini(struct dma_resv *obj);
 int dma_resv_reserve_shared(struct dma_resv *obj, unsigned int num_fences);
 void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence);
 void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence);
+struct dma_fence *dma_resv_walk(struct dma_resv *obj,
+   struct dma_resv_cursor *cursor,
+   bool first, bool all_fences);
 struct dma_fence *dma_resv_walk_unlocked(struct dma_resv *obj,
 struct dma_resv_cursor *cursor,
 bool first, bool all_fences);
-- 
2.25.1



[PATCH 03/16] dma-buf: use new iterator in dma_resv_copy_fences

2021-06-17 Thread Christian König
This makes the function much simpler since the complex
retry logic is now handled else where.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 81 +++---
 1 file changed, 32 insertions(+), 49 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index a0386cf5824c..a5d78bf401b5 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -426,74 +426,57 @@ EXPORT_SYMBOL_GPL(dma_resv_walk_unlocked);
  */
 int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
 {
-   struct dma_resv_list *src_list, *dst_list;
-   struct dma_fence *old, *new;
-   unsigned int i;
+   struct dma_resv_cursor cursor;
+   struct dma_resv_list *list;
+   struct dma_fence *f, *excl;
 
dma_resv_assert_held(dst);
 
-   rcu_read_lock();
-   src_list = dma_resv_shared_list(src);
+   list = NULL;
+   excl = NULL;
 
-retry:
-   if (src_list) {
-   unsigned int shared_count = src_list->shared_count;
+   rcu_read_lock();
+   dma_resv_for_each_fence_unlocked(dst, &cursor, true, f) {
 
-   rcu_read_unlock();
+   if (cursor.is_first) {
+   dma_resv_list_free(list);
+   dma_fence_put(excl);
 
-   dst_list = dma_resv_list_alloc(shared_count);
-   if (!dst_list)
-   return -ENOMEM;
+   if (cursor.fences) {
+   unsigned int cnt = cursor.fences->shared_count;
 
-   rcu_read_lock();
-   src_list = dma_resv_shared_list(src);
-   if (!src_list || src_list->shared_count > shared_count) {
-   kfree(dst_list);
-   goto retry;
-   }
+   rcu_read_unlock();
+   list = dma_resv_list_alloc(cnt);
+   if (!list)
+   return -ENOMEM;
 
-   dst_list->shared_count = 0;
-   for (i = 0; i < src_list->shared_count; ++i) {
-   struct dma_fence __rcu **dst;
-   struct dma_fence *fence;
+   list->shared_count = 0;
+   rcu_read_lock();
 
-   fence = rcu_dereference(src_list->shared[i]);
-   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
-&fence->flags))
-   continue;
-
-   if (!dma_fence_get_rcu(fence)) {
-   dma_resv_list_free(dst_list);
-   src_list = dma_resv_shared_list(src);
-   goto retry;
+   } else {
+   list = NULL;
}
+   excl = NULL;
+   }
 
-   if (dma_fence_is_signaled(fence)) {
-   dma_fence_put(fence);
-   continue;
-   }
+   if (cursor.is_exclusive)
+   excl = f;
+   else
+   RCU_INIT_POINTER(list->shared[list->shared_count++], f);
 
-   dst = &dst_list->shared[dst_list->shared_count++];
-   rcu_assign_pointer(*dst, fence);
-   }
-   } else {
-   dst_list = NULL;
+   /* Don't drop the reference */
+   f = NULL;
}
 
-   new = dma_fence_get_rcu_safe(&src->fence_excl);
rcu_read_unlock();
 
-   src_list = dma_resv_shared_list(dst);
-   old = dma_resv_excl_fence(dst);
-
write_seqcount_begin(&dst->seq);
-   /* write_seqcount_begin provides the necessary memory barrier */
-   RCU_INIT_POINTER(dst->fence_excl, new);
-   RCU_INIT_POINTER(dst->fence, dst_list);
+   excl = rcu_replace_pointer(dst->fence_excl, excl, dma_resv_held(dst));
+   list = rcu_replace_pointer(dst->fence, list, dma_resv_held(dst));
write_seqcount_end(&dst->seq);
 
-   dma_resv_list_free(src_list);
-   dma_fence_put(old);
+   dma_resv_list_free(list);
+   dma_fence_put(excl);
 
return 0;
 }
-- 
2.25.1



[PATCH 11/16] drm/amdgpu: use the new iterator in amdgpu_sync_resv

2021-06-17 Thread Christian König
Simplifying the code a bit.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 44 
 1 file changed, 14 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index 862eb3c1c4c5..031ba20debb9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -252,41 +252,25 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct 
amdgpu_sync *sync,
 struct dma_resv *resv, enum amdgpu_sync_mode mode,
 void *owner)
 {
-   struct dma_resv_list *flist;
+   struct dma_resv_cursor cursor;
struct dma_fence *f;
-   unsigned i;
-   int r = 0;
+   int r;
 
if (resv == NULL)
return -EINVAL;
 
-   /* always sync to the exclusive fence */
-   f = dma_resv_excl_fence(resv);
-   dma_fence_chain_for_each(f, f) {
-   struct dma_fence_chain *chain = to_dma_fence_chain(f);
-
-   if (amdgpu_sync_test_fence(adev, mode, owner, chain ?
-  chain->fence : f)) {
-   r = amdgpu_sync_fence(sync, f);
-   dma_fence_put(f);
-   if (r)
-   return r;
-   break;
-   }
-   }
-
-   flist = dma_resv_shared_list(resv);
-   if (!flist)
-   return 0;
-
-   for (i = 0; i < flist->shared_count; ++i) {
-   f = rcu_dereference_protected(flist->shared[i],
- dma_resv_held(resv));
-
-   if (amdgpu_sync_test_fence(adev, mode, owner, f)) {
-   r = amdgpu_sync_fence(sync, f);
-   if (r)
-   return r;
+   dma_resv_for_each_fence(resv, &cursor, true, f) {
+   dma_fence_chain_for_each(f, f) {
+   struct dma_fence_chain *chain = to_dma_fence_chain(f);
+
+   if (amdgpu_sync_test_fence(adev, mode, owner, chain ?
+  chain->fence : f)) {
+   r = amdgpu_sync_fence(sync, f);
+   dma_fence_put(f);
+   if (r)
+   return r;
+   break;
+   }
}
}
return 0;
-- 
2.25.1



[PATCH 01/16] dma-buf: add dma_resv_for_each_fence_unlocked

2021-06-17 Thread Christian König
Abstract the complexity of iterating over all the fences
in a dma_resv object.

The new loop handles the whole RCU and retry dance and
returns only fences where we can be sure we grabbed the
right one.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 63 ++
 include/linux/dma-resv.h   | 36 ++
 2 files changed, 99 insertions(+)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 18dd5a6ca06c..d8da8a914b07 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -316,6 +316,69 @@ void dma_resv_add_excl_fence(struct dma_resv *obj, struct 
dma_fence *fence)
 }
 EXPORT_SYMBOL(dma_resv_add_excl_fence);
 
+/**
+ * dma_resv_walk_unlocked - walk over fences in a dma_resv obj
+ * @obj: the dma_resv object
+ * @cursor: cursor to record the current position
+ * @all_fences: true returns also the shared fences
+ * @first: if we should start over
+ *
+ * Return all the fences in the dma_resv object which are not yet signaled.
+ * The returned fence has an extra local reference so will stay alive.
+ * If a concurrent modify is detected the whole iterator is started over again.
+ */
+struct dma_fence *dma_resv_walk_unlocked(struct dma_resv *obj,
+struct dma_resv_cursor *cursor,
+bool all_fences, bool first)
+{
+   struct dma_fence *fence = NULL;
+
+   do {
+   /* Drop the reference from the previous round */
+   dma_fence_put(fence);
+
+   cursor->is_first = first;
+   if (first) {
+   cursor->seq = read_seqcount_begin(&obj->seq);
+   cursor->index = -1;
+   cursor->fences = dma_resv_shared_list(obj);
+   cursor->is_exclusive = true;
+
+   fence = dma_resv_excl_fence(obj);
+   if (fence && test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+ &fence->flags))
+   fence = NULL;
+   } else {
+   fence = NULL;
+   }
+
+   if (fence) {
+   fence = dma_fence_get_rcu(fence);
+   } else if (all_fences && cursor->fences) {
+   struct dma_resv_list *fences = cursor->fences;
+
+   cursor->is_exclusive = false;
+   while (++cursor->index < fences->shared_count) {
+   fence = rcu_dereference(fences->shared[
+   cursor->index]);
+   if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+ &fence->flags))
+   break;
+   }
+   if (cursor->index < fences->shared_count)
+   fence = dma_fence_get_rcu(fence);
+   else
+   fence = NULL;
+   }
+
+   /* For the eventually next round */
+   first = true;
+   } while (read_seqcount_retry(&obj->seq, cursor->seq));
+
+   return fence;
+}
+EXPORT_SYMBOL_GPL(dma_resv_walk_unlocked);
+
 /**
  * dma_resv_copy_fences - Copy all fences from src to dst.
  * @dst: the destination reservation object
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 562b885cf9c3..74775f2cb679 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -75,6 +75,39 @@ struct dma_resv {
struct dma_resv_list __rcu *fence;
 };
 
+/**
+ * struct dma_resv_cursor - current position into the dma_resv fences
+ * @seq: sequence number to check
+ * @index: index into the shared fences
+ * @shared: the shared fences
+ * @is_first: true if this is the first returned fence
+ * @is_exclusive: if the current fence is the exclusive one
+ */
+struct dma_resv_cursor {
+   unsigned int seq;
+   unsigned int index;
+   struct dma_resv_list *fences;
+   bool is_first;
+   bool is_exclusive;
+};
+
+/**
+ * dma_resv_for_each_fence_unlocked - fence iterator
+ * @obj: a dma_resv object pointer
+ * @cursor: a struct dma_resv_cursor pointer
+ * @all_fences: true if all fences should be returned
+ * @fence: the current fence
+ *
+ * Iterate over the fences in a struct dma_resv object without holding the
+ * dma_resv::lock. The RCU read side lock must be hold when using this, but can
+ * be dropped and re-taken as necessary inside the loop. @all_fences controls
+ * if the shared fences are returned as well.
+ */
+#define dma_resv_for_each_fence_unlocked(obj, cursor, all_fences, fence)\
+   for (fence = dma_resv_walk_unlocked(obj, cursor, all_fences, true); \
+fence; dma_fence_put(fence),   \
+fence = dma_resv_walk_unlocked(obj, cursor,

[PATCH 06/16] dma-buf: use new iterator in dma_resv_test_signaled

2021-06-17 Thread Christian König
This makes the function much simpler since the complex
retry logic is now handled elsewhere.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 54 +-
 1 file changed, 7 insertions(+), 47 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 5192cf4271ac..85e07becdb93 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -586,22 +586,6 @@ long dma_resv_wait_timeout(struct dma_resv *obj, bool 
wait_all, bool intr,
 EXPORT_SYMBOL_GPL(dma_resv_wait_timeout);
 
 
-static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
-{
-   struct dma_fence *fence, *lfence = passed_fence;
-   int ret = 1;
-
-   if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &lfence->flags)) {
-   fence = dma_fence_get_rcu(lfence);
-   if (!fence)
-   return -1;
-
-   ret = !!dma_fence_is_signaled(fence);
-   dma_fence_put(fence);
-   }
-   return ret;
-}
-
 /**
  * dma_resv_test_signaled - Test if a reservation object's fences have been
  * signaled.
@@ -616,43 +600,19 @@ static inline int dma_resv_test_signaled_single(struct 
dma_fence *passed_fence)
  */
 bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
 {
+   struct dma_resv_cursor cursor;
struct dma_fence *fence;
-   unsigned int seq;
-   int ret;
 
rcu_read_lock();
-retry:
-   ret = true;
-   seq = read_seqcount_begin(&obj->seq);
-
-   if (test_all) {
-   struct dma_resv_list *fobj = dma_resv_shared_list(obj);
-   unsigned int i, shared_count;
-
-   shared_count = fobj ? fobj->shared_count : 0;
-   for (i = 0; i < shared_count; ++i) {
-   fence = rcu_dereference(fobj->shared[i]);
-   ret = dma_resv_test_signaled_single(fence);
-   if (ret < 0)
-   goto retry;
-   else if (!ret)
-   break;
+   dma_resv_for_each_fence_unlocked(obj, &cursor, test_all, fence) {
+   if (!dma_fence_is_signaled(fence)) {
+   rcu_read_unlock();
+   dma_fence_put(fence);
+   return false;
}
}
-
-   fence = dma_resv_excl_fence(obj);
-   if (ret && fence) {
-   ret = dma_resv_test_signaled_single(fence);
-   if (ret < 0)
-   goto retry;
-
-   }
-
-   if (read_seqcount_retry(&obj->seq, seq))
-   goto retry;
-
rcu_read_unlock();
-   return ret;
+   return true;
 }
 EXPORT_SYMBOL_GPL(dma_resv_test_signaled);
 
-- 
2.25.1



[PATCH 08/16] drm/i915: use the new iterator in i915_gem_busy_ioctl

2021-06-17 Thread Christian König
This makes the function much simpler since the complex
retry logic is now handled else where.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/i915/gem/i915_gem_busy.c | 30 +++-
 1 file changed, 9 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c 
b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
index 6234e17259c1..c6c6d747b33e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
@@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
 {
struct drm_i915_gem_busy *args = data;
struct drm_i915_gem_object *obj;
-   struct dma_resv_list *list;
-   unsigned int seq;
+   struct dma_resv_cursor cursor;
+   struct dma_fence *fence;
int err;
 
err = -ENOENT;
@@ -109,28 +109,16 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
 * to report the overall busyness. This is what the wait-ioctl does.
 *
 */
-retry:
-   seq = raw_read_seqcount(&obj->base.resv->seq);
-
-   /* Translate the exclusive fence to the READ *and* WRITE engine */
-   args->busy = busy_check_writer(dma_resv_excl_fence(obj->base.resv));
-
-   /* Translate shared fences to READ set of engines */
-   list = dma_resv_shared_list(obj->base.resv);
-   if (list) {
-   unsigned int shared_count = list->shared_count, i;
-
-   for (i = 0; i < shared_count; ++i) {
-   struct dma_fence *fence =
-   rcu_dereference(list->shared[i]);
-
+   args->busy = false;
+   dma_resv_for_each_fence_unlocked(obj->base.resv, &cursor, true, fence) {
+   if (cursor.is_exclusive)
+   /* Translate the exclusive fence to the READ *and* 
WRITE engine */
+   args->busy = busy_check_writer(fence);
+   else
+   /* Translate shared fences to READ set of engines */
args->busy |= busy_check_reader(fence);
-   }
}
 
-   if (args->busy && read_seqcount_retry(&obj->base.resv->seq, seq))
-   goto retry;
-
err = 0;
 out:
rcu_read_unlock();
-- 
2.25.1



[PATCH 09/16] drm/ttm: use the new iterator in ttm_bo_flush_all_fences

2021-06-17 Thread Christian König
This is probably a fix since we didn't even grabed a reference to the
fences.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index db53fecca696..15edb308e5a9 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -256,19 +256,11 @@ static int ttm_bo_individualize_resv(struct 
ttm_buffer_object *bo)
 static void ttm_bo_flush_all_fences(struct ttm_buffer_object *bo)
 {
struct dma_resv *resv = &bo->base._resv;
-   struct dma_resv_list *fobj;
+   struct dma_resv_cursor cursor;
struct dma_fence *fence;
-   int i;
 
rcu_read_lock();
-   fobj = dma_resv_shared_list(resv);
-   fence = dma_resv_excl_fence(resv);
-   if (fence && !fence->ops->signaled)
-   dma_fence_enable_sw_signaling(fence);
-
-   for (i = 0; fobj && i < fobj->shared_count; ++i) {
-   fence = rcu_dereference(fobj->shared[i]);
-
+   dma_resv_for_each_fence_unlocked(resv, &cursor, true, fence) {
if (!fence->ops->signaled)
dma_fence_enable_sw_signaling(fence);
}
-- 
2.25.1



[PATCH 07/16] dma-buf: use new iterator in dma_buf_poll

2021-06-17 Thread Christian König
This makes the function much simpler since the complex
retry logic is now handled elsewhere.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-buf.c | 49 ---
 1 file changed, 4 insertions(+), 45 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index b67fbf4e3705..4173f1f70ac1 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -207,15 +207,13 @@ static void dma_buf_poll_cb(struct dma_fence *fence, 
struct dma_fence_cb *cb)
 
 static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
 {
+   struct dma_resv_cursor cursor;
struct dma_buf_poll_cb_t *dcb;
struct dma_buf *dmabuf;
struct dma_resv *resv;
-   struct dma_resv_list *fobj;
-   struct dma_fence *fence_excl;
-   unsigned shared_count, seq;
struct dma_fence *fence;
__poll_t events;
-   int r, i;
+   int r;
 
dmabuf = file->private_data;
if (!dmabuf || !dmabuf->resv)
@@ -241,53 +239,14 @@ static __poll_t dma_buf_poll(struct file *file, 
poll_table *poll)
if (!events)
return 0;
 
-retry:
-   seq = read_seqcount_begin(&resv->seq);
-   rcu_read_lock();
-
-   fobj = rcu_dereference(resv->fence);
-   if (fobj && events & EPOLLOUT)
-   shared_count = fobj->shared_count;
-   else
-   shared_count = 0;
-
-   for (i = 0; i < shared_count; ++i) {
-   fence = rcu_dereference(fobj->shared[i]);
-   fence = dma_fence_get_rcu(fence);
-   if (!fence || read_seqcount_retry(&resv->seq, seq)) {
-   /* Concurrent modify detected, force re-check */
-   dma_fence_put(fence);
-   rcu_read_unlock();
-   goto retry;
-   }
-
-   r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb);
-   if (!r) {
-   /* Callback queued */
-   events = 0;
-   goto out;
-   }
-   dma_fence_put(fence);
-   }
-
-   fence = dma_resv_excl_fence(resv);
-   if (fence) {
-   fence = dma_fence_get_rcu(fence);
-   if (!fence || read_seqcount_retry(&resv->seq, seq)) {
-   /* Concurrent modify detected, force re-check */
-   dma_fence_put(fence);
-   rcu_read_unlock();
-   goto retry;
-
-   }
-
+   dma_resv_for_each_fence_unlocked(resv, &cursor, events & EPOLLOUT,
+fence) {
r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb);
if (!r) {
/* Callback queued */
events = 0;
goto out;
}
-   dma_fence_put(fence_excl);
}
 
/* No callback queued, wake up any additional waiters. */
-- 
2.25.1



[PATCH 04/16] dma-buf: use new iterator in dma_resv_get_fences

2021-06-17 Thread Christian König
This makes the function much simpler since the complex
retry logic is now handled elsewhere.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 110 +
 1 file changed, 37 insertions(+), 73 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index a5d78bf401b5..b77bf46c0f48 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -486,99 +486,63 @@ EXPORT_SYMBOL(dma_resv_copy_fences);
  * dma_resv_get_fences - Get an object's shared and exclusive
  * fences without update side lock held
  * @obj: the reservation object
- * @pfence_excl: the returned exclusive fence (or NULL)
- * @pshared_count: the number of shared fences returned
- * @pshared: the array of shared fence ptrs returned (array is krealloc'd to
+ * @fence_excl: the returned exclusive fence (or NULL)
+ * @shared_count: the number of shared fences returned
+ * @shared: the array of shared fence ptrs returned (array is krealloc'd to
  * the required size, and must be freed by caller)
  *
  * Retrieve all fences from the reservation object. If the pointer for the
  * exclusive fence is not specified the fence is put into the array of the
  * shared fences as well. Returns either zero or -ENOMEM.
  */
-int dma_resv_get_fences(struct dma_resv *obj, struct dma_fence **pfence_excl,
-   unsigned int *pshared_count,
-   struct dma_fence ***pshared)
+int dma_resv_get_fences(struct dma_resv *obj, struct dma_fence **fence_excl,
+   unsigned int *shared_count, struct dma_fence ***shared)
 {
-   struct dma_fence **shared = NULL;
-   struct dma_fence *fence_excl;
-   unsigned int shared_count;
-   int ret = 1;
-
-   do {
-   struct dma_resv_list *fobj;
-   unsigned int i, seq;
-   size_t sz = 0;
-
-   shared_count = i = 0;
-
-   rcu_read_lock();
-   seq = read_seqcount_begin(&obj->seq);
-
-   fence_excl = dma_resv_excl_fence(obj);
-   if (fence_excl && !dma_fence_get_rcu(fence_excl))
-   goto unlock;
+   struct dma_resv_cursor cursor;
+   struct dma_fence *fence;
 
-   fobj = dma_resv_shared_list(obj);
-   if (fobj)
-   sz += sizeof(*shared) * fobj->shared_max;
+   *shared_count = 0;
+   *shared = NULL;
 
-   if (!pfence_excl && fence_excl)
-   sz += sizeof(*shared);
+   if (fence_excl)
+   *fence_excl = NULL;
 
-   if (sz) {
-   struct dma_fence **nshared;
+   rcu_read_lock();
+   dma_resv_for_each_fence_unlocked(obj, &cursor, true, fence) {
 
-   nshared = krealloc(shared, sz,
-  GFP_NOWAIT | __GFP_NOWARN);
-   if (!nshared) {
-   rcu_read_unlock();
+   if (cursor.is_first) {
+   unsigned int count;
 
-   dma_fence_put(fence_excl);
-   fence_excl = NULL;
+   while (*shared_count)
+   dma_fence_put((*shared)[--(*shared_count)]);
 
-   nshared = krealloc(shared, sz, GFP_KERNEL);
-   if (nshared) {
-   shared = nshared;
-   continue;
-   }
+   if (fence_excl)
+   dma_fence_put(*fence_excl);
 
-   ret = -ENOMEM;
-   break;
-   }
-   shared = nshared;
-   shared_count = fobj ? fobj->shared_count : 0;
-   for (i = 0; i < shared_count; ++i) {
-   shared[i] = rcu_dereference(fobj->shared[i]);
-   if (!dma_fence_get_rcu(shared[i]))
-   break;
-   }
-   }
+   count = cursor.fences ? cursor.fences->shared_count : 0;
+   count += fence_excl ? 0 : 1;
+   rcu_read_unlock();
 
-   if (i != shared_count || read_seqcount_retry(&obj->seq, seq)) {
-   while (i--)
-   dma_fence_put(shared[i]);
-   dma_fence_put(fence_excl);
-   goto unlock;
+   /* Eventually re-allocate the array */
+   *shared = krealloc_array(*shared, count,
+sizeof(*shared),
+GFP_KERNEL);
+   if (count && !*shared)
+   return -ENOMEM;
+

[PATCH 15/16] drm/nouveau: use the new iterator in nouveau_fence_sync

2021-06-17 Thread Christian König
Simplifying the code a bit.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 48 +++--
 1 file changed, 12 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 05d0b3eb3690..dc8d7ca1e239 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -339,14 +339,15 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool 
lazy, bool intr)
 }
 
 int
-nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool 
exclusive, bool intr)
+nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
+  bool exclusive, bool intr)
 {
struct nouveau_fence_chan *fctx = chan->fence;
-   struct dma_fence *fence;
struct dma_resv *resv = nvbo->bo.base.resv;
-   struct dma_resv_list *fobj;
+   struct dma_resv_cursor cursor;
+   struct dma_fence *fence;
struct nouveau_fence *f;
-   int ret = 0, i;
+   int ret;
 
if (!exclusive) {
ret = dma_resv_reserve_shared(resv, 1);
@@ -355,10 +356,7 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct 
nouveau_channel *chan, bool e
return ret;
}
 
-   fobj = dma_resv_shared_list(resv);
-   fence = dma_resv_excl_fence(resv);
-
-   if (fence) {
+   dma_resv_for_each_fence(resv, &cursor, exclusive, fence) {
struct nouveau_channel *prev = NULL;
bool must_wait = true;
 
@@ -366,41 +364,19 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct 
nouveau_channel *chan, bool e
if (f) {
rcu_read_lock();
prev = rcu_dereference(f->channel);
-   if (prev && (prev == chan || fctx->sync(f, prev, chan) 
== 0))
+   if (prev && (prev == chan ||
+fctx->sync(f, prev, chan) == 0))
must_wait = false;
rcu_read_unlock();
}
 
-   if (must_wait)
+   if (must_wait) {
ret = dma_fence_wait(fence, intr);
-
-   return ret;
-   }
-
-   if (!exclusive || !fobj)
-   return ret;
-
-   for (i = 0; i < fobj->shared_count && !ret; ++i) {
-   struct nouveau_channel *prev = NULL;
-   bool must_wait = true;
-
-   fence = rcu_dereference_protected(fobj->shared[i],
-   dma_resv_held(resv));
-
-   f = nouveau_local_fence(fence, chan->drm);
-   if (f) {
-   rcu_read_lock();
-   prev = rcu_dereference(f->channel);
-   if (prev && (prev == chan || fctx->sync(f, prev, chan) 
== 0))
-   must_wait = false;
-   rcu_read_unlock();
+   if (ret)
+   return ret;
}
-
-   if (must_wait)
-   ret = dma_fence_wait(fence, intr);
}
-
-   return ret;
+   return 0;
 }
 
 void
-- 
2.25.1



[PATCH 14/16] drm/msm: use new iterator in msm_gem_describe

2021-06-17 Thread Christian König
Simplifying the code a bit. Also drop the RCU read side lock since the
object is locked anyway.

Untested since I can't get the driver to compile on !ARM.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/msm/msm_gem.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 24f8c0603385..8b10d82b5d7b 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -932,7 +932,7 @@ void msm_gem_describe(struct drm_gem_object *obj, struct 
seq_file *m,
 {
struct msm_gem_object *msm_obj = to_msm_bo(obj);
struct dma_resv *robj = obj->resv;
-   struct dma_resv_list *fobj;
+   struct dma_resv_cursor cursor;
struct dma_fence *fence;
struct msm_gem_vma *vma;
uint64_t off = drm_vma_node_start(&obj->vma_node);
@@ -1007,22 +1007,13 @@ void msm_gem_describe(struct drm_gem_object *obj, 
struct seq_file *m,
seq_puts(m, "\n");
}
 
-   rcu_read_lock();
-   fobj = dma_resv_shared_list(robj);
-   if (fobj) {
-   unsigned int i, shared_count = fobj->shared_count;
-
-   for (i = 0; i < shared_count; i++) {
-   fence = rcu_dereference(fobj->shared[i]);
+   dma_resv_for_each_fence(robj, &cursor, true, fence) {
+   if (cursor.is_exclusive)
+   describe_fence(fence, "Exclusive", m);
+   else
describe_fence(fence, "Shared", m);
-   }
}
 
-   fence = dma_resv_excl_fence(robj);
-   if (fence)
-   describe_fence(fence, "Exclusive", m);
-   rcu_read_unlock();
-
msm_gem_unlock(obj);
 }
 
-- 
2.25.1



[PATCH 12/16] drm/amdgpu: use new iterator in amdgpu_ttm_bo_eviction_valuable

2021-06-17 Thread Christian König
Simplifying the code a bit.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 80dff29f2bc7..d86b0cbff889 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1334,10 +1334,9 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct 
ttm_buffer_object *bo,
const struct ttm_place *place)
 {
unsigned long num_pages = bo->resource->num_pages;
+   struct dma_resv_cursor resv_cursor;
struct amdgpu_res_cursor cursor;
-   struct dma_resv_list *flist;
struct dma_fence *f;
-   int i;
 
/* Swapout? */
if (bo->resource->mem_type == TTM_PL_SYSTEM)
@@ -1351,14 +1350,9 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct 
ttm_buffer_object *bo,
 * If true, then return false as any KFD process needs all its BOs to
 * be resident to run successfully
 */
-   flist = dma_resv_shared_list(bo->base.resv);
-   if (flist) {
-   for (i = 0; i < flist->shared_count; ++i) {
-   f = rcu_dereference_protected(flist->shared[i],
-   dma_resv_held(bo->base.resv));
-   if (amdkfd_fence_check_mm(f, current->mm))
-   return false;
-   }
+   dma_resv_for_each_fence(bo->base.resv, &resv_cursor, true, f) {
+   if (amdkfd_fence_check_mm(f, current->mm))
+   return false;
}
 
switch (bo->resource->mem_type) {
-- 
2.25.1



[PATCH 05/16] dma-buf: use new iterator in dma_resv_wait_timeout

2021-06-17 Thread Christian König
This makes the function much simpler since the complex
retry logic is now handled elsewhere.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 64 +-
 1 file changed, 7 insertions(+), 57 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index b77bf46c0f48..5192cf4271ac 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -564,74 +564,24 @@ long dma_resv_wait_timeout(struct dma_resv *obj, bool 
wait_all, bool intr,
   unsigned long timeout)
 {
long ret = timeout ? timeout : 1;
-   unsigned int seq, shared_count;
+   struct dma_resv_cursor cursor;
struct dma_fence *fence;
-   int i;
 
-retry:
-   shared_count = 0;
-   seq = read_seqcount_begin(&obj->seq);
rcu_read_lock();
-   i = -1;
-
-   fence = dma_resv_excl_fence(obj);
-   if (fence && !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-   if (!dma_fence_get_rcu(fence))
-   goto unlock_retry;
+   dma_resv_for_each_fence_unlocked(obj, &cursor, wait_all, fence) {
+   rcu_read_unlock();
 
-   if (dma_fence_is_signaled(fence)) {
+   ret = dma_fence_wait_timeout(fence, intr, ret);
+   if (ret <= 0) {
dma_fence_put(fence);
-   fence = NULL;
+   return ret;
}
 
-   } else {
-   fence = NULL;
-   }
-
-   if (wait_all) {
-   struct dma_resv_list *fobj = dma_resv_shared_list(obj);
-
-   if (fobj)
-   shared_count = fobj->shared_count;
-
-   for (i = 0; !fence && i < shared_count; ++i) {
-   struct dma_fence *lfence;
-
-   lfence = rcu_dereference(fobj->shared[i]);
-   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
-&lfence->flags))
-   continue;
-
-   if (!dma_fence_get_rcu(lfence))
-   goto unlock_retry;
-
-   if (dma_fence_is_signaled(lfence)) {
-   dma_fence_put(lfence);
-   continue;
-   }
-
-   fence = lfence;
-   break;
-   }
+   rcu_read_lock();
}
-
rcu_read_unlock();
-   if (fence) {
-   if (read_seqcount_retry(&obj->seq, seq)) {
-   dma_fence_put(fence);
-   goto retry;
-   }
 
-   ret = dma_fence_wait_timeout(fence, intr, ret);
-   dma_fence_put(fence);
-   if (ret > 0 && wait_all && (i + 1 < shared_count))
-   goto retry;
-   }
return ret;
-
-unlock_retry:
-   rcu_read_unlock();
-   goto retry;
 }
 EXPORT_SYMBOL_GPL(dma_resv_wait_timeout);
 
-- 
2.25.1



[PATCH 16/16] drm/radeon: use new iterator in radeon_sync_resv

2021-06-17 Thread Christian König
Simplifying the code a bit.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/radeon/radeon_sync.c | 22 +++---
 1 file changed, 3 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_sync.c 
b/drivers/gpu/drm/radeon/radeon_sync.c
index 9257b60144c4..14a4d8135bad 100644
--- a/drivers/gpu/drm/radeon/radeon_sync.c
+++ b/drivers/gpu/drm/radeon/radeon_sync.c
@@ -91,33 +91,17 @@ int radeon_sync_resv(struct radeon_device *rdev,
 struct dma_resv *resv,
 bool shared)
 {
-   struct dma_resv_list *flist;
-   struct dma_fence *f;
+   struct dma_resv_cursor cursor;
struct radeon_fence *fence;
-   unsigned i;
+   struct dma_fence *f;
int r = 0;
 
-   /* always sync to the exclusive fence */
-   f = dma_resv_excl_fence(resv);
-   fence = f ? to_radeon_fence(f) : NULL;
-   if (fence && fence->rdev == rdev)
-   radeon_sync_fence(sync, fence);
-   else if (f)
-   r = dma_fence_wait(f, true);
-
-   flist = dma_resv_shared_list(resv);
-   if (shared || !flist || r)
-   return r;
-
-   for (i = 0; i < flist->shared_count; ++i) {
-   f = rcu_dereference_protected(flist->shared[i],
- dma_resv_held(resv));
+   dma_resv_for_each_fence(resv, &cursor, shared, f) {
fence = to_radeon_fence(f);
if (fence && fence->rdev == rdev)
radeon_sync_fence(sync, fence);
else
r = dma_fence_wait(f, true);
-
if (r)
break;
}
-- 
2.25.1



[PATCH 10/16] drm/etnaviv: use new iterator in etnaviv_gem_describe

2021-06-17 Thread Christian König
Instead of hand rolling the logic.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/etnaviv/etnaviv_gem.c | 27 +--
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index b8fa6ed3dd73..6808dbef5c79 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -437,19 +437,17 @@ int etnaviv_gem_wait_bo(struct etnaviv_gpu *gpu, struct 
drm_gem_object *obj,
 static void etnaviv_gem_describe_fence(struct dma_fence *fence,
const char *type, struct seq_file *m)
 {
-   if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
-   seq_printf(m, "\t%9s: %s %s seq %llu\n",
-  type,
-  fence->ops->get_driver_name(fence),
-  fence->ops->get_timeline_name(fence),
-  fence->seqno);
+   seq_printf(m, "\t%9s: %s %s seq %llu\n", type,
+  fence->ops->get_driver_name(fence),
+  fence->ops->get_timeline_name(fence),
+  fence->seqno);
 }
 
 static void etnaviv_gem_describe(struct drm_gem_object *obj, struct seq_file 
*m)
 {
struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
struct dma_resv *robj = obj->resv;
-   struct dma_resv_list *fobj;
+   struct dma_resv_cursor cursor;
struct dma_fence *fence;
unsigned long off = drm_vma_node_start(&obj->vma_node);
 
@@ -459,19 +457,12 @@ static void etnaviv_gem_describe(struct drm_gem_object 
*obj, struct seq_file *m)
off, etnaviv_obj->vaddr, obj->size);
 
rcu_read_lock();
-   fobj = dma_resv_shared_list(robj);
-   if (fobj) {
-   unsigned int i, shared_count = fobj->shared_count;
-
-   for (i = 0; i < shared_count; i++) {
-   fence = rcu_dereference(fobj->shared[i]);
+   dma_resv_for_each_fence_unlocked(robj, &cursor, true, fence) {
+   if (cursor.is_exclusive)
+   etnaviv_gem_describe_fence(fence, "Exclusive", m);
+   else
etnaviv_gem_describe_fence(fence, "Shared", m);
-   }
}
-
-   fence = dma_resv_excl_fence(robj);
-   if (fence)
-   etnaviv_gem_describe_fence(fence, "Exclusive", m);
rcu_read_unlock();
 }
 
-- 
2.25.1



[PATCH 13/16] drm/msm: use new iterator in msm_gem_sync_object

2021-06-17 Thread Christian König
Simplifying the code a bit.

Untested since I can't get the driver to compile on !ARM.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/msm/msm_gem.c | 20 +++-
 1 file changed, 3 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 72a07e311de3..24f8c0603385 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -813,25 +813,11 @@ void msm_gem_vunmap(struct drm_gem_object *obj)
 int msm_gem_sync_object(struct drm_gem_object *obj,
struct msm_fence_context *fctx, bool exclusive)
 {
-   struct dma_resv_list *fobj;
+   struct dma_resv_cursor cursor;
struct dma_fence *fence;
-   int i, ret;
-
-   fence = dma_resv_excl_fence(obj->resv);
-   /* don't need to wait on our own fences, since ring is fifo */
-   if (fence && (fence->context != fctx->context)) {
-   ret = dma_fence_wait(fence, true);
-   if (ret)
-   return ret;
-   }
-
-   fobj = dma_resv_shared_list(obj->resv);
-   if (!exclusive || !fobj)
-   return 0;
+   int ret;
 
-   for (i = 0; i < fobj->shared_count; i++) {
-   fence = rcu_dereference_protected(fobj->shared[i],
-   dma_resv_held(obj->resv));
+   dma_resv_for_each_fence(obj->resv, &cursor, exclusive, fence) {
if (fence->context != fctx->context) {
ret = dma_fence_wait(fence, true);
if (ret)
-- 
2.25.1



[PATCH] drm/bridge: ti-sn65dsi83: Fix null pointer dereference in remove callback

2021-06-17 Thread Jonathan Liu
If attach has not been called, unloading the driver can result in a null
pointer dereference in mipi_dsi_detach as ctx->dsi has not been assigned
yet.

Fixes: ceb515ba29ba6b ("drm/bridge: ti-sn65dsi83: Add TI SN65DSI83 and 
SN65DSI84 driver")
Signed-off-by: Jonathan Liu 
---
 drivers/gpu/drm/bridge/ti-sn65dsi83.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
index 750f2172ef08..8e9f45c5c7c1 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
@@ -671,8 +671,11 @@ static int sn65dsi83_remove(struct i2c_client *client)
 {
struct sn65dsi83 *ctx = i2c_get_clientdata(client);
 
-   mipi_dsi_detach(ctx->dsi);
-   mipi_dsi_device_unregister(ctx->dsi);
+   if (ctx->dsi) {
+   mipi_dsi_detach(ctx->dsi);
+   mipi_dsi_device_unregister(ctx->dsi);
+   }
+
drm_bridge_remove(&ctx->bridge);
of_node_put(ctx->host_node);
 
-- 
2.32.0



Re: [PATCH] drm/nouveau/core/object: fix double free on error in nvkm_ioctl_new()

2021-06-17 Thread Dan Carpenter
On Mon, Jun 14, 2021 at 01:43:27PM +0300, Dan Carpenter wrote:
> If nvkm_object_init() fails then we should not call nvkm_object_fini()
> because it results in calling object->func->fini(object, suspend) twice.
> Once inside the nvkm_object_init() function and once inside the
> nvkm_object_fini() function.
> 
> Fixes: fbd58ebda9c8 ("drm/nouveau/object: merge with handle")
> Signed-off-by: Dan Carpenter 
> ---
> This is something that I spotted while looking for reference counting
> bugs.  I have tried running it, but it does not fix my crashes.  My
> system is basically unusable.  It's something to do with the new version
> of Firefox which triggers the refcount_t underflow, but switching to
> Epiphany doesn't solve the issue either.
> 
>  drivers/gpu/drm/nouveau/nvkm/core/ioctl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c 
> b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> index d777df5a64e6..87c761fb475a 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> @@ -134,8 +134,8 @@ nvkm_ioctl_new(struct nvkm_client *client,
>   return 0;
>   }
>   ret = -EEXIST;
> + nvkm_object_fini(object, false);
>   }
> - nvkm_object_fini(object, false);

Actually calling nvkm_object_fini() is probably fine.  It just screws
around with the registers and it's probably fine if we do that twice.

Calling .dtor() when .ctor() fails is actually required because .ctor
doesn't clean up after itself.

So this patch is not required.  The other patch is required.
https://lore.kernel.org/nouveau/YMinJwpIei9n1Pn1@mwanda/T/

In the end, I had to give up on fixing the hang and downgrade to
debian's long term support version of firefox.

regards,
dan carpenter



[PATCH] drm/auth: Move master pointer from drm_device to drm_file

2021-06-17 Thread Qiang Ma
The drm_file pointer clears to zero during multi-user switching,
so it needs to call drm_new_set_master for master pointer from drm_file.

Signed-off-by: Qiang Ma 
---
 drivers/gpu/drm/drm_auth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c
index f2d46b7ac6f9..02431af6d0c5 100644
--- a/drivers/gpu/drm/drm_auth.c
+++ b/drivers/gpu/drm/drm_auth.c
@@ -302,7 +302,7 @@ int drm_master_open(struct drm_file *file_priv)
/* if there is no current master make this fd it, but do not create
 * any master object for render clients */
mutex_lock(&dev->master_mutex);
-   if (!dev->master)
+   if (!file_priv->master)
ret = drm_new_set_master(dev, file_priv);
else
file_priv->master = drm_master_get(dev->master);
-- 
2.20.1





Re: [PATCH] drm/bridge: ti-sn65dsi83: Fix null pointer dereference in remove callback

2021-06-17 Thread Marek Vasut

On 6/17/21 1:19 PM, Jonathan Liu wrote:

If attach has not been called, unloading the driver can result in a null
pointer dereference in mipi_dsi_detach as ctx->dsi has not been assigned
yet.

Fixes: ceb515ba29ba6b ("drm/bridge: ti-sn65dsi83: Add TI SN65DSI83 and SN65DSI84 
driver")
Signed-off-by: Jonathan Liu 
---
  drivers/gpu/drm/bridge/ti-sn65dsi83.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
index 750f2172ef08..8e9f45c5c7c1 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
@@ -671,8 +671,11 @@ static int sn65dsi83_remove(struct i2c_client *client)
  {
struct sn65dsi83 *ctx = i2c_get_clientdata(client);
  
-	mipi_dsi_detach(ctx->dsi);

-   mipi_dsi_device_unregister(ctx->dsi);
+   if (ctx->dsi) {
+   mipi_dsi_detach(ctx->dsi);
+   mipi_dsi_device_unregister(ctx->dsi);
+   }
+
drm_bridge_remove(&ctx->bridge);
of_node_put(ctx->host_node);


Looks OK to me.

Reviewed-by: Marek Vasut 

Thanks !


Re: [PATCH v3] Documentation: gpu: Mention the requirements for new properties

2021-06-17 Thread Pekka Paalanen
On Thu, 17 Jun 2021 13:29:48 +0300
Laurent Pinchart  wrote:

> Hi Pekka,
> 
> On Thu, Jun 17, 2021 at 10:27:01AM +0300, Pekka Paalanen wrote:
> > On Thu, 17 Jun 2021 00:05:24 +0300 Laurent Pinchart wrote:  
> > > On Tue, Jun 15, 2021 at 01:16:56PM +0300, Pekka Paalanen wrote:  
> > > > On Tue, 15 Jun 2021 12:45:57 +0300 Laurent Pinchart wrote:
> > > > > On Tue, Jun 15, 2021 at 07:15:18AM +, Simon Ser wrote:
> > > > > > On Tuesday, June 15th, 2021 at 09:03, Pekka Paalanen wrote:
> > > > > >   
> > > > > > > indeed it will, but what else could one do to test userspace KMS
> > > > > > > clients in generic CI where all you can have is virtual hardware? 
> > > > > > > Maybe
> > > > > > > in the long run VKMS needs to loop back to a userspace daemon that
> > > > > > > implements all the complex processing and returns the writeback 
> > > > > > > result
> > > > > > > via VKMS again? That daemon would then need a single upstream, 
> > > > > > > like the
> > > > > > > kernel, where it is maintained and correctness verified.  
> > > > > > 
> > > > > > The complex processing must be implemented even without write-back, 
> > > > > > because
> > > > > > user-space can ask for CRCs of the CRTC.
> > > > > >   
> > > > > > > Or an LD_PRELOAD that hijacks all KMS ioctls and implements 
> > > > > > > virtual
> > > > > > > stuff in userspace? Didn't someone already have something like 
> > > > > > > that?
> > > > > > > It would need to be lifted to be a required part of kernel UAPI
> > > > > > > submissions, I suppose like IGT is nowadays.  
> > > > > > 
> > > > > > FWIW, I have a mock libdrm [1] for libliftoff. This is nowhere near 
> > > > > > a full
> > > > > > software implementation with write-back connectors, but allows to 
> > > > > > expose
> > > > > > virtual planes and check atomic commits in CI.
> > > > > > 
> > > > > > [1]: 
> > > > > > https://github.com/emersion/libliftoff/blob/master/test/libdrm_mock.c
> > > > > >   
> > > > > > > For compositor developers like me knowing the exact formulas 
> > > > > > > would be a huge
> > > > > > > benefit as it would allow me to use KMS to off-load 
> > > > > > > precision-sensitive
> > > > > > > operations (e.g.  professional color management). Otherwise, 
> > > > > > > compositors
> > > > > > > probably need a switch: "high quality color management? Then do 
> > > > > > > not use KMS
> > > > > > > features."  
> > > > > > 
> > > > > > I think for alpha blending there are already rounding issues 
> > > > > > depending on the
> > > > > > hardware. I wouldn't keep my hopes up for any guarantee that all hw 
> > > > > > uses the
> > > > > > exact same formulae for color management stuff.  
> > > > > 
> > > > > Good, because otherwise you would be very quickly disappointed :-)
> > > > > 
> > > > > For scaling we would also need to replicate the exact same filter 
> > > > > taps,
> > > > > which are often not documented.
> > > > 
> > > > That is where the documented tolerances come into play.
> > > 
> > > This is something I've experimented with a while ago, when developing
> > > automated tests for the rcar-du driver. When playing with different
> > > input images we had to constantly increases tolerances, up to a point
> > > where the tests started to miss real problems :-(  
> > 
> > What should we infer from that? That the hardware is broken and
> > exposing those KMS properties is a false promise?  
> 
> No, just that the scaler doesn't document the internal hardware
> implementation (number of taps in the filters, coefficients, rounding,
> ...). That's the rule, not the exception, and it doesn't prevent correct
> operation, images get scaled in a reproducible way (the same input
> produces the same output).
> 
> > If a driver on certain hardware cannot correctly implement a KMS
> > property over the full domain of the input space, should that driver
> > then simply not expose the KMS property at all?  
> 
> The properties involved here would the the SRC and CRTC rectangles for
> the planes. They don't document pixel-perfect scaling :-)
> 
> > But I would assume that the vendor still wants to expose the features
> > in upstream kernels, yet they cannot use the standard KMS properties
> > for that. Should the driver then expose vendor-specific properties with
> > the disclaimer that the result is not always what one would expect, so
> > that userspace written and tested explicitly for that hardware can
> > still work?
> > 
> > That is, a sufficient justification for a vendor-specific KMS property
> > would be that a standard property already exists, but the hardware is
> > too buggy to make it work. IOW, give up trying to make sense.  
> 
> It's not just about buggy hardware, it's also about implementation
> specificities, such as rounding, filters, order of operations in the
> color management pipeline (it's relatively easy when you only have two
> LUTs and a CCM matrix, but if you through 3D LUTs and other tonemapping
> feat

Re: [PATCH v3] Documentation: gpu: Mention the requirements for new properties

2021-06-17 Thread Laurent Pinchart
Hi Pekka,

On Thu, Jun 17, 2021 at 02:33:11PM +0300, Pekka Paalanen wrote:
> On Thu, 17 Jun 2021 13:29:48 +0300 Laurent Pinchart wrote:
> > On Thu, Jun 17, 2021 at 10:27:01AM +0300, Pekka Paalanen wrote:
> > > On Thu, 17 Jun 2021 00:05:24 +0300 Laurent Pinchart wrote:  
> > > > On Tue, Jun 15, 2021 at 01:16:56PM +0300, Pekka Paalanen wrote:  
> > > > > On Tue, 15 Jun 2021 12:45:57 +0300 Laurent Pinchart wrote:
> > > > > > On Tue, Jun 15, 2021 at 07:15:18AM +, Simon Ser wrote:
> > > > > > > On Tuesday, June 15th, 2021 at 09:03, Pekka Paalanen wrote:
> > > > > > >   
> > > > > > > > indeed it will, but what else could one do to test userspace KMS
> > > > > > > > clients in generic CI where all you can have is virtual 
> > > > > > > > hardware? Maybe
> > > > > > > > in the long run VKMS needs to loop back to a userspace daemon 
> > > > > > > > that
> > > > > > > > implements all the complex processing and returns the writeback 
> > > > > > > > result
> > > > > > > > via VKMS again? That daemon would then need a single upstream, 
> > > > > > > > like the
> > > > > > > > kernel, where it is maintained and correctness verified.  
> > > > > > > 
> > > > > > > The complex processing must be implemented even without 
> > > > > > > write-back, because
> > > > > > > user-space can ask for CRCs of the CRTC.
> > > > > > >   
> > > > > > > > Or an LD_PRELOAD that hijacks all KMS ioctls and implements 
> > > > > > > > virtual
> > > > > > > > stuff in userspace? Didn't someone already have something like 
> > > > > > > > that?
> > > > > > > > It would need to be lifted to be a required part of kernel UAPI
> > > > > > > > submissions, I suppose like IGT is nowadays.  
> > > > > > > 
> > > > > > > FWIW, I have a mock libdrm [1] for libliftoff. This is nowhere 
> > > > > > > near a full
> > > > > > > software implementation with write-back connectors, but allows to 
> > > > > > > expose
> > > > > > > virtual planes and check atomic commits in CI.
> > > > > > > 
> > > > > > > [1]: 
> > > > > > > https://github.com/emersion/libliftoff/blob/master/test/libdrm_mock.c
> > > > > > >   
> > > > > > > > For compositor developers like me knowing the exact formulas 
> > > > > > > > would be a huge
> > > > > > > > benefit as it would allow me to use KMS to off-load 
> > > > > > > > precision-sensitive
> > > > > > > > operations (e.g.  professional color management). Otherwise, 
> > > > > > > > compositors
> > > > > > > > probably need a switch: "high quality color management? Then do 
> > > > > > > > not use KMS
> > > > > > > > features."  
> > > > > > > 
> > > > > > > I think for alpha blending there are already rounding issues 
> > > > > > > depending on the
> > > > > > > hardware. I wouldn't keep my hopes up for any guarantee that all 
> > > > > > > hw uses the
> > > > > > > exact same formulae for color management stuff.  
> > > > > > 
> > > > > > Good, because otherwise you would be very quickly disappointed :-)
> > > > > > 
> > > > > > For scaling we would also need to replicate the exact same filter 
> > > > > > taps,
> > > > > > which are often not documented.
> > > > > 
> > > > > That is where the documented tolerances come into play.
> > > > 
> > > > This is something I've experimented with a while ago, when developing
> > > > automated tests for the rcar-du driver. When playing with different
> > > > input images we had to constantly increases tolerances, up to a point
> > > > where the tests started to miss real problems :-(  
> > > 
> > > What should we infer from that? That the hardware is broken and
> > > exposing those KMS properties is a false promise?  
> > 
> > No, just that the scaler doesn't document the internal hardware
> > implementation (number of taps in the filters, coefficients, rounding,
> > ...). That's the rule, not the exception, and it doesn't prevent correct
> > operation, images get scaled in a reproducible way (the same input
> > produces the same output).
> > 
> > > If a driver on certain hardware cannot correctly implement a KMS
> > > property over the full domain of the input space, should that driver
> > > then simply not expose the KMS property at all?  
> > 
> > The properties involved here would the the SRC and CRTC rectangles for
> > the planes. They don't document pixel-perfect scaling :-)
> > 
> > > But I would assume that the vendor still wants to expose the features
> > > in upstream kernels, yet they cannot use the standard KMS properties
> > > for that. Should the driver then expose vendor-specific properties with
> > > the disclaimer that the result is not always what one would expect, so
> > > that userspace written and tested explicitly for that hardware can
> > > still work?
> > > 
> > > That is, a sufficient justification for a vendor-specific KMS property
> > > would be that a standard property already exists, but the hardware is
> > > too buggy to make it work. IOW, give up trying to make sense.  
> > 
> > It's not just about buggy 

Re: [PATCH] dt-bindings: Drop redundant minItems/maxItems

2021-06-17 Thread Jassi Brar
On Tue, Jun 15, 2021 at 2:15 PM Rob Herring  wrote:
>
> If a property has an 'items' list, then a 'minItems' or 'maxItems' with the
> same size as the list is redundant and can be dropped. Note that is DT
> schema specific behavior and not standard json-schema behavior. The tooling
> will fixup the final schema adding any unspecified minItems/maxItems.
>
> This condition is partially checked with the meta-schema already, but
> only if both 'minItems' and 'maxItems' are equal to the 'items' length.
> An improved meta-schema is pending.
>
> Cc: Jens Axboe 
> Cc: Stephen Boyd 
> Cc: Herbert Xu 
> Cc: "David S. Miller" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Vinod Koul 
> Cc: Bartosz Golaszewski 
> Cc: Kamal Dasu 
> Cc: Jonathan Cameron 
> Cc: Lars-Peter Clausen 
> Cc: Thomas Gleixner 
> Cc: Marc Zyngier 
> Cc: Joerg Roedel 
> Cc: Jassi Brar 
> Cc: Mauro Carvalho Chehab 
> Cc: Krzysztof Kozlowski 
> Cc: Ulf Hansson 
> Cc: Jakub Kicinski 
> Cc: Wolfgang Grandegger 
> Cc: Marc Kleine-Budde 
> Cc: Andrew Lunn 
> Cc: Vivien Didelot 
> Cc: Vladimir Oltean 
> Cc: Bjorn Helgaas 
> Cc: Kishon Vijay Abraham I 
> Cc: Linus Walleij 
> Cc: "Uwe Kleine-König" 
> Cc: Lee Jones 
> Cc: Ohad Ben-Cohen 
> Cc: Mathieu Poirier 
> Cc: Philipp Zabel 
> Cc: Paul Walmsley 
> Cc: Palmer Dabbelt 
> Cc: Albert Ou 
> Cc: Alessandro Zummo 
> Cc: Alexandre Belloni 
> Cc: Greg Kroah-Hartman 
> Cc: Mark Brown 
> Cc: Zhang Rui 
> Cc: Daniel Lezcano 
> Cc: Wim Van Sebroeck 
> Cc: Guenter Roeck 
> Signed-off-by: Rob Herring 
> ---
>  .../devicetree/bindings/ata/nvidia,tegra-ahci.yaml  | 1 -
>  .../devicetree/bindings/clock/allwinner,sun4i-a10-ccu.yaml  | 2 --
>  .../devicetree/bindings/clock/qcom,gcc-apq8064.yaml | 1 -
>  Documentation/devicetree/bindings/clock/qcom,gcc-sdx55.yaml | 2 --
>  .../devicetree/bindings/clock/qcom,gcc-sm8350.yaml  | 2 --
>  .../devicetree/bindings/clock/sprd,sc9863a-clk.yaml | 1 -
>  .../devicetree/bindings/crypto/allwinner,sun8i-ce.yaml  | 2 --
>  Documentation/devicetree/bindings/crypto/fsl-dcp.yaml   | 1 -
>  .../display/allwinner,sun4i-a10-display-backend.yaml| 6 --
>  .../bindings/display/allwinner,sun6i-a31-mipi-dsi.yaml  | 1 -
>  .../bindings/display/allwinner,sun8i-a83t-dw-hdmi.yaml  | 4 
>  .../bindings/display/allwinner,sun8i-a83t-hdmi-phy.yaml | 2 --
>  .../bindings/display/allwinner,sun8i-r40-tcon-top.yaml  | 2 --
>  .../devicetree/bindings/display/bridge/cdns,mhdp8546.yaml   | 2 --
>  .../bindings/display/rockchip/rockchip,dw-hdmi.yaml | 2 --
>  Documentation/devicetree/bindings/display/st,stm32-dsi.yaml | 2 --
>  .../devicetree/bindings/display/st,stm32-ltdc.yaml  | 1 -
>  .../devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.yaml | 4 
>  .../devicetree/bindings/dma/renesas,rcar-dmac.yaml  | 1 -
>  .../devicetree/bindings/edac/amazon,al-mc-edac.yaml | 2 --
>  Documentation/devicetree/bindings/eeprom/at24.yaml  | 1 -
>  Documentation/devicetree/bindings/example-schema.yaml   | 2 --
>  Documentation/devicetree/bindings/gpu/brcm,bcm-v3d.yaml | 1 -
>  Documentation/devicetree/bindings/gpu/vivante,gc.yaml   | 1 -
>  Documentation/devicetree/bindings/i2c/brcm,brcmstb-i2c.yaml | 1 -
>  .../devicetree/bindings/i2c/marvell,mv64xxx-i2c.yaml| 2 --
>  .../devicetree/bindings/i2c/mellanox,i2c-mlxbf.yaml | 1 -
>  .../devicetree/bindings/iio/adc/amlogic,meson-saradc.yaml   | 1 -
>  .../devicetree/bindings/iio/adc/st,stm32-dfsdm-adc.yaml | 2 --
>  .../bindings/interrupt-controller/fsl,irqsteer.yaml | 1 -
>  .../bindings/interrupt-controller/loongson,liointc.yaml | 1 -
>  Documentation/devicetree/bindings/iommu/arm,smmu-v3.yaml| 1 -
>  .../devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml   | 1 -
>  .../devicetree/bindings/mailbox/st,stm32-ipcc.yaml  | 2 --
>  .../devicetree/bindings/media/amlogic,gx-vdec.yaml  | 1 -
>  Documentation/devicetree/bindings/media/i2c/adv7604.yaml| 1 -
>  .../devicetree/bindings/media/marvell,mmp2-ccic.yaml| 1 -
>  .../devicetree/bindings/media/qcom,sc7180-venus.yaml| 1 -
>  .../devicetree/bindings/media/qcom,sdm845-venus-v2.yaml | 1 -
>  .../devicetree/bindings/media/qcom,sm8250-venus.yaml| 1 -
>  Documentation/devicetree/bindings/media/renesas,drif.yaml   | 1 -
>  .../bindings/memory-controllers/mediatek,smi-common.yaml| 6 ++
>  .../bindings/memory-controllers/mediatek,smi-larb.yaml  | 1 -
>  .../devicetree/bindings/mmc/allwinner,sun4i-a10-mmc.yaml| 2 --
>  Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.yaml| 1 -
>  Documentation/devicetree/bindings/mmc/mtk-sd.yaml   | 2 --
>  Documentation/devicetree/bindings/mmc/renesas,sdhi.yaml | 2 --
>  Documentation/devicetree/bindings/mmc/sdhci-am654.yaml  | 1 -
>  Documentation/devicetree/bindings/mmc/sdhci-pxa.yaml| 1 -
>  .../devicetree/bindings/net/amlogic,meson-dwmac.yaml

[PATCH 0/7] drm/msm/dpu: merge dpu_core_irq into dpu_hw_interrupts

2021-06-17 Thread Dmitry Baryshkov
This patch series reworks DPU's irq handling code by merging
dpu_core_irq into dpu_hw_intr, reworking/dropping irq-related helpers
and wrappers, etc.

Dependencies: 
https://lore.kernel.org/linux-arm-msm/20210611170003.3539059-1-bjorn.anders...@linaro.org/


Dmitry Baryshkov (7):
  drm/msm/dpu: squash dpu_core_irq into dpu_hw_interrupts
  drm/msm/dpu: don't clear IRQ register twice
  drm/msm/dpu: merge struct dpu_irq into struct dpu_hw_intr
  drm/msm/dpu: hide struct dpu_irq_callback
  drm/msm/dpu: remove extra wrappers around dpu_core_irq
  drm/msm/dpu: get rid of dpu_encoder_helper_(un)register_irq
  drm/msm/dpu: remove struct dpu_encoder_irq and enum dpu_intr_idx

 drivers/gpu/drm/msm/Makefile   |   1 -
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c   | 256 -
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.h   |  30 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c| 111 ++--
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h   |  66 +
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c   |  99 +++
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c   |  56 ++--
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c  | 306 -
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h  |  92 +--
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c|  27 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h|  25 --
 drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h  |  49 ++--
 12 files changed, 383 insertions(+), 735 deletions(-)
 delete mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c




[PATCH 3/7] drm/msm/dpu: merge struct dpu_irq into struct dpu_hw_intr

2021-06-17 Thread Dmitry Baryshkov
As dpu_core_irq was merged into dpu_hw_intr, merge data structures too,
removing the need for a separate data structure.

Signed-off-by: Dmitry Baryshkov 
---
 .../gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c | 51 +--
 .../gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h |  5 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h   | 13 -
 3 files changed, 28 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
index 17ad78d49948..e5dce884e7c0 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
@@ -127,20 +127,19 @@ static const struct dpu_intr_reg dpu_intr_set[] = {
  */
 static void dpu_core_irq_callback_handler(struct dpu_kms *dpu_kms, int irq_idx)
 {
-   struct dpu_irq *irq_obj = &dpu_kms->irq_obj;
struct dpu_irq_callback *cb;
 
pr_debug("irq_idx=%d\n", irq_idx);
 
-   if (list_empty(&irq_obj->irq_cb_tbl[irq_idx]))
+   if (list_empty(&dpu_kms->hw_intr->irq_cb_tbl[irq_idx]))
DRM_ERROR("no registered cb, idx:%d\n", irq_idx);
 
-   atomic_inc(&irq_obj->irq_counts[irq_idx]);
+   atomic_inc(&dpu_kms->hw_intr->irq_counts[irq_idx]);
 
/*
 * Perform registered function callback
 */
-   list_for_each_entry(cb, &irq_obj->irq_cb_tbl[irq_idx], list)
+   list_for_each_entry(cb, &dpu_kms->hw_intr->irq_cb_tbl[irq_idx], list)
if (cb->func)
cb->func(cb->arg, irq_idx);
 }
@@ -420,6 +419,10 @@ void dpu_hw_intr_destroy(struct dpu_hw_intr *intr)
 {
if (intr) {
kfree(intr->cache_irq_mask);
+
+   kfree(intr->irq_cb_tbl);
+   kfree(intr->irq_counts);
+
kfree(intr);
}
 }
@@ -429,7 +432,7 @@ int dpu_core_irq_register_callback(struct dpu_kms *dpu_kms, 
int irq_idx,
 {
unsigned long irq_flags;
 
-   if (!dpu_kms->irq_obj.irq_cb_tbl) {
+   if (!dpu_kms->hw_intr->irq_cb_tbl) {
DPU_ERROR("invalid params\n");
return -EINVAL;
}
@@ -453,9 +456,9 @@ int dpu_core_irq_register_callback(struct dpu_kms *dpu_kms, 
int irq_idx,
trace_dpu_core_irq_register_callback(irq_idx, register_irq_cb);
list_del_init(®ister_irq_cb->list);
list_add_tail(®ister_irq_cb->list,
-   &dpu_kms->irq_obj.irq_cb_tbl[irq_idx]);
+   &dpu_kms->hw_intr->irq_cb_tbl[irq_idx]);
if (list_is_first(®ister_irq_cb->list,
-   &dpu_kms->irq_obj.irq_cb_tbl[irq_idx])) {
+   &dpu_kms->hw_intr->irq_cb_tbl[irq_idx])) {
int ret = dpu_hw_intr_enable_irq_locked(
dpu_kms->hw_intr,
irq_idx);
@@ -473,7 +476,7 @@ int dpu_core_irq_unregister_callback(struct dpu_kms 
*dpu_kms, int irq_idx,
 {
unsigned long irq_flags;
 
-   if (!dpu_kms->irq_obj.irq_cb_tbl) {
+   if (!dpu_kms->hw_intr->irq_cb_tbl) {
DPU_ERROR("invalid params\n");
return -EINVAL;
}
@@ -497,7 +500,7 @@ int dpu_core_irq_unregister_callback(struct dpu_kms 
*dpu_kms, int irq_idx,
trace_dpu_core_irq_unregister_callback(irq_idx, register_irq_cb);
list_del_init(®ister_irq_cb->list);
/* empty callback list but interrupt is still enabled */
-   if (list_empty(&dpu_kms->irq_obj.irq_cb_tbl[irq_idx])) {
+   if (list_empty(&dpu_kms->hw_intr->irq_cb_tbl[irq_idx])) {
int ret = dpu_hw_intr_disable_irq_locked(
dpu_kms->hw_intr,
irq_idx);
@@ -515,19 +518,18 @@ int dpu_core_irq_unregister_callback(struct dpu_kms 
*dpu_kms, int irq_idx,
 static int dpu_debugfs_core_irq_show(struct seq_file *s, void *v)
 {
struct dpu_kms *dpu_kms = s->private;
-   struct dpu_irq *irq_obj = &dpu_kms->irq_obj;
struct dpu_irq_callback *cb;
unsigned long irq_flags;
int i, irq_count, cb_count;
 
-   if (WARN_ON(!irq_obj->irq_cb_tbl))
+   if (WARN_ON(!dpu_kms->hw_intr->irq_cb_tbl))
return 0;
 
-   for (i = 0; i < irq_obj->total_irqs; i++) {
+   for (i = 0; i < dpu_kms->hw_intr->total_irqs; i++) {
spin_lock_irqsave(&dpu_kms->hw_intr->irq_lock, irq_flags);
cb_count = 0;
-   irq_count = atomic_read(&irq_obj->irq_counts[i]);
-   list_for_each_entry(cb, &irq_obj->irq_cb_tbl[i], list)
+   irq_count = atomic_read(&dpu_kms->hw_intr->irq_counts[i]);
+   list_for_each_entry(cb, &dpu_kms->hw_intr->irq_cb_tbl[i], list)
cb_count++;
spin_unlock_irqrestore(&dpu_kms->hw_intr->irq_lock, irq_flags);
 
@@ -559,14 +561,13 @@ void dpu_core_irq_preinstall(struct dpu_kms *dpu_kms)
pm_runtime_put_sync(&dpu_kms->pdev->dev);
 
/* Creat

[PATCH 1/7] drm/msm/dpu: squash dpu_core_irq into dpu_hw_interrupts

2021-06-17 Thread Dmitry Baryshkov
With dpu_core_irq being the wrapper around dpu_hw_interrupts, there is
little sense in having them separate. Squash them together to remove
another layer of abstraction (hw_intr ops).

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/Makefile  |   1 -
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c  | 256 -
 .../gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c | 269 ++
 .../gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h |  87 --
 4 files changed, 214 insertions(+), 399 deletions(-)
 delete mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 65d86cecb571..54655e459866 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -51,7 +51,6 @@ msm-y := \
disp/mdp5/mdp5_mixer.o \
disp/mdp5/mdp5_plane.o \
disp/mdp5/mdp5_smp.o \
-   disp/dpu1/dpu_core_irq.o \
disp/dpu1/dpu_core_perf.o \
disp/dpu1/dpu_crtc.o \
disp/dpu1/dpu_encoder.o \
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c
deleted file mode 100644
index 18557b9713b6..
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c
+++ /dev/null
@@ -1,256 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
- */
-
-#define pr_fmt(fmt)"[drm:%s:%d] " fmt, __func__, __LINE__
-
-#include 
-#include 
-#include 
-#include 
-
-#include "dpu_core_irq.h"
-#include "dpu_trace.h"
-
-/**
- * dpu_core_irq_callback_handler - dispatch core interrupts
- * @arg:   private data of callback handler
- * @irq_idx:   interrupt index
- */
-static void dpu_core_irq_callback_handler(void *arg, int irq_idx)
-{
-   struct dpu_kms *dpu_kms = arg;
-   struct dpu_irq *irq_obj = &dpu_kms->irq_obj;
-   struct dpu_irq_callback *cb;
-
-   pr_debug("irq_idx=%d\n", irq_idx);
-
-   if (list_empty(&irq_obj->irq_cb_tbl[irq_idx]))
-   DRM_ERROR("no registered cb, idx:%d\n", irq_idx);
-
-   atomic_inc(&irq_obj->irq_counts[irq_idx]);
-
-   /*
-* Perform registered function callback
-*/
-   list_for_each_entry(cb, &irq_obj->irq_cb_tbl[irq_idx], list)
-   if (cb->func)
-   cb->func(cb->arg, irq_idx);
-}
-
-u32 dpu_core_irq_read(struct dpu_kms *dpu_kms, int irq_idx, bool clear)
-{
-   if (!dpu_kms->hw_intr ||
-   !dpu_kms->hw_intr->ops.get_interrupt_status)
-   return 0;
-
-   if (irq_idx < 0) {
-   DPU_ERROR("[%pS] invalid irq_idx=%d\n",
-   __builtin_return_address(0), irq_idx);
-   return 0;
-   }
-
-   return dpu_kms->hw_intr->ops.get_interrupt_status(dpu_kms->hw_intr,
-   irq_idx, clear);
-}
-
-int dpu_core_irq_register_callback(struct dpu_kms *dpu_kms, int irq_idx,
-   struct dpu_irq_callback *register_irq_cb)
-{
-   unsigned long irq_flags;
-
-   if (!dpu_kms->irq_obj.irq_cb_tbl) {
-   DPU_ERROR("invalid params\n");
-   return -EINVAL;
-   }
-
-   if (!register_irq_cb || !register_irq_cb->func) {
-   DPU_ERROR("invalid irq_cb:%d func:%d\n",
-   register_irq_cb != NULL,
-   register_irq_cb ?
-   register_irq_cb->func != NULL : -1);
-   return -EINVAL;
-   }
-
-   if (irq_idx < 0 || irq_idx >= dpu_kms->hw_intr->total_irqs) {
-   DPU_ERROR("invalid IRQ index: [%d]\n", irq_idx);
-   return -EINVAL;
-   }
-
-   DPU_DEBUG("[%pS] irq_idx=%d\n", __builtin_return_address(0), irq_idx);
-
-   irq_flags = dpu_kms->hw_intr->ops.lock(dpu_kms->hw_intr);
-   trace_dpu_core_irq_register_callback(irq_idx, register_irq_cb);
-   list_del_init(®ister_irq_cb->list);
-   list_add_tail(®ister_irq_cb->list,
-   &dpu_kms->irq_obj.irq_cb_tbl[irq_idx]);
-   if (list_is_first(®ister_irq_cb->list,
-   &dpu_kms->irq_obj.irq_cb_tbl[irq_idx])) {
-   int ret = dpu_kms->hw_intr->ops.enable_irq_locked(
-   dpu_kms->hw_intr,
-   irq_idx);
-   if (ret)
-   DPU_ERROR("Fail to enable IRQ for irq_idx:%d\n",
-   irq_idx);
-   }
-   dpu_kms->hw_intr->ops.unlock(dpu_kms->hw_intr, irq_flags);
-
-   return 0;
-}
-
-int dpu_core_irq_unregister_callback(struct dpu_kms *dpu_kms, int irq_idx,
-   struct dpu_irq_callback *register_irq_cb)
-{
-   unsigned long irq_flags;
-
-   if (!dpu_kms->irq_obj.irq_cb_tbl) {
-   DPU_ERROR("invalid params\n");
-   return -EINVAL;
-   }
-
-   if (!register_irq_cb || !register_irq_cb->func) {
-

[PATCH 2/7] drm/msm/dpu: don't clear IRQ register twice

2021-06-17 Thread Dmitry Baryshkov
We already clear the IRQ status register before processing IRQs, so do
not clear the register again. Especially do not clear the IRQ status
_after_ processing the IRQ as this way we can loose the event.

Signed-off-by: Dmitry Baryshkov 
---
 .../gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c   | 17 -
 1 file changed, 17 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
index 8e890f981afd..17ad78d49948 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
@@ -120,21 +120,6 @@ static const struct dpu_intr_reg dpu_intr_set[] = {
 #define DPU_IRQ_REG(irq_idx)   (irq_idx / 32)
 #define DPU_IRQ_MASK(irq_idx)  (BIT(irq_idx % 32))
 
-static void dpu_hw_intr_clear_intr_status_nolock(struct dpu_hw_intr *intr,
-   int irq_idx)
-{
-   int reg_idx;
-
-   if (!intr)
-   return;
-
-   reg_idx = DPU_IRQ_REG(irq_idx);
-   DPU_REG_WRITE(&intr->hw, dpu_intr_set[reg_idx].clr_off, 
DPU_IRQ_MASK(irq_idx));
-
-   /* ensure register writes go through */
-   wmb();
-}
-
 /**
  * dpu_core_irq_callback_handler - dispatch core interrupts
  * @arg:   private data of callback handler
@@ -203,8 +188,6 @@ irqreturn_t dpu_core_irq(struct dpu_kms *dpu_kms)
 
dpu_core_irq_callback_handler(dpu_kms, irq_idx);
 
-   dpu_hw_intr_clear_intr_status_nolock(intr, irq_idx);
-
/*
 * When callback finish, clear the irq_status
 * with the matching mask. Once irq_status
-- 
2.30.2



[PATCH 6/7] drm/msm/dpu: get rid of dpu_encoder_helper_(un)register_irq

2021-06-17 Thread Dmitry Baryshkov
Get rid of dpu_encoder_helper_register_irq/unregister_irq helpers, call
dpu_core_register/unregister_callback directly, without surrounding them
with helpers.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c   | 64 ---
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h  | 18 --
 .../drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c  | 39 +++
 .../drm/msm/disp/dpu1/dpu_encoder_phys_vid.c  | 21 --
 .../gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c |  4 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h | 29 +++--
 6 files changed, 56 insertions(+), 119 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index 186b2f87d193..23a7a22d4f3f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -333,70 +333,6 @@ int dpu_encoder_helper_wait_for_irq(struct 
dpu_encoder_phys *phys_enc,
return ret;
 }
 
-int dpu_encoder_helper_register_irq(struct dpu_encoder_phys *phys_enc,
-   enum dpu_intr_idx intr_idx)
-{
-   struct dpu_encoder_irq *irq;
-   int ret = 0;
-
-   if (intr_idx >= INTR_IDX_MAX) {
-   DPU_ERROR("invalid params\n");
-   return -EINVAL;
-   }
-   irq = &phys_enc->irq[intr_idx];
-
-   if (irq->irq_idx < 0) {
-   DPU_ERROR_PHYS(phys_enc,
-   "invalid IRQ index:%d\n", irq->irq_idx);
-   return -EINVAL;
-   }
-
-   ret = dpu_core_irq_register_callback(phys_enc->dpu_kms, irq->irq_idx,
-   irq->func, phys_enc);
-   if (ret) {
-   DPU_ERROR_PHYS(phys_enc,
-   "failed to register IRQ callback for %s\n",
-   irq->name);
-   irq->irq_idx = -EINVAL;
-   return ret;
-   }
-
-   trace_dpu_enc_irq_register_success(DRMID(phys_enc->parent), intr_idx,
-   irq->irq_idx);
-
-   return ret;
-}
-
-int dpu_encoder_helper_unregister_irq(struct dpu_encoder_phys *phys_enc,
-   enum dpu_intr_idx intr_idx)
-{
-   struct dpu_encoder_irq *irq;
-   int ret;
-
-   irq = &phys_enc->irq[intr_idx];
-
-   /* silently skip irqs that weren't registered */
-   if (irq->irq_idx < 0) {
-   DRM_ERROR("duplicate unregister id=%u, intr=%d, irq=%d",
- DRMID(phys_enc->parent), intr_idx,
- irq->irq_idx);
-   return 0;
-   }
-
-   ret = dpu_core_irq_unregister_callback(phys_enc->dpu_kms, irq->irq_idx,
-   irq->func, phys_enc);
-   if (ret) {
-   DRM_ERROR("unreg cb fail id=%u, intr=%d, irq=%d ret=%d",
- DRMID(phys_enc->parent), intr_idx,
- irq->irq_idx, ret);
-   }
-
-   trace_dpu_enc_irq_unregister_success(DRMID(phys_enc->parent), intr_idx,
-irq->irq_idx);
-
-   return 0;
-}
-
 int dpu_encoder_get_frame_count(struct drm_encoder *drm_enc)
 {
struct dpu_encoder_virt *dpu_enc;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
index 80d87871fd94..ff2218155b44 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
@@ -364,22 +364,4 @@ int dpu_encoder_helper_wait_for_irq(struct 
dpu_encoder_phys *phys_enc,
enum dpu_intr_idx intr_idx,
struct dpu_encoder_wait_info *wait_info);
 
-/**
- * dpu_encoder_helper_register_irq - register and enable an irq
- * @phys_enc: Pointer to physical encoder structure
- * @intr_idx: encoder interrupt index
- * @Return: 0 or -ERROR
- */
-int dpu_encoder_helper_register_irq(struct dpu_encoder_phys *phys_enc,
-   enum dpu_intr_idx intr_idx);
-
-/**
- * dpu_encoder_helper_unregister_irq - unregister and disable an irq
- * @phys_enc: Pointer to physical encoder structure
- * @intr_idx: encoder interrupt index
- * @Return: 0 or -ERROR
- */
-int dpu_encoder_helper_unregister_irq(struct dpu_encoder_phys *phys_enc,
-   enum dpu_intr_idx intr_idx);
-
 #endif /* __dpu_encoder_phys_H__ */
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
index dbc8f0811dd1..d5d4ee7f0a10 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
@@ -211,7 +211,9 @@ static int _dpu_encoder_phys_cmd_handle_ppdone_timeout(
  cmd_enc->pp_timeout_report_cnt,
  atomic_read(&phys_enc->pending_kickoff_cnt));
msm_disp_snapshot_state(drm_enc->dev);
-   dpu_encoder_helper_unregister_irq(phys_enc, INTR_IDX_RDPTR);
+   dpu_core_irq_unregister_callback(phys_enc->dpu_kms,
+   

[PATCH 7/7] drm/msm/dpu: remove struct dpu_encoder_irq and enum dpu_intr_idx

2021-06-17 Thread Dmitry Baryshkov
Drop the wrapping structures and the enum used to index those structures
in dpu_kms. Instead of them use IRQ indices and callback functions
directly.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c   | 47 +-
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h  | 48 +++---
 .../drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c  | 94 +++
 .../drm/msm/disp/dpu1/dpu_encoder_phys_vid.c  | 53 ---
 drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h | 12 +--
 5 files changed, 92 insertions(+), 162 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index 23a7a22d4f3f..cbc07591c17f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -241,11 +241,11 @@ static void _dpu_encoder_setup_dither(struct 
dpu_hw_pingpong *hw_pp, unsigned bp
 }
 
 void dpu_encoder_helper_report_irq_timeout(struct dpu_encoder_phys *phys_enc,
-   enum dpu_intr_idx intr_idx)
+   int irq_idx)
 {
DRM_ERROR("irq timeout id=%u, intf=%d, pp=%d, intr=%d\n",
  DRMID(phys_enc->parent), phys_enc->intf_idx - INTF_0,
- phys_enc->hw_pp->idx - PINGPONG_0, intr_idx);
+ phys_enc->hw_pp->idx - PINGPONG_0, irq_idx);
 
if (phys_enc->parent_ops->handle_frame_done)
phys_enc->parent_ops->handle_frame_done(
@@ -257,75 +257,70 @@ static int dpu_encoder_helper_wait_event_timeout(int32_t 
drm_id,
u32 irq_idx, struct dpu_encoder_wait_info *info);
 
 int dpu_encoder_helper_wait_for_irq(struct dpu_encoder_phys *phys_enc,
-   enum dpu_intr_idx intr_idx,
+   int irq_idx, void (*irq_cb)(void *, int),
struct dpu_encoder_wait_info *wait_info)
 {
-   struct dpu_encoder_irq *irq;
u32 irq_status;
int ret;
 
-   if (!wait_info || intr_idx >= INTR_IDX_MAX) {
+   if (!wait_info || irq_idx < 0) {
DPU_ERROR("invalid params\n");
return -EINVAL;
}
-   irq = &phys_enc->irq[intr_idx];
 
/* note: do master / slave checking outside */
 
/* return EWOULDBLOCK since we know the wait isn't necessary */
if (phys_enc->enable_state == DPU_ENC_DISABLED) {
-   DRM_ERROR("encoder is disabled id=%u, intr=%d, irq=%d",
- DRMID(phys_enc->parent), intr_idx,
- irq->irq_idx);
+   DRM_ERROR("encoder is disabled id=%u, irq=%d",
+ DRMID(phys_enc->parent), irq_idx);
return -EWOULDBLOCK;
}
 
-   if (irq->irq_idx < 0) {
-   DRM_DEBUG_KMS("skip irq wait id=%u, intr=%d, irq=%s",
- DRMID(phys_enc->parent), intr_idx,
- irq->name);
+   if (irq_idx < 0) {
+   DRM_DEBUG_KMS("skip irq wait id=%u", DRMID(phys_enc->parent));
return 0;
}
 
-   DRM_DEBUG_KMS("id=%u, intr=%d, irq=%d, pp=%d, pending_cnt=%d",
- DRMID(phys_enc->parent), intr_idx,
- irq->irq_idx, phys_enc->hw_pp->idx - PINGPONG_0,
+   DRM_DEBUG_KMS("id=%u, irq=%d, pp=%d, pending_cnt=%d",
+ DRMID(phys_enc->parent),
+ irq_idx, phys_enc->hw_pp->idx - PINGPONG_0,
  atomic_read(wait_info->atomic_cnt));
 
ret = dpu_encoder_helper_wait_event_timeout(
DRMID(phys_enc->parent),
-   irq->irq_idx,
+   irq_idx,
wait_info);
 
if (ret <= 0) {
irq_status = dpu_core_irq_read(phys_enc->dpu_kms,
-   irq->irq_idx, true);
+   irq_idx, true);
if (irq_status) {
unsigned long flags;
 
-   DRM_DEBUG_KMS("irq not triggered id=%u, intr=%d, "
+   DRM_DEBUG_KMS("irq not triggered id=%u, "
  "irq=%d, pp=%d, atomic_cnt=%d",
- DRMID(phys_enc->parent), intr_idx,
- irq->irq_idx,
+ DRMID(phys_enc->parent),
+ irq_idx,
  phys_enc->hw_pp->idx - PINGPONG_0,
  atomic_read(wait_info->atomic_cnt));
local_irq_save(flags);
-   irq->func(phys_enc, irq->irq_idx);
+   irq_cb(phys_enc, irq_idx);
local_irq_restore(flags);
ret = 0;
} else {
ret = -ETIMEDOUT;
-   DRM_DEBUG_KMS("irq timeout id=%u, intr=%d, "
+   DRM_DEBUG_KMS("irq timeout id=%u, "
   

[PATCH 4/7] drm/msm/dpu: hide struct dpu_irq_callback

2021-06-17 Thread Dmitry Baryshkov
The struct dpu_irq_callbacks looks internal to IRQ handling code. Hide
it from the rest of the DPU driver.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.h  | 18 +++---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c   |  6 +-
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h  |  2 +-
 .../drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c  | 10 ++-
 .../drm/msm/disp/dpu1/dpu_encoder_phys_vid.c  |  6 +-
 .../gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c | 62 ++-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h   | 12 
 drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h |  8 +--
 8 files changed, 69 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.h
index 90ae6c9ccc95..44ab97fb2964 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.h
@@ -46,10 +46,8 @@ u32 dpu_core_irq_read(
  * interrupt
  * @dpu_kms:   DPU handle
  * @irq_idx:   irq index
- * @irq_cb:IRQ callback structure, containing callback function
- * and argument. Passing NULL for irq_cb will unregister
- * the callback for the given irq_idx
- * This must exist until un-registration.
+ * @irq_cb:IRQ callback funcion.
+ * @irq_arg:   IRQ callback argument.
  * @return:0 for success registering callback, otherwise failure
  *
  * This function supports registration of multiple callbacks for each 
interrupt.
@@ -57,17 +55,16 @@ u32 dpu_core_irq_read(
 int dpu_core_irq_register_callback(
struct dpu_kms *dpu_kms,
int irq_idx,
-   struct dpu_irq_callback *irq_cb);
+   void (*irq_cb)(void *arg, int irq_idx),
+   void *irq_arg);
 
 /**
  * dpu_core_irq_unregister_callback - For unregistering callback function on 
IRQ
  * interrupt
  * @dpu_kms:   DPU handle
  * @irq_idx:   irq index
- * @irq_cb:IRQ callback structure, containing callback function
- * and argument. Passing NULL for irq_cb will unregister
- * the callback for the given irq_idx
- * This must match with registration.
+ * @irq_cb:IRQ callback funcion.
+ * @irq_arg:   IRQ callback argument.
  * @return:0 for success registering callback, otherwise failure
  *
  * This function supports registration of multiple callbacks for each 
interrupt.
@@ -75,7 +72,8 @@ int dpu_core_irq_register_callback(
 int dpu_core_irq_unregister_callback(
struct dpu_kms *dpu_kms,
int irq_idx,
-   struct dpu_irq_callback *irq_cb);
+   void (*irq_cb)(void *arg, int irq_idx),
+   void *irq_arg);
 
 /**
  * dpu_debugfs_core_irq_init - register core irq debugfs
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index 7f06238a7c64..186b2f87d193 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -310,7 +310,7 @@ int dpu_encoder_helper_wait_for_irq(struct dpu_encoder_phys 
*phys_enc,
  phys_enc->hw_pp->idx - PINGPONG_0,
  atomic_read(wait_info->atomic_cnt));
local_irq_save(flags);
-   irq->cb.func(phys_enc, irq->irq_idx);
+   irq->func(phys_enc, irq->irq_idx);
local_irq_restore(flags);
ret = 0;
} else {
@@ -352,7 +352,7 @@ int dpu_encoder_helper_register_irq(struct dpu_encoder_phys 
*phys_enc,
}
 
ret = dpu_core_irq_register_callback(phys_enc->dpu_kms, irq->irq_idx,
-   &irq->cb);
+   irq->func, phys_enc);
if (ret) {
DPU_ERROR_PHYS(phys_enc,
"failed to register IRQ callback for %s\n",
@@ -384,7 +384,7 @@ int dpu_encoder_helper_unregister_irq(struct 
dpu_encoder_phys *phys_enc,
}
 
ret = dpu_core_irq_unregister_callback(phys_enc->dpu_kms, irq->irq_idx,
-   &irq->cb);
+   irq->func, phys_enc);
if (ret) {
DRM_ERROR("unreg cb fail id=%u, intr=%d, irq=%d ret=%d",
  DRMID(phys_enc->parent), intr_idx,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
index e7270eb6b84b..80d87871fd94 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
@@ -174,7 +174,7 @@ struct dpu_encoder_irq {
const char *name;
enum dpu_intr_idx intr_idx;
int irq_idx;
-   struct dpu_irq_callback cb;
+   void (*

[PATCH 5/7] drm/msm/dpu: remove extra wrappers around dpu_core_irq

2021-06-17 Thread Dmitry Baryshkov
Remove extra dpu_irq_* wrappers from dpu_kms.c, merge them directly into
dpu_core_irq_* functions.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.h  | 12 -
 .../gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c |  9 ---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c   | 27 +++
 3 files changed, 15 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.h
index 44ab97fb2964..afc8cd546368 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.h
@@ -10,24 +10,24 @@
 
 /**
  * dpu_core_irq_preinstall - perform pre-installation of core IRQ handler
- * @dpu_kms:   DPU handle
+ * @kms:   MSM KMS handle
  * @return:none
  */
-void dpu_core_irq_preinstall(struct dpu_kms *dpu_kms);
+void dpu_core_irq_preinstall(struct msm_kms *kms);
 
 /**
  * dpu_core_irq_uninstall - uninstall core IRQ handler
- * @dpu_kms:   DPU handle
+ * @kms:   MSM KMS handle
  * @return:none
  */
-void dpu_core_irq_uninstall(struct dpu_kms *dpu_kms);
+void dpu_core_irq_uninstall(struct msm_kms *kms);
 
 /**
  * dpu_core_irq - core IRQ handler
- * @dpu_kms:   DPU handle
+ * @kms:   MSM KMS handle
  * @return:interrupt handling status
  */
-irqreturn_t dpu_core_irq(struct dpu_kms *dpu_kms);
+irqreturn_t dpu_core_irq(struct msm_kms *kms);
 
 /**
  * dpu_core_irq_read - IRQ helper function for reading IRQ status
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
index 73a20fc5c766..124b38e2102c 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
@@ -156,8 +156,9 @@ static void dpu_core_irq_callback_handler(struct dpu_kms 
*dpu_kms, int irq_idx)
cb->func(cb->arg, irq_idx);
 }
 
-irqreturn_t dpu_core_irq(struct dpu_kms *dpu_kms)
+irqreturn_t dpu_core_irq(struct msm_kms *kms)
 {
+   struct dpu_kms *dpu_kms = to_dpu_kms(kms);
struct dpu_hw_intr *intr = dpu_kms->hw_intr;
int reg_idx;
int irq_idx;
@@ -583,8 +584,9 @@ void dpu_debugfs_core_irq_init(struct dpu_kms *dpu_kms,
 }
 #endif
 
-void dpu_core_irq_preinstall(struct dpu_kms *dpu_kms)
+void dpu_core_irq_preinstall(struct msm_kms *kms)
 {
+   struct dpu_kms *dpu_kms = to_dpu_kms(kms);
int i;
 
pm_runtime_get_sync(&dpu_kms->pdev->dev);
@@ -603,8 +605,9 @@ void dpu_core_irq_preinstall(struct dpu_kms *dpu_kms)
}
 }
 
-void dpu_core_irq_uninstall(struct dpu_kms *dpu_kms)
+void dpu_core_irq_uninstall(struct msm_kms *kms)
 {
+   struct dpu_kms *dpu_kms = to_dpu_kms(kms);
int i;
 
pm_runtime_get_sync(&dpu_kms->pdev->dev);
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index e500a9294528..0e4352a4c28c 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -761,20 +761,6 @@ static void _dpu_kms_set_encoder_mode(struct msm_kms *kms,
encoder->base.id, rc);
 }
 
-static irqreturn_t dpu_irq(struct msm_kms *kms)
-{
-   struct dpu_kms *dpu_kms = to_dpu_kms(kms);
-
-   return dpu_core_irq(dpu_kms);
-}
-
-static void dpu_irq_preinstall(struct msm_kms *kms)
-{
-   struct dpu_kms *dpu_kms = to_dpu_kms(kms);
-
-   dpu_core_irq_preinstall(dpu_kms);
-}
-
 static int dpu_irq_postinstall(struct msm_kms *kms)
 {
struct msm_drm_private *priv;
@@ -792,13 +778,6 @@ static int dpu_irq_postinstall(struct msm_kms *kms)
return 0;
 }
 
-static void dpu_irq_uninstall(struct msm_kms *kms)
-{
-   struct dpu_kms *dpu_kms = to_dpu_kms(kms);
-
-   dpu_core_irq_uninstall(dpu_kms);
-}
-
 static void dpu_kms_mdp_snapshot(struct msm_disp_state *disp_state, struct 
msm_kms *kms)
 {
int i;
@@ -846,10 +825,10 @@ static void dpu_kms_mdp_snapshot(struct msm_disp_state 
*disp_state, struct msm_k
 
 static const struct msm_kms_funcs kms_funcs = {
.hw_init = dpu_kms_hw_init,
-   .irq_preinstall  = dpu_irq_preinstall,
+   .irq_preinstall  = dpu_core_irq_preinstall,
.irq_postinstall = dpu_irq_postinstall,
-   .irq_uninstall   = dpu_irq_uninstall,
-   .irq = dpu_irq,
+   .irq_uninstall   = dpu_core_irq_uninstall,
+   .irq = dpu_core_irq,
.enable_commit   = dpu_kms_enable_commit,
.disable_commit  = dpu_kms_disable_commit,
.vsync_time  = dpu_kms_vsync_time,
-- 
2.30.2



Re: [PATCH] drm/bridge: ti-sn65dsi83: Fix null pointer dereference in remove callback

2021-06-17 Thread Laurent Pinchart
Hi Jonathan,

Thank you for the patch.

On Thu, Jun 17, 2021 at 09:19:25PM +1000, Jonathan Liu wrote:
> If attach has not been called, unloading the driver can result in a null
> pointer dereference in mipi_dsi_detach as ctx->dsi has not been assigned
> yet.

Shouldn't this be done in a brige .detach() operation instead ?

> Fixes: ceb515ba29ba6b ("drm/bridge: ti-sn65dsi83: Add TI SN65DSI83 and 
> SN65DSI84 driver")
> Signed-off-by: Jonathan Liu 
> ---
>  drivers/gpu/drm/bridge/ti-sn65dsi83.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c 
> b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> index 750f2172ef08..8e9f45c5c7c1 100644
> --- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> @@ -671,8 +671,11 @@ static int sn65dsi83_remove(struct i2c_client *client)
>  {
>   struct sn65dsi83 *ctx = i2c_get_clientdata(client);
>  
> - mipi_dsi_detach(ctx->dsi);
> - mipi_dsi_device_unregister(ctx->dsi);
> + if (ctx->dsi) {
> + mipi_dsi_detach(ctx->dsi);
> + mipi_dsi_device_unregister(ctx->dsi);
> + }
> +
>   drm_bridge_remove(&ctx->bridge);
>   of_node_put(ctx->host_node);
>  

-- 
Regards,

Laurent Pinchart


Re: [PATCH] drm/meson: fix potential NULL pointer exception in meson_drv_unbind()

2021-06-17 Thread Neil Armstrong
Hi,

On 17/06/2021 09:07, Jiajun Cao wrote:
> Fix a potential NULL pointer exception when meson_drv_unbind()
> attempts to operate on the driver_data priv which may be NULL.
> Add a null pointer check on the priv struct to avoid the NULL
> pointer dereference after calling dev_get_drvdata(), just like
> the null pointer checks done on the struct priv in the function
> meson_drv_shutdown(), meson_drv_pm_suspend() and meson_drv_pm_resume().
> 
> Signed-off-by: Jiajun Cao 
> Signed-off-by: Xin Tan 
> ---
>  drivers/gpu/drm/meson/meson_drv.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/meson/meson_drv.c 
> b/drivers/gpu/drm/meson/meson_drv.c
> index 07fcd12dca16..adea6a2b28f5 100644
> --- a/drivers/gpu/drm/meson/meson_drv.c
> +++ b/drivers/gpu/drm/meson/meson_drv.c
> @@ -380,6 +380,8 @@ static int meson_drv_bind(struct device *dev)
>  static void meson_drv_unbind(struct device *dev)
>  {
>   struct meson_drm *priv = dev_get_drvdata(dev);
> + if (!priv)
> + return;
>   struct drm_device *drm = priv->drm;

Please move struct drm_device before the if like :

struct meson_drm *priv = dev_get_drvdata(dev);
struct drm_device *drm;

if (!priv)
return;
drm = priv->drm;


>  
>   if (priv->canvas) {
> 

Thanks,
Neil


Re: [v1 1/3] dt-bindings: msm/dsi: Add yaml schema for 7nm DSI PHY

2021-06-17 Thread Jonathan Marek

On 6/16/21 1:50 AM, rajee...@codeaurora.org wrote:

On 03-06-2021 01:32, rajee...@codeaurora.org wrote:

On 02-06-2021 02:28, Rob Herring wrote:

On Mon, May 31, 2021 at 07:03:53PM +0530, Rajeev Nandan wrote:



+
+properties:
+  compatible:
+    oneOf:
+  - const: qcom,dsi-phy-7nm


When would one use this?

This is for SM8250.




+  - const: qcom,dsi-phy-7nm-7280
+  - const: qcom,dsi-phy-7nm-8150


These don't look like full SoC names (sm8150?) and it's
,-.


Thanks, Rob, for the review.

I just took the `compatible` property currently used in the DSI PHY 
driver
(drivers/gpu/drm/msm/dsi/phy/dsi_phy.c), and added a new entry for 
sc7280.

A similar pattern of `compatible` names are used in other variants of the
DSI PHY driver e.g. qcom,qcom,dsi-phy-10nm-8998, qcom,dsi-phy-14nm-660 
etc.


The existing compatible names "qcom,dsi-phy-7nm-8150" (SoC at the end) 
make

some sense, if we look at the organization of the dsi phy driver code.
I am new to this and don't know the reason behind the current code
organization and this naming.

Yes, I agree with you, we should use full SoC names. Adding
the SoC name at the end does not feel very convincing, so I will 
change this
to the suggested format e.g. "qcom,sm8250-dsi-phy-7nm", and will 
rename the

occurrences in the driver and device tree accordingly.
Do I need to make changes for 10nm, 14nm, 20nm, and 28nm DSI PHY too?
Bindings doc for these PHYs recently got merged to msm-next [1]


[1]
https://gitlab.freedesktop.org/drm/msm/-/commit/8fc939e72ff80116c090aaf03952253a124d2a8e 





Hi Rob,

I missed adding "robh...@kernel.org" earlier in this thread.

Please check my response to your review comments. Regarding your 
suggestion to use ,- format for compatible property, 
should I also upload a new patch to make changes in 10nm, 14nm, 20nm, 
and 28nm DSI PHY DT bindings?


Thanks,
Rajeev



Hi,

I missed this and ended up sending a similar patch a week later (as part 
of my cphy series, because I needed it to add a "phy-type" property).


"qcom,dsi-phy-7nm" and "qcom,dsi-phy-7nm-8150" aren't new compatibles, 
they were previously documented in the .txt bindings, which are getting 
removed, but the new .yaml bindings didn't include them. Documenting 
them is just a fixup to that patch [1] which is already R-B'd by RobH 
(and has similar compatibles such as "qcom,dsi-phy-10nm" and 
"qcom,dsi-phy-10nm-8998

").

You can use a different/better naming scheme for sc7280, but changing 
the others has nothing to do with adding support for sc7280.


[1] 
https://gitlab.freedesktop.org/drm/msm/-/commit/8fc939e72ff80116c090aaf03952253a124d2a8e 







[PATCH v4 0/3] drm/msm/dsi: support CPHY mode for 7nm pll/phy

2021-06-17 Thread Jonathan Marek
Add the required changes to support 7nm pll/phy in CPHY mode.

This adds a "qcom,dsi-phy-cphy-mode" property for the PHY node to enable
the CPHY mode.

v2:
 - rebased on DSI PHY reworks
 - reworked getting cphy_mode in dsi_host.c
 - documentation change in separate patch

v3:
 - yaml bindings
 - changed binding to "phy-type = ;"

v4:
 - PHY_TYPE_{DPHY,CPHY} instead of PHY_TYPE_DSI_{DPHY,CPHY}
 - use enum/default for phy-type property
 - remove a stray semicolon in dts example

Jonathan Marek (3):
  dt-bindings: msm: dsi: add missing 7nm bindings
  dt-bindings: msm: dsi: document phy-type property for 7nm dsi phy
  drm/msm/dsi: support CPHY mode for 7nm pll/phy

 .../bindings/display/msm/dsi-phy-7nm.yaml |  71 +
 drivers/gpu/drm/msm/dsi/dsi.xml.h |   2 +
 drivers/gpu/drm/msm/dsi/dsi_host.c|  34 +++-
 drivers/gpu/drm/msm/dsi/phy/dsi_phy.c |  49 ++
 drivers/gpu/drm/msm/dsi/phy/dsi_phy.h |   3 +
 drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c | 145 --
 include/dt-bindings/phy/phy.h |   2 +
 7 files changed, 259 insertions(+), 47 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml

-- 
2.26.1



[PATCH v4 1/3] dt-bindings: msm: dsi: add missing 7nm bindings

2021-06-17 Thread Jonathan Marek
These got lost when going from .txt to .yaml bindings, add them back.

Signed-off-by: Jonathan Marek 
---
 .../bindings/display/msm/dsi-phy-7nm.yaml | 66 +++
 1 file changed, 66 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml

diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml 
b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
new file mode 100644
index ..c0077ca7e9e7
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
@@ -0,0 +1,66 @@
+# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/msm/dsi-phy-7nm.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm Display DSI 7nm PHY
+
+maintainers:
+  - Jonathan Marek 
+
+allOf:
+  - $ref: dsi-phy-common.yaml#
+
+properties:
+  compatible:
+oneOf:
+  - const: qcom,dsi-phy-7nm
+  - const: qcom,dsi-phy-7nm-8150
+
+  reg:
+items:
+  - description: dsi phy register set
+  - description: dsi phy lane register set
+  - description: dsi pll register set
+
+  reg-names:
+items:
+  - const: dsi_phy
+  - const: dsi_phy_lane
+  - const: dsi_pll
+
+  vdds-supply:
+description: |
+  Connected to VDD_A_DSI_PLL_0P9 pin (or VDDA_DSI{0,1}_PLL_0P9 for sm8150)
+
+required:
+  - compatible
+  - reg
+  - reg-names
+  - vdds-supply
+
+unevaluatedProperties: false
+
+examples:
+  - |
+ #include 
+ #include 
+
+ dsi-phy@ae94400 {
+ compatible = "qcom,dsi-phy-7nm";
+ reg = <0x0ae94400 0x200>,
+   <0x0ae94600 0x280>,
+   <0x0ae94900 0x260>;
+ reg-names = "dsi_phy",
+ "dsi_phy_lane",
+ "dsi_pll";
+
+ #clock-cells = <1>;
+ #phy-cells = <0>;
+
+ vdds-supply = <&vreg_l5a_0p88>;
+ clocks = <&dispcc DISP_CC_MDSS_AHB_CLK>,
+  <&rpmhcc RPMH_CXO_CLK>;
+ clock-names = "iface", "ref";
+ };
-- 
2.26.1



[PATCH v4 2/3] dt-bindings: msm: dsi: document phy-type property for 7nm dsi phy

2021-06-17 Thread Jonathan Marek
Document a new phy-type property which will be used to determine whether
the phy should operate in D-PHY or C-PHY mode.

Signed-off-by: Jonathan Marek 
Reviewed-by: Laurent Pinchart 
---
 .../devicetree/bindings/display/msm/dsi-phy-7nm.yaml | 5 +
 include/dt-bindings/phy/phy.h| 2 ++
 2 files changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml 
b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
index c0077ca7e9e7..70809d1cac54 100644
--- a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
+++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
@@ -34,6 +34,11 @@ properties:
 description: |
   Connected to VDD_A_DSI_PLL_0P9 pin (or VDDA_DSI{0,1}_PLL_0P9 for sm8150)
 
+  phy-type:
+description: D-PHY (default) or C-PHY mode
+enum: [ 10, 11 ]
+default: 10
+
 required:
   - compatible
   - reg
diff --git a/include/dt-bindings/phy/phy.h b/include/dt-bindings/phy/phy.h
index 887a31b250a8..f48c9acf251e 100644
--- a/include/dt-bindings/phy/phy.h
+++ b/include/dt-bindings/phy/phy.h
@@ -20,5 +20,7 @@
 #define PHY_TYPE_XPCS  7
 #define PHY_TYPE_SGMII 8
 #define PHY_TYPE_QSGMII9
+#define PHY_TYPE_DPHY  10
+#define PHY_TYPE_CPHY  11
 
 #endif /* _DT_BINDINGS_PHY */
-- 
2.26.1



[PATCH v4 3/3] drm/msm/dsi: support CPHY mode for 7nm pll/phy

2021-06-17 Thread Jonathan Marek
Add the required changes to support 7nm pll/phy in CPHY mode.

This adds a "qcom,dsi-phy-cphy-mode" property for the PHY node to enable
the CPHY mode.

Signed-off-by: Jonathan Marek 
Reviewed-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/dsi/dsi.xml.h |   2 +
 drivers/gpu/drm/msm/dsi/dsi_host.c|  34 -
 drivers/gpu/drm/msm/dsi/phy/dsi_phy.c |  49 
 drivers/gpu/drm/msm/dsi/phy/dsi_phy.h |   3 +
 drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c | 145 +++---
 5 files changed, 186 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/msm/dsi/dsi.xml.h 
b/drivers/gpu/drm/msm/dsi/dsi.xml.h
index b8e9e608abfc..a59a9bd3f5d1 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.xml.h
+++ b/drivers/gpu/drm/msm/dsi/dsi.xml.h
@@ -621,6 +621,8 @@ static inline uint32_t DSI_VERSION_MAJOR(uint32_t val)
return ((val) << DSI_VERSION_MAJOR__SHIFT) & DSI_VERSION_MAJOR__MASK;
 }
 
+#define REG_DSI_CPHY_MODE_CTRL 0x02d4
+
 #define REG_DSI_PHY_PLL_CTRL_0 0x0200
 #define DSI_PHY_PLL_CTRL_0_ENABLE  0x0001
 
diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c 
b/drivers/gpu/drm/msm/dsi/dsi_host.c
index 809997f870f6..262d6d3b9c4b 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -27,6 +27,7 @@
 #include "dsi_cfg.h"
 #include "msm_kms.h"
 #include "msm_gem.h"
+#include "phy/dsi_phy.h"
 
 #define DSI_RESET_TOGGLE_DELAY_MS 20
 
@@ -170,6 +171,9 @@ struct msm_dsi_host {
int dlane_swap;
int num_data_lanes;
 
+   /* from phy DT */
+   bool cphy_mode;
+
u32 dma_cmd_ctrl_restore;
 
bool registered;
@@ -513,6 +517,7 @@ int msm_dsi_runtime_resume(struct device *dev)
 
 int dsi_link_clk_set_rate_6g(struct msm_dsi_host *msm_host)
 {
+   u32 byte_intf_rate;
int ret;
 
DBG("Set clk rates: pclk=%d, byteclk=%d",
@@ -532,8 +537,13 @@ int dsi_link_clk_set_rate_6g(struct msm_dsi_host *msm_host)
}
 
if (msm_host->byte_intf_clk) {
-   ret = clk_set_rate(msm_host->byte_intf_clk,
-  msm_host->byte_clk_rate / 2);
+   /* For CPHY, byte_intf_clk is same as byte_clk */
+   if (msm_host->cphy_mode)
+   byte_intf_rate = msm_host->byte_clk_rate;
+   else
+   byte_intf_rate = msm_host->byte_clk_rate / 2;
+
+   ret = clk_set_rate(msm_host->byte_intf_clk, byte_intf_rate);
if (ret) {
pr_err("%s: Failed to set rate byte intf clk, %d\n",
   __func__, ret);
@@ -721,7 +731,11 @@ static void dsi_calc_pclk(struct msm_dsi_host *msm_host, 
bool is_dual_dsi)
lanes = 1;
}
 
-   do_div(pclk_bpp, (8 * lanes));
+   /* CPHY "byte_clk" is in units of 16 bits */
+   if (msm_host->cphy_mode)
+   do_div(pclk_bpp, (16 * lanes));
+   else
+   do_div(pclk_bpp, (8 * lanes));
 
msm_host->pixel_clk_rate = pclk_rate;
msm_host->byte_clk_rate = pclk_bpp;
@@ -947,6 +961,9 @@ static void dsi_ctrl_config(struct msm_dsi_host *msm_host, 
bool enable,
data |= DSI_CTRL_ENABLE;
 
dsi_write(msm_host, REG_DSI_CTRL, data);
+
+   if (msm_host->cphy_mode)
+   dsi_write(msm_host, REG_DSI_CPHY_MODE_CTRL, BIT(0));
 }
 
 static void dsi_set_video_dsc(struct msm_dsi_host *msm_host,
@@ -2278,6 +2295,8 @@ int msm_dsi_host_set_src_pll(struct mipi_dsi_host *host,
struct clk *byte_clk_provider, *pixel_clk_provider;
int ret;
 
+   msm_host->cphy_mode = src_phy->cphy_mode;
+
ret = msm_dsi_phy_get_clk_provider(src_phy,
&byte_clk_provider, &pixel_clk_provider);
if (ret) {
@@ -2349,7 +2368,14 @@ void msm_dsi_host_get_phy_clk_req(struct mipi_dsi_host 
*host,
return;
}
 
-   clk_req->bitclk_rate = msm_host->byte_clk_rate * 8;
+   /* CPHY transmits 16 bits over 7 clock cycles
+* "byte_clk" is in units of 16-bits (see dsi_calc_pclk),
+* so multiply by 7 to get the "bitclk rate"
+*/
+   if (msm_host->cphy_mode)
+   clk_req->bitclk_rate = msm_host->byte_clk_rate * 7;
+   else
+   clk_req->bitclk_rate = msm_host->byte_clk_rate * 8;
clk_req->escclk_rate = msm_host->esc_clk_rate;
 }
 
diff --git a/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c 
b/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
index 6ca6bfd4809b..3e64f1840672 100644
--- a/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
+++ b/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
@@ -5,6 +5,7 @@
 
 #include 
 #include 
+#include 
 
 #include "dsi_phy.h"
 
@@ -461,6 +462,51 @@ int msm_dsi_dphy_timing_calc_v4(struct msm_dsi_dphy_timing 
*timing,
return 0;
 }
 
+int msm_dsi_cphy_timing_calc_v4(struct msm_dsi_dphy_timing *timing,
+   struct msm_dsi_phy_clk_request *clk_req

[PATCH v3 0/8] Support DEVICE_GENERIC memory in migrate_vma_*

2021-06-17 Thread Alex Sierra
v1:
AMD is building a system architecture for the Frontier supercomputer with a
coherent interconnect between CPUs and GPUs. This hardware architecture allows
the CPUs to coherently access GPU device memory. We have hardware in our labs
and we are working with our partner HPE on the BIOS, firmware and software
for delivery to the DOE.

The system BIOS advertises the GPU device memory (aka VRAM) as SPM
(special purpose memory) in the UEFI system address map. The amdgpu driver looks
it up with lookup_resource and registers it with devmap as MEMORY_DEVICE_GENERIC
using devm_memremap_pages.

Now we're trying to migrate data to and from that memory using the migrate_vma_*
helpers so we can support page-based migration in our unified memory 
allocations,
while also supporting CPU access to those pages.

This patch series makes a few changes to make MEMORY_DEVICE_GENERIC pages behave
correctly in the migrate_vma_* helpers. We are looking for feedback about this
approach. If we're close, what's needed to make our patches acceptable upstream?
If we're not close, any suggestions how else to achieve what we are trying to do
(i.e. page migration and coherent CPU access to VRAM)?

This work is based on HMM and our SVM memory manager that was recently 
upstreamed
to Dave Airlie's drm-next branch
https://lore.kernel.org/dri-devel/20210527205606.2660-6-felix.kuehl...@amd.com/T/#r996356015e295780eb50453e7dbd5d0d68b47cbc
On top of that we did some rework of our VRAM management for migrations to 
remove
some incorrect assumptions, allow partially successful migrations and GPU memory
mappings that mix pages in VRAM and system memory.
https://patchwork.kernel.org/project/dri-devel/list/?series=489811

v2:
This patch series version has merged "[RFC PATCH v3 0/2]
mm: remove extra ZONE_DEVICE struct page refcount" patch series made by
Ralph Campbell. It also applies at the top of these series, our changes
to support device generic type in migration_vma helpers.
This has been tested in systems with device memory that has coherent
access by CPU.

Also addresses the following feedback made in v1:
- Isolate in one patch kernel/resource.c modification, based
on Christoph's feedback.
- Add helpers check for generic and private type to avoid
duplicated long lines.

v3:
- Include cover letter from v1
- Rename dax_layout_is_idle_page func to dax_page_unused in patch
ext4/xfs: add page refcount helper

Patches 1-2 Rebased Ralph Campbell's ZONE_DEVICE page refcounting patches
Patches 4-5 are for context to show how we are looking up the SPM 
memory and registering it with devmap.
Patches 3,6-8 are the changes we are trying to upstream or rework to 
make them acceptable upstream.

Alex Sierra (6):
  kernel: resource: lookup_resource as exported symbol
  drm/amdkfd: add SPM support for SVM
  drm/amdkfd: generic type as sys mem on migration to ram
  include/linux/mm.h: helpers to check zone device generic type
  mm: add generic type support to migrate_vma helpers
  mm: call pgmap->ops->page_free for DEVICE_GENERIC pages

Ralph Campbell (2):
  ext4/xfs: add page refcount helper
  mm: remove extra ZONE_DEVICE struct page refcount

 arch/powerpc/kvm/book3s_hv_uvmem.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 15 --
 drivers/gpu/drm/nouveau/nouveau_dmem.c   |  2 +-
 fs/dax.c |  8 +--
 fs/ext4/inode.c  |  5 +-
 fs/xfs/xfs_file.c|  4 +-
 include/linux/dax.h  | 10 
 include/linux/memremap.h |  7 +--
 include/linux/mm.h   | 52 +++---
 kernel/resource.c|  2 +-
 lib/test_hmm.c   |  2 +-
 mm/internal.h|  8 +++
 mm/memremap.c| 69 +++-
 mm/migrate.c | 13 ++---
 mm/page_alloc.c  |  3 ++
 mm/swap.c| 45 ++--
 16 files changed, 83 insertions(+), 164 deletions(-)

-- 
2.17.1



[PATCH v3 1/8] ext4/xfs: add page refcount helper

2021-06-17 Thread Alex Sierra
From: Ralph Campbell 

There are several places where ZONE_DEVICE struct pages assume a reference
count == 1 means the page is idle and free. Instead of open coding this,
add a helper function to hide this detail.

v2:
[AS]: rename dax_layout_is_idle_page func to dax_page_unused

Signed-off-by: Ralph Campbell 
Signed-off-by: Alex Sierra 
---
 fs/dax.c|  4 ++--
 fs/ext4/inode.c |  5 +
 fs/xfs/xfs_file.c   |  4 +---
 include/linux/dax.h | 10 ++
 4 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index 26d5dcd2d69e..321f4ddc6643 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -358,7 +358,7 @@ static void dax_disassociate_entry(void *entry, struct 
address_space *mapping,
for_each_mapped_pfn(entry, pfn) {
struct page *page = pfn_to_page(pfn);
 
-   WARN_ON_ONCE(trunc && page_ref_count(page) > 1);
+   WARN_ON_ONCE(trunc && !dax_layout_is_idle_page(page));
WARN_ON_ONCE(page->mapping && page->mapping != mapping);
page->mapping = NULL;
page->index = 0;
@@ -372,7 +372,7 @@ static struct page *dax_busy_page(void *entry)
for_each_mapped_pfn(entry, pfn) {
struct page *page = pfn_to_page(pfn);
 
-   if (page_ref_count(page) > 1)
+   if (!dax_layout_is_idle_page(page))
return page;
}
return NULL;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c173c8405856..9ee00186412f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3972,10 +3972,7 @@ int ext4_break_layouts(struct inode *inode)
if (!page)
return 0;
 
-   error = ___wait_var_event(&page->_refcount,
-   atomic_read(&page->_refcount) == 1,
-   TASK_INTERRUPTIBLE, 0, 0,
-   ext4_wait_dax_page(ei));
+   error = dax_wait_page(ei, page, ext4_wait_dax_page);
} while (error == 0);
 
return error;
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 5b0f93f73837..39565fe5f817 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -782,9 +782,7 @@ xfs_break_dax_layouts(
return 0;
 
*retry = true;
-   return ___wait_var_event(&page->_refcount,
-   atomic_read(&page->_refcount) == 1, TASK_INTERRUPTIBLE,
-   0, 0, xfs_wait_dax_page(inode));
+   return dax_wait_page(inode, page, xfs_wait_dax_page);
 }
 
 int
diff --git a/include/linux/dax.h b/include/linux/dax.h
index b52f084aa643..8b5da1d60dbc 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -243,6 +243,16 @@ static inline bool dax_mapping(struct address_space 
*mapping)
return mapping->host && IS_DAX(mapping->host);
 }
 
+static inline bool dax_page_unused(struct page *page)
+{
+   return page_ref_count(page) == 1;
+}
+
+#define dax_wait_page(_inode, _page, _wait_cb) \
+   ___wait_var_event(&(_page)->_refcount,  \
+   dax_page_unused(_page), \
+   TASK_INTERRUPTIBLE, 0, 0, _wait_cb(_inode))
+
 #ifdef CONFIG_DEV_DAX_HMEM_DEVICES
 void hmem_register_device(int target_nid, struct resource *r);
 #else
-- 
2.17.1



[PATCH v3 2/8] mm: remove extra ZONE_DEVICE struct page refcount

2021-06-17 Thread Alex Sierra
From: Ralph Campbell 

ZONE_DEVICE struct pages have an extra reference count that complicates the
code for put_page() and several places in the kernel that need to check the
reference count to see that a page is not being used (gup, compaction,
migration, etc.). Clean up the code so the reference count doesn't need to
be treated specially for ZONE_DEVICE.

v2:
AS: merged this patch in linux 5.11 version

Signed-off-by: Ralph Campbell 
Signed-off-by: Alex Sierra 
---
 arch/powerpc/kvm/book3s_hv_uvmem.c |  2 +-
 drivers/gpu/drm/nouveau/nouveau_dmem.c |  2 +-
 fs/dax.c   |  4 +-
 include/linux/dax.h|  2 +-
 include/linux/memremap.h   |  7 +--
 include/linux/mm.h | 44 -
 lib/test_hmm.c |  2 +-
 mm/internal.h  |  8 +++
 mm/memremap.c  | 68 +++---
 mm/migrate.c   |  5 --
 mm/page_alloc.c|  3 ++
 mm/swap.c  | 45 ++---
 12 files changed, 45 insertions(+), 147 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c 
b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 84e5a2dc8be5..acee67710620 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -711,7 +711,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long 
gpa, struct kvm *kvm)
 
dpage = pfn_to_page(uvmem_pfn);
dpage->zone_device_data = pvt;
-   get_page(dpage);
+   init_page_count(dpage);
lock_page(dpage);
return dpage;
 out_clear:
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c 
b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 92987daa5e17..8bc7120e1216 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -324,7 +324,7 @@ nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm)
return NULL;
}
 
-   get_page(page);
+   init_page_count(page);
lock_page(page);
return page;
 }
diff --git a/fs/dax.c b/fs/dax.c
index 321f4ddc6643..7b4c6b35b098 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -560,14 +560,14 @@ static void *grab_mapping_entry(struct xa_state *xas,
 
 /**
  * dax_layout_busy_page_range - find first pinned page in @mapping
- * @mapping: address space to scan for a page with ref count > 1
+ * @mapping: address space to scan for a page with ref count > 0
  * @start: Starting offset. Page containing 'start' is included.
  * @end: End offset. Page containing 'end' is included. If 'end' is LLONG_MAX,
  *   pages from 'start' till the end of file are included.
  *
  * DAX requires ZONE_DEVICE mapped pages. These pages are never
  * 'onlined' to the page allocator so they are considered idle when
- * page->count == 1. A filesystem uses this interface to determine if
+ * page->count == 0. A filesystem uses this interface to determine if
  * any page in the mapping is busy, i.e. for DMA, or other
  * get_user_pages() usages.
  *
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 8b5da1d60dbc..05fc982ce153 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -245,7 +245,7 @@ static inline bool dax_mapping(struct address_space 
*mapping)
 
 static inline bool dax_page_unused(struct page *page)
 {
-   return page_ref_count(page) == 1;
+   return page_ref_count(page) == 0;
 }
 
 #define dax_wait_page(_inode, _page, _wait_cb) \
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 79c49e7f5c30..327f32427d21 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -66,9 +66,10 @@ enum memory_type {
 
 struct dev_pagemap_ops {
/*
-* Called once the page refcount reaches 1.  (ZONE_DEVICE pages never
-* reach 0 refcount unless there is a refcount bug. This allows the
-* device driver to implement its own memory management.)
+* Called once the page refcount reaches 0. The reference count
+* should be reset to one with init_page_count(page) before reusing
+* the page. This allows the device driver to implement its own
+* memory management.
 */
void (*page_free)(struct page *page);
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index c9900aedc195..d8d79bb94be8 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1117,39 +1117,6 @@ static inline bool is_zone_device_page(const struct page 
*page)
 }
 #endif
 
-#ifdef CONFIG_DEV_PAGEMAP_OPS
-void free_devmap_managed_page(struct page *page);
-DECLARE_STATIC_KEY_FALSE(devmap_managed_key);
-
-static inline bool page_is_devmap_managed(struct page *page)
-{
-   if (!static_branch_unlikely(&devmap_managed_key))
-   return false;
-   if (!is_zone_device_page(page))
-   return false;
-   switch (page->pgmap->type) {
-   case MEMORY_DEVICE_PRIVATE:
-   case ME

[PATCH v3 3/8] kernel: resource: lookup_resource as exported symbol

2021-06-17 Thread Alex Sierra
The AMD architecture for the Frontier supercomputer will
have device memory which can be coherently accessed by
the CPU. The system BIOS advertises this memory as SPM
(special purpose memory) in the UEFI system address map.

The AMDGPU driver needs to be able to lookup this resource
in order to claim it as MEMORY_DEVICE_GENERIC using
devm_memremap_pages.

Signed-off-by: Alex Sierra 
---
 kernel/resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 627e61b0c124..269489bb7097 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -783,7 +783,7 @@ struct resource *lookup_resource(struct resource *root, 
resource_size_t start)
 
return res;
 }
-
+EXPORT_SYMBOL_GPL(lookup_resource);
 /*
  * Insert a resource into the resource tree. If successful, return NULL,
  * otherwise return the conflicting resource (compare to __request_resource())
-- 
2.17.1



[PATCH v3 4/8] drm/amdkfd: add SPM support for SVM

2021-06-17 Thread Alex Sierra
When CPU is connected throug XGMI, it has coherent
access to VRAM resource. In this case that resource
is taken from a table in the device gmc aperture base.
This resource is used along with the device type, which could
be DEVICE_PRIVATE or DEVICE_GENERIC to create the device
page map region.

Signed-off-by: Alex Sierra 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index c8ca3252cbc2..f5939449a99f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -895,6 +895,7 @@ int svm_migrate_init(struct amdgpu_device *adev)
struct resource *res;
unsigned long size;
void *r;
+   bool xgmi_connected_to_cpu = adev->gmc.xgmi.connected_to_cpu;
 
/* Page migration works on Vega10 or newer */
if (kfddev->device_info->asic_family < CHIP_VEGA10)
@@ -907,17 +908,22 @@ int svm_migrate_init(struct amdgpu_device *adev)
 * should remove reserved size
 */
size = ALIGN(adev->gmc.real_vram_size, 2ULL << 20);
-   res = devm_request_free_mem_region(adev->dev, &iomem_resource, size);
+   if (xgmi_connected_to_cpu)
+   res = lookup_resource(&iomem_resource, adev->gmc.aper_base);
+   else
+   res = devm_request_free_mem_region(adev->dev, &iomem_resource, 
size);
+
if (IS_ERR(res))
return -ENOMEM;
 
-   pgmap->type = MEMORY_DEVICE_PRIVATE;
pgmap->nr_range = 1;
pgmap->range.start = res->start;
pgmap->range.end = res->end;
+   pgmap->type = xgmi_connected_to_cpu ?
+   MEMORY_DEVICE_GENERIC : MEMORY_DEVICE_PRIVATE;
pgmap->ops = &svm_migrate_pgmap_ops;
pgmap->owner = SVM_ADEV_PGMAP_OWNER(adev);
-   pgmap->flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
+   pgmap->flags = 0;
r = devm_memremap_pages(adev->dev, pgmap);
if (IS_ERR(r)) {
pr_err("failed to register HMM device memory\n");
-- 
2.17.1



[PATCH v3 5/8] drm/amdkfd: generic type as sys mem on migration to ram

2021-06-17 Thread Alex Sierra
Generic device type memory on VRAM to RAM migration,
has similar access as System RAM from the CPU. This flag sets
the source from the sender. Which in Generic type case,
should be set as SYSTEM.

Signed-off-by: Alex Sierra 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index f5939449a99f..7b41006c1164 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -653,8 +653,9 @@ svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct 
svm_range *prange,
migrate.vma = vma;
migrate.start = start;
migrate.end = end;
-   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
migrate.pgmap_owner = SVM_ADEV_PGMAP_OWNER(adev);
+   migrate.flags = adev->gmc.xgmi.connected_to_cpu ?
+   MIGRATE_VMA_SELECT_SYSTEM : 
MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
 
size = 2 * sizeof(*migrate.src) + sizeof(uint64_t) + sizeof(dma_addr_t);
size *= npages;
-- 
2.17.1



[PATCH v3 6/8] include/linux/mm.h: helpers to check zone device generic type

2021-06-17 Thread Alex Sierra
Two helpers added. One checks if zone device page is generic
type. The other if page is either private or generic type.

Signed-off-by: Alex Sierra 
---
 include/linux/mm.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index d8d79bb94be8..f5b247a63044 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1125,6 +1125,14 @@ static inline bool is_device_private_page(const struct 
page *page)
page->pgmap->type == MEMORY_DEVICE_PRIVATE;
 }
 
+static inline bool is_device_page(const struct page *page)
+{
+   return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
+   is_zone_device_page(page) &&
+   (page->pgmap->type == MEMORY_DEVICE_PRIVATE ||
+page->pgmap->type == MEMORY_DEVICE_GENERIC);
+}
+
 static inline bool is_pci_p2pdma_page(const struct page *page)
 {
return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
-- 
2.17.1



[PATCH v3 7/8] mm: add generic type support to migrate_vma helpers

2021-06-17 Thread Alex Sierra
Device generic type case added for migrate_vma_pages and
migrate_vma_check_page helpers.
Both, generic and private device types have the same
conditions to decide to migrate pages from/to device
memory.

Signed-off-by: Alex Sierra 
---
 mm/migrate.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 8c2430d3e77b..3b6aaba96fe6 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2602,7 +2602,7 @@ static bool migrate_vma_check_page(struct page *page)
 * FIXME proper solution is to rework migration_entry_wait() so
 * it does not need to take a reference on page.
 */
-   return is_device_private_page(page);
+   return is_device_page(page);
}
 
/* For file back page */
@@ -3064,10 +3064,10 @@ void migrate_vma_pages(struct migrate_vma *migrate)
mapping = page_mapping(page);
 
if (is_zone_device_page(newpage)) {
-   if (is_device_private_page(newpage)) {
+   if (is_device_page(newpage)) {
/*
-* For now only support private anonymous when
-* migrating to un-addressable device memory.
+* For now only support private and generic
+* anonymous when migrating to device memory.
 */
if (mapping) {
migrate->src[i] &= ~MIGRATE_PFN_MIGRATE;
-- 
2.17.1



[PATCH v3 8/8] mm: call pgmap->ops->page_free for DEVICE_GENERIC pages

2021-06-17 Thread Alex Sierra
Add MEMORY_DEVICE_GENERIC case to free_zone_device_page
callback.
Device generic type memory case is now able to free its
pages properly.

Signed-off-by: Alex Sierra 
---
 mm/memremap.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/memremap.c b/mm/memremap.c
index 614b3d600e95..6c884e2542a9 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -438,7 +438,7 @@ struct dev_pagemap *get_dev_pagemap(unsigned long pfn,
 EXPORT_SYMBOL_GPL(get_dev_pagemap);
 
 #ifdef CONFIG_DEV_PAGEMAP_OPS
-static void free_device_private_page(struct page *page)
+static void free_device_page(struct page *page)
 {
 
__ClearPageWaiters(page);
@@ -477,7 +477,8 @@ void free_zone_device_page(struct page *page)
wake_up_var(&page->_refcount);
return;
case MEMORY_DEVICE_PRIVATE:
-   free_device_private_page(page);
+   case MEMORY_DEVICE_GENERIC:
+   free_device_page(page);
return;
default:
return;
-- 
2.17.1



Re: vc4: hdmi: audio: ASoC: error at snd_soc_dai_startup on fef00700.hdmi

2021-06-17 Thread Maxime Ripard
Hi Stefan,

On Sat, Jun 12, 2021 at 12:04:08PM +0200, Stefan Wahren wrote:
> Hi Maxime,
> 
> Am 04.06.21 um 11:02 schrieb Maxime Ripard:
> > Hi Stefan,
> >
> > I would assume it's due to this:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/vc4/vc4_hdmi.c#n1083
> >
> > It pre-dates my time working on the vc4 driver so I'm not really sure
> > what this is supposed to prevent, but my guess is that it's there to
> > avoid someone using the audio card before we have a display detected and
> > connected, and its capabilities known (the first and more obvious one
> > being does it support audio in the first place).
> >
> > It's nothing new though, maybe it's the error printing itself that is?
> 
> i'm sorry, i forgot about this discussion here:
> 
> https://lists.freedesktop.org/archives/dri-devel/2020-December/292701.html

It looks like there's no discussion on that link, is it the link you wanted to 
paste?

Maxime


signature.asc
Description: PGP signature


RE: [Intel-gfx] [PATCH] drm/i915: Perform execbuffer object locking as a separate step

2021-06-17 Thread Tang, CQ


> -Original Message-
> From: Intel-gfx  On Behalf Of
> Thomas Hellström
> Sent: Tuesday, June 15, 2021 4:36 AM
> To: intel-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Cc: Thomas Hellström ; Auld, Matthew
> 
> Subject: [Intel-gfx] [PATCH] drm/i915: Perform execbuffer object locking as a
> separate step
> 
> To help avoid evicting already resident buffers from the batch we're
> processing, perform locking as a separate step.
> 
> Signed-off-by: Thomas Hellström 
> ---
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 25 --
> -
>  1 file changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 201fed19d120..394eb40c95b5 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -922,21 +922,38 @@ static int eb_lookup_vmas(struct i915_execbuffer
> *eb)
>   return err;
>  }
> 
> -static int eb_validate_vmas(struct i915_execbuffer *eb)
> +static int eb_lock_vmas(struct i915_execbuffer *eb)
>  {
>   unsigned int i;
>   int err;
> 
> - INIT_LIST_HEAD(&eb->unbound);
> -
>   for (i = 0; i < eb->buffer_count; i++) {
> - struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
>   struct eb_vma *ev = &eb->vma[i];
>   struct i915_vma *vma = ev->vma;
> 
>   err = i915_gem_object_lock(vma->obj, &eb->ww);
>   if (err)
>   return err;
> + }
> +
> + return 0;
> +}
> +
> +static int eb_validate_vmas(struct i915_execbuffer *eb) {
> + unsigned int i;
> + int err;
> +
> + INIT_LIST_HEAD(&eb->unbound);
> +
> + err = eb_lock_vmas(eb);
> + if (err)
> + return err;
> +
> + for (i = 0; i < eb->buffer_count; i++) {
> + struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
> + struct eb_vma *ev = &eb->vma[i];
> + struct i915_vma *vma = ev->vma;
> 
>   err = eb_pin_vma(eb, entry, ev);
>   if (err == -EDEADLK)

Thomas, just checked eb_pin_vma(), it calls i915_vma_pin_ww(), if the object is 
already locked, under what condition these calls still return -EDEADLK?

--CQ

> --
> 2.31.1
> 
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [PATCH v4] Documentation: gpu: Mention the requirements for new properties

2021-06-17 Thread Philippe CORNU




On 6/16/21 4:38 PM, Maxime Ripard wrote:

New KMS properties come with a bunch of requirements to avoid each
driver from running their own, inconsistent, set of properties,
eventually leading to issues like property conflicts, inconsistencies
between drivers and semantics, etc.

Let's document what we expect.

Cc: Alexandre Belloni 
Cc: Alexandre Torgue 
Cc: Alex Deucher 
Cc: Alison Wang 
Cc: Alyssa Rosenzweig 
Cc: Andrew Jeffery 
Cc: Andrzej Hajda 
Cc: Anitha Chrisanthus 
Cc: Benjamin Gaignard 
Cc: Ben Skeggs 
Cc: Boris Brezillon 
Cc: Brian Starkey 
Cc: Chen Feng 
Cc: Chen-Yu Tsai 
Cc: Christian Gmeiner 
Cc: "Christian König" 
Cc: Chun-Kuang Hu 
Cc: Edmund Dea 
Cc: Eric Anholt 
Cc: Fabio Estevam 
Cc: Gerd Hoffmann 
Cc: Haneen Mohammed 
Cc: Hans de Goede 
Cc: "Heiko Stübner" 
Cc: Huang Rui 
Cc: Hyun Kwon 
Cc: Inki Dae 
Cc: Jani Nikula 
Cc: Jernej Skrabec 
Cc: Jerome Brunet 
Cc: Joel Stanley 
Cc: John Stultz 
Cc: Jonas Karlman 
Cc: Jonathan Hunter 
Cc: Joonas Lahtinen 
Cc: Joonyoung Shim 
Cc: Jyri Sarha 
Cc: Kevin Hilman 
Cc: Kieran Bingham 
Cc: Krzysztof Kozlowski 
Cc: Kyungmin Park 
Cc: Laurent Pinchart 
Cc: Linus Walleij 
Cc: Liviu Dudau 
Cc: Lucas Stach 
Cc: Ludovic Desroches 
Cc: Marek Vasut 
Cc: Martin Blumenstingl 
Cc: Matthias Brugger 
Cc: Maxime Coquelin 
Cc: Maxime Ripard 
Cc: Melissa Wen 
Cc: Neil Armstrong 
Cc: Nicolas Ferre 
Cc: "Noralf Trønnes" 
Cc: NXP Linux Team 
Cc: Oleksandr Andrushchenko 
Cc: Patrik Jakobsson 
Cc: Paul Cercueil 
Cc: Pekka Paalanen 
Cc: Pengutronix Kernel Team 
Cc: Philippe Cornu 
Cc: Philipp Zabel 
Cc: Qiang Yu 
Cc: Rob Clark 
Cc: Robert Foss 
Cc: Rob Herring 
Cc: Rodrigo Siqueira 
Cc: Rodrigo Vivi 
Cc: Roland Scheidegger 
Cc: Russell King 
Cc: Sam Ravnborg 
Cc: Sandy Huang 
Cc: Sascha Hauer 
Cc: Sean Paul 
Cc: Seung-Woo Kim 
Cc: Shawn Guo 
Cc: Simon Ser 
Cc: Stefan Agner 
Cc: Steven Price 
Cc: Sumit Semwal 
Cc: Thierry Reding 
Cc: Tian Tao 
Cc: Tomeu Vizoso 
Cc: Tomi Valkeinen 
Cc: VMware Graphics 
Cc: Xinliang Liu 
Cc: Xinwei Kong 
Cc: Yannick Fertre 
Cc: Zack Rusin 
Reviewed-by: Daniel Vetter 
Signed-off-by: Maxime Ripard 

---

Changes from v3:
   - Roll back to the v2
   - Add Simon and Pekka in Cc

Changes from v2:
   - Take into account the feedback from Laurent and Lidiu to no longer
 force generic properties, but prefix vendor-specific properties with
 the vendor name

Changes from v1:
   - Typos and wording reported by Daniel and Alex
---
  Documentation/gpu/drm-kms.rst | 19 +++
  1 file changed, 19 insertions(+)

diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst
index 87e5023e3f55..c28b464dd397 100644
--- a/Documentation/gpu/drm-kms.rst
+++ b/Documentation/gpu/drm-kms.rst
@@ -463,6 +463,25 @@ KMS Properties
  This section of the documentation is primarily aimed at user-space developers.
  For the driver APIs, see the other sections.
  
+Requirements

+
+
+KMS drivers might need to add extra properties to support new features.
+Each new property introduced in a driver need to meet a few
+requirements, in addition to the one mentioned above.:
+
+- It must be standardized, with some documentation to describe how the
+  property can be used.
+
+- It must provide a generic helper in the core code to register that
+  property on the object it attaches to.
+
+- Its content must be decoded by the core and provided in the object's
+  associated state structure. That includes anything drivers might want to
+  precompute, like :c:type:`struct drm_clip_rect ` for planes.
+
+- An IGT test must be submitted where reasonable.
+
  Property Types and Blob Property Support
  
  



Hi,

Regarding properties, we have a “case study example” related in a 
certain way to this documentation update :-)


The use case: on a front desk at an exhibition, there is a welcome 
screen you can touch for searching various information. When this 
welcome screen is in idle, a small logo is displayed at its center 
(around 20% of the fullscreen). The logo has a white background color. 
We want to reduce the ddr usage for lowering the power (the board is 
battery powered) so the idea is to use a white background color around 
this logo, produced by the drm CRTC so the image in ddr is only the size 
of the logo.


Reading the thread 
https://lists.freedesktop.org/archives/dri-devel/2019-October/239733.html 
dissuade us from coding a generic solution, so we started to implement a 
"STM_" private background color property, it works... but we are not at 
all convince this is the right way and we clearly prefer 
mainline/generic sw for both kernel & userland.


So now, what are our options... well, this v4 documentation update is I 
think clear enough: we have to document + provide a generic helper in 
the core code (similar to the original patch) + update IGT test, right?


Thanks
Philippe :-)

Note: It is really a pleasure to read such interesting thread, exposing 
the “complexity” of our job, dea

Re: [PATCH v3 0/8] Support DEVICE_GENERIC memory in migrate_vma_*

2021-06-17 Thread Sierra Guiza, Alejandro (Alex)



On 6/17/2021 10:16 AM, Alex Sierra wrote:

v1:
AMD is building a system architecture for the Frontier supercomputer with a
coherent interconnect between CPUs and GPUs. This hardware architecture allows
the CPUs to coherently access GPU device memory. We have hardware in our labs
and we are working with our partner HPE on the BIOS, firmware and software
for delivery to the DOE.

The system BIOS advertises the GPU device memory (aka VRAM) as SPM
(special purpose memory) in the UEFI system address map. The amdgpu driver looks
it up with lookup_resource and registers it with devmap as MEMORY_DEVICE_GENERIC
using devm_memremap_pages.

Now we're trying to migrate data to and from that memory using the migrate_vma_*
helpers so we can support page-based migration in our unified memory 
allocations,
while also supporting CPU access to those pages.

This patch series makes a few changes to make MEMORY_DEVICE_GENERIC pages behave
correctly in the migrate_vma_* helpers. We are looking for feedback about this
approach. If we're close, what's needed to make our patches acceptable upstream?
If we're not close, any suggestions how else to achieve what we are trying to do
(i.e. page migration and coherent CPU access to VRAM)?

This work is based on HMM and our SVM memory manager that was recently 
upstreamed
to Dave Airlie's drm-next branch
https://lore.kernel.org/dri-devel/20210527205606.2660-6-felix.kuehl...@amd.com/T/#r996356015e295780eb50453e7dbd5d0d68b47cbc

Corrected link:

https://cgit.freedesktop.org/drm/drm/log/?h=drm-next

Regards,
Alex Sierra


On top of that we did some rework of our VRAM management for migrations to 
remove
some incorrect assumptions, allow partially successful migrations and GPU memory
mappings that mix pages in VRAM and system memory.
https://patchwork.kernel.org/project/dri-devel/list/?series=489811


Corrected link:

https://lore.kernel.org/dri-devel/20210527205606.2660-6-felix.kuehl...@amd.com/T/#r996356015e295780eb50453e7dbd5d0d68b47cbc

Regards,
Alex Sierra



v2:
This patch series version has merged "[RFC PATCH v3 0/2]
mm: remove extra ZONE_DEVICE struct page refcount" patch series made by
Ralph Campbell. It also applies at the top of these series, our changes
to support device generic type in migration_vma helpers.
This has been tested in systems with device memory that has coherent
access by CPU.

Also addresses the following feedback made in v1:
- Isolate in one patch kernel/resource.c modification, based
on Christoph's feedback.
- Add helpers check for generic and private type to avoid
duplicated long lines.

v3:
- Include cover letter from v1
- Rename dax_layout_is_idle_page func to dax_page_unused in patch
ext4/xfs: add page refcount helper

Patches 1-2 Rebased Ralph Campbell's ZONE_DEVICE page refcounting patches
Patches 4-5 are for context to show how we are looking up the SPM
memory and registering it with devmap.
Patches 3,6-8 are the changes we are trying to upstream or rework to
make them acceptable upstream.

Alex Sierra (6):
   kernel: resource: lookup_resource as exported symbol
   drm/amdkfd: add SPM support for SVM
   drm/amdkfd: generic type as sys mem on migration to ram
   include/linux/mm.h: helpers to check zone device generic type
   mm: add generic type support to migrate_vma helpers
   mm: call pgmap->ops->page_free for DEVICE_GENERIC pages

Ralph Campbell (2):
   ext4/xfs: add page refcount helper
   mm: remove extra ZONE_DEVICE struct page refcount

  arch/powerpc/kvm/book3s_hv_uvmem.c   |  2 +-
  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 15 --
  drivers/gpu/drm/nouveau/nouveau_dmem.c   |  2 +-
  fs/dax.c |  8 +--
  fs/ext4/inode.c  |  5 +-
  fs/xfs/xfs_file.c|  4 +-
  include/linux/dax.h  | 10 
  include/linux/memremap.h |  7 +--
  include/linux/mm.h   | 52 +++---
  kernel/resource.c|  2 +-
  lib/test_hmm.c   |  2 +-
  mm/internal.h|  8 +++
  mm/memremap.c| 69 +++-
  mm/migrate.c | 13 ++---
  mm/page_alloc.c  |  3 ++
  mm/swap.c| 45 ++--
  16 files changed, 83 insertions(+), 164 deletions(-)



Re: [PATCH v3 1/8] ext4/xfs: add page refcount helper

2021-06-17 Thread Darrick J. Wong
On Thu, Jun 17, 2021 at 10:16:58AM -0500, Alex Sierra wrote:
> From: Ralph Campbell 
> 
> There are several places where ZONE_DEVICE struct pages assume a reference
> count == 1 means the page is idle and free. Instead of open coding this,
> add a helper function to hide this detail.
> 
> v2:
> [AS]: rename dax_layout_is_idle_page func to dax_page_unused
> 
> Signed-off-by: Ralph Campbell 
> Signed-off-by: Alex Sierra 
> ---
>  fs/dax.c|  4 ++--
>  fs/ext4/inode.c |  5 +
>  fs/xfs/xfs_file.c   |  4 +---
>  include/linux/dax.h | 10 ++
>  4 files changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index 26d5dcd2d69e..321f4ddc6643 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -358,7 +358,7 @@ static void dax_disassociate_entry(void *entry, struct 
> address_space *mapping,
>   for_each_mapped_pfn(entry, pfn) {
>   struct page *page = pfn_to_page(pfn);
>  
> - WARN_ON_ONCE(trunc && page_ref_count(page) > 1);
> + WARN_ON_ONCE(trunc && !dax_layout_is_idle_page(page));
>   WARN_ON_ONCE(page->mapping && page->mapping != mapping);
>   page->mapping = NULL;
>   page->index = 0;
> @@ -372,7 +372,7 @@ static struct page *dax_busy_page(void *entry)
>   for_each_mapped_pfn(entry, pfn) {
>   struct page *page = pfn_to_page(pfn);
>  
> - if (page_ref_count(page) > 1)
> + if (!dax_layout_is_idle_page(page))
>   return page;
>   }
>   return NULL;
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index c173c8405856..9ee00186412f 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3972,10 +3972,7 @@ int ext4_break_layouts(struct inode *inode)
>   if (!page)
>   return 0;
>  
> - error = ___wait_var_event(&page->_refcount,
> - atomic_read(&page->_refcount) == 1,
> - TASK_INTERRUPTIBLE, 0, 0,
> - ext4_wait_dax_page(ei));
> + error = dax_wait_page(ei, page, ext4_wait_dax_page);
>   } while (error == 0);
>  
>   return error;
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 5b0f93f73837..39565fe5f817 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -782,9 +782,7 @@ xfs_break_dax_layouts(
>   return 0;
>  
>   *retry = true;
> - return ___wait_var_event(&page->_refcount,
> - atomic_read(&page->_refcount) == 1, TASK_INTERRUPTIBLE,
> - 0, 0, xfs_wait_dax_page(inode));
> + return dax_wait_page(inode, page, xfs_wait_dax_page);

Mechanically, this looks like a straightforward replacement, so:
Acked-by: Darrick J. Wong 

--D

>  }
>  
>  int
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index b52f084aa643..8b5da1d60dbc 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -243,6 +243,16 @@ static inline bool dax_mapping(struct address_space 
> *mapping)
>   return mapping->host && IS_DAX(mapping->host);
>  }
>  
> +static inline bool dax_page_unused(struct page *page)
> +{
> + return page_ref_count(page) == 1;
> +}
> +
> +#define dax_wait_page(_inode, _page, _wait_cb)   
> \
> + ___wait_var_event(&(_page)->_refcount,  \
> + dax_page_unused(_page), \
> + TASK_INTERRUPTIBLE, 0, 0, _wait_cb(_inode))
> +
>  #ifdef CONFIG_DEV_DAX_HMEM_DEVICES
>  void hmem_register_device(int target_nid, struct resource *r);
>  #else
> -- 
> 2.17.1
> 


Re: [PATCH v4 1/3] dt-bindings: msm: dsi: add missing 7nm bindings

2021-06-17 Thread Rob Clark
On Thu, Jun 17, 2021 at 8:09 AM Jonathan Marek  wrote:
>
> These got lost when going from .txt to .yaml bindings, add them back.
>

Fixes: 8fc939e72ff8 ("dt-bindings: msm: dsi: add yaml schemas for DSI
PHY bindings")

> Signed-off-by: Jonathan Marek 
> ---
>  .../bindings/display/msm/dsi-phy-7nm.yaml | 66 +++
>  1 file changed, 66 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
>
> diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml 
> b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
> new file mode 100644
> index ..c0077ca7e9e7
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
> @@ -0,0 +1,66 @@
> +# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/display/msm/dsi-phy-7nm.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Qualcomm Display DSI 7nm PHY
> +
> +maintainers:
> +  - Jonathan Marek 
> +
> +allOf:
> +  - $ref: dsi-phy-common.yaml#
> +
> +properties:
> +  compatible:
> +oneOf:
> +  - const: qcom,dsi-phy-7nm
> +  - const: qcom,dsi-phy-7nm-8150
> +
> +  reg:
> +items:
> +  - description: dsi phy register set
> +  - description: dsi phy lane register set
> +  - description: dsi pll register set
> +
> +  reg-names:
> +items:
> +  - const: dsi_phy
> +  - const: dsi_phy_lane
> +  - const: dsi_pll
> +
> +  vdds-supply:
> +description: |
> +  Connected to VDD_A_DSI_PLL_0P9 pin (or VDDA_DSI{0,1}_PLL_0P9 for 
> sm8150)
> +
> +required:
> +  - compatible
> +  - reg
> +  - reg-names
> +  - vdds-supply
> +
> +unevaluatedProperties: false
> +
> +examples:
> +  - |
> + #include 
> + #include 
> +
> + dsi-phy@ae94400 {
> + compatible = "qcom,dsi-phy-7nm";
> + reg = <0x0ae94400 0x200>,
> +   <0x0ae94600 0x280>,
> +   <0x0ae94900 0x260>;
> + reg-names = "dsi_phy",
> + "dsi_phy_lane",
> + "dsi_pll";
> +
> + #clock-cells = <1>;
> + #phy-cells = <0>;
> +
> + vdds-supply = <&vreg_l5a_0p88>;
> + clocks = <&dispcc DISP_CC_MDSS_AHB_CLK>,
> +  <&rpmhcc RPMH_CXO_CLK>;
> + clock-names = "iface", "ref";
> + };
> --
> 2.26.1
>


Re: [PATCH v2 2/2] drm/bridge: ti-sn65dsi86: Implement the pwm_chip

2021-06-17 Thread Bjorn Andersson
On Thu 17 Jun 01:24 CDT 2021, Uwe Kleine-K?nig wrote:

> Hello Bjorn,
> 
> On Wed, Jun 16, 2021 at 10:22:17PM -0500, Bjorn Andersson wrote:
> > > > +static int ti_sn_pwm_apply(struct pwm_chip *chip, struct pwm_device 
> > > > *pwm,
> > > > +  const struct pwm_state *state)
> > > > +{
> > > > +   struct ti_sn65dsi86 *pdata = pwm_chip_to_ti_sn_bridge(chip);
> > > > +   unsigned int pwm_en_inv;
> > > > +   unsigned int backlight;
> > > > +   unsigned int pre_div;
> > > > +   unsigned int scale;
> > > > +   int ret;
> > > > +
> > > > +   if (!pdata->pwm_enabled) {
> > > > +   ret = pm_runtime_get_sync(pdata->dev);
> > > > +   if (ret < 0)
> > > > +   return ret;
> > > > +
> > > > +   ret = regmap_update_bits(pdata->regmap, 
> > > > SN_GPIO_CTRL_REG,
> > > > +   SN_GPIO_MUX_MASK << (2 * 
> > > > SN_PWM_GPIO_IDX),
> > > > +   SN_GPIO_MUX_SPECIAL << (2 * 
> > > > SN_PWM_GPIO_IDX));
> > > > +   if (ret) {
> > > > +   dev_err(pdata->dev, "failed to mux in PWM 
> > > > function\n");
> > > > +   goto out;
> > > > +   }
> > > 
> > > Do you need to do this even if state->enabled is false?
> > 
> > I presume I should be able to explicitly mux in the GPIO function and
> > configure that to output low. But I am not able to find anything in the
> > data sheet that would indicate this to be preferred.
> 
> My question targetted a different case. If the PWM is off
> (!pdata->pwm_enabled) and should remain off (state->enabled is false)
> you can shortcut here, can you not?
> 

Right, if we're going off->off then we don't need to touch the hardware.

But am I expected to -EINVAL improper period and duty cycle even though
enabled is false?


And also, what is the supposed behavior of enabled = false? Is it
supposedly equivalent of asking for a duty_cycle of 0?

> > > Does this already modify the output pin?
> > 
> > Yes, coming out of reset this pin is configured as input, so switching
> > the mux here will effectively start driving the pin.
> 
> So please document this in the format the recently added drivers do,
> too. See e.g. drivers/pwm/pwm-sifive.c. (The idea is to start that with
> " * Limitations:" to make it easy to grep it.)
> 

Okay, will do. Although I believe that for this driver it makes sense to
place such comment close to this function, rather than at the top of the
driver.

> > > Lets continue the above example with the fixed calculation. So we have:
> > > 
> > >   pdata->pwm_refclk_freq = 334
> > >   state->period = 10 [ns]
> > >   state->duty_cycle = 600
> > >   scale = 332
> > > 
> > > so the actually emitted period = 99899.98002000399 ns
> > > 
> > > Now you calculate:
> > > 
> > >   backlight = 1
> > > 
> > > which yields an actual duty_cycle of 299.4 ns, with backlight = 2
> > > you would get an actual duty_cycle of 599.99988 ns, which is better. The
> > > culprit here is that you divide by state->period but instead should
> > > divide by the actual period.
> > 
> > What do I do about the case where the actual period is lower than the
> > requested one and thereby the duty cycle becomes larger than the period?
> 
> The general principle is: Pick the biggest possible duty_cycle available
> for the just picked period. So in your example you have to clamp it to
> period (assuming you can, otherwise pick the next lower possible value).
> 

Sounds good.

Thank you,
Bjorn

> Best regards
> Uwe
> 
> -- 
> Pengutronix e.K.   | Uwe Kleine-König|
> Industrial Linux Solutions | https://www.pengutronix.de/ |




Re: [Intel-gfx] [RFC PATCH 2/2] drm/doc/rfc: i915 new parallel submission uAPI plan

2021-06-17 Thread Daniel Vetter
Sorry I'm behind on mails  ...

On Fri, Jun 11, 2021 at 12:50:29PM -0700, Matthew Brost wrote:
> On Fri, Jun 04, 2021 at 07:59:05PM +0200, Daniel Vetter wrote:
> > On Wed, May 26, 2021 at 04:33:57PM -0700, Matthew Brost wrote:
> > > Add entry for i915 new parallel submission uAPI plan.
> > > 
> > > v2:
> > >  (Daniel Vetter):
> > >   - Expand logical order explaination
> > >   - Add dummy header
> > >   - Only allow N BBs in execbuf IOCTL
> > >   - Configure parallel submission per slot not per gem context
> > > v3:
> > >  (Marcin Ślusarz):
> > >   - Lot's of typos / bad english fixed
> > >  (Tvrtko Ursulin):
> > >   - Consistent pseudo code, clean up wording in descriptions
> > > 
> > > Cc: Tvrtko Ursulin 
> > > Cc: Tony Ye 
> > > CC: Carl Zhang 
> > > Cc: Daniel Vetter 
> > > Cc: Jason Ekstrand 
> > > Signed-off-by: Matthew Brost 
> > > ---
> > >  Documentation/gpu/rfc/i915_parallel_execbuf.h | 145 ++
> > >  Documentation/gpu/rfc/i915_scheduler.rst  |  55 ++-
> > >  2 files changed, 198 insertions(+), 2 deletions(-)
> > >  create mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
> > > 
> > > diff --git a/Documentation/gpu/rfc/i915_parallel_execbuf.h 
> > > b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > > new file mode 100644
> > > index ..20de206e3ab4
> > > --- /dev/null
> > > +++ b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > > @@ -0,0 +1,145 @@
> > > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> > > i915_context_engines_parallel_submit */
> > > +
> > > +/*
> > > + * i915_context_engines_parallel_submit:
> > 
> > So the idea is to make these kerneldoc and pull them into the rfc section.
> > Then when we merge, move them to the real uapi section, like what Matt has
> > done for lmem.
> > 
> 
> Yep, will fix in next rev.
> 
> > > + *
> > > + * Setup a slot in the context engine map to allow multiple BBs to be 
> > > submitted
> > > + * in a single execbuf IOCTL. Those BBs will then be scheduled to run on 
> > > the GPU
> > > + * in parallel. Multiple hardware contexts are created internally in the 
> > > i915
> > > + * run these BBs. Once a slot is configured for N BBs only N BBs can be
> > > + * submitted in each execbuf IOCTL and this is implicit behavior e.g. 
> > > The user
> > > + * doesn't tell the execbuf IOCTL there are N BBs, the execbuf IOCTL 
> > > know how
> > > + * many BBs there are based on the slots configuration. The N BBs are 
> > > the last N
> > > + * buffer objects for first N if I915_EXEC_BATCH_FIRST is set.
> > 
> > s/for/or/
> > 
> > > + *
> > > + * There are two currently defined ways to control the placement of the
> > > + * hardware contexts on physical engines: default behavior (no flags) and
> > > + * I915_PARALLEL_IMPLICIT_BONDS (a flag). More flags may be added the in 
> > > the
> > > + * future as new hardware / use cases arise. Details of how to use this
> > > + * interface above the flags field in this structure.
> > > + *
> > > + * Returns -EINVAL if hardware context placement configuration is 
> > > invalid or if
> > > + * the placement configuration isn't supported on the platform / 
> > > submission
> > > + * interface.
> > > + * Returns -ENODEV if extension isn't supported on the platform / 
> > > submission
> > > + * inteface.
> > > + */
> > > +struct i915_context_engines_parallel_submit {
> > > + struct i915_user_extension base;
> > > +
> > > + __u16 engine_index; /* slot for parallel engine */
> > 
> > Kernel doc here for the inline comments too.
> >
> 
> Yep.
>  
> > > + __u16 width;/* number of contexts per parallel engine */
> > > + __u16 num_siblings; /* number of siblings per context */
> > > + __u16 mbz16;
> > > +/*
> > > + * Default placement behavior (currently unsupported):
> > > + *
> > > + * Allow BBs to be placed on any available engine instance. In this case 
> > > each
> > > + * context's engine mask indicates where that context can be placed. It 
> > > is
> > > + * implied in this mode that all contexts have mutual exclusive 
> > > placement.
> > > + * e.g. If one context is running CSX[0] no other contexts can run on 
> > > CSX[0]).
> > > + *
> > > + * Example 1 pseudo code:
> > > + * CSX,Y[N] = generic engine class X or Y, logical instance N
> > > + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
> > > + * set_engines(INVALID)
> > > + * set_parallel(engine_index=0, width=2, num_siblings=2,
> > > + *   engines=CSX[0],CSX[1],CSY[0],CSY[1])
> > > + *
> > > + * Results in the following valid placements:
> > > + * CSX[0], CSY[0]
> > > + * CSX[0], CSY[1]
> > > + * CSX[1], CSY[0]
> > > + * CSX[1], CSY[1]
> > > + *
> > > + * This can also be thought of as 2 virtual engines described by 2-D 
> > > array in
> > > + * the engines the field:
> > > + * VE[0] = CSX[0], CSX[1]
> > > + * VE[1] = CSY[0], CSY[1]
> > > + *
> > > + * Example 2 pseudo code:
> > > + * CSX[Y] = generic engine of same class X, logical instance N
> > > + * INVALID = 

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-17 Thread Daniel Vetter
On Mon, Jun 14, 2021 at 07:13:00PM +0200, Christian König wrote:
> As long as we can figure out who touched to a certain sync object last that
> would indeed work, yes.

Don't you need to know who will touch it next, i.e. who is holding up your
fence? Or maybe I'm just again totally confused.
-Daniel

> 
> Christian.
> 
> Am 14.06.21 um 19:10 schrieb Marek Olšák:
> > The call to the hw scheduler has a limitation on the size of all
> > parameters combined. I think we can only pass a 32-bit sequence number
> > and a ~16-bit global (per-GPU) syncobj handle in one call and not much
> > else.
> > 
> > The syncobj handle can be an element index in a global (per-GPU) syncobj
> > table and it's read only for all processes with the exception of the
> > signal command. Syncobjs can either have per VMID write access flags for
> > the signal command (slow), or any process can write to any syncobjs and
> > only rely on the kernel checking the write log (fast).
> > 
> > In any case, we can execute the memory write in the queue engine and
> > only use the hw scheduler for logging, which would be perfect.
> > 
> > Marek
> > 
> > On Thu, Jun 10, 2021 at 12:33 PM Christian König
> >  > > wrote:
> > 
> > Hi guys,
> > 
> > maybe soften that a bit. Reading from the shared memory of the
> > user fence is ok for everybody. What we need to take more care of
> > is the writing side.
> > 
> > So my current thinking is that we allow read only access, but
> > writing a new sequence value needs to go through the scheduler/kernel.
> > 
> > So when the CPU wants to signal a timeline fence it needs to call
> > an IOCTL. When the GPU wants to signal the timeline fence it needs
> > to hand that of to the hardware scheduler.
> > 
> > If we lockup the kernel can check with the hardware who did the
> > last write and what value was written.
> > 
> > That together with an IOCTL to give out sequence number for
> > implicit sync to applications should be sufficient for the kernel
> > to track who is responsible if something bad happens.
> > 
> > In other words when the hardware says that the shader wrote stuff
> > like 0xdeadbeef 0x0 or 0x into memory we kill the process
> > who did that.
> > 
> > If the hardware says that seq - 1 was written fine, but seq is
> > missing then the kernel blames whoever was supposed to write seq.
> > 
> > Just pieping the write through a privileged instance should be
> > fine to make sure that we don't run into issues.
> > 
> > Christian.
> > 
> > Am 10.06.21 um 17:59 schrieb Marek Olšák:
> > > Hi Daniel,
> > > 
> > > We just talked about this whole topic internally and we came up
> > > to the conclusion that the hardware needs to understand sync
> > > object handles and have high-level wait and signal operations in
> > > the command stream. Sync objects will be backed by memory, but
> > > they won't be readable or writable by processes directly. The
> > > hardware will log all accesses to sync objects and will send the
> > > log to the kernel periodically. The kernel will identify
> > > malicious behavior.
> > > 
> > > Example of a hardware command stream:
> > > ...
> > > ImplicitSyncWait(syncObjHandle, sequenceNumber); // the sequence
> > > number is assigned by the kernel
> > > Draw();
> > > ImplicitSyncSignalWhenDone(syncObjHandle);
> > > ...
> > > 
> > > I'm afraid we have no other choice because of the TLB
> > > invalidation overhead.
> > > 
> > > Marek
> > > 
> > > 
> > > On Wed, Jun 9, 2021 at 2:31 PM Daniel Vetter  > > > wrote:
> > > 
> > > On Wed, Jun 09, 2021 at 03:58:26PM +0200, Christian König wrote:
> > > > Am 09.06.21 um 15:19 schrieb Daniel Vetter:
> > > > > [SNIP]
> > > > > > Yeah, we call this the lightweight and the heavyweight
> > > tlb flush.
> > > > > >
> > > > > > The lighweight can be used when you are sure that you
> > > don't have any of the
> > > > > > PTEs currently in flight in the 3D/DMA engine and you
> > > just need to
> > > > > > invalidate the TLB.
> > > > > >
> > > > > > The heavyweight must be used when you need to
> > > invalidate the TLB *AND* make
> > > > > > sure that no concurrently operation moves new stuff
> > > into the TLB.
> > > > > >
> > > > > > The problem is for this use case we have to use the
> > > heavyweight one.
> > > > > Just for my own curiosity: So the lightweight flush is
> > > only for in-between
> > > > > CS when you know access is idle? Or does that also not
> > > work if userspace
> > > > > has a CS on a dma engine going at the same time because
> > > the tlb aren't
> > > > > isolated enough between engin

Re: [PATCH 6/7] drm/amdgpu: unwrap fence chains in the explicit sync fence

2021-06-17 Thread Daniel Vetter
On Mon, Jun 14, 2021 at 09:25:44AM +0200, Christian König wrote:
> Am 11.06.21 um 17:18 schrieb Daniel Vetter:
> > On Fri, Jun 11, 2021 at 12:09:19PM +0200, Christian König wrote:
> > > Am 11.06.21 um 11:07 schrieb Daniel Vetter:
> > > > On Thu, Jun 10, 2021 at 11:17:59AM +0200, Christian König wrote:
> > > > > Unwrap a the explicit fence if it is a dma_fence_chain and
> > > > > sync to the first fence not matching the owner rules.
> > > > > 
> > > > > Signed-off-by: Christian König 
> > > > > ---
> > > > >drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 118 
> > > > > +--
> > > > >1 file changed, 68 insertions(+), 50 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
> > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> > > > > index 1b2ceccaf5b0..862eb3c1c4c5 100644
> > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> > > > > @@ -28,6 +28,8 @@
> > > > > *Christian König 
> > > > > */
> > > > > +#include 
> > > > > +
> > > > >#include "amdgpu.h"
> > > > >#include "amdgpu_trace.h"
> > > > >#include "amdgpu_amdkfd.h"
> > > > > @@ -186,6 +188,55 @@ int amdgpu_sync_vm_fence(struct amdgpu_sync 
> > > > > *sync, struct dma_fence *fence)
> > > > >   return amdgpu_sync_fence(sync, fence);
> > > > >}
> > > > > +/* Determine based on the owner and mode if we should sync to a 
> > > > > fence or not */
> > > > > +static bool amdgpu_sync_test_fence(struct amdgpu_device *adev,
> > > > > +enum amdgpu_sync_mode mode,
> > > > > +void *owner, struct dma_fence *f)
> > > > > +{
> > > > > + void *fence_owner = amdgpu_sync_get_owner(f);
> > > > > +
> > > > > + /* Always sync to moves, no matter what */
> > > > > + if (fence_owner == AMDGPU_FENCE_OWNER_UNDEFINED)
> > > > > + return true;
> > > > > +
> > > > > + /* We only want to trigger KFD eviction fences on
> > > > > +  * evict or move jobs. Skip KFD fences otherwise.
> > > > > +  */
> > > > > + if (fence_owner == AMDGPU_FENCE_OWNER_KFD &&
> > > > > + owner != AMDGPU_FENCE_OWNER_UNDEFINED)
> > > > > + return false;
> > > > > +
> > > > > + /* Never sync to VM updates either. */
> > > > > + if (fence_owner == AMDGPU_FENCE_OWNER_VM &&
> > > > > + owner != AMDGPU_FENCE_OWNER_UNDEFINED)
> > > > > + return false;
> > > > > +
> > > > > + /* Ignore fences depending on the sync mode */
> > > > > + switch (mode) {
> > > > > + case AMDGPU_SYNC_ALWAYS:
> > > > > + return true;
> > > > > +
> > > > > + case AMDGPU_SYNC_NE_OWNER:
> > > > > + if (amdgpu_sync_same_dev(adev, f) &&
> > > > > + fence_owner == owner)
> > > > > + return false;
> > > > > + break;
> > > > > +
> > > > > + case AMDGPU_SYNC_EQ_OWNER:
> > > > > + if (amdgpu_sync_same_dev(adev, f) &&
> > > > > + fence_owner != owner)
> > > > > + return false;
> > > > > + break;
> > > > > +
> > > > > + case AMDGPU_SYNC_EXPLICIT:
> > > > > + return false;
> > > > > + }
> > > > > +
> > > > > + WARN(debug_evictions && fence_owner == AMDGPU_FENCE_OWNER_KFD,
> > > > > +  "Adding eviction fence to sync obj");
> > > > > + return true;
> > > > > +}
> > > > > +
> > > > >/**
> > > > > * amdgpu_sync_resv - sync to a reservation object
> > > > > *
> > > > > @@ -211,67 +262,34 @@ int amdgpu_sync_resv(struct amdgpu_device 
> > > > > *adev, struct amdgpu_sync *sync,
> > > > >   /* always sync to the exclusive fence */
> > > > >   f = dma_resv_excl_fence(resv);
> > > > > - r = amdgpu_sync_fence(sync, f);
> > > > > + dma_fence_chain_for_each(f, f) {
> > > > Jason has some helper for deep-walking fence chains/arrays here I think.
> > > > Might want to look into that, so that we have some consistency in how we
> > > > pile up multiple exclusive fences.
> > > Well those helpers are not from Jason, but from me :)
> > > 
> > > But no, for now the deep inspection is not really helpful here since
> > > grabbing a reference to a certain chain node is what that makes the 
> > > handling
> > > easier and faster here.
> > > 
> > > Thinking more about it that should also make it possible for the garbage
> > > collection to kick in properly.
> > Hm this is tricky to reason about, but yeah with this here it's a true
> > chain, and you just need to connect them. But then if a buffer is on
> > multiple engines, collapsing things down occasionally might be useful.
> > 
> > But maybe we need to do that in the bigger rework where exclusive fences
> > are also just in the dma_fence_list with a "this is an exclusive one btw"
> > tag.
> > 
> > I think for the vk import case doing the deep scan makes more sense, it's
> > a once-per-frame thing, and there's a muc

Re: [PATCH] drm/ttm: fix error handling in ttm_bo_handle_move_mem()

2021-06-17 Thread Daniel Vetter
On Thu, Jun 17, 2021 at 09:41:35AM +0200, Christian König wrote:
> 
> 
> Am 16.06.21 um 21:19 schrieb Dan Carpenter:
> > On Wed, Jun 16, 2021 at 01:00:38PM +0200, Christian König wrote:
> > > 
> > > Am 16.06.21 um 11:36 schrieb Dan Carpenter:
> > > > On Wed, Jun 16, 2021 at 10:47:14AM +0200, Christian König wrote:
> > > > > Am 16.06.21 um 10:37 schrieb Dan Carpenter:
> > > > > > On Wed, Jun 16, 2021 at 08:46:33AM +0200, Christian König wrote:
> > > > > > > Sending the first message didn't worked, so let's try again.
> > > > > > > 
> > > > > > > Am 16.06.21 um 08:30 schrieb Dan Carpenter:
> > > > > > > > There are three bugs here:
> > > > > > > > 1) We need to call unpopulate() if ttm_tt_populate() succeeds.
> > > > > > > > 2) The "new_man = ttm_manager_type(bdev, bo->mem.mem_type);" 
> > > > > > > > assignment
> > > > > > > >was wrong and it was really assigning "new_mem = 
> > > > > > > > old_mem;".  There
> > > > > > > >is no need for this assignment anyway as we already have 
> > > > > > > > the value
> > > > > > > >for "new_mem".
> > > > > > > > 3) The (!new_man->use_tt) condition is reversed.
> > > > > > > > 
> > > > > > > > Fixes: ba4e7d973dd0 ("drm: Add the TTM GPU memory manager 
> > > > > > > > subsystem.")
> > > > > > > > Signed-off-by: Dan Carpenter 
> > > > > > > > ---
> > > > > > > > This is from reading the code and I can't swear that I have 
> > > > > > > > understood
> > > > > > > > it correctly.  My nouveau driver is currently unusable and this 
> > > > > > > > patch
> > > > > > > > has not helped.  But hopefully if I fix enough bugs eventually 
> > > > > > > > it will
> > > > > > > > start to work.
> > > > > > > Well NAK, the code previously looked quite well and you are 
> > > > > > > breaking it now.
> > > > > > > 
> > > > > > > What's the problem with nouveau?
> > > > > > > 
> > > > > > The new Firefox seems to excersize nouveau more than the old one so
> > > > > > when I start 10 firefox windows it just hangs the graphics.
> > > > > > 
> > > > > > I've added debug code and it seems like the problem is that
> > > > > > nv50_mem_new() is failing.
> > > > > Sounds like it is running out of memory to me.
> > > > > 
> > > > > Do you have a dmesg?
> > > > > 
> > > > At first there was a very straight forward use after free bug which I
> > > > fixed.
> > > > https://lore.kernel.org/nouveau/YMinJwpIei9n1Pn1@mwanda/T/#u
> > > > 
> > > > But now the use after free is gone the only thing in dmesg is:
> > > > "[TTM] Buffer eviction failed".  And I have some firmware missing.
> > > > 
> > > > [  205.489763] rfkill: input handler disabled
> > > > [  205.678292] nouveau :01:00.0: Direct firmware load for 
> > > > nouveau/nva8_fuc084 failed with error -2
> > > > [  205.678300] nouveau :01:00.0: Direct firmware load for 
> > > > nouveau/nva8_fuc084d failed with error -2
> > > > [  205.678302] nouveau :01:00.0: msvld: unable to load firmware data
> > > > [  205.678304] nouveau :01:00.0: msvld: init failed, -19
> > > > [  296.150632] [TTM] Buffer eviction failed
> > > > [  417.084265] [TTM] Buffer eviction failed
> > > > [  447.295961] [TTM] Buffer eviction failed
> > > > [  510.800231] [TTM] Buffer eviction failed
> > > > [  556.101384] [TTM] Buffer eviction failed
> > > > [  616.495790] [TTM] Buffer eviction failed
> > > > [  692.014007] [TTM] Buffer eviction failed
> > > > 
> > > > The eviction failed message only shows up a minute after the hang so it
> > > > seems more like a symptom than a root cause.
> > > Yeah, look at the timing. What happens is that the buffer eviction timed 
> > > out
> > > because the hardware is locked up.
> > > 
> > > No idea what that could be. It might not even be kernel related at all.
> > I don't think it's hardware related...  Using an old version of firefox
> > "fixes" the problem.  I downloaded the firmware so that's not the issue.
> > Here's the dmesg load info with the new firmware.
> 
> Oh, I was not suggesting a hardware problem.
> 
> The most likely cause is a software issue in userspace, e.g. wrong order of
> doing thing, doing things to fast without waiting etc...
> 
> There are tons of things how userspace can crash GPU hardware you can't
> prevent in the kernel. Especially sending an endless loop is well known as
> Turing's halting problems and not even theoretically solvable.
> 
> I suggest to start digging in userspace instead.

I guess nouveau doesn't have reset when the fences time out? That would at
least paper over this, plus it makes debugging the bug in mesa3 easier.

Also as Christian points out, because halting problem lack of tdr (timeoud
and device reset) is actually a security bug itself.
-Daniel

> 
> Christian.
> 
> > 
> > [1.412458] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel 
> > [1.412527] AMD-Vi: AMD IOMMUv2 functionality not available on this 
> > system
> > [1.412710] nouveau :01:00.0: vgaarb: deactivate vga console
> > [1.417213] Console: switching to colour dummy device 80x25

Re: [PATCH v2 2/2] drm/bridge: ti-sn65dsi86: Implement the pwm_chip

2021-06-17 Thread Uwe Kleine-König
Hello Bjorn,

On Thu, Jun 17, 2021 at 11:38:26AM -0500, Bjorn Andersson wrote:
> On Thu 17 Jun 01:24 CDT 2021, Uwe Kleine-K?nig wrote:
> > On Wed, Jun 16, 2021 at 10:22:17PM -0500, Bjorn Andersson wrote:
> > > > > +static int ti_sn_pwm_apply(struct pwm_chip *chip, struct pwm_device 
> > > > > *pwm,
> > > > > +const struct pwm_state *state)
> > > > > +{
> > > > > + struct ti_sn65dsi86 *pdata = pwm_chip_to_ti_sn_bridge(chip);
> > > > > + unsigned int pwm_en_inv;
> > > > > + unsigned int backlight;
> > > > > + unsigned int pre_div;
> > > > > + unsigned int scale;
> > > > > + int ret;
> > > > > +
> > > > > + if (!pdata->pwm_enabled) {
> > > > > + ret = pm_runtime_get_sync(pdata->dev);
> > > > > + if (ret < 0)
> > > > > + return ret;
> > > > > +
> > > > > + ret = regmap_update_bits(pdata->regmap, 
> > > > > SN_GPIO_CTRL_REG,
> > > > > + SN_GPIO_MUX_MASK << (2 * 
> > > > > SN_PWM_GPIO_IDX),
> > > > > + SN_GPIO_MUX_SPECIAL << (2 * 
> > > > > SN_PWM_GPIO_IDX));
> > > > > + if (ret) {
> > > > > + dev_err(pdata->dev, "failed to mux in PWM 
> > > > > function\n");
> > > > > + goto out;
> > > > > + }
> > > > 
> > > > Do you need to do this even if state->enabled is false?
> > > 
> > > I presume I should be able to explicitly mux in the GPIO function and
> > > configure that to output low. But I am not able to find anything in the
> > > data sheet that would indicate this to be preferred.
> > 
> > My question targetted a different case. If the PWM is off
> > (!pdata->pwm_enabled) and should remain off (state->enabled is false)
> > you can shortcut here, can you not?
> 
> Right, if we're going off->off then we don't need to touch the hardware.
> 
> But am I expected to -EINVAL improper period and duty cycle even though
> enabled is false?
> 
> And also, what is the supposed behavior of enabled = false? Is it
> supposedly equivalent of asking for a duty_cycle of 0?

In my book enabled = false is just syntactic sugar to say:
"duty_cycle=0, period=something small". So to answer your questions: if
enabled = false, the consumer doesn't really care about period and
duty_cycle. Some care that the output becomes inactive, some others
don't, so from my POV just emit the inactive level on the output and
ignore period and duty_cycle.

> > > > Does this already modify the output pin?
> > > 
> > > Yes, coming out of reset this pin is configured as input, so switching
> > > the mux here will effectively start driving the pin.
> > 
> > So please document this in the format the recently added drivers do,
> > too. See e.g. drivers/pwm/pwm-sifive.c. (The idea is to start that with
> > " * Limitations:" to make it easy to grep it.)
> > 
> 
> Okay, will do. Although I believe that for this driver it makes sense to
> place such comment close to this function, rather than at the top of the
> driver.

Yes, agreed.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | https://www.pengutronix.de/ |


signature.asc
Description: PGP signature


Re: [PATCH 1/5] dma-buf: fix dma_resv_test_signaled test_all handling

2021-06-17 Thread Daniel Vetter
On Mon, Jun 14, 2021 at 07:15:44PM +0200, Christian König wrote:
> Am 11.06.21 um 16:55 schrieb Daniel Vetter:
> > On Fri, Jun 11, 2021 at 04:53:11PM +0200, Christian König wrote:
> > > 
> > > Am 11.06.21 um 16:47 schrieb Daniel Vetter:
> > > > On Fri, Jun 11, 2021 at 02:02:57PM +0200, Christian König wrote:
> > > > > As the name implies if testing all fences is requested we
> > > > > should indeed test all fences and not skip the exclusive
> > > > > one because we see shared ones.
> > > > > 
> > > > > Signed-off-by: Christian König 
> > > > Hm I thought we've had the rule that when both fences exist, then
> > > > collectively the shared ones must signale no earlier than the exclusive
> > > > one.
> > > > 
> > > > That's at least the contract we've implemented in dma_resv.h. But I've
> > > > also found a bunch of drivers who are a lot more yolo on this.
> > > > 
> > > > I think there's a solid case here to just always take all the fences if 
> > > > we
> > > > ask for all the shared ones, but if we go that way then I'd say
> > > > - clear kerneldoc patch to really hammer this in (currently we're not 
> > > > good
> > > > at all in this regard)
> > > > - going through drivers a bit to check for this (I have some of that 
> > > > done
> > > > already in my earlier series, need to respin it and send it out)
> > > > 
> > > > But I'm kinda not seeing why this needs to be in this patch series here.
> > > You mentioned that this is a problem in the last patch and if you ask me
> > > that's just a bug or at least very inconsistent.
> > > 
> > > See dma_resv_wait_timeout() always waits for all fences, including the
> > > exclusive one even if shared ones are present. But 
> > > dma_resv_test_signaled()
> > > ignores the exclusive one if shared ones are present.
> > Hm the only one I thought I've mentioned is that dma_buf_poll doesn't use
> > dma_fence_get_rcu_safe where I think it should. Different problem. I think
> > this is one you spotted.
> > 
> > > The only other driver I could find trying to make use of this is nouveau 
> > > and
> > > I already provided a fix for this as well.
> > i915 also does this, and I think I've found a few more.
> > 
> > > I just think that this is the more defensive approach to fix this and have
> > > at least the core functions consistent on the handling.
> > Oh fully agree, it's just current dma_resv docs aren't the greatest, and
> > hacking on semantics without updating the docs isn't great. Especially
> > when it's ad-hoc.
> 
> Well when the requirement that shared fences should always signal after the
> exclusive fence is not documented anywhere then I would say that it is
> naturally allowed to just add any fence to the list of shared fence and any
> code assuming something else is just broken and need fixing.

That's not what I meant. I thought the rule is that the shared fences
_together_ need to signal after the exclusive ones. Not each individual
one.

This means that if you have both exclusive  fences and shared fences, and
you want to wait for just the shared fences, then you can ignore the
exclusive ones.

You have a patch series floating around which "fixes" this, but I think
it's imcomplete. And I'm pretty sure it's a change of defacto rules, since
not obeying this breaks a bunch of existing code (as you've noticed).
-Daniel

> 
> Christian.
> 
> > -Daniel
> > 
> > > Christian.
> > > 
> > > > -Daniel
> > > > 
> > > > > ---
> > > > >drivers/dma-buf/dma-resv.c | 33 -
> > > > >1 file changed, 12 insertions(+), 21 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> > > > > index f26c71747d43..c66bfdde9454 100644
> > > > > --- a/drivers/dma-buf/dma-resv.c
> > > > > +++ b/drivers/dma-buf/dma-resv.c
> > > > > @@ -615,25 +615,21 @@ static inline int 
> > > > > dma_resv_test_signaled_single(struct dma_fence *passed_fence)
> > > > > */
> > > > >bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
> > > > >{
> > > > > - unsigned int seq, shared_count;
> > > > > + struct dma_fence *fence;
> > > > > + unsigned int seq;
> > > > >   int ret;
> > > > >   rcu_read_lock();
> > > > >retry:
> > > > >   ret = true;
> > > > > - shared_count = 0;
> > > > >   seq = read_seqcount_begin(&obj->seq);
> > > > >   if (test_all) {
> > > > >   struct dma_resv_list *fobj = dma_resv_shared_list(obj);
> > > > > - unsigned int i;
> > > > > -
> > > > > - if (fobj)
> > > > > - shared_count = fobj->shared_count;
> > > > > + unsigned int i, shared_count;
> > > > > + shared_count = fobj ? fobj->shared_count : 0;
> > > > >   for (i = 0; i < shared_count; ++i) {
> > > > > - struct dma_fence *fence;
> > > > > -
> > > > >   fence = rcu_dereference(fobj->shared[i]);
> > > > >   ret = dma_resv_test_sign

Re: [Intel-gfx] [PATCH 0/2] GuC submission / DRM scheduler integration plan + new uAPI

2021-06-17 Thread Daniel Vetter
On Fri, Jun 11, 2021 at 04:40:42PM -0700, Matthew Brost wrote:
> Subject and patches say it all.
> 
> v2: Address comments, patches have details of changes
> v3: Address comments, patches have details of changes
> v4: Address comments, patches have details of changes
> 
> Signed-off-by: Matthew Brost 

Imo ready (well overdue) for merging, please annoy Carl or someone from
media for an ack and then ask John or Daniele to merge it into
drm-intel-gt-next.
-Daniel

> 
> Matthew Brost (2):
>   drm/doc/rfc: i915 GuC submission / DRM scheduler
>   drm/doc/rfc: i915 new parallel submission uAPI plan
> 
>  Documentation/gpu/rfc/i915_parallel_execbuf.h | 117 ++
>  Documentation/gpu/rfc/i915_scheduler.rst  | 148 ++
>  Documentation/gpu/rfc/index.rst   |   4 +
>  3 files changed, 269 insertions(+)
>  create mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
>  create mode 100644 Documentation/gpu/rfc/i915_scheduler.rst
> 
> -- 
> 2.28.0
> 
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 1/2] drm: Add a locked version of drm_is_current_master

2021-06-17 Thread Daniel Vetter
On Tue, Jun 15, 2021 at 10:36:44AM +0800, Desmond Cheong Zhi Xi wrote:
> While checking the master status of the DRM file in
> drm_is_current_master(), the device's master mutex should be
> held. Without the mutex, the pointer fpriv->master may be freed
> concurrently by another process calling drm_setmaster_ioctl(). This
> could lead to use-after-free errors when the pointer is subsequently
> dereferenced in drm_lease_owner().
> 
> The callers of drm_is_current_master() from drm_auth.c hold the
> device's master mutex, but external callers do not. Hence, we implement
> drm_is_current_master_locked() to be used within drm_auth.c, and
> modify drm_is_current_master() to grab the device's master mutex
> before checking the master status.
> 
> Reported-by: Daniel Vetter 
> Signed-off-by: Desmond Cheong Zhi Xi 
> Reviewed-by: Emil Velikov 
> ---
>  drivers/gpu/drm/drm_auth.c | 23 +++
>  1 file changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c
> index 232abbba3686..c6bf52c310a9 100644
> --- a/drivers/gpu/drm/drm_auth.c
> +++ b/drivers/gpu/drm/drm_auth.c
> @@ -61,6 +61,8 @@
>   * trusted clients.
>   */
>  
> +static bool drm_is_current_master_locked(struct drm_file *fpriv);

A bit a bikeshed, but we try to avoid forward declarations when they're
not needed. If you don't want to tear apart drm_is_current_master and the
_locked version then just move them together.

Can you pls do that and respin?

Otherwise looks all great.
-Daniel


> +
>  int drm_getmagic(struct drm_device *dev, void *data, struct drm_file 
> *file_priv)
>  {
>   struct drm_auth *auth = data;
> @@ -223,7 +225,7 @@ int drm_setmaster_ioctl(struct drm_device *dev, void 
> *data,
>   if (ret)
>   goto out_unlock;
>  
> - if (drm_is_current_master(file_priv))
> + if (drm_is_current_master_locked(file_priv))
>   goto out_unlock;
>  
>   if (dev->master) {
> @@ -272,7 +274,7 @@ int drm_dropmaster_ioctl(struct drm_device *dev, void 
> *data,
>   if (ret)
>   goto out_unlock;
>  
> - if (!drm_is_current_master(file_priv)) {
> + if (!drm_is_current_master_locked(file_priv)) {
>   ret = -EINVAL;
>   goto out_unlock;
>   }
> @@ -321,7 +323,7 @@ void drm_master_release(struct drm_file *file_priv)
>   if (file_priv->magic)
>   idr_remove(&file_priv->master->magic_map, file_priv->magic);
>  
> - if (!drm_is_current_master(file_priv))
> + if (!drm_is_current_master_locked(file_priv))
>   goto out;
>  
>   drm_legacy_lock_master_cleanup(dev, master);
> @@ -342,6 +344,13 @@ void drm_master_release(struct drm_file *file_priv)
>   mutex_unlock(&dev->master_mutex);
>  }
>  
> +static bool drm_is_current_master_locked(struct drm_file *fpriv)
> +{
> + lockdep_assert_held_once(&fpriv->master->dev->master_mutex);
> +
> + return fpriv->is_master && drm_lease_owner(fpriv->master) == 
> fpriv->minor->dev->master;
> +}
> +
>  /**
>   * drm_is_current_master - checks whether @priv is the current master
>   * @fpriv: DRM file private
> @@ -354,7 +363,13 @@ void drm_master_release(struct drm_file *file_priv)
>   */
>  bool drm_is_current_master(struct drm_file *fpriv)
>  {
> - return fpriv->is_master && drm_lease_owner(fpriv->master) == 
> fpriv->minor->dev->master;
> + bool ret;
> +
> + mutex_lock(&fpriv->master->dev->master_mutex);
> + ret = drm_is_current_master_locked(fpriv);
> + mutex_unlock(&fpriv->master->dev->master_mutex);
> +
> + return ret;
>  }
>  EXPORT_SYMBOL(drm_is_current_master);
>  
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 2/2] drm: Protect drm_master pointers in drm_lease.c

2021-06-17 Thread Daniel Vetter
On Tue, Jun 15, 2021 at 10:36:45AM +0800, Desmond Cheong Zhi Xi wrote:
> This patch ensures that the device's master mutex is acquired before
> accessing pointers to struct drm_master that are subsequently
> dereferenced. Without the mutex, the struct drm_master may be freed
> concurrently by another process calling drm_setmaster_ioctl(). This
> could then lead to use-after-free errors.
> 
> Reported-by: Daniel Vetter 
> Signed-off-by: Desmond Cheong Zhi Xi 
> Reviewed-by: Emil Velikov 
> ---
>  drivers/gpu/drm/drm_lease.c | 58 +++--
>  1 file changed, 43 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_lease.c b/drivers/gpu/drm/drm_lease.c
> index da4f085fc09e..3e6f689236e5 100644
> --- a/drivers/gpu/drm/drm_lease.c
> +++ b/drivers/gpu/drm/drm_lease.c
> @@ -107,10 +107,16 @@ static bool _drm_has_leased(struct drm_master *master, 
> int id)
>   */
>  bool _drm_lease_held(struct drm_file *file_priv, int id)
>  {
> + bool ret;
> +
>   if (!file_priv || !file_priv->master)
>   return true;
>  
> - return _drm_lease_held_master(file_priv->master, id);
> + mutex_lock(&file_priv->master->dev->master_mutex);

So maybe we have a bug somewhere, and the kerneldoc isn't 100% clear, but
I thought file_priv->master is invariant over the lifetime of file_priv.
So we don't need a lock to check anything here.

It's the drm_device->master derefence that gets us into trouble. Well also
file_priv->is_owner is protected by dev->master_mutex.

So I think with your previous patch all the access here in drm_lease.c is
ok and already protected? Or am I missing something?

Thanks, Daniel


> + ret = _drm_lease_held_master(file_priv->master, id);
> + mutex_unlock(&file_priv->master->dev->master_mutex);
> +
> + return ret;
>  }
>  
>  /**
> @@ -132,10 +138,12 @@ bool drm_lease_held(struct drm_file *file_priv, int id)
>   if (!file_priv || !file_priv->master || !file_priv->master->lessor)
>   return true;
>  
> + mutex_lock(&file_priv->master->dev->master_mutex);
>   master = file_priv->master;
>   mutex_lock(&master->dev->mode_config.idr_mutex);
>   ret = _drm_lease_held_master(master, id);
>   mutex_unlock(&master->dev->mode_config.idr_mutex);
> + mutex_unlock(&file_priv->master->dev->master_mutex);
>   return ret;
>  }
>  
> @@ -158,6 +166,7 @@ uint32_t drm_lease_filter_crtcs(struct drm_file 
> *file_priv, uint32_t crtcs_in)
>   if (!file_priv || !file_priv->master || !file_priv->master->lessor)
>   return crtcs_in;
>  
> + mutex_lock(&file_priv->master->dev->master_mutex);
>   master = file_priv->master;
>   dev = master->dev;
>  
> @@ -177,6 +186,7 @@ uint32_t drm_lease_filter_crtcs(struct drm_file 
> *file_priv, uint32_t crtcs_in)
>   count_in++;
>   }
>   mutex_unlock(&master->dev->mode_config.idr_mutex);
> + mutex_unlock(&file_priv->master->dev->master_mutex);
>   return crtcs_out;
>  }
>  
> @@ -490,7 +500,7 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,
>   size_t object_count;
>   int ret = 0;
>   struct idr leases;
> - struct drm_master *lessor = lessor_priv->master;
> + struct drm_master *lessor;
>   struct drm_master *lessee = NULL;
>   struct file *lessee_file = NULL;
>   struct file *lessor_file = lessor_priv->filp;
> @@ -502,12 +512,6 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,
>   if (!drm_core_check_feature(dev, DRIVER_MODESET))
>   return -EOPNOTSUPP;
>  
> - /* Do not allow sub-leases */
> - if (lessor->lessor) {
> - DRM_DEBUG_LEASE("recursive leasing not allowed\n");
> - return -EINVAL;
> - }
> -
>   /* need some objects */
>   if (cl->object_count == 0) {
>   DRM_DEBUG_LEASE("no objects in lease\n");
> @@ -519,12 +523,23 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,
>   return -EINVAL;
>   }
>  
> + mutex_lock(&dev->master_mutex);
> + lessor = lessor_priv->master;
> + /* Do not allow sub-leases */
> + if (lessor->lessor) {
> + DRM_DEBUG_LEASE("recursive leasing not allowed\n");
> + ret = -EINVAL;
> + goto unlock;
> + }
> +
>   object_count = cl->object_count;
>  
>   object_ids = memdup_user(u64_to_user_ptr(cl->object_ids),
>   array_size(object_count, sizeof(__u32)));
> - if (IS_ERR(object_ids))
> - return PTR_ERR(object_ids);
> + if (IS_ERR(object_ids)) {
> + ret = PTR_ERR(object_ids);
> + goto unlock;
> + }
>  
>   idr_init(&leases);
>  
> @@ -535,14 +550,15 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,
>   if (ret) {
>   DRM_DEBUG_LEASE("lease object lookup failed: %i\n", ret);
>   idr_destroy(&leases);
> - return ret;
> + goto unlock;
>   }
>  
>   /* Allo

Re: [PATCH v2] drm/i915: Document the Virtual Engine uAPI

2021-06-17 Thread Daniel Vetter
On Mon, Jun 14, 2021 at 10:09:59AM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin 
> 
> A little bit of documentation covering the topics of engine discovery,
> context engine maps and virtual engines. It is not very detailed but
> supposed to be a starting point of giving a brief high level overview of
> general principles and intended use cases.
> 
> v2:
>  * Have the text in uapi header and link from there.
> 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Daniel Vetter 

What I meant was the kerneldoc directly as kerneldoc for the uapi structs,
like Matt has done for e.g. drm_i915_gem_create_ext_memory_regions.

But then I also realized that Matt hasn't set up the include for this, so
it's not automatic at all yet :-/
-Daniel

> ---
>  Documentation/gpu/i915.rst  |  18 
>  include/uapi/drm/i915_drm.h | 188 
>  2 files changed, 206 insertions(+)
> 
> diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
> index 42ce0196930a..00aa55bbe0fd 100644
> --- a/Documentation/gpu/i915.rst
> +++ b/Documentation/gpu/i915.rst
> @@ -335,6 +335,24 @@ for execution also include a list of all locations 
> within buffers that
>  refer to GPU-addresses so that the kernel can edit the buffer correctly.
>  This process is dubbed relocation.
>  
> +Engine Discovery uAPI
> +-
> +
> +.. kernel-doc:: include/uapi/drm/i915_drm.h
> +   :doc: Engine Discovery uAPI
> +
> +Context Engine Map uAPI
> +---
> +
> +.. kernel-doc:: include/uapi/drm/i915_drm.h
> +   :doc: Context Engine Map uAPI
> +
> +Virtual Engine uAPI
> +---
> +
> +.. kernel-doc:: include/uapi/drm/i915_drm.h
> +   :doc: Virtual Engine uAPI
> +
>  Locking Guidelines
>  --
>  
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index a1cb4aa035a9..2f70c48567c0 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1806,6 +1806,69 @@ struct drm_i915_gem_context_param_sseu {
>   __u32 rsvd;
>  };
>  
> +/**
> + * DOC: Virtual Engine uAPI
> + *
> + * Virtual engine is a concept where userspace is able to configure a set of
> + * physical engines, submit a batch buffer, and let the driver execute it on 
> any
> + * engine from the set as it sees fit.
> + *
> + * This is primarily useful on parts which have multiple instances of a same
> + * class engine, like for example GT3+ Skylake parts with their two VCS 
> engines.
> + *
> + * For instance userspace can enumerate all engines of a certain class using 
> the
> + * previously described `Engine Discovery uAPI`_. After that userspace can
> + * create a GEM context with a placeholder slot for the virtual engine (using
> + * `I915_ENGINE_CLASS_INVALID` and `I915_ENGINE_CLASS_INVALID_NONE` for class
> + * and instance respectively) and finally using the
> + * `I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE` extension place a virtual engine 
> in
> + * the same reserved slot.
> + *
> + * Example of creating a virtual engine and submitting a batch buffer to it:
> + *
> + * .. code-block:: C
> + *
> + *   I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(virtual, 2) = {
> + *   .base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
> + *   .engine_index = 0, // Place this virtual engine into engine map 
> slot 0
> + *   .num_siblings = 2,
> + *   .engines = { { I915_ENGINE_CLASS_VIDEO, 0 },
> + *{ I915_ENGINE_CLASS_VIDEO, 1 }, },
> + *   };
> + *   I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 1) = {
> + *   .engines = { { I915_ENGINE_CLASS_INVALID,
> + *  I915_ENGINE_CLASS_INVALID_NONE } },
> + *   .extensions = to_user_pointer(&virtual), // Chains after 
> load_balance extension
> + *   };
> + *   struct drm_i915_gem_context_create_ext_setparam p_engines = {
> + *   .base = {
> + *   .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> + *   },
> + *   .param = {
> + *   .param = I915_CONTEXT_PARAM_ENGINES,
> + *   .value = to_user_pointer(&engines),
> + *   .size = sizeof(engines),
> + *   },
> + *   };
> + *   struct drm_i915_gem_context_create_ext create = {
> + *   .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> + *   .extensions = to_user_pointer(&p_engines);
> + *   };
> + *
> + *   ctx_id = gem_context_create_ext(drm_fd, &create);
> + *
> + *   // Now we have created a GEM context with its engine map containing a
> + *   // single virtual engine. Submissions to this slot can go either to
> + *   // vcs0 or vcs1, depending on the load balancing algorithm used inside
> + *   // the driver. The load balancing is dynamic from one batch buffer to
> + *   // another and transparent to userspace.
> + *
> + *   ...
> + *   execbuf.rsvd1 = ctx_id;
> + *   execbuf.flags = 0; // Submits to index 0 which is the virtual engine
> + *   gem_execbuf(drm_fd, &execbuf);
> + */
> +
>  /

Re: [PATCH] drm/i915: allow DG1 autoprobe for CONFIG_BROKEN

2021-06-17 Thread Daniel Vetter
On Wed, Jun 16, 2021 at 03:29:26PM +0100, Matthew Auld wrote:
> On Mon, 14 Jun 2021 at 10:22, Matthew Auld  wrote:
> >
> > Purely for CI so we can get some pre-merge results for DG1. This is
> > especially useful for cross driver TTM changes where CI can hopefully
> > catch regressions. This is similar to how we already handle the DG1
> > specific uAPI, which are also hidden behind CONFIG_BROKEN.
> >
> > Signed-off-by: Matthew Auld 
> > Cc: Thomas Hellström 
> > Cc: Daniel Vetter 
> > Cc: Dave Airlie 
> 
> Daniel, any objections to landing this?

I think stuffing this into topic/core-for-CI is fine, lets wait a bit more
until mesa and everything is ready with adding the pciids to an official
tree.

(Catching up on mails, apologies and all that).
-Daniel

> 
> > ---
> >  drivers/gpu/drm/i915/i915_pci.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c 
> > b/drivers/gpu/drm/i915/i915_pci.c
> > index 83b500bb170c..78742157aaa3 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -1040,6 +1040,9 @@ static const struct pci_device_id pciidlist[] = {
> > INTEL_RKL_IDS(&rkl_info),
> > INTEL_ADLS_IDS(&adl_s_info),
> > INTEL_ADLP_IDS(&adl_p_info),
> > +#if IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM)
> > +   INTEL_DG1_IDS(&dg1_info),
> > +#endif
> > {0, 0, 0}
> >  };
> >  MODULE_DEVICE_TABLE(pci, pciidlist);
> > --
> > 2.26.3
> >

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: drm/i915: __GFP_RETRY_MAYFAIL allocations in stable kernels

2021-06-17 Thread Daniel Vetter
On Mon, Jun 14, 2021 at 09:45:37PM +0900, Sergey Senozhatsky wrote:
> Hi,
> 
> We are observing some user-space crashes (sigabort, segfaults etc.)
> under moderate memory pressure (pretty far from severe pressure) which
> have one thing in common - restrictive GFP mask in setup_scratch_page().
> 
> For instance, (stable 4.19) drivers/gpu/drm/i915/i915_gem_gtt.c
> 
> (trimmed down version)
> 
> static int gen8_init_scratch(struct i915_address_space *vm)
> {
> setup_scratch_page(vm, __GFP_HIGHMEM);
> 
> vm->scratch_pt = alloc_pt(vm);
> vm->scratch_pd = alloc_pd(vm);
> if (use_4lvl(vm)) {
> vm->scratch_pdp = alloc_pdp(vm);
> }
> }
> 
> gen8_init_scratch() function puts a rather inconsistent restrictions on mm.
> 
> Looking at it line by line:
> 
> setup_scratch_page() uses very restrictive gfp mask:
>   __GFP_HIGHMEM | __GFP_ZERO | __GFP_RETRY_MAYFAIL
> 
> it doesn't try to reclaim anything and fails almost immediately.
> 
> alloc_pt() - uses more permissive gfp mask:
>   GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN
> 
> alloc_pd() - likewise:
>   GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN
> 
> alloc_pdp() - very permissive gfp mask:
>   GFP_KERNEL
> 
> 
> So can all allocations in gen8_init_scratch() use
>   GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN

Yeah that looks all fairly broken tbh. The only thing I didn't know was
that GFP_DMA32 wasn't a full gfp mask with reclaim bits set as needed. I
guess it would be clearer if we use GFP_KERNEL | __GFP_DMA32 for these.

The commit that introduced a lot of this, including I915_GFP_ALLOW_FAIL
seems to be

commit 1abb70f5955d1a9021f96359a2c6502ca569b68d
Author: Chris Wilson 
Date:   Tue May 22 09:36:43 2018 +0100

drm/i915/gtt: Allow pagedirectory allocations to fail

which used a selftest as justification, not real world workloads, so looks
rather dubious.

Adding Matt Auld to this thread, maybe he has ideas.

Thanks, Daniel

> ?
> 
> E.g.
> 
> ---
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
> b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index a12430187108..e862680b9c93 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -792,7 +792,7 @@ alloc_pdp(struct i915_address_space *vm)
>  
> GEM_BUG_ON(!use_4lvl(vm));
>  
> -   pdp = kzalloc(sizeof(*pdp), GFP_KERNEL);
> +   pdp = kzalloc(sizeof(*pdp), I915_GFP_ALLOW_FAIL);
> if (!pdp)
> return ERR_PTR(-ENOMEM);
>  
> @@ -1262,7 +1262,7 @@ static int gen8_init_scratch(struct i915_address_space 
> *vm)
>  {
> int ret;
>  
> -   ret = setup_scratch_page(vm, __GFP_HIGHMEM);
> +   ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
> if (ret)
> return ret;
>  
> @@ -1972,7 +1972,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_hw_ppgtt 
> *ppgtt)
> u32 pde;
> int ret;
>  
> -   ret = setup_scratch_page(vm, __GFP_HIGHMEM);
> +   ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
> if (ret)
> return ret;
>  
> @@ -3078,7 +3078,7 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, 
> u64 size)
> return -ENOMEM;
> }
>  
> -   ret = setup_scratch_page(&ggtt->vm, GFP_DMA32);
> +   ret = setup_scratch_page(&ggtt->vm, GFP_KERNEL | GFP_DMA32);
> if (ret) {
> DRM_ERROR("Scratch setup failed\n");
> /* iounmap will also get called at remove, but meh */
> ---
> 
> 
> 
> It's quite similar on stable 5.4 - setup_scratch_page() uses restrictive
> gfp mask again.
> 
> ---
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
> b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index f614646ed3f9..99d78b1052df 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1378,7 +1378,7 @@ static int gen8_init_scratch(struct i915_address_space 
> *vm)
> return 0;
> }
>  
> -   ret = setup_scratch_page(vm, __GFP_HIGHMEM);
> +   ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
> if (ret)
> return ret;
>  
> @@ -1753,7 +1753,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt 
> *ppgtt)
> struct i915_page_directory * const pd = ppgtt->base.pd;
> int ret;
>  
> -   ret = setup_scratch_page(vm, __GFP_HIGHMEM);
> +   ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
> if (ret)
> return ret;
>  
> @@ -2860,7 +2860,7 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, 
> u64 size)
> return -ENOMEM;
> }
>  
> -   ret = setup_scratch_page(&ggtt->vm, GFP_DMA32);
> +   ret = setup_scratch_page(&ggtt->vm, GFP_KERNEL | GFP_DMA32);
> if (ret) {
> DRM_ERROR("Scratch setup failed\n");
> /* iounmap will also get called at remove, but meh */
> ---

-- 
Daniel Vetter
Software Engineer

Re: [PATCH 1/2] drm/amdgpu: unwrap fence chains in the explicit sync fence

2021-06-17 Thread Daniel Vetter
On Thu, Jun 17, 2021 at 09:44:25AM +0200, Christian König wrote:
> Alex do want to review those so that we can close the ticket?

Maybe I'm behind on mails, but 2nd patch still has the issues I think I'm
seeing ...
-Daniel

> 
> Thanks,
> Christian.
> 
> Am 14.06.21 um 19:45 schrieb Christian König:
> > Unwrap the explicit fence if it is a dma_fence_chain and
> > sync to the first fence not matching the owner rules.
> > 
> > Signed-off-by: Christian König 
> > Acked-by: Daniel Vetter 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 118 +--
> >   1 file changed, 68 insertions(+), 50 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> > index 1b2ceccaf5b0..862eb3c1c4c5 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> > @@ -28,6 +28,8 @@
> >*Christian König 
> >*/
> > +#include 
> > +
> >   #include "amdgpu.h"
> >   #include "amdgpu_trace.h"
> >   #include "amdgpu_amdkfd.h"
> > @@ -186,6 +188,55 @@ int amdgpu_sync_vm_fence(struct amdgpu_sync *sync, 
> > struct dma_fence *fence)
> > return amdgpu_sync_fence(sync, fence);
> >   }
> > +/* Determine based on the owner and mode if we should sync to a fence or 
> > not */
> > +static bool amdgpu_sync_test_fence(struct amdgpu_device *adev,
> > +  enum amdgpu_sync_mode mode,
> > +  void *owner, struct dma_fence *f)
> > +{
> > +   void *fence_owner = amdgpu_sync_get_owner(f);
> > +
> > +   /* Always sync to moves, no matter what */
> > +   if (fence_owner == AMDGPU_FENCE_OWNER_UNDEFINED)
> > +   return true;
> > +
> > +   /* We only want to trigger KFD eviction fences on
> > +* evict or move jobs. Skip KFD fences otherwise.
> > +*/
> > +   if (fence_owner == AMDGPU_FENCE_OWNER_KFD &&
> > +   owner != AMDGPU_FENCE_OWNER_UNDEFINED)
> > +   return false;
> > +
> > +   /* Never sync to VM updates either. */
> > +   if (fence_owner == AMDGPU_FENCE_OWNER_VM &&
> > +   owner != AMDGPU_FENCE_OWNER_UNDEFINED)
> > +   return false;
> > +
> > +   /* Ignore fences depending on the sync mode */
> > +   switch (mode) {
> > +   case AMDGPU_SYNC_ALWAYS:
> > +   return true;
> > +
> > +   case AMDGPU_SYNC_NE_OWNER:
> > +   if (amdgpu_sync_same_dev(adev, f) &&
> > +   fence_owner == owner)
> > +   return false;
> > +   break;
> > +
> > +   case AMDGPU_SYNC_EQ_OWNER:
> > +   if (amdgpu_sync_same_dev(adev, f) &&
> > +   fence_owner != owner)
> > +   return false;
> > +   break;
> > +
> > +   case AMDGPU_SYNC_EXPLICIT:
> > +   return false;
> > +   }
> > +
> > +   WARN(debug_evictions && fence_owner == AMDGPU_FENCE_OWNER_KFD,
> > +"Adding eviction fence to sync obj");
> > +   return true;
> > +}
> > +
> >   /**
> >* amdgpu_sync_resv - sync to a reservation object
> >*
> > @@ -211,67 +262,34 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, 
> > struct amdgpu_sync *sync,
> > /* always sync to the exclusive fence */
> > f = dma_resv_excl_fence(resv);
> > -   r = amdgpu_sync_fence(sync, f);
> > +   dma_fence_chain_for_each(f, f) {
> > +   struct dma_fence_chain *chain = to_dma_fence_chain(f);
> > +
> > +   if (amdgpu_sync_test_fence(adev, mode, owner, chain ?
> > +  chain->fence : f)) {
> > +   r = amdgpu_sync_fence(sync, f);
> > +   dma_fence_put(f);
> > +   if (r)
> > +   return r;
> > +   break;
> > +   }
> > +   }
> > flist = dma_resv_shared_list(resv);
> > -   if (!flist || r)
> > -   return r;
> > +   if (!flist)
> > +   return 0;
> > for (i = 0; i < flist->shared_count; ++i) {
> > -   void *fence_owner;
> > -
> > f = rcu_dereference_protected(flist->shared[i],
> >   dma_resv_held(resv));
> > -   fence_owner = amdgpu_sync_get_owner(f);
> > -
> > -   /* Always sync to moves, no matter what */
> > -   if (fence_owner == AMDGPU_FENCE_OWNER_UNDEFINED) {
> > +   if (amdgpu_sync_test_fence(adev, mode, owner, f)) {
> > r = amdgpu_sync_fence(sync, f);
> > if (r)
> > -   break;
> > -   }
> > -
> > -   /* We only want to trigger KFD eviction fences on
> > -* evict or move jobs. Skip KFD fences otherwise.
> > -*/
> > -   if (fence_owner == AMDGPU_FENCE_OWNER_KFD &&
> > -   owner != AMDGPU_FENCE_OWNER_UNDEFINED)
> > -   continue;
> > -
> > -   /* Never sync to VM updates either. */
> > -   if (fence_owner == AMDGPU_FENCE_OWNER_VM &&
> > -   owner != AMDGPU_FENCE_OWNER_UNDEFI

Re: [PATCH] dma-buf: fix and rework dma_buf_poll

2021-06-17 Thread Daniel Vetter
On Tue, Jun 15, 2021 at 01:21:17PM +0200, Christian König wrote:
> Daniel pointed me towards this function and there are multiple obvious 
> problems
> in the implementation.
> 
> First of all the retry loop is not working as intended. In general the retry
> makes only sense if you grab the reference first and then check the retry. 
> Then
> we skipped checking the exclusive fence when shared fences were present. And
> last the whole implementation was unnecessary complex and rather hard to
> understand which could lead to probably unexpected behavior of the IOCTL.
> 
> Fix all this by reworking the implementation from scratch.

Can't we split this a bit?

The other thing I'm wondering, instead of open-coding this and breaking
our heads trying to make sure we got it right. Can't we reuse
dma_resv_get_fences? That's what a lot of drivers use already to get a
consistent copy of the fence set without holding the lock.

I think then the actual semantics, i.e. do we need to include the
exclusive fence or not, stick out more.
-Daniel

> 
> Only mildly tested and needs a thoughtful review of the code.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/dma-buf/dma-buf.c | 132 +++---
>  include/linux/dma-buf.h   |   2 +-
>  2 files changed, 54 insertions(+), 80 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 511fe0d217a0..1bd00e18291f 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -72,7 +72,7 @@ static void dma_buf_release(struct dentry *dentry)
>* If you hit this BUG() it means someone dropped their ref to the
>* dma-buf while still having pending operation to the buffer.
>*/
> - BUG_ON(dmabuf->cb_shared.active || dmabuf->cb_excl.active);
> + BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
>  
>   dmabuf->ops->release(dmabuf);
>  
> @@ -206,12 +206,15 @@ static void dma_buf_poll_cb(struct dma_fence *fence, 
> struct dma_fence_cb *cb)
>  
>  static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>  {
> + struct dma_buf_poll_cb_t *dcb;
>   struct dma_buf *dmabuf;
>   struct dma_resv *resv;
>   struct dma_resv_list *fobj;
>   struct dma_fence *fence_excl;
> - __poll_t events;
>   unsigned shared_count, seq;
> + struct dma_fence *fence;
> + __poll_t events;
> + int r, i;
>  
>   dmabuf = file->private_data;
>   if (!dmabuf || !dmabuf->resv)
> @@ -225,99 +228,70 @@ static __poll_t dma_buf_poll(struct file *file, 
> poll_table *poll)
>   if (!events)
>   return 0;
>  
> + dcb = events & EPOLLOUT ? &dmabuf->cb_out : &dmabuf->cb_in;
> +
> + /* Only queue a new one if we are not still waiting for the old one */
> + spin_lock_irq(&dmabuf->poll.lock);
> + if (dcb->active)
> + events = 0;
> + else
> + dcb->active = events;
> + spin_unlock_irq(&dmabuf->poll.lock);
> + if (!events)
> + return 0;
> +
>  retry:
>   seq = read_seqcount_begin(&resv->seq);
>   rcu_read_lock();
>  
>   fobj = rcu_dereference(resv->fence);
> - if (fobj)
> + if (fobj && events & EPOLLOUT)
>   shared_count = fobj->shared_count;
>   else
>   shared_count = 0;
> - fence_excl = dma_resv_excl_fence(resv);
> - if (read_seqcount_retry(&resv->seq, seq)) {
> - rcu_read_unlock();
> - goto retry;
> - }
>  
> - if (fence_excl && (!(events & EPOLLOUT) || shared_count == 0)) {
> - struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_excl;
> - __poll_t pevents = EPOLLIN;
> -
> - if (shared_count == 0)
> - pevents |= EPOLLOUT;
> -
> - spin_lock_irq(&dmabuf->poll.lock);
> - if (dcb->active) {
> - dcb->active |= pevents;
> - events &= ~pevents;
> - } else
> - dcb->active = pevents;
> - spin_unlock_irq(&dmabuf->poll.lock);
> -
> - if (events & pevents) {
> - if (!dma_fence_get_rcu(fence_excl)) {
> - /* force a recheck */
> - events &= ~pevents;
> - dma_buf_poll_cb(NULL, &dcb->cb);
> - } else if (!dma_fence_add_callback(fence_excl, &dcb->cb,
> -dma_buf_poll_cb)) {
> - events &= ~pevents;
> - dma_fence_put(fence_excl);
> - } else {
> - /*
> -  * No callback queued, wake up any additional
> -  * waiters.
> -  */
> - dma_fence_put(fence_excl);
> - dma_buf_poll_cb(NULL, &dcb->cb);
> - }
> + for (i

Re: [Intel-gfx] [RFC PATCH 2/2] drm/doc/rfc: i915 new parallel submission uAPI plan

2021-06-17 Thread Matthew Brost
On Thu, Jun 17, 2021 at 06:46:48PM +0200, Daniel Vetter wrote:
> Sorry I'm behind on mails  ...
> 

Aren't we all.

> On Fri, Jun 11, 2021 at 12:50:29PM -0700, Matthew Brost wrote:
> > On Fri, Jun 04, 2021 at 07:59:05PM +0200, Daniel Vetter wrote:
> > > On Wed, May 26, 2021 at 04:33:57PM -0700, Matthew Brost wrote:
> > > > Add entry for i915 new parallel submission uAPI plan.
> > > > 
> > > > v2:
> > > >  (Daniel Vetter):
> > > >   - Expand logical order explaination
> > > >   - Add dummy header
> > > >   - Only allow N BBs in execbuf IOCTL
> > > >   - Configure parallel submission per slot not per gem context
> > > > v3:
> > > >  (Marcin Ślusarz):
> > > >   - Lot's of typos / bad english fixed
> > > >  (Tvrtko Ursulin):
> > > >   - Consistent pseudo code, clean up wording in descriptions
> > > > 
> > > > Cc: Tvrtko Ursulin 
> > > > Cc: Tony Ye 
> > > > CC: Carl Zhang 
> > > > Cc: Daniel Vetter 
> > > > Cc: Jason Ekstrand 
> > > > Signed-off-by: Matthew Brost 
> > > > ---
> > > >  Documentation/gpu/rfc/i915_parallel_execbuf.h | 145 ++
> > > >  Documentation/gpu/rfc/i915_scheduler.rst  |  55 ++-
> > > >  2 files changed, 198 insertions(+), 2 deletions(-)
> > > >  create mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
> > > > 
> > > > diff --git a/Documentation/gpu/rfc/i915_parallel_execbuf.h 
> > > > b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > > > new file mode 100644
> > > > index ..20de206e3ab4
> > > > --- /dev/null
> > > > +++ b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > > > @@ -0,0 +1,145 @@
> > > > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> > > > i915_context_engines_parallel_submit */
> > > > +
> > > > +/*
> > > > + * i915_context_engines_parallel_submit:
> > > 
> > > So the idea is to make these kerneldoc and pull them into the rfc section.
> > > Then when we merge, move them to the real uapi section, like what Matt has
> > > done for lmem.
> > > 
> > 
> > Yep, will fix in next rev.
> > 
> > > > + *
> > > > + * Setup a slot in the context engine map to allow multiple BBs to be 
> > > > submitted
> > > > + * in a single execbuf IOCTL. Those BBs will then be scheduled to run 
> > > > on the GPU
> > > > + * in parallel. Multiple hardware contexts are created internally in 
> > > > the i915
> > > > + * run these BBs. Once a slot is configured for N BBs only N BBs can be
> > > > + * submitted in each execbuf IOCTL and this is implicit behavior e.g. 
> > > > The user
> > > > + * doesn't tell the execbuf IOCTL there are N BBs, the execbuf IOCTL 
> > > > know how
> > > > + * many BBs there are based on the slots configuration. The N BBs are 
> > > > the last N
> > > > + * buffer objects for first N if I915_EXEC_BATCH_FIRST is set.
> > > 
> > > s/for/or/
> > > 
> > > > + *
> > > > + * There are two currently defined ways to control the placement of the
> > > > + * hardware contexts on physical engines: default behavior (no flags) 
> > > > and
> > > > + * I915_PARALLEL_IMPLICIT_BONDS (a flag). More flags may be added the 
> > > > in the
> > > > + * future as new hardware / use cases arise. Details of how to use this
> > > > + * interface above the flags field in this structure.
> > > > + *
> > > > + * Returns -EINVAL if hardware context placement configuration is 
> > > > invalid or if
> > > > + * the placement configuration isn't supported on the platform / 
> > > > submission
> > > > + * interface.
> > > > + * Returns -ENODEV if extension isn't supported on the platform / 
> > > > submission
> > > > + * inteface.
> > > > + */
> > > > +struct i915_context_engines_parallel_submit {
> > > > +   struct i915_user_extension base;
> > > > +
> > > > +   __u16 engine_index; /* slot for parallel engine */
> > > 
> > > Kernel doc here for the inline comments too.
> > >
> > 
> > Yep.
> >  
> > > > +   __u16 width;/* number of contexts per parallel 
> > > > engine */
> > > > +   __u16 num_siblings; /* number of siblings per context */
> > > > +   __u16 mbz16;
> > > > +/*
> > > > + * Default placement behavior (currently unsupported):
> > > > + *
> > > > + * Allow BBs to be placed on any available engine instance. In this 
> > > > case each
> > > > + * context's engine mask indicates where that context can be placed. 
> > > > It is
> > > > + * implied in this mode that all contexts have mutual exclusive 
> > > > placement.
> > > > + * e.g. If one context is running CSX[0] no other contexts can run on 
> > > > CSX[0]).
> > > > + *
> > > > + * Example 1 pseudo code:
> > > > + * CSX,Y[N] = generic engine class X or Y, logical instance N
> > > > + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
> > > > + * set_engines(INVALID)
> > > > + * set_parallel(engine_index=0, width=2, num_siblings=2,
> > > > + * engines=CSX[0],CSX[1],CSY[0],CSY[1])
> > > > + *
> > > > + * Results in the following valid placements:
> > > > + * CSX[0], CSY[0]
> > > > + * CSX[0], CSY[1]
> > > > + * CSX[

Re: [PATCH] drm/i915: Remove duplicate include of intel_region_lmem.h

2021-06-17 Thread Daniel Vetter
On Tue, Jun 15, 2021 at 07:35:20PM +0800, Wan Jiabing wrote:
> Fix the following checkinclude.pl warning:
> drivers/gpu/drm/i915/gt/intel_region_lmem.c
> 8 #include "intel_region_lmem.h"
>  12   #include "intel_region_lmem.h"
> 
> Signed-off-by: Wan Jiabing 

Applied to drm-intel-gt-next, thanks for your patch.
-Daniel

> ---
>  drivers/gpu/drm/i915/gt/intel_region_lmem.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c 
> b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> index f7366b054f8e..119eeec98837 100644
> --- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> +++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> @@ -9,7 +9,6 @@
>  #include "intel_region_ttm.h"
>  #include "gem/i915_gem_lmem.h"
>  #include "gem/i915_gem_region.h"
> -#include "intel_region_lmem.h"
>  
>  static int init_fake_lmem_bar(struct intel_memory_region *mem)
>  {
> -- 
> 2.20.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: vc4: hdmi: audio: ASoC: error at snd_soc_dai_startup on fef00700.hdmi

2021-06-17 Thread Stefan Wahren
Hi Maxime,

Am 17.06.21 um 17:25 schrieb Maxime Ripard:
> Hi Stefan,
>
> On Sat, Jun 12, 2021 at 12:04:08PM +0200, Stefan Wahren wrote:
>> Hi Maxime,
>>
>> Am 04.06.21 um 11:02 schrieb Maxime Ripard:
>>> Hi Stefan,
>>>
>>> I would assume it's due to this:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/vc4/vc4_hdmi.c#n1083
>>>
>>> It pre-dates my time working on the vc4 driver so I'm not really sure
>>> what this is supposed to prevent, but my guess is that it's there to
>>> avoid someone using the audio card before we have a display detected and
>>> connected, and its capabilities known (the first and more obvious one
>>> being does it support audio in the first place).
>>>
>>> It's nothing new though, maybe it's the error printing itself that is?
>> i'm sorry, i forgot about this discussion here:
>>
>> https://lists.freedesktop.org/archives/dri-devel/2020-December/292701.html
> It looks like there's no discussion on that link, is it the link you wanted 
> to paste?

it was the right patch, but the discussion get lost because of turn of
the year. Next try:

https://www.spinics.net/lists/dri-devel/msg284535.html

>
> Maxime


Re: [PATCH -next] apply: use DEFINE_SPINLOCK() instead of spin_lock_init().

2021-06-17 Thread Daniel Vetter
On Tue, Jun 15, 2021 at 07:17:13PM -0800, Yu Jiahua wrote:
> From: Jiahua Yu 
> 
> spinlock can be initialized automatically with DEFINE_SPINLOCK()
> rather than explicitly calling spin_lock_init().
> 
> Signed-off-by: Jiahua Yu 

Stuffed into drm-misc-next. The subject looked a bit strange, so I fixed
that up.
-Daniel

> ---
>  drivers/video/fbdev/omap2/omapfb/dss/apply.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/video/fbdev/omap2/omapfb/dss/apply.c 
> b/drivers/video/fbdev/omap2/omapfb/dss/apply.c
> index c71021091828..acca991c7540 100644
> --- a/drivers/video/fbdev/omap2/omapfb/dss/apply.c
> +++ b/drivers/video/fbdev/omap2/omapfb/dss/apply.c
> @@ -108,7 +108,7 @@ static struct {
>  } dss_data;
>  
>  /* protects dss_data */
> -static spinlock_t data_lock;
> +static DEFINE_SPINLOCK(data_lock);
>  /* lock for blocking functions */
>  static DEFINE_MUTEX(apply_lock);
>  static DECLARE_COMPLETION(extra_updated_completion);
> @@ -131,8 +131,6 @@ static void apply_init_priv(void)
>   struct mgr_priv_data *mp;
>   int i;
>  
> - spin_lock_init(&data_lock);
> -
>   for (i = 0; i < num_ovls; ++i) {
>   struct ovl_priv_data *op;
>  
> -- 
> 2.17.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/i915/gt: Fix duplicate included intel_region_lmem.h

2021-06-17 Thread Daniel Vetter
On Wed, Jun 16, 2021 at 02:01:58PM +0800, Jiapeng Chong wrote:
> Clean up the following includecheck warning:
> 
> ./drivers/gpu/drm/i915/gt/intel_region_lmem.c: intel_region_lmem.h is
> included more than once.
> 
> No functional change.
> 
> Reported-by: Abaci Robot 
> Signed-off-by: Jiapeng Chong 

Already merged another one of these:

commit 6796c772850574ec0a9adc977e9889606b23d0f4 (HEAD -> drm-intel-gt-next, 
drm-intel/drm-intel-gt-next)
Author: Wan Jiabing 
Date:   Tue Jun 15 19:35:20 2021 +0800

drm/i915: Remove duplicate include of intel_region_lmem.h

Thanks anyway.

Cheers, Daniel

> ---
>  drivers/gpu/drm/i915/gt/intel_region_lmem.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c 
> b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> index f7366b0..aa3cfca 100644
> --- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> +++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> @@ -5,7 +5,6 @@
>  
>  #include "i915_drv.h"
>  #include "intel_memory_region.h"
> -#include "intel_region_lmem.h"
>  #include "intel_region_ttm.h"
>  #include "gem/i915_gem_lmem.h"
>  #include "gem/i915_gem_region.h"
> -- 
> 1.8.3.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: vc4_bo_create: Failed to allocate from CMA

2021-06-17 Thread Stefan Wahren
Hi Nicolas,

Am 17.06.21 um 11:36 schrieb nicolas saenz julienne:
> On Sat, 2021-06-12 at 17:17 +0200, Stefan Wahren wrote:
>> Hi,
>>
>> while testing the mainline kernel (arm64, defconfig) on Raspberry Pi 3 B
>> Plus with Raspberry Pi OS - 64 bit, sometimes X doesn't start into
>> desktop properly (unexpected and unusable login screen instead of auto
>> login or mouse pointer is show shorty and than switch back to black
>> screen in a loop). In that case dmesg shows the following:
>>
>> [   74.737106] [drm:vc4_bo_create [vc4]] *ERROR* Failed to allocate from
>> CMA:
>> [   74.737558] vc4-drm soc:gpu: [drm]    V3D: 
>> 28976kb BOs (10)
>> [   74.737602] vc4-drm soc:gpu: [drm] V3D
>> shader: 44kb BOs (11)
>> [   74.737632] vc4-drm soc:gpu: [drm]   dumb:  
>> 4564kb BOs (5)
>> [   74.737664] vc4-drm soc:gpu: [drm] binner: 
>> 16384kb BOs (1)
>> [   74.737697] vc4-drm soc:gpu: [drm]    total purged
>> BO:  4kb BOs (1)
>> [   74.739039] [drm:vc4_bo_create [vc4]] *ERROR* Failed to allocate from
>> CMA:
>> [   74.739466] vc4-drm soc:gpu: [drm]    V3D: 
>> 28972kb BOs (9)
>> [   74.739512] vc4-drm soc:gpu: [drm] V3D
>> shader: 44kb BOs (11)
>> [   74.739541] vc4-drm soc:gpu: [drm]   dumb:  
>> 4564kb BOs (5)
>> [   74.739570] vc4-drm soc:gpu: [drm] binner: 
>> 16384kb BOs (1)
>> [   74.739602] vc4-drm soc:gpu: [drm]    total purged
>> BO:  4kb BOs (1)
>> [   74.740718] [drm:vc4_bo_create [vc4]] *ERROR* Failed to allocate from
>> CMA:
>> [   74.741138] vc4-drm soc:gpu: [drm]    V3D: 
>> 28972kb BOs (9)
>> [   74.741171] vc4-drm soc:gpu: [drm] V3D
>> shader: 44kb BOs (11)
>> [   74.741202] vc4-drm soc:gpu: [drm]   dumb:  
>> 4564kb BOs (5)
>> [   74.741231] vc4-drm soc:gpu: [drm] binner: 
>> 16384kb BOs (1)
>> [   74.741263] vc4-drm soc:gpu: [drm]    total purged
>> BO:  4kb BOs (1)
>> ...
>>
>> I have only seen this issue on arm64 with latest mainline kernel
>> (5.13.0-rc5-00130-gf21b807c3cf8), but also with older kernel versions.
>> So it's not a regression. It seems 64 bit needs more CMA.
>>
>> In case X started properly i was also able to reproduce these errors
>> above by dis- and reconneting HDMI.
>>
>> So i increased CMA in bcm283x.dtsi and the problem disappeared:
>>
>> iff --git a/arch/arm/boot/dts/bcm283x.dtsi b/arch/arm/boot/dts/bcm283x.dtsi
>> index b83a864..d1304cb 100644
>> --- a/arch/arm/boot/dts/bcm283x.dtsi
>> +++ b/arch/arm/boot/dts/bcm283x.dtsi
>> @@ -37,7 +37,7 @@
>>  
>>      cma: linux,cma {
>>          compatible = "shared-dma-pool";
>> -            size = <0x400>; /* 64MB */
>> +            size = <0x600>; /* 96MB */
>>          reusable;
>>          linux,cma-default;
>>      };
>>
>> The questions are:
>>
>> Is this the right way (tm) to fix this problem?
> Frankly I don't know if there is a better way. IIRC opensuse and downstream 
> use
> DT overlays to cater for this limitation. It seems reasonable to bump the
> value. But it'll be in detriment of users that don't care much for graphical
> interfaces. Nonetheless, I'm not familiar with how DRM handles CMA/DMA memory.
> So let me have a look at it. Maybe there is a SW fix. At first glance I'm
> surprised they can't defer to normal page allocations when CMA isn't capable 
> of
> honoring the request (like the dma code does).

a compromise might be to increase the CMA size based on the SoC type
(newer generations have more memory)

BCM2835 => 64 MB
BCM2836, BCM2837 => 256 MB

>
>> And what is a sensible value (don't have a 4K display to test)?
> The default for downstream is 256MB. But I've read discussions in the forum
> where people needed even more. IIUC it's use-case dependent, resolution is 
> only
> one variable, you might then try to run a game and run out of memory there.

Sure this wasn't intend to make everybody happy. But i would expect to
start X reliable at least.

Regards
Stefan

>
> Regards,
> Nicolas
>


  1   2   >