Re: [PATCH] drm/bridge: Ignore -EPROBE_DEFER when bridge attach fails

2021-10-13 Thread Guido Günther
Hi,
On Wed, Oct 13, 2021 at 08:48:32AM +0200, Andrzej Hajda wrote:
> On 12.10.2021 22:47, Guido Günther wrote:
> > Hi Laurent,
> > On Tue, Oct 12, 2021 at 11:17:07PM +0300, Laurent Pinchart wrote:
> > > Hi Guido,
> > > 
> > > Thank you for the patch.
> > > 
> > > On Tue, Oct 12, 2021 at 09:58:58PM +0200, Guido Günther wrote:
> > > > Otherwise logs are filled with
> > > > 
> > > >[drm:drm_bridge_attach] *ERROR* failed to attach bridge 
> > > > /soc@0/bus@3080/mipi-dsi@30a0  to encoder None-34: -517
> > > > 
> > > > when the bridge isn't ready yet.
> > > > 
> > > > Fixes: fb8d617f8fd6 ("drm/bridge: Centralize error message when bridge 
> > > > attach fails")
> > > > Signed-off-by: Guido Günther 
> > > > ---
> > > >   drivers/gpu/drm/drm_bridge.c | 11 ++-
> > > >   1 file changed, 6 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
> > > > index a8ed66751c2d..f0508e85ae98 100644
> > > > --- a/drivers/gpu/drm/drm_bridge.c
> > > > +++ b/drivers/gpu/drm/drm_bridge.c
> > > > @@ -227,14 +227,15 @@ int drm_bridge_attach(struct drm_encoder 
> > > > *encoder, struct drm_bridge *bridge,
> > > > bridge->encoder = NULL;
> > > > list_del(&bridge->chain_node);
> > > > +   if (ret != -EPROBE_DEFER) {
> > > >   #ifdef CONFIG_OF
> > > > -   DRM_ERROR("failed to attach bridge %pOF to encoder %s: %d\n",
> > > > - bridge->of_node, encoder->name, ret);
> > > > +   DRM_ERROR("failed to attach bridge %pOF to encoder %s: 
> > > > %d\n",
> > > > + bridge->of_node, encoder->name, ret);
> > > >   #else
> > > > -   DRM_ERROR("failed to attach bridge to encoder %s: %d\n",
> > > > - encoder->name, ret);
> > > > +   DRM_ERROR("failed to attach bridge to encoder %s: %d\n",
> > > > + encoder->name, ret);
> > > >   #endif
> > > > -
> > > > +   }
> > > 
> > > This looks fine as such, but I'm concerned about the direction it's
> > > taking. Ideally, probe deferral should happen at probe time, way before
> > > the bridge is attached. Doing otherwise is a step in the wrong direction
> > > in my opinion, and something we'll end up regretting when we'll feel the
> > > pain it inflicts.
> > 
> > The particular case I'm seeing this is the nwl driver probe deferrals if
> > the panel bridge isn't ready (which needs a bunch of components
> > (dsi, panel, backlight wrapped led, ...) and it probes fine later on so I
> > wonder where you see the actual error cause? That downstream of the
> > bridge isn't ready or that the display controller is already attaching
> > the bridge?
> 
> So it is something wrong there, nwl should not publish bridge interface
> until it gather its resources (the panel in this case).

That helps, I'll look at that. Thanks!
 -- Guido

> 
> Regards
> Andrzej
> 
> 
> > 
> > Cheers,
> >   -- Guido
> > 
> > > 
> > > > return ret;
> > > >   }
> > > >   EXPORT_SYMBOL(drm_bridge_attach);
> > > 
> > > -- 
> > > Regards,
> > > 
> > > Laurent Pinchart
> > > 
> 


Re: [RFC v2 01/22] drm: RFC for Plane Color Hardware Pipeline

2021-10-13 Thread Pekka Paalanen
On Tue, 12 Oct 2021 19:11:29 +
"Shankar, Uma"  wrote:

> > -Original Message-
> > From: Pekka Paalanen 
> > Sent: Tuesday, October 12, 2021 5:30 PM
> > To: Simon Ser 
> > Cc: Shankar, Uma ; intel-...@lists.freedesktop.org; 
> > dri-
> > de...@lists.freedesktop.org; harry.wentl...@amd.com;
> > ville.syrj...@linux.intel.com; brian.star...@arm.com;
> > sebast...@sebastianwick.net; shashank.sha...@amd.com
> > Subject: Re: [RFC v2 01/22] drm: RFC for Plane Color Hardware Pipeline
> > 
> > On Tue, 12 Oct 2021 10:35:37 +
> > Simon Ser  wrote:
> >   
> > > On Tuesday, October 12th, 2021 at 12:30, Pekka Paalanen  
> >  wrote:  
> > >  
> > > > is there a practise of landing proposal documents in the kernel? How
> > > > does that work, will a kernel tree carry the patch files?
> > > > Or should this document be worded like documentation for an accepted
> > > > feature, and then the patches either land or don't?  
> > >
> > > Once everyone agrees, the RFC can land. I don't think a kernel tree is
> > > necessary. See:
> > >
> > > https://dri.freedesktop.org/docs/drm/gpu/rfc/index.html  
> > 
> > Does this mean the RFC doc patch will land, but the code patches will 
> > remain in the
> > review cycles waiting for userspace proving vehicles?
> > Rather than e.g. committed as files that people would need to apply 
> > themselves? Or
> > how does one find the code patches corresponding to RFC docs?  
> 
> As I understand, this section was added to finalize the design and debate on 
> the UAPI,
> structures, headers and design etc. Once a general agreement is in place with 
> all the
> stakeholders, we can have ack on design and approach and get it merged. This 
> hence
> serves as an approved reference for the UAPI, accepted and agreed by 
> community at large.
> 
> Once the code lands, all the documentation will be added to the right driver 
> sections and
> helpers, like it's been done currently.

I'm just wondering: someone browses a kernel tree, and discovers this
RFC doc in there. They want to see or test the latest (WIP) kernel
implementation of it. How will they find the code / patches?


Thanks,
pq


pgpOXzW3pj3W9.pgp
Description: OpenPGP digital signature


[PATCH v3] lib/stackdepot: allow optional init and stack_table allocation by kvmalloc()

2021-10-13 Thread Vlastimil Babka
Currently, enabling CONFIG_STACKDEPOT means its stack_table will be allocated
from memblock, even if stack depot ends up not actually used. The default size
of stack_table is 4MB on 32-bit, 8MB on 64-bit.

This is fine for use-cases such as KASAN which is also a config option and
has overhead on its own. But it's an issue for functionality that has to be
actually enabled on boot (page_owner) or depends on hardware (GPU drivers)
and thus the memory might be wasted. This was raised as an issue [1] when
attempting to add stackdepot support for SLUB's debug object tracking
functionality. It's common to build kernels with CONFIG_SLUB_DEBUG and enable
slub_debug on boot only when needed, or create only specific kmem caches with
debugging for testing purposes.

It would thus be more efficient if stackdepot's table was allocated only when
actually going to be used. This patch thus makes the allocation (and whole
stack_depot_init() call) optional:

- Add a CONFIG_STACKDEPOT_ALWAYS_INIT flag to keep using the current
  well-defined point of allocation as part of mem_init(). Make CONFIG_KASAN
  select this flag.
- Other users have to call stack_depot_init() as part of their own init when
  it's determined that stack depot will actually be used. This may depend on
  both config and runtime conditions. Convert current users which are
  page_owner and several in the DRM subsystem. Same will be done for SLUB
  later.
- Because the init might now be called after the boot-time memblock allocation
  has given all memory to the buddy allocator, change stack_depot_init() to
  allocate stack_table with kvmalloc() when memblock is no longer available.
  Also handle allocation failure by disabling stackdepot (could have
  theoretically happened even with memblock allocation previously), and don't
  unnecessarily align the memblock allocation to its own size anymore.

[1] 
https://lore.kernel.org/all/CAMuHMdW=eovzm1re5fvoen87nkfilmm2+ah7enu2kxehcvb...@mail.gmail.com/

Signed-off-by: Vlastimil Babka 
Acked-by: Dmitry Vyukov 
Reviewed-by: Marco Elver  # stackdepot
Cc: Marco Elver 
Cc: Vijayanand Jitta 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Andrey Ryabinin 
Cc: Alexander Potapenko 
Cc: Andrey Konovalov 
Cc: Dmitry Vyukov 
Cc: Geert Uytterhoeven 
Cc: Oliver Glitta 
Cc: Imran Khan 
---
Changes in v3:
- stack_depot_init_mutex made static and moved inside stack_depot_init()
  Reported-by: kernel test robot 
- use !stack_table condition instead of stack_table == NULL
  reported by checkpatch on freedesktop.org patchwork
 drivers/gpu/drm/drm_dp_mst_topology.c   |  1 +
 drivers/gpu/drm/drm_mm.c|  4 +++
 drivers/gpu/drm/i915/intel_runtime_pm.c |  3 +++
 include/linux/stackdepot.h  | 25 ---
 init/main.c |  2 +-
 lib/Kconfig |  4 +++
 lib/Kconfig.kasan   |  2 +-
 lib/stackdepot.c| 33 +
 mm/page_owner.c |  2 ++
 9 files changed, 60 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c 
b/drivers/gpu/drm/drm_dp_mst_topology.c
index 86d13d6bc463..b0ebdc843a00 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -5493,6 +5493,7 @@ int drm_dp_mst_topology_mgr_init(struct 
drm_dp_mst_topology_mgr *mgr,
mutex_init(&mgr->probe_lock);
 #if IS_ENABLED(CONFIG_DRM_DEBUG_DP_MST_TOPOLOGY_REFS)
mutex_init(&mgr->topology_ref_history_lock);
+   stack_depot_init();
 #endif
INIT_LIST_HEAD(&mgr->tx_msg_downq);
INIT_LIST_HEAD(&mgr->destroy_port_list);
diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 93d48a6f04ab..5916228ea0c9 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -983,6 +983,10 @@ void drm_mm_init(struct drm_mm *mm, u64 start, u64 size)
add_hole(&mm->head_node);
 
mm->scan_active = 0;
+
+#ifdef CONFIG_DRM_DEBUG_MM
+   stack_depot_init();
+#endif
 }
 EXPORT_SYMBOL(drm_mm_init);
 
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c 
b/drivers/gpu/drm/i915/intel_runtime_pm.c
index eaf7688f517d..d083506986e1 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -78,6 +78,9 @@ static void __print_depot_stack(depot_stack_handle_t stack,
 static void init_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm)
 {
spin_lock_init(&rpm->debug.lock);
+
+   if (rpm->available)
+   stack_depot_init();
 }
 
 static noinline depot_stack_handle_t
diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index 6bb4bc1a5f54..40fc5e92194f 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -13,6 +13,22 @@
 
 typedef u32 depot_stack_handle_t;
 
+/*
+ * Every user of stack depot has to call this during its own init when it's
+ * decided that

Re: [PATCH v2 3/4] dt-bindings: drm/bridge: ti-sn65dsi83: Add vcc supply bindings

2021-10-13 Thread Maxime Ripard
Hi,

On Tue, Oct 12, 2021 at 08:48:42AM +0200, Alexander Stein wrote:
> Add a VCC regulator which needs to be enabled before the EN pin is
> released.
> 
> Reviewed-by: Sam Ravnborg 
> Signed-off-by: Alexander Stein 
> ---
>  .../devicetree/bindings/display/bridge/ti,sn65dsi83.yaml | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi83.yaml 
> b/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi83.yaml
> index a5779bf17849..49ace6f312d5 100644
> --- a/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi83.yaml
> +++ b/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi83.yaml
> @@ -32,6 +32,9 @@ properties:
>  maxItems: 1
>  description: GPIO specifier for bridge_en pin (active high).
>  
> +  vcc-supply:
> +description: A 1.8V power supply (see regulator/regulator.yaml).
> +
>ports:
>  $ref: /schemas/graph.yaml#/properties/ports
>  
> @@ -93,6 +96,7 @@ properties:
>  required:
>- compatible
>- reg
> +  - vcc-supply

This isn't a backward-compatible change. All the previous users of that
binding will now require a vcc-supply property even though it was
working fine for them before.

You handle that nicely in the code, but you can't make that new property
required.

Maxime


signature.asc
Description: PGP signature


[PATCH] drm/bridge: display-connector: fix an uninitialized pointer in probe()

2021-10-13 Thread Dan Carpenter
The "label" pointer is used for debug output.  The code assumes that it
is either NULL or valid, but it is never set to NULL.  It is either
valid or uninitialized.

Fixes: 0c275c30176b ("drm/bridge: Add bridge driver for display connectors")
Signed-off-by: Dan Carpenter 
---
 drivers/gpu/drm/bridge/display-connector.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/display-connector.c 
b/drivers/gpu/drm/bridge/display-connector.c
index 05eb759da6fc..847a0dce7f1d 100644
--- a/drivers/gpu/drm/bridge/display-connector.c
+++ b/drivers/gpu/drm/bridge/display-connector.c
@@ -107,7 +107,7 @@ static int display_connector_probe(struct platform_device 
*pdev)
 {
struct display_connector *conn;
unsigned int type;
-   const char *label;
+   const char *label = NULL;
int ret;
 
conn = devm_kzalloc(&pdev->dev, sizeof(*conn), GFP_KERNEL);
-- 
2.20.1



[PATCH 1/2] drm/msm: fix potential NULL dereference in cleanup

2021-10-13 Thread Dan Carpenter
The "msm_obj->node" list needs to be initialized earlier so that the
list_del() in msm_gem_free_object() doesn't experience a NULL pointer
dereference.

Fixes: 6ed0897cd800 ("drm/msm: Fix debugfs deadlock")
Signed-off-by: Dan Carpenter 
---
 drivers/gpu/drm/msm/msm_gem.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 40a9863f5951..49185d524be3 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -1132,6 +1132,7 @@ static int msm_gem_new_impl(struct drm_device *dev,
msm_obj->flags = flags;
msm_obj->madv = MSM_MADV_WILLNEED;
 
+   INIT_LIST_HEAD(&msm_obj->node);
INIT_LIST_HEAD(&msm_obj->vmas);
 
*obj = &msm_obj->base;
-- 
2.20.1



[PATCH 2/2] drm/msm: uninitialized variable in msm_gem_import()

2021-10-13 Thread Dan Carpenter
The msm_gem_new_impl() function cleans up after itself so there is no
need to call drm_gem_object_put().  Conceptually, it does not make sense
to call a kref_put() function until after the reference counting has
been initialized which happens immediately after this call in the
drm_gem_(private_)object_init() functions.

In the msm_gem_import() function the "obj" pointer is uninitialized, so
it will lead to a crash.

Fixes: 05b849111c07 ("drm/msm: prime support")
Signed-off-by: Dan Carpenter 
---
 drivers/gpu/drm/msm/msm_gem.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 49185d524be3..0e491cd21c53 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -1167,7 +1167,7 @@ struct drm_gem_object *msm_gem_new(struct drm_device 
*dev, uint32_t size, uint32
 
ret = msm_gem_new_impl(dev, size, flags, &obj);
if (ret)
-   goto fail;
+   return ERR_PTR(ret);
 
msm_obj = to_msm_bo(obj);
 
@@ -1251,7 +1251,7 @@ struct drm_gem_object *msm_gem_import(struct drm_device 
*dev,
 
ret = msm_gem_new_impl(dev, size, MSM_BO_WC, &obj);
if (ret)
-   goto fail;
+   return ERR_PTR(ret);
 
drm_gem_private_object_init(dev, obj, size);
 
-- 
2.20.1



Re: [RFC v2 01/22] drm: RFC for Plane Color Hardware Pipeline

2021-10-13 Thread Pekka Paalanen
On Tue, 12 Oct 2021 20:58:27 +
"Shankar, Uma"  wrote:

> > -Original Message-
> > From: Pekka Paalanen 
> > Sent: Tuesday, October 12, 2021 4:01 PM
> > To: Shankar, Uma 
> > Cc: intel-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; 
> > harry.wentl...@amd.com; ville.syrj...@linux.intel.com; 
> > brian.star...@arm.com; sebast...@sebastianwick.net; 
> > shashank.sha...@amd.com
> > Subject: Re: [RFC v2 01/22] drm: RFC for Plane Color Hardware Pipeline
> > 
> > On Tue,  7 Sep 2021 03:08:43 +0530
> > Uma Shankar  wrote:
> >   
> > > This is a RFC proposal for plane color hardware blocks.
> > > It exposes the property interface to userspace and calls out the 
> > > details or interfaces created and the intended purpose.
> > >
> > > Credits: Ville Syrjälä 
> > > Signed-off-by: Uma Shankar 
> > > ---
> > >  Documentation/gpu/rfc/drm_color_pipeline.rst | 167
> > > +++
> > >  1 file changed, 167 insertions(+)
> > >  create mode 100644 Documentation/gpu/rfc/drm_color_pipeline.rst
> > >
> > > diff --git a/Documentation/gpu/rfc/drm_color_pipeline.rst
> > > b/Documentation/gpu/rfc/drm_color_pipeline.rst
> > > new file mode 100644
> > > index ..0d1ca858783b
> > > --- /dev/null
> > > +++ b/Documentation/gpu/rfc/drm_color_pipeline.rst
> > > @@ -0,0 +1,167 @@
> > > +==
> > > +Display Color Pipeline: Proposed DRM Properties  

...

> > > +Proposal is to have below properties for a plane:
> > > +
> > > +* Plane Degamma or Pre-Curve:
> > > + * This will be used to linearize the input framebuffer data.
> > > + * It will apply the reverse of the color transfer function.
> > > + * It can be a degamma curve or OETF for HDR.  
> > 
> > As you want to produce light-linear values, you use EOTF or inverse OETF.
> > 
> > The term OETF has a built-in assumption that that happens in a camera:
> > it takes in light and produces and electrical signal. Lately I have 
> > personally started talking about non-linear encoding of color values, 
> > since EOTF is often associated with displays if nothing else is said 
> > (taking in an electrical signal and producing light).
> > 
> > So this would be decoding the color values into light-linear color 
> > values. That is what an EOTF does, yes, but I feel there is a nuanced 
> > difference. A piece of equipment implements an EOTF by turning an 
> > electrical signal into light, hence EOTF often refers to specific 
> > equipment. You could talk about content EOTF to denote content value 
> > encoding, as opposed to output or display EOTF, but that might be 
> > confusing if you look at e.g. the diagrams in BT.2100: is it the EOTF or is 
> > it the inverse OETF? Is the (inverse?) OOTF included?
> > 
> > So I try to side-step those questions by talking about encoding.  
> 
> The idea here is that frame buffer presented to display plane engine will be 
> non-linear.
> So output of a media decode should result in content with EOTF applied.

Hi,

sure, but the question is: which EOTF. There can be many different
things called "EOTF" in a single pipeline, and then it's up to the
document writer to make the difference between them. Comparing two
documents with different conventions causes a lot of confusion in my
personal experience, so it is good to define the concepts more
carefully.

> So output of a media decode should result in content with EOTF applied.

I suspect you have it backwards. Media decode produces electrical
(non-linear) pixel color values. If EOTF was applied, they would be
linear instead (and require more memory to achieve the same visual
precision).

If you want to put it this way, you could say "with inverse EOTF
applied", but that might be slightly confusing because it is already
baked in to the video, it's not something a media decoder has to
specifically apply, I think. However, the (inverse) EOTF in this case
is the content EOTF, not the display EOTF.

If content and display EOTF differ, then one must apply first content
EOTF and then inverse display EOTF to get values that are correctly
encoded for the display. (This is necessary but not sufficient in
general.) Mind, that this is not an OOTF nor an artistic adjustment,
this is purely a value encoding conversion.

> Playback transfer function (EOTF): inverse OETF plus rendering intent gamma. 

Does "rendering intent gamma" refer to artistic adjustments, not OOTF?

cf. BT.2100 Annex 1, "The relationship between the OETF, the EOTF and
the OOTF", although I find those diagrams somewhat confusing still. It
does not seem to clearly account for transmission non-linear encoding
being different from the display EOTF.

Different documents use OOTF to refer to different things. Then there
is also the fundamental difference between PQ and HLG systems, where
OOTF is by definition in different places of the
camera-transmission-display pipeline.

> 
> To make it linear, we should apply the OETF. Confusion is whether OETF is 
> equivalent to
> in

AW: (EXT) Re: [PATCH v2 4/4] drm/bridge: ti-sn65dsi83: Add vcc supply regulator support

2021-10-13 Thread Alexander Stein
Hello Laurent,

On Tue, Oct 12, 2021 at 10:43 +0200, Laurent Pinchart wrote:
> On Tue, Oct 12, 2021 at 08:48:43AM +0200, Alexander Stein wrote:
> > VCC needs to be enabled before releasing the enable GPIO.
> > 
> > Reviewed-by: Sam Ravnborg 
> > Signed-off-by: Alexander Stein 
> > ---
> >  drivers/gpu/drm/bridge/ti-sn65dsi83.c | 15 ++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c 
> > b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> > index 9072342566f3..a6b1fd71dfee 100644
> > --- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> > +++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> > @@ -33,6 +33,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include 
> >  #include 
> > @@ -143,6 +144,7 @@ struct sn65dsi83 {
> > struct mipi_dsi_device  *dsi;
> > struct drm_bridge   *panel_bridge;
> > struct gpio_desc*enable_gpio;
> > +   struct regulator*vcc;
> > int dsi_lanes;
> > boollvds_dual_link;
> > boollvds_dual_link_even_odd_swap;
> > @@ -647,6 +649,12 @@ static int sn65dsi83_parse_dt(struct sn65dsi83 *ctx,
> enum sn65dsi83_model model)
> >  
> > ctx->panel_bridge = panel_bridge;
> >  
> > +   ctx->vcc = devm_regulator_get(dev, "vcc");
> > +   if (IS_ERR(ctx->vcc))
> > +   return dev_err_probe(dev, PTR_ERR(ctx->vcc),
> > +"Failed to get supply 'vcc': %pe\n",
> > +ctx->vcc);
> > +
> > return 0;
> >  }
> >  
> > @@ -691,7 +699,11 @@ static int sn65dsi83_probe(struct i2c_client *client,
> > ctx->bridge.of_node = dev->of_node;
> > drm_bridge_add(&ctx->bridge);
> >  
> > -   return 0;
> > +   ret = regulator_enable(ctx->vcc);
> > +   if (ret)
> > +   dev_err(dev, "Failed to enable vcc: %i\n", ret);
> 
> I think this should move to sn65dsi83_atomic_pre_enable() (and similarly
> for regulator_disable()) as keeping the regulator enabled at all times
> will cost power.

I get your idea. The thing is that unless 1V8 is provided the bridge is not
even accessible on I2C. So any access to sn65dsi83.regmap without the vcc
regulator enabled will fail. AFAICS this is not an issue right now, as regmap
is only used in sn65dsi83_atomic_enable(), sn65dsi83_atomic_disable() and
sn65dsi83_atomic_pre_enable(), so your sugestion would work, but I'm
hestitating a bit. The driver then has to ensure all regmap uses are done
only when vcc is enabled.

Best regards,
Alexander



Re: (EXT) Re: [PATCH v2 4/4] drm/bridge: ti-sn65dsi83: Add vcc supply regulator support

2021-10-13 Thread Laurent Pinchart
Hi Alexander,

On Wed, Oct 13, 2021 at 08:59:22AM +, Alexander Stein wrote:
> On Tue, Oct 12, 2021 at 10:43 +0200, Laurent Pinchart wrote:
> > On Tue, Oct 12, 2021 at 08:48:43AM +0200, Alexander Stein wrote:
> > > VCC needs to be enabled before releasing the enable GPIO.
> > > 
> > > Reviewed-by: Sam Ravnborg 
> > > Signed-off-by: Alexander Stein 
> > > ---
> > >  drivers/gpu/drm/bridge/ti-sn65dsi83.c | 15 ++-
> > >  1 file changed, 14 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c 
> > > b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> > > index 9072342566f3..a6b1fd71dfee 100644
> > > --- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> > > +++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> > > @@ -33,6 +33,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  
> > >  #include 
> > >  #include 
> > > @@ -143,6 +144,7 @@ struct sn65dsi83 {
> > >   struct mipi_dsi_device  *dsi;
> > >   struct drm_bridge   *panel_bridge;
> > >   struct gpio_desc*enable_gpio;
> > > + struct regulator*vcc;
> > >   int dsi_lanes;
> > >   boollvds_dual_link;
> > >   boollvds_dual_link_even_odd_swap;
> > > @@ -647,6 +649,12 @@ static int sn65dsi83_parse_dt(struct sn65dsi83 *ctx,
> > enum sn65dsi83_model model)
> > >  
> > >   ctx->panel_bridge = panel_bridge;
> > >  
> > > + ctx->vcc = devm_regulator_get(dev, "vcc");
> > > + if (IS_ERR(ctx->vcc))
> > > + return dev_err_probe(dev, PTR_ERR(ctx->vcc),
> > > +  "Failed to get supply 'vcc': %pe\n",
> > > +  ctx->vcc);
> > > +
> > >   return 0;
> > >  }
> > >  
> > > @@ -691,7 +699,11 @@ static int sn65dsi83_probe(struct i2c_client *client,
> > >   ctx->bridge.of_node = dev->of_node;
> > >   drm_bridge_add(&ctx->bridge);
> > >  
> > > - return 0;
> > > + ret = regulator_enable(ctx->vcc);
> > > + if (ret)
> > > + dev_err(dev, "Failed to enable vcc: %i\n", ret);
> > 
> > I think this should move to sn65dsi83_atomic_pre_enable() (and similarly
> > for regulator_disable()) as keeping the regulator enabled at all times
> > will cost power.
> 
> I get your idea. The thing is that unless 1V8 is provided the bridge is not
> even accessible on I2C. So any access to sn65dsi83.regmap without the vcc
> regulator enabled will fail. AFAICS this is not an issue right now, as regmap
> is only used in sn65dsi83_atomic_enable(), sn65dsi83_atomic_disable() and
> sn65dsi83_atomic_pre_enable(), so your sugestion would work, but I'm
> hestitating a bit. The driver then has to ensure all regmap uses are done
> only when vcc is enabled.

Correct, and that's the usual pattern, drivers need to call
pm_runtime_get_sync() before accessing registers. For all you know, even
if the power to the bridge is on, the I2C controller it is connected to
could be suspended.

-- 
Regards,

Laurent Pinchart


Re: [PATCH v2 3/4] dt-bindings: drm/bridge: ti-sn65dsi83: Add vcc supply bindings

2021-10-13 Thread Laurent Pinchart
Hi Maxime,

On Wed, Oct 13, 2021 at 09:47:22AM +0200, Maxime Ripard wrote:
> On Tue, Oct 12, 2021 at 08:48:42AM +0200, Alexander Stein wrote:
> > Add a VCC regulator which needs to be enabled before the EN pin is
> > released.
> > 
> > Reviewed-by: Sam Ravnborg 
> > Signed-off-by: Alexander Stein 
> > ---
> >  .../devicetree/bindings/display/bridge/ti,sn65dsi83.yaml | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi83.yaml 
> > b/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi83.yaml
> > index a5779bf17849..49ace6f312d5 100644
> > --- a/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi83.yaml
> > +++ b/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi83.yaml
> > @@ -32,6 +32,9 @@ properties:
> >  maxItems: 1
> >  description: GPIO specifier for bridge_en pin (active high).
> >  
> > +  vcc-supply:
> > +description: A 1.8V power supply (see regulator/regulator.yaml).
> > +
> >ports:
> >  $ref: /schemas/graph.yaml#/properties/ports
> >  
> > @@ -93,6 +96,7 @@ properties:
> >  required:
> >- compatible
> >- reg
> > +  - vcc-supply
> 
> This isn't a backward-compatible change. All the previous users of that
> binding will now require a vcc-supply property even though it was
> working fine for them before.
> 
> You handle that nicely in the code, but you can't make that new property
> required.

We can't make it required in the driver, but can't we make it required
in the bindings ? This indicates that all new DTs need to set the
property. We also need to mass-patch the in-tree DTs to avoid validation
failures, but apart from that, I don't see any issue.

-- 
Regards,

Laurent Pinchart


[PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Maarten Lankhorst
No memory should be allocated when calling i915_gem_object_wait,
because it may be called to idle a BO when evicting memory.

Fix this by using dma_resv_iter helpers to call
i915_gem_object_wait_fence() on each fence, which cleans up the code a lot.
Also remove dma_resv_prune, it's questionably.

This will result in the following lockdep splat.

<4> [83.538517] ==
<4> [83.538520] WARNING: possible circular locking dependency detected
<4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted
<4> [83.538525] --
<4> [83.538527] gem_render_line/5242 is trying to acquire lock:
<4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: 
__kmalloc_track_caller+0x56/0x270
<4> [83.538538]
but task is already holding lock:
<4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.538638]
which lock already depends on the new lock.
<4> [83.538642]
the existing dependency chain (in reverse order) is:
<4> [83.538645]
-> #1 (&vm->mutex/1){+.+.}-{3:3}:
<4> [83.538649]lock_acquire+0xd3/0x310
<4> [83.538654]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
<4> [83.538730]i915_address_space_init+0xf5/0x1b0 [i915]
<4> [83.538794]ppgtt_init+0x55/0x70 [i915]
<4> [83.538856]gen8_ppgtt_create+0x44/0x5d0 [i915]
<4> [83.538912]i915_ppgtt_create+0x28/0xf0 [i915]
<4> [83.538971]intel_gt_init+0x130/0x3b0 [i915]
<4> [83.539029]i915_gem_init+0x14b/0x220 [i915]
<4> [83.539100]i915_driver_probe+0x97e/0xdd0 [i915]
<4> [83.539149]i915_pci_probe+0x43/0x1d0 [i915]
<4> [83.539197]pci_device_probe+0x9b/0x110
<4> [83.539201]really_probe+0x1b0/0x3b0
<4> [83.539205]__driver_probe_device+0xf6/0x170
<4> [83.539208]driver_probe_device+0x1a/0x90
<4> [83.539210]__driver_attach+0x93/0x160
<4> [83.539213]bus_for_each_dev+0x72/0xc0
<4> [83.539216]bus_add_driver+0x14b/0x1f0
<4> [83.539220]driver_register+0x66/0xb0
<4> [83.539222]hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi]
<4> [83.539227]do_one_initcall+0x53/0x2e0
<4> [83.539230]do_init_module+0x55/0x200
<4> [83.539234]load_module+0x2700/0x2980
<4> [83.539237]__do_sys_finit_module+0xaa/0x110
<4> [83.539241]do_syscall_64+0x37/0xb0
<4> [83.539244]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539247]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [83.539251]validate_chain+0xb37/0x1e70
<4> [83.539254]__lock_acquire+0x5a1/0xb70
<4> [83.539258]lock_acquire+0xd3/0x310
<4> [83.539260]fs_reclaim_acquire+0x9d/0xd0
<4> [83.539264]__kmalloc_track_caller+0x56/0x270
<4> [83.539267]krealloc+0x48/0xa0
<4> [83.539270]dma_resv_get_fences+0x1c3/0x280
<4> [83.539274]i915_gem_object_wait+0x1ff/0x410 [i915]
<4> [83.539342]i915_gem_evict_for_node+0x16b/0x440 [i915]
<4> [83.539412]i915_gem_gtt_reserve+0xff/0x130 [i915]
<4> [83.539482]i915_vma_pin_ww+0x765/0x970 [i915]
<4> [83.539556]eb_validate_vmas+0x6fe/0x8e0 [i915]
<4> [83.539626]i915_gem_do_execbuffer+0x9a6/0x20a0 [i915]
<4> [83.539693]i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915]
<4> [83.539759]drm_ioctl_kernel+0xac/0x140
<4> [83.539763]drm_ioctl+0x201/0x3d0
<4> [83.539766]__x64_sys_ioctl+0x6a/0xa0
<4> [83.539769]do_syscall_64+0x37/0xb0
<4> [83.539772]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539775]
other info that might help us debug this:
<4> [83.539778]  Possible unsafe locking scenario:
<4> [83.539781]CPU0CPU1
<4> [83.539783]
<4> [83.539785]   lock(&vm->mutex/1);
<4> [83.539788]lock(fs_reclaim);
<4> [83.539791]lock(&vm->mutex/1);
<4> [83.539794]   lock(fs_reclaim);
<4> [83.539796]
 *** DEADLOCK ***
<4> [83.539799] 3 locks held by gem_render_line/5242:
<4> [83.539802]  #0: c9d4bbf0 
(reservation_ww_class_acquire){+.+.}-{0:0}, at: 
i915_gem_do_execbuffer+0x8e5/0x20a0 [i915]
<4> [83.539870]  #1: 88811e48bae8 (reservation_ww_class_mutex){+.+.}-{3:3}, 
at: eb_validate_vmas+0x81/0x8e0 [i915]
<4> [83.539936]  #2: 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.540011]
stack backtrace:
<4> [83.540014] CPU: 2 PID: 5242 Comm: gem_render_line Not tainted 
5.15.0-rc5-CI-Trybot_8062+ #1
<4> [83.540019] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, 
BIOS TNTGL357.0038.2020.1124.1648 11/24/2020
<4> [83.540023] Call Trace:
<4> [83.540026]  dump_stack_lvl+0x56/0x7b
<4> [83.540030]  check_noncircular+0x12e/0x150
<4> [83.540034]  ? _raw_spin_unlock_irqrestore+0x50/0x60
<4> [83.540038]  validate_chain+0xb37/0x1e70
<4> [83.540042]  __lock_acquire+0x5a1/0xb70
<4> [83.540046]  lock_acquire+0

Re: mmotm 2021-10-05-19-53 uploaded (drivers/gpu/drm/msm/hdmi/hdmi_phy.o)

2021-10-13 Thread Arnd Bergmann
On Thu, Oct 7, 2021 at 11:51 AM Geert Uytterhoeven  wrote:
> On Wed, Oct 6, 2021 at 9:28 AM Christian König  
> wrote:
> > Am 06.10.21 um 09:20 schrieb Stephen Rothwell:
> > > On Tue, 5 Oct 2021 22:48:03 -0700 Randy Dunlap  
> > > wrote:
> > >> on i386:
> > >>
> > >> ld: drivers/gpu/drm/msm/hdmi/hdmi_phy.o:(.rodata+0x3f0): undefined 
> > >> reference to `msm_hdmi_phy_8996_cfg'

I ran into the same thing now as well.
E_TEST) && COMMON_CLK
>
> I'd make that:
>
> -depends on DRM
> +   depends on COMMON_CLK && DRM && IOMMU_SUPPORT
> depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST
> -depends on IOMMU_SUPPORT
> -   depends on (OF && COMMON_CLK) || COMPILE_TEST
> +   depends on OF || COMPILE_TEST
>
> to keep a better separation between hard and soft dependencies.
>
> Note that the "depends on OF || COMPILE_TEST" can even be
> deleted, as the dependency on ARCH_QCOM || SOC_IMX5 implies OF.

Looks good to me, I would also drop that last line in this case, and maybe
add this change as building without COMMON_CLK is no longer possible:

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 904535eda0c4..a5d87e03812f 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -116,10 +116,10 @@ msm-$(CONFIG_DRM_MSM_DP)+= dp/dp_aux.o \
  dp/dp_power.o \
  dp/dp_audio.o

-msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o
-msm-$(CONFIG_COMMON_CLK) += disp/mdp4/mdp4_lvds_pll.o
-msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_pll_8960.o
-msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_phy_8996.o
+msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o \
+ disp/mdp4/mdp4_lvds_pll.o \
+ hdmi/hdmi_pll_8960.o \
+ hdmi/hdmi_phy_8996.o

 msm-$(CONFIG_DRM_MSM_HDMI_HDCP) += hdmi/hdmi_hdcp.o

Has anyone submitted a patch already, or should I send the version
that I am using locally now?

Arnd


Re: [PATCH v7, 00/15] Support multi hardware decode using of_platform_populate

2021-10-13 Thread Andrzej Pietrasiewicz

Hi,

W dniu 13.10.2021 o 03:08, yunfei.d...@mediatek.com pisze:

Hi Andrzej,


On Tue, 2021-10-12 at 16:27 +0200, Andrzej Pietrasiewicz wrote:

Hi Yunfei Dong,

W dniu 11.10.2021 o 09:02, Yunfei Dong pisze:

This series adds support for multi hardware decode into mtk-vcodec,
by first
adding use of_platform_populate to manage each hardware
information: interrupt,
clock, register bases and power. Secondly add core thread to deal
with core
hardware message, at the same time, add msg queue for different
hardware
share messages. Lastly, the architecture of different specs are not
the same,
using specs type to separate them.

This series has been tested with both MT8183 and MT8173. Decoding
was working
for both chips.

Patches 1~3 rewrite get register bases and power on/off interface.

Patch 4 add to support multi hardware.

Patch 5 separate video encoder and decoder document

Patches 6-15 add interfaces to support core hardware.


Which tree does the series apply to?


I don't understand your mean clearly. Media tree?

You can get the patches from this link:

https://patchwork.linuxtv.org/project/linux-media/cover/20211011070247.792-1-yunfei.d...@mediatek.com/



Here's what I get:

$ git remote update media_tree
Fetching media_tree

$ git branch
  master
* media_tree
  mediatek-master

$ git-pw --server https://patchwork.linuxtv.org/api/1.1 --project linux-media 
series apply 6465 -3

Failed to apply patch:
Applying: media: mtk-vcodec: Get numbers of register bases from DT
Applying: media: mtk-vcodec: Align vcodec wake up interrupt interface
Applying: media: mtk-vcodec: Refactor vcodec pm interface
Applying: media: mtk-vcodec: Manage multi hardware information
error: sha1 information is lacking or useless 
(drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c).

error: could not build fake ancestor
Patch failed at 0004 media: mtk-vcodec: Manage multi hardware information
Use 'git am --show-current-patch' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Regards,

Andrzej


[PATCH RFC] virtio: wrap config->reset calls

2021-10-13 Thread Michael S. Tsirkin
This will enable cleanups down the road.
The idea is to disable cbs, then add "flush_queued_cbs" callback
as a parameter, this way drivers can flush any work
queued after callbacks have been disabled.

Signed-off-by: Michael S. Tsirkin 
---
 arch/um/drivers/virt-pci.c | 2 +-
 drivers/block/virtio_blk.c | 4 ++--
 drivers/bluetooth/virtio_bt.c  | 2 +-
 drivers/char/hw_random/virtio-rng.c| 2 +-
 drivers/char/virtio_console.c  | 4 ++--
 drivers/crypto/virtio/virtio_crypto_core.c | 8 
 drivers/firmware/arm_scmi/virtio.c | 2 +-
 drivers/gpio/gpio-virtio.c | 2 +-
 drivers/gpu/drm/virtio/virtgpu_kms.c   | 2 +-
 drivers/i2c/busses/i2c-virtio.c| 2 +-
 drivers/iommu/virtio-iommu.c   | 2 +-
 drivers/net/caif/caif_virtio.c | 2 +-
 drivers/net/virtio_net.c   | 4 ++--
 drivers/net/wireless/mac80211_hwsim.c  | 2 +-
 drivers/nvdimm/virtio_pmem.c   | 2 +-
 drivers/rpmsg/virtio_rpmsg_bus.c   | 2 +-
 drivers/scsi/virtio_scsi.c | 2 +-
 drivers/virtio/virtio.c| 5 +
 drivers/virtio/virtio_balloon.c| 2 +-
 drivers/virtio/virtio_input.c  | 2 +-
 drivers/virtio/virtio_mem.c| 2 +-
 fs/fuse/virtio_fs.c| 4 ++--
 include/linux/virtio.h | 1 +
 net/9p/trans_virtio.c  | 2 +-
 net/vmw_vsock/virtio_transport.c   | 4 ++--
 sound/virtio/virtio_card.c | 4 ++--
 26 files changed, 39 insertions(+), 33 deletions(-)

diff --git a/arch/um/drivers/virt-pci.c b/arch/um/drivers/virt-pci.c
index c08066633023..22c4d87c9c15 100644
--- a/arch/um/drivers/virt-pci.c
+++ b/arch/um/drivers/virt-pci.c
@@ -616,7 +616,7 @@ static void um_pci_virtio_remove(struct virtio_device *vdev)
int i;
 
 /* Stop all virtqueues */
-vdev->config->reset(vdev);
+virtio_reset_device(vdev);
 vdev->config->del_vqs(vdev);
 
device_set_wakeup_enable(&vdev->dev, false);
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 303caf2d17d0..83d0af3fbf30 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -910,7 +910,7 @@ static void virtblk_remove(struct virtio_device *vdev)
mutex_lock(&vblk->vdev_mutex);
 
/* Stop all the virtqueues. */
-   vdev->config->reset(vdev);
+   virtio_reset_device(vdev);
 
/* Virtqueues are stopped, nothing can use vblk->vdev anymore. */
vblk->vdev = NULL;
@@ -929,7 +929,7 @@ static int virtblk_freeze(struct virtio_device *vdev)
struct virtio_blk *vblk = vdev->priv;
 
/* Ensure we don't receive any more interrupts */
-   vdev->config->reset(vdev);
+   virtio_reset_device(vdev);
 
/* Make sure no work handler is accessing the device. */
flush_work(&vblk->config_work);
diff --git a/drivers/bluetooth/virtio_bt.c b/drivers/bluetooth/virtio_bt.c
index 57908ce4fae8..24a9258962fa 100644
--- a/drivers/bluetooth/virtio_bt.c
+++ b/drivers/bluetooth/virtio_bt.c
@@ -364,7 +364,7 @@ static void virtbt_remove(struct virtio_device *vdev)
struct hci_dev *hdev = vbt->hdev;
 
hci_unregister_dev(hdev);
-   vdev->config->reset(vdev);
+   virtio_reset_device(vdev);
 
hci_free_dev(hdev);
vbt->hdev = NULL;
diff --git a/drivers/char/hw_random/virtio-rng.c 
b/drivers/char/hw_random/virtio-rng.c
index a90001e02bf7..95980489514b 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -134,7 +134,7 @@ static void remove_common(struct virtio_device *vdev)
vi->hwrng_removed = true;
vi->data_avail = 0;
complete(&vi->have_data);
-   vdev->config->reset(vdev);
+   virtio_reset_device(vdev);
vi->busy = false;
if (vi->hwrng_register_done)
hwrng_unregister(&vi->hwrng);
diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 7eaf303a7a86..08bbd693436f 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -1957,7 +1957,7 @@ static void virtcons_remove(struct virtio_device *vdev)
spin_unlock_irq(&pdrvdata_lock);
 
/* Disable interrupts for vqs */
-   vdev->config->reset(vdev);
+   virtio_reset_device(vdev);
/* Finish up work that's lined up */
if (use_multiport(portdev))
cancel_work_sync(&portdev->control_work);
@@ -2139,7 +2139,7 @@ static int virtcons_freeze(struct virtio_device *vdev)
 
portdev = vdev->priv;
 
-   vdev->config->reset(vdev);
+   virtio_reset_device(vdev);
 
if (use_multiport(portdev))
virtqueue_disable_cb(portdev->c_ivq);
diff --git a/drivers/crypto/virtio/virtio_crypto_core.c 
b/drivers/crypto/virtio/virtio_crypto_core.c
index e2375d992308..8e977b7627cb 100644
--- a/drivers/crypto/virti

Re: [PATCH RFC] virtio: wrap config->reset calls

2021-10-13 Thread Viresh Kumar
On 13-10-21, 06:55, Michael S. Tsirkin wrote:
> This will enable cleanups down the road.
> The idea is to disable cbs, then add "flush_queued_cbs" callback
> as a parameter, this way drivers can flush any work
> queued after callbacks have been disabled.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  drivers/gpio/gpio-virtio.c | 2 +-
>  drivers/i2c/busses/i2c-virtio.c| 2 +-

Reviewed-by: Viresh Kumar 

-- 
viresh


Re: [PATCH RFC] virtio: wrap config->reset calls

2021-10-13 Thread David Hildenbrand

On 13.10.21 12:55, Michael S. Tsirkin wrote:

This will enable cleanups down the road.
The idea is to disable cbs, then add "flush_queued_cbs" callback
as a parameter, this way drivers can flush any work
queued after callbacks have been disabled.

Signed-off-by: Michael S. Tsirkin 
---
  arch/um/drivers/virt-pci.c | 2 +-
  drivers/block/virtio_blk.c | 4 ++--
  drivers/bluetooth/virtio_bt.c  | 2 +-
  drivers/char/hw_random/virtio-rng.c| 2 +-
  drivers/char/virtio_console.c  | 4 ++--
  drivers/crypto/virtio/virtio_crypto_core.c | 8 
  drivers/firmware/arm_scmi/virtio.c | 2 +-
  drivers/gpio/gpio-virtio.c | 2 +-
  drivers/gpu/drm/virtio/virtgpu_kms.c   | 2 +-
  drivers/i2c/busses/i2c-virtio.c| 2 +-
  drivers/iommu/virtio-iommu.c   | 2 +-
  drivers/net/caif/caif_virtio.c | 2 +-
  drivers/net/virtio_net.c   | 4 ++--
  drivers/net/wireless/mac80211_hwsim.c  | 2 +-
  drivers/nvdimm/virtio_pmem.c   | 2 +-
  drivers/rpmsg/virtio_rpmsg_bus.c   | 2 +-
  drivers/scsi/virtio_scsi.c | 2 +-
  drivers/virtio/virtio.c| 5 +
  drivers/virtio/virtio_balloon.c| 2 +-
  drivers/virtio/virtio_input.c  | 2 +-
  drivers/virtio/virtio_mem.c| 2 +-
  fs/fuse/virtio_fs.c| 4 ++--
  include/linux/virtio.h | 1 +
  net/9p/trans_virtio.c  | 2 +-
  net/vmw_vsock/virtio_transport.c   | 4 ++--
  sound/virtio/virtio_card.c | 4 ++--
  26 files changed, 39 insertions(+), 33 deletions(-)

diff --git a/arch/um/drivers/virt-pci.c b/arch/um/drivers/virt-pci.c
index c08066633023..22c4d87c9c15 100644
--- a/arch/um/drivers/virt-pci.c
+++ b/arch/um/drivers/virt-pci.c
@@ -616,7 +616,7 @@ static void um_pci_virtio_remove(struct virtio_device *vdev)
int i;
  
  /* Stop all virtqueues */

-vdev->config->reset(vdev);
+virtio_reset_device(vdev);
  vdev->config->del_vqs(vdev);


Nit: virtio_device_reset()?

Because I see:

int virtio_device_freeze(struct virtio_device *dev);
int virtio_device_restore(struct virtio_device *dev);
void virtio_device_ready(struct virtio_device *dev)

But well, there is:
void virtio_break_device(struct virtio_device *dev);

--
Thanks,

David / dhildenb



[PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Maarten Lankhorst
No memory should be allocated when calling i915_gem_object_wait,
because it may be called to idle a BO when evicting memory.

Fix this by using dma_resv_iter helpers to call
i915_gem_object_wait_fence() on each fence, which cleans up the code a lot.
Also remove dma_resv_prune, it's questionably.

This will result in the following lockdep splat.

<4> [83.538517] ==
<4> [83.538520] WARNING: possible circular locking dependency detected
<4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted
<4> [83.538525] --
<4> [83.538527] gem_render_line/5242 is trying to acquire lock:
<4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: 
__kmalloc_track_caller+0x56/0x270
<4> [83.538538]
but task is already holding lock:
<4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.538638]
which lock already depends on the new lock.
<4> [83.538642]
the existing dependency chain (in reverse order) is:
<4> [83.538645]
-> #1 (&vm->mutex/1){+.+.}-{3:3}:
<4> [83.538649]lock_acquire+0xd3/0x310
<4> [83.538654]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
<4> [83.538730]i915_address_space_init+0xf5/0x1b0 [i915]
<4> [83.538794]ppgtt_init+0x55/0x70 [i915]
<4> [83.538856]gen8_ppgtt_create+0x44/0x5d0 [i915]
<4> [83.538912]i915_ppgtt_create+0x28/0xf0 [i915]
<4> [83.538971]intel_gt_init+0x130/0x3b0 [i915]
<4> [83.539029]i915_gem_init+0x14b/0x220 [i915]
<4> [83.539100]i915_driver_probe+0x97e/0xdd0 [i915]
<4> [83.539149]i915_pci_probe+0x43/0x1d0 [i915]
<4> [83.539197]pci_device_probe+0x9b/0x110
<4> [83.539201]really_probe+0x1b0/0x3b0
<4> [83.539205]__driver_probe_device+0xf6/0x170
<4> [83.539208]driver_probe_device+0x1a/0x90
<4> [83.539210]__driver_attach+0x93/0x160
<4> [83.539213]bus_for_each_dev+0x72/0xc0
<4> [83.539216]bus_add_driver+0x14b/0x1f0
<4> [83.539220]driver_register+0x66/0xb0
<4> [83.539222]hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi]
<4> [83.539227]do_one_initcall+0x53/0x2e0
<4> [83.539230]do_init_module+0x55/0x200
<4> [83.539234]load_module+0x2700/0x2980
<4> [83.539237]__do_sys_finit_module+0xaa/0x110
<4> [83.539241]do_syscall_64+0x37/0xb0
<4> [83.539244]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539247]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [83.539251]validate_chain+0xb37/0x1e70
<4> [83.539254]__lock_acquire+0x5a1/0xb70
<4> [83.539258]lock_acquire+0xd3/0x310
<4> [83.539260]fs_reclaim_acquire+0x9d/0xd0
<4> [83.539264]__kmalloc_track_caller+0x56/0x270
<4> [83.539267]krealloc+0x48/0xa0
<4> [83.539270]dma_resv_get_fences+0x1c3/0x280
<4> [83.539274]i915_gem_object_wait+0x1ff/0x410 [i915]
<4> [83.539342]i915_gem_evict_for_node+0x16b/0x440 [i915]
<4> [83.539412]i915_gem_gtt_reserve+0xff/0x130 [i915]
<4> [83.539482]i915_vma_pin_ww+0x765/0x970 [i915]
<4> [83.539556]eb_validate_vmas+0x6fe/0x8e0 [i915]
<4> [83.539626]i915_gem_do_execbuffer+0x9a6/0x20a0 [i915]
<4> [83.539693]i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915]
<4> [83.539759]drm_ioctl_kernel+0xac/0x140
<4> [83.539763]drm_ioctl+0x201/0x3d0
<4> [83.539766]__x64_sys_ioctl+0x6a/0xa0
<4> [83.539769]do_syscall_64+0x37/0xb0
<4> [83.539772]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539775]
other info that might help us debug this:
<4> [83.539778]  Possible unsafe locking scenario:
<4> [83.539781]CPU0CPU1
<4> [83.539783]
<4> [83.539785]   lock(&vm->mutex/1);
<4> [83.539788]lock(fs_reclaim);
<4> [83.539791]lock(&vm->mutex/1);
<4> [83.539794]   lock(fs_reclaim);
<4> [83.539796]
 *** DEADLOCK ***
<4> [83.539799] 3 locks held by gem_render_line/5242:
<4> [83.539802]  #0: c9d4bbf0 
(reservation_ww_class_acquire){+.+.}-{0:0}, at: 
i915_gem_do_execbuffer+0x8e5/0x20a0 [i915]
<4> [83.539870]  #1: 88811e48bae8 (reservation_ww_class_mutex){+.+.}-{3:3}, 
at: eb_validate_vmas+0x81/0x8e0 [i915]
<4> [83.539936]  #2: 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.540011]
stack backtrace:
<4> [83.540014] CPU: 2 PID: 5242 Comm: gem_render_line Not tainted 
5.15.0-rc5-CI-Trybot_8062+ #1
<4> [83.540019] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, 
BIOS TNTGL357.0038.2020.1124.1648 11/24/2020
<4> [83.540023] Call Trace:
<4> [83.540026]  dump_stack_lvl+0x56/0x7b
<4> [83.540030]  check_noncircular+0x12e/0x150
<4> [83.540034]  ? _raw_spin_unlock_irqrestore+0x50/0x60
<4> [83.540038]  validate_chain+0xb37/0x1e70
<4> [83.540042]  __lock_acquire+0x5a1/0xb70
<4> [83.540046]  lock_acquire+0

Re: [PATCH] drm/i915: Prefer struct_size over open coded arithmetic

2021-10-13 Thread Jani Nikula
On Mon, 11 Oct 2021, Len Baker  wrote:
> Hi,
>
> On Sun, Oct 03, 2021 at 12:42:58PM +0200, Len Baker wrote:
>> As noted in the "Deprecated Interfaces, Language Features, Attributes,
>> and Conventions" documentation [1], size calculations (especially
>> multiplication) should not be performed in memory allocator (or similar)
>> function arguments due to the risk of them overflowing. This could lead
>> to values wrapping around and a smaller allocation being made than the
>> caller was expecting. Using those allocations could lead to linear
>> overflows of heap memory and other misbehaviors.
>>
>> In this case these are not actually dynamic sizes: all the operands
>> involved in the calculation are constant values. However it is better to
>> refactor them anyway, just to keep the open-coded math idiom out of
>> code.
>>
>> So, add at the end of the struct i915_syncmap a union with two flexible
>> array members (these arrays share the same memory layout). This is
>> possible using the new DECLARE_FLEX_ARRAY macro. And then, use the
>> struct_size() helper to do the arithmetic instead of the argument
>> "size + count * size" in the kmalloc and kzalloc() functions.
>>
>> Also, take the opportunity to refactor the __sync_seqno and __sync_child
>> making them more readable.
>>
>> This code was detected with the help of Coccinelle and audited and fixed
>> manually.
>>
>> [1] 
>> https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
>>
>> Signed-off-by: Len Baker 
>> ---
>>  drivers/gpu/drm/i915/i915_syncmap.c | 12 
>>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> I received a mail telling that this patch doesn't build:
>
> == Series Details ==
>
> Series: drm/i915: Prefer struct_size over open coded arithmetic
> URL   : https://patchwork.freedesktop.org/series/95408/
> State : failure
>
> But it builds without error against linux-next (tag next-20211001). Against
> which tree and branch do I need to build?

drm-tip [1]. It's a sort of linux-next for graphics. I think there are
still some branches that don't feed to linux-next.

BR,
Jani.


[1] https://cgit.freedesktop.org/drm/drm-tip


>
> Regards,
> Len

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [RFC PATCH v2 2/2] RDMA/rxe: Add dma-buf support

2021-10-13 Thread Daniel Vetter
On Fri, Oct 01, 2021 at 12:56:48PM +0900, Shunsuke Mie wrote:
> 2021年9月30日(木) 23:41 Daniel Vetter :
> >
> > On Wed, Sep 29, 2021 at 01:19:05PM +0900, Shunsuke Mie wrote:
> > > Implement a ib device operation ‘reg_user_mr_dmabuf’. Generate a
> > > rxe_map from the memory space linked the passed dma-buf.
> > >
> > > Signed-off-by: Shunsuke Mie 
> > > ---
> > >  drivers/infiniband/sw/rxe/rxe_loc.h   |   2 +
> > >  drivers/infiniband/sw/rxe/rxe_mr.c| 118 ++
> > >  drivers/infiniband/sw/rxe/rxe_verbs.c |  34 
> > >  drivers/infiniband/sw/rxe/rxe_verbs.h |   2 +
> > >  4 files changed, 156 insertions(+)
> > >
> > > diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h 
> > > b/drivers/infiniband/sw/rxe/rxe_loc.h
> > > index 1ca43b859d80..8bc19ea1a376 100644
> > > --- a/drivers/infiniband/sw/rxe/rxe_loc.h
> > > +++ b/drivers/infiniband/sw/rxe/rxe_loc.h
> > > @@ -75,6 +75,8 @@ u8 rxe_get_next_key(u32 last_key);
> > >  void rxe_mr_init_dma(struct rxe_pd *pd, int access, struct rxe_mr *mr);
> > >  int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
> > >int access, struct rxe_mr *mr);
> > > +int rxe_mr_dmabuf_init_user(struct rxe_pd *pd, int fd, u64 start, u64 
> > > length,
> > > + u64 iova, int access, struct rxe_mr *mr);
> > >  int rxe_mr_init_fast(struct rxe_pd *pd, int max_pages, struct rxe_mr 
> > > *mr);
> > >  int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
> > >   enum rxe_mr_copy_dir dir);
> > > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c 
> > > b/drivers/infiniband/sw/rxe/rxe_mr.c
> > > index 53271df10e47..af6ef671c3a5 100644
> > > --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> > > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> > > @@ -4,6 +4,7 @@
> > >   * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved.
> > >   */
> > >
> > > +#include 
> > >  #include "rxe.h"
> > >  #include "rxe_loc.h"
> > >
> > > @@ -245,6 +246,120 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, 
> > > u64 length, u64 iova,
> > >   return err;
> > >  }
> > >
> > > +static int rxe_map_dmabuf_mr(struct rxe_mr *mr,
> > > +  struct ib_umem_dmabuf *umem_dmabuf)
> > > +{
> > > + struct rxe_map_set *set;
> > > + struct rxe_phys_buf *buf = NULL;
> > > + struct rxe_map **map;
> > > + void *vaddr, *vaddr_end;
> > > + int num_buf = 0;
> > > + int err;
> > > + size_t remain;
> > > +
> > > + mr->dmabuf_map = kzalloc(sizeof &mr->dmabuf_map, GFP_KERNEL);
> >
> > dmabuf_maps are just tagged pointers (and we could shrink them to actually
> > just a tagged pointer if anyone cares about the overhead of the separate
> > bool), allocating them seperately is overkill.
> 
> I agree with you. However, I think it is needed to unmap by
> dma_buf_vunmap(). If there is another simple way to unmap it. It is not
> needed I think. What do you think about it?

dma_buf_vunmap does not kfree the dma_buf_map argument, so that's no
reason to allocate it separately. Or I'm confused.

Also apologies, I'm way behind on mails.
-Daniel

> 
> > > + if (!mr->dmabuf_map) {
> > > + err = -ENOMEM;
> > > + goto err_out;
> > > + }
> > > +
> > > + err = dma_buf_vmap(umem_dmabuf->dmabuf, mr->dmabuf_map);
> > > + if (err)
> > > + goto err_free_dmabuf_map;
> > > +
> > > + set = mr->cur_map_set;
> > > + set->page_shift = PAGE_SHIFT;
> > > + set->page_mask = PAGE_SIZE - 1;
> > > +
> > > + map = set->map;
> > > + buf = map[0]->buf;
> > > +
> > > + vaddr = mr->dmabuf_map->vaddr;
> >
> > dma_buf_map can be an __iomem too, you shouldn't dig around in this, but
> > use the dma-buf-map.h helpers instead. On x86 (and I think also on most
> > arm) it doesn't matter, but it's kinda not very nice in a pure software
> > driver.
> >
> > If anything is missing in dma-buf-map.h wrappers just add more.
> >
> > Or alternatively you need to fail the import if you can't handle __iomem.
> >
> > Aside from these I think the dma-buf side here for cpu access looks
> > reasonable now.
> > -Daniel
> I'll see the dma-buf-map.h and consider the error handling that you suggested.
> I appreciate your support.
> 
> Thanks a lot,
> Shunsuke.
> 
> > > + vaddr_end = vaddr + umem_dmabuf->dmabuf->size;
> > > + remain = umem_dmabuf->dmabuf->size;
> > > +
> > > + for (; remain; vaddr += PAGE_SIZE) {
> > > + if (num_buf >= RXE_BUF_PER_MAP) {
> > > + map++;
> > > + buf = map[0]->buf;
> > > + num_buf = 0;
> > > + }
> > > +
> > > + buf->addr = (uintptr_t)vaddr;
> > > + if (remain >= PAGE_SIZE)
> > > + buf->size = PAGE_SIZE;
> > > + else
> > > + buf->size = remain;
> > > + remain -= buf->size;
> > > +
> > > + num_buf++;
> > > + buf++;
> > > +  

Re: mmotm 2021-10-05-19-53 uploaded (drivers/gpu/drm/msm/hdmi/hdmi_phy.o)

2021-10-13 Thread Arnd Bergmann
On Wed, Oct 13, 2021 at 12:54 PM Arnd Bergmann  wrote:
> On Thu, Oct 7, 2021 at 11:51 AM Geert Uytterhoeven  
> wrote:
>
> -msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o
> -msm-$(CONFIG_COMMON_CLK) += disp/mdp4/mdp4_lvds_pll.o
> -msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_pll_8960.o
> -msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_phy_8996.o
> +msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o \
> + disp/mdp4/mdp4_lvds_pll.o \
> + hdmi/hdmi_pll_8960.o \
> + hdmi/hdmi_phy_8996.o
>
>  msm-$(CONFIG_DRM_MSM_HDMI_HDCP) += hdmi/hdmi_hdcp.o

I fixed my local copy now after noticing that these should not go
after CONFIG_DRM_FBDEV_EMULATION but the top-level option:

@@ -23,8 +23,10 @@ msm-y := \
hdmi/hdmi_i2c.o \
hdmi/hdmi_phy.o \
hdmi/hdmi_phy_8960.o \
+   hdmi/hdmi_phy_8996.o
hdmi/hdmi_phy_8x60.o \
hdmi/hdmi_phy_8x74.o \
+   hdmi/hdmi_pll_8960.o \
edp/edp.o \
edp/edp_aux.o \
edp/edp_bridge.o \
@@ -37,6 +39,7 @@ msm-y := \
disp/mdp4/mdp4_dtv_encoder.o \
disp/mdp4/mdp4_lcdc_encoder.o \
disp/mdp4/mdp4_lvds_connector.o \
+   disp/mdp4/mdp4_lvds_pll.o \
disp/mdp4/mdp4_irq.o \
disp/mdp4/mdp4_kms.o \
disp/mdp4/mdp4_plane.o \

   Arnd


Re: [PATCH 15/16] Revert "drm/i915: cleanup: drm_modeset_lock_all_ctx() --> DRM_MODESET_LOCK_ALL_BEGIN()"

2021-10-13 Thread Daniel Vetter
On Mon, Oct 04, 2021 at 10:31:04PM +0200, Fernando Ramos wrote:
> On 21/10/04 11:56AM, Sean Paul wrote:
> > @Fernando, hopefully you can revise and post again. Thank you for your 
> > patches
> > and your effort!
> 
> No problem :)
> 
> Just to be sure I do the right thing this time (and to better understand the
> process), please confirm that this is the correct sequence of events:
> 
>   1. I fix the lock issue and test on my local machine.
> 
>   2. I then post this new patch set (v3) rebased on top of drm-tip (instead of
>  drm-next). This will automatically trigger tests on intel hardware (and
>  maybe in other hardwares?)
> 
> NOTE: I originally chose drm-next because that's what is mentioned 
> here:
> 
> https://01.org/linuxgraphics/gfx-docs/drm/gpu/introduction.html#contribution-process
> Maybe this doc should be updated?
> 
>   3. Once reviewed and approved, someone (Sean?) merges them into "somewhere"
>  (drm-next? drm-misc-next? drm-intel-next? How is this decided?).
> 
>   4. Eventually, that other branch from the previous point is merged into
>  drm-tip.
> 
>   5. ??
> 
>   6. The branch is merged into linux-next.

This part should happen automatically, plus/minus right around the merge
window. At least not your problem.

Otherwise don't worry, and don't sweat it too much. We know that our CI
situation just isn't great yet for drm contributors :-/ There's plans to
improve it though, but it all takes time.

> There must be something wrong in my description above, as it doesn't make 
> sense
> to post the patch series based on "drm-tip" only to later have one of the
> mainteiners merge them into a different branch that will eventually be merged
> back into "drm-tip".
> 
> Sorry for being completely lost! Is there a document explaining how all of 
> this
> works so that I can learn for the next time?

drm-tip is just linux-next for drm area, it's the same principle. If there
are conflicts while merging, maintainers will sort these out. And yeah
that's a bit a speciality of linux with the multi-branch model for
development.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/i915: Prefer struct_size over open coded arithmetic

2021-10-13 Thread Daniel Vetter
On Wed, Oct 13, 2021 at 02:24:05PM +0300, Jani Nikula wrote:
> On Mon, 11 Oct 2021, Len Baker  wrote:
> > Hi,
> >
> > On Sun, Oct 03, 2021 at 12:42:58PM +0200, Len Baker wrote:
> >> As noted in the "Deprecated Interfaces, Language Features, Attributes,
> >> and Conventions" documentation [1], size calculations (especially
> >> multiplication) should not be performed in memory allocator (or similar)
> >> function arguments due to the risk of them overflowing. This could lead
> >> to values wrapping around and a smaller allocation being made than the
> >> caller was expecting. Using those allocations could lead to linear
> >> overflows of heap memory and other misbehaviors.
> >>
> >> In this case these are not actually dynamic sizes: all the operands
> >> involved in the calculation are constant values. However it is better to
> >> refactor them anyway, just to keep the open-coded math idiom out of
> >> code.
> >>
> >> So, add at the end of the struct i915_syncmap a union with two flexible
> >> array members (these arrays share the same memory layout). This is
> >> possible using the new DECLARE_FLEX_ARRAY macro. And then, use the
> >> struct_size() helper to do the arithmetic instead of the argument
> >> "size + count * size" in the kmalloc and kzalloc() functions.
> >>
> >> Also, take the opportunity to refactor the __sync_seqno and __sync_child
> >> making them more readable.
> >>
> >> This code was detected with the help of Coccinelle and audited and fixed
> >> manually.
> >>
> >> [1] 
> >> https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
> >>
> >> Signed-off-by: Len Baker 
> >> ---
> >>  drivers/gpu/drm/i915/i915_syncmap.c | 12 
> >>  1 file changed, 8 insertions(+), 4 deletions(-)
> >
> > I received a mail telling that this patch doesn't build:
> >
> > == Series Details ==
> >
> > Series: drm/i915: Prefer struct_size over open coded arithmetic
> > URL   : https://patchwork.freedesktop.org/series/95408/
> > State : failure
> >
> > But it builds without error against linux-next (tag next-20211001). Against
> > which tree and branch do I need to build?
> 
> drm-tip [1]. It's a sort of linux-next for graphics. I think there are
> still some branches that don't feed to linux-next.

Yeah we need to get gt-next in linux-next asap. Joonas promised to send
out his patch to make that happen in dim.
-Daniel

> 
> BR,
> Jani.
> 
> 
> [1] https://cgit.freedesktop.org/drm/drm-tip
> 
> 
> >
> > Regards,
> > Len
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/4] drm: Introduce drm_modeset_lock_ctx_retry()

2021-10-13 Thread Daniel Vetter
On Mon, Oct 04, 2021 at 02:15:51PM +0300, Ville Syrjälä wrote:
> On Tue, Jul 20, 2021 at 03:44:49PM +0200, Daniel Vetter wrote:
> > On Thu, Jul 15, 2021 at 09:49:51PM +0300, Ville Syrjala wrote:
> > > From: Ville Syrjälä 
> > > 
> > > Quite a few places are hand rolling the modeset lock backoff dance.
> > > Let's suck that into a helper macro that is easier to use without
> > > forgetting some steps.
> > > 
> > > The main downside is probably that the implementation of
> > > drm_with_modeset_lock_ctx() is a bit harder to read than a hand
> > > rolled version on account of being split across three functions,
> > > but the actual code using it ends up being much simpler.
> > > 
> > > Cc: Sean Paul 
> > > Cc: Daniel Vetter 
> > > Signed-off-by: Ville Syrjälä 
> > > ---
> > >  drivers/gpu/drm/drm_modeset_lock.c | 44 ++
> > >  include/drm/drm_modeset_lock.h | 20 ++
> > >  2 files changed, 64 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_modeset_lock.c 
> > > b/drivers/gpu/drm/drm_modeset_lock.c
> > > index fcfe1a03c4a1..083df96632e8 100644
> > > --- a/drivers/gpu/drm/drm_modeset_lock.c
> > > +++ b/drivers/gpu/drm/drm_modeset_lock.c
> > > @@ -425,3 +425,47 @@ int drm_modeset_lock_all_ctx(struct drm_device *dev,
> > >   return 0;
> > >  }
> > >  EXPORT_SYMBOL(drm_modeset_lock_all_ctx);
> > > +
> > > +void _drm_modeset_lock_begin(struct drm_modeset_acquire_ctx *ctx,
> > > +  struct drm_atomic_state *state,
> > > +  unsigned int flags, int *ret)
> > > +{
> > > + drm_modeset_acquire_init(ctx, flags);
> > > +
> > > + if (state)
> > > + state->acquire_ctx = ctx;
> > > +
> > > + *ret = -EDEADLK;
> > > +}
> > > +EXPORT_SYMBOL(_drm_modeset_lock_begin);
> > > +
> > > +bool _drm_modeset_lock_loop(int *ret)
> > > +{
> > > + if (*ret == -EDEADLK) {
> > > + *ret = 0;
> > > + return true;
> > > + }
> > > +
> > > + return false;
> > > +}
> > > +EXPORT_SYMBOL(_drm_modeset_lock_loop);
> > > +
> > > +void _drm_modeset_lock_end(struct drm_modeset_acquire_ctx *ctx,
> > > +struct drm_atomic_state *state,
> > > +int *ret)
> > > +{
> > > + if (*ret == -EDEADLK) {
> > > + if (state)
> > > + drm_atomic_state_clear(state);
> > > +
> > > + *ret = drm_modeset_backoff(ctx);
> > > + if (*ret == 0) {
> > > + *ret = -EDEADLK;
> > > + return;
> > > + }
> > > + }
> > > +
> > > + drm_modeset_drop_locks(ctx);
> > > + drm_modeset_acquire_fini(ctx);
> > > +}
> > > +EXPORT_SYMBOL(_drm_modeset_lock_end);
> > > diff --git a/include/drm/drm_modeset_lock.h 
> > > b/include/drm/drm_modeset_lock.h
> > > index aafd07388eb7..5eaad2533de5 100644
> > > --- a/include/drm/drm_modeset_lock.h
> > > +++ b/include/drm/drm_modeset_lock.h
> > > @@ -26,6 +26,7 @@
> > >  
> > >  #include 
> > >  
> > > +struct drm_atomic_state;
> > >  struct drm_modeset_lock;
> > >  
> > >  /**
> > > @@ -203,4 +204,23 @@ modeset_lock_fail:   
> > > \
> > >   if (!drm_drv_uses_atomic_modeset(dev))  \
> > >   mutex_unlock(&dev->mode_config.mutex);
> > >  
> > > +void _drm_modeset_lock_begin(struct drm_modeset_acquire_ctx *ctx,
> > > +  struct drm_atomic_state *state,
> > > +  unsigned int flags,
> > > +  int *ret);
> > > +bool _drm_modeset_lock_loop(int *ret);
> > > +void _drm_modeset_lock_end(struct drm_modeset_acquire_ctx *ctx,
> > > +struct drm_atomic_state *state,
> > > +int *ret);
> > > +
> > > +/*
> > > + * Note that one must always use "continue" rather than
> > > + * "break" or "return" to handle errors within the
> > > + * drm_modeset_lock_ctx_retry() block.
> > 
> > I'm not sold on loop macros with these kind of restrictions, C just isn't
> > a great language for these. That's why e.g. drm_connector_iter doesn't
> > give you a macro, but only the begin/next/end function calls explicitly.
> 
> We already use this pattern extensively in i915. Gem ww ctx has one,
> power domains/pps/etc. use a similar things. It makes the code pretty nice,
> with the slight caveat that an accidental 'break' can ruin your day. But
> so can an accidental return with other constructs (and we even had that
> happen a few times with the connector iterators), so not a dealbreaker
> IMO.
> 
> So if we don't want this drm wide I guess I can propose this just for
> i915 since it fits in perfectly there.

Well I don't like them for i915 either.

And yes C is dangerous, but also C is verbose. I think one lesson from igt
is that too many magic block constructs are bad, it's just not how C
works. Definitely not in the kernel, where "oops I got it wrong because it
was too clever" is bad.

> > Yes the macro we have is also not nice, but at least it's a screaming
> > macro since it's all uppercase, so

Re: [Intel-gfx] [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute

2021-10-13 Thread Daniel Vetter
On Wed, Oct 06, 2021 at 10:12:29AM -0700, Matthew Brost wrote:
> On Mon, Oct 04, 2021 at 03:36:48PM +0100, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin 
> > 
> > Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to
> > i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC
> > priority management") introduced some scheduling related vfuncs which
> > take integer request priority as argument.
> > 
> > Make them instead take struct i915_sched_attr, which is the type
> > encapsulating this information, so it probably aligns with the design
> > better. It definitely enables extending the set of scheduling attributes.
> > 
> 
> Understand the motivation here but the i915_scheduler is going to
> disapear when we move to the DRM scheduler or at least its functionality
> of priority inheritance will be pushed into the DRM scheduler. I'd be
> very careful making any changes here as the priority in the DRM
> scheduler is defined as single enum:

Yeah I'm not sure it makes sense to build this and make the conversion to
drm/sched even harder. We've already merged a lot of code with a "we'll
totally convert to drm/sched right after" promise, there's not really room
for more fun like this built on top of i915-scheduler.
-Daniel

> 
> /* These are often used as an (initial) index
>  * to an array, and as such should start at 0.
>  */
> enum drm_sched_priority {
> DRM_SCHED_PRIORITY_MIN,
> DRM_SCHED_PRIORITY_NORMAL,
> DRM_SCHED_PRIORITY_HIGH,
> DRM_SCHED_PRIORITY_KERNEL,
> 
> DRM_SCHED_PRIORITY_COUNT,
> DRM_SCHED_PRIORITY_UNSET = -2
> };
> 
> Adding a field to the i915_sched_attr is fairly easy as we already have
> a structure but changing the DRM scheduler might be a tougher sell.
> Anyway you can make this work without adding the 'nice' field to
> i915_sched_attr? Might be worth exploring so when we move to the DRM
> scheduler this feature drops in a little cleaner.
> 
> Matt
> 
> > Signed-off-by: Tvrtko Ursulin 
> > Cc: Matthew Brost 
> > Cc: Daniele Ceraolo Spurio 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++-
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c| 3 ++-
> >  drivers/gpu/drm/i915/i915_scheduler.c| 4 ++--
> >  drivers/gpu/drm/i915/i915_scheduler_types.h  | 4 ++--
> >  4 files changed, 9 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> > b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > index 7147fe80919e..e91d803a6453 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > @@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs 
> > *engine)
> > return engine->class != RENDER_CLASS;
> >  }
> >  
> > -static void kick_execlists(const struct i915_request *rq, int prio)
> > +static void kick_execlists(const struct i915_request *rq,
> > +  const struct i915_sched_attr *attr)
> >  {
> > struct intel_engine_cs *engine = rq->engine;
> > struct i915_sched_engine *sched_engine = engine->sched_engine;
> > const struct i915_request *inflight;
> > +   const int prio = attr->priority;
> >  
> > /*
> >  * We only need to kick the tasklet once for the high priority
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index ba0de35f6323..b5883a4365ca 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct 
> > intel_engine_cs *engine)
> >  }
> >  
> >  static void guc_bump_inflight_request_prio(struct i915_request *rq,
> > -  int prio)
> > +  const struct i915_sched_attr *attr)
> >  {
> > struct intel_context *ce = rq->context;
> > +   const int prio = attr->priority;
> > u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
> >  
> > /* Short circuit function */
> > diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
> > b/drivers/gpu/drm/i915/i915_scheduler.c
> > index 762127dd56c5..534bab99fcdc 100644
> > --- a/drivers/gpu/drm/i915/i915_scheduler.c
> > +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> > @@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node 
> > *node,
> >  
> > /* Must be called before changing the nodes priority */
> > if (sched_engine->bump_inflight_request_prio)
> > -   sched_engine->bump_inflight_request_prio(from, prio);
> > +   sched_engine->bump_inflight_request_prio(from, attr);
> >  
> > WRITE_ONCE(node->attr.priority, prio);
> >  
> > @@ -280,7 +280,7 @@ static void __i915_schedule(struct i915_sched_node 
> > *node,
> >  
> > /* Defer (tasklet) submission until after all of

Re: [RFC PATCH] drm: Increase DRM_OBJECT_MAX_PROPERTY by 18.

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 08:51:51AM +0200, Sebastian Andrzej Siewior wrote:
> The warning poped up, it says it increase it by the number of occurrence.
> I saw it 18 times so here it is.
> It started to up since commit
>2f425cf5242a0 ("drm: Fix oops in damage self-tests by mocking damage 
> property")
> 
> Increase DRM_OBJECT_MAX_PROPERTY by 18.
> 
> Signed-off-by: Sebastian Andrzej Siewior 

Which driver where? Whomever added that into upstream should also have
realized this (things will just not work) and include it in there. So if
things are tested correctly this should be part of a larger series to add
these 18 props somewhere.

Also maybe we should just dynamically allocate this array if people have
this many properties on their objects.
-Daniel

> ---
> 
> I have no idea whether this is correct or just a symptom of another
> problem. This has been observed with i915 and full debug.
> 
>  include/drm/drm_mode_object.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/drm/drm_mode_object.h b/include/drm/drm_mode_object.h
> index c34a3e8030e12..1e5399e47c3a5 100644
> --- a/include/drm/drm_mode_object.h
> +++ b/include/drm/drm_mode_object.h
> @@ -60,7 +60,7 @@ struct drm_mode_object {
>   void (*free_cb)(struct kref *kref);
>  };
>  
> -#define DRM_OBJECT_MAX_PROPERTY 24
> +#define DRM_OBJECT_MAX_PROPERTY 42
>  /**
>   * struct drm_object_properties - property tracking for &drm_mode_object
>   */
> -- 
> 2.33.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/i915: Handle Intel igfx + Intel dgfx hybrid graphics setup

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 03:05:25PM +0200, Thomas Hellström wrote:
> Hi, Tvrtko,
> 
> On 10/5/21 13:31, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin 
> > 
> > In short this makes i915 work for hybrid setups (DRI_PRIME=1 with Mesa)
> > when rendering is done on Intel dgfx and scanout/composition on Intel
> > igfx.
> > 
> > Before this patch the driver was not quite ready for that setup, mainly
> > because it was able to emit a semaphore wait between the two GPUs, which
> > results in deadlocks because semaphore target location in HWSP is neither
> > shared between the two, nor mapped in both GGTT spaces.
> > 
> > To fix it the patch adds an additional check to a couple of relevant code
> > paths in order to prevent using semaphores for inter-engine
> > synchronisation when relevant objects are not in the same GGTT space.
> > 
> > v2:
> >   * Avoid adding rq->i915. (Chris)
> > 
> > v3:
> >   * Use GGTT which describes the limit more precisely.
> > 
> > Signed-off-by: Tvrtko Ursulin 
> > Cc: Daniel Vetter 
> > Cc: Matthew Auld 
> > Cc: Thomas Hellström 
> 
> An IMO pretty important bugfix. I read up a bit on the previous discussion
> on this, and from what I understand the other two options were
> 
> 1) Ripping out the semaphore code,
> 2) Consider dma-fences from other instances of the same driver as foreign.
> 
> For imported dma-bufs we do 2), but particularly with lmem and p2p that's a
> more straightforward decision.
> 
> I don't think 1) is a reasonable approach to fix this bug, (but perhaps as a
> general cleanup?), and for 2) yes I guess we might end up doing that, unless
> we find some real benefits in treating same-driver-separate-device
> dma-fences as local, but for this particular bug, IMO this is a reasonable
> fix.

The foreign dma-fences have uapi impact, which Tvrtko shrugged off as
"it's a good idea", and not it's really just not. So we still need to that
this properly.

> Reviewed-by: Thomas Hellström 

But I'm also ok with just merging this as-is so the situation doesn't
become too entertaining.
-Daniel

> 
> 
> 
> 
> 
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 12 +++-
> >   1 file changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c 
> > b/drivers/gpu/drm/i915/i915_request.c
> > index 79da5eca60af..4f189982f67e 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1145,6 +1145,12 @@ __emit_semaphore_wait(struct i915_request *to,
> > return 0;
> >   }
> > +static bool
> > +can_use_semaphore_wait(struct i915_request *to, struct i915_request *from)
> > +{
> > +   return to->engine->gt->ggtt == from->engine->gt->ggtt;
> > +}
> > +
> >   static int
> >   emit_semaphore_wait(struct i915_request *to,
> > struct i915_request *from,
> > @@ -1153,6 +1159,9 @@ emit_semaphore_wait(struct i915_request *to,
> > const intel_engine_mask_t mask = READ_ONCE(from->engine)->mask;
> > struct i915_sw_fence *wait = &to->submit;
> > +   if (!can_use_semaphore_wait(to, from))
> > +   goto await_fence;
> > +
> > if (!intel_context_use_semaphores(to->context))
> > goto await_fence;
> > @@ -1256,7 +1265,8 @@ __i915_request_await_execution(struct i915_request 
> > *to,
> >  * immediate execution, and so we must wait until it reaches the
> >  * active slot.
> >  */
> > -   if (intel_engine_has_semaphores(to->engine) &&
> > +   if (can_use_semaphore_wait(to, from) &&
> > +   intel_engine_has_semaphores(to->engine) &&
> > !i915_request_has_initial_breadcrumb(to)) {
> > err = __emit_semaphore_wait(to, from, from->fence.seqno - 1);
> > if (err < 0)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/4] drm: Introduce drm_modeset_lock_ctx_retry()

2021-10-13 Thread Ville Syrjälä
On Wed, Oct 13, 2021 at 01:59:47PM +0200, Daniel Vetter wrote:
> On Mon, Oct 04, 2021 at 02:15:51PM +0300, Ville Syrjälä wrote:
> > On Tue, Jul 20, 2021 at 03:44:49PM +0200, Daniel Vetter wrote:
> > > On Thu, Jul 15, 2021 at 09:49:51PM +0300, Ville Syrjala wrote:
> > > > From: Ville Syrjälä 
> > > > 
> > > > Quite a few places are hand rolling the modeset lock backoff dance.
> > > > Let's suck that into a helper macro that is easier to use without
> > > > forgetting some steps.
> > > > 
> > > > The main downside is probably that the implementation of
> > > > drm_with_modeset_lock_ctx() is a bit harder to read than a hand
> > > > rolled version on account of being split across three functions,
> > > > but the actual code using it ends up being much simpler.
> > > > 
> > > > Cc: Sean Paul 
> > > > Cc: Daniel Vetter 
> > > > Signed-off-by: Ville Syrjälä 
> > > > ---
> > > >  drivers/gpu/drm/drm_modeset_lock.c | 44 ++
> > > >  include/drm/drm_modeset_lock.h | 20 ++
> > > >  2 files changed, 64 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/drm_modeset_lock.c 
> > > > b/drivers/gpu/drm/drm_modeset_lock.c
> > > > index fcfe1a03c4a1..083df96632e8 100644
> > > > --- a/drivers/gpu/drm/drm_modeset_lock.c
> > > > +++ b/drivers/gpu/drm/drm_modeset_lock.c
> > > > @@ -425,3 +425,47 @@ int drm_modeset_lock_all_ctx(struct drm_device 
> > > > *dev,
> > > > return 0;
> > > >  }
> > > >  EXPORT_SYMBOL(drm_modeset_lock_all_ctx);
> > > > +
> > > > +void _drm_modeset_lock_begin(struct drm_modeset_acquire_ctx *ctx,
> > > > +struct drm_atomic_state *state,
> > > > +unsigned int flags, int *ret)
> > > > +{
> > > > +   drm_modeset_acquire_init(ctx, flags);
> > > > +
> > > > +   if (state)
> > > > +   state->acquire_ctx = ctx;
> > > > +
> > > > +   *ret = -EDEADLK;
> > > > +}
> > > > +EXPORT_SYMBOL(_drm_modeset_lock_begin);
> > > > +
> > > > +bool _drm_modeset_lock_loop(int *ret)
> > > > +{
> > > > +   if (*ret == -EDEADLK) {
> > > > +   *ret = 0;
> > > > +   return true;
> > > > +   }
> > > > +
> > > > +   return false;
> > > > +}
> > > > +EXPORT_SYMBOL(_drm_modeset_lock_loop);
> > > > +
> > > > +void _drm_modeset_lock_end(struct drm_modeset_acquire_ctx *ctx,
> > > > +  struct drm_atomic_state *state,
> > > > +  int *ret)
> > > > +{
> > > > +   if (*ret == -EDEADLK) {
> > > > +   if (state)
> > > > +   drm_atomic_state_clear(state);
> > > > +
> > > > +   *ret = drm_modeset_backoff(ctx);
> > > > +   if (*ret == 0) {
> > > > +   *ret = -EDEADLK;
> > > > +   return;
> > > > +   }
> > > > +   }
> > > > +
> > > > +   drm_modeset_drop_locks(ctx);
> > > > +   drm_modeset_acquire_fini(ctx);
> > > > +}
> > > > +EXPORT_SYMBOL(_drm_modeset_lock_end);
> > > > diff --git a/include/drm/drm_modeset_lock.h 
> > > > b/include/drm/drm_modeset_lock.h
> > > > index aafd07388eb7..5eaad2533de5 100644
> > > > --- a/include/drm/drm_modeset_lock.h
> > > > +++ b/include/drm/drm_modeset_lock.h
> > > > @@ -26,6 +26,7 @@
> > > >  
> > > >  #include 
> > > >  
> > > > +struct drm_atomic_state;
> > > >  struct drm_modeset_lock;
> > > >  
> > > >  /**
> > > > @@ -203,4 +204,23 @@ modeset_lock_fail: 
> > > > \
> > > > if (!drm_drv_uses_atomic_modeset(dev))  
> > > > \
> > > > mutex_unlock(&dev->mode_config.mutex);
> > > >  
> > > > +void _drm_modeset_lock_begin(struct drm_modeset_acquire_ctx *ctx,
> > > > +struct drm_atomic_state *state,
> > > > +unsigned int flags,
> > > > +int *ret);
> > > > +bool _drm_modeset_lock_loop(int *ret);
> > > > +void _drm_modeset_lock_end(struct drm_modeset_acquire_ctx *ctx,
> > > > +  struct drm_atomic_state *state,
> > > > +  int *ret);
> > > > +
> > > > +/*
> > > > + * Note that one must always use "continue" rather than
> > > > + * "break" or "return" to handle errors within the
> > > > + * drm_modeset_lock_ctx_retry() block.
> > > 
> > > I'm not sold on loop macros with these kind of restrictions, C just isn't
> > > a great language for these. That's why e.g. drm_connector_iter doesn't
> > > give you a macro, but only the begin/next/end function calls explicitly.
> > 
> > We already use this pattern extensively in i915. Gem ww ctx has one,
> > power domains/pps/etc. use a similar things. It makes the code pretty nice,
> > with the slight caveat that an accidental 'break' can ruin your day. But
> > so can an accidental return with other constructs (and we even had that
> > happen a few times with the connector iterators), so not a dealbreaker
> > IMO.
> > 
> > 

Re: [PATCH 03/11] drm/i915: Restructure probe to handle multi-tile platforms

2021-10-13 Thread Jani Nikula
On Fri, 08 Oct 2021, Matt Roper  wrote:
> On a multi-tile platform, each tile has its own registers + GGTT space,
> and BAR 0 is extended to cover all of them.  Upcoming patches will start
> exposing the tiles as multiple GTs within a single PCI device.  In
> preparation for supporting such setups, restructure the driver's probe
> code a bit.
>
> Only the primary/root tile is initialized for now; the other tiles will
> be detected and plugged in by future patches once the necessary
> infrastructure is in place to handle them.
>
> Original-author: Abdiel Janulgue
> Cc: Daniele Ceraolo Spurio 
> Cc: Matthew Auld 
> Cc: Joonas Lahtinen 
> Signed-off-by: Daniele Ceraolo Spurio 
> Signed-off-by: Tvrtko Ursulin 
> Signed-off-by: Matt Roper 
> ---
>  drivers/gpu/drm/i915/gt/intel_gt.c   | 45 
>  drivers/gpu/drm/i915/gt/intel_gt.h   |  3 ++
>  drivers/gpu/drm/i915/gt/intel_gt_pm.c|  9 -
>  drivers/gpu/drm/i915/gt/intel_gt_types.h |  5 +++
>  drivers/gpu/drm/i915/i915_drv.c  | 20 +--
>  drivers/gpu/drm/i915/intel_uncore.c  | 12 +++
>  drivers/gpu/drm/i915/intel_uncore.h  |  3 +-
>  7 files changed, 76 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
> b/drivers/gpu/drm/i915/gt/intel_gt.c
> index 1cb1948ac959..f4bea1f1de77 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -900,6 +900,51 @@ u32 intel_gt_read_register_fw(struct intel_gt *gt, 
> i915_reg_t reg)
>   return intel_uncore_read_fw(gt->uncore, reg);
>  }
>  
> +static int
> +tile_setup(struct intel_gt *gt, unsigned int id, phys_addr_t phys_addr)
> +{
> + int ret;
> +
> + intel_uncore_init_early(gt->uncore, gt->i915);
> +
> + ret = intel_uncore_setup_mmio(gt->uncore, phys_addr);
> + if (ret)
> + return ret;
> +
> + gt->phys_addr = phys_addr;
> +
> + return 0;
> +}
> +
> +static void tile_cleanup(struct intel_gt *gt)
> +{
> + intel_uncore_cleanup_mmio(gt->uncore);
> +}
> +
> +int intel_probe_gts(struct drm_i915_private *i915)
> +{
> + struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
> + phys_addr_t phys_addr;
> + unsigned int mmio_bar;
> + int ret;
> +
> + mmio_bar = GRAPHICS_VER(i915) == 2 ? 1 : 0;
> + phys_addr = pci_resource_start(pdev, mmio_bar);
> +
> + /* We always have at least one primary GT on any device */
> + ret = tile_setup(&i915->gt, 0, phys_addr);
> + if (ret)
> + return ret;
> +
> + /* TODO: add more tiles */
> + return 0;
> +}
> +
> +void intel_gts_release(struct drm_i915_private *i915)
> +{
> + tile_cleanup(&i915->gt);
> +}

Please call the functions intel_gt_*.

BR,
Jani.



> +
>  void intel_gt_info_print(const struct intel_gt_info *info,
>struct drm_printer *p)
>  {
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h 
> b/drivers/gpu/drm/i915/gt/intel_gt.h
> index 74e771871a9b..f4f35a70cbe4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.h
> @@ -85,6 +85,9 @@ static inline bool intel_gt_needs_read_steering(struct 
> intel_gt *gt,
>  
>  u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg);
>  
> +int intel_probe_gts(struct drm_i915_private *i915);
> +void intel_gts_release(struct drm_i915_private *i915);
> +
>  void intel_gt_info_print(const struct intel_gt_info *info,
>struct drm_printer *p);
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c 
> b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> index 524eaf678790..76f498edb0d5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> @@ -126,7 +126,14 @@ static const struct intel_wakeref_ops wf_ops = {
>  
>  void intel_gt_pm_init_early(struct intel_gt *gt)
>  {
> - intel_wakeref_init(>->wakeref, gt->uncore->rpm, &wf_ops);
> + /*
> +  * We access the runtime_pm structure via gt->i915 here rather than
> +  * gt->uncore as we do elsewhere in the file because gt->uncore is not
> +  * yet initialized for all tiles at this point in the driver startup.
> +  * runtime_pm is per-device rather than per-tile, so this is still the
> +  * correct structure.
> +  */
> + intel_wakeref_init(>->wakeref, >->i915->runtime_pm, &wf_ops);
>   seqcount_mutex_init(>->stats.lock, >->wakeref.mutex);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
> b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> index 14216cc471b1..66143316d92e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> @@ -180,6 +180,11 @@ struct intel_gt {
>  
>   const struct intel_mmio_range *steering_table[NUM_STEERING_TYPES];
>  
> + /*
> +  * Base of per-tile GTTMMADR where we can derive the MMIO and the GGTT.
> +  */
> + phys_addr_t phys_addr;
> +
>   struct intel_gt_info {
>   intel_engine_mask_t engine_mask;

[PATCH] drm/bridge: nwl-dsi: Move bridge add/remove to dsi callbacks

2021-10-13 Thread Guido Günther
Move the panel and bridge_{add,remove} from the bridge callbacks to the
DSI callbacks to make sure we don't indicate readiness to participate in
the display pipeline before the panel is attached.

This was prompted by commit fb8d617f8fd6 ("drm/bridge: Centralize error
message when bridge attach fails") which triggered

  [drm:drm_bridge_attach] *ERROR* failed to attach bridge 
/soc@0/bus@3080/mipi-dsi@30a0  to encoder None-34: -517

during boot.

Signed-off-by: Guido Günther 
---
This was prompted by the discussion at
https://lore.kernel.org/dri-devel/00493cc61d1443dab1c131c46c5890f95f6f9b25.1634068657.git@sigxcpu.org/

 drivers/gpu/drm/bridge/nwl-dsi.c | 64 ++--
 1 file changed, 37 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/bridge/nwl-dsi.c b/drivers/gpu/drm/bridge/nwl-dsi.c
index a7389a0facfb..77aa6f13afef 100644
--- a/drivers/gpu/drm/bridge/nwl-dsi.c
+++ b/drivers/gpu/drm/bridge/nwl-dsi.c
@@ -355,6 +355,9 @@ static int nwl_dsi_host_attach(struct mipi_dsi_host 
*dsi_host,
 {
struct nwl_dsi *dsi = container_of(dsi_host, struct nwl_dsi, dsi_host);
struct device *dev = dsi->dev;
+   struct drm_bridge *panel_bridge;
+   struct drm_panel *panel;
+   int ret;
 
DRM_DEV_INFO(dev, "lanes=%u, format=0x%x flags=0x%lx\n", device->lanes,
 device->format, device->mode_flags);
@@ -362,10 +365,43 @@ static int nwl_dsi_host_attach(struct mipi_dsi_host 
*dsi_host,
if (device->lanes < 1 || device->lanes > 4)
return -EINVAL;
 
+   ret = drm_of_find_panel_or_bridge(dsi->dev->of_node, 1, 0, &panel,
+ &panel_bridge);
+   if (ret)
+   return ret;
+
+   if (panel) {
+   panel_bridge = drm_panel_bridge_add(panel);
+   if (IS_ERR(panel_bridge))
+   return PTR_ERR(panel_bridge);
+   }
+   if (!panel_bridge)
+   return -EPROBE_DEFER;
+
+   dsi->panel_bridge = panel_bridge;
dsi->lanes = device->lanes;
dsi->format = device->format;
dsi->dsi_mode_flags = device->mode_flags;
 
+   /*
+* The DSI output has been properly configured, we can now safely
+* register the input to the bridge framework so that it can take place
+* in a display pipeline.
+*/
+   drm_bridge_add(&dsi->bridge);
+
+   return 0;
+}
+
+static int nwl_dsi_host_detach(struct mipi_dsi_host *dsi_host,
+  struct mipi_dsi_device *dev)
+{
+   struct nwl_dsi *dsi = container_of(dsi_host, struct nwl_dsi, dsi_host);
+
+   drm_bridge_remove(&dsi->bridge);
+   if (dsi->panel_bridge)
+   drm_panel_bridge_remove(dsi->panel_bridge);
+
return 0;
 }
 
@@ -632,6 +668,7 @@ static ssize_t nwl_dsi_host_transfer(struct mipi_dsi_host 
*dsi_host,
 
 static const struct mipi_dsi_host_ops nwl_dsi_host_ops = {
.attach = nwl_dsi_host_attach,
+   .detach = nwl_dsi_host_detach,
.transfer = nwl_dsi_host_transfer,
 };
 
@@ -910,35 +947,11 @@ static int nwl_dsi_bridge_attach(struct drm_bridge 
*bridge,
 enum drm_bridge_attach_flags flags)
 {
struct nwl_dsi *dsi = bridge_to_dsi(bridge);
-   struct drm_bridge *panel_bridge;
-   struct drm_panel *panel;
-   int ret;
-
-   ret = drm_of_find_panel_or_bridge(dsi->dev->of_node, 1, 0, &panel,
- &panel_bridge);
-   if (ret)
-   return ret;
-
-   if (panel) {
-   panel_bridge = drm_panel_bridge_add(panel);
-   if (IS_ERR(panel_bridge))
-   return PTR_ERR(panel_bridge);
-   }
-   dsi->panel_bridge = panel_bridge;
-
-   if (!dsi->panel_bridge)
-   return -EPROBE_DEFER;
 
return drm_bridge_attach(bridge->encoder, dsi->panel_bridge, bridge,
 flags);
 }
 
-static void nwl_dsi_bridge_detach(struct drm_bridge *bridge)
-{  struct nwl_dsi *dsi = bridge_to_dsi(bridge);
-
-   drm_of_panel_bridge_remove(dsi->dev->of_node, 1, 0);
-}
-
 static u32 *nwl_bridge_atomic_get_input_bus_fmts(struct drm_bridge *bridge,
 struct drm_bridge_state 
*bridge_state,
 struct drm_crtc_state 
*crtc_state,
@@ -984,7 +997,6 @@ static const struct drm_bridge_funcs nwl_dsi_bridge_funcs = 
{
.mode_set   = nwl_dsi_bridge_mode_set,
.mode_valid = nwl_dsi_bridge_mode_valid,
.attach = nwl_dsi_bridge_attach,
-   .detach = nwl_dsi_bridge_detach,
 };
 
 static int nwl_dsi_parse_dt(struct nwl_dsi *dsi)
@@ -1210,7 +1222,6 @@ static int nwl_dsi_probe(struct platform_device *pdev)
return ret;
}
 
-   drm_bridge_add(&dsi->bridge);
return 0;
 }
 
@@ -1220,7 +12

Re: [PATCH v2 00/34] component: Make into an aggregate bus

2021-10-13 Thread Daniel Vetter
On Thu, Oct 07, 2021 at 04:46:22PM -0400, Stephen Boyd wrote:
> Quoting Andrzej Hajda (2021-10-07 03:16:27)
> > Hi Stephen,
> >
> > On 06.10.2021 21:37, Stephen Boyd wrote:
> > > This series is from discussion we had on reordering the device lists for
> > > drm shutdown paths[1]. I've introduced an 'aggregate' bus that we put
> > > the aggregate device onto and then we probe the aggregate device once
> > > all the components are probed and call component_add(). The probe/remove
> > > hooks are where the bind/unbind calls go, and then a shutdown hook is
> > > added that can be used to shutdown the drm display pipeline at the right
> > > time.
> > >
> > > This works for me on my sc7180 board. I no longer get a warning from i2c
> > > at shutdown that we're trying to make an i2c transaction after the i2c
> > > bus has been shutdown. There's more work to do on the msm drm driver to
> > > extract component device resources like clks, regulators, etc. out of
> > > the component bind function into the driver probe but I wanted to move
> > > everything over now in other component drivers before tackling that
> > > problem.
> >
> >
> > As I understand you have DSI host with i2c-controlled DSI bridge. And
> > there is an issue that bridge is shutdown before msmdrm. Your solution
> > is to 'adjust' device order on pm list.
> > I had similar issue and solved it locally by adding notification from
> > DSI bridge to DSI host that is has to be removed: mipi_dsi_detach, this
> > notification escalates in DSI host to component_del and this allow to
> > react properly.
> >
> > Advantages:
> > - it is local (only involves DSI host and DSI device),
> > - it does not depend on PM internals,
> > - it can be used in other scenarios as well - unbinding DSI device driver
> >
> > Disadvantage:
> > - It is DSI specific (but this is your case), I have advertised some
> > time ago more general approach [1][2].
> >
> > [1]: https://static.sched.com/hosted_files/osseu18/0f/deferred_problem.pdf
> > [2]: https://lwn.net/Articles/625454/
> >
> 
> I think these are all points for or against using the component code in
> general? Maybe you can send patches that you think can solve the problem
> I'm experiencing and we can review them on the list.

Yeah I think this is entirely orthogonal. If you use component, then
component should provide a way to handle this.

If you use something else, like drm_bridge or dsi or whatever, then that
part should provide a solution to stage stuff correctly and handle all the
ordering.

Now there's a bunch of drivers which mix up component with bridge use and
hilarity ensues, but since there's no real effort to fix that I think it's
toally fine to just improve component.c meanwhile.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH RFC] virtio: wrap config->reset calls

2021-10-13 Thread Michael S. Tsirkin
On Wed, Oct 13, 2021 at 01:03:46PM +0200, David Hildenbrand wrote:
> On 13.10.21 12:55, Michael S. Tsirkin wrote:
> > This will enable cleanups down the road.
> > The idea is to disable cbs, then add "flush_queued_cbs" callback
> > as a parameter, this way drivers can flush any work
> > queued after callbacks have been disabled.
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >   arch/um/drivers/virt-pci.c | 2 +-
> >   drivers/block/virtio_blk.c | 4 ++--
> >   drivers/bluetooth/virtio_bt.c  | 2 +-
> >   drivers/char/hw_random/virtio-rng.c| 2 +-
> >   drivers/char/virtio_console.c  | 4 ++--
> >   drivers/crypto/virtio/virtio_crypto_core.c | 8 
> >   drivers/firmware/arm_scmi/virtio.c | 2 +-
> >   drivers/gpio/gpio-virtio.c | 2 +-
> >   drivers/gpu/drm/virtio/virtgpu_kms.c   | 2 +-
> >   drivers/i2c/busses/i2c-virtio.c| 2 +-
> >   drivers/iommu/virtio-iommu.c   | 2 +-
> >   drivers/net/caif/caif_virtio.c | 2 +-
> >   drivers/net/virtio_net.c   | 4 ++--
> >   drivers/net/wireless/mac80211_hwsim.c  | 2 +-
> >   drivers/nvdimm/virtio_pmem.c   | 2 +-
> >   drivers/rpmsg/virtio_rpmsg_bus.c   | 2 +-
> >   drivers/scsi/virtio_scsi.c | 2 +-
> >   drivers/virtio/virtio.c| 5 +
> >   drivers/virtio/virtio_balloon.c| 2 +-
> >   drivers/virtio/virtio_input.c  | 2 +-
> >   drivers/virtio/virtio_mem.c| 2 +-
> >   fs/fuse/virtio_fs.c| 4 ++--
> >   include/linux/virtio.h | 1 +
> >   net/9p/trans_virtio.c  | 2 +-
> >   net/vmw_vsock/virtio_transport.c   | 4 ++--
> >   sound/virtio/virtio_card.c | 4 ++--
> >   26 files changed, 39 insertions(+), 33 deletions(-)
> > 
> > diff --git a/arch/um/drivers/virt-pci.c b/arch/um/drivers/virt-pci.c
> > index c08066633023..22c4d87c9c15 100644
> > --- a/arch/um/drivers/virt-pci.c
> > +++ b/arch/um/drivers/virt-pci.c
> > @@ -616,7 +616,7 @@ static void um_pci_virtio_remove(struct virtio_device 
> > *vdev)
> > int i;
> >   /* Stop all virtqueues */
> > -vdev->config->reset(vdev);
> > +virtio_reset_device(vdev);
> >   vdev->config->del_vqs(vdev);
> 
> Nit: virtio_device_reset()?
> 
> Because I see:
> 
> int virtio_device_freeze(struct virtio_device *dev);
> int virtio_device_restore(struct virtio_device *dev);
> void virtio_device_ready(struct virtio_device *dev)
> 
> But well, there is:
> void virtio_break_device(struct virtio_device *dev);

Exactly. I don't know what's best, so I opted for plain english :)


> -- 
> Thanks,
> 
> David / dhildenb



Re: [PATCH v2] dma-buf: remove restriction of IOCTL:DMA_BUF_SET_NAME

2021-10-13 Thread Christian König

Am 13.10.21 um 01:56 schrieb Sumit Semwal:

Hello Guangming, Christian,



On Tue, 12 Oct 2021, 14:09 , > wrote:


From: Guangming Cao mailto:guangming@mediatek.com>>

> Am 09.10.21 um 07:55 schrieb guangming@mediatek.com
:
> From: Guangming Cao mailto:guangming@mediatek.com>>
> >
> > If dma-buf don't want userspace users to touch the dmabuf buffer,
> > it seems we should add this restriction into dma_buf_ops.mmap,
> > not in this IOCTL:DMA_BUF_SET_NAME.
> >
> > With this restriction, we can only know the kernel users of
the dmabuf
> > by attachments.
> > However, for many userspace users, such as userpsace users of
dma_heap,
> > they also need to mark the usage of dma-buf, and they don't
care about
> > who attached to this dmabuf, and seems it's no meaning to be
waiting for
> > IOCTL:DMA_BUF_SET_NAME rather than mmap.
>
> Sounds valid to me, but I have no idea why this restriction was
added in
> the first place.
>
> Can you double check the git history and maybe identify when
that was
> added? Mentioning this change in the commit message then might make
> things a bit easier to understand.
>
> Thanks,
> Christian.
It was add in this patch:
https://patchwork.freedesktop.org/patch/310349/

.
However, there is no illustration about it.
I guess it wants users to set_name when no attachments on the dmabuf,
for case with attachments, we can find owner by device in attachments.
But just I said in commit message, this is might not a good idea.

Do you have any idea?


For the original series, the idea was that allowing name change 
mid-use could confuse the users about the dma-buf. However, the rest 
of the series also makes sure each dma-buf have a unique inode, and 
any accounting should probably use that, without relying on the name 
as much.

So I don't have an objection to this change.


I suggest to add that explanation and the original commit id into the 
commit message.


With that changed the patch has my rb as well.

Regards,
Christian.



Best,
Sumit.

>
> >
> > Signed-off-by: Guangming Cao mailto:guangming@mediatek.com>>
> > ---
> >   drivers/dma-buf/dma-buf.c | 14 ++
> >   1 file changed, 2 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index 511fe0d217a0..db2f4efdec32 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -325,10 +325,8 @@ static __poll_t dma_buf_poll(struct file
*file, poll_table *poll)
> >
> >   /**
> >    * dma_buf_set_name - Set a name to a specific dma_buf to
track the usage.
> > - * The name of the dma-buf buffer can only be set when the
dma-buf is not
> > - * attached to any devices. It could theoritically support
changing the
> > - * name of the dma-buf if the same piece of memory is used
for multiple
> > - * purpose between different devices.
> > + * It could theoretically support changing the name of the
dma-buf if the same
> > + * piece of memory is used for multiple purpose between
different devices.
> >    *
> >    * @dmabuf: [in]     dmabuf buffer that will be renamed.
> >    * @buf:    [in]     A piece of userspace memory that
contains the name of
> > @@ -346,19 +344,11 @@ static long dma_buf_set_name(struct
dma_buf *dmabuf, const char __user *buf)
> >     if (IS_ERR(name))
> >             return PTR_ERR(name);
> >
> > -   dma_resv_lock(dmabuf->resv, NULL);
> > -   if (!list_empty(&dmabuf->attachments)) {
> > -           ret = -EBUSY;
> > -           kfree(name);
> > -           goto out_unlock;
> > -   }
> >     spin_lock(&dmabuf->name_lock);
> >     kfree(dmabuf->name);
> >     dmabuf->name = name;
> >     spin_unlock(&dmabuf->name_lock);
> >
> > -out_unlock:
> > -   dma_resv_unlock(dmabuf->resv);
> >     return ret;
> >   }
> >





Re: [PATCH v2 01/34] component: Introduce struct aggregate_device

2021-10-13 Thread Daniel Vetter
On Wed, Oct 06, 2021 at 12:37:46PM -0700, Stephen Boyd wrote:
> Replace 'struct master' with 'struct aggregate_device' and then rename
> 'master' to 'adev' everywhere in the code. While we're here, put a
> struct device inside the aggregate device so that we can register it
> with a bus_type in the next patch.
> 
> The diff is large but that's because this is mostly a rename, where
> sometimes 'master' is replaced with 'adev' and other times it is
> replaced with 'parent' to indicate that the struct device that was being
> used is actually the parent of the aggregate device and driver.
> 
> Cc: Daniel Vetter 
> Cc: "Rafael J. Wysocki" 
> Cc: Rob Clark 
> Cc: Russell King 
> Cc: Saravana Kannan 
> Signed-off-by: Stephen Boyd 

This adds device model stuff, please cc Greg KH and ask him to review
this. Maybe also an ack from Rafael would be good whether this makes
sense.

Once we have that I think we can then go&collect acks/review for all the
driver changes and get this sorted. Thanks a lot for pushing this forward.
-Daniel

> ---
>  drivers/base/component.c  | 250 --
>  include/linux/component.h |   2 +-
>  2 files changed, 134 insertions(+), 118 deletions(-)
> 
> diff --git a/drivers/base/component.c b/drivers/base/component.c
> index 5e79299f6c3f..0a41bbe14981 100644
> --- a/drivers/base/component.c
> +++ b/drivers/base/component.c
> @@ -9,6 +9,7 @@
>   */
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -58,18 +59,21 @@ struct component_match {
>   struct component_match_array *compare;
>  };
>  
> -struct master {
> +struct aggregate_device {
>   struct list_head node;
>   bool bound;
>  
>   const struct component_master_ops *ops;
>   struct device *parent;
> + struct device dev;
>   struct component_match *match;
> +
> + int id;
>  };
>  
>  struct component {
>   struct list_head node;
> - struct master *master;
> + struct aggregate_device *adev;
>   bool bound;
>  
>   const struct component_ops *ops;
> @@ -79,7 +83,9 @@ struct component {
>  
>  static DEFINE_MUTEX(component_mutex);
>  static LIST_HEAD(component_list);
> -static LIST_HEAD(masters);
> +static LIST_HEAD(aggregate_devices);
> +
> +static DEFINE_IDA(aggregate_ida);
>  
>  #ifdef CONFIG_DEBUG_FS
>  
> @@ -87,12 +93,12 @@ static struct dentry *component_debugfs_dir;
>  
>  static int component_devices_show(struct seq_file *s, void *data)
>  {
> - struct master *m = s->private;
> + struct aggregate_device *m = s->private;
>   struct component_match *match = m->match;
>   size_t i;
>  
>   mutex_lock(&component_mutex);
> - seq_printf(s, "%-40s %20s\n", "master name", "status");
> + seq_printf(s, "%-40s %20s\n", "aggregate_device name", "status");
>   seq_puts(s, 
> "-\n");
>   seq_printf(s, "%-40s %20s\n\n",
>  dev_name(m->parent), m->bound ? "bound" : "not bound");
> @@ -122,46 +128,46 @@ static int __init component_debug_init(void)
>  
>  core_initcall(component_debug_init);
>  
> -static void component_master_debugfs_add(struct master *m)
> +static void component_master_debugfs_add(struct aggregate_device *m)
>  {
>   debugfs_create_file(dev_name(m->parent), 0444, component_debugfs_dir, m,
>   &component_devices_fops);
>  }
>  
> -static void component_master_debugfs_del(struct master *m)
> +static void component_master_debugfs_del(struct aggregate_device *m)
>  {
>   debugfs_remove(debugfs_lookup(dev_name(m->parent), 
> component_debugfs_dir));
>  }
>  
>  #else
>  
> -static void component_master_debugfs_add(struct master *m)
> +static void component_master_debugfs_add(struct aggregate_device *m)
>  { }
>  
> -static void component_master_debugfs_del(struct master *m)
> +static void component_master_debugfs_del(struct aggregate_device *m)
>  { }
>  
>  #endif
>  
> -static struct master *__master_find(struct device *parent,
> +static struct aggregate_device *__aggregate_find(struct device *parent,
>   const struct component_master_ops *ops)
>  {
> - struct master *m;
> + struct aggregate_device *m;
>  
> - list_for_each_entry(m, &masters, node)
> + list_for_each_entry(m, &aggregate_devices, node)
>   if (m->parent == parent && (!ops || m->ops == ops))
>   return m;
>  
>   return NULL;
>  }
>  
> -static struct component *find_component(struct master *master,
> +static struct component *find_component(struct aggregate_device *adev,
>   struct component_match_array *mc)
>  {
>   struct component *c;
>  
>   list_for_each_entry(c, &component_list, node) {
> - if (c->master && c->master != master)
> + if (c->adev && c->adev != adev)
>   continue;
>  
>   if (mc->compare && mc->compare(c->dev, mc->data))
> @@ -175,101 +181,102 @@ static struct component *

Re: [PATCH] drm/bridge: nwl-dsi: Move bridge add/remove to dsi callbacks

2021-10-13 Thread Jagan Teki
On Wed, Oct 13, 2021 at 5:44 PM Guido Günther  wrote:
>
> Move the panel and bridge_{add,remove} from the bridge callbacks to the
> DSI callbacks to make sure we don't indicate readiness to participate in
> the display pipeline before the panel is attached.
>
> This was prompted by commit fb8d617f8fd6 ("drm/bridge: Centralize error
> message when bridge attach fails") which triggered
>
>   [drm:drm_bridge_attach] *ERROR* failed to attach bridge 
> /soc@0/bus@3080/mipi-dsi@30a0  to encoder None-34: -517
>
> during boot.
>
> Signed-off-by: Guido Günther 
> ---
> This was prompted by the discussion at
> https://lore.kernel.org/dri-devel/00493cc61d1443dab1c131c46c5890f95f6f9b25.1634068657.git@sigxcpu.org/
>
>  drivers/gpu/drm/bridge/nwl-dsi.c | 64 ++--
>  1 file changed, 37 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/gpu/drm/bridge/nwl-dsi.c 
> b/drivers/gpu/drm/bridge/nwl-dsi.c
> index a7389a0facfb..77aa6f13afef 100644
> --- a/drivers/gpu/drm/bridge/nwl-dsi.c
> +++ b/drivers/gpu/drm/bridge/nwl-dsi.c
> @@ -355,6 +355,9 @@ static int nwl_dsi_host_attach(struct mipi_dsi_host 
> *dsi_host,
>  {
> struct nwl_dsi *dsi = container_of(dsi_host, struct nwl_dsi, 
> dsi_host);
> struct device *dev = dsi->dev;
> +   struct drm_bridge *panel_bridge;
> +   struct drm_panel *panel;
> +   int ret;
>
> DRM_DEV_INFO(dev, "lanes=%u, format=0x%x flags=0x%lx\n", 
> device->lanes,
>  device->format, device->mode_flags);
> @@ -362,10 +365,43 @@ static int nwl_dsi_host_attach(struct mipi_dsi_host 
> *dsi_host,
> if (device->lanes < 1 || device->lanes > 4)
> return -EINVAL;
>
> +   ret = drm_of_find_panel_or_bridge(dsi->dev->of_node, 1, 0, &panel,
> + &panel_bridge);
> +   if (ret)
> +   return ret;
> +
> +   if (panel) {
> +   panel_bridge = drm_panel_bridge_add(panel);
> +   if (IS_ERR(panel_bridge))
> +   return PTR_ERR(panel_bridge);
> +   }
> +   if (!panel_bridge)
> +   return -EPROBE_DEFER;
> +
> +   dsi->panel_bridge = panel_bridge;
> dsi->lanes = device->lanes;
> dsi->format = device->format;
> dsi->dsi_mode_flags = device->mode_flags;
>
> +   /*
> +* The DSI output has been properly configured, we can now safely
> +* register the input to the bridge framework so that it can take 
> place
> +* in a display pipeline.
> +*/
> +   drm_bridge_add(&dsi->bridge);
> +
> +   return 0;
> +}
> +
> +static int nwl_dsi_host_detach(struct mipi_dsi_host *dsi_host,
> +  struct mipi_dsi_device *dev)
> +{
> +   struct nwl_dsi *dsi = container_of(dsi_host, struct nwl_dsi, 
> dsi_host);
> +
> +   drm_bridge_remove(&dsi->bridge);
> +   if (dsi->panel_bridge)
> +   drm_panel_bridge_remove(dsi->panel_bridge);
> +

If I'm correct this logic will failed to find the direct and I2C based
bridges. As these peripheral bridges are trying to find the bridge
device from bridge_attach unlike DSI panels are trying to find the
panel via host attach directly from probe.

Similar issue we have encounters with dw-mipi-dsi bridge.
c206c7faeb3263a7cc7b4de443a3877cd7a5e74b

Jagan.


Re: [PATCH] Revert "drm/fb-helper: improve DRM fbdev emulation device names"

2021-10-13 Thread Daniel Vetter
On Fri, Oct 08, 2021 at 02:02:40PM +0300, Ville Syrjälä wrote:
> On Fri, Oct 08, 2021 at 09:17:08AM +0200, Javier Martinez Canillas wrote:
> > This reverts commit b3484d2b03e4c940a9598aa841a52d69729c582a.
> > 
> > That change attempted to improve the DRM drivers fbdev emulation device
> > names to avoid having confusing names like "simpledrmdrmfb" in /proc/fb.
> > 
> > But unfortunately there are user-space programs, such as pm-utils that
> > query that information and so broke after the mentioned commit. Since
> > the names in /proc/fb are used programs that consider it an ABI, let's
> > restore the old names even when this lead to silly naming like the one
> > mentioned above as an example.
> 
> The usage Johannes listed was this specificially:
>  using_kms() { grep -q -E '(nouveau|drm)fb' /proc/fb; }   
>  
> 
> So it actually looks like  Daniel's
> commit f243dd06180a ("drm/nouveau: Use drm_fb_helper_fill_info")
> also broke the abi. But for the pm-utils use case at least
> just having the "drmfb" in there should cover even nouveau.
> 
> Cc: sta...@vger.kernel.org
> Reviewed-by: Ville Syrjälä 
> 
> > 
> > Reported-by: Johannes Stezenbach 
> > Signed-off-by: Javier Martinez Canillas 
> > ---
> > 
> >  drivers/gpu/drm/drm_fb_helper.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_fb_helper.c 
> > b/drivers/gpu/drm/drm_fb_helper.c
> > index 3ab07832104..8993b02e783 100644
> > --- a/drivers/gpu/drm/drm_fb_helper.c
> > +++ b/drivers/gpu/drm/drm_fb_helper.c
> > @@ -1737,7 +1737,7 @@ void drm_fb_helper_fill_info(struct fb_info *info,
> >sizes->fb_width, sizes->fb_height);
> >  
> > info->par = fb_helper;
> > -   snprintf(info->fix.id, sizeof(info->fix.id), "%s",

Please add a comment here that drmfb is uapi because pm-utils matches
against it ...

Otherwise this will be lost in time again :-(
-Daniel
> > +   snprintf(info->fix.id, sizeof(info->fix.id), "%sdrmfb",
> >  fb_helper->dev->driver->name);
> >  
> >  }
> > -- 
> > 2.31.1
> 
> -- 
> Ville Syrjälä
> Intel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] dma-buf: Update obsoluted comments on dma_buf_vmap/vunmap()

2021-10-13 Thread Daniel Vetter
On Fri, Oct 08, 2021 at 02:09:41PM +0200, Christian König wrote:
> Am 08.10.21 um 13:20 schrieb Shunsuke Mie:
> > A comment for the dma_buf_vmap/vunmap() is not catching up a
> > corresponding implementation.
> > 
> > Signed-off-by: Shunsuke Mie 
> 
> Reviewed-by: Christian König 

You're also pushing?
-Daniel

> 
> > ---
> >   drivers/dma-buf/dma-buf.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index beb504a92d60..7b619998f03a 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -1052,8 +1052,8 @@ EXPORT_SYMBOL_GPL(dma_buf_move_notify);
> >*
> >*   Interfaces::
> >*
> > - *  void \*dma_buf_vmap(struct dma_buf \*dmabuf)
> > - *  void dma_buf_vunmap(struct dma_buf \*dmabuf, void \*vaddr)
> > + *  void \*dma_buf_vmap(struct dma_buf \*dmabuf, struct dma_buf_map 
> > \*map)
> > + *  void dma_buf_vunmap(struct dma_buf \*dmabuf, struct dma_buf_map 
> > \*map)
> >*
> >*   The vmap call can fail if there is no vmap support in the exporter, 
> > or if
> >*   it runs out of vmalloc space. Note that the dma-buf layer keeps a 
> > reference
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Maarten Lankhorst
No memory should be allocated when calling i915_gem_object_wait,
because it may be called to idle a BO when evicting memory.

Fix this by using dma_resv_iter helpers to call
i915_gem_object_wait_fence() on each fence, which cleans up the code a lot.
Also remove dma_resv_prune, it's questionably.

This will result in the following lockdep splat.

<4> [83.538517] ==
<4> [83.538520] WARNING: possible circular locking dependency detected
<4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted
<4> [83.538525] --
<4> [83.538527] gem_render_line/5242 is trying to acquire lock:
<4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: 
__kmalloc_track_caller+0x56/0x270
<4> [83.538538]
but task is already holding lock:
<4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.538638]
which lock already depends on the new lock.
<4> [83.538642]
the existing dependency chain (in reverse order) is:
<4> [83.538645]
-> #1 (&vm->mutex/1){+.+.}-{3:3}:
<4> [83.538649]lock_acquire+0xd3/0x310
<4> [83.538654]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
<4> [83.538730]i915_address_space_init+0xf5/0x1b0 [i915]
<4> [83.538794]ppgtt_init+0x55/0x70 [i915]
<4> [83.538856]gen8_ppgtt_create+0x44/0x5d0 [i915]
<4> [83.538912]i915_ppgtt_create+0x28/0xf0 [i915]
<4> [83.538971]intel_gt_init+0x130/0x3b0 [i915]
<4> [83.539029]i915_gem_init+0x14b/0x220 [i915]
<4> [83.539100]i915_driver_probe+0x97e/0xdd0 [i915]
<4> [83.539149]i915_pci_probe+0x43/0x1d0 [i915]
<4> [83.539197]pci_device_probe+0x9b/0x110
<4> [83.539201]really_probe+0x1b0/0x3b0
<4> [83.539205]__driver_probe_device+0xf6/0x170
<4> [83.539208]driver_probe_device+0x1a/0x90
<4> [83.539210]__driver_attach+0x93/0x160
<4> [83.539213]bus_for_each_dev+0x72/0xc0
<4> [83.539216]bus_add_driver+0x14b/0x1f0
<4> [83.539220]driver_register+0x66/0xb0
<4> [83.539222]hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi]
<4> [83.539227]do_one_initcall+0x53/0x2e0
<4> [83.539230]do_init_module+0x55/0x200
<4> [83.539234]load_module+0x2700/0x2980
<4> [83.539237]__do_sys_finit_module+0xaa/0x110
<4> [83.539241]do_syscall_64+0x37/0xb0
<4> [83.539244]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539247]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [83.539251]validate_chain+0xb37/0x1e70
<4> [83.539254]__lock_acquire+0x5a1/0xb70
<4> [83.539258]lock_acquire+0xd3/0x310
<4> [83.539260]fs_reclaim_acquire+0x9d/0xd0
<4> [83.539264]__kmalloc_track_caller+0x56/0x270
<4> [83.539267]krealloc+0x48/0xa0
<4> [83.539270]dma_resv_get_fences+0x1c3/0x280
<4> [83.539274]i915_gem_object_wait+0x1ff/0x410 [i915]
<4> [83.539342]i915_gem_evict_for_node+0x16b/0x440 [i915]
<4> [83.539412]i915_gem_gtt_reserve+0xff/0x130 [i915]
<4> [83.539482]i915_vma_pin_ww+0x765/0x970 [i915]
<4> [83.539556]eb_validate_vmas+0x6fe/0x8e0 [i915]
<4> [83.539626]i915_gem_do_execbuffer+0x9a6/0x20a0 [i915]
<4> [83.539693]i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915]
<4> [83.539759]drm_ioctl_kernel+0xac/0x140
<4> [83.539763]drm_ioctl+0x201/0x3d0
<4> [83.539766]__x64_sys_ioctl+0x6a/0xa0
<4> [83.539769]do_syscall_64+0x37/0xb0
<4> [83.539772]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539775]
other info that might help us debug this:
<4> [83.539778]  Possible unsafe locking scenario:
<4> [83.539781]CPU0CPU1
<4> [83.539783]
<4> [83.539785]   lock(&vm->mutex/1);
<4> [83.539788]lock(fs_reclaim);
<4> [83.539791]lock(&vm->mutex/1);
<4> [83.539794]   lock(fs_reclaim);
<4> [83.539796]
 *** DEADLOCK ***
<4> [83.539799] 3 locks held by gem_render_line/5242:
<4> [83.539802]  #0: c9d4bbf0 
(reservation_ww_class_acquire){+.+.}-{0:0}, at: 
i915_gem_do_execbuffer+0x8e5/0x20a0 [i915]
<4> [83.539870]  #1: 88811e48bae8 (reservation_ww_class_mutex){+.+.}-{3:3}, 
at: eb_validate_vmas+0x81/0x8e0 [i915]
<4> [83.539936]  #2: 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.540011]
stack backtrace:
<4> [83.540014] CPU: 2 PID: 5242 Comm: gem_render_line Not tainted 
5.15.0-rc5-CI-Trybot_8062+ #1
<4> [83.540019] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, 
BIOS TNTGL357.0038.2020.1124.1648 11/24/2020
<4> [83.540023] Call Trace:
<4> [83.540026]  dump_stack_lvl+0x56/0x7b
<4> [83.540030]  check_noncircular+0x12e/0x150
<4> [83.540034]  ? _raw_spin_unlock_irqrestore+0x50/0x60
<4> [83.540038]  validate_chain+0xb37/0x1e70
<4> [83.540042]  __lock_acquire+0x5a1/0xb70
<4> [83.540046]  lock_acquire+0

Re: [RFC PATCH] drm: Increase DRM_OBJECT_MAX_PROPERTY by 18.

2021-10-13 Thread Sebastian Andrzej Siewior
On 2021-10-13 14:02:59 [+0200], Daniel Vetter wrote:
> On Tue, Oct 05, 2021 at 08:51:51AM +0200, Sebastian Andrzej Siewior wrote:
> > The warning poped up, it says it increase it by the number of occurrence.
> > I saw it 18 times so here it is.
> > It started to up since commit
> >2f425cf5242a0 ("drm: Fix oops in damage self-tests by mocking damage 
> > property")
> > 
> > Increase DRM_OBJECT_MAX_PROPERTY by 18.
> > 
> > Signed-off-by: Sebastian Andrzej Siewior 
> 
> Which driver where? Whomever added that into upstream should also have
> realized this (things will just not work) and include it in there. So if
> things are tested correctly this should be part of a larger series to add
> these 18 props somewhere.

This is on i915 with full debug. If I remember correctly, it wasn't
there before commit
   c7fcbf2513973 ("drm/plane: check that fb_damage is set up when used")

With that commit the box crashed until commit 
   2f425cf5242a0 ("drm: Fix oops in damage self-tests by mocking damage 
property")

where I then observed this.

> Also maybe we should just dynamically allocate this array if people have
> this many properties on their objects.
> -Daniel

Sebastian


Re: [PATCH 1/6] drm/i915: Update dma_fence_work

2021-10-13 Thread Daniel Vetter
On Fri, Oct 08, 2021 at 03:35:25PM +0200, Thomas Hellström wrote:
> Move the release callback to after fence signaling to align with
> what's done for upcoming VM_BIND user-fence signaling.
> 
> Finally call the work callback regardless of whether we have a fence
> error or not and update the existing callbacks accordingly. We will
> need this to intercept the error for failsafe migration.
> 
> Signed-off-by: Thomas Hellström 

I think before we make this thing more complex we really should either
move this into dma-buf/ as a proper thing, or just open-code.

Minimally at least any new async dma_fence worker needs to have
dma_fence_begin/end_signalling annotations, or we're just digging a grave
here.

I'm also not seeing the point in building everything on top of this, for
many cases just an open-coded work_struct should be a lot simpler. It's
just more to clean up later on, that part is for sure.
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_clflush.c |  5 +++
>  drivers/gpu/drm/i915/i915_sw_fence_work.c   | 36 ++---
>  drivers/gpu/drm/i915/i915_sw_fence_work.h   |  1 +
>  drivers/gpu/drm/i915/i915_vma.c | 12 +--
>  4 files changed, 33 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> index f0435c6feb68..2143ebaf5b6f 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> @@ -28,6 +28,11 @@ static void clflush_work(struct dma_fence_work *base)
>  {
>   struct clflush *clflush = container_of(base, typeof(*clflush), base);
>  
> + if (base->error) {
> + dma_fence_set_error(&base->dma, base->error);
> + return;
> + }
> +
>   __do_clflush(clflush->obj);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c 
> b/drivers/gpu/drm/i915/i915_sw_fence_work.c
> index 5b33ef23d54c..5b55cddafc9b 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence_work.c
> +++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c
> @@ -6,21 +6,24 @@
>  
>  #include "i915_sw_fence_work.h"
>  
> -static void fence_complete(struct dma_fence_work *f)
> +static void dma_fence_work_complete(struct dma_fence_work *f)
>  {
> + dma_fence_signal(&f->dma);
> +
>   if (f->ops->release)
>   f->ops->release(f);
> - dma_fence_signal(&f->dma);
> +
> + dma_fence_put(&f->dma);
>  }
>  
> -static void fence_work(struct work_struct *work)
> +static void dma_fence_work_work(struct work_struct *work)
>  {
>   struct dma_fence_work *f = container_of(work, typeof(*f), work);
>  
> - f->ops->work(f);
> + if (f->ops->work)
> + f->ops->work(f);
>  
> - fence_complete(f);
> - dma_fence_put(&f->dma);
> + dma_fence_work_complete(f);
>  }
>  
>  static int __i915_sw_fence_call
> @@ -31,17 +34,13 @@ fence_notify(struct i915_sw_fence *fence, enum 
> i915_sw_fence_notify state)
>   switch (state) {
>   case FENCE_COMPLETE:
>   if (fence->error)
> - dma_fence_set_error(&f->dma, fence->error);
> -
> - if (!f->dma.error) {
> - dma_fence_get(&f->dma);
> - if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
> - fence_work(&f->work);
> - else
> - queue_work(system_unbound_wq, &f->work);
> - } else {
> - fence_complete(f);
> - }
> + cmpxchg(&f->error, 0, fence->error);
> +
> + dma_fence_get(&f->dma);
> + if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
> + dma_fence_work_work(&f->work);
> + else
> + queue_work(system_unbound_wq, &f->work);
>   break;
>  
>   case FENCE_FREE:
> @@ -84,10 +83,11 @@ void dma_fence_work_init(struct dma_fence_work *f,
>const struct dma_fence_work_ops *ops)
>  {
>   f->ops = ops;
> + f->error = 0;
>   spin_lock_init(&f->lock);
>   dma_fence_init(&f->dma, &fence_ops, &f->lock, 0, 0);
>   i915_sw_fence_init(&f->chain, fence_notify);
> - INIT_WORK(&f->work, fence_work);
> + INIT_WORK(&f->work, dma_fence_work_work);
>  }
>  
>  int dma_fence_work_chain(struct dma_fence_work *f, struct dma_fence *signal)
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.h 
> b/drivers/gpu/drm/i915/i915_sw_fence_work.h
> index d56806918d13..caa59fb5252b 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence_work.h
> +++ b/drivers/gpu/drm/i915/i915_sw_fence_work.h
> @@ -24,6 +24,7 @@ struct dma_fence_work_ops {
>  struct dma_fence_work {
>   struct dma_fence dma;
>   spinlock_t lock;
> + int error;
>  
>   struct i915_sw_fence chain;
>   struct i915_sw_dma_fence_cb cb;
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 4b7fc4647e46..5123ac28ad9a 100644
> --- a/d

Re: [Intel-gfx] [PATCH 4/6] drm/i915: Add a struct dma_fence_work timeline

2021-10-13 Thread Daniel Vetter
On Fri, Oct 08, 2021 at 03:35:28PM +0200, Thomas Hellström wrote:
> The TTM managers and, possibly, the gtt address space managers will
> need to be able to order fences for async operation.
> Using dma_fence_is_later() for this will require that the fences we hand
> them are from a single fence context and ordered.
> 
> Introduce a struct dma_fence_work_timeline, and a function to attach
> struct dma_fence_work to such a timeline in a way that all previous
> fences attached to the timeline will be signaled when the latest
> attached struct dma_fence_work signals.
> 
> Signed-off-by: Thomas Hellström 

I'm not understanding why we need this:

- if we just want to order dma_fence work, then an ordered workqueue is
  what we want. Which is why hand-rolling is better than reusing
  dma_fence_work for absolutely everything.

- if we just need to make sure the public fences signal in order, then
  it's a dma_fence_chain.

Definitely no more "it looks like it's shared code but isn't" stuff in
i915.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_sw_fence_work.c | 89 ++-
>  drivers/gpu/drm/i915/i915_sw_fence_work.h | 58 +++
>  2 files changed, 145 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c 
> b/drivers/gpu/drm/i915/i915_sw_fence_work.c
> index 5b55cddafc9b..87cdb3158042 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence_work.c
> +++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c
> @@ -5,6 +5,66 @@
>   */
>  
>  #include "i915_sw_fence_work.h"
> +#include "i915_utils.h"
> +
> +/**
> + * dma_fence_work_timeline_attach - Attach a struct dma_fence_work to a
> + * timeline.
> + * @tl: The timeline to attach to.
> + * @f: The struct dma_fence_work.
> + * @tl_cb: The i915_sw_dma_fence_cb needed to attach to the
> + * timeline. This is typically embedded into the structure that also
> + * embeds the struct dma_fence_work.
> + *
> + * This function takes a timeline reference and associates it with the
> + * struct dma_fence_work. That reference is given up when the fence
> + * signals. Furthermore it assigns a fence context and a seqno to the
> + * dma-fence, and then chains upon the previous fence of the timeline
> + * if any, to make sure that the fence signals after that fence. The
> + * @tl_cb callback structure is needed for that chaining. Finally
> + * the registered last fence of the timeline is replaced by this fence, and
> + * the timeline takes a reference on the fence, which is released when
> + * the fence signals.
> + */
> +void dma_fence_work_timeline_attach(struct dma_fence_work_timeline *tl,
> + struct dma_fence_work *f,
> + struct i915_sw_dma_fence_cb *tl_cb)
> +{
> + struct dma_fence *await;
> +
> + if (tl->ops->get)
> + tl->ops->get(tl);
> +
> + spin_lock(&tl->lock);
> + await = tl->last_fence;
> + tl->last_fence = dma_fence_get(&f->dma);
> + f->dma.seqno = tl->seqno++;
> + f->dma.context = tl->context;
> + f->tl = tl;
> + spin_unlock(&tl->lock);
> +
> + if (await) {
> + __i915_sw_fence_await_dma_fence(&f->chain, await, tl_cb);
> + dma_fence_put(await);
> + }
> +}
> +
> +static void dma_fence_work_timeline_detach(struct dma_fence_work *f)
> +{
> + struct dma_fence_work_timeline *tl = f->tl;
> + bool put = false;
> +
> + spin_lock(&tl->lock);
> + if (tl->last_fence == &f->dma) {
> + put = true;
> + tl->last_fence = NULL;
> + }
> + spin_unlock(&tl->lock);
> + if (tl->ops->put)
> + tl->ops->put(tl);
> + if (put)
> + dma_fence_put(&f->dma);
> +}
>  
>  static void dma_fence_work_complete(struct dma_fence_work *f)
>  {
> @@ -13,6 +73,9 @@ static void dma_fence_work_complete(struct dma_fence_work 
> *f)
>   if (f->ops->release)
>   f->ops->release(f);
>  
> + if (f->tl)
> + dma_fence_work_timeline_detach(f);
> +
>   dma_fence_put(&f->dma);
>  }
>  
> @@ -53,14 +116,17 @@ fence_notify(struct i915_sw_fence *fence, enum 
> i915_sw_fence_notify state)
>  
>  static const char *get_driver_name(struct dma_fence *fence)
>  {
> - return "dma-fence";
> + struct dma_fence_work *f = container_of(fence, typeof(*f), dma);
> +
> + return (f->tl && f->tl->ops->name) ? f->tl->ops->name : "dma-fence";
>  }
>  
>  static const char *get_timeline_name(struct dma_fence *fence)
>  {
>   struct dma_fence_work *f = container_of(fence, typeof(*f), dma);
>  
> - return f->ops->name ?: "work";
> + return (f->tl && f->tl->name) ? f->tl->name :
> + f->ops->name ?: "work";
>  }
>  
>  static void fence_release(struct dma_fence *fence)
> @@ -84,6 +150,7 @@ void dma_fence_work_init(struct dma_fence_work *f,
>  {
>   f->ops = ops;
>   f->error = 0;
> + f->tl = NULL;
>   spin_lock_init(&f->lock);
>   dma_fence_init(&f->dma, &fence_ops, &f->lock, 0, 0);
>   i

Re: [PATCH 0/5] drm/vmwgfx: Support module unload and hotunplug

2021-10-13 Thread Daniel Vetter
On Tue, Oct 12, 2021 at 05:34:50PM +, Zack Rusin wrote:
> On Tue, 2021-10-12 at 11:10 +0200, Thomas Hellström wrote:
> > On Tue, 2021-10-12 at 10:27 +0200, Christian König wrote:
> > > Am 11.10.21 um 14:04 schrieb Thomas Hellström:
> > > 
> > > > > 
> > 
> > > > So now if this is going to be changed, I think we need to
> > > > understand
> > > > why and think this through really thoroughly:
> > > > 
> > > > * What is not working and why (the teardown seems to be a trivial
> > > > fix).
> > > > * How did we end up here,
> > > > * What's the cost of fixing that up compared to refactoring the
> > > > drivers
> > > > that rely on bindable system memory,
> > > > * What's the justification of a system type at all if it's not
> > > > GPU-
> > > > bindable, meaning it's basically equivalent to swapped-out shmem
> > > > with
> > > > the exception that it's mappable?
> > > 
> > > Well, once more that isn't correct. This is nothing new and as far
> > > as
> > > I 
> > > know that behavior existing as long as TTM existed.
> > 
> > I'm not sure whats incorrect? I'm trying to explain what the initial
> > design was, and it may of course have been bad and the one you
> > propose
> > a better one and if required we certainly need to fix i915 to align
> > with a new one.
> > 
> > What worries me though, that if you perceive the design differently
> > and
> > change things in TTM according to that perception that breaks drivers
> > that rely on the initial design and then force drivers to change
> > claiming they are incorrect without a thorough discussion on dri-
> > devel,
> > that's IMHO not good.
> 
> We should probably do that in a seperate thread so that this,
> fundametally important, discussion is easier to find and reference in
> the future. It looks like we're settling on a decision here so I'd
> appreciate an Acked-by for the patch 4/5 just so it doesn't look like I
> was making things up to someone looking at git history in the future.

Jumping in sidesways and late, and also without real context on the
decision itself:

The way to properly formalize this is
- type a kerneldoc patch which writes down the rules we agree on, whether
  that's uapi, or internal helper api like for ttm, or on types or
  whatever
- get acks from everyone who participated + everyone who might care
- merge it

> It seems that in general TTM was designed to be able to handle an
> amazing number of special/corner cases at a cost of complexity which
> meant that over the years very few people understood it and the code
> handling those cases sometimes broke. It sounds like Christian is now
> trying to reign it in and make the code a lot more focused.
> 
> Working on other OS'es for the last few years, certainly made me
> appreciate simple frameworks that move complexity towards drivers that
> actually need them, e.g. it's of course anecdotal but I found wddm gpu
> virtual addressing models (iommu/gpummu) a lot easier to grok.
> 
> On the flip side that does mean that vmwgfx and i915 need to redo some
> code. For vmwgfx it's probably a net positive anyway as we've been
> using TTM for, what is really nowadays, an integrated GPU so maybe it's
> time for us to think about transition to gem.

Aside, but we're looking at adopting ttm for integrated gpu too. The
execbuf utils and dynamic memory management helpers for pure gem just
aren't quite there yet, and improving ttm a bit in this area looks
reasonable (like adding a unified memory aware shrinker like we have in
i915-gem).

Also I thought vmwgfx is using ttm to also manage some id spaces, you'd
have to hand-roll that.

Anyway entirely orthogonal.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH] drm: Increase DRM_OBJECT_MAX_PROPERTY by 18.

2021-10-13 Thread Daniel Vetter
On Wed, Oct 13, 2021 at 02:35:25PM +0200, Sebastian Andrzej Siewior wrote:
> On 2021-10-13 14:02:59 [+0200], Daniel Vetter wrote:
> > On Tue, Oct 05, 2021 at 08:51:51AM +0200, Sebastian Andrzej Siewior wrote:
> > > The warning poped up, it says it increase it by the number of occurrence.
> > > I saw it 18 times so here it is.
> > > It started to up since commit
> > >2f425cf5242a0 ("drm: Fix oops in damage self-tests by mocking damage 
> > > property")
> > > 
> > > Increase DRM_OBJECT_MAX_PROPERTY by 18.
> > > 
> > > Signed-off-by: Sebastian Andrzej Siewior 
> > 
> > Which driver where? Whomever added that into upstream should also have
> > realized this (things will just not work) and include it in there. So if
> > things are tested correctly this should be part of a larger series to add
> > these 18 props somewhere.
> 
> This is on i915 with full debug. If I remember correctly, it wasn't
> there before commit
>c7fcbf2513973 ("drm/plane: check that fb_damage is set up when used")
> 
> With that commit the box crashed until commit 
>2f425cf5242a0 ("drm: Fix oops in damage self-tests by mocking damage 
> property")
> 
> where I then observed this.

Hm there's a pile of commits there, and nothing immediately jumps to
light. The thing is, 18 is likely way too much, since if e.g. we have a
single new property on a plane and that pushes over the limit on all of
them, you get iirc 3x4 already simply because we have that many planes.

So would be good to know the actual culprit.

Can you pls try to bisect the above range, applying the patch as a fixup
locally (without commit, that will confuse git bisect a bit I think), so
we know what/where went wrong?

I'm still confused why this isn't showing up anywhere in our intel ci ...

Thanks, Daniel

> 
> > Also maybe we should just dynamically allocate this array if people have
> > this many properties on their objects.
> > -Daniel
> 
> Sebastian

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/6] drm/i915: Update dma_fence_work

2021-10-13 Thread Thomas Hellström



On 10/13/21 14:41, Daniel Vetter wrote:

On Fri, Oct 08, 2021 at 03:35:25PM +0200, Thomas Hellström wrote:

Move the release callback to after fence signaling to align with
what's done for upcoming VM_BIND user-fence signaling.

Finally call the work callback regardless of whether we have a fence
error or not and update the existing callbacks accordingly. We will
need this to intercept the error for failsafe migration.

Signed-off-by: Thomas Hellström 

I think before we make this thing more complex we really should either
move this into dma-buf/ as a proper thing, or just open-code.

Minimally at least any new async dma_fence worker needs to have
dma_fence_begin/end_signalling annotations, or we're just digging a grave
here.

I'm also not seeing the point in building everything on top of this, for
many cases just an open-coded work_struct should be a lot simpler. It's
just more to clean up later on, that part is for sure.
-Daniel


Yes, I mentioned to Matthew, I'm going to respin this based on our 
previous discussions.


Forgot to mention on the ML.

/Thomas



---
  drivers/gpu/drm/i915/gem/i915_gem_clflush.c |  5 +++
  drivers/gpu/drm/i915/i915_sw_fence_work.c   | 36 ++---
  drivers/gpu/drm/i915/i915_sw_fence_work.h   |  1 +
  drivers/gpu/drm/i915/i915_vma.c | 12 +--
  4 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index f0435c6feb68..2143ebaf5b6f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -28,6 +28,11 @@ static void clflush_work(struct dma_fence_work *base)
  {
struct clflush *clflush = container_of(base, typeof(*clflush), base);
  
+	if (base->error) {

+   dma_fence_set_error(&base->dma, base->error);
+   return;
+   }
+
__do_clflush(clflush->obj);
  }
  
diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c b/drivers/gpu/drm/i915/i915_sw_fence_work.c

index 5b33ef23d54c..5b55cddafc9b 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence_work.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c
@@ -6,21 +6,24 @@
  
  #include "i915_sw_fence_work.h"
  
-static void fence_complete(struct dma_fence_work *f)

+static void dma_fence_work_complete(struct dma_fence_work *f)
  {
+   dma_fence_signal(&f->dma);
+
if (f->ops->release)
f->ops->release(f);
-   dma_fence_signal(&f->dma);
+
+   dma_fence_put(&f->dma);
  }
  
-static void fence_work(struct work_struct *work)

+static void dma_fence_work_work(struct work_struct *work)
  {
struct dma_fence_work *f = container_of(work, typeof(*f), work);
  
-	f->ops->work(f);

+   if (f->ops->work)
+   f->ops->work(f);
  
-	fence_complete(f);

-   dma_fence_put(&f->dma);
+   dma_fence_work_complete(f);
  }
  
  static int __i915_sw_fence_call

@@ -31,17 +34,13 @@ fence_notify(struct i915_sw_fence *fence, enum 
i915_sw_fence_notify state)
switch (state) {
case FENCE_COMPLETE:
if (fence->error)
-   dma_fence_set_error(&f->dma, fence->error);
-
-   if (!f->dma.error) {
-   dma_fence_get(&f->dma);
-   if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
-   fence_work(&f->work);
-   else
-   queue_work(system_unbound_wq, &f->work);
-   } else {
-   fence_complete(f);
-   }
+   cmpxchg(&f->error, 0, fence->error);
+
+   dma_fence_get(&f->dma);
+   if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
+   dma_fence_work_work(&f->work);
+   else
+   queue_work(system_unbound_wq, &f->work);
break;
  
  	case FENCE_FREE:

@@ -84,10 +83,11 @@ void dma_fence_work_init(struct dma_fence_work *f,
 const struct dma_fence_work_ops *ops)
  {
f->ops = ops;
+   f->error = 0;
spin_lock_init(&f->lock);
dma_fence_init(&f->dma, &fence_ops, &f->lock, 0, 0);
i915_sw_fence_init(&f->chain, fence_notify);
-   INIT_WORK(&f->work, fence_work);
+   INIT_WORK(&f->work, dma_fence_work_work);
  }
  
  int dma_fence_work_chain(struct dma_fence_work *f, struct dma_fence *signal)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.h 
b/drivers/gpu/drm/i915/i915_sw_fence_work.h
index d56806918d13..caa59fb5252b 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence_work.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence_work.h
@@ -24,6 +24,7 @@ struct dma_fence_work_ops {
  struct dma_fence_work {
struct dma_fence dma;
spinlock_t lock;
+   int error;
  
  	struct i915_sw_fence chain;

struct i915_sw_dma_fence_cb cb;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
i

Re: [PATCH v2] drm/locking: add backtrace for locking contended locks without backoff

2021-10-13 Thread Jani Nikula
On Fri, 01 Oct 2021, Jani Nikula  wrote:
> If drm_modeset_lock() returns -EDEADLK, the caller is supposed to drop
> all currently held locks using drm_modeset_backoff(). Failing to do so
> will result in warnings and backtraces on the paths trying to lock a
> contended lock. Add support for optionally printing the backtrace on the
> path that hit the deadlock and didn't gracefully handle the situation.
>
> For example, the patch [1] inadvertently dropped the return value check
> and error return on replacing calc_watermark_data() with
> intel_compute_global_watermarks(). The backtraces on the subsequent
> locking paths hitting WARN_ON(ctx->contended) were unhelpful, but adding
> the backtrace to the deadlock path produced this helpful printout:
>
> <7> [98.002465] drm_modeset_lock attempting to lock a contended lock without 
> backoff:
>drm_modeset_lock+0x107/0x130
>drm_atomic_get_plane_state+0x76/0x150
>skl_compute_wm+0x251d/0x2b20 [i915]
>intel_atomic_check+0x1942/0x29e0 [i915]
>drm_atomic_check_only+0x554/0x910
>drm_atomic_nonblocking_commit+0xe/0x50
>drm_mode_atomic_ioctl+0x8c2/0xab0
>drm_ioctl_kernel+0xac/0x140
>
> Add new CONFIG_DRM_DEBUG_MODESET_LOCK to enable modeset lock debugging
> with stack depot and trace.
>
> [1] https://lore.kernel.org/r/20210924114741.15940-4-jani.nik...@intel.com
>
> v2:
> - default y if DEBUG_WW_MUTEX_SLOWPATH (Daniel)
> - depends on DEBUG_KERNEL
>
> Cc: Daniel Vetter 
> Cc: Dave Airlie 
> Reviewed-by: Daniel Vetter 
> Signed-off-by: Jani Nikula 

Pushed to drm-misc-next, thanks for the review.

BR,
Jani.

> ---
>  drivers/gpu/drm/Kconfig| 15 +
>  drivers/gpu/drm/drm_modeset_lock.c | 49 --
>  include/drm/drm_modeset_lock.h |  8 +
>  3 files changed, 70 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 2a926d0de423..a4c020a9a0eb 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -100,6 +100,21 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
>This has the potential to use a lot of memory and print some very
>large kernel messages. If in doubt, say "N".
>  
> +config DRM_DEBUG_MODESET_LOCK
> + bool "Enable backtrace history for lock contention"
> + depends on STACKTRACE_SUPPORT
> + depends on DEBUG_KERNEL
> + depends on EXPERT
> + select STACKDEPOT
> + default y if DEBUG_WW_MUTEX_SLOWPATH
> + help
> +   Enable debug tracing of failures to gracefully handle drm modeset lock
> +   contention. A history of each drm modeset lock path hitting -EDEADLK
> +   will be saved until gracefully handled, and the backtrace will be
> +   printed when attempting to lock a contended lock.
> +
> +   If in doubt, say "N".
> +
>  config DRM_FBDEV_EMULATION
>   bool "Enable legacy fbdev support for your modesetting driver"
>   depends on DRM
> diff --git a/drivers/gpu/drm/drm_modeset_lock.c 
> b/drivers/gpu/drm/drm_modeset_lock.c
> index bf8a6e823a15..4d32b61fa1fd 100644
> --- a/drivers/gpu/drm/drm_modeset_lock.c
> +++ b/drivers/gpu/drm/drm_modeset_lock.c
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  /**
>   * DOC: kms locking
> @@ -77,6 +78,45 @@
>  
>  static DEFINE_WW_CLASS(crtc_ww_class);
>  
> +#if IS_ENABLED(CONFIG_DRM_DEBUG_MODESET_LOCK)
> +static noinline depot_stack_handle_t __stack_depot_save(void)
> +{
> + unsigned long entries[8];
> + unsigned int n;
> +
> + n = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
> +
> + return stack_depot_save(entries, n, GFP_NOWAIT | __GFP_NOWARN);
> +}
> +
> +static void __stack_depot_print(depot_stack_handle_t stack_depot)
> +{
> + struct drm_printer p = drm_debug_printer("drm_modeset_lock");
> + unsigned long *entries;
> + unsigned int nr_entries;
> + char *buf;
> +
> + buf = kmalloc(PAGE_SIZE, GFP_NOWAIT | __GFP_NOWARN);
> + if (!buf)
> + return;
> +
> + nr_entries = stack_depot_fetch(stack_depot, &entries);
> + stack_trace_snprint(buf, PAGE_SIZE, entries, nr_entries, 2);
> +
> + drm_printf(&p, "attempting to lock a contended lock without 
> backoff:\n%s", buf);
> +
> + kfree(buf);
> +}
> +#else /* CONFIG_DRM_DEBUG_MODESET_LOCK */
> +static depot_stack_handle_t __stack_depot_save(void)
> +{
> + return 0;
> +}
> +static void __stack_depot_print(depot_stack_handle_t stack_depot)
> +{
> +}
> +#endif /* CONFIG_DRM_DEBUG_MODESET_LOCK */
> +
>  /**
>   * drm_modeset_lock_all - take all modeset locks
>   * @dev: DRM device
> @@ -225,7 +265,9 @@ EXPORT_SYMBOL(drm_modeset_acquire_fini);
>   */
>  void drm_modeset_drop_locks(struct drm_modeset_acquire_ctx *ctx)
>  {
> - WARN_ON(ctx->contended);
> + if (WARN_ON(ctx->contended))
> + __stack_depot_print(ctx->stack_depot);
> +
>   while (!list_empty(&ctx->locked)) {
>   struct drm_modeset_lock *lock;
>  
> @@ -243,7 +285,8 @@ static i

Re: (subset) [PATCH] drm/vc4: crtc: Make sure the HDMI controller is powered when disabling

2021-10-13 Thread Maxime Ripard
On Thu, 23 Sep 2021 20:50:13 +0200, Maxime Ripard wrote:
> Since commit 875a4d536842 ("drm/vc4: drv: Disable the CRTC at boot
> time"), during the initial setup of the driver we call into the VC4 HDMI
> controller hooks to make sure the controller is properly disabled.
> 
> However, we were never making sure that the device was properly powered
> while doing so. This never resulted in any (reported) issue in practice,
> but since the introduction of commit 4209f03fcb8e ("drm/vc4: hdmi: Warn
> if we access the controller while disabled") we get a loud complaint
> when we do that kind of access.
> 
> [...]

Applied to drm/drm-misc (drm-misc-fixes).

Thanks!
Maxime


Re: [PATCH v2] component: do not leave master devres group open after bind

2021-10-13 Thread Greg KH
On Wed, Oct 06, 2021 at 04:47:57PM +0300, Kai Vehmanen wrote:
> Hi,
> 
> On Tue, 5 Oct 2021, Greg KH wrote:
> 
> > On Wed, Sep 22, 2021 at 11:54:32AM +0300, Kai Vehmanen wrote:
> > > In current code, the devres group for aggregate master is left open
> > > after call to component_master_add_*(). This leads to problems when the
> > > master does further managed allocations on its own. When any
> > > participating driver calls component_del(), this leads to immediate
> > > release of resources.
> [...]
> > > the devres group, and by closing the devres group after
> > > the master->ops->bind() call is done. This allows devres allocations
> > > done by the driver acting as master to be isolated from the binding state
> > > of the aggregate driver. This modifies the logic originally introduced in
> > > commit 9e1ccb4a7700 ("drivers/base: fix devres handling for master 
> > > device")
> > > 
> > > BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/4136
> > > Signed-off-by: Kai Vehmanen 
> > > Acked-by: Imre Deak 
> > > Acked-by: Russell King (Oracle) 
> > 
> > What commit does this "fix:"?  And does it need to go to stable
> > kernel(s)?
> 
> I didn't put a "Fixes" on the original commit 9e1ccb4a7700 
> ("drivers/base: fix devres handling for master device") as it alone
> didn't cause problems. It did open the door for possible devres issues
> for anybody calling component_master_add_().
> 
> On audio side, this surfaced with the more recent commit 3fcaf24e5dce 
> ("ALSA: hda: Allocate resources with device-managed APIs"). In theory one 
> could have hit issues already before, but this made it very easy to hit
> on actual systems.
> 
> If I'd have to pick one, it would be 9e1ccb4a7700 ("drivers/base: fix 
> devres handling for master device"). And yes, given comments on this 
> thread, I'd say this needs to go to stable kernels.

Then please add a fixes: line and a cc: stable line and resend.

thanks,

greg k-h


Re: [PATCH RFC] virtio: wrap config->reset calls

2021-10-13 Thread Vivek Goyal
On Wed, Oct 13, 2021 at 06:55:31AM -0400, Michael S. Tsirkin wrote:
> This will enable cleanups down the road.
> The idea is to disable cbs, then add "flush_queued_cbs" callback
> as a parameter, this way drivers can flush any work
> queued after callbacks have been disabled.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  arch/um/drivers/virt-pci.c | 2 +-
>  drivers/block/virtio_blk.c | 4 ++--
>  drivers/bluetooth/virtio_bt.c  | 2 +-
>  drivers/char/hw_random/virtio-rng.c| 2 +-
>  drivers/char/virtio_console.c  | 4 ++--
>  drivers/crypto/virtio/virtio_crypto_core.c | 8 
>  drivers/firmware/arm_scmi/virtio.c | 2 +-
>  drivers/gpio/gpio-virtio.c | 2 +-
>  drivers/gpu/drm/virtio/virtgpu_kms.c   | 2 +-
>  drivers/i2c/busses/i2c-virtio.c| 2 +-
>  drivers/iommu/virtio-iommu.c   | 2 +-
>  drivers/net/caif/caif_virtio.c | 2 +-
>  drivers/net/virtio_net.c   | 4 ++--
>  drivers/net/wireless/mac80211_hwsim.c  | 2 +-
>  drivers/nvdimm/virtio_pmem.c   | 2 +-
>  drivers/rpmsg/virtio_rpmsg_bus.c   | 2 +-
>  drivers/scsi/virtio_scsi.c | 2 +-
>  drivers/virtio/virtio.c| 5 +
>  drivers/virtio/virtio_balloon.c| 2 +-
>  drivers/virtio/virtio_input.c  | 2 +-
>  drivers/virtio/virtio_mem.c| 2 +-
>  fs/fuse/virtio_fs.c| 4 ++--

fs/fuse/virtio_fs.c changes look good to me.

Reviewed-by: Vivek Goyal 

Vivek

[..]
> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
> index 0ad89c6629d7..27c3b74070a2 100644
> --- a/fs/fuse/virtio_fs.c
> +++ b/fs/fuse/virtio_fs.c
> @@ -895,7 +895,7 @@ static int virtio_fs_probe(struct virtio_device *vdev)
>   return 0;
>  
>  out_vqs:
> - vdev->config->reset(vdev);
> + virtio_reset_device(vdev);
>   virtio_fs_cleanup_vqs(vdev, fs);
>   kfree(fs->vqs);
>  
> @@ -927,7 +927,7 @@ static void virtio_fs_remove(struct virtio_device *vdev)
>   list_del_init(&fs->list);
>   virtio_fs_stop_all_queues(fs);
>   virtio_fs_drain_all_queues_locked(fs);
> - vdev->config->reset(vdev);
> + virtio_reset_device(vdev);
>   virtio_fs_cleanup_vqs(vdev, fs);
>  
>   vdev->priv = NULL;


Thanks
Vivek



Re: DRM KUnit hackathon

2021-10-13 Thread Daniel Vetter
On Mon, Oct 11, 2021 at 10:23:33AM -0500, Nícolas F. R. A. Prado wrote:
> Hello,
> 
> We belong to a student group, LKCAMP [1], which is focused on sharing kernel 
> and
> free software development knowledge and mentoring newcomers to become
> contributors to these projects.
> 
> As part of our efforts, we'll be organizing a hackathon to convert the drm
> selftests in drivers/gpu/drm/selftests/ (and possibly the ones in
> drivers/dma-buf too) to the KUnit framework. It will take place on October 30.
> 
> So please expect to receive some patches from our mentees on that date. It
> probably won't be a big volume (experience tells it'll be around half a dozen
> patches). We'll also make sure to do an internal review beforehand to catch
> common first-timer mistakes and teach the basics.
> 
> We're already working on making sure that the converted KUnit tests can still 
> be
> run by IGT.
> 
> Please let us know if there's any issue with this date. Otherwise we look
> forward to helping a few newcomers get their patches in the kernel on the 30th
> :).

Welcome all, looking forward to cool stuff!

Cheers, Daniel

> 
> Thanks!
> 
> [1] - https://lkcamp.dev/

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Should multiple PRIME_FD_TO_HANDLE ioctls on the same fd require multiple GEM_CLOSE?

2021-10-13 Thread Daniel Vetter
On Tue, Oct 12, 2021 at 07:37:11PM +0100, John Cox wrote:
> On Tue, 12 Oct 2021 17:33:18 +, you wrote:
> 
> >Yes, this is expected behavior, even if it's not intuitive. For more
> >details, see:
> >
> >https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/110
> 
> Thanks - as noted in that discussion the behaviour is a bit unhelpful
> but just knowing that it is expected means I can deal with it.

kerneldoc in that uapi header to explain precisely what and why is going
on would be good too.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v1 00/12] MEMORY_DEVICE_COHERENT for CPU-accessible coherent device memory

2021-10-13 Thread Daniel Vetter
On Tue, Oct 12, 2021 at 03:56:29PM -0300, Jason Gunthorpe wrote:
> On Tue, Oct 12, 2021 at 11:39:57AM -0700, Andrew Morton wrote:
> > On Tue, 12 Oct 2021 12:12:35 -0500 Alex Sierra  wrote:
> > 
> > > This patch series introduces MEMORY_DEVICE_COHERENT, a type of memory
> > > owned by a device that can be mapped into CPU page tables like
> > > MEMORY_DEVICE_GENERIC and can also be migrated like MEMORY_DEVICE_PRIVATE.
> > > With MEMORY_DEVICE_COHERENT, we isolate the new memory type from other
> > > subsystems as far as possible, though there are some small changes to
> > > other subsystems such as filesystem DAX, to handle the new memory type
> > > appropriately.
> > > 
> > > We use ZONE_DEVICE for this instead of NUMA so that the amdgpu
> > > allocator can manage it without conflicting with core mm for non-unified
> > > memory use cases.
> > > 
> > > How it works: The system BIOS advertises the GPU device memory (aka VRAM)
> > > as SPM (special purpose memory) in the UEFI system address map.
> > > The amdgpu driver registers the memory with devmap as
> > > MEMORY_DEVICE_COHERENT using devm_memremap_pages.
> > > 
> > > The initial user for this hardware page migration capability will be
> > > the Frontier supercomputer project.
> > 
> > To what other uses will this infrastructure be put?
> > 
> > Because I must ask: if this feature is for one single computer which
> > presumably has a custom kernel, why add it to mainline Linux?
> 
> Well, it certainly isn't just "one single computer". Overall I know of
> about, hmm, ~10 *datacenters* worth of installations that are using
> similar technology underpinnings.
> 
> "Frontier" is the code name for a specific installation but as the
> technology is proven out there will be many copies made of that same
> approach.
> 
> The previous program "Summit" was done with NVIDIA GPUs and PowerPC
> CPUs and also included a very similar capability. I think this is a
> good sign that this coherently attached accelerator will continue to
> be a theme in computing going foward. IIRC this was done using out of
> tree kernel patches and NUMA localities.
> 
> Specifically with CXL now being standardized and on a path to ubiquity
> I think we will see an explosion in deployments of coherently attached
> accelerator memory. This is the high end trickling down to wider
> usage.
> 
> I strongly think many CXL accelerators are going to want to manage
> their on-accelerator memory in this way as it makes universal sense to
> want to carefully manage memory access locality to optimize for
> performance.

Yeah with CXL this will be used by a lot more drivers/devices, not
even including nvidia's blob.

I guess if you want make sure get an ack on this from CXL folks, so that
we don't end up with a mess.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 03/14] drm/i915/xehpsdv: enforce min GTT alignment

2021-10-13 Thread Daniel Vetter
On Mon, Oct 11, 2021 at 09:41:44PM +0530, Ramalingam C wrote:
> From: Matthew Auld 
> 
> For local-memory objects we need to align the GTT addresses to 64K, both
> for the ppgtt and ggtt.
> 
> Signed-off-by: Matthew Auld 
> Signed-off-by: Stuart Summers 
> Signed-off-by: Ramalingam C 
> Cc: Joonas Lahtinen 
> Cc: Rodrigo Vivi 

Do we still need this with relocations removed? Userspace is picking all
the addresses for us, so all we have to check is whether userspace got it
right.
-Daniel


> ---
>  drivers/gpu/drm/i915/i915_vma.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 4b7fc4647e46..1ea1fa08efdf 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -670,8 +670,13 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 
> alignment, u64 flags)
>   }
>  
>   color = 0;
> - if (vma->obj && i915_vm_has_cache_coloring(vma->vm))
> - color = vma->obj->cache_level;
> + if (vma->obj) {
> + if (HAS_64K_PAGES(vma->vm->i915) && 
> i915_gem_object_is_lmem(vma->obj))
> + alignment = max(alignment, I915_GTT_PAGE_SIZE_64K);
> +
> + if (i915_vm_has_cache_coloring(vma->vm))
> + color = vma->obj->cache_level;
> + }
>  
>   if (flags & PIN_OFFSET_FIXED) {
>   u64 offset = flags & PIN_OFFSET_MASK;
> -- 
> 2.20.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH 2/3] drm/amdgpu:move vram manager defines into a header file

2021-10-13 Thread Arunpravin
Move vram related defines and inline functions into
a separate header file

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 72 
 1 file changed, 72 insertions(+)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
new file mode 100644
index ..fcab6475ccbb
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2021 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __AMDGPU_VRAM_MGR_H__
+#define __AMDGPU_VRAM_MGR_H__
+
+#include 
+
+struct amdgpu_vram_mgr_node {
+   struct ttm_resource base;
+   struct list_head blocks;
+   unsigned long flags;
+};
+
+struct amdgpu_vram_reservation {
+   uint64_t start;
+   uint64_t size;
+   uint64_t min_size;
+   unsigned long flags;
+   struct list_head block;
+   struct list_head node;
+};
+
+static inline uint64_t node_start(struct drm_buddy_block *block)
+{
+   return drm_buddy_block_offset(block);
+}
+
+static inline uint64_t node_size(struct drm_buddy_block *block)
+{
+   return PAGE_SIZE << drm_buddy_block_order(block);
+}
+
+static inline struct amdgpu_vram_mgr_node *
+to_amdgpu_vram_mgr_node(struct ttm_resource *res)
+{
+   return container_of(res, struct amdgpu_vram_mgr_node, base);
+}
+
+static inline struct amdgpu_vram_mgr *
+to_vram_mgr(struct ttm_resource_manager *man)
+{
+   return container_of(man, struct amdgpu_vram_mgr, manager);
+}
+
+static inline struct amdgpu_device *
+to_amdgpu_device(struct amdgpu_vram_mgr *mgr)
+{
+   return container_of(mgr, struct amdgpu_device, mman.vram_mgr);
+}
+
+#endif
-- 
2.25.1



[PATCH 3/3] drm/amdgpu: Replace drm_mm with drm buddy manager

2021-10-13 Thread Arunpravin
Add drm buddy allocator support for vram memory management

Signed-off-by: Arunpravin 
---
 .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 251 ++
 3 files changed, 217 insertions(+), 135 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
index acfa207cf970..2c17e948355e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -30,12 +30,15 @@
 #include 
 #include 
 
+#include "amdgpu_vram_mgr.h"
+
 /* state back for walking over vram_mgr and gtt_mgr allocations */
 struct amdgpu_res_cursor {
uint64_tstart;
uint64_tsize;
uint64_tremaining;
-   struct drm_mm_node  *node;
+   void*node;
+   uint32_tmem_type;
 };
 
 /**
@@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource 
*res,
uint64_t start, uint64_t size,
struct amdgpu_res_cursor *cur)
 {
+   struct drm_buddy_block *block;
+   struct list_head *head, *next;
struct drm_mm_node *node;
 
-   if (!res || res->mem_type == TTM_PL_SYSTEM) {
-   cur->start = start;
-   cur->size = size;
-   cur->remaining = size;
-   cur->node = NULL;
-   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
-   return;
-   }
+   if (!res)
+   goto err_out;
 
BUG_ON(start + size > res->num_pages << PAGE_SHIFT);
 
-   node = to_ttm_range_mgr_node(res)->mm_nodes;
-   while (start >= node->size << PAGE_SHIFT)
-   start -= node++->size << PAGE_SHIFT;
+   cur->mem_type = res->mem_type;
+
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   head = &to_amdgpu_vram_mgr_node(res)->blocks;
+
+   block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!block)
+   goto err_out;
+
+   while (start >= node_size(block)) {
+   start -= node_size(block);
+
+   next = block->link.next;
+   if (next != head)
+   block = list_entry(next, struct 
drm_buddy_block, link);
+   }
+
+   cur->start = node_start(block) + start;
+   cur->size = min(node_size(block) - start, size);
+   cur->remaining = size;
+   cur->node = block;
+   break;
+   case TTM_PL_TT:
+   node = to_ttm_range_mgr_node(res)->mm_nodes;
+   while (start >= node->size << PAGE_SHIFT)
+   start -= node++->size << PAGE_SHIFT;
+
+   cur->start = (node->start << PAGE_SHIFT) + start;
+   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   cur->remaining = size;
+   cur->node = node;
+   break;
+   default:
+   goto err_out;
+   }
 
-   cur->start = (node->start << PAGE_SHIFT) + start;
-   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   return;
+
+err_out:
+   cur->start = start;
+   cur->size = size;
cur->remaining = size;
-   cur->node = node;
+   cur->node = NULL;
+   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
+   return;
 }
 
 /**
@@ -85,7 +124,9 @@ static inline void amdgpu_res_first(struct ttm_resource *res,
  */
 static inline void amdgpu_res_next(struct amdgpu_res_cursor *cur, uint64_t 
size)
 {
-   struct drm_mm_node *node = cur->node;
+   struct drm_buddy_block *block;
+   struct drm_mm_node *node;
+   struct list_head *next;
 
BUG_ON(size > cur->remaining);
 
@@ -99,9 +140,27 @@ static inline void amdgpu_res_next(struct amdgpu_res_cursor 
*cur, uint64_t size)
return;
}
 
-   cur->node = ++node;
-   cur->start = node->start << PAGE_SHIFT;
-   cur->size = min(node->size << PAGE_SHIFT, cur->remaining);
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   block = cur->node;
+
+   next = block->link.next;
+   block = list_entry(next, struct drm_buddy_block, link);
+
+   cur->node = block;
+   cur->start = node_start(block);
+   cur->size = min(node_size(block), cur->remaining);
+   break;
+   case TTM_PL_TT:
+   node = cur->node;
+
+   cur->node = ++node;
+   cur->start = node->start << PAGE_SHIFT;
+   cur->size = min(node->size << PAGE_SHIFT, cur->re

Re: [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread kernel test robot
Hi Maarten,

I love your patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip drm-exynos/exynos-drm-next 
tegra-drm/drm/tegra/for-next v5.15-rc5 next-20211013]
[cannot apply to airlied/drm-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Maarten-Lankhorst/drm-i915-Use-dma_resv_iter-for-waiting-in-i915_gem_object_wait_reservation/20211013-184219
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-a015-20211013 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# 
https://github.com/0day-ci/linux/commit/647f0c4c47ffea53967daf523e8b935707e7a586
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Maarten-Lankhorst/drm-i915-Use-dma_resv_iter-for-waiting-in-i915_gem_object_wait_reservation/20211013-184219
git checkout 647f0c4c47ffea53967daf523e8b935707e7a586
# save the attached .config to linux build tree
mkdir build_dir
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/gpu/drm/i915/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gem/i915_gem_shrinker.c:18:10: fatal error: 
>> dma_resv_utils.h: No such file or directory
  18 | #include "dma_resv_utils.h"
 |  ^~
   compilation terminated.


vim +18 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c

09137e94543761 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c Chris Wilson  
2020-07-08  17  
6d393ef5ff5cac drivers/gpu/drm/i915/gem/i915_gem_shrinker.c Chris Wilson  
2020-12-23 @18  #include "dma_resv_utils.h"
be6a0376950475 drivers/gpu/drm/i915/i915_gem_shrinker.c Daniel Vetter 
2015-03-18  19  #include "i915_trace.h"
be6a0376950475 drivers/gpu/drm/i915/i915_gem_shrinker.c Daniel Vetter 
2015-03-18  20  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH v4 00/24] drm/bridge: Make panel and bridge probe order consistent

2021-10-13 Thread Maxime Ripard
Hi John,

On Wed, Sep 29, 2021 at 04:29:42PM -0700, John Stultz wrote:
> On Wed, Sep 29, 2021 at 2:51 PM John Stultz  wrote:
> >
> > On Wed, Sep 29, 2021 at 2:32 PM John Stultz  wrote:
> > > On Wed, Sep 29, 2021 at 2:27 PM John Stultz  
> > > wrote:
> > > > On Fri, Sep 10, 2021 at 3:12 AM Maxime Ripard  wrote:
> > > > > The best practice to avoid those issues is to register its functions 
> > > > > only after
> > > > > all its dependencies are live. We also shouldn't wait any longer than 
> > > > > we should
> > > > > to play nice with the other components that are waiting for us, so in 
> > > > > our case
> > > > > that would mean moving the DSI device registration to the bridge 
> > > > > probe.
> > > > >
> > > > > I also had a look at all the DSI hosts, and it seems that exynos, 
> > > > > kirin and msm
> > > > > would be affected by this and wouldn't probe anymore after those 
> > > > > changes.
> > > > > Exynos and kirin seems to be simple enough for a mechanical change 
> > > > > (that still
> > > > > requires to be tested), but the changes in msm seemed to be far more 
> > > > > important
> > > > > and I wasn't confortable doing them.
> > > >
> > > >
> > > > Hey Maxime,
> > > >   Sorry for taking so long to get to this, but now that plumbers is
> > > > over I've had a chance to check it out on kirin
> > > >
> > > > Rob Clark pointed me to his branch with some fixups here:
> > > >
> > > > https://gitlab.freedesktop.org/robclark/msm/-/commits/for-mripard/bridge-rework
> > > >
> > > > But trying to boot hikey with that, I see the following loop 
> > > > indefinitely:
> > > > [4.632132] adv7511 2-0039: supply avdd not found, using dummy 
> > > > regulator
> > > > [4.638961] adv7511 2-0039: supply dvdd not found, using dummy 
> > > > regulator
> > > > [4.645741] adv7511 2-0039: supply pvdd not found, using dummy 
> > > > regulator
> > > > [4.652483] adv7511 2-0039: supply a2vdd not found, using dummy 
> > > > regulator
> > > > [4.659342] adv7511 2-0039: supply v3p3 not found, using dummy 
> > > > regulator
> > > > [4.666086] adv7511 2-0039: supply v1p2 not found, using dummy 
> > > > regulator
> > > > [4.681898] adv7511 2-0039: failed to find dsi host
> > >
> > > I just realized Rob's tree is missing the kirin patch. My apologies!
> > > I'll retest and let you know.
> >
> > Ok, just retested including the kirin patch and unfortunately I'm
> > still seeing the same thing.  :(
> >
> > Will dig a bit and let you know when I find more.
> 
> Hey Maxime!
>   I chased down the issue. The dsi probe code was still calling
> drm_of_find_panel_or_bridge() in order to succeed.
> 
> I've moved the logic that looks for the bridge into the bridge_init
> and with that it seems to work.
> 
> Feel free (assuming it looks ok) to fold this change into your kirin patch:
>   
> https://git.linaro.org/people/john.stultz/android-dev.git/commit/?id=4a35ccc4d7a53f68d6d93da3b47e232a7c75b91d

Thanks for testing, I've picked and squashed your fixup

Maxime


signature.asc
Description: PGP signature


Re: [PATCH 13/14] drm/i915/uapi: document behaviour for DG2 64K support

2021-10-13 Thread Daniel Vetter
On Mon, Oct 11, 2021 at 09:41:54PM +0530, Ramalingam C wrote:
> From: Matthew Auld 
> 
> On discrete platforms like DG2, we need to support a minimum page size
> of 64K when dealing with device local-memory. This is quite tricky for
> various reasons, so try to document the new implicit uapi for this.
> 
> Signed-off-by: Matthew Auld 
> Signed-off-by: Ramalingam C 
> ---
>  include/uapi/drm/i915_drm.h | 61 ++---
>  1 file changed, 56 insertions(+), 5 deletions(-)
> 
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index aa2a7eccfb94..d62e8b7ed8b6 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 {
>   /**
>* When the EXEC_OBJECT_PINNED flag is specified this is populated by
>* the user with the GTT offset at which this object will be pinned.
> +  *
>* When the I915_EXEC_NO_RELOC flag is specified this must contain the
>* presumed_offset of the object.
> +  *
>* During execbuffer2 the kernel populates it with the value of the
>* current GTT offset of the object, for future presumed_offset writes.
> +  *
> +  * See struct drm_i915_gem_create_ext for the rules when dealing with
> +  * alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with
> +  * minimum page sizes, like DG2.
>*/
>   __u64 offset;
>  
> @@ -3001,11 +3007,56 @@ struct drm_i915_gem_create_ext {
>*

I think a heading here (or a bit earlier) about Page alignment would be
good. Just mark it up as bold or something (since real sphinx headings
won't work).

>* The (page-aligned) allocated size for the object will be returned.
>*
> -  * Note that for some devices we have might have further minimum
> -  * page-size restrictions(larger than 4K), like for device local-memory.
> -  * However in general the final size here should always reflect any
> -  * rounding up, if for example using the 
> I915_GEM_CREATE_EXT_MEMORY_REGIONS
> -  * extension to place the object in device local-memory.
> +  * On discrete platforms, starting from DG2, we have to contend with GTT
> +  * page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
> +  * objects.  Specifically the hardware only supports 64K or larger GTT
> +  * page sizes for such memory. The kernel will already ensure that all
> +  * I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
> +  * sizes underneath.
> +  *
> +  * Note that the returned size here will always reflect any required
> +  * rounding up done by the kernel, i.e 4K will now become 64K on devices
> +  * such as DG2. The GTT alignment will also need be at least 64K for
> +  * such objects.
> +  *

I think here we should have a "Special DG2 placement restrictions" heading
for clarity

> +  * Note that due to how the hardware implements 64K GTT page support, we
> +  * have some further complications:
> +  *
> +  *   1.) The entire PDE(which covers a 2M virtual address range), must

Does this really format into a nice list in the html output? Also not both
. and ), usually in text it's just )

> +  *   contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same
> +  *   PDE is forbidden by the hardware.
> +  *
> +  *   2.) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
> +  *   objects.
> +  *
> +  * To handle the above the kernel implements a memory coloring scheme to
> +  * prevent userspace from mixing I915_MEMORY_CLASS_DEVICE and
> +  * I915_MEMORY_CLASS_SYSTEM objects in the same PDE. If the kernel is
> +  * ever unable to evict the required pages for the given PDE(different
> +  * color) when inserting the object into the GTT then it will simply
> +  * fail the request.
> +  *
> +  * Since userspace needs to manage the GTT address space themselves,
> +  * special care is needed to ensure this doesn't happen. The simplest
> +  * scheme is to simply align and round up all I915_MEMORY_CLASS_DEVICE
> +  * objects to 2M, which avoids any issues here. At the very least this
> +  * is likely needed for objects that can be placed in both
> +  * I915_MEMORY_CLASS_DEVICE and I915_MEMORY_CLASS_SYSTEM, to avoid
> +  * potential issues when the kernel needs to migrate the object behind
> +  * the scenes, since that might also involve evicting other objects.
> +  *
> +  * To summarise the GTT rules, on platforms like DG2:
> +  *
> +  *   1.) All objects that can be placed in I915_MEMORY_CLASS_DEVICE must
> +  *   have 64K alignment. The kernel will reject this otherwise.
> +  *
> +  *   2.) All I915_MEMORY_CLASS_DEVICE objects must never be placed in
> +  *   the same PDE with other I915_MEMORY_CLASS_SYSTEM objects. The
> +  *   kernel will r

Re: [PATCH 14/14] Doc/gpu/rfc/i915: i915 DG2 uAPI

2021-10-13 Thread Daniel Vetter
On Mon, Oct 11, 2021 at 09:41:55PM +0530, Ramalingam C wrote:
> Details of the new features getting added as part of DG2 enabling and their
> implicit impact on the uAPI.
> 
> Signed-off-by: Ramalingam C 
> cc: Daniel Vetter 
> cc: Matthew Auld 
> ---
>  Documentation/gpu/rfc/i915_dg2.rst | 47 ++
>  Documentation/gpu/rfc/index.rst|  3 ++
>  2 files changed, 50 insertions(+)
>  create mode 100644 Documentation/gpu/rfc/i915_dg2.rst

Please move this and any uapi doc patch this relies on to the front of the
series, so it serves as an intro.

I think the 64k side looks good with the uapi docs, once it's fully
reviewed and acked.

What we still need is proper uapi docs for flat CCS. I think for that a
separate flat ccs DOC: section would be good, which is then references by
the gem_create_ext kerneldoc with a sphinx hyperlink.

The other thing that's missing here are the dg2 flat ccs drm_modifiers. So
we need another patch for that, which in it's kerneldoc then also links to
the flat ccs DOC: section.

Finally that flat ccs doc section needs to discuss all the flat ccs issues
and uapi we've discussed. That patch needs to be acked both by userspace
driver folks, and by compositor folks (because of the modifier uapi
aspect). Please cc Pekka and Simon Ser for the compositor acks (but feel
free to add more people).
-Daniel

> 
> diff --git a/Documentation/gpu/rfc/i915_dg2.rst 
> b/Documentation/gpu/rfc/i915_dg2.rst
> new file mode 100644
> index ..a83ca26cd758
> --- /dev/null
> +++ b/Documentation/gpu/rfc/i915_dg2.rst
> @@ -0,0 +1,47 @@
> +
> +I915 DG2 RFC Section
> +
> +
> +Upstream plan
> +=
> +Plan to upstream the DG2 enabling is:
> +
> +* Merge basic HW enabling for DG2(Still without pciid)
> +* Merge the 64k support for lmem
> +* Merge the flat CCS enabling patches
> +* Add the pciid for DG2 and enable the DG2 in CI
> +
> +
> +64K page support for lmem
> +=
> +On DG2 hw, local-memory supports minimum GTT page size of 64k only. 4k is 
> not supported anymore.
> +
> +DG2 hw dont support the 64k(lmem) and 4k(smem) pages in the same ppgtt Page 
> table. Refer the
> +struct drm_i915_gem_create_ext for the implication of handling the 64k page 
> size.
> +
> +.. kernel-doc:: include/uapi/drm/i915_drm.h
> +:functions: drm_i915_gem_create_ext
> +
> +
> +flat CCS support for lmem
> +=
> +Gen 12+ devices support 3D surfaces compression and compression formats. 
> This is
> +accomplished by an additional compression control state (CCS) stored for 
> each surface.
> +
> +Gen 12 devices(TGL and DG1) stores compression state in a separate region of 
> memory.
> +It is managed by userspace and has an associated set of userspace managed 
> page tables
> +used by hardware for address translation.
> +
> +In Gen 12.5 devices(XEXPSDV and DG2) Flat CCS is introduced to replace the 
> userspace
> +managed AUX pagetable with the flat indexed region of device memory for 
> storing the
> +compression state
> +
> +GOP Driver steals a chunk of memory for the CCS surface corresponding to the 
> entire
> +range of local memory. The memory required for the CCS of the entire local 
> memory is
> +1/256 of the main local memory. The Gop driver will also program a secure 
> register
> +(XEHPSDV_FLAT_CCS_BASE_ADDR 0x4910) with this address value.
> +
> +So the Total local memory available for driver allocation is Total lmem size 
> - CCS data size
> +
> +Flat CCS data needs to be cleared when a lmem object is allocated. And CCS 
> data can
> +be copied in and out of CCS region through XY_CTRL_SURF_COPY_BLT.
> diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
> index 91e93a705230..afb320ed4028 100644
> --- a/Documentation/gpu/rfc/index.rst
> +++ b/Documentation/gpu/rfc/index.rst
> @@ -20,6 +20,9 @@ host such documentation:
>  
>  i915_gem_lmem.rst
>  
> +.. toctree::
> +i915_dg2.rst
> +
>  .. toctree::
>  
>  i915_scheduler.rst
> -- 
> 2.20.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 00/14] drm/i915/dg2: Enabling 64k page size and flat ccs

2021-10-13 Thread Daniel Vetter
On Mon, Oct 11, 2021 at 09:41:41PM +0530, Ramalingam C wrote:
> This series introduces the enabling patches for new flat ccs feature and
> 64k page support for i915 local memory, along with documentation on the
> uAPI impact.
> 
> 64k page support
> 
> 
> On discrete platforms, starting from DG2, we have to contend with GTT
> page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
> objects. Specifically the hardware only supports 64K or larger GTT page
> sizes for such memory. The kernel will already ensure that all
> I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
> sizes underneath.
> 
> Note that the returned size here will always reflect any required
> rounding up done by the kernel, i.e 4K will now become 64K on devices
> such as DG2. The GTT alignment will also need be at least 64K for such
> objects.
> 
> Note that due to how the hardware implements 64K GTT page support, we
> have some further complications:
> 
> 1.) The entire PDE(which covers a 2M virtual address range), must
> contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same PDE is
> forbidden by the hardware.
> 
> 2.) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
> objects.
> 
> To handle the above the kernel implements a memory coloring scheme to
> prevent userspace from mixing I915_MEMORY_CLASS_DEVICE and
> I915_MEMORY_CLASS_SYSTEM objects in the same PDE. If the kernel is ever
> unable to evict the required pages for the given PDE(different color)
> when inserting the object into the GTT then it will simply fail the
> request.
> 
> Since userspace needs to manage the GTT address space themselves,
> special care is needed to ensure this doesn’t happen. The simplest
> scheme is to simply align and round up all I915_MEMORY_CLASS_DEVICE
> objects to 2M, which avoids any issues here. At the very least this is
> likely needed for objects that can be placed in both
> I915_MEMORY_CLASS_DEVICE and I915_MEMORY_CLASS_SYSTEM, to avoid
> potential issues when the kernel needs to migrate the object behind the
> scenes, since that might also involve evicting other objects.
> 
> To summarise the GTT rules, on platforms like DG2:
> 
> 1.) All objects that can be placed in I915_MEMORY_CLASS_DEVICE must have
> 64K alignment. The kernel will reject this otherwise.
> 
> 2.) All I915_MEMORY_CLASS_DEVICE objects must never be placed in the
> same PDE with other I915_MEMORY_CLASS_SYSTEM objects. The kernel will
> reject this otherwise.
> 
> 3.) Objects that can be placed in both I915_MEMORY_CLASS_DEVICE and
> I915_MEMORY_CLASS_SYSTEM should probably be aligned and padded out to
> 2M.
> 
> Flat CCS:
> =
> Gen 12+ devices support 3D surfaces compression and compression formats.
> This is accomplished by an additional compression control state (CCS)
> stored for each surface.
> 
> Gen 12 devices(TGL and DG1) stores compression state in a separate
> region of memory. It is managed by userspace and has an associated set
> of userspace managed page tables used by hardware for address
> translation.
> 
> In Gen 12.5 devices(XEXPSDV and DG2) Flat CCS is introduced to replace
> the userspace managed AUX pagetable with the flat indexed region of
> device memory for storing the compression state
> 
> GOP Driver steals a chunk of memory for the CCS surface corresponding to
> the entire range of local memory. The memory required for the CCS of the
> entire local memory is 1/256 of the main local memory. The Gop driver
> will also program a secure register (XEHPSDV_FLAT_CCS_BASE_ADDR 0x4910)
> with this address value.
> 
> TODO: add patches for the flatccs modifiers and kdoc for them.

Ah it's here too :-)

Since this is uapi we also need link to igts (or at least where the tests
are), and to mesa MR (if that hasn't all landed yet).
-Daniel

> 
> *** BLURB HERE ***
> 
> Abdiel Janulgue (1):
>   drm/i915/lmem: Enable lmem for platforms with Flat CCS
> 
> Ayaz A Siddiqui (1):
>   drm/i915/gt: Clear compress metadata for Gen12.5 >= platforms
> 
> Bommu Krishnaiah (1):
>   drm/i915: Add vm min alignment support
> 
> CQ Tang (1):
>   drm/i915/xehpsdv: Add has_flat_ccs to device info
> 
> Matthew Auld (8):
>   drm/i915/xehpsdv: set min page-size to 64K
>   drm/i915/xehpsdv: enforce min GTT alignment
>   drm/i915: enforce min page size for scratch
>   drm/i915/gtt/xehpsdv: move scratch page to system memory
>   drm/i915/xehpsdv: support 64K GTT pages
>   drm/i915/selftests: account for min_alignment in GTT selftests
>   drm/i915/xehpsdv: implement memory coloring
>   drm/i915/uapi: document behaviour for DG2 64K support
> 
> Ramalingam C (1):
>   Doc/gpu/rfc/i915: i915 DG2 uAPI
> 
> Stuart Summers (1):
>   drm/i915: Add has_64k_pages flag
> 
>  Documentation/gpu/rfc/i915_dg2.rst|  47 ++
>  Documentation/gpu/rfc/index.rst   |   3 +
>  drivers/gpu/drm/i915/gem/i915_gem_stolen.c|   6 +-
>  .../gpu/drm/i915/gem/selftests/huge_pages.c   |  61 
>  .../i915/gem/selftest

Re: [PATCH 1/3] drm:Enable buddy allocator support

2021-10-13 Thread Daniel Vetter
On Wed, Oct 13, 2021 at 07:05:34PM +0530, Arunpravin wrote:
> Port Intel buddy manager to drm root folder

One patch to move it 1:1, then follow-up patches to change it. Not
everything in one.

Also i915 needs to be adopted to use this too, or this just doesn't make
sense.

I'm also wondering whether we shouldn't have a ttm helper for this
readymade so it just glues all in?
-Daniel

> Implemented range allocation support for the provided order
> Implemented TOP-DOWN support
> Implemented freeing up unused pages on contiguous allocation
> Moved range allocation and freelist pickup into a single function
> 
> Signed-off-by: Arunpravin 
> ---
>  drivers/gpu/drm/Makefile|   2 +-
>  drivers/gpu/drm/drm_buddy.c | 705 
>  drivers/gpu/drm/drm_drv.c   |   3 +
>  include/drm/drm_buddy.h | 157 
>  4 files changed, 866 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/drm/drm_buddy.c
>  create mode 100644 include/drm/drm_buddy.h
> 
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index a118692a6df7..fe1a2fc09675 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -18,7 +18,7 @@ drm-y   :=  drm_aperture.o drm_auth.o drm_cache.o \
>   drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \
>   drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \
>   drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \
> - drm_managed.o drm_vblank_work.o
> + drm_managed.o drm_vblank_work.o drm_buddy.o
>  
>  drm-$(CONFIG_DRM_LEGACY) += drm_agpsupport.o drm_bufs.o drm_context.o 
> drm_dma.o \
>   drm_legacy_misc.o drm_lock.o drm_memory.o 
> drm_scatter.o \
> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> new file mode 100644
> index ..8cd118574665
> --- /dev/null
> +++ b/drivers/gpu/drm/drm_buddy.c
> @@ -0,0 +1,705 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +
> +#include 
> +#include 
> +
> +#include 
> +
> +static struct kmem_cache *slab_blocks;
> +
> +static struct drm_buddy_block *drm_block_alloc(struct drm_buddy_mm *mm,
> +struct drm_buddy_block *parent,
> +unsigned int order,
> +u64 offset)
> +{
> + struct drm_buddy_block *block;
> +
> + BUG_ON(order > DRM_BUDDY_MAX_ORDER);
> +
> + block = kmem_cache_zalloc(slab_blocks, GFP_KERNEL);
> + if (!block)
> + return NULL;
> +
> + block->header = offset;
> + block->header |= order;
> + block->parent = parent;
> +
> + BUG_ON(block->header & DRM_BUDDY_HEADER_UNUSED);
> + return block;
> +}
> +
> +static void drm_block_free(struct drm_buddy_mm *mm,
> +struct drm_buddy_block *block)
> +{
> + kmem_cache_free(slab_blocks, block);
> +}
> +
> +static void mark_allocated(struct drm_buddy_block *block)
> +{
> + block->header &= ~DRM_BUDDY_HEADER_STATE;
> + block->header |= DRM_BUDDY_ALLOCATED;
> +
> + list_del(&block->link);
> +}
> +
> +static void mark_free(struct drm_buddy_mm *mm,
> +   struct drm_buddy_block *block)
> +{
> + block->header &= ~DRM_BUDDY_HEADER_STATE;
> + block->header |= DRM_BUDDY_FREE;
> +
> + list_add(&block->link,
> + &mm->free_list[drm_buddy_block_order(block)]);
> +}
> +
> +static void mark_split(struct drm_buddy_block *block)
> +{
> + block->header &= ~DRM_BUDDY_HEADER_STATE;
> + block->header |= DRM_BUDDY_SPLIT;
> +
> + list_del(&block->link);
> +}
> +
> +/**
> + * drm_buddy_init - init memory manager
> + *
> + * @mm: DRM buddy manager to initialize
> + * @size: size in bytes to manage
> + * @chunk_size: minimum page size in bytes for our allocations
> + *
> + * Initializes the memory manager and its resources.
> + *
> + * Returns:
> + * 0 on success, error code on failure.
> + */
> +int drm_buddy_init(struct drm_buddy_mm *mm, u64 size, u64 chunk_size)
> +{
> + unsigned int i;
> + u64 offset;
> +
> + if (size < chunk_size)
> + return -EINVAL;
> +
> + if (chunk_size < PAGE_SIZE)
> + return -EINVAL;
> +
> + if (!is_power_of_2(chunk_size))
> + return -EINVAL;
> +
> + size = round_down(size, chunk_size);
> +
> + mm->size = size;
> + mm->avail = size;
> + mm->chunk_size = chunk_size;
> + mm->max_order = ilog2(size) - ilog2(chunk_size);
> +
> + BUG_ON(mm->max_order > DRM_BUDDY_MAX_ORDER);
> +
> + mm->free_list = kmalloc_array(mm->max_order + 1,
> +   sizeof(struct list_head),
> +   GFP_KERNEL);
> + if (!mm->free_list)
> + return -ENOMEM;
> +
> + for (i = 0; i <= mm->max_order; ++i)
> + INIT_LIST_HEAD(&mm->free_list[i]);
> +
> + mm->n_roots = hweight

Re: [Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Daniel Vetter
On Wed, Oct 13, 2021 at 02:32:03PM +0200, Maarten Lankhorst wrote:
> No memory should be allocated when calling i915_gem_object_wait,
> because it may be called to idle a BO when evicting memory.
> 
> Fix this by using dma_resv_iter helpers to call
> i915_gem_object_wait_fence() on each fence, which cleans up the code a lot.
> Also remove dma_resv_prune, it's questionably.
> 
> This will result in the following lockdep splat.
> 
> <4> [83.538517] ==
> <4> [83.538520] WARNING: possible circular locking dependency detected
> <4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted
> <4> [83.538525] --
> <4> [83.538527] gem_render_line/5242 is trying to acquire lock:
> <4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: 
> __kmalloc_track_caller+0x56/0x270
> <4> [83.538538]
> but task is already holding lock:
> <4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
> i915_vma_pin_ww+0x1c7/0x970 [i915]
> <4> [83.538638]
> which lock already depends on the new lock.
> <4> [83.538642]
> the existing dependency chain (in reverse order) is:
> <4> [83.538645]
> -> #1 (&vm->mutex/1){+.+.}-{3:3}:
> <4> [83.538649]lock_acquire+0xd3/0x310
> <4> [83.538654]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
> <4> [83.538730]i915_address_space_init+0xf5/0x1b0 [i915]
> <4> [83.538794]ppgtt_init+0x55/0x70 [i915]
> <4> [83.538856]gen8_ppgtt_create+0x44/0x5d0 [i915]
> <4> [83.538912]i915_ppgtt_create+0x28/0xf0 [i915]
> <4> [83.538971]intel_gt_init+0x130/0x3b0 [i915]
> <4> [83.539029]i915_gem_init+0x14b/0x220 [i915]
> <4> [83.539100]i915_driver_probe+0x97e/0xdd0 [i915]
> <4> [83.539149]i915_pci_probe+0x43/0x1d0 [i915]
> <4> [83.539197]pci_device_probe+0x9b/0x110
> <4> [83.539201]really_probe+0x1b0/0x3b0
> <4> [83.539205]__driver_probe_device+0xf6/0x170
> <4> [83.539208]driver_probe_device+0x1a/0x90
> <4> [83.539210]__driver_attach+0x93/0x160
> <4> [83.539213]bus_for_each_dev+0x72/0xc0
> <4> [83.539216]bus_add_driver+0x14b/0x1f0
> <4> [83.539220]driver_register+0x66/0xb0
> <4> [83.539222]hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi]
> <4> [83.539227]do_one_initcall+0x53/0x2e0
> <4> [83.539230]do_init_module+0x55/0x200
> <4> [83.539234]load_module+0x2700/0x2980
> <4> [83.539237]__do_sys_finit_module+0xaa/0x110
> <4> [83.539241]do_syscall_64+0x37/0xb0
> <4> [83.539244]entry_SYSCALL_64_after_hwframe+0x44/0xae
> <4> [83.539247]
> -> #0 (fs_reclaim){+.+.}-{0:0}:
> <4> [83.539251]validate_chain+0xb37/0x1e70
> <4> [83.539254]__lock_acquire+0x5a1/0xb70
> <4> [83.539258]lock_acquire+0xd3/0x310
> <4> [83.539260]fs_reclaim_acquire+0x9d/0xd0
> <4> [83.539264]__kmalloc_track_caller+0x56/0x270
> <4> [83.539267]krealloc+0x48/0xa0
> <4> [83.539270]dma_resv_get_fences+0x1c3/0x280
> <4> [83.539274]i915_gem_object_wait+0x1ff/0x410 [i915]
> <4> [83.539342]i915_gem_evict_for_node+0x16b/0x440 [i915]
> <4> [83.539412]i915_gem_gtt_reserve+0xff/0x130 [i915]
> <4> [83.539482]i915_vma_pin_ww+0x765/0x970 [i915]
> <4> [83.539556]eb_validate_vmas+0x6fe/0x8e0 [i915]
> <4> [83.539626]i915_gem_do_execbuffer+0x9a6/0x20a0 [i915]
> <4> [83.539693]i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915]
> <4> [83.539759]drm_ioctl_kernel+0xac/0x140
> <4> [83.539763]drm_ioctl+0x201/0x3d0
> <4> [83.539766]__x64_sys_ioctl+0x6a/0xa0
> <4> [83.539769]do_syscall_64+0x37/0xb0
> <4> [83.539772]entry_SYSCALL_64_after_hwframe+0x44/0xae
> <4> [83.539775]
> other info that might help us debug this:
> <4> [83.539778]  Possible unsafe locking scenario:
> <4> [83.539781]CPU0CPU1
> <4> [83.539783]
> <4> [83.539785]   lock(&vm->mutex/1);
> <4> [83.539788]lock(fs_reclaim);
> <4> [83.539791]lock(&vm->mutex/1);
> <4> [83.539794]   lock(fs_reclaim);
> <4> [83.539796]
>  *** DEADLOCK ***
> <4> [83.539799] 3 locks held by gem_render_line/5242:
> <4> [83.539802]  #0: c9d4bbf0 
> (reservation_ww_class_acquire){+.+.}-{0:0}, at: 
> i915_gem_do_execbuffer+0x8e5/0x20a0 [i915]
> <4> [83.539870]  #1: 88811e48bae8 
> (reservation_ww_class_mutex){+.+.}-{3:3}, at: eb_validate_vmas+0x81/0x8e0 
> [i915]
> <4> [83.539936]  #2: 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
> i915_vma_pin_ww+0x1c7/0x970 [i915]
> <4> [83.540011]
> stack backtrace:
> <4> [83.540014] CPU: 2 PID: 5242 Comm: gem_render_line Not tainted 
> 5.15.0-rc5-CI-Trybot_8062+ #1
> <4> [83.540019] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, 
> BIOS TNTGL357.0038.2020.1124.1648 11/24/2020
> <4> [83.540023] Call Trace:

Re: [PATCH 03/28] dma-buf: add dma_resv selftest v3

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:17PM +0200, Christian König wrote:
> Just exercising a very minor subset of the functionality, but already
> proven useful.
> 
> v2: add missing locking
> v3: some more cleanup and consolidation, add unlocked test as well
> 
> Signed-off-by: Christian König 

Yeah this is great, since if we then get some specific bug later on it's
going to be very easy to add the unit test for the precise bug hopefully.

I scrolled through, looks correct.

Reviewed-by: Daniel Vetter 

> ---
>  drivers/dma-buf/Makefile  |   3 +-
>  drivers/dma-buf/selftests.h   |   1 +
>  drivers/dma-buf/st-dma-resv.c | 282 ++
>  3 files changed, 285 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/dma-buf/st-dma-resv.c
> 
> diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
> index 1ef021273a06..511805dbeb75 100644
> --- a/drivers/dma-buf/Makefile
> +++ b/drivers/dma-buf/Makefile
> @@ -11,6 +11,7 @@ obj-$(CONFIG_DMABUF_SYSFS_STATS) += dma-buf-sysfs-stats.o
>  dmabuf_selftests-y := \
>   selftest.o \
>   st-dma-fence.o \
> - st-dma-fence-chain.o
> + st-dma-fence-chain.o \
> + st-dma-resv.o
>  
>  obj-$(CONFIG_DMABUF_SELFTESTS)   += dmabuf_selftests.o
> diff --git a/drivers/dma-buf/selftests.h b/drivers/dma-buf/selftests.h
> index bc8cea67bf1e..97d73aaa31da 100644
> --- a/drivers/dma-buf/selftests.h
> +++ b/drivers/dma-buf/selftests.h
> @@ -12,3 +12,4 @@
>  selftest(sanitycheck, __sanitycheck__) /* keep first (igt selfcheck) */
>  selftest(dma_fence, dma_fence)
>  selftest(dma_fence_chain, dma_fence_chain)
> +selftest(dma_resv, dma_resv)
> diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
> new file mode 100644
> index ..50d3791ccb8c
> --- /dev/null
> +++ b/drivers/dma-buf/st-dma-resv.c
> @@ -0,0 +1,282 @@
> +/* SPDX-License-Identifier: MIT */
> +
> +/*
> +* Copyright © 2019 Intel Corporation
> +* Copyright © 2021 Advanced Micro Devices, Inc.
> +*/
> +
> +#include 
> +#include 
> +#include 
> +
> +#include "selftest.h"
> +
> +static struct spinlock fence_lock;
> +
> +static const char *fence_name(struct dma_fence *f)
> +{
> + return "selftest";
> +}
> +
> +static const struct dma_fence_ops fence_ops = {
> + .get_driver_name = fence_name,
> + .get_timeline_name = fence_name,
> +};
> +
> +static struct dma_fence *alloc_fence(void)
> +{
> + struct dma_fence *f;
> +
> + f = kmalloc(sizeof(*f), GFP_KERNEL);
> + if (!f)
> + return NULL;
> +
> + dma_fence_init(f, &fence_ops, &fence_lock, 0, 0);
> + return f;
> +}
> +
> +static int sanitycheck(void *arg)
> +{
> + struct dma_resv resv;
> + struct dma_fence *f;
> + int r;
> +
> + f = alloc_fence();
> + if (!f)
> + return -ENOMEM;
> +
> + dma_fence_signal(f);
> + dma_fence_put(f);
> +
> + dma_resv_init(&resv);
> + r = dma_resv_lock(&resv, NULL);
> + if (r)
> + pr_err("Resv locking failed\n");
> + else
> + dma_resv_unlock(&resv);
> + dma_resv_fini(&resv);
> + return r;
> +}
> +
> +static int test_signaling(void *arg, bool shared)
> +{
> + struct dma_resv resv;
> + struct dma_fence *f;
> + int r;
> +
> + f = alloc_fence();
> + if (!f)
> + return -ENOMEM;
> +
> + dma_resv_init(&resv);
> + r = dma_resv_lock(&resv, NULL);
> + if (r) {
> + pr_err("Resv locking failed\n");
> + goto err_free;
> + }
> +
> + if (shared) {
> + r = dma_resv_reserve_shared(&resv, 1);
> + if (r) {
> + pr_err("Resv shared slot allocation failed\n");
> + goto err_unlock;
> + }
> +
> + dma_resv_add_shared_fence(&resv, f);
> + } else {
> + dma_resv_add_excl_fence(&resv, f);
> + }
> +
> + if (dma_resv_test_signaled(&resv, shared)) {
> + pr_err("Resv unexpectedly signaled\n");
> + r = -EINVAL;
> + goto err_unlock;
> + }
> + dma_fence_signal(f);
> + if (!dma_resv_test_signaled(&resv, shared)) {
> + pr_err("Resv not reporting signaled\n");
> + r = -EINVAL;
> + goto err_unlock;
> + }
> +err_unlock:
> + dma_resv_unlock(&resv);
> +err_free:
> + dma_resv_fini(&resv);
> + dma_fence_put(f);
> + return r;
> +}
> +
> +static int test_excl_signaling(void *arg)
> +{
> + return test_signaling(arg, false);
> +}
> +
> +static int test_shared_signaling(void *arg)
> +{
> + return test_signaling(arg, true);
> +}
> +
> +static int test_for_each(void *arg, bool shared)
> +{
> + struct dma_resv_iter cursor;
> + struct dma_fence *f, *fence;
> + struct dma_resv resv;
> + int r;
> +
> + f = alloc_fence();
> + if (!f)
> + return -ENOMEM;
> +
> + dma_resv_init(&resv);
> + r = dma_resv_lock(&resv, NULL);
> + if (r) {
> + pr_err("Resv 

Re: [PATCH 11/28] drm/amdgpu: use the new iterator in amdgpu_sync_resv

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:25PM +0200, Christian König wrote:
> Simplifying the code a bit.
> 
> Signed-off-by: Christian König 

Reviewed-by: Daniel Vetter 

Yeah these iterators rock :-)
-Daniel

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 44 
>  1 file changed, 14 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> index 862eb3c1c4c5..f7d8487799b2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> @@ -252,41 +252,25 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct 
> amdgpu_sync *sync,
>struct dma_resv *resv, enum amdgpu_sync_mode mode,
>void *owner)
>  {
> - struct dma_resv_list *flist;
> + struct dma_resv_iter cursor;
>   struct dma_fence *f;
> - unsigned i;
> - int r = 0;
> + int r;
>  
>   if (resv == NULL)
>   return -EINVAL;
>  
> - /* always sync to the exclusive fence */
> - f = dma_resv_excl_fence(resv);
> - dma_fence_chain_for_each(f, f) {
> - struct dma_fence_chain *chain = to_dma_fence_chain(f);
> -
> - if (amdgpu_sync_test_fence(adev, mode, owner, chain ?
> -chain->fence : f)) {
> - r = amdgpu_sync_fence(sync, f);
> - dma_fence_put(f);
> - if (r)
> - return r;
> - break;
> - }
> - }
> -
> - flist = dma_resv_shared_list(resv);
> - if (!flist)
> - return 0;
> -
> - for (i = 0; i < flist->shared_count; ++i) {
> - f = rcu_dereference_protected(flist->shared[i],
> -   dma_resv_held(resv));
> -
> - if (amdgpu_sync_test_fence(adev, mode, owner, f)) {
> - r = amdgpu_sync_fence(sync, f);
> - if (r)
> - return r;
> + dma_resv_for_each_fence(&cursor, resv, true, f) {
> + dma_fence_chain_for_each(f, f) {
> + struct dma_fence_chain *chain = to_dma_fence_chain(f);
> +
> + if (amdgpu_sync_test_fence(adev, mode, owner, chain ?
> +chain->fence : f)) {
> + r = amdgpu_sync_fence(sync, f);
> + dma_fence_put(f);
> + if (r)
> + return r;
> + break;
> + }
>   }
>   }
>   return 0;
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 12/28] drm/amdgpu: use new iterator in amdgpu_ttm_bo_eviction_valuable

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:26PM +0200, Christian König wrote:
> Simplifying the code a bit.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 14 --
>  1 file changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index e8d70b6e6737..722e3c9e8882 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1345,10 +1345,9 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct 
> ttm_buffer_object *bo,
>   const struct ttm_place *place)
>  {
>   unsigned long num_pages = bo->resource->num_pages;
> + struct dma_resv_iter resv_cursor;
>   struct amdgpu_res_cursor cursor;
> - struct dma_resv_list *flist;
>   struct dma_fence *f;
> - int i;
>  
>   /* Swapout? */
>   if (bo->resource->mem_type == TTM_PL_SYSTEM)
> @@ -1362,14 +1361,9 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct 
> ttm_buffer_object *bo,
>* If true, then return false as any KFD process needs all its BOs to
>* be resident to run successfully
>*/
> - flist = dma_resv_shared_list(bo->base.resv);
> - if (flist) {
> - for (i = 0; i < flist->shared_count; ++i) {
> - f = rcu_dereference_protected(flist->shared[i],
> - dma_resv_held(bo->base.resv));
> - if (amdkfd_fence_check_mm(f, current->mm))
> - return false;
> - }
> + dma_resv_for_each_fence(&resv_cursor, bo->base.resv, true, f) {

^false?

At least I'm not seeing the code look at the exclusive fence here.
-Daniel

> + if (amdkfd_fence_check_mm(f, current->mm))
> + return false;
>   }
>  
>   switch (bo->resource->mem_type) {
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] fbdev: Garbage collect fbdev scrolling acceleration, part 1 (from TODO list)

2021-10-13 Thread Thomas Zimmermann

Hi

Am 01.10.21 um 14:48 schrieb Claudio Suarez:

On Fri, Oct 01, 2021 at 10:21:44AM +0200, Thomas Zimmermann wrote:

Hi

Am 30.09.21 um 17:10 schrieb Claudio:

Scroll acceleration is disabled in fbcon by hard-wiring
p->scrollmode = SCROLL_REDRAW. Remove the obsolete code in fbcon.c
and fbdev/core/

Signed-off-by: Claudio Suarez 
---

- This is a task in the TODO list Documentation/gpu/todo.rst
- The contact in the task is Daniel Vetter. He is/you are in copy.
- To ease the things and saving time, I did a patch. It is included in this
message. I can redo it if there is something wrong.
- I tested it in some configurations.

My plan for new patches in this task:
- A buch of patches to remove code from drivers: fb_copyarea and related.
- Simplify the code around fbcon_ops as much as possible to remove the hooks
as the TODO suggests.
- Remove fb_copyarea in headers and exported symbols: cfb_copyarea, etc. This
must be done when all the drivers are changed.

I think that the correct list to ask questions about this
is linux-fb...@vger.kernel.org . Is it correct ?
My question: I can develop the new changes. I can test in two computers/two
drivers. Is there a way to test the rest of the patches ? I have not hardware
to test them. Is anyone helping with this? Only regression tests are needed.
I can test other patches in return.

Thank you.
Claudio Suarez.

Patch follows:

   Documentation/gpu/todo.rst  |  13 +-
   drivers/video/fbdev/core/bitblit.c  |  16 -
   drivers/video/fbdev/core/fbcon.c| 509 
++--
   drivers/video/fbdev/core/fbcon.h|  59 
   drivers/video/fbdev/core/fbcon_ccw.c|  28 +-
   drivers/video/fbdev/core/fbcon_cw.c |  28 +-
   drivers/video/fbdev/core/fbcon_rotate.h |   9 -
   drivers/video/fbdev/core/fbcon_ud.c |  37 +--
   drivers/video/fbdev/core/tileblit.c |  16 -
   drivers/video/fbdev/skeletonfb.c|  12 +-
   include/linux/fb.h  |   2 +-
   11 files changed, 51 insertions(+), 678 deletions(-)


Nice stats :)

I looked through it and it looks good. Maybe double-check that everything
still builds.

Acked-by: Thomas Zimmermann 



Yes, it still builds :)
I had built with some different .config options, including
allyesconfig, allno, some randoms and debian default config. I tested
some .config options related to fbdev. I spent time running some kernels
with different parameters and everything was ok.
Today, I've just applied the patch to source from two gits: Linus
rc and drm. Both have built ok.
I think that I did enough tests to ensure it works fine. This code is going
to run in many computers, mine included!
Of course, if you or anyone is worried about something specific, please,
tell me and I can check and re-check it. I don't want to miss something
important.

Thank you!


I added the patch to drm-misc-next.

Best regards
Thomas



Best regards
Claudio Suarez


Best regards
Thomas



diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
index 12e61869939e..bb1e04bbf4fb 100644
--- a/Documentation/gpu/todo.rst
+++ b/Documentation/gpu/todo.rst
@@ -314,16 +314,19 @@ Level: Advanced
   Garbage collect fbdev scrolling acceleration
   
-Scroll acceleration is disabled in fbcon by hard-wiring p->scrollmode =
-SCROLL_REDRAW. There's a ton of code this will allow us to remove:
+Scroll acceleration has been disabled in fbcon. Now it works as the old
+SCROLL_REDRAW mode. A ton of code was removed in fbcon.c and the hook bmove was
+removed from fbcon_ops.
+Remaining tasks:
-- lots of code in fbcon.c
-
-- a bunch of the hooks in fbcon_ops, maybe the remaining hooks could be called
+- a bunch of the hooks in fbcon_ops could be removed or simplified by calling
 directly instead of the function table (with a switch on p->rotate)
   - fb_copyarea is unused after this, and can be deleted from all drivers
+- after that, fb_copyarea can be deleted from fb_ops in include/linux/fb.h as
+  well as cfb_copyarea
+
   Note that not all acceleration code can be deleted, since clearing and cursor
   support is still accelerated, which might be good candidates for further
   deletion projects.
diff --git a/drivers/video/fbdev/core/bitblit.c 
b/drivers/video/fbdev/core/bitblit.c
index f98e8f298bc1..01fae2c96965 100644
--- a/drivers/video/fbdev/core/bitblit.c
+++ b/drivers/video/fbdev/core/bitblit.c
@@ -43,21 +43,6 @@ static void update_attr(u8 *dst, u8 *src, int attribute,
}
   }
-static void bit_bmove(struct vc_data *vc, struct fb_info *info, int sy,
- int sx, int dy, int dx, int height, int width)
-{
-   struct fb_copyarea area;
-
-   area.sx = sx * vc->vc_font.width;
-   area.sy = sy * vc->vc_font.height;
-   area.dx = dx * vc->vc_font.width;
-   area.dy = dy * vc->vc_font.height;
-   area.height = height * vc->vc_font.height;
-   area.width = width * vc->vc_font.width;
-
-   inf

Re: [PATCH 13/28] drm/amdgpu: use new iterator in amdgpu_vm_prt_fini

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:27PM +0200, Christian König wrote:
> No need to actually allocate an array of fences here.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 26 +-
>  1 file changed, 5 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 6b15cad78de9..e42dd79ed6f4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2090,30 +2090,14 @@ static void amdgpu_vm_free_mapping(struct 
> amdgpu_device *adev,
>  static void amdgpu_vm_prt_fini(struct amdgpu_device *adev, struct amdgpu_vm 
> *vm)
>  {
>   struct dma_resv *resv = vm->root.bo->tbo.base.resv;
> - struct dma_fence *excl, **shared;
> - unsigned i, shared_count;
> - int r;
> + struct dma_resv_iter cursor;
> + struct dma_fence *fence;
>  
> - r = dma_resv_get_fences(resv, &excl, &shared_count, &shared);
> - if (r) {
> - /* Not enough memory to grab the fence list, as last resort
> -  * block for all the fences to complete.
> -  */
> - dma_resv_wait_timeout(resv, true, false,
> - MAX_SCHEDULE_TIMEOUT);
> - return;
> - }
> -
> - /* Add a callback for each fence in the reservation object */
> - amdgpu_vm_prt_get(adev);

I was confused for a bit why the old code wouldn't leak a refcount for
!excl case, but it's all handled.

Not sure amdgpu_vm_add_prt_cb still needs to handle the !fence case, it's
a bit a gotcha but I guess can happen?

Either way, looks correct.

Reviewed-by: Daniel Vetter 

> - amdgpu_vm_add_prt_cb(adev, excl);
> -
> - for (i = 0; i < shared_count; ++i) {
> + dma_resv_for_each_fence(&cursor, resv, true, fence) {
> + /* Add a callback for each fence in the reservation object */
>   amdgpu_vm_prt_get(adev);
> - amdgpu_vm_add_prt_cb(adev, shared[i]);
> + amdgpu_vm_add_prt_cb(adev, fence);
>   }
> -
> - kfree(shared);
>  }
>  
>  /**
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v9 2/4] dt-bindings: mfd: logicvc: Add patternProperties for the display

2021-10-13 Thread Geert Uytterhoeven
Hi Lee,

On Wed, Sep 22, 2021 at 4:46 PM Lee Jones  wrote:
> On Tue, 14 Sep 2021, Paul Kocialkowski wrote:
> > The LogiCVC multi-function device has a display part which is now
> > described in its binding. Add a patternProperties match for it.
> >
> > Signed-off-by: Paul Kocialkowski 
> > ---
> >  Documentation/devicetree/bindings/mfd/xylon,logicvc.yaml | 3 +++
> >  1 file changed, 3 insertions(+)
>
> Applied, thanks.

Unknown file referenced: [Errno 2] No such file or directory:
'.../dt-schema/dtschema/schemas/display/xylon,logicvc-display.yaml'

as 1/4 hasn't been applied yet.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH 03/14] drm/i915/xehpsdv: enforce min GTT alignment

2021-10-13 Thread Matthew Auld

On 13/10/2021 14:38, Daniel Vetter wrote:

On Mon, Oct 11, 2021 at 09:41:44PM +0530, Ramalingam C wrote:

From: Matthew Auld 

For local-memory objects we need to align the GTT addresses to 64K, both
for the ppgtt and ggtt.

Signed-off-by: Matthew Auld 
Signed-off-by: Stuart Summers 
Signed-off-by: Ramalingam C 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 


Do we still need this with relocations removed? Userspace is picking all
the addresses for us, so all we have to check is whether userspace got it
right.


Yeah, for OFFSET_FIXED this just validates that the provided address is 
correctly aligned to 64K, while for the in-kernel insertion stuff we 
still need to allocate an address that is aligned to 64K. Setting the 
alignment here handles both cases.



-Daniel



---
  drivers/gpu/drm/i915/i915_vma.c | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 4b7fc4647e46..1ea1fa08efdf 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -670,8 +670,13 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 
alignment, u64 flags)
}
  
  	color = 0;

-   if (vma->obj && i915_vm_has_cache_coloring(vma->vm))
-   color = vma->obj->cache_level;
+   if (vma->obj) {
+   if (HAS_64K_PAGES(vma->vm->i915) && 
i915_gem_object_is_lmem(vma->obj))
+   alignment = max(alignment, I915_GTT_PAGE_SIZE_64K);
+
+   if (i915_vm_has_cache_coloring(vma->vm))
+   color = vma->obj->cache_level;
+   }
  
  	if (flags & PIN_OFFSET_FIXED) {

u64 offset = flags & PIN_OFFSET_MASK;
--
2.20.1





Re: [PATCH 14/28] drm/msm: use new iterator in msm_gem_describe

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:28PM +0200, Christian König wrote:
> Simplifying the code a bit. Also drop the RCU read side lock since the
> object is locked anyway.
> 
> Untested since I can't get the driver to compile on !ARM.

Cross-compiler install is pretty easy and you should have that for pushing
drm changes to drm-misc :-)

> Signed-off-by: Christian König 

Assuming this compiles, it looks correct.

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/msm/msm_gem.c | 19 +--
>  1 file changed, 5 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> index 40a9863f5951..5bd511f07c07 100644
> --- a/drivers/gpu/drm/msm/msm_gem.c
> +++ b/drivers/gpu/drm/msm/msm_gem.c
> @@ -880,7 +880,7 @@ void msm_gem_describe(struct drm_gem_object *obj, struct 
> seq_file *m,
>  {
>   struct msm_gem_object *msm_obj = to_msm_bo(obj);
>   struct dma_resv *robj = obj->resv;
> - struct dma_resv_list *fobj;
> + struct dma_resv_iter cursor;
>   struct dma_fence *fence;
>   struct msm_gem_vma *vma;
>   uint64_t off = drm_vma_node_start(&obj->vma_node);
> @@ -955,22 +955,13 @@ void msm_gem_describe(struct drm_gem_object *obj, 
> struct seq_file *m,
>   seq_puts(m, "\n");
>   }
>  
> - rcu_read_lock();
> - fobj = dma_resv_shared_list(robj);
> - if (fobj) {
> - unsigned int i, shared_count = fobj->shared_count;
> -
> - for (i = 0; i < shared_count; i++) {
> - fence = rcu_dereference(fobj->shared[i]);
> + dma_resv_for_each_fence(&cursor, robj, true, fence) {
> + if (dma_resv_iter_is_exclusive(&cursor))
> + describe_fence(fence, "Exclusive", m);
> + else
>   describe_fence(fence, "Shared", m);
> - }
>   }
>  
> - fence = dma_resv_excl_fence(robj);
> - if (fence)
> - describe_fence(fence, "Exclusive", m);
> - rcu_read_unlock();
> -
>   msm_gem_unlock(obj);
>  }
>  
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v8 1/4] dt-bindings: display: Document the Xylon LogiCVC display controller

2021-10-13 Thread Geert Uytterhoeven
Hi Paul,

On Wed, Dec 23, 2020 at 10:32 PM Paul Kocialkowski
 wrote:
> The Xylon LogiCVC is a display controller implemented as programmable
> logic in Xilinx FPGAs.
>
> Signed-off-by: Paul Kocialkowski 
> Acked-by: Rob Herring 

> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/xylon,logicvc-display.yaml
> @@ -0,0 +1,313 @@

> +  clock-names:
> +minItems: 1
> +maxItems: 4

After applying this to my local tree, as it is a dependency for 2/4 in
for-mfd-next:

Documentation/devicetree/bindings/display/xylon,logicvc-display.yaml:
properties:clock-names: {'required': ['maxItems']} is not allowed for
{'minItems': 1, 'maxItems': 4, 'items': [{'const': 'vclk'}, {'enum':
['vclk2', 'lvdsclk', 'lvdsclkn']}, {'enum': ['vclk2', 'lvdsclk',
'lvdsclkn']}, {'enum': ['vclk2', 'lvdsclk', 'lvdsclkn']}]}
hint: "maxItems" is not needed with an "items" list
from schema $id: http://devicetree.org/meta-schemas/items.yaml#

> +items:
> +  # vclk is required and must be provided as first item.
> +  - const: vclk
> +  # Other clocks are optional and can be provided in any order.
> +  - enum:
> +  - vclk2
> +  - lvdsclk
> +  - lvdsclkn
> +  - enum:
> +  - vclk2
> +  - lvdsclk
> +  - lvdsclkn
> +  - enum:
> +  - vclk2
> +  - lvdsclk
> +  - lvdsclkn
> +

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH 15/28] drm/radeon: use new iterator in radeon_sync_resv

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:29PM +0200, Christian König wrote:
> Simplifying the code a bit.
> 
> Signed-off-by: Christian König 

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/radeon/radeon_sync.c | 22 +++---
>  1 file changed, 3 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_sync.c 
> b/drivers/gpu/drm/radeon/radeon_sync.c
> index 9257b60144c4..b991ba1bcd51 100644
> --- a/drivers/gpu/drm/radeon/radeon_sync.c
> +++ b/drivers/gpu/drm/radeon/radeon_sync.c
> @@ -91,33 +91,17 @@ int radeon_sync_resv(struct radeon_device *rdev,
>struct dma_resv *resv,
>bool shared)
>  {
> - struct dma_resv_list *flist;
> - struct dma_fence *f;
> + struct dma_resv_iter cursor;
>   struct radeon_fence *fence;
> - unsigned i;
> + struct dma_fence *f;
>   int r = 0;
>  
> - /* always sync to the exclusive fence */
> - f = dma_resv_excl_fence(resv);
> - fence = f ? to_radeon_fence(f) : NULL;
> - if (fence && fence->rdev == rdev)
> - radeon_sync_fence(sync, fence);
> - else if (f)
> - r = dma_fence_wait(f, true);
> -
> - flist = dma_resv_shared_list(resv);
> - if (shared || !flist || r)
> - return r;
> -
> - for (i = 0; i < flist->shared_count; ++i) {
> - f = rcu_dereference_protected(flist->shared[i],
> -   dma_resv_held(resv));
> + dma_resv_for_each_fence(&cursor, resv, shared, f) {
>   fence = to_radeon_fence(f);
>   if (fence && fence->rdev == rdev)
>   radeon_sync_fence(sync, fence);
>   else
>   r = dma_fence_wait(f, true);
> -
>   if (r)
>   break;
>   }
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v4 00/24] drm/bridge: Make panel and bridge probe order consistent

2021-10-13 Thread Maxime Ripard
Hi Caleb,

On Thu, Sep 30, 2021 at 09:20:52PM +0100, Caleb Connolly wrote:
> Hi,
> 
> On 30/09/2021 20:49, Amit Pundir wrote:
> > On Thu, 30 Sept 2021 at 04:50, Rob Clark  wrote:
> > > 
> > > On Wed, Sep 29, 2021 at 2:51 PM John Stultz  
> > > wrote:
> > > > 
> > > > On Wed, Sep 29, 2021 at 2:32 PM John Stultz  
> > > > wrote:
> > > > > On Wed, Sep 29, 2021 at 2:27 PM John Stultz  
> > > > > wrote:
> > > > > > On Fri, Sep 10, 2021 at 3:12 AM Maxime Ripard  
> > > > > > wrote:
> > > > > > > The best practice to avoid those issues is to register its 
> > > > > > > functions only after
> > > > > > > all its dependencies are live. We also shouldn't wait any longer 
> > > > > > > than we should
> > > > > > > to play nice with the other components that are waiting for us, 
> > > > > > > so in our case
> > > > > > > that would mean moving the DSI device registration to the bridge 
> > > > > > > probe.
> > > > > > > 
> > > > > > > I also had a look at all the DSI hosts, and it seems that exynos, 
> > > > > > > kirin and msm
> > > > > > > would be affected by this and wouldn't probe anymore after those 
> > > > > > > changes.
> > > > > > > Exynos and kirin seems to be simple enough for a mechanical 
> > > > > > > change (that still
> > > > > > > requires to be tested), but the changes in msm seemed to be far 
> > > > > > > more important
> > > > > > > and I wasn't confortable doing them.
> > > > > > 
> > > > > > 
> > > > > > Hey Maxime,
> > > > > >Sorry for taking so long to get to this, but now that plumbers is
> > > > > > over I've had a chance to check it out on kirin
> > > > > > 
> > > > > > Rob Clark pointed me to his branch with some fixups here:
> > > > > > 
> > > > > > https://gitlab.freedesktop.org/robclark/msm/-/commits/for-mripard/bridge-rework
> > > > > > 
> > > > > > But trying to boot hikey with that, I see the following loop 
> > > > > > indefinitely:
> > > > > > [4.632132] adv7511 2-0039: supply avdd not found, using dummy 
> > > > > > regulator
> > > > > > [4.638961] adv7511 2-0039: supply dvdd not found, using dummy 
> > > > > > regulator
> > > > > > [4.645741] adv7511 2-0039: supply pvdd not found, using dummy 
> > > > > > regulator
> > > > > > [4.652483] adv7511 2-0039: supply a2vdd not found, using dummy 
> > > > > > regulator
> > > > > > [4.659342] adv7511 2-0039: supply v3p3 not found, using dummy 
> > > > > > regulator
> > > > > > [4.666086] adv7511 2-0039: supply v1p2 not found, using dummy 
> > > > > > regulator
> > > > > > [4.681898] adv7511 2-0039: failed to find dsi host
> > > > > 
> > > > > I just realized Rob's tree is missing the kirin patch. My apologies!
> > > > > I'll retest and let you know.
> > > > 
> > > > Ok, just retested including the kirin patch and unfortunately I'm
> > > > still seeing the same thing.  :(
> > > > 
> > > > Will dig a bit and let you know when I find more.
> > > 
> > > Did you have a chance to test it on anything using drm/msm with DSI
> > > panels?  That would at least confirm that I didn't miss anything in
> > > the drm/msm patch to swap the dsi-host vs bridge ordering..
> > 
> > Hi, smoke tested
> > https://gitlab.freedesktop.org/robclark/msm/-/commits/for-mripard/bridge-rework
> > on Pocophone F1 (sdm845 / A630) with v5.15-rc3. I see no obvious
> > regressions in my limited testing so far including video (youtube)
> > playback.
> Tested on the OnePlus 6 too booting AOSP, works fine. This *fixes*
> FBDEV_EMULATION (so we can get a working framebuffer console) which was
> otherwise broken on 5.15.
> 
> However it spits out some warnings during boot: 
> https://p.calebs.dev/gucysowyna.yaml

Thanks for testing. It looks like the runtime_pm ordering between the
msm devices changed a bit with the conversion Rob did.

Rob, do you know what could be going on?

Thanks!
Maxime


signature.asc
Description: PGP signature


Re: [PATCH 17/28] drm/i915: use the new iterator in i915_gem_busy_ioctl v2

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 02:44:50PM +0200, Christian König wrote:
> Am 05.10.21 um 14:40 schrieb Tvrtko Ursulin:
> > 
> > On 05/10/2021 12:37, Christian König wrote:
> > > This makes the function much simpler since the complex
> > > retry logic is now handled else where.
> > > 
> > > Signed-off-by: Christian König 
> > > Reviewed-by: Tvrtko Ursulin 
> > 
> > Reminder - r-b was retracted until at least more text is added to commit
> > message about pros and cons. But really some discussion had inside the
> > i915 team on the topic.
> 
> Sure, going to move those to a different branch.
> 
> But I really only see the following options:
> 1. Grab the lock.
> 2. Use the _unlocked variant with get/put.
> 3. Add another _rcu iterator just for this case.
> 
> I'm fine with either, but Daniel pretty much already rejected #3 and #2/#1
> has more overhead then the original one.

Anything that removes open-code rcu/lockless magic from i915 gets my ack,
there's way too much of this everywhere. So on this:

Acked-by: Daniel Vetter 

I've asked Maarten to review the i915 ones for you, please pester him if
it's not happening :-)
-Daniel

> 
> Regards,
> Christian.
> 
> > 
> > Regards,
> > 
> > Tvrtko
> > 
> > > ---
> > >   drivers/gpu/drm/i915/gem/i915_gem_busy.c | 35 ++--
> > >   1 file changed, 14 insertions(+), 21 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > > b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > > index 6234e17259c1..dc72b36dae54 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > > @@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void
> > > *data,
> > >   {
> > >   struct drm_i915_gem_busy *args = data;
> > >   struct drm_i915_gem_object *obj;
> > > -    struct dma_resv_list *list;
> > > -    unsigned int seq;
> > > +    struct dma_resv_iter cursor;
> > > +    struct dma_fence *fence;
> > >   int err;
> > >     err = -ENOENT;
> > > @@ -109,27 +109,20 @@ i915_gem_busy_ioctl(struct drm_device *dev,
> > > void *data,
> > >    * to report the overall busyness. This is what the wait-ioctl
> > > does.
> > >    *
> > >    */
> > > -retry:
> > > -    seq = raw_read_seqcount(&obj->base.resv->seq);
> > > -
> > > -    /* Translate the exclusive fence to the READ *and* WRITE engine */
> > > -    args->busy =
> > > busy_check_writer(dma_resv_excl_fence(obj->base.resv));
> > > -
> > > -    /* Translate shared fences to READ set of engines */
> > > -    list = dma_resv_shared_list(obj->base.resv);
> > > -    if (list) {
> > > -    unsigned int shared_count = list->shared_count, i;
> > > -
> > > -    for (i = 0; i < shared_count; ++i) {
> > > -    struct dma_fence *fence =
> > > -    rcu_dereference(list->shared[i]);
> > > -
> > > +    args->busy = 0;
> > > +    dma_resv_iter_begin(&cursor, obj->base.resv, true);
> > > +    dma_resv_for_each_fence_unlocked(&cursor, fence) {
> > > +    if (dma_resv_iter_is_restarted(&cursor))
> > > +    args->busy = 0;
> > > +
> > > +    if (dma_resv_iter_is_exclusive(&cursor))
> > > +    /* Translate the exclusive fence to the READ *and*
> > > WRITE engine */
> > > +    args->busy |= busy_check_writer(fence);
> > > +    else
> > > +    /* Translate shared fences to READ set of engines */
> > >   args->busy |= busy_check_reader(fence);
> > > -    }
> > >   }
> > > -
> > > -    if (args->busy && read_seqcount_retry(&obj->base.resv->seq, seq))
> > > -    goto retry;
> > > +    dma_resv_iter_end(&cursor);
> > >     err = 0;
> > >   out:
> > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 23/28] drm: use new iterator in drm_gem_fence_array_add_implicit v3

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:37PM +0200, Christian König wrote:
> Simplifying the code a bit.
> 
> v2: add missing rcu_read_lock()/unlock()
> v3: switch to locked version
> 
> Signed-off-by: Christian König 
> Reviewed-by: Tvrtko Ursulin 

Please make sure you also apply this to the new copy of this code in
drm/sched. This one here is up for deletion, once I get all the driver
conversions I have landed ...
-Daniel

> ---
>  drivers/gpu/drm/drm_gem.c | 26 +-
>  1 file changed, 5 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 09c820045859..4dcdec6487bb 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -1340,31 +1340,15 @@ int drm_gem_fence_array_add_implicit(struct xarray 
> *fence_array,
>struct drm_gem_object *obj,
>bool write)
>  {
> - int ret;
> - struct dma_fence **fences;
> - unsigned int i, fence_count;
> -
> - if (!write) {
> - struct dma_fence *fence =
> - dma_resv_get_excl_unlocked(obj->resv);
> -
> - return drm_gem_fence_array_add(fence_array, fence);
> - }
> + struct dma_resv_iter cursor;
> + struct dma_fence *fence;
> + int ret = 0;
>  
> - ret = dma_resv_get_fences(obj->resv, NULL,
> - &fence_count, &fences);
> - if (ret || !fence_count)
> - return ret;
> -
> - for (i = 0; i < fence_count; i++) {
> - ret = drm_gem_fence_array_add(fence_array, fences[i]);
> + dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
> + ret = drm_gem_fence_array_add(fence_array, fence);
>   if (ret)
>   break;
>   }
> -
> - for (; i < fence_count; i++)
> - dma_fence_put(fences[i]);
> - kfree(fences);
>   return ret;
>  }
>  EXPORT_SYMBOL(drm_gem_fence_array_add_implicit);
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 4/6] drm/i915: Add a struct dma_fence_work timeline

2021-10-13 Thread Thomas Hellström
On Wed, 2021-10-13 at 14:43 +0200, Daniel Vetter wrote:
> On Fri, Oct 08, 2021 at 03:35:28PM +0200, Thomas Hellström wrote:
> > The TTM managers and, possibly, the gtt address space managers will
> > need to be able to order fences for async operation.
> > Using dma_fence_is_later() for this will require that the fences we
> > hand
> > them are from a single fence context and ordered.
> > 
> > Introduce a struct dma_fence_work_timeline, and a function to
> > attach
> > struct dma_fence_work to such a timeline in a way that all previous
> > fences attached to the timeline will be signaled when the latest
> > attached struct dma_fence_work signals.
> > 
> > Signed-off-by: Thomas Hellström 
> 
> I'm not understanding why we need this:
> 
> - if we just want to order dma_fence work, then an ordered workqueue
> is
>   what we want. Which is why hand-rolling is better than reusing
>   dma_fence_work for absolutely everything.
> 
> - if we just need to make sure the public fences signal in order,
> then
>   it's a dma_fence_chain.

Part of the same series that needs reworking.

What we need here is a way to coalesce multiple fences from various
contexts (including both gpu and work fences) into a single fence and
then attach it to a timeline.

/Thomas






Re: [PATCH 24/28] drm: use new iterator in drm_gem_plane_helper_prepare_fb v2

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:38PM +0200, Christian König wrote:
> Makes the handling a bit more complex, but avoids the use of
> dma_resv_get_excl_unlocked().
> 
> v2: improve coding and documentation
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/drm_gem_atomic_helper.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c 
> b/drivers/gpu/drm/drm_gem_atomic_helper.c
> index e570398abd78..8534f78d4d6d 100644
> --- a/drivers/gpu/drm/drm_gem_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
> @@ -143,6 +143,7 @@
>   */
>  int drm_gem_plane_helper_prepare_fb(struct drm_plane *plane, struct 
> drm_plane_state *state)
>  {
> + struct dma_resv_iter cursor;
>   struct drm_gem_object *obj;
>   struct dma_fence *fence;
>  
> @@ -150,9 +151,17 @@ int drm_gem_plane_helper_prepare_fb(struct drm_plane 
> *plane, struct drm_plane_st
>   return 0;
>  
>   obj = drm_gem_fb_get_obj(state->fb, 0);
> - fence = dma_resv_get_excl_unlocked(obj->resv);
> - drm_atomic_set_fence_for_plane(state, fence);
> + dma_resv_iter_begin(&cursor, obj->resv, false);
> + dma_resv_for_each_fence_unlocked(&cursor, fence) {
> + /* TODO: We only use the first write fence here and need to fix
> +  * the drm_atomic_set_fence_for_plane() API to accept more than
> +  * one. */

I'm confused, right now there is only one write fence. So no need to
iterate, and also no need to add a TODO. If/when we add more write fences
then I think this needs to be revisited, and ofc then we do need to update
the set_fence helpers to carry an entire array of fences.
-Daniel

> + dma_fence_get(fence);
> + break;
> + }
> + dma_resv_iter_end(&cursor);
>  
> + drm_atomic_set_fence_for_plane(state, fence);
>   return 0;
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_plane_helper_prepare_fb);
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 25/28] drm/nouveau: use the new iterator in nouveau_fence_sync

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:39PM +0200, Christian König wrote:
> Simplifying the code a bit.
> 
> Signed-off-by: Christian König 

A bit a trick conversion since the previous code was clever with the ret
handling in the loop, but looks correct.

Please mention in the commit message that this code now also waits for all
shared fences in all cases. Previously if we found an exclusive fence, we
bailed out. That needs to be recorded in the commit message, together with
an explainer that defacto too many other drivers have broken this rule
already, and so you have to always iterate all fences.

With that added:

Reviewed-by: Daniel Vetter 


> ---
>  drivers/gpu/drm/nouveau/nouveau_fence.c | 48 +++--
>  1 file changed, 12 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
> b/drivers/gpu/drm/nouveau/nouveau_fence.c
> index 05d0b3eb3690..26f9299df881 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> @@ -339,14 +339,15 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool 
> lazy, bool intr)
>  }
>  
>  int
> -nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, 
> bool exclusive, bool intr)
> +nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
> +bool exclusive, bool intr)
>  {
>   struct nouveau_fence_chan *fctx = chan->fence;
> - struct dma_fence *fence;
>   struct dma_resv *resv = nvbo->bo.base.resv;
> - struct dma_resv_list *fobj;
> + struct dma_resv_iter cursor;
> + struct dma_fence *fence;
>   struct nouveau_fence *f;
> - int ret = 0, i;
> + int ret;
>  
>   if (!exclusive) {
>   ret = dma_resv_reserve_shared(resv, 1);
> @@ -355,10 +356,7 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct 
> nouveau_channel *chan, bool e
>   return ret;
>   }
>  
> - fobj = dma_resv_shared_list(resv);
> - fence = dma_resv_excl_fence(resv);
> -
> - if (fence) {
> + dma_resv_for_each_fence(&cursor, resv, exclusive, fence) {
>   struct nouveau_channel *prev = NULL;
>   bool must_wait = true;
>  
> @@ -366,41 +364,19 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct 
> nouveau_channel *chan, bool e
>   if (f) {
>   rcu_read_lock();
>   prev = rcu_dereference(f->channel);
> - if (prev && (prev == chan || fctx->sync(f, prev, chan) 
> == 0))
> + if (prev && (prev == chan ||
> +  fctx->sync(f, prev, chan) == 0))
>   must_wait = false;
>   rcu_read_unlock();
>   }
>  
> - if (must_wait)
> + if (must_wait) {
>   ret = dma_fence_wait(fence, intr);
> -
> - return ret;
> - }
> -
> - if (!exclusive || !fobj)
> - return ret;
> -
> - for (i = 0; i < fobj->shared_count && !ret; ++i) {
> - struct nouveau_channel *prev = NULL;
> - bool must_wait = true;
> -
> - fence = rcu_dereference_protected(fobj->shared[i],
> - dma_resv_held(resv));
> -
> - f = nouveau_local_fence(fence, chan->drm);
> - if (f) {
> - rcu_read_lock();
> - prev = rcu_dereference(f->channel);
> - if (prev && (prev == chan || fctx->sync(f, prev, chan) 
> == 0))
> - must_wait = false;
> - rcu_read_unlock();
> + if (ret)
> + return ret;
>   }
> -
> - if (must_wait)
> - ret = dma_fence_wait(fence, intr);
>   }
> -
> - return ret;
> + return 0;
>  }
>  
>  void
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 26/28] drm/nouveau: use the new interator in nv50_wndw_prepare_fb

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:40PM +0200, Christian König wrote:
> Makes the handling a bit more complex, but avoids the use of
> dma_resv_get_excl_unlocked().
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/nouveau/dispnv50/wndw.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c 
> b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
> index 8d048bacd6f0..30712a681e2a 100644
> --- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
> @@ -539,6 +539,8 @@ nv50_wndw_prepare_fb(struct drm_plane *plane, struct 
> drm_plane_state *state)
>   struct nouveau_bo *nvbo;
>   struct nv50_head_atom *asyh;
>   struct nv50_wndw_ctxdma *ctxdma;
> + struct dma_resv_iter cursor;
> + struct dma_fence *fence;
>   int ret;
>  
>   NV_ATOMIC(drm, "%s prepare: %p\n", plane->name, fb);
> @@ -561,7 +563,13 @@ nv50_wndw_prepare_fb(struct drm_plane *plane, struct 
> drm_plane_state *state)
>   asyw->image.handle[0] = ctxdma->object.handle;
>   }
>  
> - asyw->state.fence = dma_resv_get_excl_unlocked(nvbo->bo.base.resv);
> + dma_resv_iter_begin(&cursor, nvbo->bo.base.resv, false);
> + dma_resv_for_each_fence_unlocked(&cursor, fence) {
> + /* TODO: We only use the first writer here */

Same thing as with the atomic core helper. This is actually broken,
because for atomic we really do _not_ want to wait for any shared fences.
Which this will do, if there's no exclusive fence attached.

So upgrading my general concern on this and the atomic helper patch to a
reject, since I think it's broken.
-Daniel

> + asyw->state.fence = dma_fence_get(fence);
> + break;
> + }
> + dma_resv_iter_end(&cursor);
>   asyw->image.offset[0] = nvbo->offset;
>  
>   if (wndw->func->prepare) {
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 27/28] drm/etnaviv: use new iterator in etnaviv_gem_describe

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:41PM +0200, Christian König wrote:
> Instead of hand rolling the logic.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_gem.c | 31 ++-
>  1 file changed, 11 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
> index 8f1b5af47dd6..0eeb33de2ff4 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
> @@ -428,19 +428,17 @@ int etnaviv_gem_wait_bo(struct etnaviv_gpu *gpu, struct 
> drm_gem_object *obj,
>  static void etnaviv_gem_describe_fence(struct dma_fence *fence,
>   const char *type, struct seq_file *m)
>  {
> - if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))

Yay for removing open-coded tests like this. Drivers really should have no
business digging around in fence->flags (i915 is terrible in this regard
unfortunately).

> - seq_printf(m, "\t%9s: %s %s seq %llu\n",
> -type,
> -fence->ops->get_driver_name(fence),
> -fence->ops->get_timeline_name(fence),
> -fence->seqno);
> + seq_printf(m, "\t%9s: %s %s seq %llu\n", type,
> +fence->ops->get_driver_name(fence),
> +fence->ops->get_timeline_name(fence),
> +fence->seqno);
>  }
>  
>  static void etnaviv_gem_describe(struct drm_gem_object *obj, struct seq_file 
> *m)
>  {
>   struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
>   struct dma_resv *robj = obj->resv;
> - struct dma_resv_list *fobj;
> + struct dma_resv_iter cursor;
>   struct dma_fence *fence;
>   unsigned long off = drm_vma_node_start(&obj->vma_node);
>  
> @@ -449,21 +447,14 @@ static void etnaviv_gem_describe(struct drm_gem_object 
> *obj, struct seq_file *m)
>   obj->name, kref_read(&obj->refcount),
>   off, etnaviv_obj->vaddr, obj->size);
>  
> - rcu_read_lock();
> - fobj = dma_resv_shared_list(robj);
> - if (fobj) {
> - unsigned int i, shared_count = fobj->shared_count;
> -
> - for (i = 0; i < shared_count; i++) {
> - fence = rcu_dereference(fobj->shared[i]);
> + dma_resv_iter_begin(&cursor, robj, true);
> + dma_resv_for_each_fence_unlocked(&cursor, fence) {
> + if (dma_resv_iter_is_exclusive(&cursor))
> + etnaviv_gem_describe_fence(fence, "Exclusive", m);
> + else
>   etnaviv_gem_describe_fence(fence, "Shared", m);
> - }
>   }
> -
> - fence = dma_resv_excl_fence(robj);
> - if (fence)
> - etnaviv_gem_describe_fence(fence, "Exclusive", m);
> - rcu_read_unlock();
> + dma_resv_iter_end(&cursor);

Reviewed-by: Daniel Vetter 

Please make sure it compiles on arm before pushing :-)

>  }
>  
>  void etnaviv_gem_describe_objects(struct etnaviv_drm_private *priv,
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 28/28] drm/etnaviv: replace dma_resv_get_excl_unlocked

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:42PM +0200, Christian König wrote:
> We certainly hold the reservation lock here, no need for the RCU dance.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> index 4dd7d9d541c0..7e17bc2b5df1 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> @@ -195,7 +195,7 @@ static int submit_fence_sync(struct etnaviv_gem_submit 
> *submit)
>   if (ret)
>   return ret;
>   } else {
> - bo->excl = dma_resv_get_excl_unlocked(robj);

Maybe have that in the series to sunset dma_resv_get_excl_unlocked()? Just
so it makes a bit more sense from a motivation pov. Or explain that in the
commit message.

Anyway looks correct.

Reviewed-by: Daniel Vetter 
> + bo->excl = dma_fence_get(dma_resv_excl_fence(robj));
>   }
>  
>   }
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 4/6] drm/i915: Add a struct dma_fence_work timeline

2021-10-13 Thread Daniel Vetter
On Wed, Oct 13, 2021 at 04:21:43PM +0200, Thomas Hellström wrote:
> On Wed, 2021-10-13 at 14:43 +0200, Daniel Vetter wrote:
> > On Fri, Oct 08, 2021 at 03:35:28PM +0200, Thomas Hellström wrote:
> > > The TTM managers and, possibly, the gtt address space managers will
> > > need to be able to order fences for async operation.
> > > Using dma_fence_is_later() for this will require that the fences we
> > > hand
> > > them are from a single fence context and ordered.
> > > 
> > > Introduce a struct dma_fence_work_timeline, and a function to
> > > attach
> > > struct dma_fence_work to such a timeline in a way that all previous
> > > fences attached to the timeline will be signaled when the latest
> > > attached struct dma_fence_work signals.
> > > 
> > > Signed-off-by: Thomas Hellström 
> > 
> > I'm not understanding why we need this:
> > 
> > - if we just want to order dma_fence work, then an ordered workqueue
> > is
> >   what we want. Which is why hand-rolling is better than reusing
> >   dma_fence_work for absolutely everything.
> > 
> > - if we just need to make sure the public fences signal in order,
> > then
> >   it's a dma_fence_chain.
> 
> Part of the same series that needs reworking.
> 
> What we need here is a way to coalesce multiple fences from various
> contexts (including both gpu and work fences) into a single fence and
> then attach it to a timeline.

I thought dma_fence_chain does this for you, including coelescing on the
same timeline. Or at least it's supposed to, because if it doesn't you can
produce some rather epic chain explosions with vulkan :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 4/6] drm/i915: Add a struct dma_fence_work timeline

2021-10-13 Thread Thomas Hellström



On 10/13/21 16:33, Daniel Vetter wrote:

On Wed, Oct 13, 2021 at 04:21:43PM +0200, Thomas Hellström wrote:

On Wed, 2021-10-13 at 14:43 +0200, Daniel Vetter wrote:

On Fri, Oct 08, 2021 at 03:35:28PM +0200, Thomas Hellström wrote:

The TTM managers and, possibly, the gtt address space managers will
need to be able to order fences for async operation.
Using dma_fence_is_later() for this will require that the fences we
hand
them are from a single fence context and ordered.

Introduce a struct dma_fence_work_timeline, and a function to
attach
struct dma_fence_work to such a timeline in a way that all previous
fences attached to the timeline will be signaled when the latest
attached struct dma_fence_work signals.

Signed-off-by: Thomas Hellström 

I'm not understanding why we need this:

- if we just want to order dma_fence work, then an ordered workqueue
is
   what we want. Which is why hand-rolling is better than reusing
   dma_fence_work for absolutely everything.

- if we just need to make sure the public fences signal in order,
then
   it's a dma_fence_chain.

Part of the same series that needs reworking.

What we need here is a way to coalesce multiple fences from various
contexts (including both gpu and work fences) into a single fence and
then attach it to a timeline.

I thought dma_fence_chain does this for you, including coelescing on the
same timeline. Or at least it's supposed to, because if it doesn't you can
produce some rather epic chain explosions with vulkan :-)


I'll take a look to see if I can use dma_fence_chain for this case.

Thanks,

/Thomas


-Daniel


[PATCH] drm/tegra: mark nvdec_writel as inline

2021-10-13 Thread Arnd Bergmann
From: Arnd Bergmann 

Without CONFIG_IOMMU_API, the nvdec_writel() function is
unused, causing a warning:

drivers/gpu/drm/tegra/nvdec.c:48:13: error: 'nvdec_writel' defined but not used 
[-Werror=unused-function]
   48 | static void nvdec_writel(struct nvdec *nvdec, u32 value, unsigned int 
offset)
  | ^~~~

As this is a trivial wrapper around an inline function, mark
it as inline itself, which avoids the warning as well.

Fixes: e76599df354d ("drm/tegra: Add NVDEC driver")
Signed-off-by: Arnd Bergmann 
---
 drivers/gpu/drm/tegra/nvdec.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tegra/nvdec.c b/drivers/gpu/drm/tegra/nvdec.c
index 410333e05de8..791bf1acf5f0 100644
--- a/drivers/gpu/drm/tegra/nvdec.c
+++ b/drivers/gpu/drm/tegra/nvdec.c
@@ -45,7 +45,8 @@ static inline struct nvdec *to_nvdec(struct tegra_drm_client 
*client)
return container_of(client, struct nvdec, client);
 }
 
-static void nvdec_writel(struct nvdec *nvdec, u32 value, unsigned int offset)
+static inline void nvdec_writel(struct nvdec *nvdec, u32 value,
+   unsigned int offset)
 {
writel(value, nvdec->regs + offset);
 }
-- 
2.29.2



Re: [PATCH 2/6] drm/i915: Introduce refcounted sg-tables

2021-10-13 Thread Daniel Vetter
On Fri, Oct 08, 2021 at 03:35:26PM +0200, Thomas Hellström wrote:
> As we start to introduce asynchronous failsafe object migration,
> where we update the object state and then submit asynchronous
> commands we need to record what memory resources are actually used
> by various part of the command stream. Initially for three purposes:
> 
> 1) Error capture.
> 2) Asynchronous migration error recovery.
> 3) Asynchronous vma bind.
> 
> At the time where these happens, the object state may have been updated
> to be several migrations ahead and object sg-tables discarded.
> 
> In order to make it possible to keep sg-tables with memory resource
> information for these operations, introduce refcounted sg-tables that
> aren't freed until the last user is done with them.
> 
> The alternative would be to reference information sitting on the
> corresponding ttm_resources which typically have the same lifetime as
> these refcountes sg_tables, but that leads to other awkward constructs:
> Due to the design direction chosen for ttm resource managers that would
> lead to diamond-style inheritance, the LMEM resources may sometimes be
> prematurely freed, and finally the subclassed struct ttm_resource would
> have to bleed into the asynchronous vma bind code.

On the diamon inheritence I was pondering some more whether we shouldn't
just do the classic C union horrors, i.e.

struct ttm_resource {
/* stuff */
};

struct ttm_drm_mm_resource {
struct ttm_resource base;
struct drm_mm_node;
};

struct ttm_buddy_resource {
struct ttm_resource base;
struct drm_buddy_node;
};

Whatever else we have, maybe also integer resources for guc_id.

And then the horrors:

struct i915_gem_resource {
union {
struct ttm_resource base;
struct ttm_drm_mm_resource drm_mm;
struct ttm_buffer_object buddy;
};

/* i915 stuff */
};

BUILD_BUG_ON(offsetof(struct i915_gem_resource, base) ==
offsetof(struct i915_gem_resource, drmm_mm.base))
BUILD_BUG_ON(offsetof(struct i915_gem_resource, base) ==
offsetof(struct i915_gem_resource, buddy.base))

This is horrible, but also in official C89 and later unions are the only
ways to do inheritance. The only reason we can do different in linux is
because we compile with strict aliasing turned off.

So I think we can shrug this off as officially sanctioned horrors. There's
a small downside with overhead maybe, but I don't think the amount in
difference between the various allocators is big enough that we should
care. Plus a pointer to driver stuff to resolve the diamond inheritance
through different means isn't free either.

But also this is for much later, I think for now refcounting sglist as a
standalone thing is ok, since we do seem to need them in a bunch of
places. But eventually I do think we should aim to merge them with
ttm_resource, if/when those get refcounted.
-Daniel

> 
> Signed-off-by: Thomas Hellström 
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   3 +-
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c   | 159 +++---
>  drivers/gpu/drm/i915/i915_scatterlist.c   |  62 +--
>  drivers/gpu/drm/i915/i915_scatterlist.h   |  76 -
>  drivers/gpu/drm/i915/intel_region_ttm.c   |  15 +-
>  drivers/gpu/drm/i915/intel_region_ttm.h   |   5 +-
>  drivers/gpu/drm/i915/selftests/mock_region.c  |  12 +-
>  7 files changed, 238 insertions(+), 94 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index 7c3da4e3e737..d600cf7ceb35 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -485,6 +485,7 @@ struct drm_i915_gem_object {
>*/
>   struct list_head region_link;
>  
> + struct i915_refct_sgt *rsgt;
>   struct sg_table *pages;
>   void *mapping;
>  
> @@ -538,7 +539,7 @@ struct drm_i915_gem_object {
>   } mm;
>  
>   struct {
> - struct sg_table *cached_io_st;
> + struct i915_refct_sgt *cached_io_rsgt;
>   struct i915_gem_object_page_iter get_io_page;
>   struct drm_i915_gem_object *backup;
>   bool created:1;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index 74a1ffd0d7dd..4b4d7457bef9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -34,7 +34,7 @@
>   * struct i915_ttm_tt - TTM page vector with additional private information
>   * @ttm: The base TTM page vector.
>   * @dev: The struct device used for dma mapping and unmapping.
> - * @cached_st: The cached scatter-gather table.
> + * @cached_rsgt: The cached scatter-gather table.
>   *
>   * Note that DMA may be going on right up to the point where the page-
>   * vector is unpopulated in delayed

[PATCH] drm/tegra: mark nvdec PM functions as __maybe_unused

2021-10-13 Thread Arnd Bergmann
From: Arnd Bergmann 

The resume helper is called conditionally and causes a harmless
warning when stubbed out:

drivers/gpu/drm/tegra/nvdec.c:240:12: error: 'nvdec_runtime_resume' defined but 
not used [-Werror=unused-function]
  240 | static int nvdec_runtime_resume(struct device *dev)

Mark both suspend and resume as __maybe_unused for consistency
to avoid this warning.

Fixes: e76599df354d ("drm/tegra: Add NVDEC driver")
Signed-off-by: Arnd Bergmann 
---
 drivers/gpu/drm/tegra/nvdec.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/tegra/nvdec.c b/drivers/gpu/drm/tegra/nvdec.c
index 30105a93de9f..791bf1acf5f0 100644
--- a/drivers/gpu/drm/tegra/nvdec.c
+++ b/drivers/gpu/drm/tegra/nvdec.c
@@ -238,7 +238,7 @@ static int nvdec_load_firmware(struct nvdec *nvdec)
 }
 
 
-static int nvdec_runtime_resume(struct device *dev)
+static __maybe_unused int nvdec_runtime_resume(struct device *dev)
 {
struct nvdec *nvdec = dev_get_drvdata(dev);
int err;
@@ -264,7 +264,7 @@ static int nvdec_runtime_resume(struct device *dev)
return err;
 }
 
-static int nvdec_runtime_suspend(struct device *dev)
+static __maybe_unused int nvdec_runtime_suspend(struct device *dev)
 {
struct nvdec *nvdec = dev_get_drvdata(dev);
 
-- 
2.29.2



[PATCH] drm/amd/display: fix apply_degamma_for_user_regamma() warning

2021-10-13 Thread Arnd Bergmann
From: Arnd Bergmann 

It appears that the wrong argument was removed in this call:

drivers/gpu/drm/amd/amdgpu/../display/modules/color/color_gamma.c: In function 
'apply_degamma_for_user_regamma':
drivers/gpu/drm/amd/amdgpu/../display/modules/color/color_gamma.c:1694:36: 
error: implicit conversion from 'enum ' to 'enum 
dc_transfer_func_predefined' [-Werror=enum-conversion]
 1694 | build_coefficients(&coeff, true);

Fixes: 9b3d76527f6e ("drm/amd/display: Revert adding degamma coefficients")
Signed-off-by: Arnd Bergmann 
---
 drivers/gpu/drm/amd/display/modules/color/color_gamma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c 
b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
index 64a38f08f497..4cb6617059ae 100644
--- a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
+++ b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
@@ -1691,7 +1691,7 @@ static void apply_degamma_for_user_regamma(struct 
pwl_float_data_ex *rgb_regamma
struct pwl_float_data_ex *rgb = rgb_regamma;
const struct hw_x_point *coord_x = coordinates_x;
 
-   build_coefficients(&coeff, true);
+   build_coefficients(&coeff, TRANSFER_FUNCTION_SRGB);
 
i = 0;
while (i != hw_points_num + 1) {
-- 
2.29.2



[PATCH] drm: msm: fix building without CONFIG_COMMON_CLK

2021-10-13 Thread Arnd Bergmann
From: Arnd Bergmann 

When CONFIG_COMMON_CLOCK is disabled, the 8996 specific
phy code is left out, which results in a link failure:

ld: drivers/gpu/drm/msm/hdmi/hdmi_phy.o:(.rodata+0x3f0): undefined reference to 
`msm_hdmi_phy_8996_cfg'

This was only exposed after it became possible to build
test the driver without the clock interfaces.

Make COMMON_CLK a hard dependency for compile testing,
and simplify it a little based on that.

Fixes: b3ed524f84f5 ("drm/msm: allow compile_test on !ARM")
Reported-by: Randy Dunlap 
Suggested-by: Geert Uytterhoeven 
Signed-off-by: Arnd Bergmann 
---
 drivers/gpu/drm/msm/Kconfig  | 2 +-
 drivers/gpu/drm/msm/Makefile | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index f5107b6ded7b..cb204912e0f4 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -4,8 +4,8 @@ config DRM_MSM
tristate "MSM DRM"
depends on DRM
depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST
+   depends on COMMON_CLK
depends on IOMMU_SUPPORT
-   depends on (OF && COMMON_CLK) || COMPILE_TEST
depends on QCOM_OCMEM || QCOM_OCMEM=n
depends on QCOM_LLCC || QCOM_LLCC=n
depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n
diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 904535eda0c4..bbee22b54b0c 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -23,8 +23,10 @@ msm-y := \
hdmi/hdmi_i2c.o \
hdmi/hdmi_phy.o \
hdmi/hdmi_phy_8960.o \
+   hdmi/hdmi_phy_8996.o \
hdmi/hdmi_phy_8x60.o \
hdmi/hdmi_phy_8x74.o \
+   hdmi/hdmi_pll_8960.o \
edp/edp.o \
edp/edp_aux.o \
edp/edp_bridge.o \
@@ -37,6 +39,7 @@ msm-y := \
disp/mdp4/mdp4_dtv_encoder.o \
disp/mdp4/mdp4_lcdc_encoder.o \
disp/mdp4/mdp4_lvds_connector.o \
+   disp/mdp4/mdp4_lvds_pll.o \
disp/mdp4/mdp4_irq.o \
disp/mdp4/mdp4_kms.o \
disp/mdp4/mdp4_plane.o \
@@ -117,9 +120,6 @@ msm-$(CONFIG_DRM_MSM_DP)+= dp/dp_aux.o \
dp/dp_audio.o
 
 msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o
-msm-$(CONFIG_COMMON_CLK) += disp/mdp4/mdp4_lvds_pll.o
-msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_pll_8960.o
-msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_phy_8996.o
 
 msm-$(CONFIG_DRM_MSM_HDMI_HDCP) += hdmi/hdmi_hdcp.o
 
-- 
2.29.2



[PATCH v2] drm/cma-helper: Set VM_DONTEXPAND for mmap

2021-10-13 Thread Alyssa Rosenzweig
From: Robin Murphy 

drm_gem_cma_mmap() cannot assume every implementation of dma_mmap_wc()
will end up calling remap_pfn_range() (which happens to set the relevant
vma flag, among others), so in order to make sure expectations around
VM_DONTEXPAND are met, let it explicitly set the flag like most other
GEM mmap implementations do.

This avoids repeated warnings on a small minority of systems where the
display is behind an IOMMU, and has a simple driver which does not
override drm_gem_cma_default_funcs. Arm hdlcd is an in-tree affected
driver. Out-of-tree, the Apple DCP driver is affected; this fix is
required for DCP to be mainlined.

Signed-off-by: Robin Murphy 
Reviewed-and-tested-by: Alyssa Rosenzweig 
---
 drivers/gpu/drm/drm_gem_cma_helper.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_gem_cma_helper.c 
b/drivers/gpu/drm/drm_gem_cma_helper.c
index d53388199f34..63e48d98263d 100644
--- a/drivers/gpu/drm/drm_gem_cma_helper.c
+++ b/drivers/gpu/drm/drm_gem_cma_helper.c
@@ -510,6 +510,7 @@ int drm_gem_cma_mmap(struct drm_gem_object *obj, struct 
vm_area_struct *vma)
 */
vma->vm_pgoff -= drm_vma_node_start(&obj->vma_node);
vma->vm_flags &= ~VM_PFNMAP;
+   vma->vm_flags |= VM_DONTEXPAND;
 
cma_obj = to_drm_gem_cma_obj(obj);
 
-- 
2.30.2



Re: [PATCH 2/6] drm/i915: Introduce refcounted sg-tables

2021-10-13 Thread Thomas Hellström



On 10/13/21 16:41, Daniel Vetter wrote:

On Fri, Oct 08, 2021 at 03:35:26PM +0200, Thomas Hellström wrote:

As we start to introduce asynchronous failsafe object migration,
where we update the object state and then submit asynchronous
commands we need to record what memory resources are actually used
by various part of the command stream. Initially for three purposes:

1) Error capture.
2) Asynchronous migration error recovery.
3) Asynchronous vma bind.

At the time where these happens, the object state may have been updated
to be several migrations ahead and object sg-tables discarded.

In order to make it possible to keep sg-tables with memory resource
information for these operations, introduce refcounted sg-tables that
aren't freed until the last user is done with them.

The alternative would be to reference information sitting on the
corresponding ttm_resources which typically have the same lifetime as
these refcountes sg_tables, but that leads to other awkward constructs:
Due to the design direction chosen for ttm resource managers that would
lead to diamond-style inheritance, the LMEM resources may sometimes be
prematurely freed, and finally the subclassed struct ttm_resource would
have to bleed into the asynchronous vma bind code.

On the diamon inheritence I was pondering some more whether we shouldn't
just do the classic C union horrors, i.e.

struct ttm_resource {
/* stuff */
};

struct ttm_drm_mm_resource {
struct ttm_resource base;
struct drm_mm_node;
};

struct ttm_buddy_resource {
struct ttm_resource base;
struct drm_buddy_node;
};

Whatever else we have, maybe also integer resources for guc_id.

And then the horrors:

struct i915_gem_resource {
union {
struct ttm_resource base;
struct ttm_drm_mm_resource drm_mm;
struct ttm_buffer_object buddy;
};

/* i915 stuff */
};

BUILD_BUG_ON(offsetof(struct i915_gem_resource, base) ==
offsetof(struct i915_gem_resource, drmm_mm.base))
BUILD_BUG_ON(offsetof(struct i915_gem_resource, base) ==
offsetof(struct i915_gem_resource, buddy.base))

This is horrible, but also in official C89 and later unions are the only
ways to do inheritance. The only reason we can do different in linux is
because we compile with strict aliasing turned off.

So I think we can shrug this off as officially sanctioned horrors. There's
a small downside with overhead maybe, but I don't think the amount in
difference between the various allocators is big enough that we should
care. Plus a pointer to driver stuff to resolve the diamond inheritance
through different means isn't free either.


Yes, this is exactly what was meant by "awkward constructs" in the 
commit message,


My thoughts are still that all this could be avoided by a different 
design for struct ttm_resource,
but I agree we can do with refcounted sg-lists for now, to see where 
this ends up when all related resource-on-lru stuff lands in TTM.


/Thomas




Re: [PATCH 0/5] drm/vmwgfx: Support module unload and hotunplug

2021-10-13 Thread Zack Rusin
On Wed, 2021-10-13 at 14:50 +0200, Daniel Vetter wrote:
> On Tue, Oct 12, 2021 at 05:34:50PM +, Zack Rusin wrote:
> 
> > On the flip side that does mean that vmwgfx and i915 need to redo
> > some
> > code. For vmwgfx it's probably a net positive anyway as we've been
> > using TTM for, what is really nowadays, an integrated GPU so maybe
> > it's
> > time for us to think about transition to gem.
> 
> Aside, but we're looking at adopting ttm for integrated gpu too. The
> execbuf utils and dynamic memory management helpers for pure gem just
> aren't quite there yet, and improving ttm a bit in this area looks
> reasonable (like adding a unified memory aware shrinker like we have
> in
> i915-gem).
> 

That would certainly be a big help. The situation that I want to avoid
is having vmwgfx using TTM for what no other driver is using it for.


> Also I thought vmwgfx is using ttm to also manage some id spaces,
> you'd
> have to hand-roll that.

Yes, it's work either way, it's likely less code with GEM but we'd lose
support for 3D on older hardware where our device did have dedicated
VRAM. 

Nowadays memory management in our device is rather trivial: every GPU
object is just kernel virtual memory. To allow our virtual device to
write into that memory we send it an identifier to be able to name the
object (we use id's but it could be just the kernel virtual address as
integer) plus a page table because, of course, vmx can't read guest
kernel's page tables so we need to map the kernel virtual address space
to physical addresses so that the host can write into them. So mm in
vmwgfx shouldn't require performance enhancing drugs to understand and
drug usage while writing vmwgfx code should remain purely recreational
;)

z



Re: 572994bf18ff prevents system boot

2021-10-13 Thread Chuck Lever III


> On Oct 8, 2021, at 4:49 AM, Thomas Zimmermann  wrote:
> 
> Hi
> 
> Am 04.10.21 um 16:11 schrieb Chuck Lever III:
>>> On Oct 4, 2021, at 10:07 AM, Thomas Zimmermann  wrote:
>>> 
>>> Hi
>>> 
>>> Am 04.10.21 um 15:34 schrieb Chuck Lever III:
> On Oct 4, 2021, at 3:07 AM, Thomas Zimmermann  wrote:
> 
> (cc: ainux.w...@gmail.com)
> 
> Hi
> 
> Am 03.10.21 um 20:09 schrieb Chuck Lever III:
>> Hi-
>> After updating one of my test systems to v5.15-rc, I found that it
>> becomes unresponsive during the later part of the boot process. A
>> power-on reset is necessary to recover.
>> I bisected to this commit:
>> 572994bf18ff ("drm/ast: Zero is missing in detect function")
> 
> You don't have a monitor connected, I guess?
 Correct, my lab systems use IPMI and a browser-attached console.
> In that case, we now trigger the helpers that poll for connected 
> monitors. However, the overhead seems rather extreme.
> 
> I'll have to try to reproduce this, or otherwise we can revert the commit.
 It's strange, only that system in my lab seems to have a problem.
 The others work fine.
 Thanks for having a look!
>>> 
>>> Is it a HW or FW problem? Maybe a different revision?
>> It's possible. I don't know how to further diagnose the issue,
>> though. Any guidance appreciated!
> 
> v5.15-rc3 works well on my test machine.
> 
> For getting the firmware revisions, run
> 
>  sudo dmidecode
> 
> on the machine. It will print a long list of devices with related 
> information. Running
> 
>  sudo lspci -v
> 
> will give information about the PCI devices. There's an entry for the VGA 
> device somewhere. Maybe you can find some difference between the different 
> systems
> 
> If you think the machine got stuck, try to plug-in the VGA cable during the 
> boot and see if it makes the machine come up.

Yes, plugging in a physical monitor unsticks the machine and booting
continues normally.

However, after that, having a monitor present does not seem to be
necessary. The machine has been rebooted several times with
v5.15-rc5 and no monitor attached, without any delays.

I'll note this is Fedora 32, in case you suspect there is a user
space interaction involved. The system is going to be updated very
soon to a more recent release of Fedora.


> Best regards
> Thomas
> 
>>> I'm asking because the problematic commit does the correct thing. If there 
>>> is no VGA cable connected, the driver should poll until it detects one. The 
>>> overhead should be minimal.
>>> 
>>> But I'll try to reproduce anyway.
>>> 
>>> Best regards
>>> Thomas
>>> 
> Best regards
> Thomas
> 
>> Checking out v5.15-rc3 and reverting this commit enables the system
>> to boot again.
>> 0b:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED 
>> Graphics Family (rev 30) (prog-if 00 [VGA controller])
>> DeviceName:  ASPEED Video AST2400
>> Subsystem: Super Micro Computer Inc X10SRL-F
>> Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- 
>> ParErr- Stepping- SERR- FastB2B- DisINTx-
>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
>> SERR- > Interrupt: pin A routed to IRQ 18
>> Region 0: Memory at fa00 (32-bit, non-prefetchable) 
>> [size=16M]
>> Region 1: Memory at fb00 (32-bit, non-prefetchable) 
>> [size=128K]
>> Region 2: I/O ports at c000 [size=128]
>> Expansion ROM at 000c [virtual] [disabled] [size=128K]
>> Capabilities: [40] Power Management version 3
>> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA 
>> PME(D0+,D1+,D2+,D3hot+,D3cold+)
>> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>> Capabilities: [50] MSI: Enable- Count=1/4 Maskable- 64bit+
>> Address:   Data: 
>> Kernel driver in use: ast
>> Kernel modules: ast
>> --
>> Chuck Lever
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
 --
 Chuck Lever
>>> 
>>> -- 
>>> Thomas Zimmermann
>>> Graphics Driver Developer
>>> SUSE Software Solutions Germany GmbH
>>> Maxfeldstr. 5, 90409 Nürnberg, Germany
>>> (HRB 36809, AG Nürnberg)
>>> Geschäftsführer: Felix Imendörffer
>> --
>> Chuck Lever
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer

--
Chuck Lever





Re: Regression with mainline kernel on rpi4

2021-10-13 Thread Maxime Ripard
On Thu, Sep 30, 2021 at 11:19:59AM +0200, Daniel Vetter wrote:
> On Tue, Sep 28, 2021 at 10:34:46AM +0200, Maxime Ripard wrote:
> > Hi Daniel,
> > 
> > On Sat, Sep 25, 2021 at 12:50:17AM +0200, Daniel Vetter wrote:
> > > On Fri, Sep 24, 2021 at 3:30 PM Maxime Ripard  wrote:
> > > >
> > > > On Wed, Sep 22, 2021 at 01:25:21PM -0700, Linus Torvalds wrote:
> > > > > On Wed, Sep 22, 2021 at 1:19 PM Sudip Mukherjee
> > > > >  wrote:
> > > > > >
> > > > > > I added some debugs to print the addresses, and I am getting:
> > > > > > [   38.813809] sudip crtc 
> > > > > >
> > > > > > This is from struct drm_crtc *crtc = connector->state->crtc;
> > > > >
> > > > > Yeah, that was my personal suspicion, because while the line number
> > > > > implied "crtc->state" being NULL, the drm data structure documentation
> > > > > and other drivers both imply that "crtc" was the more likely one.
> > > > >
> > > > > I suspect a simple
> > > > >
> > > > > if (!crtc)
> > > > > return;
> > > > >
> > > > > in vc4_hdmi_set_n_cts() is at least part of the fix for this all, but
> > > > > I didn't check if there is possibly something else that needs to be
> > > > > done too.
> > > >
> > > > Thanks for the decode_stacktrace.sh and the follow-up
> > > >
> > > > Yeah, it looks like we have several things wrong here:
> > > >
> > > >   * we only check that connector->state is set, and not
> > > > connector->state->crtc indeed.
> > > >
> > > >   * We also check only in startup(), so at open() and not later on when
> > > > the sound streaming actually start. This has been there for a while,
> > > > so I guess it's never really been causing a practical issue before.
> > > 
> > > You also have no locking
> > 
> > Indeed. Do we just need locking to prevent a concurrent audio setup and
> > modeset, or do you have another corner case in mind?
> > 
> > Also, generally, what locks should we make sure we have locked when
> > accessing the connector and CRTC state? drm_mode_config.connection_mutex
> > and drm_mode_config.mutex, respectively?
> > 
> > > plus looking at ->state objects outside of atomic commit machinery
> > > makes no sense because you're not actually in sync with the hw state.
> > > Relevant bits need to be copied over at commit time, protected by some
> > > spinlock (and that spinlock also needs to be held over whatever other
> > > stuff you're setting to make sure we don't get a funny out-of-sync
> > > state anywhere).
> > 
> > If we already have a lock protecting against having both an ASoC and KMS
> > function running, it's not clear to me what the spinlock would prevent
> > here?
> 
> Replicating the irc chat here. With
> 
> commit 6c5ed5ae353cdf156f9ac4db17e15db56b4de880
> Author: Maarten Lankhorst 
> Date:   Thu Apr 6 20:55:20 2017 +0200
> 
> drm/atomic: Acquire connection_mutex lock in 
> drm_helper_probe_single_connector_modes, v4.
> 
> this is already taken care of for drivers and should be all good from a
> locking pov.

So, if I understand this properly, this superseeds your comment on the
spinlock for the hw state, but not the comment that we need some locking
to synchronize between the audio and KMS path (and CEC?). Right?

Maxime


signature.asc
Description: PGP signature


Re: [Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Tvrtko Ursulin



On 13/10/2021 15:00, Daniel Vetter wrote:

On Wed, Oct 13, 2021 at 02:32:03PM +0200, Maarten Lankhorst wrote:

No memory should be allocated when calling i915_gem_object_wait,
because it may be called to idle a BO when evicting memory.

Fix this by using dma_resv_iter helpers to call
i915_gem_object_wait_fence() on each fence, which cleans up the code a lot.
Also remove dma_resv_prune, it's questionably.

This will result in the following lockdep splat.

<4> [83.538517] ==
<4> [83.538520] WARNING: possible circular locking dependency detected
<4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted
<4> [83.538525] --
<4> [83.538527] gem_render_line/5242 is trying to acquire lock:
<4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: 
__kmalloc_track_caller+0x56/0x270
<4> [83.538538]
but task is already holding lock:
<4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.538638]
which lock already depends on the new lock.
<4> [83.538642]
the existing dependency chain (in reverse order) is:
<4> [83.538645]
-> #1 (&vm->mutex/1){+.+.}-{3:3}:
<4> [83.538649]lock_acquire+0xd3/0x310
<4> [83.538654]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
<4> [83.538730]i915_address_space_init+0xf5/0x1b0 [i915]
<4> [83.538794]ppgtt_init+0x55/0x70 [i915]
<4> [83.538856]gen8_ppgtt_create+0x44/0x5d0 [i915]
<4> [83.538912]i915_ppgtt_create+0x28/0xf0 [i915]
<4> [83.538971]intel_gt_init+0x130/0x3b0 [i915]
<4> [83.539029]i915_gem_init+0x14b/0x220 [i915]
<4> [83.539100]i915_driver_probe+0x97e/0xdd0 [i915]
<4> [83.539149]i915_pci_probe+0x43/0x1d0 [i915]
<4> [83.539197]pci_device_probe+0x9b/0x110
<4> [83.539201]really_probe+0x1b0/0x3b0
<4> [83.539205]__driver_probe_device+0xf6/0x170
<4> [83.539208]driver_probe_device+0x1a/0x90
<4> [83.539210]__driver_attach+0x93/0x160
<4> [83.539213]bus_for_each_dev+0x72/0xc0
<4> [83.539216]bus_add_driver+0x14b/0x1f0
<4> [83.539220]driver_register+0x66/0xb0
<4> [83.539222]hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi]
<4> [83.539227]do_one_initcall+0x53/0x2e0
<4> [83.539230]do_init_module+0x55/0x200
<4> [83.539234]load_module+0x2700/0x2980
<4> [83.539237]__do_sys_finit_module+0xaa/0x110
<4> [83.539241]do_syscall_64+0x37/0xb0
<4> [83.539244]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539247]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [83.539251]validate_chain+0xb37/0x1e70
<4> [83.539254]__lock_acquire+0x5a1/0xb70
<4> [83.539258]lock_acquire+0xd3/0x310
<4> [83.539260]fs_reclaim_acquire+0x9d/0xd0
<4> [83.539264]__kmalloc_track_caller+0x56/0x270
<4> [83.539267]krealloc+0x48/0xa0
<4> [83.539270]dma_resv_get_fences+0x1c3/0x280
<4> [83.539274]i915_gem_object_wait+0x1ff/0x410 [i915]
<4> [83.539342]i915_gem_evict_for_node+0x16b/0x440 [i915]
<4> [83.539412]i915_gem_gtt_reserve+0xff/0x130 [i915]
<4> [83.539482]i915_vma_pin_ww+0x765/0x970 [i915]
<4> [83.539556]eb_validate_vmas+0x6fe/0x8e0 [i915]
<4> [83.539626]i915_gem_do_execbuffer+0x9a6/0x20a0 [i915]
<4> [83.539693]i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915]
<4> [83.539759]drm_ioctl_kernel+0xac/0x140
<4> [83.539763]drm_ioctl+0x201/0x3d0
<4> [83.539766]__x64_sys_ioctl+0x6a/0xa0
<4> [83.539769]do_syscall_64+0x37/0xb0
<4> [83.539772]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539775]
other info that might help us debug this:
<4> [83.539778]  Possible unsafe locking scenario:
<4> [83.539781]CPU0CPU1
<4> [83.539783]
<4> [83.539785]   lock(&vm->mutex/1);
<4> [83.539788]lock(fs_reclaim);
<4> [83.539791]lock(&vm->mutex/1);
<4> [83.539794]   lock(fs_reclaim);
<4> [83.539796]
  *** DEADLOCK ***
<4> [83.539799] 3 locks held by gem_render_line/5242:
<4> [83.539802]  #0: c9d4bbf0 
(reservation_ww_class_acquire){+.+.}-{0:0}, at: i915_gem_do_execbuffer+0x8e5/0x20a0 
[i915]
<4> [83.539870]  #1: 88811e48bae8 (reservation_ww_class_mutex){+.+.}-{3:3}, 
at: eb_validate_vmas+0x81/0x8e0 [i915]
<4> [83.539936]  #2: 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.540011]
stack backtrace:
<4> [83.540014] CPU: 2 PID: 5242 Comm: gem_render_line Not tainted 
5.15.0-rc5-CI-Trybot_8062+ #1
<4> [83.540019] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, 
BIOS TNTGL357.0038.2020.1124.1648 11/24/2020
<4> [83.540023] Call Trace:
<4> [83.540026]  dump_stack_lvl+0x56/0x7b
<4> [83.540030]  check_noncircular+0x12e/0x150
<4> [83.540034]  ? _raw_spin_unlock_irqrestore+0x50/0x60
<4> [

Re: [Intel-gfx] [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute

2021-10-13 Thread Tvrtko Ursulin



On 13/10/2021 13:01, Daniel Vetter wrote:

On Wed, Oct 06, 2021 at 10:12:29AM -0700, Matthew Brost wrote:

On Mon, Oct 04, 2021 at 03:36:48PM +0100, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to
i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC
priority management") introduced some scheduling related vfuncs which
take integer request priority as argument.

Make them instead take struct i915_sched_attr, which is the type
encapsulating this information, so it probably aligns with the design
better. It definitely enables extending the set of scheduling attributes.



Understand the motivation here but the i915_scheduler is going to
disapear when we move to the DRM scheduler or at least its functionality
of priority inheritance will be pushed into the DRM scheduler. I'd be
very careful making any changes here as the priority in the DRM
scheduler is defined as single enum:


Yeah I'm not sure it makes sense to build this and make the conversion to
drm/sched even harder. We've already merged a lot of code with a "we'll
totally convert to drm/sched right after" promise, there's not really room
for more fun like this built on top of i915-scheduler.


It is not really fun on top of i915-scheduler. It is fun on top of the 
concept of uapi gem context priority. As long as there is gem context 
priority, and requests inherit from it, the concept works. This is 
demonstrated by the fact it ties in with the GuC backend which reduces 
to three priorities already. It is limited granularity but it does 
something.


Implementation details aside, key question is the proposal to tie 
process nice with GPU scheduling priority. There seems to be interest 
from other parties so there probably is something here.


But I do plan to simplify this RFC to not add anything to 
i915_sched_attr and also drop the task sched attr change notifier.


Regards,

Tvrtko


-Daniel



/* These are often used as an (initial) index
  * to an array, and as such should start at 0.
  */
enum drm_sched_priority {
 DRM_SCHED_PRIORITY_MIN,
 DRM_SCHED_PRIORITY_NORMAL,
 DRM_SCHED_PRIORITY_HIGH,
 DRM_SCHED_PRIORITY_KERNEL,

 DRM_SCHED_PRIORITY_COUNT,
 DRM_SCHED_PRIORITY_UNSET = -2
};

Adding a field to the i915_sched_attr is fairly easy as we already have
a structure but changing the DRM scheduler might be a tougher sell.
Anyway you can make this work without adding the 'nice' field to
i915_sched_attr? Might be worth exploring so when we move to the DRM
scheduler this feature drops in a little cleaner.

Matt


Signed-off-by: Tvrtko Ursulin 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
---
  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++-
  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c| 3 ++-
  drivers/gpu/drm/i915/i915_scheduler.c| 4 ++--
  drivers/gpu/drm/i915/i915_scheduler_types.h  | 4 ++--
  4 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 7147fe80919e..e91d803a6453 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs *engine)
return engine->class != RENDER_CLASS;
  }
  
-static void kick_execlists(const struct i915_request *rq, int prio)

+static void kick_execlists(const struct i915_request *rq,
+  const struct i915_sched_attr *attr)
  {
struct intel_engine_cs *engine = rq->engine;
struct i915_sched_engine *sched_engine = engine->sched_engine;
const struct i915_request *inflight;
+   const int prio = attr->priority;
  
  	/*

 * We only need to kick the tasklet once for the high priority
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index ba0de35f6323..b5883a4365ca 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct intel_engine_cs 
*engine)
  }
  
  static void guc_bump_inflight_request_prio(struct i915_request *rq,

-  int prio)
+  const struct i915_sched_attr *attr)
  {
struct intel_context *ce = rq->context;
+   const int prio = attr->priority;
u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
  
  	/* Short circuit function */

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index 762127dd56c5..534bab99fcdc 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node *node,
  
  		/* Must 

Re: [PATCH RFC] virtio: wrap config->reset calls

2021-10-13 Thread David Hildenbrand
On 13.10.21 14:17, Michael S. Tsirkin wrote:
> On Wed, Oct 13, 2021 at 01:03:46PM +0200, David Hildenbrand wrote:
>> On 13.10.21 12:55, Michael S. Tsirkin wrote:
>>> This will enable cleanups down the road.
>>> The idea is to disable cbs, then add "flush_queued_cbs" callback
>>> as a parameter, this way drivers can flush any work
>>> queued after callbacks have been disabled.
>>>
>>> Signed-off-by: Michael S. Tsirkin 
>>> ---
>>>   arch/um/drivers/virt-pci.c | 2 +-
>>>   drivers/block/virtio_blk.c | 4 ++--
>>>   drivers/bluetooth/virtio_bt.c  | 2 +-
>>>   drivers/char/hw_random/virtio-rng.c| 2 +-
>>>   drivers/char/virtio_console.c  | 4 ++--
>>>   drivers/crypto/virtio/virtio_crypto_core.c | 8 
>>>   drivers/firmware/arm_scmi/virtio.c | 2 +-
>>>   drivers/gpio/gpio-virtio.c | 2 +-
>>>   drivers/gpu/drm/virtio/virtgpu_kms.c   | 2 +-
>>>   drivers/i2c/busses/i2c-virtio.c| 2 +-
>>>   drivers/iommu/virtio-iommu.c   | 2 +-
>>>   drivers/net/caif/caif_virtio.c | 2 +-
>>>   drivers/net/virtio_net.c   | 4 ++--
>>>   drivers/net/wireless/mac80211_hwsim.c  | 2 +-
>>>   drivers/nvdimm/virtio_pmem.c   | 2 +-
>>>   drivers/rpmsg/virtio_rpmsg_bus.c   | 2 +-
>>>   drivers/scsi/virtio_scsi.c | 2 +-
>>>   drivers/virtio/virtio.c| 5 +
>>>   drivers/virtio/virtio_balloon.c| 2 +-
>>>   drivers/virtio/virtio_input.c  | 2 +-
>>>   drivers/virtio/virtio_mem.c| 2 +-
>>>   fs/fuse/virtio_fs.c| 4 ++--
>>>   include/linux/virtio.h | 1 +
>>>   net/9p/trans_virtio.c  | 2 +-
>>>   net/vmw_vsock/virtio_transport.c   | 4 ++--
>>>   sound/virtio/virtio_card.c | 4 ++--
>>>   26 files changed, 39 insertions(+), 33 deletions(-)
>>>
>>> diff --git a/arch/um/drivers/virt-pci.c b/arch/um/drivers/virt-pci.c
>>> index c08066633023..22c4d87c9c15 100644
>>> --- a/arch/um/drivers/virt-pci.c
>>> +++ b/arch/um/drivers/virt-pci.c
>>> @@ -616,7 +616,7 @@ static void um_pci_virtio_remove(struct virtio_device 
>>> *vdev)
>>> int i;
>>>   /* Stop all virtqueues */
>>> -vdev->config->reset(vdev);
>>> +virtio_reset_device(vdev);
>>>   vdev->config->del_vqs(vdev);
>>
>> Nit: virtio_device_reset()?
>>
>> Because I see:
>>
>> int virtio_device_freeze(struct virtio_device *dev);
>> int virtio_device_restore(struct virtio_device *dev);
>> void virtio_device_ready(struct virtio_device *dev)
>>
>> But well, there is:
>> void virtio_break_device(struct virtio_device *dev);
> 
> Exactly. I don't know what's best, so I opted for plain english :)

Fair enough, LGTM

Reviewed-by: David Hildenbrand 


-- 
Thanks,

David / dhildenb



Re: [PATCH] drm/i915: Handle Intel igfx + Intel dgfx hybrid graphics setup

2021-10-13 Thread Tvrtko Ursulin



On 13/10/2021 13:06, Daniel Vetter wrote:

On Tue, Oct 05, 2021 at 03:05:25PM +0200, Thomas Hellström wrote:

Hi, Tvrtko,

On 10/5/21 13:31, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

In short this makes i915 work for hybrid setups (DRI_PRIME=1 with Mesa)
when rendering is done on Intel dgfx and scanout/composition on Intel
igfx.

Before this patch the driver was not quite ready for that setup, mainly
because it was able to emit a semaphore wait between the two GPUs, which
results in deadlocks because semaphore target location in HWSP is neither
shared between the two, nor mapped in both GGTT spaces.

To fix it the patch adds an additional check to a couple of relevant code
paths in order to prevent using semaphores for inter-engine
synchronisation when relevant objects are not in the same GGTT space.

v2:
   * Avoid adding rq->i915. (Chris)

v3:
   * Use GGTT which describes the limit more precisely.

Signed-off-by: Tvrtko Ursulin 
Cc: Daniel Vetter 
Cc: Matthew Auld 
Cc: Thomas Hellström 


An IMO pretty important bugfix. I read up a bit on the previous discussion
on this, and from what I understand the other two options were

1) Ripping out the semaphore code,
2) Consider dma-fences from other instances of the same driver as foreign.

For imported dma-bufs we do 2), but particularly with lmem and p2p that's a
more straightforward decision.

I don't think 1) is a reasonable approach to fix this bug, (but perhaps as a
general cleanup?), and for 2) yes I guess we might end up doing that, unless
we find some real benefits in treating same-driver-separate-device
dma-fences as local, but for this particular bug, IMO this is a reasonable
fix.


The foreign dma-fences have uapi impact, which Tvrtko shrugged off as
"it's a good idea", and not it's really just not. So we still need to that
this properly.


I always said lets merge the fix and discuss it. Fix only improved one 
fail and did not introduce any new issues you are worried about. They 
were all already there.


So lets start the discussion why it is not a good idea to extend the 
concept of priority inheritance in the hybrid case?


Today we can have high priority compositor waiting for client rendering, 
or even I915_PRIORITY_DISPLAY which I _think_ somehow ties into page 
flips with full screen stuff, and with igpu we do priority inheritance 
in those cases. Why it is a bad idea to do the same in the hybrid setup?


Regards,

Tvrtko




Reviewed-by: Thomas Hellström 


But I'm also ok with just merging this as-is so the situation doesn't
become too entertaining.
-Daniel








---
   drivers/gpu/drm/i915/i915_request.c | 12 +++-
   1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 79da5eca60af..4f189982f67e 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1145,6 +1145,12 @@ __emit_semaphore_wait(struct i915_request *to,
return 0;
   }
+static bool
+can_use_semaphore_wait(struct i915_request *to, struct i915_request *from)
+{
+   return to->engine->gt->ggtt == from->engine->gt->ggtt;
+}
+
   static int
   emit_semaphore_wait(struct i915_request *to,
struct i915_request *from,
@@ -1153,6 +1159,9 @@ emit_semaphore_wait(struct i915_request *to,
const intel_engine_mask_t mask = READ_ONCE(from->engine)->mask;
struct i915_sw_fence *wait = &to->submit;
+   if (!can_use_semaphore_wait(to, from))
+   goto await_fence;
+
if (!intel_context_use_semaphores(to->context))
goto await_fence;
@@ -1256,7 +1265,8 @@ __i915_request_await_execution(struct i915_request *to,
 * immediate execution, and so we must wait until it reaches the
 * active slot.
 */
-   if (intel_engine_has_semaphores(to->engine) &&
+   if (can_use_semaphore_wait(to, from) &&
+   intel_engine_has_semaphores(to->engine) &&
!i915_request_has_initial_breadcrumb(to)) {
err = __emit_semaphore_wait(to, from, from->fence.seqno - 1);
if (err < 0)




Re: [PATCH 2/2] drm/i915/pmu: Connect engine busyness stats from GuC to pmu

2021-10-13 Thread Tvrtko Ursulin



On 13/10/2021 01:56, Umesh Nerlige Ramappa wrote:

With GuC handling scheduling, i915 is not aware of the time that a
context is scheduled in and out of the engine. Since i915 pmu relies on
this info to provide engine busyness to the user, GuC shares this info
with i915 for all engines using shared memory. For each engine, this
info contains:

- total busyness: total time that the context was running (total)
- id: id of the running context (id)
- start timestamp: timestamp when the context started running (start)

At the time (now) of sampling the engine busyness, if the id is valid
(!= ~0), and start is non-zero, then the context is considered to be
active and the engine busyness is calculated using the below equation

engine busyness = total + (now - start)

All times are obtained from the gt clock base. For inactive contexts,
engine busyness is just equal to the total.

The start and total values provided by GuC are 32 bits and wrap around
in a few minutes. Since perf pmu provides busyness as 64 bit
monotonically increasing values, there is a need for this implementation
to account for overflows and extend the time to 64 bits before returning
busyness to the user. In order to do that, a worker runs periodically at
frequency = 1/8th the time it takes for the timestamp to wrap. As an
example, that would be once in 27 seconds for a gt clock frequency of
19.2 MHz.

Note:
There might be an overaccounting of busyness due to the fact that GuC
may be updating the total and start values while kmd is reading them.
(i.e kmd may read the updated total and the stale start). In such a
case, user may see higher busyness value followed by smaller ones which
would eventually catch up to the higher value.

v2: (Tvrtko)
- Include details in commit message
- Move intel engine busyness function into execlist code
- Use union inside engine->stats
- Use natural type for ping delay jiffies
- Drop active_work condition checks
- Use for_each_engine if iterating all engines
- Drop seq locking, use spinlock at guc level to update engine stats
- Document worker specific details

v3: (Tvrtko/Umesh)
- Demarcate guc and execlist stat objects with comments
- Document known over-accounting issue in commit
- Provide a consistent view of guc state
- Add hooks to gt park/unpark for guc busyness
- Stop/start worker in gt park/unpark path
- Drop inline
- Move spinlock and worker inits to guc initialization
- Drop helpers that are called only once

v4: (Tvrtko/Matt/Umesh)
- Drop addressed opens from commit message
- Get runtime pm in ping, remove from the park path
- Use cancel_delayed_work_sync in disable_submission path
- Update stats during reset prepare
- Skip ping if reset in progress
- Explicitly name execlists and guc stats objects
- Since disable_submission is called from many places, move resetting
   stats to intel_guc_submission_reset_prepare

v5: (Tvrtko)
- Add a trylock helper that does not sleep and synchronize PMU event
   callbacks and worker with gt reset


Looks good to me now, for some combination of high level and incomeplte 
low level review (I did not check the overflow handling or the GuC page 
layout and flow.). Both patches:


Acked-by: Tvrtko Ursulin 

Do you have someone available to check the parts I did not and r-b?

Regards,

Tvrtko



Signed-off-by: John Harrison 
Signed-off-by: Umesh Nerlige Ramappa 
---
  drivers/gpu/drm/i915/gt/intel_engine_cs.c |  28 +-
  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  33 ++-
  .../drm/i915/gt/intel_execlists_submission.c  |  34 +++
  drivers/gpu/drm/i915/gt/intel_gt_pm.c |   2 +
  drivers/gpu/drm/i915/gt/intel_reset.c |  16 ++
  drivers/gpu/drm/i915/gt/intel_reset.h |   1 +
  .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |   1 +
  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  30 ++
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  21 ++
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h|   5 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  13 +
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 267 ++
  .../gpu/drm/i915/gt/uc/intel_guc_submission.h |   2 +
  drivers/gpu/drm/i915/i915_reg.h   |   2 +
  14 files changed, 427 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 38436f4b5706..6b783fdcba2a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1873,23 +1873,6 @@ void intel_engine_dump(struct intel_engine_cs *engine,
intel_engine_print_breadcrumbs(engine, m);
  }
  
-static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine,

-   ktime_t *now)
-{
-   struct intel_engine_execlists_stats *stats = &engine->stats.execlists;
-   ktime_t total = stats->total;
-
-   /*
-* If the engine is executing something at the moment
-* add it to the total.
-*/
-   *now = 

Re: [PATCH] drm: Update MST First Link Slot Information Based on Encoding Format

2021-10-13 Thread Jani Nikula
On Tue, 12 Oct 2021, Bhawanpreet Lakha  wrote:
> 8b/10b encoding format requires to reserve the first slot for
> recording metadata. Real data transmission starts from the second slot,
> with a total of available 63 slots available.
>
> In 128b/132b encoding format, metadata is transmitted separately
> in LLCP packet before MTP. Real data transmission starts from
> the first slot, with a total of 64 slots available.
>
> v2:
> * Remove get_mst_link_encoding_cap
> * Move total/start slots to mst_state, and copy it to mst_mgr in
> atomic_check
>
> Signed-off-by: Fangzhi Zuo 
> Signed-off-by: Bhawanpreet Lakha 
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 28 +++
>  drivers/gpu/drm/drm_dp_mst_topology.c | 35 +++
>  include/drm/drm_dp_mst_helper.h   | 13 +++
>  3 files changed, 69 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 5020f2d36fe1..4ad50eb0091a 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -10612,6 +10612,8 @@ static int amdgpu_dm_atomic_check(struct drm_device 
> *dev,
>  #if defined(CONFIG_DRM_AMD_DC_DCN)
>   struct dsc_mst_fairness_vars vars[MAX_PIPES];
>  #endif
> + struct drm_dp_mst_topology_state *mst_state;
> + struct drm_dp_mst_topology_mgr *mgr;
>  
>   trace_amdgpu_dm_atomic_check_begin(state);
>  
> @@ -10819,6 +10821,32 @@ static int amdgpu_dm_atomic_check(struct drm_device 
> *dev,
>   lock_and_validation_needed = true;
>   }
>  
> +#if defined(CONFIG_DRM_AMD_DC_DCN)
> + for_each_new_mst_mgr_in_state(state, mgr, mst_state, i) {
> + struct amdgpu_dm_connector *aconnector;
> + struct drm_connector *connector;
> + struct drm_connector_list_iter iter;
> + u8 link_coding_cap;
> +
> + if (!mgr->mst_state )
> + continue;
> +
> + drm_connector_list_iter_begin(dev, &iter);
> + drm_for_each_connector_iter(connector, &iter) {
> + int id = connector->index;
> +
> + if (id == mst_state->mgr->conn_base_id) {
> + aconnector = to_amdgpu_dm_connector(connector);
> + link_coding_cap = 
> dc_link_dp_mst_decide_link_encoding_format(aconnector->dc_link);
> + drm_dp_mst_update_coding_cap(mst_state, 
> link_coding_cap);
> +
> + break;
> + }
> + }
> + drm_connector_list_iter_end(&iter);
> +
> + }
> +#endif

I wonder if we could split this to separate drm dp helper and amd driver
patches?

>   /**
>* Streams and planes are reset when there are changes that affect
>* bandwidth. Anything that affects bandwidth needs to go through
> diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c 
> b/drivers/gpu/drm/drm_dp_mst_topology.c
> index ad0795afc21c..fb5c47c4cb2e 100644
> --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> @@ -3368,7 +3368,7 @@ int drm_dp_update_payload_part1(struct 
> drm_dp_mst_topology_mgr *mgr)
>   struct drm_dp_payload req_payload;
>   struct drm_dp_mst_port *port;
>   int i, j;
> - int cur_slots = 1;
> + int cur_slots = mgr->start_slot;
>   bool skip;
>  
>   mutex_lock(&mgr->payload_lock);
> @@ -4321,7 +4321,7 @@ int drm_dp_find_vcpi_slots(struct 
> drm_dp_mst_topology_mgr *mgr,
>   num_slots = DIV_ROUND_UP(pbn, mgr->pbn_div);
>  
>   /* max. time slots - one slot for MTP header */
> - if (num_slots > 63)
> + if (num_slots > mgr->total_avail_slots)
>   return -ENOSPC;
>   return num_slots;
>  }
> @@ -4333,7 +4333,7 @@ static int drm_dp_init_vcpi(struct 
> drm_dp_mst_topology_mgr *mgr,
>   int ret;
>  
>   /* max. time slots - one slot for MTP header */
> - if (slots > 63)
> + if (slots > mgr->total_avail_slots)
>   return -ENOSPC;
>  
>   vcpi->pbn = pbn;
> @@ -4507,6 +4507,18 @@ int drm_dp_atomic_release_vcpi_slots(struct 
> drm_atomic_state *state,
>  }
>  EXPORT_SYMBOL(drm_dp_atomic_release_vcpi_slots);
>  
> +void drm_dp_mst_update_coding_cap(struct drm_dp_mst_topology_state 
> *mst_state, uint8_t link_coding_cap)
> +{
> + if (link_coding_cap == DP_CAP_ANSI_128B132B) {
> + mst_state->total_avail_slots = 64;
> + mst_state->start_slot = 0;
> + }

The values never change AFAICT, should we store the channel encoding
instead, and use that information to initialize the values?

(Alternatively, why aren't the 8b/10b values initialized here if
128b/132b are?)

> +
> + DRM_DEBUG_KMS("%s coding format on mgr 0x%p\n",
> + (link_coding_cap == DP_CAP_ANSI_128B132B) ? 
> "128b/132b":"8b/10b", mst_state->mgr);
> +}
> 

[PATCH v2 2/7] nouveau: ACPI: Use the ACPI_COMPANION() macro directly

2021-10-13 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

The ACPI_HANDLE() macro is a wrapper arond the ACPI_COMPANION()
macro and the ACPI handle produced by the former comes from the
ACPI device object produced by the latter, so it is way more
straightforward to evaluate the latter directly instead of passing
the handle produced by the former to acpi_bus_get_device().

Modify nouveau_acpi_edid() accordingly (no intentional functional
impact).

Signed-off-by: Rafael J. Wysocki 
Reviewed-by: Ben Skeggs 
---

v1 -> v2:
   * Resend with a different From and S-o-b address and with R-by from Ben.
 No other changes.

---
 drivers/gpu/drm/nouveau/nouveau_acpi.c |9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

Index: linux-pm/drivers/gpu/drm/nouveau/nouveau_acpi.c
===
--- linux-pm.orig/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ linux-pm/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -364,7 +364,6 @@ void *
 nouveau_acpi_edid(struct drm_device *dev, struct drm_connector *connector)
 {
struct acpi_device *acpidev;
-   acpi_handle handle;
int type, ret;
void *edid;
 
@@ -377,12 +376,8 @@ nouveau_acpi_edid(struct drm_device *dev
return NULL;
}
 
-   handle = ACPI_HANDLE(dev->dev);
-   if (!handle)
-   return NULL;
-
-   ret = acpi_bus_get_device(handle, &acpidev);
-   if (ret)
+   acpidev = ACPI_COMPANION(dev->dev);
+   if (!acpidev)
return NULL;
 
ret = acpi_video_get_edid(acpidev, type, -1, &edid);





Re: [Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Daniel Vetter
On Wed, Oct 13, 2021 at 04:37:03PM +0100, Tvrtko Ursulin wrote:
> 
> On 13/10/2021 15:00, Daniel Vetter wrote:
> > On Wed, Oct 13, 2021 at 02:32:03PM +0200, Maarten Lankhorst wrote:
> > > No memory should be allocated when calling i915_gem_object_wait,
> > > because it may be called to idle a BO when evicting memory.
> > > 
> > > Fix this by using dma_resv_iter helpers to call
> > > i915_gem_object_wait_fence() on each fence, which cleans up the code a 
> > > lot.
> > > Also remove dma_resv_prune, it's questionably.
> > > 
> > > This will result in the following lockdep splat.
> > > 
> > > <4> [83.538517] ==
> > > <4> [83.538520] WARNING: possible circular locking dependency detected
> > > <4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted
> > > <4> [83.538525] --
> > > <4> [83.538527] gem_render_line/5242 is trying to acquire lock:
> > > <4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: 
> > > __kmalloc_track_caller+0x56/0x270
> > > <4> [83.538538]
> > > but task is already holding lock:
> > > <4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
> > > i915_vma_pin_ww+0x1c7/0x970 [i915]
> > > <4> [83.538638]
> > > which lock already depends on the new lock.
> > > <4> [83.538642]
> > > the existing dependency chain (in reverse order) is:
> > > <4> [83.538645]
> > > -> #1 (&vm->mutex/1){+.+.}-{3:3}:
> > > <4> [83.538649]lock_acquire+0xd3/0x310
> > > <4> [83.538654]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
> > > <4> [83.538730]i915_address_space_init+0xf5/0x1b0 [i915]
> > > <4> [83.538794]ppgtt_init+0x55/0x70 [i915]
> > > <4> [83.538856]gen8_ppgtt_create+0x44/0x5d0 [i915]
> > > <4> [83.538912]i915_ppgtt_create+0x28/0xf0 [i915]
> > > <4> [83.538971]intel_gt_init+0x130/0x3b0 [i915]
> > > <4> [83.539029]i915_gem_init+0x14b/0x220 [i915]
> > > <4> [83.539100]i915_driver_probe+0x97e/0xdd0 [i915]
> > > <4> [83.539149]i915_pci_probe+0x43/0x1d0 [i915]
> > > <4> [83.539197]pci_device_probe+0x9b/0x110
> > > <4> [83.539201]really_probe+0x1b0/0x3b0
> > > <4> [83.539205]__driver_probe_device+0xf6/0x170
> > > <4> [83.539208]driver_probe_device+0x1a/0x90
> > > <4> [83.539210]__driver_attach+0x93/0x160
> > > <4> [83.539213]bus_for_each_dev+0x72/0xc0
> > > <4> [83.539216]bus_add_driver+0x14b/0x1f0
> > > <4> [83.539220]driver_register+0x66/0xb0
> > > <4> [83.539222]hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi]
> > > <4> [83.539227]do_one_initcall+0x53/0x2e0
> > > <4> [83.539230]do_init_module+0x55/0x200
> > > <4> [83.539234]load_module+0x2700/0x2980
> > > <4> [83.539237]__do_sys_finit_module+0xaa/0x110
> > > <4> [83.539241]do_syscall_64+0x37/0xb0
> > > <4> [83.539244]entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > <4> [83.539247]
> > > -> #0 (fs_reclaim){+.+.}-{0:0}:
> > > <4> [83.539251]validate_chain+0xb37/0x1e70
> > > <4> [83.539254]__lock_acquire+0x5a1/0xb70
> > > <4> [83.539258]lock_acquire+0xd3/0x310
> > > <4> [83.539260]fs_reclaim_acquire+0x9d/0xd0
> > > <4> [83.539264]__kmalloc_track_caller+0x56/0x270
> > > <4> [83.539267]krealloc+0x48/0xa0
> > > <4> [83.539270]dma_resv_get_fences+0x1c3/0x280
> > > <4> [83.539274]i915_gem_object_wait+0x1ff/0x410 [i915]
> > > <4> [83.539342]i915_gem_evict_for_node+0x16b/0x440 [i915]
> > > <4> [83.539412]i915_gem_gtt_reserve+0xff/0x130 [i915]
> > > <4> [83.539482]i915_vma_pin_ww+0x765/0x970 [i915]
> > > <4> [83.539556]eb_validate_vmas+0x6fe/0x8e0 [i915]
> > > <4> [83.539626]i915_gem_do_execbuffer+0x9a6/0x20a0 [i915]
> > > <4> [83.539693]i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915]
> > > <4> [83.539759]drm_ioctl_kernel+0xac/0x140
> > > <4> [83.539763]drm_ioctl+0x201/0x3d0
> > > <4> [83.539766]__x64_sys_ioctl+0x6a/0xa0
> > > <4> [83.539769]do_syscall_64+0x37/0xb0
> > > <4> [83.539772]entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > <4> [83.539775]
> > > other info that might help us debug this:
> > > <4> [83.539778]  Possible unsafe locking scenario:
> > > <4> [83.539781]CPU0CPU1
> > > <4> [83.539783]
> > > <4> [83.539785]   lock(&vm->mutex/1);
> > > <4> [83.539788]lock(fs_reclaim);
> > > <4> [83.539791]lock(&vm->mutex/1);
> > > <4> [83.539794]   lock(fs_reclaim);
> > > <4> [83.539796]
> > >   *** DEADLOCK ***
> > > <4> [83.539799] 3 locks held by gem_render_line/5242:
> > > <4> [83.539802]  #0: c9d4bbf0 
> > > (reservation_ww_class_acquire){+.+.}-{0:0}, at: 
> > > i915_gem_do_execbuffer+0x8e5/0x20a0 [i915]
> > > <4> [83.539870]  #1: 88811e48bae8 
> > > (reservation

  1   2   3   >