date:20220204

Re: [PATCH v2] drm/bridge: dw-hdmi: use safe format when first in bridge chain

2022-02-04 Thread Neil Armstrong


Hi,

On 19/01/2022 13:28, Neil Armstrong wrote:

When the dw-hdmi bridge is in first place of the bridge chain, this
means there is no way to select an input format of the dw-hdmi HW
component.

Since introduction of display-connector, negotiation was broken since
the dw-hdmi negotiation code only worked when the dw-hdmi bridge was
in last position of the bridge chain or behind another bridge also
supporting input & output format negotiation.

Commit 0656d1285b79 ("drm/bridge: display-connector: implement bus fmts 
callbacks")
was introduced to make negotiation work again by making display-connector
act as a pass-through concerning input & output format negotiation.

But in the case where the dw-hdmi is single in the bridge chain, for
example on Renesas SoCs, with the display-connector bridge the dw-hdmi
is no more single, breaking output format.

Reported-by: Biju Das 
Bisected-by: Kieran Bingham 
Tested-by: Kieran Bingham 
Fixes: 0656d1285b79 ("drm/bridge: display-connector: implement bus fmts 
callbacks").
Signed-off-by: Neil Armstrong 
---
Changes since v1:
- Remove bad fix in dw_hdmi_bridge_atomic_get_input_bus_fmts
- Fix typos in commit message

  drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index 54d8fdad395f..97cdc61b57f6 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -2551,8 +2551,9 @@ static u32 
*dw_hdmi_bridge_atomic_get_output_bus_fmts(struct drm_bridge *bridge,
if (!output_fmts)
return NULL;
  
-	/* If dw-hdmi is the only bridge, avoid negociating with ourselves */

-   if (list_is_singular(&bridge->encoder->bridge_chain)) {
+   /* If dw-hdmi is the first or only bridge, avoid negociating with 
ourselves */
+   if (list_is_singular(&bridge->encoder->bridge_chain) ||
+   list_is_first(&bridge->chain_node, &bridge->encoder->bridge_chain)) 
{
*num_output_fmts = 1;
output_fmts[0] = MEDIA_BUS_FMT_FIXED;
  


Can a bridge reviewer get an eye on this ? this actually solves an issue 
reported on a Renesas platform.

Neil

Re: [PATCH 21/21] fbdev: Make registered_fb[] private to fbmem.c

2022-02-04 Thread Geert Uytterhoeven

Hi Daniel,

Thanks for your patch!

On Tue, Feb 1, 2022 at 9:50 PM Daniel Vetter  wrote:
> Well except when the olpc dcon fbdev driver is enabled, that thing
> digs around in there in rather unfixable ways.

Can't the actual frame buffer driver (which one?) used on olpc export
a pointer to its fb_info?

> --- a/drivers/video/fbdev/core/fbmem.c
> +++ b/drivers/video/fbdev/core/fbmem.c
> @@ -48,10 +48,14 @@
>  static DEFINE_MUTEX(registration_lock);
>
>  struct fb_info *registered_fb[FB_MAX] __read_mostly;
> -EXPORT_SYMBOL(registered_fb);
> -
>  int num_registered_fb __read_mostly;
> +#if IS_ENABLED(CONFIG_OLPC_DCON)

CONFIG_FB_OLPC_DCON (everywhere), cfr. the build failure reported
by the robot.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH v4 3/3] fbcon: Add option to enable legacy hardware acceleration

2022-02-04 Thread Geert Uytterhoeven

Hi Helge,

Thanks for your patch!

On Wed, Feb 2, 2022 at 8:05 PM Helge Deller  wrote:
> Add a config option CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION to
> enable bitblt and fillrect hardware acceleration in the framebuffer
> console. If disabled, such acceleration will not be used, even if it is
> supported by the graphics hardware driver.

Note that this also applies to vertical panning and wrapping.


Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH] HPE BMC GXP SUPPORT

2022-02-04 Thread Krzysztof Kozlowski

On 03/02/2022 18:07, Verdun, Jean-Marie wrote:
> 
>> Maybe it does not look like, but this is actually a v2. Nick was asked
>> to change the naming for the nodes already in v1. Unfortunately it did
>> not happen, so we have vuart, spifi, vic and more.
> 
>> It is a waste of reviewers' time to ask them to perform the same review
>> twice or to ignore their comments.
> 
> Hi Krysztof,
> 
> Accept our apologies if you think you lose your time, it is clearly not 
> our
> intent. 
> 
> This is the first time that we (I mean the team) introduce a new arch into
> the linux kernel and I must admit that we had hard time to understand
> from which angle we needed to start.
> 
> I will probably write a Documentation afterward, as it is easy to find doc
> on how to introduce a patch or a driver, but not when you want to 
> introduce a new chip. 
> 
> We are trying to do our best, and clearly want to follow all of your 
> inputs.
> Mistakes happen and they are clearly not intentional and not driven in 
> a way to make you lose your time.
> 
> Helping others, and teaching something new is definitely a way to 
> optimize your time and this is what you are currently doing with us.
> 
> We appreciate it and hope you will too.

I understand, I also maybe over-reacted on this. Just please go through
the comments you got for first submission and either apply them or
respond why you disagree.

The next submissions (patchset split into several commits) should be a
v3, preferably with cover letter (git format-patch --cover-letter -v3
...) where you can document also changes you did to the patchset.

It looks for example like this:
https://lore.kernel.org/linux-samsung-soc/31da451b-a36c-74fb-5667-d4193284c...@canonical.com/T/#mf98d2ac27a8481dc69dd110f9861c8318cade252

or like this (where changelogs are in each patch, although ordering is
not correct because dt-bindings should be the first in the series):
https://lore.kernel.org/all/20220103233948.198119-1-mr.bossman...@gmail.com/


Best regards,
Krzysztof

Re: [PATCH v3 2/2] drm/i915/uapi: Add query for hwconfig table

2022-02-04 Thread Daniel Vetter

On Wed, Jan 19, 2022 at 9:35 PM  wrote:
>
> From: Rodrigo Vivi 
>
> GuC contains a consolidated table with a bunch of information about the
> current device.
>
> Previously, this information was spread and hardcoded to all the components
> including GuC, i915 and various UMDs. The goal here is to consolidate
> the data into GuC in a way that all interested components can grab the
> very latest and synchronized information using a simple query.
>
> As per most of the other queries, this one can be called twice.
> Once with item.length=0 to determine the exact buffer size, then
> allocate the user memory and call it again for to retrieve the
> table data. For example:
>   struct drm_i915_query_item item = {
> .query_id = DRM_I915_QUERY_HWCONCFIG_TABLE;
>   };
>   query.items_ptr = (int64_t) &item;
>   query.num_items = 1;
>
>   ioctl(fd, DRM_IOCTL_I915_QUERY, query, sizeof(query));
>
>   if (item.length <= 0)
> return -ENOENT;
>
>   data = malloc(item.length);
>   item.data_ptr = (int64_t) &data;
>   ioctl(fd, DRM_IOCTL_I915_QUERY, query, sizeof(query));
>
>   // Parse the data as appropriate...
>
> The returned array is a simple and flexible KLV (Key/Length/Value)
> formatted table. For example, it could be just:
>   enum device_attr {
>  ATTR_SOME_VALUE = 0,
>  ATTR_SOME_MASK  = 1,
>   };
>
>   static const u32 hwconfig[] = {
>   ATTR_SOME_VALUE,
>   1, // Value Length in DWords
>   8, // Value
>
>   ATTR_SOME_MASK,
>   3,
>   0x00, 0x, 0xFF00,
>   };
>
> The attribute ids are defined in a hardware spec.
>
> Cc: Tvrtko Ursulin 
> Cc: Kenneth Graunke 
> Cc: Michal Wajdeczko 
> Cc: Slawomir Milczarek 
> Signed-off-by: Rodrigo Vivi 
> Signed-off-by: John Harrison 
> Reviewed-by: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/i915_query.c | 23 +++
>  include/uapi/drm/i915_drm.h   |  1 +
>  2 files changed, 24 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_query.c 
> b/drivers/gpu/drm/i915/i915_query.c
> index 2dfbc22857a3..609e64d5f395 100644
> --- a/drivers/gpu/drm/i915/i915_query.c
> +++ b/drivers/gpu/drm/i915/i915_query.c
> @@ -479,12 +479,35 @@ static int query_memregion_info(struct drm_i915_private 
> *i915,
> return total_length;
>  }
>
> +static int query_hwconfig_table(struct drm_i915_private *i915,
> +   struct drm_i915_query_item *query_item)
> +{
> +   struct intel_gt *gt = to_gt(i915);
> +   struct intel_guc_hwconfig *hwconfig = >->uc.guc.hwconfig;
> +
> +   if (!hwconfig->size || !hwconfig->ptr)
> +   return -ENODEV;
> +
> +   if (query_item->length == 0)
> +   return hwconfig->size;
> +
> +   if (query_item->length < hwconfig->size)
> +   return -EINVAL;
> +
> +   if (copy_to_user(u64_to_user_ptr(query_item->data_ptr),
> +hwconfig->ptr, hwconfig->size))
> +   return -EFAULT;
> +
> +   return hwconfig->size;
> +}
> +
>  static int (* const i915_query_funcs[])(struct drm_i915_private *dev_priv,
> struct drm_i915_query_item 
> *query_item) = {
> query_topology_info,
> query_engine_info,
> query_perf_config,
> query_memregion_info,
> +   query_hwconfig_table,
>  };
>
>  int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file 
> *file)
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 914ebd9290e5..132515199f27 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -2685,6 +2685,7 @@ struct drm_i915_query_item {
>  #define DRM_I915_QUERY_ENGINE_INFO 2
>  #define DRM_I915_QUERY_PERF_CONFIG  3
>  #define DRM_I915_QUERY_MEMORY_REGIONS   4
> +#define DRM_I915_QUERY_HWCONFIG_TABLE   5
>  /* Must be kept compact -- no holes and well documented */

New uapi needs kerneldoc in the uapi header, and please fill in any
gaps you have (i.e. if the query uapi this is built on top of isn't
fully documented yet).

Also this holds across the board, so please keep in mind in patch review.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 04/21] fbcon: delete a few unneeded forward decl

2022-02-04 Thread Geert Uytterhoeven

On Tue, Feb 1, 2022 at 9:50 PM Daniel Vetter  wrote:
> I didn't bother with any code movement to fix the others, these just
> got a bit in the way.
>
> Signed-off-by: Daniel Vetter 

Reviewed-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Add warning for nesting dma_fence containers

2022-02-04 Thread Christian König

Hi everyone,

Since some operations can then lead to recursive handling nesting
dma_fence containers into each other is only allowed under some
restrictions.

dma_fence_array containers can be attached to dma_fence_chain
containers and dma_fence_chain containers by chaining them together.

In all other cases the individual fences should be extracted with
the appropriate iterators and added to the new containers
individually.

I've separated the i915 cleanup from this change since it is
generally a different functionality and the build bots complained
about some issues with those patches.

Most patches are already reviewd, but especially the first one still
needs an rb tag.

Please review and comment,
Christian.

[PATCH 1/6] dma-buf: consolidate dma_fence subclass checking

2022-02-04 Thread Christian König

Consolidate the wrapper functions to check for dma_fence
subclasses in the dma_fence header.

This makes it easier to document and also check the different
requirements for fence containers in the subclasses.

Signed-off-by: Christian König 
---
 include/linux/dma-fence-array.h | 15 +
 include/linux/dma-fence-chain.h |  3 +--
 include/linux/dma-fence.h   | 38 +
 3 files changed, 40 insertions(+), 16 deletions(-)

diff --git a/include/linux/dma-fence-array.h b/include/linux/dma-fence-array.h
index 303dd712220f..fec374f69e12 100644
--- a/include/linux/dma-fence-array.h
+++ b/include/linux/dma-fence-array.h
@@ -45,19 +45,6 @@ struct dma_fence_array {
struct irq_work work;
 };
 
-extern const struct dma_fence_ops dma_fence_array_ops;
-
-/**
- * dma_fence_is_array - check if a fence is from the array subsclass
- * @fence: fence to test
- *
- * Return true if it is a dma_fence_array and false otherwise.
- */
-static inline bool dma_fence_is_array(struct dma_fence *fence)
-{
-   return fence->ops == &dma_fence_array_ops;
-}
-
 /**
  * to_dma_fence_array - cast a fence to a dma_fence_array
  * @fence: fence to cast to a dma_fence_array
@@ -68,7 +55,7 @@ static inline bool dma_fence_is_array(struct dma_fence *fence)
 static inline struct dma_fence_array *
 to_dma_fence_array(struct dma_fence *fence)
 {
-   if (fence->ops != &dma_fence_array_ops)
+   if (!fence || !dma_fence_is_array(fence))
return NULL;
 
return container_of(fence, struct dma_fence_array, base);
diff --git a/include/linux/dma-fence-chain.h b/include/linux/dma-fence-chain.h
index 54fe3443fd2c..ee906b659694 100644
--- a/include/linux/dma-fence-chain.h
+++ b/include/linux/dma-fence-chain.h
@@ -49,7 +49,6 @@ struct dma_fence_chain {
spinlock_t lock;
 };
 
-extern const struct dma_fence_ops dma_fence_chain_ops;
 
 /**
  * to_dma_fence_chain - cast a fence to a dma_fence_chain
@@ -61,7 +60,7 @@ extern const struct dma_fence_ops dma_fence_chain_ops;
 static inline struct dma_fence_chain *
 to_dma_fence_chain(struct dma_fence *fence)
 {
-   if (!fence || fence->ops != &dma_fence_chain_ops)
+   if (!fence || !dma_fence_is_chain(fence))
return NULL;
 
return container_of(fence, struct dma_fence_chain, base);
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 1ea691753bd3..775cdc0b4f24 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -587,4 +587,42 @@ struct dma_fence *dma_fence_get_stub(void);
 struct dma_fence *dma_fence_allocate_private_stub(void);
 u64 dma_fence_context_alloc(unsigned num);
 
+extern const struct dma_fence_ops dma_fence_array_ops;
+extern const struct dma_fence_ops dma_fence_chain_ops;
+
+/**
+ * dma_fence_is_array - check if a fence is from the array subclass
+ * @fence: the fence to test
+ *
+ * Return true if it is a dma_fence_array and false otherwise.
+ */
+static inline bool dma_fence_is_array(struct dma_fence *fence)
+{
+   return fence->ops == &dma_fence_array_ops;
+}
+
+/**
+ * dma_fence_is_chain - check if a fence is from the chain subclass
+ * @fence: the fence to test
+ *
+ * Return true if it is a dma_fence_chain and false otherwise.
+ */
+static inline bool dma_fence_is_chain(struct dma_fence *fence)
+{
+   return fence->ops == &dma_fence_chain_ops;
+}
+
+/**
+ * dma_fence_is_container - check if a fence is a container for other fences
+ * @fence: the fence to test
+ *
+ * Return true if this fence is a container for other fences, false otherwise.
+ * This is important since we can't build up large fence structure or otherwise
+ * we run into recursion during operation on those fences.
+ */
+static inline bool dma_fence_is_container(struct dma_fence *fence)
+{
+   return dma_fence_is_array(fence) || dma_fence_is_chain(fence);
+}
+
 #endif /* __LINUX_DMA_FENCE_H */
-- 
2.25.1

[PATCH 2/6] dma-buf: warn about dma_fence_array container rules v2

2022-02-04 Thread Christian König

It's not allowed to nest another dma_fence container into a dma_fence_array
or otherwise we can run into recursion.

Warn about that when we create a dma_fence_array.

v2: fix comment style and typo in the warning pointed out by Thomas

Signed-off-by: Christian König 
Reviewed-by: Daniel Vetter 
Reviewed-by: Thomas Hellström 
---
 drivers/dma-buf/dma-fence-array.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/dma-buf/dma-fence-array.c 
b/drivers/dma-buf/dma-fence-array.c
index 3e07f961e2f3..cb1bacb5a42b 100644
--- a/drivers/dma-buf/dma-fence-array.c
+++ b/drivers/dma-buf/dma-fence-array.c
@@ -176,6 +176,20 @@ struct dma_fence_array *dma_fence_array_create(int 
num_fences,
 
array->base.error = PENDING_ERROR;
 
+   /*
+* dma_fence_array objects should never contain any other fence
+* containers or otherwise we run into recursion and potential kernel
+* stack overflow on operations on the dma_fence_array.
+*
+* The correct way of handling this is to flatten out the array by the
+* caller instead.
+*
+* Enforce this here by checking that we don't create a dma_fence_array
+* with any container inside.
+*/
+   while (num_fences--)
+   WARN_ON(dma_fence_is_container(fences[num_fences]));
+
return array;
 }
 EXPORT_SYMBOL(dma_fence_array_create);
-- 
2.25.1

[PATCH 4/6] dma-buf: warn about containers in dma_resv object

2022-02-04 Thread Christian König

Drivers should not add containers as shared fences to the dma_resv
object, instead each fence should be added individually.

Signed-off-by: Christian König 
Reviewed-by: Daniel Vetter 
Reviewed-by: Thomas Hellström 
---
 drivers/dma-buf/dma-resv.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index ee31f15d633a..b51416405e86 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -256,6 +256,11 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, 
struct dma_fence *fence)
 
dma_resv_assert_held(obj);
 
+   /* Drivers should not add containers here, instead add each fence
+* individually.
+*/
+   WARN_ON(dma_fence_is_container(fence));
+
fobj = dma_resv_shared_list(obj);
count = fobj->shared_count;
 
-- 
2.25.1

[PATCH 5/6] dma-buf: add dma_fence_chain_contained helper

2022-02-04 Thread Christian König

It's a reoccurring pattern that we need to extract the fence
from a dma_fence_chain object. Add a helper for this.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-fence-chain.c |  6 ++
 include/linux/dma-fence-chain.h   | 15 +++
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/dma-buf/dma-fence-chain.c 
b/drivers/dma-buf/dma-fence-chain.c
index 084c6927b735..06f8ef97c6e8 100644
--- a/drivers/dma-buf/dma-fence-chain.c
+++ b/drivers/dma-buf/dma-fence-chain.c
@@ -148,8 +148,7 @@ static bool dma_fence_chain_enable_signaling(struct 
dma_fence *fence)
 
dma_fence_get(&head->base);
dma_fence_chain_for_each(fence, &head->base) {
-   struct dma_fence_chain *chain = to_dma_fence_chain(fence);
-   struct dma_fence *f = chain ? chain->fence : fence;
+   struct dma_fence *f = dma_fence_chain_contained(fence);
 
dma_fence_get(f);
if (!dma_fence_add_callback(f, &head->cb, dma_fence_chain_cb)) {
@@ -165,8 +164,7 @@ static bool dma_fence_chain_enable_signaling(struct 
dma_fence *fence)
 static bool dma_fence_chain_signaled(struct dma_fence *fence)
 {
dma_fence_chain_for_each(fence, fence) {
-   struct dma_fence_chain *chain = to_dma_fence_chain(fence);
-   struct dma_fence *f = chain ? chain->fence : fence;
+   struct dma_fence *f = dma_fence_chain_contained(fence);
 
if (!dma_fence_is_signaled(f)) {
dma_fence_put(fence);
diff --git a/include/linux/dma-fence-chain.h b/include/linux/dma-fence-chain.h
index ee906b659694..10d51bcdf7b7 100644
--- a/include/linux/dma-fence-chain.h
+++ b/include/linux/dma-fence-chain.h
@@ -66,6 +66,21 @@ to_dma_fence_chain(struct dma_fence *fence)
return container_of(fence, struct dma_fence_chain, base);
 }
 
+/**
+ * dma_fence_chain_contained - return the contained fence
+ * @fence: the fence to test
+ *
+ * If the fence is a dma_fence_chain the function returns the fence contained
+ * inside the chain object, otherwise it returns the fence itself.
+ */
+static inline struct dma_fence *
+dma_fence_chain_contained(struct dma_fence *fence)
+{
+   struct dma_fence_chain *chain = to_dma_fence_chain(fence);
+
+   return chain ? chain->fence : fence;
+}
+
 /**
  * dma_fence_chain_alloc
  *
-- 
2.25.1

[PATCH 6/6] drm/amdgpu: use dma_fence_chain_contained

2022-02-04 Thread Christian König

Instead of manually extracting the fence.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index f7d8487799b2..40e06745fae9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -261,10 +261,9 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct 
amdgpu_sync *sync,
 
dma_resv_for_each_fence(&cursor, resv, true, f) {
dma_fence_chain_for_each(f, f) {
-   struct dma_fence_chain *chain = to_dma_fence_chain(f);
+   struct dma_fence *tmp = dma_fence_chain_contained(f);
 
-   if (amdgpu_sync_test_fence(adev, mode, owner, chain ?
-  chain->fence : f)) {
+   if (amdgpu_sync_test_fence(adev, mode, owner, tmp)) {
r = amdgpu_sync_fence(sync, f);
dma_fence_put(f);
if (r)
-- 
2.25.1

[PATCH 3/6] dma-buf: Warn about dma_fence_chain container rules v2

2022-02-04 Thread Christian König

Chaining of dma_fence_chain objects is only allowed through the prev
fence and not through the contained fence.

Warn about that when we create a dma_fence_chain.

v2: fix comment style

Signed-off-by: Christian König 
Reviewed-by: Daniel Vetter 
Reviewed-by: Thomas Hellström 
---
 drivers/dma-buf/dma-fence-chain.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/dma-buf/dma-fence-chain.c 
b/drivers/dma-buf/dma-fence-chain.c
index 1b4cb3e5cec9..084c6927b735 100644
--- a/drivers/dma-buf/dma-fence-chain.c
+++ b/drivers/dma-buf/dma-fence-chain.c
@@ -254,5 +254,14 @@ void dma_fence_chain_init(struct dma_fence_chain *chain,
 
dma_fence_init(&chain->base, &dma_fence_chain_ops,
   &chain->lock, context, seqno);
+
+   /*
+* Chaining dma_fence_chain container together is only allowed through
+* the prev fence and not through the contained fence.
+*
+* The correct way of handling this is to flatten out the fence
+* structure into a dma_fence_array by the caller instead.
+*/
+   WARN_ON(dma_fence_is_chain(fence));
 }
 EXPORT_SYMBOL(dma_fence_chain_init);
-- 
2.25.1

Re: [PATCH v4 3/3] fbcon: Add option to enable legacy hardware acceleration

2022-02-04 Thread Helge Deller

Hello Geert,

On 2/4/22 09:37, Geert Uytterhoeven wrote:
> On Wed, Feb 2, 2022 at 8:05 PM Helge Deller  wrote:
>> Add a config option CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION to
>> enable bitblt and fillrect hardware acceleration in the framebuffer
>> console. If disabled, such acceleration will not be used, even if it is
>> supported by the graphics hardware driver.
>
> Note that this also applies to vertical panning and wrapping.

That's correct.
Would you mind to send a patch which adds this info?

Helge

Re: [PATCH v4 3/3] fbcon: Add option to enable legacy hardware acceleration

2022-02-04 Thread Geert Uytterhoeven

Hi Helge,

On Fri, Feb 4, 2022 at 11:17 AM Helge Deller  wrote:
> On 2/4/22 09:37, Geert Uytterhoeven wrote:
> > On Wed, Feb 2, 2022 at 8:05 PM Helge Deller  wrote:
> >> Add a config option CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION to
> >> enable bitblt and fillrect hardware acceleration in the framebuffer
> >> console. If disabled, such acceleration will not be used, even if it is
> >> supported by the graphics hardware driver.
> >
> > Note that this also applies to vertical panning and wrapping.
>
> That's correct.
> Would you mind to send a patch which adds this info?

To add it where? "bitblt and fillrect" are only mentioned in the patch
description.

The Kconfig help entry just talks about "hardware acceleration",
which can mean any trick supported by the hardware.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH 1/6] dma-buf: consolidate dma_fence subclass checking

2022-02-04 Thread Thomas Hellström

On Fri, 2022-02-04 at 11:04 +0100, Christian König wrote:
> Consolidate the wrapper functions to check for dma_fence
> subclasses in the dma_fence header.
> 
> This makes it easier to document and also check the different
> requirements for fence containers in the subclasses.
> 
> Signed-off-by: Christian König 

I'd probably still opt for a fence ops is_container member, but won't
insist.

Reviewed-by: Thomas Hellström 


> ---
>  include/linux/dma-fence-array.h | 15 +
>  include/linux/dma-fence-chain.h |  3 +--
>  include/linux/dma-fence.h   | 38
> +
>  3 files changed, 40 insertions(+), 16 deletions(-)
> 
> diff --git a/include/linux/dma-fence-array.h b/include/linux/dma-
> fence-array.h
> index 303dd712220f..fec374f69e12 100644
> --- a/include/linux/dma-fence-array.h
> +++ b/include/linux/dma-fence-array.h
> @@ -45,19 +45,6 @@ struct dma_fence_array {
> struct irq_work work;
>  };
>  
> -extern const struct dma_fence_ops dma_fence_array_ops;
> -
> -/**
> - * dma_fence_is_array - check if a fence is from the array subsclass
> - * @fence: fence to test
> - *
> - * Return true if it is a dma_fence_array and false otherwise.
> - */
> -static inline bool dma_fence_is_array(struct dma_fence *fence)
> -{
> -   return fence->ops == &dma_fence_array_ops;
> -}
> -
>  /**
>   * to_dma_fence_array - cast a fence to a dma_fence_array
>   * @fence: fence to cast to a dma_fence_array
> @@ -68,7 +55,7 @@ static inline bool dma_fence_is_array(struct
> dma_fence *fence)
>  static inline struct dma_fence_array *
>  to_dma_fence_array(struct dma_fence *fence)
>  {
> -   if (fence->ops != &dma_fence_array_ops)
> +   if (!fence || !dma_fence_is_array(fence))
> return NULL;
>  
> return container_of(fence, struct dma_fence_array, base);
> diff --git a/include/linux/dma-fence-chain.h b/include/linux/dma-
> fence-chain.h
> index 54fe3443fd2c..ee906b659694 100644
> --- a/include/linux/dma-fence-chain.h
> +++ b/include/linux/dma-fence-chain.h
> @@ -49,7 +49,6 @@ struct dma_fence_chain {
> spinlock_t lock;
>  };
>  
> -extern const struct dma_fence_ops dma_fence_chain_ops;
>  
>  /**
>   * to_dma_fence_chain - cast a fence to a dma_fence_chain
> @@ -61,7 +60,7 @@ extern const struct dma_fence_ops
> dma_fence_chain_ops;
>  static inline struct dma_fence_chain *
>  to_dma_fence_chain(struct dma_fence *fence)
>  {
> -   if (!fence || fence->ops != &dma_fence_chain_ops)
> +   if (!fence || !dma_fence_is_chain(fence))
> return NULL;
>  
> return container_of(fence, struct dma_fence_chain, base);
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index 1ea691753bd3..775cdc0b4f24 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -587,4 +587,42 @@ struct dma_fence *dma_fence_get_stub(void);
>  struct dma_fence *dma_fence_allocate_private_stub(void);
>  u64 dma_fence_context_alloc(unsigned num);
>  
> +extern const struct dma_fence_ops dma_fence_array_ops;
> +extern const struct dma_fence_ops dma_fence_chain_ops;
> +
> +/**
> + * dma_fence_is_array - check if a fence is from the array subclass
> + * @fence: the fence to test
> + *
> + * Return true if it is a dma_fence_array and false otherwise.
> + */
> +static inline bool dma_fence_is_array(struct dma_fence *fence)
> +{
> +   return fence->ops == &dma_fence_array_ops;
> +}
> +
> +/**
> + * dma_fence_is_chain - check if a fence is from the chain subclass
> + * @fence: the fence to test
> + *
> + * Return true if it is a dma_fence_chain and false otherwise.
> + */
> +static inline bool dma_fence_is_chain(struct dma_fence *fence)
> +{
> +   return fence->ops == &dma_fence_chain_ops;
> +}
> +
> +/**
> + * dma_fence_is_container - check if a fence is a container for
> other fences
> + * @fence: the fence to test
> + *
> + * Return true if this fence is a container for other fences, false
> otherwise.
> + * This is important since we can't build up large fence structure
> or otherwise
> + * we run into recursion during operation on those fences.
> + */
> +static inline bool dma_fence_is_container(struct dma_fence *fence)
> +{
> +   return dma_fence_is_array(fence) ||
> dma_fence_is_chain(fence);
> +}
> +
>  #endif /* __LINUX_DMA_FENCE_H */

Re: [PATCH 5/6] dma-buf: add dma_fence_chain_contained helper

2022-02-04 Thread Thomas Hellström

On Fri, 2022-02-04 at 11:04 +0100, Christian König wrote:
> It's a reoccurring pattern that we need to extract the fence
> from a dma_fence_chain object. Add a helper for this.
> 
> Signed-off-by: Christian König 

I thought I'd reviewed this one already, but in case I didn't

Reviewed-by: Thomas Hellström 


> ---
>  drivers/dma-buf/dma-fence-chain.c |  6 ++
>  include/linux/dma-fence-chain.h   | 15 +++
>  2 files changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-
> fence-chain.c
> index 084c6927b735..06f8ef97c6e8 100644
> --- a/drivers/dma-buf/dma-fence-chain.c
> +++ b/drivers/dma-buf/dma-fence-chain.c
> @@ -148,8 +148,7 @@ static bool
> dma_fence_chain_enable_signaling(struct dma_fence *fence)
>  
> dma_fence_get(&head->base);
> dma_fence_chain_for_each(fence, &head->base) {
> -   struct dma_fence_chain *chain =
> to_dma_fence_chain(fence);
> -   struct dma_fence *f = chain ? chain->fence : fence;
> +   struct dma_fence *f =
> dma_fence_chain_contained(fence);
>  
> dma_fence_get(f);
> if (!dma_fence_add_callback(f, &head->cb,
> dma_fence_chain_cb)) {
> @@ -165,8 +164,7 @@ static bool
> dma_fence_chain_enable_signaling(struct dma_fence *fence)
>  static bool dma_fence_chain_signaled(struct dma_fence *fence)
>  {
> dma_fence_chain_for_each(fence, fence) {
> -   struct dma_fence_chain *chain =
> to_dma_fence_chain(fence);
> -   struct dma_fence *f = chain ? chain->fence : fence;
> +   struct dma_fence *f =
> dma_fence_chain_contained(fence);
>  
> if (!dma_fence_is_signaled(f)) {
> dma_fence_put(fence);
> diff --git a/include/linux/dma-fence-chain.h b/include/linux/dma-
> fence-chain.h
> index ee906b659694..10d51bcdf7b7 100644
> --- a/include/linux/dma-fence-chain.h
> +++ b/include/linux/dma-fence-chain.h
> @@ -66,6 +66,21 @@ to_dma_fence_chain(struct dma_fence *fence)
> return container_of(fence, struct dma_fence_chain, base);
>  }
>  
> +/**
> + * dma_fence_chain_contained - return the contained fence
> + * @fence: the fence to test
> + *
> + * If the fence is a dma_fence_chain the function returns the fence
> contained
> + * inside the chain object, otherwise it returns the fence itself.
> + */
> +static inline struct dma_fence *
> +dma_fence_chain_contained(struct dma_fence *fence)
> +{
> +   struct dma_fence_chain *chain = to_dma_fence_chain(fence);
> +
> +   return chain ? chain->fence : fence;
> +}
> +
>  /**
>   * dma_fence_chain_alloc
>   *

Re: Add warning for nesting dma_fence containers

2022-02-04 Thread Thomas Hellström

On Fri, 2022-02-04 at 11:04 +0100, Christian König wrote:
> Hi everyone,
> 
> Since some operations can then lead to recursive handling nesting
> dma_fence containers into each other is only allowed under some
> restrictions.
> 
> dma_fence_array containers can be attached to dma_fence_chain
> containers and dma_fence_chain containers by chaining them together.
> 
> In all other cases the individual fences should be extracted with
> the appropriate iterators and added to the new containers
> individually.
> 
> I've separated the i915 cleanup from this change since it is
> generally a different functionality and the build bots complained
> about some issues with those patches.
> 
> Most patches are already reviewd, but especially the first one still
> needs an rb tag.
> 
> Please review and comment,

I see you dropped the i915 patch (probably due to lack of reviews?),
Got distracted with other things, but I'll see if I can resurrect that
and get it reviewed and merged.

Thanks, 
Thomas


> Christian.
> 
>

Re: [PATCH v11 5/5] drm/amdgpu: add drm buddy support to amdgpu

2022-02-04 Thread Arunpravin




On 28/01/22 7:48 pm, Matthew Auld wrote:
> On Thu, 27 Jan 2022 at 14:11, Arunpravin
>  wrote:
>>
>> - Remove drm_mm references and replace with drm buddy functionalities
>> - Add res cursor support for drm buddy
>>
>> v2(Matthew Auld):
>>   - replace spinlock with mutex as we call kmem_cache_zalloc
>> (..., GFP_KERNEL) in drm_buddy_alloc() function
>>
>>   - lock drm_buddy_block_trim() function as it calls
>> mark_free/mark_split are all globally visible
>>
>> v3(Matthew Auld):
>>   - remove trim method error handling as we address the failure case
>> at drm_buddy_block_trim() function
>>
>> v4:
>>   - fix warnings reported by kernel test robot 
>>
>> v5:
>>   - fix merge conflict issue
>>
>> v6:
>>   - fix warnings reported by kernel test robot 
>>
>> Signed-off-by: Arunpravin 
>> ---
>>  drivers/gpu/drm/Kconfig   |   1 +
>>  .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 +--
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   7 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 259 ++
>>  4 files changed, 231 insertions(+), 133 deletions(-)
> 
> 
> 
>>
>> -/**
>> - * amdgpu_vram_mgr_virt_start - update virtual start address
>> - *
>> - * @mem: ttm_resource to update
>> - * @node: just allocated node
>> - *
>> - * Calculate a virtual BO start address to easily check if everything is CPU
>> - * accessible.
>> - */
>> -static void amdgpu_vram_mgr_virt_start(struct ttm_resource *mem,
>> -  struct drm_mm_node *node)
>> -{
>> -   unsigned long start;
>> -
>> -   start = node->start + node->size;
>> -   if (start > mem->num_pages)
>> -   start -= mem->num_pages;
>> -   else
>> -   start = 0;
>> -   mem->start = max(mem->start, start);
>> -}
>> -
>>  /**
>>   * amdgpu_vram_mgr_new - allocate new ranges
>>   *
>> @@ -366,13 +357,13 @@ static int amdgpu_vram_mgr_new(struct 
>> ttm_resource_manager *man,
>>const struct ttm_place *place,
>>struct ttm_resource **res)
>>  {
>> -   unsigned long lpfn, num_nodes, pages_per_node, pages_left, pages;
>> +   unsigned long lpfn, pages_per_node, pages_left, pages, n_pages;
>> +   u64 vis_usage = 0, mem_bytes, max_bytes, min_page_size;
>> struct amdgpu_vram_mgr *mgr = to_vram_mgr(man);
>> struct amdgpu_device *adev = to_amdgpu_device(mgr);
>> -   uint64_t vis_usage = 0, mem_bytes, max_bytes;
>> -   struct ttm_range_mgr_node *node;
>> -   struct drm_mm *mm = &mgr->mm;
>> -   enum drm_mm_insert_mode mode;
>> +   struct amdgpu_vram_mgr_node *node;
>> +   struct drm_buddy *mm = &mgr->mm;
>> +   struct drm_buddy_block *block;
>> unsigned i;
>> int r;
>>
>> @@ -391,10 +382,9 @@ static int amdgpu_vram_mgr_new(struct 
>> ttm_resource_manager *man,
>> goto error_sub;
>> }
>>
>> -   if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
>> +   if (place->flags & TTM_PL_FLAG_CONTIGUOUS)
>> pages_per_node = ~0ul;
>> -   num_nodes = 1;
>> -   } else {
>> +   else {
>>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>> pages_per_node = HPAGE_PMD_NR;
>>  #else
>> @@ -403,11 +393,9 @@ static int amdgpu_vram_mgr_new(struct 
>> ttm_resource_manager *man,
>>  #endif
>> pages_per_node = max_t(uint32_t, pages_per_node,
>>tbo->page_alignment);
>> -   num_nodes = DIV_ROUND_UP_ULL(PFN_UP(mem_bytes), 
>> pages_per_node);
>> }
>>
>> -   node = kvmalloc(struct_size(node, mm_nodes, num_nodes),
>> -   GFP_KERNEL | __GFP_ZERO);
>> +   node = kzalloc(sizeof(*node), GFP_KERNEL);
>> if (!node) {
>> r = -ENOMEM;
>> goto error_sub;
>> @@ -415,9 +403,17 @@ static int amdgpu_vram_mgr_new(struct 
>> ttm_resource_manager *man,
>>
>> ttm_resource_init(tbo, place, &node->base);
>>
>> -   mode = DRM_MM_INSERT_BEST;
>> +   INIT_LIST_HEAD(&node->blocks);
>> +
>> if (place->flags & TTM_PL_FLAG_TOPDOWN)
>> -   mode = DRM_MM_INSERT_HIGH;
>> +   node->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
>> +
>> +   if (place->fpfn || lpfn != man->size)
>> +   /* Allocate blocks in desired range */
>> +   node->flags |= DRM_BUDDY_RANGE_ALLOCATION;
>> +
>> +   min_page_size = mgr->default_page_size;
>> +   BUG_ON(min_page_size < mm->chunk_size);
>>
>> pages_left = node->base.num_pages;
>>
>> @@ -425,36 +421,61 @@ static int amdgpu_vram_mgr_new(struct 
>> ttm_resource_manager *man,
>> pages = min(pages_left, 2UL << (30 - PAGE_SHIFT));
>>
>> i = 0;
>> -   spin_lock(&mgr->lock);
>> while (pages_left) {
>> -   uint32_t alignment = tbo->page_alignment;
>> -
>> if (pages >= pages_per_node)
>> -   align

Re: [PATCH v2 5/8] drm/i915/dp: rewrite DP 2.0 128b/132b link training based on errata

2022-02-04 Thread Ville Syrjälä

On Thu, Feb 03, 2022 at 11:03:54AM +0200, Jani Nikula wrote:
> The DP 2.0 errata completely overhauls the 128b/132b link training, with
> no provisions for backward compatibility with the original DP 2.0
> specification.
> 
> The changes are too intrusive to consider reusing the same code for both
> 8b/10b and 128b/132b, mainly because the LTTPR channel equalisation is
> done concurrently instead of serialized.
> 
> NOTES:
> 
> * It's a bit unclear when to wait for DP_INTERLANE_ALIGN_DONE and
>   per-lane DP_LANE_SYMBOL_LOCKED. Figure xx4 in the SCR implies the
>   LANEx_CHANNEL_EQ_DONE sequence may end with either 0x77,0x77,0x85 *or*
>   0x33,0x33,0x84 (for four lane configuration in DPCD 0x202..0x204)
>   i.e. without the above bits set. Text elsewhere seems contradictory or
>   incomplete.
> 
> * We read entire link status (6 bytes) everywhere instead of individual
>   DPCD addresses.
> 
> * There are some subtle ambiguities or contradictions in the order of
>   some DPCD access and TPS signal enables/disables. It's also not clear
>   whether these are significant.
> 
> v2:
> - Always try one last time after timeouts to avoid races (Ville)
> - Extend timeout to cover the entire LANEx_EQ_DONE sequence (Ville)
> - Also check for eq interlane align done in LANEx_CDS_DONE Sequence (Ville)
> - Check for Intra-hop status before link training
> 
> Cc: Uma Shankar 
> Cc: Ville Syrjälä 
> Signed-off-by: Jani Nikula 
> ---
>  .../drm/i915/display/intel_dp_link_training.c | 279 +-
>  1 file changed, 278 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.c 
> b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
> index 4e507aa75a03..cc2b82d9114c 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
> @@ -1102,6 +1102,277 @@ intel_dp_link_train_all_phys(struct intel_dp 
> *intel_dp,
>   return ret;
>  }
>  
> +
> +/*
> + * 128b/132b DP LANEx_EQ_DONE Sequence (DP 2.0 E11 3.5.2.16.1)
> + */
> +static bool
> +intel_dp_128b132b_lane_eq(struct intel_dp *intel_dp,
> +   const struct intel_crtc_state *crtc_state)
> +{
> + struct intel_encoder *encoder = &dp_to_dig_port(intel_dp)->base;
> + struct drm_i915_private *i915 = to_i915(encoder->base.dev);
> + u8 link_status[DP_LINK_STATUS_SIZE];
> + int delay_us;
> + int try, max_tries = 20;
> + unsigned long deadline;
> + bool timeout = false;
> +
> + /*
> +  * Reset signal levels. Start transmitting 128b/132b TPS1.
> +  *
> +  * Put DPRX and LTTPRs (if any) into intra-hop AUX mode by writing TPS1
> +  * in DP_TRAINING_PATTERN_SET.
> +  */
> + if (!intel_dp_reset_link_train(intel_dp, crtc_state, DP_PHY_DPRX,
> +DP_TRAINING_PATTERN_1)) {
> + drm_err(&i915->drm,
> + "[ENCODER:%d:%s] Failed to start 128b/132b TPS1\n",
> + encoder->base.base.id, encoder->base.name);
> + return false;
> + }
> +
> + delay_us = drm_dp_128b132b_read_aux_rd_interval(&intel_dp->aux);
> +
> + /* Read the initial TX FFE settings. */
> + if (drm_dp_dpcd_read_link_status(&intel_dp->aux, link_status) < 0) {
> + drm_err(&i915->drm,
> + "[ENCODER:%d:%s] Failed to read TX FFE presets\n",
> + encoder->base.base.id, encoder->base.name);
> + return false;
> + }
> +
> + /* Update signal levels and training set as requested. */
> + intel_dp_get_adjust_train(intel_dp, crtc_state, DP_PHY_DPRX, 
> link_status);
> + if (!intel_dp_update_link_train(intel_dp, crtc_state, DP_PHY_DPRX)) {
> + drm_err(&i915->drm,
> + "[ENCODER:%d:%s] Failed to set initial TX FFE 
> settings\n",
> + encoder->base.base.id, encoder->base.name);
> + return false;
> + }
> +
> + /* Start transmitting 128b/132b TPS2. */
> + if (!intel_dp_set_link_train(intel_dp, crtc_state, DP_PHY_DPRX,
> +  DP_TRAINING_PATTERN_2)) {
> + drm_err(&i915->drm,
> + "[ENCODER:%d:%s] Failed to start 128b/132b TPS2\n",
> + encoder->base.base.id, encoder->base.name);
> + return false;
> + }
> +
> + /* Time budget for the LANEx_EQ_DONE Sequence */
> + deadline = jiffies + msecs_to_jiffies(400);

Didn't we have a msecs_to_jiffies_timeout() that adds an extra
jiffy to make sure we don't bail too early?

> +
> + for (try = 0; try < max_tries; try++) {
> + usleep_range(delay_us, 2 * delay_us);
> +
> + /*
> +  * The delay may get updated. The transmitter shall read the
> +  * delay before link status during link training.
> +  */
> + delay_us = drm_dp_128b132b_read_aux_rd_interval(&intel_dp->aux);
> +
> +

Re: [PATCH] HPE BMC GXP SUPPORT

2022-02-04 Thread Russell King (Oracle)

Hi,

On Wed, Feb 02, 2022 at 10:52:50AM -0600, nick.hawk...@hpe.com wrote:
> diff --git a/arch/arm/mach-hpe/Makefile b/arch/arm/mach-hpe/Makefile
> new file mode 100644
> index ..8b0a91234df4
> --- /dev/null
> +++ b/arch/arm/mach-hpe/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_ARCH_HPE_GXP) += gxp.o
> diff --git a/arch/arm/mach-hpe/gxp.c b/arch/arm/mach-hpe/gxp.c
> new file mode 100644
> index ..a37838247948
> --- /dev/null
> +++ b/arch/arm/mach-hpe/gxp.c
> @@ -0,0 +1,62 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (C) 2022 Hewlett-Packard Enterprise Development Company, L.P.
> + *
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

It's normal to list all linux/ includes before asm/ includes. Please
rearrange.

> +
> +#define IOP_REGS_PHYS_BASE 0xc000
> +#define IOP_REGS_VIRT_BASE 0xf000
> +#define IOP_REGS_SIZE (240*SZ_1M)
> +
> +#define IOP_EHCI_USBCMD 0x0efe0010
> +
> +static struct map_desc gxp_io_desc[] __initdata = {
> + {
> + .virtual= (unsigned long)IOP_REGS_VIRT_BASE,
> + .pfn= __phys_to_pfn(IOP_REGS_PHYS_BASE),
> + .length = IOP_REGS_SIZE,
> + .type   = MT_DEVICE,

If you keep this, please indent the above four lines by one more tab.

> + },
> +};
> +
> +void __init gxp_map_io(void)
> +{
> + iotable_init(gxp_io_desc, ARRAY_SIZE(gxp_io_desc));
> +}
> +
> +static void __init gxp_dt_init(void)
> +{
> + /*reset EHCI host controller for clear start*/
> + __raw_writel(0x00080002,
> + (void __iomem *)(IOP_REGS_VIRT_BASE + IOP_EHCI_USBCMD));

Please consider making IOP_REGS_VIRT_BASE a 'void __iomem' pointer, it
being a _virtual_ iomem address. This should save you needing repeated
casts except for the initialiser above.

> + of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
> +}
> +
> +static void gxp_restart(enum reboot_mode mode, const char *cmd)
> +{
> + __raw_writel(1, (void __iomem *) IOP_REGS_VIRT_BASE);
> +}
> +
> +static const char * const gxp_board_dt_compat[] = {
> + "HPE,GXP",
> + NULL,
> +};
> +
> +DT_MACHINE_START(GXP_DT, "HPE GXP")
> + .init_machine   = gxp_dt_init,
> + .map_io = gxp_map_io,
> + .restart= gxp_restart,
> + .dt_compat  = gxp_board_dt_compat,
> +MACHINE_END
> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
> index cfb8ea0df3b1..5916dade7608 100644
> --- a/drivers/clocksource/Kconfig
> +++ b/drivers/clocksource/Kconfig
> @@ -617,6 +617,14 @@ config CLKSRC_ST_LPC
> Enable this option to use the Low Power controller timer
> as clocksource.
>  
> +config GXP_TIMER
> + bool "GXP timer driver"
> + depends on ARCH_HPE
> + default y
> + help
> +   Provides a driver for the timer control found on HPE
> +   GXP SOCs. This is required for all GXP SOCs.
> +
>  config ATCPIT100_TIMER
>   bool "ATCPIT100 timer driver"
>   depends on NDS32 || COMPILE_TEST
> diff --git a/drivers/clocksource/Makefile b/drivers/clocksource/Makefile
> index fa5f624eadb6..ffca09ec34de 100644
> --- a/drivers/clocksource/Makefile
> +++ b/drivers/clocksource/Makefile
> @@ -89,3 +89,4 @@ obj-$(CONFIG_GX6605S_TIMER) += timer-gx6605s.o
>  obj-$(CONFIG_HYPERV_TIMER)   += hyperv_timer.o
>  obj-$(CONFIG_MICROCHIP_PIT64B)   += timer-microchip-pit64b.o
>  obj-$(CONFIG_MSC313E_TIMER)  += timer-msc313e.o
> +obj-$(CONFIG_GXP_TIMER)  += gxp_timer.o
> diff --git a/drivers/clocksource/gxp_timer.c b/drivers/clocksource/gxp_timer.c
> new file mode 100644
> index ..e3c617036e0d
> --- /dev/null
> +++ b/drivers/clocksource/gxp_timer.c
> @@ -0,0 +1,158 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (C) 2022 Hewlett-Packard Enterprise Development Company, L.P.
> + *
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 

Why do you need asm/irq.h ?

> +
> +#define TIMER0_FREQ 100
> +#define TIMER1_FREQ 100
> +
> +#define MASK_TCS_ENABLE  0x01
> +#define MASK_TCS_PERIOD  0x02
> +#define MASK_TCS_RELOAD  0x04
> +#define MASK_TCS_TC  0x80
> +
> +struct gxp_timer {
> + void __iomem *counter;
> + void __iomem *control;
> + struct clock_event_device evt;
> +};
> +
> +static void __iomem *system_clock __read_mostly;
> +
> +static u64 notrace gxp_sche

Re: [PATCH] HPE BMC GXP SUPPORT

2022-02-04 Thread Joe Perches

On Fri, 2022-02-04 at 12:05 +, Russell King (Oracle) wrote:
> On Wed, Feb 02, 2022 at 10:52:50AM -0600, nick.hawk...@hpe.com wrote:
[]
> > diff --git a/drivers/clocksource/Makefile b/drivers/clocksource/Makefile
[]
> > +static irqreturn_t gxp_time_interrupt(int irq, void *dev_id)
> > +{
> > +   struct gxp_timer *timer = dev_id;
> > +   void (*event_handler)(struct clock_event_device *timer);
> > +
> > +
> 
> One too many blank lines.
> 
> > +   if (readb_relaxed(timer->control) & MASK_TCS_TC) {
> > +   writeb_relaxed(MASK_TCS_TC, timer->control);
> > +
> > +   event_handler = READ_ONCE(timer->evt.event_handler);
> > +   if (event_handler)
> > +   event_handler(&timer->evt);
> > +   return IRQ_HANDLED;
> > +   } else {
> > +   return IRQ_NONE;
> > +   }
> > +}

It's also less indented code and perhaps clearer to reverse the test

if (!readb_relaxed(timer->control) & MASK_TCS_TC)
return IRQ_NONE;

writeb_relaxed(MASK_TCS_TC, timer->control);

event_handler = READ_ONCE(timer->evt.event_handler);
if (event_handler)
event_handler(&timer->evt);

return IRQ_HANDLED;

Re: [PATCH] HPE BMC GXP SUPPORT

2022-02-04 Thread Russell King (Oracle)

On Fri, Feb 04, 2022 at 04:18:24AM -0800, Joe Perches wrote:
> On Fri, 2022-02-04 at 12:05 +, Russell King (Oracle) wrote:
> > On Wed, Feb 02, 2022 at 10:52:50AM -0600, nick.hawk...@hpe.com wrote:
> > > + if (readb_relaxed(timer->control) & MASK_TCS_TC) {
> > > + writeb_relaxed(MASK_TCS_TC, timer->control);
> > > +
> > > + event_handler = READ_ONCE(timer->evt.event_handler);
> > > + if (event_handler)
> > > + event_handler(&timer->evt);
> > > + return IRQ_HANDLED;
> > > + } else {
> > > + return IRQ_NONE;
> > > + }
> > > +}
> 
> It's also less indented code and perhaps clearer to reverse the test
> 
>   if (!readb_relaxed(timer->control) & MASK_TCS_TC)

This will need to be:

if (!(readb_relaxed(timer->control) & MASK_TCS_TC))

>   return IRQ_NONE;
> 
>   writeb_relaxed(MASK_TCS_TC, timer->control);
> 
>   event_handler = READ_ONCE(timer->evt.event_handler);
>   if (event_handler)
>   event_handler(&timer->evt);
> 
>   return IRQ_HANDLED;
> 
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: Add warning for nesting dma_fence containers

2022-02-04 Thread Christian König





Am 04.02.22 um 11:40 schrieb Thomas Hellström:

On Fri, 2022-02-04 at 11:04 +0100, Christian König wrote:

Hi everyone,

Since some operations can then lead to recursive handling nesting
dma_fence containers into each other is only allowed under some
restrictions.

dma_fence_array containers can be attached to dma_fence_chain
containers and dma_fence_chain containers by chaining them together.

In all other cases the individual fences should be extracted with
the appropriate iterators and added to the new containers
individually.

I've separated the i915 cleanup from this change since it is
generally a different functionality and the build bots complained
about some issues with those patches.

Most patches are already reviewd, but especially the first one still
needs an rb tag.

Please review and comment,

I see you dropped the i915 patch (probably due to lack of reviews?),
Got distracted with other things, but I'll see if I can resurrect that
and get it reviewed and merged.


I was about to send out the i915 patch when that one here is merged.

The CI systems yielded some strange error with that one and I wanted to 
double check what's that all about.


Regards,
Christian.



Thanks,
Thomas



Christian.

Re: [PATCH v11 5/5] drm/amdgpu: add drm buddy support to amdgpu

2022-02-04 Thread Christian König


Am 04.02.22 um 12:22 schrieb Arunpravin:

On 28/01/22 7:48 pm, Matthew Auld wrote:

On Thu, 27 Jan 2022 at 14:11, Arunpravin
 wrote:

- Remove drm_mm references and replace with drm buddy functionalities
- Add res cursor support for drm buddy

v2(Matthew Auld):
   - replace spinlock with mutex as we call kmem_cache_zalloc
 (..., GFP_KERNEL) in drm_buddy_alloc() function

   - lock drm_buddy_block_trim() function as it calls
 mark_free/mark_split are all globally visible

v3(Matthew Auld):
   - remove trim method error handling as we address the failure case
 at drm_buddy_block_trim() function

v4:
   - fix warnings reported by kernel test robot 

v5:
   - fix merge conflict issue

v6:
   - fix warnings reported by kernel test robot 

Signed-off-by: Arunpravin 
---
  drivers/gpu/drm/Kconfig   |   1 +
  .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   7 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 259 ++
  4 files changed, 231 insertions(+), 133 deletions(-)




-/**
- * amdgpu_vram_mgr_virt_start - update virtual start address
- *
- * @mem: ttm_resource to update
- * @node: just allocated node
- *
- * Calculate a virtual BO start address to easily check if everything is CPU
- * accessible.
- */
-static void amdgpu_vram_mgr_virt_start(struct ttm_resource *mem,
-  struct drm_mm_node *node)
-{
-   unsigned long start;
-
-   start = node->start + node->size;
-   if (start > mem->num_pages)
-   start -= mem->num_pages;
-   else
-   start = 0;
-   mem->start = max(mem->start, start);
-}
-
  /**
   * amdgpu_vram_mgr_new - allocate new ranges
   *
@@ -366,13 +357,13 @@ static int amdgpu_vram_mgr_new(struct 
ttm_resource_manager *man,
const struct ttm_place *place,
struct ttm_resource **res)
  {
-   unsigned long lpfn, num_nodes, pages_per_node, pages_left, pages;
+   unsigned long lpfn, pages_per_node, pages_left, pages, n_pages;
+   u64 vis_usage = 0, mem_bytes, max_bytes, min_page_size;
 struct amdgpu_vram_mgr *mgr = to_vram_mgr(man);
 struct amdgpu_device *adev = to_amdgpu_device(mgr);
-   uint64_t vis_usage = 0, mem_bytes, max_bytes;
-   struct ttm_range_mgr_node *node;
-   struct drm_mm *mm = &mgr->mm;
-   enum drm_mm_insert_mode mode;
+   struct amdgpu_vram_mgr_node *node;
+   struct drm_buddy *mm = &mgr->mm;
+   struct drm_buddy_block *block;
 unsigned i;
 int r;

@@ -391,10 +382,9 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
 goto error_sub;
 }

-   if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
+   if (place->flags & TTM_PL_FLAG_CONTIGUOUS)
 pages_per_node = ~0ul;
-   num_nodes = 1;
-   } else {
+   else {
  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 pages_per_node = HPAGE_PMD_NR;
  #else
@@ -403,11 +393,9 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
  #endif
 pages_per_node = max_t(uint32_t, pages_per_node,
tbo->page_alignment);
-   num_nodes = DIV_ROUND_UP_ULL(PFN_UP(mem_bytes), pages_per_node);
 }

-   node = kvmalloc(struct_size(node, mm_nodes, num_nodes),
-   GFP_KERNEL | __GFP_ZERO);
+   node = kzalloc(sizeof(*node), GFP_KERNEL);
 if (!node) {
 r = -ENOMEM;
 goto error_sub;
@@ -415,9 +403,17 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,

 ttm_resource_init(tbo, place, &node->base);

-   mode = DRM_MM_INSERT_BEST;
+   INIT_LIST_HEAD(&node->blocks);
+
 if (place->flags & TTM_PL_FLAG_TOPDOWN)
-   mode = DRM_MM_INSERT_HIGH;
+   node->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
+
+   if (place->fpfn || lpfn != man->size)
+   /* Allocate blocks in desired range */
+   node->flags |= DRM_BUDDY_RANGE_ALLOCATION;
+
+   min_page_size = mgr->default_page_size;
+   BUG_ON(min_page_size < mm->chunk_size);

 pages_left = node->base.num_pages;

@@ -425,36 +421,61 @@ static int amdgpu_vram_mgr_new(struct 
ttm_resource_manager *man,
 pages = min(pages_left, 2UL << (30 - PAGE_SHIFT));

 i = 0;
-   spin_lock(&mgr->lock);
 while (pages_left) {
-   uint32_t alignment = tbo->page_alignment;
-
 if (pages >= pages_per_node)
-   alignment = pages_per_node;
-
-   r = drm_mm_insert_node_in_range(mm, &node->mm_nodes[i], pages,
-   alignment, 0, place->fpfn,
-   lpfn, mode);
-   if (unlikely(r)) {
-   if (pages > pages_per_no

Re: [PATCH] HPE BMC GXP SUPPORT

2022-02-04 Thread Andrew Lunn

> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> 
> It's normal to list all linux/ includes before asm/ includes. Please
> rearrange.

Hi Nick

Since you are new to the kernel, please let me point out, you should
consider Russell comments for all your code, not just this one file.
Many of the comments are generic to code anywhere in the kernel. So it
would be good to fix the same issues in the rest of your code base
before submitting them.

I would also suggest that when you start submitting drivers, submit
just one or two to start with. You will learn a lot from the feedback
you get, and you can apply what you have learnt to the rest of your
code before you post them for review.

I would also suggest you spend 30 minutes a day just reading comments
other patches receive. You can also learn a lot that way, see if the
comments apply to your own code. You will also learn about processes
this way, which can be just as challenging to get right as code.

 Andrew

[PATCH v2 0/4] drm/tiny: Add driver for Solomon SSD1307 OLED displays

2022-02-04 Thread Javier Martinez Canillas

This patch series adds a DRM driver for the Solomon OLED SSD1305, SSD1306,
SSD1307 and SSD1309 displays. It is a port of the ssd1307fb fbdev driver.

Using the DRM fb emulation, all the tests from Geert Uytterhoeven's fbtest
(https://git.kernel.org/pub/scm/linux/kernel/git/geert/fbtest.git) passes:

./fbtest -f /dev/fb1
Using drawops cfb32 (32 bpp packed pixels)
Available visuals:
  Monochrome
  Grayscale 256
  Truecolor 8:8:8:0
Using visops truecolor
Running all tests
test001: PASSED
test002: PASSED
test003: PASSED
test004: PASSED
test005: PASSED
test006: PASSED
test008: PASSED
Screen size too small for this test
test010: PASSED
Benchmarking... 10x10 squares: 414.41 Mpixels/s
Benchmarking... 20x20 squares: 858.31 Mpixels/s
Benchmarking... 50x50 squares: 1586.33 Mpixels/s
test012: PASSED
Benchmarking... R5 circles: 234.68 Mpixels/s
Benchmarking... R10 circles: 498.24 Mpixels/s
Benchmarking... R25 circles: 942.34 Mpixels/s
test013: PASSED

This is a v2 that addresses all the issues pointed in v1, thanks a lot
to everyone that gave me feedback and reviews. I tried to not miss any
comment, but there were a lot so forgive me if something is not there.

Patch #1 adds two new helpers, drm_fb_gray8_to_mono_reversed() to convert
from grayscale to monochrome and a drm_fb_xrgb_to_mono_reversed() to
convert from XR24 to monochrome. The latter internally use thes former.

Patch #2 adds the driver. The name ssd130x was used instead of ssd1307fb
to denote that this driver is not only for SSD1307, but also for other
displays from the same chip family.

Patch #3 just adds a MAINTAINERS entry for the DRM driver and patch #4
adds myself as a co-maintainer of the existing Device Tree binding for
ssd1307fb, since the same is shared between the fbdev and DRM drivers.

Best regards,
Javier

Changes in v2:
- Drop patch that was adding a DRM_MODE_CONNECTOR_I2C type.
- Invert order of backlight {en,dis}able and display {on,off} (Sam Ravnborg)
- Don't clear the screen and turn on display on probe (Sam Ravnborg)
- Use backlight_get_brightness() macro to get BL brightness (Sam Ravnborg)
- Use dev managed version of devm_backlight_device_register() (Sam Ravnborg)
- Use dev_name(dev) for backlight name instead of an array (Sam Ravnborg)
- Drop the .get_brightness callback since isn't needed  (Sam Ravnborg)
- Add myself as co-maintainer of the ssd1370fb DT binding (Sam Ravnborg)
- Add Sam Ravnborg's acked-by tag to patch 3/4.
- Rename driver to ssd130x since supports a display family (Thomas Zimmermann)
- Drop the TINY prefix from the Kconfig symbol (Thomas Zimmermann)
- Sort the Kconfig symbol dependencies alphabetically (Thomas Zimmermann)
- Rename struct ssd130x_array to struct ssd130x_i2c_msg (Thomas Zimmermann)
- Rename struct ssd130x_i2c_msg .type member to .cmd (Thomas Zimmermann)
- Use sizeof(*foo) instead of sizeof(struct foo) (Thomas Zimmermann)
- Use struct_size() macro to calculate sizeof(*foo) + len (Thomas Zimmermann)
- Use kcalloc() instead of kmalloc_array() + memset() (Thomas Zimmermann)
- Use shadow plane helpers virtual screen support (Thomas Zimmermann)
- Remove unused goto label in ssd1307_fb_blit_rect() (Thomas Zimmermann)
- Use drm_set_preferred_mode() inset of manually set (Thomas Zimmermann)
- Use shadow plane helpers virtual screen support (Thomas Zimmermann)
- Remove unused goto label in ssd1307_fb_blit_rect() (Thomas Zimmermann)
- Use drm_set_preferred_mode() inset of manually set (Thomas Zimmermann)
- Reorganize code in probe to make it more legible (Thomas Zimmermann)
- ssd130x_write_cmd() uses varargs to simplify I2C code (Thomas Zimmermann)
- Move regulator/pwm init logic to display pipe enable callback.
- Also add a drm_fb_xrgb_to_mono_reversed() helper (Thomas Zimmermann)
- Add a drm_fb_gray8_to_mono_reversed_line() helper (Thomas Zimmermann)

Javier Martinez Canillas (4):
  drm/format-helper: Add drm_fb_{xrgb,gray8}_to_mono_reversed()
  drm/tiny: Add driver for Solomon SSD130X OLED displays
  MAINTAINERS: Add entry for Solomon SSD130X OLED displays DRM driver
  dt-bindings: display: ssd1307fb: Add myself as binding co-maintainer

 .../bindings/display/solomon,ssd1307fb.yaml   |   1 +
 MAINTAINERS   |   7 +
 drivers/gpu/drm/drm_format_helper.c   |  80 ++
 drivers/gpu/drm/tiny/Kconfig  |  12 +
 drivers/gpu/drm/tiny/Makefile |   1 +
 drivers/gpu/drm/tiny/ssd130x.c| 971 ++
 include/drm/drm_format_helper.h   |   7 +
 7 files changed, 1079 insertions(+)
 create mode 100644 drivers/gpu/drm/tiny/ssd130x.c

-- 
2.34.1

[PATCH v2 1/4] drm/format-helper: Add drm_fb_{xrgb8888, gray8}_to_mono_reversed()

2022-02-04 Thread Javier Martinez Canillas

Add support to convert XR24 and 8-bit grayscale to reversed monochrome for
drivers that control monochromatic panels, that only have 1 bit per pixel.

The drm_fb_gray8_to_mono_reversed() helper was based on the function that
does the same in the drivers/gpu/drm/tiny/repaper.c driver.

Signed-off-by: Javier Martinez Canillas 
---

(no changes since v1)

 drivers/gpu/drm/drm_format_helper.c | 80 +
 include/drm/drm_format_helper.h |  7 +++
 2 files changed, 87 insertions(+)

diff --git a/drivers/gpu/drm/drm_format_helper.c 
b/drivers/gpu/drm/drm_format_helper.c
index 0f28dd2bdd72..cdce4b7c25d9 100644
--- a/drivers/gpu/drm/drm_format_helper.c
+++ b/drivers/gpu/drm/drm_format_helper.c
@@ -584,3 +584,83 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int 
dst_pitch, uint32_t dst_for
return -EINVAL;
 }
 EXPORT_SYMBOL(drm_fb_blit_toio);
+
+static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, const u8 *src, size_t 
pixels)
+{
+   unsigned int xb, i;
+
+   for (xb = 0; xb < pixels / 8; xb++) {
+   u8 byte = 0x00;
+
+   for (i = 0; i < 8; i++) {
+   int x = xb * 8 + i;
+
+   byte >>= 1;
+   if (src[x] >> 7)
+   byte |= BIT(7);
+   }
+   *dst++ = byte;
+   }
+}
+
+/**
+ * drm_fb_gray8_to_mono_reversed - Convert grayscale to reversed monochrome
+ * @dst: reversed monochrome destination buffer
+ * @dst_pitch: Number of bytes between two consecutive scanlines within dst
+ * @src: 8-bit grayscale source buffer
+ * @clip: Clip rectangle area to copy
+ *
+ * DRM doesn't have native monochrome or grayscale support.
+ * Such drivers can announce the commonly supported XR24 format to userspace
+ * and use drm_fb_xrgb_to_gray8() to convert to grayscale and then this
+ * helper function to convert to the native format.
+ */
+void drm_fb_gray8_to_mono_reversed(void *dst, unsigned int dst_pitch, const 
void *src,
+  const struct drm_rect *clip)
+{
+
+   size_t height = drm_rect_height(clip);
+   size_t width = drm_rect_width(clip);
+   unsigned int y;
+   const u8 *gray8 = src;
+   u8 *mono = dst;
+
+   if (!dst_pitch)
+   dst_pitch = width;
+
+   for (y = 0; y < height; y++) {
+   drm_fb_gray8_to_mono_reversed_line(mono, gray8, dst_pitch);
+   mono += (dst_pitch / 8);
+   gray8 += dst_pitch;
+   }
+}
+
+/**
+ * drm_fb_xrgb_to_mono_reversed - Convert XRGB to reversed monochrome
+ * @dst: reversed monochrome destination buffer
+ * @dst_pitch: Number of bytes between two consecutive scanlines within dst
+ * @src: XRGB source buffer
+ * @fb: DRM framebuffer
+ * @clip: Clip rectangle area to copy
+ *
+ * DRM doesn't have native monochrome support.
+ * Such drivers can announce the commonly supported XR24 format to userspace
+ * and use this function to convert to the native format.
+ *
+ * This function uses drm_fb_xrgb_to_gray8() to convert to grayscale and
+ * then the result is converted from grayscale to reversed monohrome.
+ */
+void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const 
void *src,
+ const struct drm_framebuffer *fb,
+ const struct drm_rect *clip)
+{
+   if (WARN_ON(fb->format->format != DRM_FORMAT_XRGB))
+   return;
+
+   if (!dst_pitch)
+   dst_pitch = drm_rect_width(clip);
+
+   drm_fb_xrgb_to_gray8(dst, dst_pitch, src, fb, clip);
+   drm_fb_gray8_to_mono_reversed(dst, dst_pitch, dst, fb, clip);
+}
+EXPORT_SYMBOL(drm_fb_xrgb_to_mono_reversed);
diff --git a/include/drm/drm_format_helper.h b/include/drm/drm_format_helper.h
index b30ed5de0a33..85e551a5cbe6 100644
--- a/include/drm/drm_format_helper.h
+++ b/include/drm/drm_format_helper.h
@@ -43,4 +43,11 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int 
dst_pitch, uint32_t dst_for
 const void *vmap, const struct drm_framebuffer *fb,
 const struct drm_rect *rect);
 
+void drm_fb_gray8_to_mono_reversed(void *dst, unsigned int dst_pitch, const 
void *src,
+  const struct drm_rect *clip);
+
+void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const 
void *src,
+ const struct drm_framebuffer *fb,
+ const struct drm_rect *clip);
+
 #endif /* __LINUX_DRM_FORMAT_HELPER_H */
-- 
2.34.1

[PATCH v2 3/4] MAINTAINERS: Add entry for Solomon SSD130X OLED displays DRM driver

2022-02-04 Thread Javier Martinez Canillas

To make sure that tools like the get_maintainer.pl script will suggest
to Cc me if patches are posted for this driver.

Also include the Device Tree binding for the old ssd1307fb fbdev driver
since the new DRM driver was made compatible with the existing binding.

Signed-off-by: Javier Martinez Canillas 
Acked-by: Sam Ravnborg 
---

(no changes since v1)

 MAINTAINERS | 7 +++
 drivers/gpu/drm/drm_format_helper.c | 2 +-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d03ad8da1f36..9061488a4113 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6102,6 +6102,13 @@ T:   git git://anongit.freedesktop.org/drm/drm-misc
 F: Documentation/devicetree/bindings/display/repaper.txt
 F: drivers/gpu/drm/tiny/repaper.c
 
+DRM DRIVER FOR SOLOMON SSD130X OLED DISPLAYS
+M: Javier Martinez Canillas 
+S: Maintained
+T: git git://anongit.freedesktop.org/drm/drm-misc
+F: Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml
+F: drivers/gpu/drm/tiny/ssd130x.c
+
 DRM DRIVER FOR QEMU'S CIRRUS DEVICE
 M: Dave Airlie 
 M: Gerd Hoffmann 
diff --git a/drivers/gpu/drm/drm_format_helper.c 
b/drivers/gpu/drm/drm_format_helper.c
index cdce4b7c25d9..c3c1372fb771 100644
--- a/drivers/gpu/drm/drm_format_helper.c
+++ b/drivers/gpu/drm/drm_format_helper.c
@@ -661,6 +661,6 @@ void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned 
int dst_pitch, const v
dst_pitch = drm_rect_width(clip);
 
drm_fb_xrgb_to_gray8(dst, dst_pitch, src, fb, clip);
-   drm_fb_gray8_to_mono_reversed(dst, dst_pitch, dst, fb, clip);
+   drm_fb_gray8_to_mono_reversed(dst, dst_pitch, dst, clip);
 }
 EXPORT_SYMBOL(drm_fb_xrgb_to_mono_reversed);
-- 
2.34.1

[PATCH v2 2/4] drm/tiny: Add driver for Solomon SSD130X OLED displays

2022-02-04 Thread Javier Martinez Canillas

Add a DRM driver for SSD1305, SSD1306, SSD1307 and SSD1309 Solomon OLED
controllers that can be programmed via an I2C interface. This is a port
of the ssd1307fb driver that already supports these devices.

A Device Tree binding is not added because the DRM driver is compatible
with the existing binding for the ssd1307fb driver.

Signed-off-by: Javier Martinez Canillas 
---

(no changes since v1)

 drivers/gpu/drm/tiny/Kconfig   |  12 +
 drivers/gpu/drm/tiny/Makefile  |   1 +
 drivers/gpu/drm/tiny/ssd130x.c | 971 +
 3 files changed, 984 insertions(+)
 create mode 100644 drivers/gpu/drm/tiny/ssd130x.c

diff --git a/drivers/gpu/drm/tiny/Kconfig b/drivers/gpu/drm/tiny/Kconfig
index 712e0004e96e..1b6f5aa41d69 100644
--- a/drivers/gpu/drm/tiny/Kconfig
+++ b/drivers/gpu/drm/tiny/Kconfig
@@ -67,6 +67,18 @@ config DRM_SIMPLEDRM
  On x86 BIOS or UEFI systems, you should also select SYSFB_SIMPLEFB
  to use UEFI and VESA framebuffers.
 
+config DRM_SSD130X
+   tristate "DRM support for Solomon SSD130X OLED displays"
+   depends on DRM && I2C
+   select BACKLIGHT_CLASS_DEVICE
+   select DRM_GEM_SHMEM_HELPER
+   select DRM_KMS_HELPER
+   help
+ DRM driver for the SSD1305, SSD1306, SSD1307 and SSD1309 Solomon
+ OLED controllers that can be programmed via an I2C interface.
+
+ If M is selected the module will be called ssd130x.
+
 config TINYDRM_HX8357D
tristate "DRM support for HX8357D display panels"
depends on DRM && SPI
diff --git a/drivers/gpu/drm/tiny/Makefile b/drivers/gpu/drm/tiny/Makefile
index 5d5505d40e7b..18c3557dcb71 100644
--- a/drivers/gpu/drm/tiny/Makefile
+++ b/drivers/gpu/drm/tiny/Makefile
@@ -5,6 +5,7 @@ obj-$(CONFIG_DRM_BOCHS) += bochs.o
 obj-$(CONFIG_DRM_CIRRUS_QEMU)  += cirrus.o
 obj-$(CONFIG_DRM_GM12U320) += gm12u320.o
 obj-$(CONFIG_DRM_SIMPLEDRM)+= simpledrm.o
+obj-$(CONFIG_DRM_SSD130X)  += ssd130x.o
 obj-$(CONFIG_TINYDRM_HX8357D)  += hx8357d.o
 obj-$(CONFIG_TINYDRM_ILI9163)  += ili9163.o
 obj-$(CONFIG_TINYDRM_ILI9225)  += ili9225.o
diff --git a/drivers/gpu/drm/tiny/ssd130x.c b/drivers/gpu/drm/tiny/ssd130x.c
new file mode 100644
index ..b348768529dc
--- /dev/null
+++ b/drivers/gpu/drm/tiny/ssd130x.c
@@ -0,0 +1,971 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * DRM driver for Solomon SSD130X OLED displays
+ *
+ * Copyright 2022 Red Hat Inc.
+ *
+ * Based on drivers/video/fbdev/ssd1307fb.c
+ * Copyright 2012 Free Electrons
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DRIVER_NAME"ssd130x"
+#define DRIVER_DESC"DRM driver for Solomon SSD130X OLED displays"
+#define DRIVER_DATE"20220131"
+#define DRIVER_MAJOR   1
+#define DRIVER_MINOR   0
+
+#define SSD130X_DATA   0x40
+#define SSD130X_COMMAND0x80
+
+#define SSD130X_SET_ADDRESS_MODE   0x20
+#define SSD130X_SET_ADDRESS_MODE_HORIZONTAL(0x00)
+#define SSD130X_SET_ADDRESS_MODE_VERTICAL  (0x01)
+#define SSD130X_SET_ADDRESS_MODE_PAGE  (0x02)
+#define SSD130X_SET_COL_RANGE  0x21
+#define SSD130X_SET_PAGE_RANGE 0x22
+#define SSD130X_CONTRAST   0x81
+#define SSD130X_SET_LOOKUP_TABLE   0x91
+#define SSD130X_CHARGE_PUMP0x8d
+#define SSD130X_SEG_REMAP_ON   0xa1
+#define SSD130X_DISPLAY_OFF0xae
+#define SSD130X_SET_MULTIPLEX_RATIO0xa8
+#define SSD130X_DISPLAY_ON 0xaf
+#define SSD130X_START_PAGE_ADDRESS 0xb0
+#define SSD130X_SET_DISPLAY_OFFSET 0xd3
+#define SSD130X_SET_CLOCK_FREQ 0xd5
+#define SSD130X_SET_AREA_COLOR_MODE0xd8
+#define SSD130X_SET_PRECHARGE_PERIOD   0xd9
+#define SSD130X_SET_COM_PINS_CONFIG0xda
+#define SSD130X_SET_VCOMH  0xdb
+
+#define MAX_CONTRAST 255
+
+struct ssd130x_deviceinfo {
+   u32 default_vcomh;
+   u32 default_dclk_div;
+   u32 default_dclk_frq;
+   int need_pwm;
+   int need_chargepump;
+};
+
+struct ssd130x_device {
+   struct drm_device drm;
+   struct drm_simple_display_pipe pipe;
+   struct drm_display_mode mode;
+   struct drm_connector connector;
+   struct i2c_client *client;
+
+   const struct ssd130x_deviceinfo *device_info;
+
+   unsigned area_color_enable : 1;
+   unsigned com_invdir : 1;
+   unsigned com_lrremap : 1;
+   unsigned com_seq : 1;
+   unsigned lookup_table_set : 1;
+   unsigned low_power : 1;
+   unsigned seg_remap : 1;
+   u32 com_offset;
+   u32 contrast;
+   u32 dclk_div;
+

[PATCH v2 4/4] dt-bindings: display: ssd1307fb: Add myself as binding co-maintainer

2022-02-04 Thread Javier Martinez Canillas

The ssd130x DRM driver also makes use of this Device Tree binding to allow
existing users of the fbdev driver to migrate without the need to change
their Device Trees.

Add myself as another maintainer of the binding, to make sure that I will
be on Cc when patches are proposed for it.

Suggested-by: Sam Ravnborg 
Signed-off-by: Javier Martinez Canillas 
---

(no changes since v1)

 Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml 
b/Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml
index 2ed2a7d0ca2f..9baafd0c42dd 100644
--- a/Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml
+++ b/Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml
@@ -8,6 +8,7 @@ title: Solomon SSD1307 OLED Controller Framebuffer
 
 maintainers:
   - Maxime Ripard 
+  - Javier Martinez Canillas 
 
 properties:
   compatible:
-- 
2.34.1

Re: [PATCH v2 3/4] MAINTAINERS: Add entry for Solomon SSD130X OLED displays DRM driver

2022-02-04 Thread Andy Shevchenko

On Fri, Feb 04, 2022 at 02:43:46PM +0100, Javier Martinez Canillas wrote:
> To make sure that tools like the get_maintainer.pl script will suggest
> to Cc me if patches are posted for this driver.
> 
> Also include the Device Tree binding for the old ssd1307fb fbdev driver
> since the new DRM driver was made compatible with the existing binding.

...

>  drivers/gpu/drm/drm_format_helper.c | 2 +-

Nothing about this in the commit message...

Stray change?

-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH] HPE BMC GXP SUPPORT

2022-02-04 Thread Joe Perches

On Fri, 2022-02-04 at 12:31 +, Russell King (Oracle) wrote:
> On Fri, Feb 04, 2022 at 04:18:24AM -0800, Joe Perches wrote:
> > On Fri, 2022-02-04 at 12:05 +, Russell King (Oracle) wrote:
> > > On Wed, Feb 02, 2022 at 10:52:50AM -0600, nick.hawk...@hpe.com wrote:
> > > > +   if (readb_relaxed(timer->control) & MASK_TCS_TC) {
> > > > +   writeb_relaxed(MASK_TCS_TC, timer->control);
> > > > +
> > > > +   event_handler = READ_ONCE(timer->evt.event_handler);
> > > > +   if (event_handler)
> > > > +   event_handler(&timer->evt);
> > > > +   return IRQ_HANDLED;
> > > > +   } else {
> > > > +   return IRQ_NONE;
> > > > +   }
> > > > +}
> > 
> > It's also less indented code and perhaps clearer to reverse the test
> > 
> > if (!readb_relaxed(timer->control) & MASK_TCS_TC)
> 
> This will need to be:
> 
>   if (!(readb_relaxed(timer->control) & MASK_TCS_TC))

right, thanks.

Re: [PATCH v2] drm/bridge: dw-hdmi: use safe format when first in bridge chain

2022-02-04 Thread Robert Foss

On Wed, 19 Jan 2022 at 13:28, Neil Armstrong  wrote:
>
> When the dw-hdmi bridge is in first place of the bridge chain, this
> means there is no way to select an input format of the dw-hdmi HW
> component.
>
> Since introduction of display-connector, negotiation was broken since
> the dw-hdmi negotiation code only worked when the dw-hdmi bridge was
> in last position of the bridge chain or behind another bridge also
> supporting input & output format negotiation.
>
> Commit 0656d1285b79 ("drm/bridge: display-connector: implement bus fmts 
> callbacks")

I think this is the wrong hash. Is 7cd70656d128 the actual hash?

> was introduced to make negotiation work again by making display-connector
> act as a pass-through concerning input & output format negotiation.
>
> But in the case where the dw-hdmi is single in the bridge chain, for
> example on Renesas SoCs, with the display-connector bridge the dw-hdmi
> is no more single, breaking output format.
>
> Reported-by: Biju Das 
> Bisected-by: Kieran Bingham 
> Tested-by: Kieran Bingham 
> Fixes: 0656d1285b79 ("drm/bridge: display-connector: implement bus fmts 
> callbacks").

This hash too.

> Signed-off-by: Neil Armstrong 
> ---
> Changes since v1:
> - Remove bad fix in dw_hdmi_bridge_atomic_get_input_bus_fmts
> - Fix typos in commit message
>
>  drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
> b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
> index 54d8fdad395f..97cdc61b57f6 100644
> --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
> +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
> @@ -2551,8 +2551,9 @@ static u32 
> *dw_hdmi_bridge_atomic_get_output_bus_fmts(struct drm_bridge *bridge,
> if (!output_fmts)
> return NULL;
>
> -   /* If dw-hdmi is the only bridge, avoid negociating with ourselves */
> -   if (list_is_singular(&bridge->encoder->bridge_chain)) {
> +   /* If dw-hdmi is the first or only bridge, avoid negociating with 
> ourselves */
> +   if (list_is_singular(&bridge->encoder->bridge_chain) ||
> +   list_is_first(&bridge->chain_node, 
> &bridge->encoder->bridge_chain)) {
> *num_output_fmts = 1;
> output_fmts[0] = MEDIA_BUS_FMT_FIXED;
>
> --
> 2.25.1
>

There are two checkstyle issues apart from the above mentioned hash
issues, and I think we can ignore those. With the above mentioned
issue fixed, feel free to add my r-b.

Reviewed-by: Robert Foss

Re: [PATCH v2 3/4] MAINTAINERS: Add entry for Solomon SSD130X OLED displays DRM driver

2022-02-04 Thread Javier Martinez Canillas

Hello Andy,

On 2/4/22 14:57, Andy Shevchenko wrote:
> On Fri, Feb 04, 2022 at 02:43:46PM +0100, Javier Martinez Canillas wrote:
>> To make sure that tools like the get_maintainer.pl script will suggest
>> to Cc me if patches are posted for this driver.
>>
>> Also include the Device Tree binding for the old ssd1307fb fbdev driver
>> since the new DRM driver was made compatible with the existing binding.
> 
> ...
> 
>>  drivers/gpu/drm/drm_format_helper.c | 2 +-
> 
> Nothing about this in the commit message...
> 
> Stray change?
> 

Sigh, I'm not sure how added that change. Just ignore it, I'll fix it
on v3 or when applying if there isn't another revision of this series.

Best regards,
-- 
Javier Martinez Canillas
Linux Engineering
Red Hat

Re: [PATCH v2 2/4] drm/tiny: Add driver for Solomon SSD130X OLED displays

2022-02-04 Thread Andy Shevchenko

On Fri, Feb 04, 2022 at 02:43:45PM +0100, Javier Martinez Canillas wrote:
> Add a DRM driver for SSD1305, SSD1306, SSD1307 and SSD1309 Solomon OLED
> controllers that can be programmed via an I2C interface. This is a port
> of the ssd1307fb driver that already supports these devices.
> 
> A Device Tree binding is not added because the DRM driver is compatible
> with the existing binding for the ssd1307fb driver.

...

> +/*
> + * DRM driver for Solomon SSD130X OLED displays
> + *
> + * Copyright 2022 Red Hat Inc.
> + *
> + * Based on drivers/video/fbdev/ssd1307fb.c
> + * Copyright 2012 Free Electrons

> + *

No need for this blank line.

> + */

...

> +struct ssd130x_device {
> + struct drm_device drm;
> + struct drm_simple_display_pipe pipe;
> + struct drm_display_mode mode;
> + struct drm_connector connector;


> + struct i2c_client *client;

Can we logically separate hw protocol vs hw interface from day 1, please?
This will allow to add SPI support for this panel much easier.

Technically I would like to see here

struct device *dev;

and probably (I haven't looked into design)

struct ssd130x_ops *ops;

or something alike.

> + const struct ssd130x_deviceinfo *device_info;
> +
> + unsigned area_color_enable : 1;
> + unsigned com_invdir : 1;
> + unsigned com_lrremap : 1;
> + unsigned com_seq : 1;
> + unsigned lookup_table_set : 1;
> + unsigned low_power : 1;
> + unsigned seg_remap : 1;
> + u32 com_offset;
> + u32 contrast;
> + u32 dclk_div;
> + u32 dclk_frq;
> + u32 height;
> + u8 lookup_table[4];
> + u32 page_offset;
> + u32 col_offset;
> + u32 prechargep1;
> + u32 prechargep2;
> +
> + struct backlight_device *bl_dev;
> + struct pwm_device *pwm;
> + struct gpio_desc *reset;
> + struct regulator *vbat_reg;
> + u32 vcomh;
> + u32 width;
> + /* Cached address ranges */
> + u8 col_start;
> + u8 col_end;
> + u8 page_start;
> + u8 page_end;
> +};

...

> +static inline int ssd130x_write_value(struct i2c_client *client, u8 value)

Not sure inline does anything useful here.
Ditto for the rest similar cases.

...

> +static inline int ssd130x_write_cmd(struct i2c_client *client, int count,
> + /* u8 cmd, u8 value, ... */...)
> +{
> + va_list ap;
> + u8 value;
> + int ret;
> +
> + va_start(ap, count);

> + while (count--) {
> + value = va_arg(ap, int);
> + ret = ssd130x_write_value(client, (u8)value);
> + if (ret)
> + goto out_end;
> + }

I'm wondering if this can be written in a form

do {
...
} while (--count);

In this case it will give a hint that count can't be 0.

> +out_end:
> + va_end(ap);
> +
> + return ret;
> +}


...

> + ssd130x->pwm = pwm_get(dev, NULL);
> + if (IS_ERR(ssd130x->pwm)) {
> + dev_err(dev, "Could not get PWM from device tree!\n");

"device tree" is a bit confusing here if I run this on ACPI.
Maybe something like "firmware description"?

> + return PTR_ERR(ssd130x->pwm);
> + }

...

> + /* Set initial contrast */
> + ret = ssd130x_write_cmd(ssd130x->client, 2, SSD130X_CONTRAST, 
> ssd130x->contrast);

Creating a local variable for client allows to:
- make lines shorter and might even be less LOCs
- allow to convert struct device to client in one place
  (as per my above comment)

Ditto for other similar cases.

> + if (ret < 0)
> + return ret;

...

> + for (i = 0; i < ARRAY_SIZE(ssd130x->lookup_table); ++i) {

i++ should work same way.

> + }

...

> + /* Switch to horizontal addressing mode */
> + ret = ssd130x_write_cmd(ssd130x->client, 2, SSD130X_SET_ADDRESS_MODE,
> + SSD130X_SET_ADDRESS_MODE_HORIZONTAL);
> + if (ret < 0)
> + return ret;
> +
> + return 0;

Can it be

return ssd130x_write_cmd(...);

?

...

> + unsigned int line_length = DIV_ROUND_UP(width, 8);
> + unsigned int pages = DIV_ROUND_UP(height, 8);

For power of two there are more efficient roundup()/rounddown()
(or with _ in the names, I don't remember by heart).

...

> + for (k = 0; k < m; k++) {

> + u8 byte = buf[(8 * i + k) * line_length +
> +j / 8];

One line?

> + u8 bit = (byte >> (j % 8)) & 1;
> +
> + data |= bit << k;
> + }

...

> +static int ssd130x_display_pipe_mode_valid(struct drm_simple_display_pipe 
> *pipe,
> +const struct drm_display_mode *mode)
> +{
> + struct ssd130x_device *ssd130x = drm_to_ssd130x(pipe->crtc.dev);
> +
> + if (mode->hdisplay != ssd130x->mode.hdisplay &&
> + mode->vdisplay != ssd130x->mode.vdisplay)
> +

Re: [PATCH v2] drm/bridge: dw-hdmi: use safe format when first in bridge chain

2022-02-04 Thread Neil Armstrong


Hi,

On 04/02/2022 15:05, Robert Foss wrote:

On Wed, 19 Jan 2022 at 13:28, Neil Armstrong  wrote:


When the dw-hdmi bridge is in first place of the bridge chain, this
means there is no way to select an input format of the dw-hdmi HW
component.

Since introduction of display-connector, negotiation was broken since
the dw-hdmi negotiation code only worked when the dw-hdmi bridge was
in last position of the bridge chain or behind another bridge also
supporting input & output format negotiation.

Commit 0656d1285b79 ("drm/bridge: display-connector: implement bus fmts 
callbacks")


I think this is the wrong hash. Is 7cd70656d128 the actual hash?


Wow indeed, thanks for checking...




was introduced to make negotiation work again by making display-connector
act as a pass-through concerning input & output format negotiation.

But in the case where the dw-hdmi is single in the bridge chain, for
example on Renesas SoCs, with the display-connector bridge the dw-hdmi
is no more single, breaking output format.

Reported-by: Biju Das 
Bisected-by: Kieran Bingham 
Tested-by: Kieran Bingham 
Fixes: 0656d1285b79 ("drm/bridge: display-connector: implement bus fmts 
callbacks").


This hash too.


Signed-off-by: Neil Armstrong 
---
Changes since v1:
- Remove bad fix in dw_hdmi_bridge_atomic_get_input_bus_fmts
- Fix typos in commit message

  drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index 54d8fdad395f..97cdc61b57f6 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -2551,8 +2551,9 @@ static u32 
*dw_hdmi_bridge_atomic_get_output_bus_fmts(struct drm_bridge *bridge,
 if (!output_fmts)
 return NULL;

-   /* If dw-hdmi is the only bridge, avoid negociating with ourselves */
-   if (list_is_singular(&bridge->encoder->bridge_chain)) {
+   /* If dw-hdmi is the first or only bridge, avoid negociating with 
ourselves */
+   if (list_is_singular(&bridge->encoder->bridge_chain) ||
+   list_is_first(&bridge->chain_node, &bridge->encoder->bridge_chain)) 
{
 *num_output_fmts = 1;
 output_fmts[0] = MEDIA_BUS_FMT_FIXED;

--
2.25.1



There are two checkstyle issues apart from the above mentioned hash
issues, and I think we can ignore those. With the above mentioned
issue fixed, feel free to add my r-b.

Reviewed-by: Robert Foss 


Thanks, I'll fix & resend with your r-b.

Neil

Re: [PATCH v2 3/4] MAINTAINERS: Add entry for Solomon SSD130X OLED displays DRM driver

2022-02-04 Thread Andy Shevchenko

On Fri, Feb 04, 2022 at 03:12:17PM +0100, Javier Martinez Canillas wrote:
> On 2/4/22 14:57, Andy Shevchenko wrote:
> > On Fri, Feb 04, 2022 at 02:43:46PM +0100, Javier Martinez Canillas wrote:

...

> > Stray change?
> 
> Sigh, I'm not sure how added that change. Just ignore it, I'll fix it
> on v3 or when applying if there isn't another revision of this series.

I believe v3 is warranted due to the other patch review.

-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH v2 0/4] drm/tiny: Add driver for Solomon SSD1307 OLED displays

2022-02-04 Thread Geert Uytterhoeven

Hi Javier,

On Fri, Feb 4, 2022 at 2:43 PM Javier Martinez Canillas
 wrote:
> This patch series adds a DRM driver for the Solomon OLED SSD1305, SSD1306,
> SSD1307 and SSD1309 displays. It is a port of the ssd1307fb fbdev driver.

[...]

> This is a v2 that addresses all the issues pointed in v1, thanks a lot
> to everyone that gave me feedback and reviews. I tried to not miss any
> comment, but there were a lot so forgive me if something is not there.

Thanks for the update!

> Changes in v2:

[...]

Note that the individual patches say "(no changes since v1)"?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH v2 3/4] MAINTAINERS: Add entry for Solomon SSD130X OLED displays DRM driver

2022-02-04 Thread Javier Martinez Canillas

On 2/4/22 15:28, Andy Shevchenko wrote:
> On Fri, Feb 04, 2022 at 03:12:17PM +0100, Javier Martinez Canillas wrote:
>> On 2/4/22 14:57, Andy Shevchenko wrote:
>>> On Fri, Feb 04, 2022 at 02:43:46PM +0100, Javier Martinez Canillas wrote:
> 
> ...
> 
>>> Stray change?
>>
>> Sigh, I'm not sure how added that change. Just ignore it, I'll fix it
>> on v3 or when applying if there isn't another revision of this series.
> 
> I believe v3 is warranted due to the other patch review.
> 

Agreed. Thanks a lot for your feedback and comments.

Best regards,
-- 
Javier Martinez Canillas
Linux Engineering
Red Hat

[PATCH v3] drm/bridge: dw-hdmi: use safe format when first in bridge chain

2022-02-04 Thread Neil Armstrong

When the dw-hdmi bridge is in first place of the bridge chain, this
means there is no way to select an input format of the dw-hdmi HW
component.

Since introduction of display-connector, negotiation was broken since
the dw-hdmi negotiation code only worked when the dw-hdmi bridge was
in last position of the bridge chain or behind another bridge also
supporting input & output format negotiation.

Commit 7cd70656d128 ("drm/bridge: display-connector: implement bus fmts 
callbacks")
was introduced to make negotiation work again by making display-connector
act as a pass-through concerning input & output format negotiation.

But in the case where the dw-hdmi is single in the bridge chain, for
example on Renesas SoCs, with the display-connector bridge the dw-hdmi
is no more single, breaking output format.

Reported-by: Biju Das 
Bisected-by: Kieran Bingham 
Tested-by: Kieran Bingham 
Fixes: 7cd70656d128 ("drm/bridge: display-connector: implement bus fmts 
callbacks").
Signed-off-by: Neil Armstrong 
Reviewed-by: Robert Foss 
---
Changes since v2:
- Add rob's r-b
- Fix invalid Fixes commit hash

Changes since v1:
- Remove bad fix in dw_hdmi_bridge_atomic_get_input_bus_fmts
- Fix typos in commit message

 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index 54d8fdad395f..97cdc61b57f6 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -2551,8 +2551,9 @@ static u32 
*dw_hdmi_bridge_atomic_get_output_bus_fmts(struct drm_bridge *bridge,
if (!output_fmts)
return NULL;
 
-   /* If dw-hdmi is the only bridge, avoid negociating with ourselves */
-   if (list_is_singular(&bridge->encoder->bridge_chain)) {
+   /* If dw-hdmi is the first or only bridge, avoid negociating with 
ourselves */
+   if (list_is_singular(&bridge->encoder->bridge_chain) ||
+   list_is_first(&bridge->chain_node, &bridge->encoder->bridge_chain)) 
{
*num_output_fmts = 1;
output_fmts[0] = MEDIA_BUS_FMT_FIXED;
 
-- 
2.25.1

Re: [PATCH v2 0/4] drm/tiny: Add driver for Solomon SSD1307 OLED displays

2022-02-04 Thread Javier Martinez Canillas

Hello Geert,

On 2/4/22 15:31, Geert Uytterhoeven wrote:
> Hi Javier,
> 
> On Fri, Feb 4, 2022 at 2:43 PM Javier Martinez Canillas
>  wrote:
>> This patch series adds a DRM driver for the Solomon OLED SSD1305, SSD1306,
>> SSD1307 and SSD1309 displays. It is a port of the ssd1307fb fbdev driver.
> 
> [...]
> 
>> This is a v2 that addresses all the issues pointed in v1, thanks a lot
>> to everyone that gave me feedback and reviews. I tried to not miss any
>> comment, but there were a lot so forgive me if something is not there.
> 
> Thanks for the update!
>

You are welcome!

>> Changes in v2:
> 
> [...]
> 
> Note that the individual patches say "(no changes since v1)"?
> 

That's due patman (the tool I use to post patches) not being that smart.

I only added the v2 changelog in the cover letter and not the individual
patches to avoid adding noise, since there are a lot of changes since v1.

But patman then thought that means individual patches had no changes...

Best regards,
-- 
Javier Martinez Canillas
Linux Engineering
Red Hat

Re: [PATCH 6/6] drm/amdgpu: use dma_fence_chain_contained

2022-02-04 Thread Alex Deucher

On Fri, Feb 4, 2022 at 5:04 AM Christian König
 wrote:
>
> Instead of manually extracting the fence.
>
> Signed-off-by: Christian König 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> index f7d8487799b2..40e06745fae9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> @@ -261,10 +261,9 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct 
> amdgpu_sync *sync,
>
> dma_resv_for_each_fence(&cursor, resv, true, f) {
> dma_fence_chain_for_each(f, f) {
> -   struct dma_fence_chain *chain = to_dma_fence_chain(f);
> +   struct dma_fence *tmp = dma_fence_chain_contained(f);
>
> -   if (amdgpu_sync_test_fence(adev, mode, owner, chain ?
> -  chain->fence : f)) {
> +   if (amdgpu_sync_test_fence(adev, mode, owner, tmp)) {
> r = amdgpu_sync_fence(sync, f);
> dma_fence_put(f);
> if (r)
> --
> 2.25.1
>

Re: [PATCH v2 1/4] drm/format-helper: Add drm_fb_{xrgb8888,gray8}_to_mono_reversed()

2022-02-04 Thread Thomas Zimmermann


Hi

Am 04.02.22 um 14:43 schrieb Javier Martinez Canillas:

Add support to convert XR24 and 8-bit grayscale to reversed monochrome for
drivers that control monochromatic panels, that only have 1 bit per pixel.

The drm_fb_gray8_to_mono_reversed() helper was based on the function that
does the same in the drivers/gpu/drm/tiny/repaper.c driver.

Signed-off-by: Javier Martinez Canillas 
---

(no changes since v1)

  drivers/gpu/drm/drm_format_helper.c | 80 +
  include/drm/drm_format_helper.h |  7 +++
  2 files changed, 87 insertions(+)

diff --git a/drivers/gpu/drm/drm_format_helper.c 
b/drivers/gpu/drm/drm_format_helper.c
index 0f28dd2bdd72..cdce4b7c25d9 100644
--- a/drivers/gpu/drm/drm_format_helper.c
+++ b/drivers/gpu/drm/drm_format_helper.c
@@ -584,3 +584,83 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int 
dst_pitch, uint32_t dst_for
return -EINVAL;
  }
  EXPORT_SYMBOL(drm_fb_blit_toio);
+
+static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, const u8 *src, size_t 
pixels)
+{
+   unsigned int xb, i;
+
+   for (xb = 0; xb < pixels / 8; xb++) {


In practice, all mode widths are multiples of 8 because VGA mandated it. 
So it's ok-ish to assume this here. You should probably at least print a 
warning somewhere if (pixels % 8 != 0)




+   u8 byte = 0x00;
+
+   for (i = 0; i < 8; i++) {
+   int x = xb * 8 + i;
+
+   byte >>= 1;
+   if (src[x] >> 7)
+   byte |= BIT(7);
+   }
+   *dst++ = byte;
+   }
+}
+
+/**
+ * drm_fb_gray8_to_mono_reversed - Convert grayscale to reversed monochrome
+ * @dst: reversed monochrome destination buffer
+ * @dst_pitch: Number of bytes between two consecutive scanlines within dst
+ * @src: 8-bit grayscale source buffer
+ * @clip: Clip rectangle area to copy
+ *
+ * DRM doesn't have native monochrome or grayscale support.
+ * Such drivers can announce the commonly supported XR24 format to userspace
+ * and use drm_fb_xrgb_to_gray8() to convert to grayscale and then this
+ * helper function to convert to the native format.
+ */
+void drm_fb_gray8_to_mono_reversed(void *dst, unsigned int dst_pitch, const 
void *src,
+  const struct drm_rect *clip)


There's a bug here. You want to pass in a drm_framebuffer as fourth 
argument.



+{
+
+   size_t height = drm_rect_height(clip);
+   size_t width = drm_rect_width(clip);
+   unsigned int y;
+   const u8 *gray8 = src;
+   u8 *mono = dst;
+
+   if (!dst_pitch)
+   dst_pitch = width;


The dst_pitch is given in bytes. You have to device by 8. Here would be 
a good place to warn if (width % 8 != 0).



+
+   for (y = 0; y < height; y++) {
+   drm_fb_gray8_to_mono_reversed_line(mono, gray8, dst_pitch);
+   mono += (dst_pitch / 8);


The dst_pitch is already given in bytes.


+   gray8 += dst_pitch;


'gray8 += fb->pitches[0]' would be correct.


+   }
+}
+
+/**
+ * drm_fb_xrgb_to_mono_reversed - Convert XRGB to reversed monochrome
+ * @dst: reversed monochrome destination buffer
+ * @dst_pitch: Number of bytes between two consecutive scanlines within dst
+ * @src: XRGB source buffer
+ * @fb: DRM framebuffer
+ * @clip: Clip rectangle area to copy
+ *
+ * DRM doesn't have native monochrome support.
+ * Such drivers can announce the commonly supported XR24 format to userspace
+ * and use this function to convert to the native format.
+ *
+ * This function uses drm_fb_xrgb_to_gray8() to convert to grayscale and
+ * then the result is converted from grayscale to reversed monohrome.
+ */
+void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const 
void *src,
+ const struct drm_framebuffer *fb,
+ const struct drm_rect *clip)
+{
+   if (WARN_ON(fb->format->format != DRM_FORMAT_XRGB))
+   return;
+
+   if (!dst_pitch)
+   dst_pitch = drm_rect_width(clip);
+
+   drm_fb_xrgb_to_gray8(dst, dst_pitch, src, fb, clip);
+   drm_fb_gray8_to_mono_reversed(dst, dst_pitch, dst, fb, clip);


Converting from dst into dst can give incorrect results. At some point 
we probably want to add restrict qualifiers to these pointers, to help 
the compiler with optimizing.


A better approach here is to pull the per-line conversion from 
drm_fb_xrgb_to_gray8() into a separate helper and implement a 
line-by-line conversion here. something like this:


  drm_fb_xrgb_to_mono_reversed()
  {
char *tmp = kmalloc(size of a single line of gray8)

for (heigth) {
   drm_fb_xrgb_to_gray8_line(tmp, ..., src, ...);
   drm_fb_gray8_to_mono_reversed(dst, ..., tmp, ...);

   src += fb->pitches[0]
   dst += dst_pitch;
}

kfree(tmp);
  }

Best regards
Thomas


+}
+EXPORT_SYMBOL(drm_fb_

Re: [PATCH v2 1/4] drm/format-helper: Add drm_fb_{xrgb8888,gray8}_to_mono_reversed()

2022-02-04 Thread Thomas Zimmermann



Am 04.02.22 um 16:52 schrieb Thomas Zimmermann:
[...]

+/**
+ * drm_fb_xrgb_to_mono_reversed - Convert XRGB to reversed 
monochrome

+ * @dst: reversed monochrome destination buffer
+ * @dst_pitch: Number of bytes between two consecutive scanlines 
within dst

+ * @src: XRGB source buffer
+ * @fb: DRM framebuffer
+ * @clip: Clip rectangle area to copy
+ *
+ * DRM doesn't have native monochrome support.
+ * Such drivers can announce the commonly supported XR24 format to 
userspace

+ * and use this function to convert to the native format.
+ *
+ * This function uses drm_fb_xrgb_to_gray8() to convert to 
grayscale and

+ * then the result is converted from grayscale to reversed monohrome.
+ */
+void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int 
dst_pitch, const void *src,

+  const struct drm_framebuffer *fb,
+  const struct drm_rect *clip)
+{
+    if (WARN_ON(fb->format->format != DRM_FORMAT_XRGB))
+    return;
+
+    if (!dst_pitch)
+    dst_pitch = drm_rect_width(clip);
+
+    drm_fb_xrgb_to_gray8(dst, dst_pitch, src, fb, clip);
+    drm_fb_gray8_to_mono_reversed(dst, dst_pitch, dst, fb, clip);


Converting from dst into dst can give incorrect results. At some point 
we probably want to add restrict qualifiers to these pointers, to help 
the compiler with optimizing.


A better approach here is to pull the per-line conversion from 
drm_fb_xrgb_to_gray8() into a separate helper and implement a 
line-by-line conversion here. something like this:


   drm_fb_xrgb_to_mono_reversed()
   {
     char *tmp = kmalloc(size of a single line of gray8)

     for (heigth) {
    drm_fb_xrgb_to_gray8_line(tmp, ..., src, ...);
    drm_fb_gray8_to_mono_reversed(dst, ..., tmp, ...);


Here, I meant 'drm_fb_gray8_to_mono_reversed_line()'



    src += fb->pitches[0]
    dst += dst_pitch;
     }

     kfree(tmp);
   }


To elaborate, this is an example of what I meant by composable. In the 
future, we can generalize this function and hand-in 2 function pointers 
the do the conversion with tmp as intermediate buffer. That would work 
for any combination of formats that have a common intermediate format.




Best regards
Thomas


+}
+EXPORT_SYMBOL(drm_fb_xrgb_to_mono_reversed);
diff --git a/include/drm/drm_format_helper.h 
b/include/drm/drm_format_helper.h

index b30ed5de0a33..85e551a5cbe6 100644
--- a/include/drm/drm_format_helper.h
+++ b/include/drm/drm_format_helper.h
@@ -43,4 +43,11 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned 
int dst_pitch, uint32_t dst_for

   const void *vmap, const struct drm_framebuffer *fb,
   const struct drm_rect *rect);
+void drm_fb_gray8_to_mono_reversed(void *dst, unsigned int dst_pitch, 
const void *src,

+   const struct drm_rect *clip);
+
+void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int 
dst_pitch, const void *src,

+  const struct drm_framebuffer *fb,
+  const struct drm_rect *clip);
+
  #endif /* __LINUX_DRM_FORMAT_HELPER_H */




--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH] drm/amdgpu: Fix recursive locking warning

2022-02-04 Thread Felix Kuehling




Am 2022-02-04 um 02:13 schrieb Christian König:

Am 04.02.22 um 04:11 schrieb Rajneesh Bhardwaj:

Noticed the below warning while running a pytorch workload on vega10
GPUs. Change to trylock to avoid conflicts with already held reservation
locks.

[  +0.03] WARNING: possible recursive locking detected
[  +0.03] 5.13.0-kfd-rajneesh #1030 Not tainted
[  +0.04] 
[  +0.02] python/4822 is trying to acquire lock:
[  +0.04] 932cd9a259f8 (reservation_ww_class_mutex){+.+.}-{3:3},
at: amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000203]
   but task is already holding lock:
[  +0.03] 932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3},
at: ttm_eu_reserve_buffers+0x270/0x470 [ttm]
[  +0.17]
   other info that might help us debug this:
[  +0.02]  Possible unsafe locking scenario:

[  +0.03]    CPU0
[  +0.02]    
[  +0.02]   lock(reservation_ww_class_mutex);
[  +0.04]   lock(reservation_ww_class_mutex);
[  +0.03]
    *** DEADLOCK ***

[  +0.02]  May be due to missing lock nesting notation

[  +0.03] 7 locks held by python/4822:
[  +0.03]  #0: 932c4ac028d0 (&process->mutex){+.+.}-{3:3}, at:
kfd_ioctl_map_memory_to_gpu+0x10b/0x320 [amdgpu]
[  +0.000232]  #1: 932c55e830a8 (&info->lock#2){+.+.}-{3:3}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x64/0xf60 [amdgpu]
[  +0.000241]  #2: 932cc45b5e68 (&(*mem)->lock){+.+.}-{3:3}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0xdf/0xf60 [amdgpu]
[  +0.000236]  #3: b2b35606fd28
(reservation_ww_class_acquire){+.+.}-{0:0}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x232/0xf60 [amdgpu]
[  +0.000235]  #4: 932cbb7181f8
(reservation_ww_class_mutex){+.+.}-{3:3}, at:
ttm_eu_reserve_buffers+0x270/0x470 [ttm]
[  +0.15]  #5: c045f700 (*(sspp++)){}-{0:0}, at:
drm_dev_enter+0x5/0xa0 [drm]
[  +0.38]  #6: 932c52da7078 (&vm->eviction_lock){+.+.}-{3:3},
at: amdgpu_vm_bo_update_mapping+0xd5/0x4f0 [amdgpu]
[  +0.000195]
   stack backtrace:
[  +0.03] CPU: 11 PID: 4822 Comm: python Not tainted
5.13.0-kfd-rajneesh #1030
[  +0.05] Hardware name: GIGABYTE MZ01-CE0-00/MZ01-CE0-00, BIOS F02
08/29/2018
[  +0.03] Call Trace:
[  +0.03]  dump_stack+0x6d/0x89
[  +0.10]  __lock_acquire+0xb93/0x1a90
[  +0.09]  lock_acquire+0x25d/0x2d0
[  +0.05]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000184]  ? lock_is_held_type+0xa2/0x110
[  +0.06]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000184]  __ww_mutex_lock.constprop.17+0xca/0x1060
[  +0.07]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000183]  ? lock_release+0x13f/0x270
[  +0.05]  ? lock_is_held_type+0xa2/0x110
[  +0.06]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000183]  amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000185]  ttm_bo_release+0x4c6/0x580 [ttm]
[  +0.10]  amdgpu_bo_unref+0x1a/0x30 [amdgpu]
[  +0.000183]  amdgpu_vm_free_table+0x76/0xa0 [amdgpu]
[  +0.000189]  amdgpu_vm_free_pts+0xb8/0xf0 [amdgpu]
[  +0.000189]  amdgpu_vm_update_ptes+0x411/0x770 [amdgpu]
[  +0.000191]  amdgpu_vm_bo_update_mapping+0x324/0x4f0 [amdgpu]
[  +0.000191]  amdgpu_vm_bo_update+0x251/0x610 [amdgpu]
[  +0.000191]  update_gpuvm_pte+0xcc/0x290 [amdgpu]
[  +0.000229]  ? amdgpu_vm_bo_map+0xd7/0x130 [amdgpu]
[  +0.000190]  amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x912/0xf60
[amdgpu]
[  +0.000234]  kfd_ioctl_map_memory_to_gpu+0x182/0x320 [amdgpu]
[  +0.000218]  kfd_ioctl+0x2b9/0x600 [amdgpu]
[  +0.000216]  ? kfd_ioctl_unmap_memory_from_gpu+0x270/0x270 [amdgpu]
[  +0.000216]  ? lock_release+0x13f/0x270
[  +0.06]  ? __fget_files+0x107/0x1e0
[  +0.07]  __x64_sys_ioctl+0x8b/0xd0
[  +0.07]  do_syscall_64+0x36/0x70
[  +0.04]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  +0.07] RIP: 0033:0x7fbff90a7317
[  +0.04] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00
48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48
[  +0.05] RSP: 002b:7fbe301fe648 EFLAGS: 0246 ORIG_RAX:
0010
[  +0.06] RAX: ffda RBX: 7fbcc402d820 RCX:
7fbff90a7317
[  +0.03] RDX: 7fbe301fe690 RSI: c0184b18 RDI:
0004
[  +0.03] RBP: 7fbe301fe690 R08:  R09:
7fbcc402d880
[  +0.03] R10: 02001000 R11: 0246 R12:
c0184b18
[  +0.03] R13: 0004 R14: 7fbf689593a0 R15:
7fbcc402d820

Cc: Christian König 
Cc: Felix Kuehling 
Cc: Alex Deucher 

Fixes: 627b92ef9d7c ("drm/amdgpu: Wipe all VRAM on free when RAS is
enabled")
Signed-off-by: Rajneesh Bhardwaj 


The fixes tag is not necessarily correct, I would remove that.

But apart from that the patch is Reviewed-by: Christian König 
.


I suggested the Fixes tag since it was my patch that introduced the 
problem. W

Re: [PATCH] drm/amdgpu: Fix recursive locking warning

2022-02-04 Thread Christian König


Am 04.02.22 um 17:23 schrieb Felix Kuehling:


Am 2022-02-04 um 02:13 schrieb Christian König:

Am 04.02.22 um 04:11 schrieb Rajneesh Bhardwaj:

Noticed the below warning while running a pytorch workload on vega10
GPUs. Change to trylock to avoid conflicts with already held 
reservation

locks.

[  +0.03] WARNING: possible recursive locking detected
[  +0.03] 5.13.0-kfd-rajneesh #1030 Not tainted
[  +0.04] 
[  +0.02] python/4822 is trying to acquire lock:
[  +0.04] 932cd9a259f8 
(reservation_ww_class_mutex){+.+.}-{3:3},

at: amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000203]
   but task is already holding lock:
[  +0.03] 932cbb7181f8 
(reservation_ww_class_mutex){+.+.}-{3:3},

at: ttm_eu_reserve_buffers+0x270/0x470 [ttm]
[  +0.17]
   other info that might help us debug this:
[  +0.02]  Possible unsafe locking scenario:

[  +0.03]    CPU0
[  +0.02]    
[  +0.02]   lock(reservation_ww_class_mutex);
[  +0.04]   lock(reservation_ww_class_mutex);
[  +0.03]
    *** DEADLOCK ***

[  +0.02]  May be due to missing lock nesting notation

[  +0.03] 7 locks held by python/4822:
[  +0.03]  #0: 932c4ac028d0 (&process->mutex){+.+.}-{3:3}, at:
kfd_ioctl_map_memory_to_gpu+0x10b/0x320 [amdgpu]
[  +0.000232]  #1: 932c55e830a8 (&info->lock#2){+.+.}-{3:3}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x64/0xf60 [amdgpu]
[  +0.000241]  #2: 932cc45b5e68 (&(*mem)->lock){+.+.}-{3:3}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0xdf/0xf60 [amdgpu]
[  +0.000236]  #3: b2b35606fd28
(reservation_ww_class_acquire){+.+.}-{0:0}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x232/0xf60 [amdgpu]
[  +0.000235]  #4: 932cbb7181f8
(reservation_ww_class_mutex){+.+.}-{3:3}, at:
ttm_eu_reserve_buffers+0x270/0x470 [ttm]
[  +0.15]  #5: c045f700 (*(sspp++)){}-{0:0}, at:
drm_dev_enter+0x5/0xa0 [drm]
[  +0.38]  #6: 932c52da7078 (&vm->eviction_lock){+.+.}-{3:3},
at: amdgpu_vm_bo_update_mapping+0xd5/0x4f0 [amdgpu]
[  +0.000195]
   stack backtrace:
[  +0.03] CPU: 11 PID: 4822 Comm: python Not tainted
5.13.0-kfd-rajneesh #1030
[  +0.05] Hardware name: GIGABYTE MZ01-CE0-00/MZ01-CE0-00, BIOS F02
08/29/2018
[  +0.03] Call Trace:
[  +0.03]  dump_stack+0x6d/0x89
[  +0.10]  __lock_acquire+0xb93/0x1a90
[  +0.09]  lock_acquire+0x25d/0x2d0
[  +0.05]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000184]  ? lock_is_held_type+0xa2/0x110
[  +0.06]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000184]  __ww_mutex_lock.constprop.17+0xca/0x1060
[  +0.07]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000183]  ? lock_release+0x13f/0x270
[  +0.05]  ? lock_is_held_type+0xa2/0x110
[  +0.06]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000183]  amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000185]  ttm_bo_release+0x4c6/0x580 [ttm]
[  +0.10]  amdgpu_bo_unref+0x1a/0x30 [amdgpu]
[  +0.000183]  amdgpu_vm_free_table+0x76/0xa0 [amdgpu]
[  +0.000189]  amdgpu_vm_free_pts+0xb8/0xf0 [amdgpu]
[  +0.000189]  amdgpu_vm_update_ptes+0x411/0x770 [amdgpu]
[  +0.000191]  amdgpu_vm_bo_update_mapping+0x324/0x4f0 [amdgpu]
[  +0.000191]  amdgpu_vm_bo_update+0x251/0x610 [amdgpu]
[  +0.000191]  update_gpuvm_pte+0xcc/0x290 [amdgpu]
[  +0.000229]  ? amdgpu_vm_bo_map+0xd7/0x130 [amdgpu]
[  +0.000190] amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x912/0xf60
[amdgpu]
[  +0.000234]  kfd_ioctl_map_memory_to_gpu+0x182/0x320 [amdgpu]
[  +0.000218]  kfd_ioctl+0x2b9/0x600 [amdgpu]
[  +0.000216]  ? kfd_ioctl_unmap_memory_from_gpu+0x270/0x270 [amdgpu]
[  +0.000216]  ? lock_release+0x13f/0x270
[  +0.06]  ? __fget_files+0x107/0x1e0
[  +0.07]  __x64_sys_ioctl+0x8b/0xd0
[  +0.07]  do_syscall_64+0x36/0x70
[  +0.04]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  +0.07] RIP: 0033:0x7fbff90a7317
[  +0.04] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00
48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48
[  +0.05] RSP: 002b:7fbe301fe648 EFLAGS: 0246 ORIG_RAX:
0010
[  +0.06] RAX: ffda RBX: 7fbcc402d820 RCX:
7fbff90a7317
[  +0.03] RDX: 7fbe301fe690 RSI: c0184b18 RDI:
0004
[  +0.03] RBP: 7fbe301fe690 R08:  R09:
7fbcc402d880
[  +0.03] R10: 02001000 R11: 0246 R12:
c0184b18
[  +0.03] R13: 0004 R14: 7fbf689593a0 R15:
7fbcc402d820

Cc: Christian König 
Cc: Felix Kuehling 
Cc: Alex Deucher 

Fixes: 627b92ef9d7c ("drm/amdgpu: Wipe all VRAM on free when RAS is
enabled")
Signed-off-by: Rajneesh Bhardwaj 


The fixes tag is not necessarily correct, I would remove that.

But apart from that the patch is Reviewed-by: Christian König 
.


I suggested the Fixes tag sin

[PATCH v5 2/5] drm/i915/gt: Drop invalidate_csb_entries

2022-02-04 Thread Michael Cheng

Drop invalidate_csb_entries and directly call drm_clflush_virt_range.
This allows for one less function call, and prevent complier errors when
building for non-x86 architectures.

v2(Michael Cheng): Drop invalidate_csb_entries function and directly
   invoke drm_clflush_virt_range. Thanks to Tvrtko for the
   sugguestion.

Signed-off-by: Michael Cheng 
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 9bb7c863172f..7500c06562da 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -1646,12 +1646,6 @@ cancel_port_requests(struct intel_engine_execlists * 
const execlists,
return inactive;
 }
 
-static void invalidate_csb_entries(const u64 *first, const u64 *last)
-{
-   clflush((void *)first);
-   clflush((void *)last);
-}
-
 /*
  * Starting with Gen12, the status has a new format:
  *
@@ -1999,7 +1993,7 @@ process_csb(struct intel_engine_cs *engine, struct 
i915_request **inactive)
 * the wash as hardware, working or not, will need to do the
 * invalidation before.
 */
-   invalidate_csb_entries(&buf[0], &buf[num_entries - 1]);
+   drm_clflush_virt_range(&buf[0], num_entries * sizeof(buf[0]));
 
/*
 * We assume that any event reflects a change in context flow
@@ -2783,8 +2777,8 @@ static void reset_csb_pointers(struct intel_engine_cs 
*engine)
 
/* Check that the GPU does indeed update the CSB entries! */
memset(execlists->csb_status, -1, (reset_value + 1) * sizeof(u64));
-   invalidate_csb_entries(&execlists->csb_status[0],
-  &execlists->csb_status[reset_value]);
+   drm_clflush_virt_range(&execlists->csb_status[0],
+   sizeof(&execlists->csb_status[reset_value]));
 
/* Once more for luck and our trusty paranoia */
ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR,
-- 
2.25.1

[PATCH v5 5/5] drm/i915/gt: replace cache_clflush_range

2022-02-04 Thread Michael Cheng

Replace all occurance of cache_clflush_range with
drm_clflush_virt_range. This will prevent compile errors on non-x86
platforms.

Signed-off-by: Michael Cheng 
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 12 ++--
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c |  2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c|  2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c|  2 +-
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index c43e724afa9f..d0999e92621b 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -444,11 +444,11 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
pd = pdp->entry[gen8_pd_index(idx, 2)];
}
 
-   clflush_cache_range(vaddr, PAGE_SIZE);
+   drm_clflush_virt_range(vaddr, PAGE_SIZE);
vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 
1)));
}
} while (1);
-   clflush_cache_range(vaddr, PAGE_SIZE);
+   drm_clflush_virt_range(vaddr, PAGE_SIZE);
 
return idx;
 }
@@ -532,7 +532,7 @@ static void gen8_ppgtt_insert_huge(struct 
i915_address_space *vm,
}
} while (rem >= page_size && index < I915_PDES);
 
-   clflush_cache_range(vaddr, PAGE_SIZE);
+   drm_clflush_virt_range(vaddr, PAGE_SIZE);
 
/*
 * Is it safe to mark the 2M block as 64K? -- Either we have
@@ -548,7 +548,7 @@ static void gen8_ppgtt_insert_huge(struct 
i915_address_space *vm,
  I915_GTT_PAGE_SIZE_2M {
vaddr = px_vaddr(pd);
vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
-   clflush_cache_range(vaddr, PAGE_SIZE);
+   drm_clflush_virt_range(vaddr, PAGE_SIZE);
page_size = I915_GTT_PAGE_SIZE_64K;
 
/*
@@ -569,7 +569,7 @@ static void gen8_ppgtt_insert_huge(struct 
i915_address_space *vm,
for (i = 1; i < index; i += 16)
memset64(vaddr + i, encode, 15);
 
-   clflush_cache_range(vaddr, PAGE_SIZE);
+   drm_clflush_virt_range(vaddr, PAGE_SIZE);
}
}
 
@@ -617,7 +617,7 @@ static void gen8_ppgtt_insert_entry(struct 
i915_address_space *vm,
 
vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
-   clflush_cache_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
+   drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
 }
 
 static int gen8_init_scratch(struct i915_address_space *vm)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 22505aa428d9..5e74c99d5929 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2822,7 +2822,7 @@ static void execlists_sanitize(struct intel_engine_cs 
*engine)
sanitize_hwsp(engine);
 
/* And scrub the dirty cachelines for the HWSP */
-   clflush_cache_range(engine->status_page.addr, PAGE_SIZE);
+   drm_clflush_virt_range(engine->status_page.addr, PAGE_SIZE);
 
intel_engine_reset_pinned_contexts(engine);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 0d6bbc8c57f2..9b594be9102f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -255,7 +255,7 @@ fill_page_dma(struct drm_i915_gem_object *p, const u64 val, 
unsigned int count)
void *vaddr = __px_vaddr(p);
 
memset64(vaddr, val, count);
-   clflush_cache_range(vaddr, PAGE_SIZE);
+   drm_clflush_virt_range(vaddr, PAGE_SIZE);
 }
 
 static void poison_scratch_page(struct drm_i915_gem_object *scratch)
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c 
b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 48e6e2f87700..bd474a5123cb 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -90,7 +90,7 @@ write_dma_entry(struct drm_i915_gem_object * const pdma,
u64 * const vaddr = __px_vaddr(pdma);
 
vaddr[idx] = encoded_entry;
-   clflush_cache_range(&vaddr[idx], sizeof(u64));
+   drm_clflush_virt_range(&vaddr[idx], sizeof(u64));
 }
 
 void
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index b3a429a92c0d..89020706adc4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915

[PATCH v5 3/5] drm/i915/gt: Re-work reset_csb

2022-02-04 Thread Michael Cheng

Use drm_clflush_virt_range instead of directly invoking clflush. This
will prevent compiler errors when building for non-x86 architectures.

v2(Michael Cheng): Remove extra clflush

v3(Michael Cheng): Remove memory barrier since drm_clflush_virt_range
   takes care of it.

Signed-off-by: Michael Cheng 
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 7500c06562da..22505aa428d9 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2944,9 +2944,8 @@ reset_csb(struct intel_engine_cs *engine, struct 
i915_request **inactive)
 {
struct intel_engine_execlists * const execlists = &engine->execlists;
 
-   mb(); /* paranoia: read the CSB pointers from after the reset */
-   clflush(execlists->csb_write);
-   mb();
+   drm_clflush_virt_range(execlists->csb_write,
+   sizeof(execlists->csb_write));
 
inactive = process_csb(engine, inactive); /* drain preemption events */
 
-- 
2.25.1

[PATCH v5 4/5] drm/i915/: Re-work clflush_write32

2022-02-04 Thread Michael Cheng

Use drm_clflush_virt_range instead of clflushopt and remove the memory
barrier, since drm_clflush_virt_range takes care of that.

Signed-off-by: Michael Cheng 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 498b458fd784..0854276ff7ba 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1332,10 +1332,8 @@ static void *reloc_vaddr(struct i915_vma *vma,
 static void clflush_write32(u32 *addr, u32 value, unsigned int flushes)
 {
if (unlikely(flushes & (CLFLUSH_BEFORE | CLFLUSH_AFTER))) {
-   if (flushes & CLFLUSH_BEFORE) {
-   clflushopt(addr);
-   mb();
-   }
+   if (flushes & CLFLUSH_BEFORE)
+   drm_clflush_virt_range(addr, sizeof(addr));
 
*addr = value;
 
@@ -1347,7 +1345,7 @@ static void clflush_write32(u32 *addr, u32 value, 
unsigned int flushes)
 * to ensure ordering of clflush wrt to the system.
 */
if (flushes & CLFLUSH_AFTER)
-   clflushopt(addr);
+   drm_clflush_virt_range(addr, sizeof(addr));
} else
*addr = value;
 }
-- 
2.25.1

[PATCH v5 0/5] Use drm_clflush* instead of clflush

2022-02-04 Thread Michael Cheng

This patch series re-work a few i915 functions to use drm_clflush_virt_range
instead of calling clflush or clflushopt directly. This will prevent errors
when building for non-x86 architectures.

v2: s/PAGE_SIZE/sizeof(value) for Re-work intel_write_status_page and added
more patches to convert additional clflush/clflushopt to use drm_clflush*.
(Michael Cheng)

v3: Drop invalidate_csb_entries and directly invoke drm_clflush_virt_ran

v4: Remove extra memory barriers 

v5: s/cache_clflush_range/drm_clflush_virt_range

Michael Cheng (5):
  drm/i915/gt: Re-work intel_write_status_page
  drm/i915/gt: Drop invalidate_csb_entries
  drm/i915/gt: Re-work reset_csb
  drm/i915/: Re-work clflush_write32
  drm/i915/gt: replace cache_clflush_range

 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  8 +++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 12 ++--
 drivers/gpu/drm/i915/gt/intel_engine.h| 13 -
 .../drm/i915/gt/intel_execlists_submission.c  | 19 ++-
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 +-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  2 +-
 7 files changed, 22 insertions(+), 36 deletions(-)

-- 
2.25.1

[PATCH v5 1/5] drm/i915/gt: Re-work intel_write_status_page

2022-02-04 Thread Michael Cheng

Re-work intel_write_status_page to use drm_clflush_virt_range. This
will prevent compiler errors when building for non-x86 architectures.

Signed-off-by: Michael Cheng 
---
 drivers/gpu/drm/i915/gt/intel_engine.h | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
b/drivers/gpu/drm/i915/gt/intel_engine.h
index 0e353d8c2bc8..986777c2430d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -143,15 +144,9 @@ intel_write_status_page(struct intel_engine_cs *engine, 
int reg, u32 value)
 * of extra paranoia to try and ensure that the HWS takes the value
 * we give and that it doesn't end up trapped inside the CPU!
 */
-   if (static_cpu_has(X86_FEATURE_CLFLUSH)) {
-   mb();
-   clflush(&engine->status_page.addr[reg]);
-   engine->status_page.addr[reg] = value;
-   clflush(&engine->status_page.addr[reg]);
-   mb();
-   } else {
-   WRITE_ONCE(engine->status_page.addr[reg], value);
-   }
+   drm_clflush_virt_range(&engine->status_page.addr[reg], sizeof(value));
+   WRITE_ONCE(engine->status_page.addr[reg], value);
+   drm_clflush_virt_range(&engine->status_page.addr[reg], sizeof(value));
 }
 
 /*
-- 
2.25.1

[PATCH v4] dma-buf-map: Rename to iosys-map

2022-02-04 Thread Lucas De Marchi

Rename struct dma_buf_map to struct iosys_map and corresponding APIs.
Over time dma-buf-map grew up to more functionality than the one used by
dma-buf: in fact it's just a shim layer to abstract system memory, that
can be accessed via regular load and store, from IO memory that needs to
be acessed via arch helpers.

The idea is to extend this API so it can fulfill other needs, internal
to a single driver. Example: in the i915 driver it's desired to share
the implementation for integrated graphics, which uses mostly system
memory, with discrete graphics, which may need to access IO memory.

The conversion was mostly done with the following semantic patch:

@r1@
@@
- struct dma_buf_map
+ struct iosys_map

@r2@
@@
(
- DMA_BUF_MAP_INIT_VADDR
+ IOSYS_MAP_INIT_VADDR
|
- dma_buf_map_set_vaddr
+ iosys_map_set_vaddr
|
- dma_buf_map_set_vaddr_iomem
+ iosys_map_set_vaddr_iomem
|
- dma_buf_map_is_equal
+ iosys_map_is_equal
|
- dma_buf_map_is_null
+ iosys_map_is_null
|
- dma_buf_map_is_set
+ iosys_map_is_set
|
- dma_buf_map_clear
+ iosys_map_clear
|
- dma_buf_map_memcpy_to
+ iosys_map_memcpy_to
|
- dma_buf_map_incr
+ iosys_map_incr
)

@@
@@
- #include 
+ #include 

Then some files had their includes adjusted and some comments were
update to remove mentions to dma-buf-map.

Since this is not specific to dma-buf anymore, move the documentation to
the "Bus-Independent Device Accesses" section.

v2:
  - Squash patches

v3:
  - Fix wrong removal of dma-buf.h from MAINTAINERS
  - Move documentation from dma-buf.rst to device-io.rst

v4:
  - Change documentation tile and level

Signed-off-by: Lucas De Marchi 
Acked-by: Christian König 
Acked-by: Sumit Semwal 
Acked-by: Thomas Zimmermann 
---
 Documentation/driver-api/device-io.rst|   9 +
 Documentation/driver-api/dma-buf.rst  |   9 -
 Documentation/gpu/todo.rst|  20 +-
 MAINTAINERS   |   9 +-
 drivers/dma-buf/dma-buf.c |  22 +-
 drivers/dma-buf/heaps/cma_heap.c  |  10 +-
 drivers/dma-buf/heaps/system_heap.c   |  10 +-
 drivers/gpu/drm/ast/ast_drv.h |   2 +-
 drivers/gpu/drm/ast/ast_mode.c|   8 +-
 drivers/gpu/drm/drm_cache.c   |  18 +-
 drivers/gpu/drm/drm_client.c  |   9 +-
 drivers/gpu/drm/drm_fb_helper.c   |  12 +-
 drivers/gpu/drm/drm_gem.c |  12 +-
 drivers/gpu/drm/drm_gem_cma_helper.c  |   9 +-
 drivers/gpu/drm/drm_gem_framebuffer_helper.c  |  16 +-
 drivers/gpu/drm/drm_gem_shmem_helper.c|  15 +-
 drivers/gpu/drm/drm_gem_ttm_helper.c  |   4 +-
 drivers/gpu/drm/drm_gem_vram_helper.c |  25 +-
 drivers/gpu/drm/drm_internal.h|   6 +-
 drivers/gpu/drm/drm_mipi_dbi.c|   8 +-
 drivers/gpu/drm/drm_prime.c   |   4 +-
 drivers/gpu/drm/etnaviv/etnaviv_drv.h |   2 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c   |   8 +-
 drivers/gpu/drm/gud/gud_pipe.c|   4 +-
 drivers/gpu/drm/hyperv/hyperv_drm_modeset.c   |   5 +-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|   8 +-
 .../drm/i915/gem/selftests/i915_gem_dmabuf.c  |   6 +-
 .../gpu/drm/i915/gem/selftests/mock_dmabuf.c  |   6 +-
 drivers/gpu/drm/lima/lima_gem.c   |   3 +-
 drivers/gpu/drm/lima/lima_sched.c |   4 +-
 drivers/gpu/drm/mediatek/mtk_drm_gem.c|   7 +-
 drivers/gpu/drm/mediatek/mtk_drm_gem.h|   5 +-
 drivers/gpu/drm/mgag200/mgag200_mode.c|   4 +-
 drivers/gpu/drm/msm/msm_drv.h |   4 +-
 drivers/gpu/drm/msm/msm_gem_prime.c   |   6 +-
 drivers/gpu/drm/panfrost/panfrost_perfcnt.c   |  13 +-
 drivers/gpu/drm/qxl/qxl_display.c |   8 +-
 drivers/gpu/drm/qxl/qxl_draw.c|   6 +-
 drivers/gpu/drm/qxl/qxl_drv.h |  10 +-
 drivers/gpu/drm/qxl/qxl_object.c  |   8 +-
 drivers/gpu/drm/qxl/qxl_object.h  |   4 +-
 drivers/gpu/drm/qxl/qxl_prime.c   |   4 +-
 drivers/gpu/drm/radeon/radeon_gem.c   |   1 +
 drivers/gpu/drm/rockchip/rockchip_drm_gem.c   |   9 +-
 drivers/gpu/drm/rockchip/rockchip_drm_gem.h   |   5 +-
 drivers/gpu/drm/tegra/gem.c   |  10 +-
 drivers/gpu/drm/tiny/cirrus.c |   8 +-
 drivers/gpu/drm/tiny/gm12u320.c   |   7 +-
 drivers/gpu/drm/ttm/ttm_bo_util.c |  16 +-
 drivers/gpu/drm/ttm/ttm_resource.c|  42 +--
 drivers/gpu/drm/ttm/ttm_tt.c  |   8 +-
 drivers/gpu/drm/udl/udl_modeset.c |   3 +-
 drivers/gpu/drm/vboxvideo/vbox_mode.c |   4 +-
 drivers/gpu/drm/vkms/vkms_c

Re: [PATCH v4 3/3] fbcon: Add option to enable legacy hardware acceleration

2022-02-04 Thread Helge Deller

Hi Geert,

On 2/4/22 11:24, Geert Uytterhoeven wrote:
> On Fri, Feb 4, 2022 at 11:17 AM Helge Deller  wrote:
>> On 2/4/22 09:37, Geert Uytterhoeven wrote:
>>> On Wed, Feb 2, 2022 at 8:05 PM Helge Deller  wrote:
 Add a config option CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION to
 enable bitblt and fillrect hardware acceleration in the framebuffer
 console. If disabled, such acceleration will not be used, even if it is
 supported by the graphics hardware driver.
>>>
>>> Note that this also applies to vertical panning and wrapping.
>>
>> That's correct.
>> Would you mind to send a patch which adds this info?
>
> To add it where? "bitblt and fillrect" are only mentioned in the patch
> description.
>
> The Kconfig help entry just talks about "hardware acceleration",
> which can mean any trick supported by the hardware.

Yes, I thought about extending the Kconfig help entry text to
mention vertical panning and wrapping as well.
If you have an idea how that should be phrased, I think it
makes sense to add it.

Helge

[Important!] 2022 X.Org Foundation Membership deadline for voting in the election

2022-02-04 Thread Lyude Paul

The 2022 X.Org Foundation elections are rapidly approaching. We will be
forwarding instructions on the nomination process to membership in the
near future.

Please note that only current members can vote in the upcoming election,
and that the deadline for new memberships or renewals to vote in the
upcoming election is March 17th 2022 at 23:59 UTC.

If you are interested in joining the X.Org Foundation or in renewing
your membership, please visit the membership system site at:

https://members.x.org/

You can find the current election schedule here:

https://www.x.org/wiki/BoardOfDirectors/Elections/2022/

    Lyude Paul,
    On behalf of the X.Org elections committee

[PATCH 00/19] drm/i915/guc: Refactor ADS access to use iosys_map

2022-02-04 Thread Lucas De Marchi

2nd version of https://patchwork.freedesktop.org/series/99378/

As first patch I'm including the dma-buf-map rename to iosys-map for
completeness and to allow the other patches to be reviewed. However the
first patch was also sent by itself.

I think all the feedback from v1 was incorporated in this version. All
the new helpers in iosys-map were squashed in a single patch as
requested.

Original cover letter:

While porting i915 to arm64 we noticed some issues accessing lmem.
Some writes were getting corrupted and the final state of the buffer
didn't have exactly what we wrote. This became evident when enabling
GuC submission: depending on the number of engines the ADS struct was
being corrupted and GuC would reject it, refusin to initialize.

>From Documentation/core-api/bus-virt-phys-mapping.rst:

This memory is called "PCI memory" or "shared memory" or "IO memory" or
whatever, and there is only one way to access it: the readb/writeb and
related functions. You should never take the address of such memory, 
because
there is really nothing you can do with such an address: it's not
conceptually in the same memory space as "real memory" at all, so you 
cannot
just dereference a pointer. (Sadly, on x86 it **is** in the same memory 
space,
so on x86 it actually works to just deference a pointer, but it's not
portable).

When reading or writing words directly to IO memory, in order to be portable
the Linux kernel provides the abstraction detailed in section "Differences
between I/O access functions" of Documentation/driver-api/device-io.rst.

This limits our ability to simply overlay our structs on top a buffer
and directly access it since that buffer may come from IO memory rather than
system memory. Hence the approach taken in intel_guc_ads.c needs to be
refactored. This is not the only place in i915 that neeed to be changed, but
the one causing the most problems, with a real reproducer. This first set of
patch focuses on fixing the gem object to pass the ADS

After the addition of a few helpers in the dma_buf_map API, most of
intel_guc_ads.c can be converted to use it. The exception is the regset
initialization: we'd incur into a lot of extra indirection when
reading/writting each register. So the regset is converted to use a
temporary buffer allocated on probe, which is then copied to its
final location when finishing the initialization or on gt reset.

Testing on some discrete cards, after this change we can correctly pass the
ADS struct to GuC and have it initialized correctly.

thanks,
Lucas De Marchi

Lucas De Marchi (19):
  dma-buf-map: Rename to iosys-map
  iosys-map: Add offset to iosys_map_memcpy_to()
  iosys-map: Add a few more helpers
  drm/i915/gt: Add helper for shmem copy to iosys_map
  drm/i915/guc: Keep iosys_map of ads_blob around
  drm/i915/guc: Add read/write helpers for ADS blob
  drm/i915/guc: Convert golden context init to iosys_map
  drm/i915/guc: Convert policies update to iosys_map
  drm/i915/guc: Convert engine record to iosys_map
  drm/i915/guc: Convert guc_ads_private_data_reset to iosys_map
  drm/i915/guc: Convert golden context prep to iosys_map
  drm/i915/guc: Replace check for golden context size
  drm/i915/guc: Convert mapping table to iosys_map
  drm/i915/guc: Convert capture list to iosys_map
  drm/i915/guc: Prepare for error propagation
  drm/i915/guc: Use a single pass to calculate regset
  drm/i915/guc: Convert guc_mmio_reg_state_init to iosys_map
  drm/i915/guc: Convert __guc_ads_init to iosys_map
  drm/i915/guc: Remove plain ads_blob pointer

 Documentation/driver-api/device-io.rst|   9 +
 Documentation/driver-api/dma-buf.rst  |   9 -
 Documentation/gpu/todo.rst|  20 +-
 MAINTAINERS   |   9 +-
 drivers/dma-buf/dma-buf.c |  22 +-
 drivers/dma-buf/heaps/cma_heap.c  |  10 +-
 drivers/dma-buf/heaps/system_heap.c   |  10 +-
 drivers/gpu/drm/ast/ast_drv.h |   2 +-
 drivers/gpu/drm/ast/ast_mode.c|   8 +-
 drivers/gpu/drm/drm_cache.c   |  18 +-
 drivers/gpu/drm/drm_client.c  |   9 +-
 drivers/gpu/drm/drm_fb_helper.c   |  12 +-
 drivers/gpu/drm/drm_gem.c |  12 +-
 drivers/gpu/drm/drm_gem_cma_helper.c  |   9 +-
 drivers/gpu/drm/drm_gem_framebuffer_helper.c  |  16 +-
 drivers/gpu/drm/drm_gem_shmem_helper.c|  15 +-
 drivers/gpu/drm/drm_gem_ttm_helper.c  |   4 +-
 drivers/gpu/drm/drm_gem_vram_helper.c |  25 +-
 drivers/gpu/drm/drm_internal.h|   6 +-
 drivers/gpu/drm/drm_mipi_dbi.c|   8 +-
 drivers/gpu/drm/drm_prime.c   |   4 +-
 drivers/gpu/drm/etnaviv/etnaviv_drv.h |   2 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c   |   8 +-
 drivers/gpu/drm/gud/gud_pipe.c|   4 +-
 drivers/gpu/drm/hyperv/hyperv_drm_modeset.c   |

[PATCH 03/19] iosys-map: Add a few more helpers

2022-02-04 Thread Lucas De Marchi

First the simplest ones:

- iosys_map_memset(): when abstracting system and I/O memory,
  just like the memcpy() use case, memset() also has dedicated
  functions to be called for using IO memory.
- iosys_map_memcpy_from(): we may need to copy data from I/O
  memory, not only to.

In certain situations it's useful to be able to read or write to an
offset that is calculated by having the memory layout given by a struct
declaration. Usually we are going to read/write a u8, u16, u32 or u64.

As a pre-requisite for the implementation, add iosys_map_memcpy_from()
to be the equivalent of iosys_map_memcpy_to(), but in the other
direction. Then add 2 pairs of macros:

- iosys_map_rd() / iosys_map_wr()
- iosys_map_rd_field() / iosys_map_wr_field()

The first pair takes the C-type and offset to read/write. The second
pair uses a struct describing the layout of the mapping in order to
calculate the offset and size being read/written.

We could use readb, readw, readl, readq and the write* counterparts,
however due to alignment issues this may not work on all architectures.
If alignment needs to be checked to call the right function, it's not
possible to decide at compile-time which function to call: so just leave
the decision to the memcpy function that will do exactly that.

Finally, in order to use the above macros with a map derived from
another, add another initializer: IOSYS_MAP_INIT_OFFSET().

Cc: Sumit Semwal 
Cc: Christian König 
Cc: Thomas Zimmermann 
Cc: dri-devel@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Lucas De Marchi 
---
 include/linux/iosys-map.h | 154 +-
 1 file changed, 153 insertions(+), 1 deletion(-)

diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index edd7fa3be9e9..96f8b61ac6fb 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -6,6 +6,7 @@
 #ifndef __IOSYS_MAP_H__
 #define __IOSYS_MAP_H__
 
+#include 
 #include 
 #include 
 
@@ -133,6 +134,45 @@ static inline void iosys_map_set_vaddr(struct iosys_map 
*map, void *vaddr)
map->is_iomem = false;
 }
 
+/**
+ * IOSYS_MAP_INIT_OFFSET - Initializes struct iosys_map from another iosys_map
+ * @map_:  The dma-buf mapping structure to copy from
+ * @offset_:   Offset to add to the other mapping
+ *
+ * Initializes a new iosys_map struct based on another passed as argument. It
+ * does a shallow copy of the struct so it's possible to update the back 
storage
+ * without changing where the original map points to. It is the equivalent of
+ * doing:
+ *
+ * .. code-block: c
+ *
+ * iosys_map map = other_map;
+ * iosys_map_incr(&map, &offset);
+ *
+ * Example usage:
+ *
+ * .. code-block: c
+ *
+ * void foo(struct device *dev, struct iosys_map *base_map)
+ * {
+ * ...
+ * struct iosys_map map = IOSYS_MAP_INIT_OFFSET(base_map, 
FIELD_OFFSET);
+ * ...
+ * }
+ *
+ * The advantage of using the initializer over just increasing the offset with
+ * ``iosys_map_incr()`` like above is that the new map will always point to the
+ * right place of the buffer during  its scope. It reduces the risk of updating
+ * the wrong part of the buffer and having no compiler warning about that. If
+ * the assignment to IOSYS_MAP_INIT_OFFSET() is forgotten, the compiler can 
warn
+ * using a uninitialized variable.
+ */
+#define IOSYS_MAP_INIT_OFFSET(map_, offset_)   (struct iosys_map)  \
+   {   \
+   .vaddr = (map_)->vaddr + (offset_), \
+   .is_iomem = (map_)->is_iomem,   \
+   }
+
 /**
  * iosys_map_set_vaddr_iomem - Sets a iosys mapping structure to an address in 
I/O memory
  * @map:   The iosys_map structure
@@ -220,7 +260,7 @@ static inline void iosys_map_clear(struct iosys_map *map)
 }
 
 /**
- * iosys_map_memcpy_to_offset - Memcpy into offset of iosys_map
+ * iosys_map_memcpy_to - Memcpy into iosys_map
  * @dst:   The iosys_map structure
  * @dst_offset:The offset from which to copy
  * @src:   The source buffer
@@ -239,6 +279,26 @@ static inline void iosys_map_memcpy_to(struct iosys_map 
*dst, size_t dst_offset,
memcpy(dst->vaddr + dst_offset, src, len);
 }
 
+/**
+ * iosys_map_memcpy_from - Memcpy from iosys_map into system memory
+ * @dst:   Destination in system memory
+ * @src:   The iosys_map structure
+ * @src_offset:The offset from which to copy
+ * @len:   The number of byte in src
+ *
+ * Copies data from a iosys_map with an offset. The dest buffer is in
+ * system memory. Depending on the mapping location, the helper picks the
+ * correct method of accessing the memory.
+ */
+static inline void iosys_map_memcpy_from(void *dst, const struct iosys_map 
*src,
+size_t src_offset, size_t len)
+{
+

[PATCH 02/19] iosys-map: Add offset to iosys_map_memcpy_to()

2022-02-04 Thread Lucas De Marchi

In certain situations it's useful to be able to write to an
offset of the mapping. Add a dst_offset to iosys_map_memcpy_to().

Cc: Sumit Semwal 
Cc: Christian König 
Cc: Thomas Zimmermann 
Cc: dri-devel@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/drm_cache.c |  2 +-
 drivers/gpu/drm/drm_fb_helper.c |  2 +-
 include/linux/iosys-map.h   | 17 +
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 66597e411764..c3e6e615bf09 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -218,7 +218,7 @@ static void memcpy_fallback(struct iosys_map *dst,
if (!dst->is_iomem && !src->is_iomem) {
memcpy(dst->vaddr, src->vaddr, len);
} else if (!src->is_iomem) {
-   iosys_map_memcpy_to(dst, src->vaddr, len);
+   iosys_map_memcpy_to(dst, 0, src->vaddr, len);
} else if (!dst->is_iomem) {
memcpy_fromio(dst->vaddr, src->vaddr_iomem, len);
} else {
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 238f815cb2a0..bf5cc9a42e5a 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -385,7 +385,7 @@ static void drm_fb_helper_damage_blit_real(struct 
drm_fb_helper *fb_helper,
iosys_map_incr(dst, offset); /* go to first pixel within clip rect */
 
for (y = clip->y1; y < clip->y2; y++) {
-   iosys_map_memcpy_to(dst, src, len);
+   iosys_map_memcpy_to(dst, 0, src, len);
iosys_map_incr(dst, fb->pitches[0]);
src += fb->pitches[0];
}
diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index f4186f91caa6..edd7fa3be9e9 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -220,22 +220,23 @@ static inline void iosys_map_clear(struct iosys_map *map)
 }
 
 /**
- * iosys_map_memcpy_to - Memcpy into iosys mapping
+ * iosys_map_memcpy_to_offset - Memcpy into offset of iosys_map
  * @dst:   The iosys_map structure
+ * @dst_offset:The offset from which to copy
  * @src:   The source buffer
  * @len:   The number of byte in src
  *
- * Copies data into a iosys mapping. The source buffer is in system
- * memory. Depending on the buffer's location, the helper picks the correct
- * method of accessing the memory.
+ * Copies data into a iosys_map with an offset. The source buffer is in
+ * system memory. Depending on the buffer's location, the helper picks the
+ * correct method of accessing the memory.
  */
-static inline void iosys_map_memcpy_to(struct iosys_map *dst, const void *src,
-  size_t len)
+static inline void iosys_map_memcpy_to(struct iosys_map *dst, size_t 
dst_offset,
+  const void *src, size_t len)
 {
if (dst->is_iomem)
-   memcpy_toio(dst->vaddr_iomem, src, len);
+   memcpy_toio(dst->vaddr_iomem + dst_offset, src, len);
else
-   memcpy(dst->vaddr, src, len);
+   memcpy(dst->vaddr + dst_offset, src, len);
 }
 
 /**
-- 
2.35.1

[PATCH 04/19] drm/i915/gt: Add helper for shmem copy to iosys_map

2022-02-04 Thread Lucas De Marchi

Add a variant of shmem_read() that takes a iosys_map pointer rather
than a plain pointer as argument. It's mostly a copy __shmem_rw() but
adapting the api and removing the write support since there's currently
only need to use iosys_map as destination.

Reworking __shmem_rw() to share the implementation was tempting, but
finding a good balance between reuse and clarity pushed towards a little
code duplication. Since the function is small, just add the similar
function with a copy/paste/adapt approach.

Cc: Matt Roper 
Cc: Joonas Lahtinen 
Cc: Tvrtko Ursulin 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Matthew Auld 
Cc: Thomas Hellström 
Cc: Maarten Lankhorst 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/shmem_utils.c | 33 +++
 drivers/gpu/drm/i915/gt/shmem_utils.h |  3 +++
 2 files changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c 
b/drivers/gpu/drm/i915/gt/shmem_utils.c
index 0683b27a3890..764adefdb4be 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -3,6 +3,7 @@
  * Copyright © 2020 Intel Corporation
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -123,6 +124,38 @@ static int __shmem_rw(struct file *file, loff_t off,
return 0;
 }
 
+int shmem_read_to_iosys_map(struct file *file, loff_t off,
+   struct iosys_map *map, size_t len)
+{
+   struct iosys_map map_iter = *map;
+   unsigned long pfn;
+
+   for (pfn = off >> PAGE_SHIFT; len; pfn++) {
+   unsigned int this =
+   min_t(size_t, PAGE_SIZE - offset_in_page(off), len);
+   struct page *page;
+   void *vaddr;
+
+   page = shmem_read_mapping_page_gfp(file->f_mapping, pfn,
+  GFP_KERNEL);
+   if (IS_ERR(page))
+   return PTR_ERR(page);
+
+   vaddr = kmap(page);
+   iosys_map_memcpy_to(&map_iter, 0, vaddr + offset_in_page(off),
+   this);
+   mark_page_accessed(page);
+   kunmap(page);
+   put_page(page);
+
+   len -= this;
+   iosys_map_incr(&map_iter, this);
+   off = 0;
+   }
+
+   return 0;
+}
+
 int shmem_read(struct file *file, loff_t off, void *dst, size_t len)
 {
return __shmem_rw(file, off, dst, len, false);
diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.h 
b/drivers/gpu/drm/i915/gt/shmem_utils.h
index c1669170c351..e1784999faee 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.h
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.h
@@ -8,6 +8,7 @@
 
 #include 
 
+struct iosys_map;
 struct drm_i915_gem_object;
 struct file;
 
@@ -17,6 +18,8 @@ struct file *shmem_create_from_object(struct 
drm_i915_gem_object *obj);
 void *shmem_pin_map(struct file *file);
 void shmem_unpin_map(struct file *file, void *ptr);
 
+int shmem_read_to_iosys_map(struct file *file, loff_t off,
+   struct iosys_map *map, size_t len);
 int shmem_read(struct file *file, loff_t off, void *dst, size_t len);
 int shmem_write(struct file *file, loff_t off, void *src, size_t len);
 
-- 
2.35.1

[PATCH 05/19] drm/i915/guc: Keep iosys_map of ads_blob around

2022-02-04 Thread Lucas De Marchi

Convert intel_guc_ads_create() and initialization to use iosys_map
rather than plain pointer and save it in the guc struct. This will help
with additional updates to the ads_blob after the
creation/initialization by abstracting the IO vs system memory.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.h | 4 +++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 6 ++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 697d9d66acef..9b9ba79f7594 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -6,8 +6,9 @@
 #ifndef _INTEL_GUC_H_
 #define _INTEL_GUC_H_
 
-#include 
 #include 
+#include 
+#include 
 
 #include "intel_uncore.h"
 #include "intel_guc_fw.h"
@@ -148,6 +149,7 @@ struct intel_guc {
struct i915_vma *ads_vma;
/** @ads_blob: contents of the GuC ADS */
struct __guc_ads_blob *ads_blob;
+   struct iosys_map ads_map;
/** @ads_regset_size: size of the save/restore regsets in the ADS */
u32 ads_regset_size;
/** @ads_golden_ctxt_size: size of the golden contexts in the ADS */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index e61150adcbe9..13671b186908 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -624,6 +624,11 @@ int intel_guc_ads_create(struct intel_guc *guc)
if (ret)
return ret;
 
+   if (i915_gem_object_is_lmem(guc->ads_vma->obj))
+   iosys_map_set_vaddr_iomem(&guc->ads_map, (void __iomem 
*)guc->ads_blob);
+   else
+   iosys_map_set_vaddr(&guc->ads_map, guc->ads_blob);
+
__guc_ads_init(guc);
 
return 0;
@@ -645,6 +650,7 @@ void intel_guc_ads_destroy(struct intel_guc *guc)
 {
i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP);
guc->ads_blob = NULL;
+   iosys_map_clear(&guc->ads_map);
 }
 
 static void guc_ads_private_data_reset(struct intel_guc *guc)
-- 
2.35.1

[PATCH 01/19] dma-buf-map: Rename to iosys-map

2022-02-04 Thread Lucas De Marchi

Rename struct dma_buf_map to struct iosys_map and corresponding APIs.
Over time dma-buf-map grew up to more functionality than the one used by
dma-buf: in fact it's just a shim layer to abstract system memory, that
can be accessed via regular load and store, from IO memory that needs to
be acessed via arch helpers.

The idea is to extend this API so it can fulfill other needs, internal
to a single driver. Example: in the i915 driver it's desired to share
the implementation for integrated graphics, which uses mostly system
memory, with discrete graphics, which may need to access IO memory.

The conversion was mostly done with the following semantic patch:

@r1@
@@
- struct dma_buf_map
+ struct iosys_map

@r2@
@@
(
- DMA_BUF_MAP_INIT_VADDR
+ IOSYS_MAP_INIT_VADDR
|
- dma_buf_map_set_vaddr
+ iosys_map_set_vaddr
|
- dma_buf_map_set_vaddr_iomem
+ iosys_map_set_vaddr_iomem
|
- dma_buf_map_is_equal
+ iosys_map_is_equal
|
- dma_buf_map_is_null
+ iosys_map_is_null
|
- dma_buf_map_is_set
+ iosys_map_is_set
|
- dma_buf_map_clear
+ iosys_map_clear
|
- dma_buf_map_memcpy_to
+ iosys_map_memcpy_to
|
- dma_buf_map_incr
+ iosys_map_incr
)

@@
@@
- #include 
+ #include 

Then some files had their includes adjusted and some comments were
update to remove mentions to dma-buf-map.

Since this is not specific to dma-buf anymore, move the documentation to
the "Bus-Independent Device Accesses" section.

v2:
  - Squash patches

v3:
  - Fix wrong removal of dma-buf.h from MAINTAINERS
  - Move documentation from dma-buf.rst to device-io.rst

v4:
  - Change documentation tile and level

Signed-off-by: Lucas De Marchi 
Acked-by: Christian König 
Acked-by: Sumit Semwal 
Acked-by: Thomas Zimmermann 
---
 Documentation/driver-api/device-io.rst|   9 +
 Documentation/driver-api/dma-buf.rst  |   9 -
 Documentation/gpu/todo.rst|  20 +-
 MAINTAINERS   |   9 +-
 drivers/dma-buf/dma-buf.c |  22 +-
 drivers/dma-buf/heaps/cma_heap.c  |  10 +-
 drivers/dma-buf/heaps/system_heap.c   |  10 +-
 drivers/gpu/drm/ast/ast_drv.h |   2 +-
 drivers/gpu/drm/ast/ast_mode.c|   8 +-
 drivers/gpu/drm/drm_cache.c   |  18 +-
 drivers/gpu/drm/drm_client.c  |   9 +-
 drivers/gpu/drm/drm_fb_helper.c   |  12 +-
 drivers/gpu/drm/drm_gem.c |  12 +-
 drivers/gpu/drm/drm_gem_cma_helper.c  |   9 +-
 drivers/gpu/drm/drm_gem_framebuffer_helper.c  |  16 +-
 drivers/gpu/drm/drm_gem_shmem_helper.c|  15 +-
 drivers/gpu/drm/drm_gem_ttm_helper.c  |   4 +-
 drivers/gpu/drm/drm_gem_vram_helper.c |  25 +-
 drivers/gpu/drm/drm_internal.h|   6 +-
 drivers/gpu/drm/drm_mipi_dbi.c|   8 +-
 drivers/gpu/drm/drm_prime.c   |   4 +-
 drivers/gpu/drm/etnaviv/etnaviv_drv.h |   2 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c   |   8 +-
 drivers/gpu/drm/gud/gud_pipe.c|   4 +-
 drivers/gpu/drm/hyperv/hyperv_drm_modeset.c   |   5 +-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|   8 +-
 .../drm/i915/gem/selftests/i915_gem_dmabuf.c  |   6 +-
 .../gpu/drm/i915/gem/selftests/mock_dmabuf.c  |   6 +-
 drivers/gpu/drm/lima/lima_gem.c   |   3 +-
 drivers/gpu/drm/lima/lima_sched.c |   4 +-
 drivers/gpu/drm/mediatek/mtk_drm_gem.c|   7 +-
 drivers/gpu/drm/mediatek/mtk_drm_gem.h|   5 +-
 drivers/gpu/drm/mgag200/mgag200_mode.c|   4 +-
 drivers/gpu/drm/msm/msm_drv.h |   4 +-
 drivers/gpu/drm/msm/msm_gem_prime.c   |   6 +-
 drivers/gpu/drm/panfrost/panfrost_perfcnt.c   |  13 +-
 drivers/gpu/drm/qxl/qxl_display.c |   8 +-
 drivers/gpu/drm/qxl/qxl_draw.c|   6 +-
 drivers/gpu/drm/qxl/qxl_drv.h |  10 +-
 drivers/gpu/drm/qxl/qxl_object.c  |   8 +-
 drivers/gpu/drm/qxl/qxl_object.h  |   4 +-
 drivers/gpu/drm/qxl/qxl_prime.c   |   4 +-
 drivers/gpu/drm/radeon/radeon_gem.c   |   1 +
 drivers/gpu/drm/rockchip/rockchip_drm_gem.c   |   9 +-
 drivers/gpu/drm/rockchip/rockchip_drm_gem.h   |   5 +-
 drivers/gpu/drm/tegra/gem.c   |  10 +-
 drivers/gpu/drm/tiny/cirrus.c |   8 +-
 drivers/gpu/drm/tiny/gm12u320.c   |   7 +-
 drivers/gpu/drm/ttm/ttm_bo_util.c |  16 +-
 drivers/gpu/drm/ttm/ttm_resource.c|  42 +--
 drivers/gpu/drm/ttm/ttm_tt.c  |   8 +-
 drivers/gpu/drm/udl/udl_modeset.c |   3 +-
 drivers/gpu/drm/vboxvideo/vbox_mode.c |   4 +-
 drivers/gpu/drm/vkms/vkms_c

[PATCH 06/19] drm/i915/guc: Add read/write helpers for ADS blob

2022-02-04 Thread Lucas De Marchi

Add helpers on top of iosys_map_read_field() /
iosys_map_write_field() functions so they always use the right
arguments and make code easier to read.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 13671b186908..3a0afce7564e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -60,6 +60,13 @@ struct __guc_ads_blob {
struct guc_mmio_reg regset[0];
 } __packed;
 
+#define ads_blob_read(guc_, field_)\
+   iosys_map_rd_field(&(guc_)->ads_map, struct __guc_ads_blob, field_)
+
+#define ads_blob_write(guc_, field_, val_) \
+   iosys_map_wr_field(&(guc_)->ads_map, struct __guc_ads_blob, \
+  field_, val_)
+
 static u32 guc_ads_regset_size(struct intel_guc *guc)
 {
GEM_BUG_ON(!guc->ads_regset_size);
-- 
2.35.1

[PATCH 08/19] drm/i915/guc: Convert policies update to iosys_map

2022-02-04 Thread Lucas De Marchi

Use iosys_map to write the policies update so access to IO and system
memory is abstracted away.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 41 --
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index d32b407a2d25..6311b9da87e4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -130,33 +130,37 @@ static u32 guc_ads_blob_size(struct intel_guc *guc)
   guc_ads_private_data_size(guc);
 }
 
-static void guc_policies_init(struct intel_guc *guc, struct guc_policies 
*policies)
+static void guc_policies_init(struct intel_guc *guc)
 {
struct intel_gt *gt = guc_to_gt(guc);
struct drm_i915_private *i915 = gt->i915;
+   u32 global_flags = 0;
 
-   policies->dpc_promote_time = GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US;
-   policies->max_num_work_items = GLOBAL_POLICY_MAX_NUM_WI;
+   ads_blob_write(guc, policies.dpc_promote_time,
+  GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US);
+   ads_blob_write(guc, policies.max_num_work_items,
+  GLOBAL_POLICY_MAX_NUM_WI);
 
-   policies->global_flags = 0;
if (i915->params.reset < 2)
-   policies->global_flags |= GLOBAL_POLICY_DISABLE_ENGINE_RESET;
+   global_flags |= GLOBAL_POLICY_DISABLE_ENGINE_RESET;
 
-   policies->is_valid = 1;
+   ads_blob_write(guc, policies.global_flags, global_flags);
+   ads_blob_write(guc, policies.is_valid, 1);
 }
 
 void intel_guc_ads_print_policy_info(struct intel_guc *guc,
 struct drm_printer *dp)
 {
-   struct __guc_ads_blob *blob = guc->ads_blob;
-
-   if (unlikely(!blob))
+   if (unlikely(iosys_map_is_null(&guc->ads_map)))
return;
 
drm_printf(dp, "Global scheduling policies:\n");
-   drm_printf(dp, "  DPC promote time   = %u\n", 
blob->policies.dpc_promote_time);
-   drm_printf(dp, "  Max num work items = %u\n", 
blob->policies.max_num_work_items);
-   drm_printf(dp, "  Flags  = %u\n", 
blob->policies.global_flags);
+   drm_printf(dp, "  DPC promote time   = %u\n",
+  ads_blob_read(guc, policies.dpc_promote_time));
+   drm_printf(dp, "  Max num work items = %u\n",
+  ads_blob_read(guc, policies.max_num_work_items));
+   drm_printf(dp, "  Flags  = %u\n",
+  ads_blob_read(guc, policies.global_flags));
 }
 
 static int guc_action_policies_update(struct intel_guc *guc, u32 policy_offset)
@@ -171,23 +175,24 @@ static int guc_action_policies_update(struct intel_guc 
*guc, u32 policy_offset)
 
 int intel_guc_global_policies_update(struct intel_guc *guc)
 {
-   struct __guc_ads_blob *blob = guc->ads_blob;
struct intel_gt *gt = guc_to_gt(guc);
+   u32 scheduler_policies;
intel_wakeref_t wakeref;
int ret;
 
-   if (!blob)
+   if (iosys_map_is_null(&guc->ads_map))
return -EOPNOTSUPP;
 
-   GEM_BUG_ON(!blob->ads.scheduler_policies);
+   scheduler_policies = ads_blob_read(guc, ads.scheduler_policies);
+   GEM_BUG_ON(!scheduler_policies);
 
-   guc_policies_init(guc, &blob->policies);
+   guc_policies_init(guc);
 
if (!intel_guc_is_ready(guc))
return 0;
 
with_intel_runtime_pm(>->i915->runtime_pm, wakeref)
-   ret = guc_action_policies_update(guc, 
blob->ads.scheduler_policies);
+   ret = guc_action_policies_update(guc, scheduler_policies);
 
return ret;
 }
@@ -557,7 +562,7 @@ static void __guc_ads_init(struct intel_guc *guc)
u32 base;
 
/* GuC scheduling policies */
-   guc_policies_init(guc, &blob->policies);
+   guc_policies_init(guc);
 
/* System info */
fill_engine_enable_masks(gt, &blob->system_info);
-- 
2.35.1

[PATCH 07/19] drm/i915/guc: Convert golden context init to iosys_map

2022-02-04 Thread Lucas De Marchi

Now the map is saved during creation, so use it to initialize the
golden context, reading from shmem and writing to either system or IO
memory.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 25 +++---
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 3a0afce7564e..d32b407a2d25 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -473,18 +473,17 @@ static struct intel_engine_cs *find_engine_state(struct 
intel_gt *gt, u8 engine_
 
 static void guc_init_golden_context(struct intel_guc *guc)
 {
-   struct __guc_ads_blob *blob = guc->ads_blob;
struct intel_engine_cs *engine;
struct intel_gt *gt = guc_to_gt(guc);
+   struct iosys_map golden_context_map;
u32 addr_ggtt, offset;
u32 total_size = 0, alloc_size, real_size;
u8 engine_class, guc_class;
-   u8 *ptr;
 
if (!intel_uc_uses_guc_submission(>->uc))
return;
 
-   GEM_BUG_ON(!blob);
+   GEM_BUG_ON(iosys_map_is_null(&guc->ads_map));
 
/*
 * Go back and fill in the golden context data now that it is
@@ -492,15 +491,15 @@ static void guc_init_golden_context(struct intel_guc *guc)
 */
offset = guc_ads_golden_ctxt_offset(guc);
addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
-   ptr = ((u8 *)blob) + offset;
+
+   golden_context_map = IOSYS_MAP_INIT_OFFSET(&guc->ads_map, offset);
 
for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; 
++engine_class) {
if (engine_class == OTHER_CLASS)
continue;
 
guc_class = engine_class_to_guc_class(engine_class);
-
-   if (!blob->system_info.engine_enabled_masks[guc_class])
+   if (!ads_blob_read(guc, 
system_info.engine_enabled_masks[guc_class]))
continue;
 
real_size = intel_engine_context_size(gt, engine_class);
@@ -511,18 +510,20 @@ static void guc_init_golden_context(struct intel_guc *guc)
if (!engine) {
drm_err(>->i915->drm, "No engine state recorded for 
class %d!\n",
engine_class);
-   blob->ads.eng_state_size[guc_class] = 0;
-   blob->ads.golden_context_lrca[guc_class] = 0;
+   ads_blob_write(guc, ads.eng_state_size[guc_class], 0);
+   ads_blob_write(guc, ads.golden_context_lrca[guc_class], 
0);
continue;
}
 
-   GEM_BUG_ON(blob->ads.eng_state_size[guc_class] !=
+   GEM_BUG_ON(ads_blob_read(guc, ads.eng_state_size[guc_class]) !=
   real_size - LRC_SKIP_SIZE);
-   GEM_BUG_ON(blob->ads.golden_context_lrca[guc_class] != 
addr_ggtt);
+   GEM_BUG_ON(ads_blob_read(guc, 
ads.golden_context_lrca[guc_class]) != addr_ggtt);
+
addr_ggtt += alloc_size;
 
-   shmem_read(engine->default_state, 0, ptr, real_size);
-   ptr += alloc_size;
+   shmem_read_to_iosys_map(engine->default_state, 0,
+   &golden_context_map, real_size);
+   iosys_map_incr(&golden_context_map, alloc_size);
}
 
GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);
-- 
2.35.1

[PATCH 13/19] drm/i915/guc: Convert mapping table to iosys_map

2022-02-04 Thread Lucas De Marchi

Use iosys_map to write the fields system_info.mapping_table[][].
Since we already have the info_map around where needed, just use it
instead of going through guc->ads_map.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 098a4756e8c5..9e96a44a6bbc 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -204,7 +204,7 @@ int intel_guc_global_policies_update(struct intel_guc *guc)
 }
 
 static void guc_mapping_table_init(struct intel_gt *gt,
-  struct guc_gt_system_info *system_info)
+  struct iosys_map *info_map)
 {
unsigned int i, j;
struct intel_engine_cs *engine;
@@ -213,14 +213,14 @@ static void guc_mapping_table_init(struct intel_gt *gt,
/* Table must be set to invalid values for entries not used */
for (i = 0; i < GUC_MAX_ENGINE_CLASSES; ++i)
for (j = 0; j < GUC_MAX_INSTANCES_PER_CLASS; ++j)
-   system_info->mapping_table[i][j] =
-   GUC_MAX_INSTANCES_PER_CLASS;
+   info_map_write(info_map, mapping_table[i][j],
+  GUC_MAX_INSTANCES_PER_CLASS);
 
for_each_engine(engine, gt, id) {
u8 guc_class = engine_class_to_guc_class(engine->class);
 
-   
system_info->mapping_table[guc_class][ilog2(engine->logical_mask)] =
-   engine->instance;
+   info_map_write(info_map, 
mapping_table[guc_class][ilog2(engine->logical_mask)],
+  engine->instance);
}
 }
 
@@ -595,7 +595,7 @@ static void __guc_ads_init(struct intel_guc *guc)
/* Golden contexts for re-initialising after a watchdog reset */
guc_prep_golden_context(guc);
 
-   guc_mapping_table_init(guc_to_gt(guc), &blob->system_info);
+   guc_mapping_table_init(guc_to_gt(guc), &info_map);
 
base = intel_guc_ggtt_offset(guc, guc->ads_vma);
 
-- 
2.35.1

[PATCH 16/19] drm/i915/guc: Use a single pass to calculate regset

2022-02-04 Thread Lucas De Marchi

The ADS initialitazion was using 2 passes to calculate the regset sent
to GuC to initialize each engine: the first pass to just have the final
object size and the second to set each register in place in the final
gem object.

However in order to maintain an ordered set of registers to pass to guc,
each register needs to be added and moved in the final array. The second
phase may actually happen in IO memory rather than system memory and
accessing IO memory by simply dereferencing the pointer doesn't work on
all architectures. Other places of the ADS initializaition were
converted to use the iosys_map API, but here there may be a lot more
accesses to IO memory. So, instead of following that same approach,
convert the regset initialization to calculate the final array in 1
pass and in the second pass that array is just copied to its final
location, updating the pointers for each engine written to the ADS blob.

One important thing is that struct temp_regset now have
different semantics: `registers` continues to track the registers of a
single engine, however the other fields are updated together, according
to the newly added `storage`, which tracks the memory allocated for
all the registers. So rename some of these fields and add a
__mmio_reg_add(): this function (possibly) allocates memory and operates
on the storage pointer while guc_mmio_reg_add() continues to manage the
registers pointer.

On a Tiger Lake system using enable_guc=3, the following log message is
now seen:

[  187.334310] i915 :00:02.0: [drm:intel_guc_ads_create [i915]] 
Used 4 KB for temporary ADS regset

This change has also been tested on an ARM64 host with DG2 and other
discrete graphics cards.

v2 (Daniele):
  - Fix leaking tempset on error path
  - Add comments on struct temp_regset to document the meaning of each
field

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.h |   7 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 128 +
 2 files changed, 90 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 9b9ba79f7594..f857e9190750 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -152,6 +152,13 @@ struct intel_guc {
struct iosys_map ads_map;
/** @ads_regset_size: size of the save/restore regsets in the ADS */
u32 ads_regset_size;
+   /**
+* @ads_regset_count: number of save/restore registers in the ADS for
+* each engine
+*/
+   u32 ads_regset_count[I915_NUM_ENGINES];
+   /** @ads_regset: save/restore regsets in the ADS */
+   struct guc_mmio_reg *ads_regset;
/** @ads_golden_ctxt_size: size of the golden contexts in the ADS */
u32 ads_golden_ctxt_size;
/** @ads_engine_usage_size: size of engine usage in the ADS */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 3a4558948c31..c040d8d8d7a4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -226,14 +226,18 @@ static void guc_mapping_table_init(struct intel_gt *gt,
 
 /*
  * The save/restore register list must be pre-calculated to a temporary
- * buffer of driver defined size before it can be generated in place
- * inside the ADS.
+ * buffer before it can be copied inside the ADS.
  */
-#define MAX_MMIO_REGS  128 /* Arbitrary size, increase as needed */
 struct temp_regset {
+   /*
+* ptr to the section of the storage for the engine currently being
+* worked on
+*/
struct guc_mmio_reg *registers;
-   u32 used;
-   u32 size;
+   /* ptr to the base of the allocated storage for all engines */
+   struct guc_mmio_reg *storage;
+   u32 storage_used;
+   u32 storage_max;
 };
 
 static int guc_mmio_reg_cmp(const void *a, const void *b)
@@ -244,18 +248,44 @@ static int guc_mmio_reg_cmp(const void *a, const void *b)
return (int)ra->offset - (int)rb->offset;
 }
 
+static struct guc_mmio_reg * __must_check
+__mmio_reg_add(struct temp_regset *regset, struct guc_mmio_reg *reg)
+{
+   u32 pos = regset->storage_used;
+   struct guc_mmio_reg *slot;
+
+   if (pos >= regset->storage_max) {
+   size_t size = ALIGN((pos + 1) * sizeof(*slot), PAGE_SIZE);
+   struct guc_mmio_reg *r = krealloc(regset->storage,
+ size, GFP_KERNEL);
+   if (!r) {
+   WARN_ONCE(1, "Incomplete regset list: can't add 
register (%d)\n",
+ -ENOMEM);
+   return ERR_PTR(-ENOMEM);
+   }
+
+   regset->registers = r + (regset->registers - regset->storage);
+

[PATCH 09/19] drm/i915/guc: Convert engine record to iosys_map

2022-02-04 Thread Lucas De Marchi

Use iosys_map to read fields from the dma_blob so access to IO and
system memory is abstracted away.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c  | 14 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h  |  3 ++-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c   | 17 ++---
 3 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 6311b9da87e4..1d21a2d457e0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -698,18 +698,16 @@ void intel_guc_ads_reset(struct intel_guc *guc)
 
 u32 intel_guc_engine_usage_offset(struct intel_guc *guc)
 {
-   struct __guc_ads_blob *blob = guc->ads_blob;
-   u32 base = intel_guc_ggtt_offset(guc, guc->ads_vma);
-   u32 offset = base + ptr_offset(blob, engine_usage);
-
-   return offset;
+   return intel_guc_ggtt_offset(guc, guc->ads_vma) +
+   offsetof(struct __guc_ads_blob, engine_usage);
 }
 
-struct guc_engine_usage_record *intel_guc_engine_usage(struct intel_engine_cs 
*engine)
+struct iosys_map intel_guc_engine_usage_record_map(struct intel_engine_cs 
*engine)
 {
struct intel_guc *guc = &engine->gt->uc.guc;
-   struct __guc_ads_blob *blob = guc->ads_blob;
u8 guc_class = engine_class_to_guc_class(engine->class);
+   size_t offset = offsetof(struct __guc_ads_blob,
+
engine_usage.engines[guc_class][ilog2(engine->logical_mask)]);
 
-   return 
&blob->engine_usage.engines[guc_class][ilog2(engine->logical_mask)];
+   return IOSYS_MAP_INIT_OFFSET(&guc->ads_map, offset);
 }
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h
index e74c110facff..1c64f4d6ea21 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h
@@ -7,6 +7,7 @@
 #define _INTEL_GUC_ADS_H_
 
 #include 
+#include 
 
 struct intel_guc;
 struct drm_printer;
@@ -18,7 +19,7 @@ void intel_guc_ads_init_late(struct intel_guc *guc);
 void intel_guc_ads_reset(struct intel_guc *guc);
 void intel_guc_ads_print_policy_info(struct intel_guc *guc,
 struct drm_printer *p);
-struct guc_engine_usage_record *intel_guc_engine_usage(struct intel_engine_cs 
*engine);
+struct iosys_map intel_guc_engine_usage_record_map(struct intel_engine_cs 
*engine);
 u32 intel_guc_engine_usage_offset(struct intel_guc *guc);
 
 #endif
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index b3a429a92c0d..6d34842f68b4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1139,6 +1139,9 @@ __extend_last_switch(struct intel_guc *guc, u64 
*prev_start, u32 new_start)
*prev_start = ((u64)gt_stamp_hi << 32) | new_start;
 }
 
+#define record_read(map_, field_) \
+   iosys_map_rd_field(map_, struct guc_engine_usage_record, field_)
+
 /*
  * GuC updates shared memory and KMD reads it. Since this is not synchronized,
  * we run into a race where the value read is inconsistent. Sometimes the
@@ -1153,17 +1156,17 @@ __extend_last_switch(struct intel_guc *guc, u64 
*prev_start, u32 new_start)
 static void __get_engine_usage_record(struct intel_engine_cs *engine,
  u32 *last_in, u32 *id, u32 *total)
 {
-   struct guc_engine_usage_record *rec = intel_guc_engine_usage(engine);
+   struct iosys_map rec_map = intel_guc_engine_usage_record_map(engine);
int i = 0;
 
do {
-   *last_in = READ_ONCE(rec->last_switch_in_stamp);
-   *id = READ_ONCE(rec->current_context_index);
-   *total = READ_ONCE(rec->total_runtime);
+   *last_in = record_read(&rec_map, last_switch_in_stamp);
+   *id = record_read(&rec_map, current_context_index);
+   *total = record_read(&rec_map, total_runtime);
 
-   if (READ_ONCE(rec->last_switch_in_stamp) == *last_in &&
-   READ_ONCE(rec->current_context_index) == *id &&
-   READ_ONCE(rec->total_runtime) == *total)
+   if (record_read(&rec_map, last_switch_in_stamp) == *last_in &&
+   record_read(&rec_map, current_context_index) == *id &&
+   record_read(&rec_map, total_runtime) == *total)
break;
} while (++i < 6);
 }
-- 
2.35.1

[PATCH 12/19] drm/i915/guc: Replace check for golden context size

2022-02-04 Thread Lucas De Marchi

In the other places in this function, guc->ads_map is being protected
from access when it's not yet set. However the last check is actually
about guc->ads_golden_ctxt_size been set before.  These checks should
always match as the size is initialized on the first call to
guc_prep_golden_context(), but it's clearer if we have a single return
and check for guc->ads_golden_ctxt_size.

This is just a readability improvement, no change in behavior.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 80fbab831536..098a4756e8c5 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -461,10 +461,10 @@ static int guc_prep_golden_context(struct intel_guc *guc)
addr_ggtt += alloc_size;
}
 
-   if (iosys_map_is_null(&guc->ads_map))
-   return total_size;
+   /* Make sure current size matches what we calculated previously */
+   if (guc->ads_golden_ctxt_size)
+   GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);
 
-   GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);
return total_size;
 }
 
-- 
2.35.1

[PATCH 19/19] drm/i915/guc: Remove plain ads_blob pointer

2022-02-04 Thread Lucas De Marchi

Now we have the access to content of GuC ADS either using iosys_map
API or using a temporary buffer. Remove guc->ads_blob as there shouldn't
be updates using the bare pointer anymore.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.h | 3 +--
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 8 
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index f857e9190750..bf7079480d47 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -147,8 +147,7 @@ struct intel_guc {
 
/** @ads_vma: object allocated to hold the GuC ADS */
struct i915_vma *ads_vma;
-   /** @ads_blob: contents of the GuC ADS */
-   struct __guc_ads_blob *ads_blob;
+   /** @ads_map: contents of the GuC ADS */
struct iosys_map ads_map;
/** @ads_regset_size: size of the save/restore regsets in the ADS */
u32 ads_regset_size;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 6262fd4e0d4a..7dd44b6d76da 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -672,6 +672,7 @@ static void __guc_ads_init(struct intel_guc *guc)
  */
 int intel_guc_ads_create(struct intel_guc *guc)
 {
+   void *ads_blob;
u32 size;
int ret;
 
@@ -696,14 +697,14 @@ int intel_guc_ads_create(struct intel_guc *guc)
size = guc_ads_blob_size(guc);
 
ret = intel_guc_allocate_and_map_vma(guc, size, &guc->ads_vma,
-(void **)&guc->ads_blob);
+&ads_blob);
if (ret)
return ret;
 
if (i915_gem_object_is_lmem(guc->ads_vma->obj))
-   iosys_map_set_vaddr_iomem(&guc->ads_map, (void __iomem 
*)guc->ads_blob);
+   iosys_map_set_vaddr_iomem(&guc->ads_map, (void __iomem 
*)ads_blob);
else
-   iosys_map_set_vaddr(&guc->ads_map, guc->ads_blob);
+   iosys_map_set_vaddr(&guc->ads_map, ads_blob);
 
__guc_ads_init(guc);
 
@@ -725,7 +726,6 @@ void intel_guc_ads_init_late(struct intel_guc *guc)
 void intel_guc_ads_destroy(struct intel_guc *guc)
 {
i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP);
-   guc->ads_blob = NULL;
iosys_map_clear(&guc->ads_map);
kfree(guc->ads_regset);
 }
-- 
2.35.1

[PATCH 10/19] drm/i915/guc: Convert guc_ads_private_data_reset to iosys_map

2022-02-04 Thread Lucas De Marchi

Use iosys_map_memset() to zero the private data as ADS may be either
on system or IO memory.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 1d21a2d457e0..973762ce2196 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -674,8 +674,8 @@ static void guc_ads_private_data_reset(struct intel_guc 
*guc)
if (!size)
return;
 
-   memset((void *)guc->ads_blob + guc_ads_private_data_offset(guc), 0,
-  size);
+   iosys_map_memset(&guc->ads_map, guc_ads_private_data_offset(guc),
+0, size);
 }
 
 /**
-- 
2.35.1

[PATCH 17/19] drm/i915/guc: Convert guc_mmio_reg_state_init to iosys_map

2022-02-04 Thread Lucas De Marchi

Now that the regset list is prepared, convert guc_mmio_reg_state_init()
to use iosys_map to copy the array to the final location and
initialize additional fields in ads.reg_state_list.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 30 +-
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index c040d8d8d7a4..cf6fafa1024c 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -383,40 +383,46 @@ static long guc_mmio_reg_state_create(struct intel_guc 
*guc)
return ret;
 }
 
-static void guc_mmio_reg_state_init(struct intel_guc *guc,
-   struct __guc_ads_blob *blob)
+static void guc_mmio_reg_state_init(struct intel_guc *guc)
 {
+   struct iosys_map ads_regset_map;
struct intel_gt *gt = guc_to_gt(guc);
struct intel_engine_cs *engine;
-   struct guc_mmio_reg *ads_registers;
enum intel_engine_id id;
u32 addr_ggtt, offset;
 
offset = guc_ads_regset_offset(guc);
addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
-   ads_registers = (struct guc_mmio_reg *)(((u8 *)blob) + offset);
+   ads_regset_map = IOSYS_MAP_INIT_OFFSET(&guc->ads_map, offset);
 
-   memcpy(ads_registers, guc->ads_regset, guc->ads_regset_size);
+   iosys_map_memcpy_to(&ads_regset_map, 0, guc->ads_regset,
+   guc->ads_regset_size);
 
for_each_engine(engine, gt, id) {
u32 count = guc->ads_regset_count[id];
-   struct guc_mmio_reg_set *ads_reg_set;
u8 guc_class;
 
/* Class index is checked in class converter */
GEM_BUG_ON(engine->instance >= GUC_MAX_INSTANCES_PER_CLASS);
 
guc_class = engine_class_to_guc_class(engine->class);
-   ads_reg_set = 
&blob->ads.reg_state_list[guc_class][engine->instance];
 
if (!count) {
-   ads_reg_set->address = 0;
-   ads_reg_set->count = 0;
+   ads_blob_write(guc,
+  
ads.reg_state_list[guc_class][engine->instance].address,
+  0);
+   ads_blob_write(guc,
+  
ads.reg_state_list[guc_class][engine->instance].count,
+  0);
continue;
}
 
-   ads_reg_set->address = addr_ggtt;
-   ads_reg_set->count = count;
+   ads_blob_write(guc,
+  
ads.reg_state_list[guc_class][engine->instance].address,
+  addr_ggtt);
+   ads_blob_write(guc,
+  
ads.reg_state_list[guc_class][engine->instance].count,
+  count);
 
addr_ggtt += count * sizeof(struct guc_mmio_reg);
}
@@ -646,7 +652,7 @@ static void __guc_ads_init(struct intel_guc *guc)
blob->ads.gt_system_info = base + ptr_offset(blob, system_info);
 
/* MMIO save/restore list */
-   guc_mmio_reg_state_init(guc, blob);
+   guc_mmio_reg_state_init(guc);
 
/* Private Data */
blob->ads.private_data = base + guc_ads_private_data_offset(guc);
-- 
2.35.1

[PATCH 14/19] drm/i915/guc: Convert capture list to iosys_map

2022-02-04 Thread Lucas De Marchi

Use iosys_map to write the fields ads.capture_*.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 9e96a44a6bbc..245b703568ff 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -544,7 +544,7 @@ static void guc_init_golden_context(struct intel_guc *guc)
GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);
 }
 
-static void guc_capture_list_init(struct intel_guc *guc, struct __guc_ads_blob 
*blob)
+static void guc_capture_list_init(struct intel_guc *guc)
 {
int i, j;
u32 addr_ggtt, offset;
@@ -556,11 +556,11 @@ static void guc_capture_list_init(struct intel_guc *guc, 
struct __guc_ads_blob *
 
for (i = 0; i < GUC_CAPTURE_LIST_INDEX_MAX; i++) {
for (j = 0; j < GUC_MAX_ENGINE_CLASSES; j++) {
-   blob->ads.capture_instance[i][j] = addr_ggtt;
-   blob->ads.capture_class[i][j] = addr_ggtt;
+   ads_blob_write(guc, ads.capture_instance[i][j], 
addr_ggtt);
+   ads_blob_write(guc, ads.capture_class[i][j], addr_ggtt);
}
 
-   blob->ads.capture_global[i] = addr_ggtt;
+   ads_blob_write(guc, ads.capture_global[i], addr_ggtt);
}
 }
 
@@ -600,7 +600,7 @@ static void __guc_ads_init(struct intel_guc *guc)
base = intel_guc_ggtt_offset(guc, guc->ads_vma);
 
/* Capture list for hang debug */
-   guc_capture_list_init(guc, blob);
+   guc_capture_list_init(guc);
 
/* ADS */
blob->ads.scheduler_policies = base + ptr_offset(blob, policies);
-- 
2.35.1

[PATCH 15/19] drm/i915/guc: Prepare for error propagation

2022-02-04 Thread Lucas De Marchi

Currently guc_mmio_reg_add() relies on having enough memory available in
the array to add a new slot. It uses
`GEM_BUG_ON(count >= regset->size);` to protect going above the
threshold.

In order to allow guc_mmio_reg_add() to handle the memory allocation by
itself, it must return an error in case of failures.  Adjust return code
so this error can be propagated to the callers of guc_mmio_reg_add() and
guc_mmio_regset_init().

No intended change in behavior.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 31 +-
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 245b703568ff..3a4558948c31 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -244,8 +244,8 @@ static int guc_mmio_reg_cmp(const void *a, const void *b)
return (int)ra->offset - (int)rb->offset;
 }
 
-static void guc_mmio_reg_add(struct temp_regset *regset,
-u32 offset, u32 flags)
+static long __must_check guc_mmio_reg_add(struct temp_regset *regset,
+ u32 offset, u32 flags)
 {
u32 count = regset->used;
struct guc_mmio_reg reg = {
@@ -264,7 +264,7 @@ static void guc_mmio_reg_add(struct temp_regset *regset,
 */
if (bsearch(®, regset->registers, count,
sizeof(reg), guc_mmio_reg_cmp))
-   return;
+   return 0;
 
slot = ®set->registers[count];
regset->used++;
@@ -277,6 +277,8 @@ static void guc_mmio_reg_add(struct temp_regset *regset,
 
swap(slot[1], slot[0]);
}
+
+   return 0;
 }
 
 #define GUC_MMIO_REG_ADD(regset, reg, masked) \
@@ -284,32 +286,35 @@ static void guc_mmio_reg_add(struct temp_regset *regset,
 i915_mmio_reg_offset((reg)), \
 (masked) ? GUC_REGSET_MASKED : 0)
 
-static void guc_mmio_regset_init(struct temp_regset *regset,
-struct intel_engine_cs *engine)
+static int guc_mmio_regset_init(struct temp_regset *regset,
+   struct intel_engine_cs *engine)
 {
const u32 base = engine->mmio_base;
struct i915_wa_list *wal = &engine->wa_list;
struct i915_wa *wa;
unsigned int i;
+   int ret = 0;
 
regset->used = 0;
 
-   GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true);
-   GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false);
-   GUC_MMIO_REG_ADD(regset, RING_IMR(base), false);
+   ret |= GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true);
+   ret |= GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false);
+   ret |= GUC_MMIO_REG_ADD(regset, RING_IMR(base), false);
 
for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
-   GUC_MMIO_REG_ADD(regset, wa->reg, wa->masked_reg);
+   ret |= GUC_MMIO_REG_ADD(regset, wa->reg, wa->masked_reg);
 
/* Be extra paranoid and include all whitelist registers. */
for (i = 0; i < RING_MAX_NONPRIV_SLOTS; i++)
-   GUC_MMIO_REG_ADD(regset,
-RING_FORCE_TO_NONPRIV(base, i),
-false);
+   ret |= GUC_MMIO_REG_ADD(regset,
+   RING_FORCE_TO_NONPRIV(base, i),
+   false);
 
/* add in local MOCS registers */
for (i = 0; i < GEN9_LNCFCMOCS_REG_COUNT; i++)
-   GUC_MMIO_REG_ADD(regset, GEN9_LNCFCMOCS(i), false);
+   ret |= GUC_MMIO_REG_ADD(regset, GEN9_LNCFCMOCS(i), false);
+
+   return ret ? -1 : 0;
 }
 
 static int guc_mmio_reg_state_query(struct intel_guc *guc)
-- 
2.35.1

[PATCH 18/19] drm/i915/guc: Convert __guc_ads_init to iosys_map

2022-02-04 Thread Lucas De Marchi

Now that all the called functions from __guc_ads_init() are converted to
use ads_map, stop using ads_blob in __guc_ads_init().

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 25 --
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index cf6fafa1024c..6262fd4e0d4a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -613,7 +613,6 @@ static void __guc_ads_init(struct intel_guc *guc)
 {
struct intel_gt *gt = guc_to_gt(guc);
struct drm_i915_private *i915 = gt->i915;
-   struct __guc_ads_blob *blob = guc->ads_blob;
struct iosys_map info_map = IOSYS_MAP_INIT_OFFSET(&guc->ads_map,
offsetof(struct __guc_ads_blob, system_info));
u32 base;
@@ -624,17 +623,18 @@ static void __guc_ads_init(struct intel_guc *guc)
/* System info */
fill_engine_enable_masks(gt, &info_map);
 
-   
blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED] =
-   hweight8(gt->info.sseu.slice_mask);
-   
blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_VDBOX_SFC_SUPPORT_MASK]
 =
-   gt->info.vdbox_sfc_access;
+   ads_blob_write(guc, 
system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED],
+  hweight8(gt->info.sseu.slice_mask));
+   ads_blob_write(guc, 
system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_VDBOX_SFC_SUPPORT_MASK],
+  gt->info.vdbox_sfc_access);
 
if (GRAPHICS_VER(i915) >= 12 && !IS_DGFX(i915)) {
u32 distdbreg = intel_uncore_read(gt->uncore,
  GEN12_DIST_DBS_POPULATED);
-   
blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_DOORBELL_COUNT_PER_SQIDI]
 =
-   ((distdbreg >> GEN12_DOORBELLS_PER_SQIDI_SHIFT) &
-GEN12_DOORBELLS_PER_SQIDI) + 1;
+   ads_blob_write(guc,
+  
system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_DOORBELL_COUNT_PER_SQIDI],
+  ((distdbreg >> GEN12_DOORBELLS_PER_SQIDI_SHIFT)
+   & GEN12_DOORBELLS_PER_SQIDI) + 1);
}
 
/* Golden contexts for re-initialising after a watchdog reset */
@@ -648,14 +648,17 @@ static void __guc_ads_init(struct intel_guc *guc)
guc_capture_list_init(guc);
 
/* ADS */
-   blob->ads.scheduler_policies = base + ptr_offset(blob, policies);
-   blob->ads.gt_system_info = base + ptr_offset(blob, system_info);
+   ads_blob_write(guc, ads.scheduler_policies, base +
+  offsetof(struct __guc_ads_blob, policies));
+   ads_blob_write(guc, ads.gt_system_info, base +
+  offsetof(struct __guc_ads_blob, system_info));
 
/* MMIO save/restore list */
guc_mmio_reg_state_init(guc);
 
/* Private Data */
-   blob->ads.private_data = base + guc_ads_private_data_offset(guc);
+   ads_blob_write(guc, ads.private_data, base +
+  guc_ads_private_data_offset(guc));
 
i915_gem_object_flush_map(guc->ads_vma->obj);
 }
-- 
2.35.1

[PATCH 11/19] drm/i915/guc: Convert golden context prep to iosys_map

2022-02-04 Thread Lucas De Marchi

Use the saved ads_map to prepare the golden context. One difference from
the init context is that this function can be called before there is a
gem object (and thus the guc->ads_map) to calculare the size of the
golden context that should be allocated for that object.

So in this case the function needs to be prepared for not having the
system_info with enabled engines filled out. To accomplish that an
info_map is prepared on the side to point either to the gem object
or the local variable on the stack. This allows making
fill_engine_enable_masks() operate always with a iosys_map
argument.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 52 +-
 1 file changed, 32 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 973762ce2196..80fbab831536 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -67,6 +67,12 @@ struct __guc_ads_blob {
iosys_map_wr_field(&(guc_)->ads_map, struct __guc_ads_blob, \
   field_, val_)
 
+#define info_map_write(map_, field_, val_) \
+   iosys_map_wr_field(map_, struct guc_gt_system_info, field_, val_)
+
+#define info_map_read(map_, field_) \
+   iosys_map_rd_field(map_, struct guc_gt_system_info, field_)
+
 static u32 guc_ads_regset_size(struct intel_guc *guc)
 {
GEM_BUG_ON(!guc->ads_regset_size);
@@ -378,24 +384,24 @@ static void guc_mmio_reg_state_init(struct intel_guc *guc,
 }
 
 static void fill_engine_enable_masks(struct intel_gt *gt,
-struct guc_gt_system_info *info)
+struct iosys_map *info_map)
 {
-   info->engine_enabled_masks[GUC_RENDER_CLASS] = 1;
-   info->engine_enabled_masks[GUC_BLITTER_CLASS] = 1;
-   info->engine_enabled_masks[GUC_VIDEO_CLASS] = VDBOX_MASK(gt);
-   info->engine_enabled_masks[GUC_VIDEOENHANCE_CLASS] = VEBOX_MASK(gt);
+   info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], 1);
+   info_map_write(info_map, engine_enabled_masks[GUC_BLITTER_CLASS], 1);
+   info_map_write(info_map, engine_enabled_masks[GUC_VIDEO_CLASS], 
VDBOX_MASK(gt));
+   info_map_write(info_map, engine_enabled_masks[GUC_VIDEOENHANCE_CLASS], 
VEBOX_MASK(gt));
 }
 
 #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
 #define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE)
-static int guc_prep_golden_context(struct intel_guc *guc,
-  struct __guc_ads_blob *blob)
+static int guc_prep_golden_context(struct intel_guc *guc)
 {
struct intel_gt *gt = guc_to_gt(guc);
u32 addr_ggtt, offset;
u32 total_size = 0, alloc_size, real_size;
u8 engine_class, guc_class;
-   struct guc_gt_system_info *info, local_info;
+   struct guc_gt_system_info local_info;
+   struct iosys_map info_map;
 
/*
 * Reserve the memory for the golden contexts and point GuC at it but
@@ -409,14 +415,15 @@ static int guc_prep_golden_context(struct intel_guc *guc,
 * GuC will also validate that the LRC base + size fall within the
 * allowed GGTT range.
 */
-   if (blob) {
+   if (!iosys_map_is_null(&guc->ads_map)) {
offset = guc_ads_golden_ctxt_offset(guc);
addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
-   info = &blob->system_info;
+   info_map = IOSYS_MAP_INIT_OFFSET(&guc->ads_map,
+offsetof(struct 
__guc_ads_blob, system_info));
} else {
memset(&local_info, 0, sizeof(local_info));
-   info = &local_info;
-   fill_engine_enable_masks(gt, info);
+   iosys_map_set_vaddr(&info_map, &local_info);
+   fill_engine_enable_masks(gt, &info_map);
}
 
for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; 
++engine_class) {
@@ -425,14 +432,14 @@ static int guc_prep_golden_context(struct intel_guc *guc,
 
guc_class = engine_class_to_guc_class(engine_class);
 
-   if (!info->engine_enabled_masks[guc_class])
+   if (!info_map_read(&info_map, engine_enabled_masks[guc_class]))
continue;
 
real_size = intel_engine_context_size(gt, engine_class);
alloc_size = PAGE_ALIGN(real_size);
total_size += alloc_size;
 
-   if (!blob)
+   if (iosys_map_is_null(&guc->ads_map))
continue;
 
/*
@@ -446,12 +453,15 @@ static int guc_prep_golden_context(struct intel_guc *guc,
 * what comes before it in the context image (which is identical

[RFC PATCH 0/3] drm/helpers: Make the suballocation manager drm generic.

2022-02-04 Thread Maarten Lankhorst

The suballocation manager itself is not dependent on any implementation detail,
and can be made generic. I want to potentially use it inside i915, as it looks
like a clean implementation to do so. :)

Looking for feedback and some testing, as I don't have a amdgpu/radeon myself.
Only compile tested so far, so some stupid bugs may remain.

Maarten Lankhorst (3):
  drm: Extract amdgpu_sa.c as a generic suballocation helper
  drm/amd: Convert amdgpu to use suballocation helper.
  drm/radeon: Use the drm suballocation manager implementation.

 drivers/gpu/drm/Makefile   |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  29 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c |   5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  21 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c | 320 +---
 drivers/gpu/drm/drm_suballoc.c | 424 +
 drivers/gpu/drm/radeon/radeon.h|  55 +--
 drivers/gpu/drm/radeon/radeon_ib.c |  10 +-
 drivers/gpu/drm/radeon/radeon_object.h |  23 +-
 drivers/gpu/drm/radeon/radeon_sa.c | 314 ++-
 drivers/gpu/drm/radeon/radeon_semaphore.c  |   6 +-
 include/drm/drm_suballoc.h |  78 
 12 files changed, 595 insertions(+), 694 deletions(-)
 create mode 100644 drivers/gpu/drm/drm_suballoc.c
 create mode 100644 include/drm/drm_suballoc.h

-- 
2.34.1

[RFC PATCH 3/3] drm/radeon: Use the drm suballocation manager implementation.

2022-02-04 Thread Maarten Lankhorst

Use the generic suballocation helper lifted from amdgpu.
Note that the generic suballocator only allows a single alignment,
so we may waste a few more bytes for radeon_semaphore, shouldn't
be a big deal, could be re-added if needed.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/radeon/radeon.h   |  55 +---
 drivers/gpu/drm/radeon/radeon_ib.c|  10 +-
 drivers/gpu/drm/radeon/radeon_object.h|  23 +-
 drivers/gpu/drm/radeon/radeon_sa.c| 314 ++
 drivers/gpu/drm/radeon/radeon_semaphore.c |   6 +-
 5 files changed, 51 insertions(+), 357 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 895776c421d4..a6339c9e7c47 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -79,6 +79,7 @@
 #include 
 
 #include 
+#include 
 
 #include "radeon_family.h"
 #include "radeon_mode.h"
@@ -512,52 +513,12 @@ struct radeon_bo {
 };
 #define gem_to_radeon_bo(gobj) container_of((gobj), struct radeon_bo, tbo.base)
 
-/* sub-allocation manager, it has to be protected by another lock.
- * By conception this is an helper for other part of the driver
- * like the indirect buffer or semaphore, which both have their
- * locking.
- *
- * Principe is simple, we keep a list of sub allocation in offset
- * order (first entry has offset == 0, last entry has the highest
- * offset).
- *
- * When allocating new object we first check if there is room at
- * the end total_size - (last_object_offset + last_object_size) >=
- * alloc_size. If so we allocate new object there.
- *
- * When there is not enough room at the end, we start waiting for
- * each sub object until we reach object_offset+object_size >=
- * alloc_size, this object then become the sub object we return.
- *
- * Alignment can't be bigger than page size.
- *
- * Hole are not considered for allocation to keep things simple.
- * Assumption is that there won't be hole (all object on same
- * alignment).
- */
 struct radeon_sa_manager {
-   wait_queue_head_t   wq;
-   struct radeon_bo*bo;
-   struct list_head*hole;
-   struct list_headflist[RADEON_NUM_RINGS];
-   struct list_headolist;
-   unsignedsize;
-   uint64_tgpu_addr;
-   void*cpu_ptr;
-   uint32_tdomain;
-   uint32_talign;
-};
-
-struct radeon_sa_bo;
-
-/* sub-allocation buffer */
-struct radeon_sa_bo {
-   struct list_headolist;
-   struct list_headflist;
-   struct radeon_sa_manager*manager;
-   unsignedsoffset;
-   unsignedeoffset;
-   struct radeon_fence *fence;
+   struct drm_suballoc_manager base;
+   struct radeon_bo*bo;
+   uint64_tgpu_addr;
+   void*cpu_ptr;
+   u32 domain;
 };
 
 /*
@@ -588,7 +549,7 @@ int radeon_mode_dumb_mmap(struct drm_file *filp,
  * Semaphores.
  */
 struct radeon_semaphore {
-   struct radeon_sa_bo *sa_bo;
+   struct drm_suballoc *sa_bo;
signed  waiters;
uint64_tgpu_addr;
 };
@@ -817,7 +778,7 @@ void radeon_irq_kms_disable_hpd(struct radeon_device *rdev, 
unsigned hpd_mask);
  */
 
 struct radeon_ib {
-   struct radeon_sa_bo *sa_bo;
+   struct drm_suballoc *sa_bo;
uint32_tlength_dw;
uint64_tgpu_addr;
uint32_t*ptr;
diff --git a/drivers/gpu/drm/radeon/radeon_ib.c 
b/drivers/gpu/drm/radeon/radeon_ib.c
index 62b116727b4f..bca2cbd27abf 100644
--- a/drivers/gpu/drm/radeon/radeon_ib.c
+++ b/drivers/gpu/drm/radeon/radeon_ib.c
@@ -61,7 +61,7 @@ int radeon_ib_get(struct radeon_device *rdev, int ring,
 {
int r;
 
-   r = radeon_sa_bo_new(rdev, &rdev->ring_tmp_bo, &ib->sa_bo, size, 256);
+   r = radeon_sa_bo_new(&rdev->ring_tmp_bo, &ib->sa_bo, size);
if (r) {
dev_err(rdev->dev, "failed to get a new IB (%d)\n", r);
return r;
@@ -97,7 +97,7 @@ int radeon_ib_get(struct radeon_device *rdev, int ring,
 void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib)
 {
radeon_sync_free(rdev, &ib->sync, ib->fence);
-   radeon_sa_bo_free(rdev, &ib->sa_bo, ib->fence);
+   radeon_sa_bo_free(&ib->sa_bo, ib->fence);
radeon_fence_unref(&ib->fence);
 }
 
@@ -201,8 +201,7 @@ int radeon_ib_pool_init(struct radeon_device *rdev)
 
if (rdev->family >= CHIP_BONAIRE) {
r = radeon_sa_bo_manager_init(rdev, &rdev->ring_tmp_bo,
- RADEON_IB_POOL_SIZE*64*1024,
- RADEON_GPU_PAGE_SIZE,
+ RADEON_IB_POOL_SIZE*6

[RFC PATCH 2/3] drm/amd: Convert amdgpu to use suballocation helper.

2022-02-04 Thread Maarten Lankhorst

Now that the suballocation helper is generic, we can use it in amdgpu.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  29 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c |   5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  21 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c | 320 ++---
 4 files changed, 39 insertions(+), 336 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 9a53a4de2bb7..a8c7a7ef480c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -61,6 +61,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include "dm_pp_interface.h"
@@ -417,29 +418,11 @@ struct amdgpu_clock {
  * alignment).
  */
 
-#define AMDGPU_SA_NUM_FENCE_LISTS  32
-
 struct amdgpu_sa_manager {
-   wait_queue_head_t   wq;
-   struct amdgpu_bo*bo;
-   struct list_head*hole;
-   struct list_headflist[AMDGPU_SA_NUM_FENCE_LISTS];
-   struct list_headolist;
-   unsignedsize;
-   uint64_tgpu_addr;
-   void*cpu_ptr;
-   uint32_tdomain;
-   uint32_talign;
-};
-
-/* sub-allocation buffer */
-struct amdgpu_sa_bo {
-   struct list_headolist;
-   struct list_headflist;
-   struct amdgpu_sa_manager*manager;
-   unsignedsoffset;
-   unsignedeoffset;
-   struct dma_fence*fence;
+   struct drm_suballoc_manager base;
+   struct amdgpu_bo*bo;
+   uint64_tgpu_addr;
+   void*cpu_ptr;
 };
 
 int amdgpu_fence_slab_init(void);
@@ -470,7 +453,7 @@ struct amdgpu_flip_work {
  */
 
 struct amdgpu_ib {
-   struct amdgpu_sa_bo *sa_bo;
+   struct drm_suballoc *sa_bo;
uint32_tlength_dw;
uint64_tgpu_addr;
uint32_t*ptr;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index bc1297dcdf97..883828a4988c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -69,7 +69,7 @@ int amdgpu_ib_get(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
 
if (size) {
r = amdgpu_sa_bo_new(&adev->ib_pools[pool_type],
- &ib->sa_bo, size, 256);
+ &ib->sa_bo, size);
if (r) {
dev_err(adev->dev, "failed to get a new IB (%d)\n", r);
return r;
@@ -307,8 +307,7 @@ int amdgpu_ib_pool_init(struct amdgpu_device *adev)
 
for (i = 0; i < AMDGPU_IB_POOL_MAX; i++) {
r = amdgpu_sa_bo_manager_init(adev, &adev->ib_pools[i],
- AMDGPU_IB_POOL_SIZE,
- AMDGPU_GPU_PAGE_SIZE,
+ AMDGPU_IB_POOL_SIZE, 256,
  AMDGPU_GEM_DOMAIN_GTT);
if (r)
goto error;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 4c9cbdc66995..7db4fe1bc1d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -337,15 +337,20 @@ uint32_t amdgpu_bo_get_preferred_domain(struct 
amdgpu_device *adev,
 /*
  * sub allocation
  */
+static inline struct amdgpu_sa_manager *
+to_amdgpu_sa_manager(struct drm_suballoc_manager *manager)
+{
+   return container_of(manager, struct amdgpu_sa_manager, base);
+}
 
-static inline uint64_t amdgpu_sa_bo_gpu_addr(struct amdgpu_sa_bo *sa_bo)
+static inline uint64_t amdgpu_sa_bo_gpu_addr(struct drm_suballoc *sa_bo)
 {
-   return sa_bo->manager->gpu_addr + sa_bo->soffset;
+   return to_amdgpu_sa_manager(sa_bo->manager)->gpu_addr + sa_bo->soffset;
 }
 
-static inline void * amdgpu_sa_bo_cpu_addr(struct amdgpu_sa_bo *sa_bo)
+static inline void * amdgpu_sa_bo_cpu_addr(struct drm_suballoc *sa_bo)
 {
-   return sa_bo->manager->cpu_ptr + sa_bo->soffset;
+   return to_amdgpu_sa_manager(sa_bo->manager)->cpu_ptr + sa_bo->soffset;
 }
 
 int amdgpu_sa_bo_manager_init(struct amdgpu_device *adev,
@@ -356,11 +361,11 @@ void amdgpu_sa_bo_manager_fini(struct amdgpu_device *adev,
 int amdgpu_sa_bo_manager_start(struct amdgpu_device *adev,
  struct amdgpu_sa_manager *sa_manager);
 int amdgpu_sa_bo_new(struct amdgpu_sa_manager *sa_manager,
-struct amdgpu_sa_bo **sa_bo,
-unsigned size, unsigned align);
+struct drm_suballoc **sa_bo,
+unsigned siz

[RFC PATCH 1/3] drm: Extract amdgpu_sa.c as a generic suballocation helper

2022-02-04 Thread Maarten Lankhorst

Suballocating a buffer object is something that is not driver
generic, and is useful for other drivers as well.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/Makefile   |   4 +-
 drivers/gpu/drm/drm_suballoc.c | 424 +
 include/drm/drm_suballoc.h |  78 ++
 3 files changed, 505 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/drm_suballoc.c
 create mode 100644 include/drm/drm_suballoc.h

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 8675c2af7ae1..b848bcf8790c 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -57,7 +57,9 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o \
drm_scdc_helper.o drm_gem_atomic_helper.o \
drm_gem_framebuffer_helper.o \
drm_atomic_state_helper.o drm_damage_helper.o \
-   drm_format_helper.o drm_self_refresh_helper.o drm_rect.o
+   drm_format_helper.o drm_self_refresh_helper.o drm_rect.o \
+   drm_suballoc.o
+
 drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o
 drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o
 
diff --git a/drivers/gpu/drm/drm_suballoc.c b/drivers/gpu/drm/drm_suballoc.c
new file mode 100644
index ..e0bb35367b71
--- /dev/null
+++ b/drivers/gpu/drm/drm_suballoc.c
@@ -0,0 +1,424 @@
+/*
+ * Copyright 2011 Red Hat Inc.
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ */
+/*
+ * Authors:
+ *Jerome Glisse 
+ */
+/* Algorithm:
+ *
+ * We store the last allocated bo in "hole", we always try to allocate
+ * after the last allocated bo. Principle is that in a linear GPU ring
+ * progression was is after last is the oldest bo we allocated and thus
+ * the first one that should no longer be in use by the GPU.
+ *
+ * If it's not the case we skip over the bo after last to the closest
+ * done bo if such one exist. If none exist and we are not asked to
+ * block we report failure to allocate.
+ *
+ * If we are asked to block we wait on all the oldest fence of all
+ * rings. We just wait for any of those fence to complete.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static void drm_suballoc_remove_locked(struct drm_suballoc *sa);
+static void drm_suballoc_try_free(struct drm_suballoc_manager *sa_manager);
+
+/**
+ * drm_suballoc_manager_init - Initialise the drm_suballoc_manager
+ *
+ * @sa_manager: pointer to the sa_manager
+ * @size: number of bytes we want to suballocate
+ * @align: alignment for each suballocated chunk
+ *
+ * Prepares the suballocation manager for suballocations.
+ */
+void drm_suballoc_manager_init(struct drm_suballoc_manager *sa_manager,
+  u32 size, u32 align)
+{
+   u32 i;
+
+   if (!align)
+   align = 1;
+
+   /* alignment must be a power of 2 */
+   BUG_ON(align & (align - 1));
+
+   init_waitqueue_head(&sa_manager->wq);
+   sa_manager->size = size;
+   sa_manager->align = align;
+   sa_manager->hole = &sa_manager->olist;
+   INIT_LIST_HEAD(&sa_manager->olist);
+   for (i = 0; i < DRM_SUBALLOC_MAX_QUEUES; ++i)
+   INIT_LIST_HEAD(&sa_manager->flist[i]);
+}
+EXPORT_SYMBOL(drm_suballoc_manager_init);
+
+/**
+ * drm_suballoc_manager_fini - Destroy the drm_suballoc_manager
+ *
+ * @sa_manager: pointer to the sa_manager
+ *
+ * Cleans up the suballocation manager after use. All fences added
+ * with drm_suballoc_free() must be signaled, or we cannot clean up
+ * the entire manager.
+ */
+void drm_suballoc_manager_fini(struct drm_suballoc_manager *sa_manager)
+{
+   struct drm_suballoc *sa, *tmp;
+
+   if (!sa_manager->size)
+   return;
+
+   if (!list_empty(&sa_manager->olist)) {
+   sa_manager->hole = &sa_manager->olist,
+   drm

[PATCH] drm/nouveau/backlight: Fix LVDS backlight detection on some laptops

2022-02-04 Thread Lyude Paul

It seems that some laptops will report having both an eDP and LVDS
connector, even though only the LVDS connector is actually hooked up. This
can lead to issues with backlight registration if the eDP connector ends up
getting registered before the LVDS connector, as the backlight device will
then be registered to the eDP connector instead of the LVDS connector.

So, fix this by only registering the backlight on connectors that are
reported as being connected.

Signed-off-by: Lyude Paul 
Fixes: 6eca310e8924 ("drm/nouveau/kms/nv50-: Add basic DPCD backlight support 
for nouveau")
Bugzilla: https://gitlab.freedesktop.org/drm/nouveau/-/issues/137
Cc:  # v5.15+
---
 drivers/gpu/drm/nouveau/nouveau_backlight.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c 
b/drivers/gpu/drm/nouveau/nouveau_backlight.c
index ae2f2abc8f5a..6af12dc99d7f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_backlight.c
+++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c
@@ -294,7 +294,8 @@ nv50_backlight_init(struct nouveau_backlight *bl,
struct nouveau_drm *drm = nouveau_drm(nv_encoder->base.base.dev);
struct nvif_object *device = &drm->client.device.object;
 
-   if (!nvif_rd32(device, NV50_PDISP_SOR_PWM_CTL(ffs(nv_encoder->dcb->or) 
- 1)))
+   if (!nvif_rd32(device, NV50_PDISP_SOR_PWM_CTL(ffs(nv_encoder->dcb->or) 
- 1)) ||
+   nv_conn->base.status != connector_status_connected)
return -ENODEV;
 
if (nv_conn->type == DCB_CONNECTOR_eDP) {
-- 
2.34.1

Re: [RFC PATCH 1/3] drm: Extract amdgpu_sa.c as a generic suballocation helper

2022-02-04 Thread Christian König


Oh, that's on my TODO list for years!

Am 04.02.22 um 18:48 schrieb Maarten Lankhorst:

Suballocating a buffer object is something that is not driver
generic, and is useful for other drivers as well.

Signed-off-by: Maarten Lankhorst 
---
  drivers/gpu/drm/Makefile   |   4 +-
  drivers/gpu/drm/drm_suballoc.c | 424 +
  include/drm/drm_suballoc.h |  78 ++
  3 files changed, 505 insertions(+), 1 deletion(-)
  create mode 100644 drivers/gpu/drm/drm_suballoc.c
  create mode 100644 include/drm/drm_suballoc.h

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 8675c2af7ae1..b848bcf8790c 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -57,7 +57,9 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o \
drm_scdc_helper.o drm_gem_atomic_helper.o \
drm_gem_framebuffer_helper.o \
drm_atomic_state_helper.o drm_damage_helper.o \
-   drm_format_helper.o drm_self_refresh_helper.o drm_rect.o
+   drm_format_helper.o drm_self_refresh_helper.o drm_rect.o \
+   drm_suballoc.o
+


I think we should put that into a separate module like we now do with 
other helpers as well.



  drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o
  drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o
  
diff --git a/drivers/gpu/drm/drm_suballoc.c b/drivers/gpu/drm/drm_suballoc.c

new file mode 100644
index ..e0bb35367b71
--- /dev/null
+++ b/drivers/gpu/drm/drm_suballoc.c
@@ -0,0 +1,424 @@
+/*
+ * Copyright 2011 Red Hat Inc.
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ */
+/*
+ * Authors:
+ *Jerome Glisse 
+ */


That is hopelessly outdated. IIRC I completely rewrote that stuff in ~2012.


+/* Algorithm:
+ *
+ * We store the last allocated bo in "hole", we always try to allocate
+ * after the last allocated bo. Principle is that in a linear GPU ring
+ * progression was is after last is the oldest bo we allocated and thus
+ * the first one that should no longer be in use by the GPU.
+ *
+ * If it's not the case we skip over the bo after last to the closest
+ * done bo if such one exist. If none exist and we are not asked to
+ * block we report failure to allocate.
+ *
+ * If we are asked to block we wait on all the oldest fence of all
+ * rings. We just wait for any of those fence to complete.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static void drm_suballoc_remove_locked(struct drm_suballoc *sa);
+static void drm_suballoc_try_free(struct drm_suballoc_manager *sa_manager);
+
+/**
+ * drm_suballoc_manager_init - Initialise the drm_suballoc_manager
+ *
+ * @sa_manager: pointer to the sa_manager
+ * @size: number of bytes we want to suballocate
+ * @align: alignment for each suballocated chunk
+ *
+ * Prepares the suballocation manager for suballocations.
+ */
+void drm_suballoc_manager_init(struct drm_suballoc_manager *sa_manager,
+  u32 size, u32 align)
+{
+   u32 i;
+
+   if (!align)
+   align = 1;
+
+   /* alignment must be a power of 2 */
+   BUG_ON(align & (align - 1));


When we move that I think we should cleanup the code once more, e.g. use 
is_power_of_2() function here for example.


There are also a bunch of places with extra {} and constructs like "if 
() return true; else return false;" which could certainly be simplified.


Apart from that really great idea.

Regards,
Christian.


+
+   init_waitqueue_head(&sa_manager->wq);
+   sa_manager->size = size;
+   sa_manager->align = align;
+   sa_manager->hole = &sa_manager->olist;
+   INIT_LIST_HEAD(&sa_manager->olist);
+   for (i = 0; i < DRM_SUBALLOC_MAX_QUEUES; ++i)
+   INIT_LIST_HEAD(&sa_manager->flist[i]);
+}
+EXPORT_SYMBOL(drm_

Re: [PATCH 02/19] iosys-map: Add offset to iosys_map_memcpy_to()

2022-02-04 Thread Christian König


Am 04.02.22 um 18:44 schrieb Lucas De Marchi:

In certain situations it's useful to be able to write to an
offset of the mapping. Add a dst_offset to iosys_map_memcpy_to().

Cc: Sumit Semwal 
Cc: Christian König 
Cc: Thomas Zimmermann 
Cc: dri-devel@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Lucas De Marchi 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/drm_cache.c |  2 +-
  drivers/gpu/drm/drm_fb_helper.c |  2 +-
  include/linux/iosys-map.h   | 17 +
  3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 66597e411764..c3e6e615bf09 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -218,7 +218,7 @@ static void memcpy_fallback(struct iosys_map *dst,
if (!dst->is_iomem && !src->is_iomem) {
memcpy(dst->vaddr, src->vaddr, len);
} else if (!src->is_iomem) {
-   iosys_map_memcpy_to(dst, src->vaddr, len);
+   iosys_map_memcpy_to(dst, 0, src->vaddr, len);
} else if (!dst->is_iomem) {
memcpy_fromio(dst->vaddr, src->vaddr_iomem, len);
} else {
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 238f815cb2a0..bf5cc9a42e5a 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -385,7 +385,7 @@ static void drm_fb_helper_damage_blit_real(struct 
drm_fb_helper *fb_helper,
iosys_map_incr(dst, offset); /* go to first pixel within clip rect */
  
  	for (y = clip->y1; y < clip->y2; y++) {

-   iosys_map_memcpy_to(dst, src, len);
+   iosys_map_memcpy_to(dst, 0, src, len);
iosys_map_incr(dst, fb->pitches[0]);
src += fb->pitches[0];
}
diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index f4186f91caa6..edd7fa3be9e9 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -220,22 +220,23 @@ static inline void iosys_map_clear(struct iosys_map *map)
  }
  
  /**

- * iosys_map_memcpy_to - Memcpy into iosys mapping
+ * iosys_map_memcpy_to_offset - Memcpy into offset of iosys_map
   * @dst:  The iosys_map structure
+ * @dst_offset:The offset from which to copy
   * @src:  The source buffer
   * @len:  The number of byte in src
   *
- * Copies data into a iosys mapping. The source buffer is in system
- * memory. Depending on the buffer's location, the helper picks the correct
- * method of accessing the memory.
+ * Copies data into a iosys_map with an offset. The source buffer is in
+ * system memory. Depending on the buffer's location, the helper picks the
+ * correct method of accessing the memory.
   */
-static inline void iosys_map_memcpy_to(struct iosys_map *dst, const void *src,
-  size_t len)
+static inline void iosys_map_memcpy_to(struct iosys_map *dst, size_t 
dst_offset,
+  const void *src, size_t len)
  {
if (dst->is_iomem)
-   memcpy_toio(dst->vaddr_iomem, src, len);
+   memcpy_toio(dst->vaddr_iomem + dst_offset, src, len);
else
-   memcpy(dst->vaddr, src, len);
+   memcpy(dst->vaddr + dst_offset, src, len);
  }
  
  /**

[PATCH v3 0/3] enable widebus feature base on chip hardware revision

2022-02-04 Thread Kuogee Hsieh

split into 3 patches
1) widebus timing engine programming
2) dsc timing engine
3) enable widebus feature base on chip hardware revision


Kuogee Hsieh (3):
  drm/msm/dp:  revise timing engine programming to support widebus
feature
  drm/msm/dp: revise timing engine programming to support compression
(DSC)
  drm/msm/dp: enable widebus feature for display port

 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c|  14 ++-
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h|   2 +
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c   |  14 +++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c| 107 +++--
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h|   5 +
 drivers/gpu/drm/msm/dp/dp_catalog.c|  36 ++-
 drivers/gpu/drm/msm/dp/dp_catalog.h|   3 +-
 drivers/gpu/drm/msm/dp/dp_ctrl.c   |  13 ++-
 drivers/gpu/drm/msm/dp/dp_ctrl.h   |   1 +
 drivers/gpu/drm/msm/dp/dp_display.c|  30 ++
 drivers/gpu/drm/msm/dp/dp_display.h|   2 +
 drivers/gpu/drm/msm/dp/dp_panel.c  |   4 +-
 drivers/gpu/drm/msm/dp/dp_panel.h  |   2 +-
 drivers/gpu/drm/msm/msm_drv.h  |   6 ++
 14 files changed, 197 insertions(+), 42 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v3 1/3] drm/msm/dp: revise timing engine programming to support widebus feature

2022-02-04 Thread Kuogee Hsieh

Widebus feature will transmit two pixel data per pixel clock to interface.
Timing engine provides driving force for this purpose. This patch base
on HPG (Hardware Programming Guide) to revise timing engine register
setting to accommodate both widebus and non widebus application. Also
horizontal width parameters need to be reduced by half since two pixel
data are clocked out per pixel clock when widebus feature enabled.
In addition, revised timing engine function is an generic function and
intend to be shared by all platforms to reduce maintenance efforts.

Changes in v2:
-- remove compression related code from timing
-- remove op_info from  struct msm_drm_private
-- remove unnecessary wide_bus_en variables
-- pass wide_bus_en into timing configuration by struct msm_dp

Changes in v3:
-- split patch into 3 patches

Signed-off-by: Kuogee Hsieh 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c| 10 +++
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h|  2 +
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c   | 14 +++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c| 99 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h|  2 +
 5 files changed, 93 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index 0d315b4..0c22839 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -208,6 +208,8 @@ struct dpu_encoder_virt {
 
u32 idle_timeout;
 
+   bool wide_bus_en;
+
struct msm_dp *dp;
 };
 
@@ -217,6 +219,14 @@ static u32 dither_matrix[DITHER_MATRIX_SZ] = {
15, 7, 13, 5, 3, 11, 1, 9, 12, 4, 14, 6, 0, 8, 2, 10
 };
 
+
+bool dpu_encoder_is_widebus_enabled(struct drm_encoder *drm_enc)
+{
+   struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
+
+   return dpu_enc->wide_bus_en;
+}
+
 static void _dpu_encoder_setup_dither(struct dpu_hw_pingpong *hw_pp, unsigned 
bpc)
 {
struct dpu_hw_dither_cfg dither_cfg = { 0 };
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
index 99a5d73..893d74d 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
@@ -168,4 +168,6 @@ int dpu_encoder_get_linecount(struct drm_encoder *drm_enc);
  */
 int dpu_encoder_get_frame_count(struct drm_encoder *drm_enc);
 
+bool dpu_encoder_is_widebus_enabled(struct drm_encoder *drm_enc);
+
 #endif /* __DPU_ENCODER_H__ */
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
index 185379b..3d6c914 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
@@ -110,6 +110,20 @@ static void drm_mode_to_intf_timing_params(
timing->v_back_porch += timing->v_front_porch;
timing->v_front_porch = 0;
}
+
+   timing->wide_bus_en = dpu_encoder_is_widebus_enabled(phys_enc->parent);
+
+   /*
+* for DP, divide the horizonal parameters by 2 when
+* widebus is enabled
+*/
+   if (timing->wide_bus_en) {
+   timing->width = timing->width >> 1;
+   timing->xres = timing->xres >> 1;
+   timing->h_back_porch = timing->h_back_porch >> 1;
+   timing->h_front_porch = timing->h_front_porch >> 1;
+   timing->hsync_pulse_width = timing->hsync_pulse_width >> 1;
+   }
 }
 
 static u32 get_horizontal_total(const struct intf_timing_params *timing)
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
index 116e2b5..35d4aaa 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
@@ -33,6 +33,7 @@
 #define INTF_TP_COLOR1  0x05C
 #define INTF_CONFIG20x060
 #define INTF_DISPLAY_DATA_HCTL  0x064
+#define INTF_ACTIVE_DATA_HCTL   0x068
 #define INTF_FRAME_LINE_COUNT_EN0x0A8
 #define INTF_FRAME_COUNT0x0AC
 #define   INTF_LINE_COUNT   0x0B0
@@ -90,68 +91,95 @@ static void dpu_hw_intf_setup_timing_engine(struct 
dpu_hw_intf *ctx,
u32 hsync_period, vsync_period;
u32 display_v_start, display_v_end;
u32 hsync_start_x, hsync_end_x;
+   u32 hsync_data_start_x, hsync_data_end_x;
u32 active_h_start, active_h_end;
u32 active_v_start, active_v_end;
u32 active_hctl, display_hctl, hsync_ctl;
u32 polarity_ctl, den_polarity, hsync_polarity, vsync_polarity;
u32 panel_format;
-   u32 intf_cfg, intf_cfg2 = 0, display_data_hctl = 0;
+   u32 intf_cfg, intf_cfg2 = 0;
+   u32 display_data_hctl = 0, active_data_hctl = 0;
+   u32 data_width;
+   bool dp_intf = false;
 
/* read interface_cfg */
intf_cfg = DPU_REG_READ(c, INTF_CO

[PATCH v3 2/3] drm/msm/dp: revise timing engine programming to support compression (DSC)

2022-02-04 Thread Kuogee Hsieh

Divides horizontal width by 3 at timing engine of interface. There are
major part of  compression (DSC) programming have to be done at DSC
controller which is not covered by this patch.

Signed-off-by: Kuogee Hsieh 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c | 22 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h |  3 +++
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
index 35d4aaa..ee7ca34 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
@@ -128,7 +128,7 @@ static void dpu_hw_intf_setup_timing_engine(struct 
dpu_hw_intf *ctx,
 * video timing. It is recommended to enable it for all cases, except
 * if compression is enabled in 1 pixel per clock mode
 */
-   if (p->wide_bus_en)
+   if (!p->compression_en || p->wide_bus_en)
intf_cfg2 |= BIT(4);
 
if (p->wide_bus_en)
@@ -150,10 +150,16 @@ static void dpu_hw_intf_setup_timing_engine(struct 
dpu_hw_intf *ctx,
 */
data_width = p->width;
 
-   if (!dp_intf && p->wide_bus_en)
+   if (p->compression_en) {
+   data_width = DIV_ROUND_UP(p->dce_bytes_per_line, 3);
+
+   if (p->wide_bus_en)
+   data_width >>= 1;
+   } else if (!dp_intf && p->wide_bus_en) {
data_width = p->width >> 1;
-   else
+   } else {
data_width = p->width;
+   }
 
hsync_data_start_x = hsync_start_x;
hsync_data_end_x =  hsync_start_x + data_width - 1;
@@ -178,8 +184,16 @@ static void dpu_hw_intf_setup_timing_engine(struct 
dpu_hw_intf *ctx,
 
active_hctl = (active_h_end << 16) | active_h_start;
 
-   if (dp_intf)
+   if (dp_intf) {
display_hctl = active_hctl;
+   if (p->compression_en) {
+   active_data_hctl = (hsync_start_x +
+   p->extra_dto_cycles) << 16;
+   active_data_hctl += hsync_start_x;
+
+   display_data_hctl = active_data_hctl;
+   }
+   }
 
den_polarity = 0;
if (ctx->cap->type == INTF_HDMI) {
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h
index e4a518a..8fc71ce 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h
@@ -32,6 +32,9 @@ struct intf_timing_params {
u32 hsync_skew;
 
bool wide_bus_en;
+   bool compression_en;
+   u32 extra_dto_cycles;   /* for DP only */
+   u32 dce_bytes_per_line;
 };
 
 struct intf_prog_fetch {
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v3 3/3] drm/msm/dp: enable widebus feature for display port

2022-02-04 Thread Kuogee Hsieh

Widebus feature will transmit two pixel data per pixel clock to interface.
This feature now is required to be enabled to easy migrant to higher
resolution applications in future. However since some legacy chipsets
does not support this feature, this feature is enabled base on chip's
hardware revision.

changes in v2:
-- remove compression related code from timing
-- remove op_info from  struct msm_drm_private
-- remove unnecessary wide_bus_en variables
-- pass wide_bus_en into timing configuration by struct msm_dp

Changes in v3:
-- split patch into 3 patches
-- enable widebus feature base on chip hardware revision

Signed-off-by: Kuogee Hsieh 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c |  4 +++-
 drivers/gpu/drm/msm/dp/dp_catalog.c | 36 +++--
 drivers/gpu/drm/msm/dp/dp_catalog.h |  3 ++-
 drivers/gpu/drm/msm/dp/dp_ctrl.c| 13 +++
 drivers/gpu/drm/msm/dp/dp_ctrl.h|  1 +
 drivers/gpu/drm/msm/dp/dp_display.c | 30 
 drivers/gpu/drm/msm/dp/dp_display.h |  2 ++
 drivers/gpu/drm/msm/dp/dp_panel.c   |  4 ++--
 drivers/gpu/drm/msm/dp/dp_panel.h   |  2 +-
 drivers/gpu/drm/msm/msm_drv.h   |  6 +
 10 files changed, 90 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index 0c22839..b2d23c2 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -2167,8 +2167,10 @@ int dpu_encoder_setup(struct drm_device *dev, struct 
drm_encoder *enc,
timer_setup(&dpu_enc->vsync_event_timer,
dpu_encoder_vsync_event_handler,
0);
-   else if (disp_info->intf_type == DRM_MODE_ENCODER_TMDS)
+   else if (disp_info->intf_type == DRM_MODE_ENCODER_TMDS) {
dpu_enc->dp = priv->dp[disp_info->h_tile_instance[0]];
+   dpu_enc->wide_bus_en = msm_dp_wide_bus_enable(dpu_enc->dp);
+   }
 
INIT_DELAYED_WORK(&dpu_enc->delayed_off_work,
dpu_encoder_off_work);
diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.c 
b/drivers/gpu/drm/msm/dp/dp_catalog.c
index 64f0b26..99d087e 100644
--- a/drivers/gpu/drm/msm/dp/dp_catalog.c
+++ b/drivers/gpu/drm/msm/dp/dp_catalog.c
@@ -483,6 +483,27 @@ int dp_catalog_ctrl_set_pattern_state_bit(struct 
dp_catalog *dp_catalog,
 }
 
 /**
+ * dp_catalog_hw_revision() - retrieve DP hw revision
+ *
+ * @dp_catalog: DP catalog structure
+ *
+ * return: u32
+ *
+ * This function return the DP controller hw revision
+ *
+ */
+u32 dp_catalog_hw_revision(struct dp_catalog *dp_catalog)
+{
+   u32 revision;
+   struct dp_catalog_private *catalog = container_of(dp_catalog,
+   struct dp_catalog_private, dp_catalog);
+
+   revision = dp_read_ahb(catalog, REG_DP_HW_VERSION);
+
+   return revision;
+}
+
+/**
  * dp_catalog_ctrl_reset() - reset DP controller
  *
  * @dp_catalog: DP catalog structure
@@ -739,10 +760,11 @@ u32 dp_catalog_ctrl_read_phy_pattern(struct dp_catalog 
*dp_catalog)
 }
 
 /* panel related catalog functions */
-int dp_catalog_panel_timing_cfg(struct dp_catalog *dp_catalog)
+int dp_catalog_panel_timing_cfg(struct dp_catalog *dp_catalog, bool 
wide_bus_en)
 {
struct dp_catalog_private *catalog = container_of(dp_catalog,
struct dp_catalog_private, dp_catalog);
+   u32 reg;
 
dp_write_link(catalog, REG_DP_TOTAL_HOR_VER,
dp_catalog->total);
@@ -751,7 +773,17 @@ int dp_catalog_panel_timing_cfg(struct dp_catalog 
*dp_catalog)
dp_write_link(catalog, REG_DP_HSYNC_VSYNC_WIDTH_POLARITY,
dp_catalog->width_blanking);
dp_write_link(catalog, REG_DP_ACTIVE_HOR_VER, dp_catalog->dp_active);
-   dp_write_p0(catalog, MMSS_DP_INTF_CONFIG, 0);
+
+   reg = dp_read_p0(catalog, MMSS_DP_INTF_CONFIG);
+
+   if (wide_bus_en)
+   reg |= BIT(4);  /* DATABUS_WIDEN */
+   else
+   reg &= ~BIT(4);
+
+   DRM_DEBUG_DP("wide_bus_en=%d reg=%x\n", wide_bus_en, reg);
+
+   dp_write_p0(catalog, MMSS_DP_INTF_CONFIG, reg);
return 0;
 }
 
diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.h 
b/drivers/gpu/drm/msm/dp/dp_catalog.h
index 7dea101..a3a0129 100644
--- a/drivers/gpu/drm/msm/dp/dp_catalog.h
+++ b/drivers/gpu/drm/msm/dp/dp_catalog.h
@@ -95,6 +95,7 @@ void dp_catalog_ctrl_config_misc(struct dp_catalog 
*dp_catalog, u32 cc, u32 tb);
 void dp_catalog_ctrl_config_msa(struct dp_catalog *dp_catalog, u32 rate,
u32 stream_rate_khz, bool fixed_nvid);
 int dp_catalog_ctrl_set_pattern_state_bit(struct dp_catalog *dp_catalog, u32 
pattern);
+u32 dp_catalog_hw_revision(struct dp_catalog *dp_catalog);
 void dp_catalog_ctrl_reset(struct dp_catalog *dp_catalog);
 bool dp_catalog_ctrl_ma

Re: [PATCH 02/19] iosys-map: Add offset to iosys_map_memcpy_to()

2022-02-04 Thread Thomas Zimmermann


Hi

Am 04.02.22 um 18:44 schrieb Lucas De Marchi:

In certain situations it's useful to be able to write to an
offset of the mapping. Add a dst_offset to iosys_map_memcpy_to().

Cc: Sumit Semwal 
Cc: Christian König 
Cc: Thomas Zimmermann 
Cc: dri-devel@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Lucas De Marchi 
---
  drivers/gpu/drm/drm_cache.c |  2 +-
  drivers/gpu/drm/drm_fb_helper.c |  2 +-
  include/linux/iosys-map.h   | 17 +
  3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 66597e411764..c3e6e615bf09 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -218,7 +218,7 @@ static void memcpy_fallback(struct iosys_map *dst,
if (!dst->is_iomem && !src->is_iomem) {
memcpy(dst->vaddr, src->vaddr, len);
} else if (!src->is_iomem) {
-   iosys_map_memcpy_to(dst, src->vaddr, len);
+   iosys_map_memcpy_to(dst, 0, src->vaddr, len);
} else if (!dst->is_iomem) {
memcpy_fromio(dst->vaddr, src->vaddr_iomem, len);
} else {
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 238f815cb2a0..bf5cc9a42e5a 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -385,7 +385,7 @@ static void drm_fb_helper_damage_blit_real(struct 
drm_fb_helper *fb_helper,
iosys_map_incr(dst, offset); /* go to first pixel within clip rect */
  
  	for (y = clip->y1; y < clip->y2; y++) {

-   iosys_map_memcpy_to(dst, src, len);
+   iosys_map_memcpy_to(dst, 0, src, len);
iosys_map_incr(dst, fb->pitches[0]);
src += fb->pitches[0];
}
diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index f4186f91caa6..edd7fa3be9e9 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -220,22 +220,23 @@ static inline void iosys_map_clear(struct iosys_map *map)
  }
  
  /**

- * iosys_map_memcpy_to - Memcpy into iosys mapping
+ * iosys_map_memcpy_to_offset - Memcpy into offset of iosys_map


'iosys_map_memcpy_to'

With that fixed:

Reviewed-by: Thomas Zimmermann 

Best regards
Thomas


   * @dst:  The iosys_map structure
+ * @dst_offset:The offset from which to copy
   * @src:  The source buffer
   * @len:  The number of byte in src
   *
- * Copies data into a iosys mapping. The source buffer is in system
- * memory. Depending on the buffer's location, the helper picks the correct
- * method of accessing the memory.
+ * Copies data into a iosys_map with an offset. The source buffer is in
+ * system memory. Depending on the buffer's location, the helper picks the
+ * correct method of accessing the memory.
   */
-static inline void iosys_map_memcpy_to(struct iosys_map *dst, const void *src,
-  size_t len)
+static inline void iosys_map_memcpy_to(struct iosys_map *dst, size_t 
dst_offset,
+  const void *src, size_t len)
  {
if (dst->is_iomem)
-   memcpy_toio(dst->vaddr_iomem, src, len);
+   memcpy_toio(dst->vaddr_iomem + dst_offset, src, len);
else
-   memcpy(dst->vaddr, src, len);
+   memcpy(dst->vaddr + dst_offset, src, len);
  }
  
  /**


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH v3 2/2] drm/i915/uapi: Add query for hwconfig table

2022-02-04 Thread John Harrison


On 2/4/2022 01:55, Daniel Vetter wrote:

On Wed, Jan 19, 2022 at 9:35 PM  wrote:

From: Rodrigo Vivi 

GuC contains a consolidated table with a bunch of information about the
current device.

Previously, this information was spread and hardcoded to all the components
including GuC, i915 and various UMDs. The goal here is to consolidate
the data into GuC in a way that all interested components can grab the
very latest and synchronized information using a simple query.

As per most of the other queries, this one can be called twice.
Once with item.length=0 to determine the exact buffer size, then
allocate the user memory and call it again for to retrieve the
table data. For example:
   struct drm_i915_query_item item = {
 .query_id = DRM_I915_QUERY_HWCONCFIG_TABLE;
   };
   query.items_ptr = (int64_t) &item;
   query.num_items = 1;

   ioctl(fd, DRM_IOCTL_I915_QUERY, query, sizeof(query));

   if (item.length <= 0)
 return -ENOENT;

   data = malloc(item.length);
   item.data_ptr = (int64_t) &data;
   ioctl(fd, DRM_IOCTL_I915_QUERY, query, sizeof(query));

   // Parse the data as appropriate...

The returned array is a simple and flexible KLV (Key/Length/Value)
formatted table. For example, it could be just:
   enum device_attr {
  ATTR_SOME_VALUE = 0,
  ATTR_SOME_MASK  = 1,
   };

   static const u32 hwconfig[] = {
   ATTR_SOME_VALUE,
   1, // Value Length in DWords
   8, // Value

   ATTR_SOME_MASK,
   3,
   0x00, 0x, 0xFF00,
   };

The attribute ids are defined in a hardware spec.

Cc: Tvrtko Ursulin 
Cc: Kenneth Graunke 
Cc: Michal Wajdeczko 
Cc: Slawomir Milczarek 
Signed-off-by: Rodrigo Vivi 
Signed-off-by: John Harrison 
Reviewed-by: Matthew Brost 
---
  drivers/gpu/drm/i915/i915_query.c | 23 +++
  include/uapi/drm/i915_drm.h   |  1 +
  2 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_query.c 
b/drivers/gpu/drm/i915/i915_query.c
index 2dfbc22857a3..609e64d5f395 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -479,12 +479,35 @@ static int query_memregion_info(struct drm_i915_private 
*i915,
 return total_length;
  }

+static int query_hwconfig_table(struct drm_i915_private *i915,
+   struct drm_i915_query_item *query_item)
+{
+   struct intel_gt *gt = to_gt(i915);
+   struct intel_guc_hwconfig *hwconfig = >->uc.guc.hwconfig;
+
+   if (!hwconfig->size || !hwconfig->ptr)
+   return -ENODEV;
+
+   if (query_item->length == 0)
+   return hwconfig->size;
+
+   if (query_item->length < hwconfig->size)
+   return -EINVAL;
+
+   if (copy_to_user(u64_to_user_ptr(query_item->data_ptr),
+hwconfig->ptr, hwconfig->size))
+   return -EFAULT;
+
+   return hwconfig->size;
+}
+
  static int (* const i915_query_funcs[])(struct drm_i915_private *dev_priv,
 struct drm_i915_query_item 
*query_item) = {
 query_topology_info,
 query_engine_info,
 query_perf_config,
 query_memregion_info,
+   query_hwconfig_table,
  };

  int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file 
*file)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 914ebd9290e5..132515199f27 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -2685,6 +2685,7 @@ struct drm_i915_query_item {
  #define DRM_I915_QUERY_ENGINE_INFO 2
  #define DRM_I915_QUERY_PERF_CONFIG  3
  #define DRM_I915_QUERY_MEMORY_REGIONS   4
+#define DRM_I915_QUERY_HWCONFIG_TABLE   5
  /* Must be kept compact -- no holes and well documented */

New uapi needs kerneldoc in the uapi header, and please fill in any
gaps you have (i.e. if the query uapi this is built on top of isn't
fully documented yet).

Also this holds across the board, so please keep in mind in patch review.
-Daniel

There is no extra documentation to add.

The query interface itself is already documented. This new query does 
not have any kernel defined data structures associated with it. There is 
just 'struct drm_i915_query_item' with a length and a pointer, all of 
which are fully documented.


John.

Re: [PATCH 03/19] iosys-map: Add a few more helpers

2022-02-04 Thread Thomas Zimmermann


Hi

Am 04.02.22 um 18:44 schrieb Lucas De Marchi:

First the simplest ones:

- iosys_map_memset(): when abstracting system and I/O memory,
  just like the memcpy() use case, memset() also has dedicated
  functions to be called for using IO memory.
- iosys_map_memcpy_from(): we may need to copy data from I/O
  memory, not only to.

In certain situations it's useful to be able to read or write to an
offset that is calculated by having the memory layout given by a struct
declaration. Usually we are going to read/write a u8, u16, u32 or u64.

As a pre-requisite for the implementation, add iosys_map_memcpy_from()
to be the equivalent of iosys_map_memcpy_to(), but in the other
direction. Then add 2 pairs of macros:

- iosys_map_rd() / iosys_map_wr()
- iosys_map_rd_field() / iosys_map_wr_field()

The first pair takes the C-type and offset to read/write. The second
pair uses a struct describing the layout of the mapping in order to
calculate the offset and size being read/written.

We could use readb, readw, readl, readq and the write* counterparts,
however due to alignment issues this may not work on all architectures.
If alignment needs to be checked to call the right function, it's not
possible to decide at compile-time which function to call: so just leave
the decision to the memcpy function that will do exactly that.

Finally, in order to use the above macros with a map derived from
another, add another initializer: IOSYS_MAP_INIT_OFFSET().

Cc: Sumit Semwal 
Cc: Christian König 
Cc: Thomas Zimmermann 
Cc: dri-devel@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Lucas De Marchi 
---
  include/linux/iosys-map.h | 154 +-
  1 file changed, 153 insertions(+), 1 deletion(-)

diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index edd7fa3be9e9..96f8b61ac6fb 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -6,6 +6,7 @@
  #ifndef __IOSYS_MAP_H__
  #define __IOSYS_MAP_H__
  
+#include 

  #include 
  #include 
  
@@ -133,6 +134,45 @@ static inline void iosys_map_set_vaddr(struct iosys_map *map, void *vaddr)

map->is_iomem = false;
  }
  
+/**

+ * IOSYS_MAP_INIT_OFFSET - Initializes struct iosys_map from another iosys_map
+ * @map_:  The dma-buf mapping structure to copy from
+ * @offset_:   Offset to add to the other mapping
+ *
+ * Initializes a new iosys_map struct based on another passed as argument. It
+ * does a shallow copy of the struct so it's possible to update the back 
storage
+ * without changing where the original map points to. It is the equivalent of
+ * doing:
+ *
+ * .. code-block: c
+ *
+ * iosys_map map = other_map;
+ * iosys_map_incr(&map, &offset);
+ *
+ * Example usage:
+ *
+ * .. code-block: c
+ *
+ * void foo(struct device *dev, struct iosys_map *base_map)
+ * {
+ * ...
+ * struct iosys_map map = IOSYS_MAP_INIT_OFFSET(base_map, 
FIELD_OFFSET);
+ * ...
+ * }
+ *
+ * The advantage of using the initializer over just increasing the offset with
+ * ``iosys_map_incr()`` like above is that the new map will always point to the
+ * right place of the buffer during  its scope. It reduces the risk of updating
+ * the wrong part of the buffer and having no compiler warning about that. If
+ * the assignment to IOSYS_MAP_INIT_OFFSET() is forgotten, the compiler can 
warn
+ * using a uninitialized variable.
+ */
+#define IOSYS_MAP_INIT_OFFSET(map_, offset_)   (struct iosys_map)  \
+   {   \
+   .vaddr = (map_)->vaddr + (offset_),  \
+   .is_iomem = (map_)->is_iomem,\
+   }


I already nak'ed this macro. This works because of the aliasing rules 
within the union and the fact that there are only plain pointers. But 
this is fragile. Do anything more complicated and it breaks. There's not 
even a test that would tell you that it failed.


Therefore, struct iosys_map should only be initialized by the code that 
creates the stored pointer.


However, you won't need the offset'ed iosys_map because the 
memcpy_to/from helpers now have the offset parameter.





+
  /**
   * iosys_map_set_vaddr_iomem - Sets a iosys mapping structure to an address 
in I/O memory
   * @map:  The iosys_map structure
@@ -220,7 +260,7 @@ static inline void iosys_map_clear(struct iosys_map *map)
  }
  
  /**

- * iosys_map_memcpy_to_offset - Memcpy into offset of iosys_map
+ * iosys_map_memcpy_to - Memcpy into iosys_map


That's the fix for the other patch. :)


   * @dst:  The iosys_map structure
   * @dst_offset:   The offset from which to copy
   * @src:  The source buffer
@@ -239,6 +279,26 @@ static inline void iosys_map_memcpy_to(struct iosys_map 
*dst, size_t dst_offset,
memcpy(dst->vaddr + dst_offset, src, len);
  }
  
+/**

+ *

Re: [PATCH 04/19] drm/i915/gt: Add helper for shmem copy to iosys_map

2022-02-04 Thread Thomas Zimmermann


Hi

Am 04.02.22 um 18:44 schrieb Lucas De Marchi:

Add a variant of shmem_read() that takes a iosys_map pointer rather
than a plain pointer as argument. It's mostly a copy __shmem_rw() but
adapting the api and removing the write support since there's currently
only need to use iosys_map as destination.

Reworking __shmem_rw() to share the implementation was tempting, but
finding a good balance between reuse and clarity pushed towards a little
code duplication. Since the function is small, just add the similar
function with a copy/paste/adapt approach.

Cc: Matt Roper 
Cc: Joonas Lahtinen 
Cc: Tvrtko Ursulin 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Matthew Auld 
Cc: Thomas Hellström 
Cc: Maarten Lankhorst 
Signed-off-by: Lucas De Marchi 
---
  drivers/gpu/drm/i915/gt/shmem_utils.c | 33 +++
  drivers/gpu/drm/i915/gt/shmem_utils.h |  3 +++
  2 files changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c 
b/drivers/gpu/drm/i915/gt/shmem_utils.c
index 0683b27a3890..764adefdb4be 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -3,6 +3,7 @@
   * Copyright © 2020 Intel Corporation
   */
  
+#include 

  #include 
  #include 
  #include 
@@ -123,6 +124,38 @@ static int __shmem_rw(struct file *file, loff_t off,
return 0;
  }
  


Here's a good example of how to avoid iosys_map_incr() and use the 
memcpy offset:



+int shmem_read_to_iosys_map(struct file *file, loff_t off,
+   struct iosys_map *map, size_t len)
+{
+   struct iosys_map map_iter = *map;


Rather replace map_iter with something like

  unsigned long map_off = 0;


+   unsigned long pfn;
+
+   for (pfn = off >> PAGE_SHIFT; len; pfn++) {
+   unsigned int this =
+   min_t(size_t, PAGE_SIZE - offset_in_page(off), len);
+   struct page *page;
+   void *vaddr;
+
+   page = shmem_read_mapping_page_gfp(file->f_mapping, pfn,
+  GFP_KERNEL);
+   if (IS_ERR(page))
+   return PTR_ERR(page);
+
+   vaddr = kmap(page);
+   iosys_map_memcpy_to(&map_iter, 0, vaddr + offset_in_page(off),
+   this);


Use map_off to index into map directly.


+   mark_page_accessed(page);
+   kunmap(page);
+   put_page(page);
+
+   len -= this;
+   iosys_map_incr(&map_iter, this);


Raplace iosys_map_incr() with map_off += this;


+   off = 0;


Maybe off += this ?

I think this pattern should be applied to all similar code. As you 
already noted, iosys_map_incr() is problematic.


Best regards
Thomas


+   }
+
+   return 0;
+}
+
  int shmem_read(struct file *file, loff_t off, void *dst, size_t len)
  {
return __shmem_rw(file, off, dst, len, false);
diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.h 
b/drivers/gpu/drm/i915/gt/shmem_utils.h
index c1669170c351..e1784999faee 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.h
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.h
@@ -8,6 +8,7 @@
  
  #include 
  
+struct iosys_map;

  struct drm_i915_gem_object;
  struct file;
  
@@ -17,6 +18,8 @@ struct file *shmem_create_from_object(struct drm_i915_gem_object *obj);

  void *shmem_pin_map(struct file *file);
  void shmem_unpin_map(struct file *file, void *ptr);
  
+int shmem_read_to_iosys_map(struct file *file, loff_t off,

+   struct iosys_map *map, size_t len);
  int shmem_read(struct file *file, loff_t off, void *dst, size_t len);
  int shmem_write(struct file *file, loff_t off, void *src, size_t len);
  


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH v2 2/4] drm/tiny: Add driver for Solomon SSD130X OLED displays

2022-02-04 Thread Javier Martinez Canillas

Hello Andy,

Thanks for your feedback.

On 2/4/22 15:26, Andy Shevchenko wrote:
> On Fri, Feb 04, 2022 at 02:43:45PM +0100, Javier Martinez Canillas wrote:
>> Add a DRM driver for SSD1305, SSD1306, SSD1307 and SSD1309 Solomon OLED
>> controllers that can be programmed via an I2C interface. This is a port
>> of the ssd1307fb driver that already supports these devices.
>>
>> A Device Tree binding is not added because the DRM driver is compatible
>> with the existing binding for the ssd1307fb driver.
> 
> ...
> 
>> +/*
>> + * DRM driver for Solomon SSD130X OLED displays
>> + *
>> + * Copyright 2022 Red Hat Inc.
>> + *
>> + * Based on drivers/video/fbdev/ssd1307fb.c
>> + * Copyright 2012 Free Electrons
> 
>> + *
> 
> No need for this blank line.
>

Ok.
 
>> + */
> 
> ...
> 
>> +struct ssd130x_device {
>> +struct drm_device drm;
>> +struct drm_simple_display_pipe pipe;
>> +struct drm_display_mode mode;
>> +struct drm_connector connector;
> 
> 
>> +struct i2c_client *client;
> 
> Can we logically separate hw protocol vs hw interface from day 1, please?
> This will allow to add SPI support for this panel much easier.
> 
> Technically I would like to see here
> 
>   struct device *dev;
>
> and probably (I haven't looked into design)
> 
>   struct ssd130x_ops *ops;
> 
> or something alike.
>

Sure. I wanted to keep the driver simple, making the writes bus agnostic and
adding a level of indirection would make it more complex. But I agree that
it will also make easier to add more buses later. I will do that for v3.

[snip]

> 
>> +static inline int ssd130x_write_value(struct i2c_client *client, u8 value)
> 
> Not sure inline does anything useful here.
> Ditto for the rest similar cases.
>

Ok, I'll drop them.
 
> ...
> 
>> +static inline int ssd130x_write_cmd(struct i2c_client *client, int count,
>> +/* u8 cmd, u8 value, ... */...)
>> +{
>> +va_list ap;
>> +u8 value;
>> +int ret;
>> +
>> +va_start(ap, count);
> 
>> +while (count--) {
>> +value = va_arg(ap, int);
>> +ret = ssd130x_write_value(client, (u8)value);
>> +if (ret)
>> +goto out_end;
>> +}
> 
> I'm wondering if this can be written in a form
> 
>   do {
>   ...
>   } while (--count);
> 
> In this case it will give a hint that count can't be 0.
>

Sure, I don't have a strong preference. I will change it.

[snip]
 
>> +ssd130x->pwm = pwm_get(dev, NULL);
>> +if (IS_ERR(ssd130x->pwm)) {
>> +dev_err(dev, "Could not get PWM from device tree!\n");
> 
> "device tree" is a bit confusing here if I run this on ACPI.
> Maybe something like "firmware description"?
>

Indeed.
 
>> +return PTR_ERR(ssd130x->pwm);
>> +}
> 
> ...
> 
>> +/* Set initial contrast */
>> +ret = ssd130x_write_cmd(ssd130x->client, 2, SSD130X_CONTRAST, 
>> ssd130x->contrast);
> 
> Creating a local variable for client allows to:
> - make lines shorter and might even be less LOCs
> - allow to convert struct device to client in one place
>   (as per my above comment)
> 
> Ditto for other similar cases.
>

Ok.
 
[snip]

>> +/* Switch to horizontal addressing mode */
>> +ret = ssd130x_write_cmd(ssd130x->client, 2, SSD130X_SET_ADDRESS_MODE,
>> +SSD130X_SET_ADDRESS_MODE_HORIZONTAL);
>> +if (ret < 0)
>> +return ret;
>> +
>> +return 0;
> 
> Can it be
> 
>   return ssd130x_write_cmd(...);
> 
> ?
> 
> ...
>

Yes.

>> +unsigned int line_length = DIV_ROUND_UP(width, 8);
>> +unsigned int pages = DIV_ROUND_UP(height, 8);
> 
> For power of two there are more efficient roundup()/rounddown()
> (or with _ in the names, I don't remember by heart).
>

Oh, I didn't know about round_{up,down}(). Thanks a lot for the pointer.

> ...
> 
>> +for (k = 0; k < m; k++) {
> 
>> +u8 byte = buf[(8 * i + k) * line_length +
>> +   j / 8];
> 
> One line?
>

Yes.

>> +u8 bit = (byte >> (j % 8)) & 1;
>> +
>> +data |= bit << k;
>> +}
> 
> ...
> 
>> +static int ssd130x_display_pipe_mode_valid(struct drm_simple_display_pipe 
>> *pipe,
>> +   const struct drm_display_mode *mode)
>> +{
>> +struct ssd130x_device *ssd130x = drm_to_ssd130x(pipe->crtc.dev);
>> +
>> +if (mode->hdisplay != ssd130x->mode.hdisplay &&
>> +mode->vdisplay != ssd130x->mode.vdisplay)
>> +return MODE_ONE_SIZE;
> 
>> +else if (mode->hdisplay != ssd130x->mode.hdisplay)
>> +return MODE_ONE_WIDTH;
>> +else if (mode->vdisplay != ssd130x->mode.vdisplay)
>> +return MODE_ONE_HEIGHT;
> 
> 'else' in both cases is redundant.
>

Indeed.
 
>> +return MODE_OK;
>> +}
> 
> ...
> 
>> +poweroff:
> 
> out_power_off: ?
>

Ok.
 
> ...
> 
>> +if (!fb)
>> +

Re: [PATCH 07/19] drm/i915/guc: Convert golden context init to iosys_map

2022-02-04 Thread Thomas Zimmermann


Hi

Am 04.02.22 um 18:44 schrieb Lucas De Marchi:

Now the map is saved during creation, so use it to initialize the
golden context, reading from shmem and writing to either system or IO
memory.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 25 +++---
  1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 3a0afce7564e..d32b407a2d25 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -473,18 +473,17 @@ static struct intel_engine_cs *find_engine_state(struct 
intel_gt *gt, u8 engine_
  
  static void guc_init_golden_context(struct intel_guc *guc)

  {
-   struct __guc_ads_blob *blob = guc->ads_blob;
struct intel_engine_cs *engine;
struct intel_gt *gt = guc_to_gt(guc);
+   struct iosys_map golden_context_map;
u32 addr_ggtt, offset;
u32 total_size = 0, alloc_size, real_size;
u8 engine_class, guc_class;
-   u8 *ptr;
  
  	if (!intel_uc_uses_guc_submission(>->uc))

return;
  
-	GEM_BUG_ON(!blob);

+   GEM_BUG_ON(iosys_map_is_null(&guc->ads_map));
  
  	/*

 * Go back and fill in the golden context data now that it is
@@ -492,15 +491,15 @@ static void guc_init_golden_context(struct intel_guc *guc)
 */
offset = guc_ads_golden_ctxt_offset(guc);
addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
-   ptr = ((u8 *)blob) + offset;
+
+   golden_context_map = IOSYS_MAP_INIT_OFFSET(&guc->ads_map, offset);
  
  	for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; ++engine_class) {

if (engine_class == OTHER_CLASS)
continue;
  
  		guc_class = engine_class_to_guc_class(engine_class);

-
-   if (!blob->system_info.engine_enabled_masks[guc_class])
+   if (!ads_blob_read(guc, 
system_info.engine_enabled_masks[guc_class]))
continue;
  
  		real_size = intel_engine_context_size(gt, engine_class);

@@ -511,18 +510,20 @@ static void guc_init_golden_context(struct intel_guc *guc)
if (!engine) {
drm_err(>->i915->drm, "No engine state recorded for class 
%d!\n",
engine_class);
-   blob->ads.eng_state_size[guc_class] = 0;
-   blob->ads.golden_context_lrca[guc_class] = 0;
+   ads_blob_write(guc, ads.eng_state_size[guc_class], 0);
+   ads_blob_write(guc, ads.golden_context_lrca[guc_class], 
0);
continue;
}
  
-		GEM_BUG_ON(blob->ads.eng_state_size[guc_class] !=

+   GEM_BUG_ON(ads_blob_read(guc, ads.eng_state_size[guc_class]) !=
   real_size - LRC_SKIP_SIZE);
-   GEM_BUG_ON(blob->ads.golden_context_lrca[guc_class] != 
addr_ggtt);
+   GEM_BUG_ON(ads_blob_read(guc, 
ads.golden_context_lrca[guc_class]) != addr_ggtt);
+
addr_ggtt += alloc_size;
  
-		shmem_read(engine->default_state, 0, ptr, real_size);

-   ptr += alloc_size;
+   shmem_read_to_iosys_map(engine->default_state, 0,
+   &golden_context_map, real_size);
+   iosys_map_incr(&golden_context_map, alloc_size);


Use an offset to index into iosys_map. Even if that means to add another 
parameter to shmem_read_to_iosys_map(). This will save you 
IOSYS_MAP_INIT_OFFSET() and iosys_map_incr().


Best regards
Thomas


}
  
  	GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH 09/19] drm/i915/guc: Convert engine record to iosys_map

2022-02-04 Thread Thomas Zimmermann


Hi

Am 04.02.22 um 18:44 schrieb Lucas De Marchi:

Use iosys_map to read fields from the dma_blob so access to IO and
system memory is abstracted away.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c  | 14 ++
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h  |  3 ++-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c   | 17 ++---
  3 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 6311b9da87e4..1d21a2d457e0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -698,18 +698,16 @@ void intel_guc_ads_reset(struct intel_guc *guc)
  
  u32 intel_guc_engine_usage_offset(struct intel_guc *guc)

  {
-   struct __guc_ads_blob *blob = guc->ads_blob;
-   u32 base = intel_guc_ggtt_offset(guc, guc->ads_vma);
-   u32 offset = base + ptr_offset(blob, engine_usage);
-
-   return offset;
+   return intel_guc_ggtt_offset(guc, guc->ads_vma) +
+   offsetof(struct __guc_ads_blob, engine_usage);
  }
  
-struct guc_engine_usage_record *intel_guc_engine_usage(struct intel_engine_cs *engine)

+struct iosys_map intel_guc_engine_usage_record_map(struct intel_engine_cs 
*engine)
  {
struct intel_guc *guc = &engine->gt->uc.guc;
-   struct __guc_ads_blob *blob = guc->ads_blob;
u8 guc_class = engine_class_to_guc_class(engine->class);
+   size_t offset = offsetof(struct __guc_ads_blob,
+
engine_usage.engines[guc_class][ilog2(engine->logical_mask)]);
  
-	return &blob->engine_usage.engines[guc_class][ilog2(engine->logical_mask)];

+   return IOSYS_MAP_INIT_OFFSET(&guc->ads_map, offset);


Here's one of the few cases where you can legitimately make a copy of an 
iosys_map buffer and call iosys_map_incr() on it. Saves you the 
IOSYS_MAP_INIT_OFFSET().


Best regards
Thomas


  }
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h
index e74c110facff..1c64f4d6ea21 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h
@@ -7,6 +7,7 @@
  #define _INTEL_GUC_ADS_H_
  
  #include 

+#include 
  
  struct intel_guc;

  struct drm_printer;
@@ -18,7 +19,7 @@ void intel_guc_ads_init_late(struct intel_guc *guc);
  void intel_guc_ads_reset(struct intel_guc *guc);
  void intel_guc_ads_print_policy_info(struct intel_guc *guc,
 struct drm_printer *p);
-struct guc_engine_usage_record *intel_guc_engine_usage(struct intel_engine_cs 
*engine);
+struct iosys_map intel_guc_engine_usage_record_map(struct intel_engine_cs 
*engine);
  u32 intel_guc_engine_usage_offset(struct intel_guc *guc);
  
  #endif

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index b3a429a92c0d..6d34842f68b4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1139,6 +1139,9 @@ __extend_last_switch(struct intel_guc *guc, u64 
*prev_start, u32 new_start)
*prev_start = ((u64)gt_stamp_hi << 32) | new_start;
  }
  
+#define record_read(map_, field_) \

+   iosys_map_rd_field(map_, struct guc_engine_usage_record, field_)
+
  /*
   * GuC updates shared memory and KMD reads it. Since this is not synchronized,
   * we run into a race where the value read is inconsistent. Sometimes the
@@ -1153,17 +1156,17 @@ __extend_last_switch(struct intel_guc *guc, u64 
*prev_start, u32 new_start)
  static void __get_engine_usage_record(struct intel_engine_cs *engine,
  u32 *last_in, u32 *id, u32 *total)
  {
-   struct guc_engine_usage_record *rec = intel_guc_engine_usage(engine);
+   struct iosys_map rec_map = intel_guc_engine_usage_record_map(engine);
int i = 0;
  
  	do {

-   *last_in = READ_ONCE(rec->last_switch_in_stamp);
-   *id = READ_ONCE(rec->current_context_index);
-   *total = READ_ONCE(rec->total_runtime);
+   *last_in = record_read(&rec_map, last_switch_in_stamp);
+   *id = record_read(&rec_map, current_context_index);
+   *total = record_read(&rec_map, total_runtime);
  
-		if (READ_ONCE(rec->last_switch_in_stamp) == *last_in &&

-   READ_ONCE(rec->current_context_index) == *id &&
-   READ_ONCE(rec->total_runtime) == *total)
+   if (record_read(&rec_map, last_switch_in_stamp) == *last_in &&
+   record_read(&rec_map, current_context_index) == *id &&
+   record_read(&rec_map, total_runtime) == *total)
break;
} while (++i < 6);
  }


--
Thomas Zimmermann
Graphi

Re: [PATCH 02/19] iosys-map: Add offset to iosys_map_memcpy_to()

2022-02-04 Thread Lucas De Marchi


On Fri, Feb 04, 2022 at 07:48:10PM +0100, Thomas Zimmermann wrote:

Hi

Am 04.02.22 um 18:44 schrieb Lucas De Marchi:

In certain situations it's useful to be able to write to an
offset of the mapping. Add a dst_offset to iosys_map_memcpy_to().

Cc: Sumit Semwal 
Cc: Christian König 
Cc: Thomas Zimmermann 
Cc: dri-devel@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/drm_cache.c |  2 +-
 drivers/gpu/drm/drm_fb_helper.c |  2 +-
 include/linux/iosys-map.h   | 17 +
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 66597e411764..c3e6e615bf09 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -218,7 +218,7 @@ static void memcpy_fallback(struct iosys_map *dst,
if (!dst->is_iomem && !src->is_iomem) {
memcpy(dst->vaddr, src->vaddr, len);
} else if (!src->is_iomem) {
-   iosys_map_memcpy_to(dst, src->vaddr, len);
+   iosys_map_memcpy_to(dst, 0, src->vaddr, len);
} else if (!dst->is_iomem) {
memcpy_fromio(dst->vaddr, src->vaddr_iomem, len);
} else {
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 238f815cb2a0..bf5cc9a42e5a 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -385,7 +385,7 @@ static void drm_fb_helper_damage_blit_real(struct 
drm_fb_helper *fb_helper,
iosys_map_incr(dst, offset); /* go to first pixel within clip rect */
for (y = clip->y1; y < clip->y2; y++) {
-   iosys_map_memcpy_to(dst, src, len);
+   iosys_map_memcpy_to(dst, 0, src, len);
iosys_map_incr(dst, fb->pitches[0]);
src += fb->pitches[0];
}
diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h
index f4186f91caa6..edd7fa3be9e9 100644
--- a/include/linux/iosys-map.h
+++ b/include/linux/iosys-map.h
@@ -220,22 +220,23 @@ static inline void iosys_map_clear(struct iosys_map *map)
 }
 /**
- * iosys_map_memcpy_to - Memcpy into iosys mapping
+ * iosys_map_memcpy_to_offset - Memcpy into offset of iosys_map


'iosys_map_memcpy_to'

With that fixed:

Reviewed-by: Thomas Zimmermann 


thanks, I noticed that, but looks like I squashed to the wrong patch.

Lucas De Marchi

Re: [PATCH v2 1/4] drm/format-helper: Add drm_fb_{xrgb8888,gray8}_to_mono_reversed()

2022-02-04 Thread Javier Martinez Canillas

Hello Thomas,

Thanks a lot for your feedback.

On 2/4/22 16:52, Thomas Zimmermann wrote:

[snip]

>> +static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, const u8 *src, 
>> size_t pixels)
>> +{
>> +unsigned int xb, i;
>> +
>> +for (xb = 0; xb < pixels / 8; xb++) {
> 
> In practice, all mode widths are multiples of 8 because VGA mandated it. 
> So it's ok-ish to assume this here. You should probably at least print a 
> warning somewhere if (pixels % 8 != 0)
>

Agreed.
 
[snip]

>> + * DRM doesn't have native monochrome or grayscale support.
>> + * Such drivers can announce the commonly supported XR24 format to userspace
>> + * and use drm_fb_xrgb_to_gray8() to convert to grayscale and then this
>> + * helper function to convert to the native format.
>> + */
>> +void drm_fb_gray8_to_mono_reversed(void *dst, unsigned int dst_pitch, const 
>> void *src,
>> +   const struct drm_rect *clip)
> 
> There's a bug here. You want to pass in a drm_framebuffer as fourth 
> argument.
>
>> +{
>> +
>> +size_t height = drm_rect_height(clip);
>> +size_t width = drm_rect_width(clip);
>> +unsigned int y;
>> +const u8 *gray8 = src;
>> +u8 *mono = dst;
>> +
>> +if (!dst_pitch)
>> +dst_pitch = width;
> 
> The dst_pitch is given in bytes. You have to device by 8. Here would be 
> a good place to warn if (width % 8 != 0).
>

Ok.
 
>> +
>> +for (y = 0; y < height; y++) {
>> +drm_fb_gray8_to_mono_reversed_line(mono, gray8, dst_pitch);
>> +mono += (dst_pitch / 8);
> 
> The dst_pitch is already given in bytes.
>

Yes, I know but for reversed mono we want only 1/8 of the width since we
are converting from 8 bits per pixel greyscale to 1 bit per pixel mono.

Or am I misunderstanding what you meant ?

>> +gray8 += dst_pitch;
> 
> 'gray8 += fb->pitches[0]' would be correct.
>

Ok.
 
[snip]

>> + */
>> +void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, 
>> const void *src,
>> +  const struct drm_framebuffer *fb,
>> +  const struct drm_rect *clip)
>> +{
>> +if (WARN_ON(fb->format->format != DRM_FORMAT_XRGB))
>> +return;
>> +
>> +if (!dst_pitch)
>> +dst_pitch = drm_rect_width(clip);
>> +
>> +drm_fb_xrgb_to_gray8(dst, dst_pitch, src, fb, clip);
>> +drm_fb_gray8_to_mono_reversed(dst, dst_pitch, dst, fb, clip);
> 
> Converting from dst into dst can give incorrect results. At some point 
> we probably want to add restrict qualifiers to these pointers, to help 
> the compiler with optimizing.
> 
> A better approach here is to pull the per-line conversion from 
> drm_fb_xrgb_to_gray8() into a separate helper and implement a 
> line-by-line conversion here. something like this:
> 
>drm_fb_xrgb_to_mono_reversed()
>{
>  char *tmp = kmalloc(size of a single line of gray8)
> 
>  for (heigth) {
> drm_fb_xrgb_to_gray8_line(tmp, ..., src, ...);
> drm_fb_gray8_to_mono_reversed(dst, ..., tmp, ...);
> 
> src += fb->pitches[0]
> dst += dst_pitch;
>  }
> 
>  kfree(tmp);
>}
>

I see. Yes, that sounds a much better approach. I'll change it in v3.
 
Best regards,
-- 
Javier Martinez Canillas
Linux Engineering
Red Hat

[PATCH] drm/nouveau/backlight: Just set all backlight types as RAW

2022-02-04 Thread Lyude Paul

Currently we can get a warning on systems with eDP backlights like so:

  nv_backlight: invalid backlight type
  WARNING: CPU: 4 PID: 454 at drivers/video/backlight/backlight.c:420
backlight_device_register+0x226/0x250

This happens as a result of us not filling out props.type for the eDP
backlight, even though we do it for all other backlight types.

Since nothing in our driver uses anything but BACKLIGHT_RAW, let's take the
props\.type assignments out of the codepaths for individual backlight types
and just set it unconditionally to prevent this from happening again.

Signed-off-by: Lyude Paul 
Fixes: 6eca310e8924 ("drm/nouveau/kms/nv50-: Add basic DPCD backlight support 
for nouveau")
Cc:  # v5.15+
---
 drivers/gpu/drm/nouveau/nouveau_backlight.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c 
b/drivers/gpu/drm/nouveau/nouveau_backlight.c
index 6af12dc99d7f..daf9f87477ba 100644
--- a/drivers/gpu/drm/nouveau/nouveau_backlight.c
+++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c
@@ -101,7 +101,6 @@ nv40_backlight_init(struct nouveau_encoder *encoder,
if (!(nvif_rd32(device, NV40_PMC_BACKLIGHT) & NV40_PMC_BACKLIGHT_MASK))
return -ENODEV;
 
-   props->type = BACKLIGHT_RAW;
props->max_brightness = 31;
*ops = &nv40_bl_ops;
return 0;
@@ -343,7 +342,6 @@ nv50_backlight_init(struct nouveau_backlight *bl,
else
*ops = &nva3_bl_ops;
 
-   props->type = BACKLIGHT_RAW;
props->max_brightness = 100;
 
return 0;
@@ -411,6 +409,7 @@ nouveau_backlight_init(struct drm_connector *connector)
goto fail_alloc;
}
 
+   props.type = BACKLIGHT_RAW;
bl->dev = backlight_device_register(backlight_name, connector->kdev,
nv_encoder, ops, &props);
if (IS_ERR(bl->dev)) {
-- 
2.34.1

Re: [PATCH 13/21] fbcon: move more common code into fb_open()

2022-02-04 Thread Sam Ravnborg

On Mon, Jan 31, 2022 at 10:05:44PM +0100, Daniel Vetter wrote:
> No idea why con2fb_acquire_newinfo() initializes much less than
> fbcon_startup(), but so be it. From a quick look most of the
> un-initialized stuff should be fairly harmless, but who knows.
> 
> Signed-off-by: Daniel Vetter 
> Cc: Daniel Vetter 
> Cc: Greg Kroah-Hartman 
> Cc: Tetsuo Handa 
> Cc: Thomas Zimmermann 
> Cc: Claudio Suarez 
> Cc: Du Cheng 
> ---
>  drivers/video/fbdev/core/fbcon.c | 74 +---
>  1 file changed, 31 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/video/fbdev/core/fbcon.c 
> b/drivers/video/fbdev/core/fbcon.c
> index b83a5a77d8a8..5a3391ff038d 100644
> --- a/drivers/video/fbdev/core/fbcon.c
> +++ b/drivers/video/fbdev/core/fbcon.c
> @@ -680,8 +680,18 @@ static int fbcon_invalid_charcount(struct fb_info *info, 
> unsigned charcount)
>  
>  #endif /* CONFIG_MISC_TILEBLITTING */
>  
> +static void fbcon_release(struct fb_info *info)
> +{
> + if (info->fbops->fb_release)
> + info->fbops->fb_release(info, 0);
> +
> + module_put(info->fbops->owner);
> +}
> +
>  static int fbcon_open(struct fb_info *info)
>  {
> + struct fbcon_ops *ops;
> +
>   if (!try_module_get(info->fbops->owner))
>   return -ENODEV;
>  
> @@ -691,19 +701,22 @@ static int fbcon_open(struct fb_info *info)
>   return -ENODEV;
>   }
>  
> - return 0;
> -}
> + ops = kzalloc(sizeof(struct fbcon_ops), GFP_KERNEL);
> + if (!ops) {
> + fbcon_release(info);
> + return -ENOMEM;
> + }
>  
> -static void fbcon_release(struct fb_info *info)
> -{
> - if (info->fbops->fb_release)
> - info->fbops->fb_release(info, 0);
> + INIT_DELAYED_WORK(&ops->cursor_work, fb_flashcursor);
> + ops->info = info;
> + info->fbcon_par = ops;
> + ops->cur_blink_jiffies = HZ / 5;
>  
> - module_put(info->fbops->owner);
> + return 0;
>  }
>  
>  static int con2fb_acquire_newinfo(struct vc_data *vc, struct fb_info *info,
> -   int unit, int oldidx)
> +   int unit)
>  {
>   struct fbcon_ops *ops = NULL;
>   int err;
> @@ -712,27 +725,10 @@ static int con2fb_acquire_newinfo(struct vc_data *vc, 
> struct fb_info *info,
>   if (err)
>   return err;
>  
> - if (!err) {
> - ops = kzalloc(sizeof(struct fbcon_ops), GFP_KERNEL);
> - if (!ops)
> - err = -ENOMEM;
> -
> - INIT_DELAYED_WORK(&ops->cursor_work, fb_flashcursor);
> - }
> -
> - if (!err) {
> - ops->cur_blink_jiffies = HZ / 5;
> - ops->info = info;
> - info->fbcon_par = ops;
> -
> - if (vc)
> - set_blitting_type(vc, info);
> - }
> + ops = info->fbcon_par;
>  
> - if (err) {
> - con2fb_map[unit] = oldidx;
> - fbcon_release(info);
> - }
> + if (vc)
> + set_blitting_type(vc, info);
>  
>   return err;
>  }
> @@ -840,9 +836,11 @@ static int set_con2fb_map(int unit, int newidx, int user)
>  
>   found = search_fb_in_map(newidx);
>  
> - con2fb_map[unit] = newidx;
> - if (!err && !found)
> - err = con2fb_acquire_newinfo(vc, info, unit, oldidx);
> + if (!err && !found) {
> + err = con2fb_acquire_newinfo(vc, info, unit);
> + if (!err)
> + con2fb_map[unit] = newidx;
> + }
This looks like an unintentional change of functionality as con2fb_map[unit] is
only assigned when we do a con2fb_acquire_newinfo().

Staring at the code I could not say it is wrong, but not nice to hide
the change in this patch.

Sam


>  
>   /*
>* If old fb is not mapped to any of the consoles,
> @@ -939,20 +937,10 @@ static const char *fbcon_startup(void)
>   if (fbcon_open(info))
>   return NULL;
>  
> - ops = kzalloc(sizeof(struct fbcon_ops), GFP_KERNEL);
> - if (!ops) {
> - fbcon_release(info);
> - return NULL;
> - }
> -
> - INIT_DELAYED_WORK(&ops->cursor_work, fb_flashcursor);
> -
> + ops = info->fbcon_par;
>   ops->currcon = -1;
>   ops->graphics = 1;
>   ops->cur_rotate = -1;
> - ops->cur_blink_jiffies = HZ / 5;
> - ops->info = info;
> - info->fbcon_par = ops;
>  
>   p->con_rotate = initial_rotation;
>   if (p->con_rotate == -1)
> @@ -1022,7 +1010,7 @@ static void fbcon_init(struct vc_data *vc, int init)
>   return;
>  
>   if (!info->fbcon_par)
> - con2fb_acquire_newinfo(vc, info, vc->vc_num, -1);
> + con2fb_acquire_newinfo(vc, info, vc->vc_num);
>  
>   /* If we are not the first console on this
>  fb, copy the font from that console */
> -- 
> 2.33.0

Re: [RFC PATCH 1/3] drm: Extract amdgpu_sa.c as a generic suballocation helper

2022-02-04 Thread Thomas Zimmermann


Hi

Am 04.02.22 um 19:29 schrieb Christian König:

Oh, that's on my TODO list for years!

Am 04.02.22 um 18:48 schrieb Maarten Lankhorst:

Suballocating a buffer object is something that is not driver
generic, and is useful for other drivers as well.

Signed-off-by: Maarten Lankhorst 
---
  drivers/gpu/drm/Makefile   |   4 +-
  drivers/gpu/drm/drm_suballoc.c | 424 +
  include/drm/drm_suballoc.h |  78 ++
  3 files changed, 505 insertions(+), 1 deletion(-)
  create mode 100644 drivers/gpu/drm/drm_suballoc.c
  create mode 100644 include/drm/drm_suballoc.h

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 8675c2af7ae1..b848bcf8790c 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -57,7 +57,9 @@ drm_kms_helper-y := drm_bridge_connector.o 
drm_crtc_helper.o \

  drm_scdc_helper.o drm_gem_atomic_helper.o \
  drm_gem_framebuffer_helper.o \
  drm_atomic_state_helper.o drm_damage_helper.o \
-    drm_format_helper.o drm_self_refresh_helper.o drm_rect.o
+    drm_format_helper.o drm_self_refresh_helper.o drm_rect.o \
+    drm_suballoc.o
+


I think we should put that into a separate module like we now do with 
other helpers as well.


Please. KMS helpers are now likely to be linked into the kernel binary. 
I've already spent time to reduce the size of the module.


Best regard
Thomas

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH 14/21] fbcon: use lock_fb_info in fbcon_open/release

2022-02-04 Thread Sam Ravnborg

On Mon, Jan 31, 2022 at 10:05:45PM +0100, Daniel Vetter wrote:
> Now we get to the real motiviation, because fbmem.c insists that
> that's the right lock for these.
> 
> Ofc fbcon.c has a lot more places where it probably should call
> lock_fb_info(). But looking at fbmem.c at least most of these seem to
> be protected by console_lock() too, which is probably what papers over
> any issues.
> 
> Note that this means we're shuffling around a bit the locking sections
> for some of the console takeover and unbind paths, but not all:
> - console binding/unbinding from the console layer never with
> lock_fb_info
> - unbind (as opposed to unlink) never bother with lock_fb_info
> 
> Also the real serialization against set_par and set_pan are still
> doing by wrapping the entire ioctl code in console_lock(). So this
> shuffling shouldn't be worse than what we had from a "can you trigger
> races?" pov, but it's at least clearer.
> 
> Signed-off-by: Daniel Vetter 
> Cc: Daniel Vetter 
> Cc: Claudio Suarez 
> Cc: Tetsuo Handa 
> Cc: Thomas Zimmermann 
> Cc: Greg Kroah-Hartman 
> Cc: Du Cheng 
> Cc: Sam Ravnborg 
> Cc: Matthew Wilcox 
> Cc: William Kucharski 
> Cc: Alex Deucher 
> Cc: Zheyu Ma 
> Cc: Zhen Lei 
> Cc: Xiyu Yang 

Well, the patch does what the commit log says.
Acked-by: Sam Ravnborg

1 2 >

1 - 100 of 155 matches

Mail list logo