Re: [PATCH v2] drm/bridge: dw-hdmi: use safe format when first in bridge chain
Hi, On 19/01/2022 13:28, Neil Armstrong wrote: When the dw-hdmi bridge is in first place of the bridge chain, this means there is no way to select an input format of the dw-hdmi HW component. Since introduction of display-connector, negotiation was broken since the dw-hdmi negotiation code only worked when the dw-hdmi bridge was in last position of the bridge chain or behind another bridge also supporting input & output format negotiation. Commit 0656d1285b79 ("drm/bridge: display-connector: implement bus fmts callbacks") was introduced to make negotiation work again by making display-connector act as a pass-through concerning input & output format negotiation. But in the case where the dw-hdmi is single in the bridge chain, for example on Renesas SoCs, with the display-connector bridge the dw-hdmi is no more single, breaking output format. Reported-by: Biju Das Bisected-by: Kieran Bingham Tested-by: Kieran Bingham Fixes: 0656d1285b79 ("drm/bridge: display-connector: implement bus fmts callbacks"). Signed-off-by: Neil Armstrong --- Changes since v1: - Remove bad fix in dw_hdmi_bridge_atomic_get_input_bus_fmts - Fix typos in commit message drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c index 54d8fdad395f..97cdc61b57f6 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c @@ -2551,8 +2551,9 @@ static u32 *dw_hdmi_bridge_atomic_get_output_bus_fmts(struct drm_bridge *bridge, if (!output_fmts) return NULL; - /* If dw-hdmi is the only bridge, avoid negociating with ourselves */ - if (list_is_singular(&bridge->encoder->bridge_chain)) { + /* If dw-hdmi is the first or only bridge, avoid negociating with ourselves */ + if (list_is_singular(&bridge->encoder->bridge_chain) || + list_is_first(&bridge->chain_node, &bridge->encoder->bridge_chain)) { *num_output_fmts = 1; output_fmts[0] = MEDIA_BUS_FMT_FIXED; Can a bridge reviewer get an eye on this ? this actually solves an issue reported on a Renesas platform. Neil
Re: [PATCH 21/21] fbdev: Make registered_fb[] private to fbmem.c
Hi Daniel, Thanks for your patch! On Tue, Feb 1, 2022 at 9:50 PM Daniel Vetter wrote: > Well except when the olpc dcon fbdev driver is enabled, that thing > digs around in there in rather unfixable ways. Can't the actual frame buffer driver (which one?) used on olpc export a pointer to its fb_info? > --- a/drivers/video/fbdev/core/fbmem.c > +++ b/drivers/video/fbdev/core/fbmem.c > @@ -48,10 +48,14 @@ > static DEFINE_MUTEX(registration_lock); > > struct fb_info *registered_fb[FB_MAX] __read_mostly; > -EXPORT_SYMBOL(registered_fb); > - > int num_registered_fb __read_mostly; > +#if IS_ENABLED(CONFIG_OLPC_DCON) CONFIG_FB_OLPC_DCON (everywhere), cfr. the build failure reported by the robot. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH v4 3/3] fbcon: Add option to enable legacy hardware acceleration
Hi Helge, Thanks for your patch! On Wed, Feb 2, 2022 at 8:05 PM Helge Deller wrote: > Add a config option CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION to > enable bitblt and fillrect hardware acceleration in the framebuffer > console. If disabled, such acceleration will not be used, even if it is > supported by the graphics hardware driver. Note that this also applies to vertical panning and wrapping. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH] HPE BMC GXP SUPPORT
On 03/02/2022 18:07, Verdun, Jean-Marie wrote: > >> Maybe it does not look like, but this is actually a v2. Nick was asked >> to change the naming for the nodes already in v1. Unfortunately it did >> not happen, so we have vuart, spifi, vic and more. > >> It is a waste of reviewers' time to ask them to perform the same review >> twice or to ignore their comments. > > Hi Krysztof, > > Accept our apologies if you think you lose your time, it is clearly not > our > intent. > > This is the first time that we (I mean the team) introduce a new arch into > the linux kernel and I must admit that we had hard time to understand > from which angle we needed to start. > > I will probably write a Documentation afterward, as it is easy to find doc > on how to introduce a patch or a driver, but not when you want to > introduce a new chip. > > We are trying to do our best, and clearly want to follow all of your > inputs. > Mistakes happen and they are clearly not intentional and not driven in > a way to make you lose your time. > > Helping others, and teaching something new is definitely a way to > optimize your time and this is what you are currently doing with us. > > We appreciate it and hope you will too. I understand, I also maybe over-reacted on this. Just please go through the comments you got for first submission and either apply them or respond why you disagree. The next submissions (patchset split into several commits) should be a v3, preferably with cover letter (git format-patch --cover-letter -v3 ...) where you can document also changes you did to the patchset. It looks for example like this: https://lore.kernel.org/linux-samsung-soc/31da451b-a36c-74fb-5667-d4193284c...@canonical.com/T/#mf98d2ac27a8481dc69dd110f9861c8318cade252 or like this (where changelogs are in each patch, although ordering is not correct because dt-bindings should be the first in the series): https://lore.kernel.org/all/20220103233948.198119-1-mr.bossman...@gmail.com/ Best regards, Krzysztof
Re: [PATCH v3 2/2] drm/i915/uapi: Add query for hwconfig table
On Wed, Jan 19, 2022 at 9:35 PM wrote: > > From: Rodrigo Vivi > > GuC contains a consolidated table with a bunch of information about the > current device. > > Previously, this information was spread and hardcoded to all the components > including GuC, i915 and various UMDs. The goal here is to consolidate > the data into GuC in a way that all interested components can grab the > very latest and synchronized information using a simple query. > > As per most of the other queries, this one can be called twice. > Once with item.length=0 to determine the exact buffer size, then > allocate the user memory and call it again for to retrieve the > table data. For example: > struct drm_i915_query_item item = { > .query_id = DRM_I915_QUERY_HWCONCFIG_TABLE; > }; > query.items_ptr = (int64_t) &item; > query.num_items = 1; > > ioctl(fd, DRM_IOCTL_I915_QUERY, query, sizeof(query)); > > if (item.length <= 0) > return -ENOENT; > > data = malloc(item.length); > item.data_ptr = (int64_t) &data; > ioctl(fd, DRM_IOCTL_I915_QUERY, query, sizeof(query)); > > // Parse the data as appropriate... > > The returned array is a simple and flexible KLV (Key/Length/Value) > formatted table. For example, it could be just: > enum device_attr { > ATTR_SOME_VALUE = 0, > ATTR_SOME_MASK = 1, > }; > > static const u32 hwconfig[] = { > ATTR_SOME_VALUE, > 1, // Value Length in DWords > 8, // Value > > ATTR_SOME_MASK, > 3, > 0x00, 0x, 0xFF00, > }; > > The attribute ids are defined in a hardware spec. > > Cc: Tvrtko Ursulin > Cc: Kenneth Graunke > Cc: Michal Wajdeczko > Cc: Slawomir Milczarek > Signed-off-by: Rodrigo Vivi > Signed-off-by: John Harrison > Reviewed-by: Matthew Brost > --- > drivers/gpu/drm/i915/i915_query.c | 23 +++ > include/uapi/drm/i915_drm.h | 1 + > 2 files changed, 24 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_query.c > b/drivers/gpu/drm/i915/i915_query.c > index 2dfbc22857a3..609e64d5f395 100644 > --- a/drivers/gpu/drm/i915/i915_query.c > +++ b/drivers/gpu/drm/i915/i915_query.c > @@ -479,12 +479,35 @@ static int query_memregion_info(struct drm_i915_private > *i915, > return total_length; > } > > +static int query_hwconfig_table(struct drm_i915_private *i915, > + struct drm_i915_query_item *query_item) > +{ > + struct intel_gt *gt = to_gt(i915); > + struct intel_guc_hwconfig *hwconfig = >->uc.guc.hwconfig; > + > + if (!hwconfig->size || !hwconfig->ptr) > + return -ENODEV; > + > + if (query_item->length == 0) > + return hwconfig->size; > + > + if (query_item->length < hwconfig->size) > + return -EINVAL; > + > + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr), > +hwconfig->ptr, hwconfig->size)) > + return -EFAULT; > + > + return hwconfig->size; > +} > + > static int (* const i915_query_funcs[])(struct drm_i915_private *dev_priv, > struct drm_i915_query_item > *query_item) = { > query_topology_info, > query_engine_info, > query_perf_config, > query_memregion_info, > + query_hwconfig_table, > }; > > int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file > *file) > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h > index 914ebd9290e5..132515199f27 100644 > --- a/include/uapi/drm/i915_drm.h > +++ b/include/uapi/drm/i915_drm.h > @@ -2685,6 +2685,7 @@ struct drm_i915_query_item { > #define DRM_I915_QUERY_ENGINE_INFO 2 > #define DRM_I915_QUERY_PERF_CONFIG 3 > #define DRM_I915_QUERY_MEMORY_REGIONS 4 > +#define DRM_I915_QUERY_HWCONFIG_TABLE 5 > /* Must be kept compact -- no holes and well documented */ New uapi needs kerneldoc in the uapi header, and please fill in any gaps you have (i.e. if the query uapi this is built on top of isn't fully documented yet). Also this holds across the board, so please keep in mind in patch review. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 04/21] fbcon: delete a few unneeded forward decl
On Tue, Feb 1, 2022 at 9:50 PM Daniel Vetter wrote: > I didn't bother with any code movement to fix the others, these just > got a bit in the way. > > Signed-off-by: Daniel Vetter Reviewed-by: Geert Uytterhoeven Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Add warning for nesting dma_fence containers
Hi everyone, Since some operations can then lead to recursive handling nesting dma_fence containers into each other is only allowed under some restrictions. dma_fence_array containers can be attached to dma_fence_chain containers and dma_fence_chain containers by chaining them together. In all other cases the individual fences should be extracted with the appropriate iterators and added to the new containers individually. I've separated the i915 cleanup from this change since it is generally a different functionality and the build bots complained about some issues with those patches. Most patches are already reviewd, but especially the first one still needs an rb tag. Please review and comment, Christian.
[PATCH 1/6] dma-buf: consolidate dma_fence subclass checking
Consolidate the wrapper functions to check for dma_fence subclasses in the dma_fence header. This makes it easier to document and also check the different requirements for fence containers in the subclasses. Signed-off-by: Christian König --- include/linux/dma-fence-array.h | 15 + include/linux/dma-fence-chain.h | 3 +-- include/linux/dma-fence.h | 38 + 3 files changed, 40 insertions(+), 16 deletions(-) diff --git a/include/linux/dma-fence-array.h b/include/linux/dma-fence-array.h index 303dd712220f..fec374f69e12 100644 --- a/include/linux/dma-fence-array.h +++ b/include/linux/dma-fence-array.h @@ -45,19 +45,6 @@ struct dma_fence_array { struct irq_work work; }; -extern const struct dma_fence_ops dma_fence_array_ops; - -/** - * dma_fence_is_array - check if a fence is from the array subsclass - * @fence: fence to test - * - * Return true if it is a dma_fence_array and false otherwise. - */ -static inline bool dma_fence_is_array(struct dma_fence *fence) -{ - return fence->ops == &dma_fence_array_ops; -} - /** * to_dma_fence_array - cast a fence to a dma_fence_array * @fence: fence to cast to a dma_fence_array @@ -68,7 +55,7 @@ static inline bool dma_fence_is_array(struct dma_fence *fence) static inline struct dma_fence_array * to_dma_fence_array(struct dma_fence *fence) { - if (fence->ops != &dma_fence_array_ops) + if (!fence || !dma_fence_is_array(fence)) return NULL; return container_of(fence, struct dma_fence_array, base); diff --git a/include/linux/dma-fence-chain.h b/include/linux/dma-fence-chain.h index 54fe3443fd2c..ee906b659694 100644 --- a/include/linux/dma-fence-chain.h +++ b/include/linux/dma-fence-chain.h @@ -49,7 +49,6 @@ struct dma_fence_chain { spinlock_t lock; }; -extern const struct dma_fence_ops dma_fence_chain_ops; /** * to_dma_fence_chain - cast a fence to a dma_fence_chain @@ -61,7 +60,7 @@ extern const struct dma_fence_ops dma_fence_chain_ops; static inline struct dma_fence_chain * to_dma_fence_chain(struct dma_fence *fence) { - if (!fence || fence->ops != &dma_fence_chain_ops) + if (!fence || !dma_fence_is_chain(fence)) return NULL; return container_of(fence, struct dma_fence_chain, base); diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 1ea691753bd3..775cdc0b4f24 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -587,4 +587,42 @@ struct dma_fence *dma_fence_get_stub(void); struct dma_fence *dma_fence_allocate_private_stub(void); u64 dma_fence_context_alloc(unsigned num); +extern const struct dma_fence_ops dma_fence_array_ops; +extern const struct dma_fence_ops dma_fence_chain_ops; + +/** + * dma_fence_is_array - check if a fence is from the array subclass + * @fence: the fence to test + * + * Return true if it is a dma_fence_array and false otherwise. + */ +static inline bool dma_fence_is_array(struct dma_fence *fence) +{ + return fence->ops == &dma_fence_array_ops; +} + +/** + * dma_fence_is_chain - check if a fence is from the chain subclass + * @fence: the fence to test + * + * Return true if it is a dma_fence_chain and false otherwise. + */ +static inline bool dma_fence_is_chain(struct dma_fence *fence) +{ + return fence->ops == &dma_fence_chain_ops; +} + +/** + * dma_fence_is_container - check if a fence is a container for other fences + * @fence: the fence to test + * + * Return true if this fence is a container for other fences, false otherwise. + * This is important since we can't build up large fence structure or otherwise + * we run into recursion during operation on those fences. + */ +static inline bool dma_fence_is_container(struct dma_fence *fence) +{ + return dma_fence_is_array(fence) || dma_fence_is_chain(fence); +} + #endif /* __LINUX_DMA_FENCE_H */ -- 2.25.1
[PATCH 2/6] dma-buf: warn about dma_fence_array container rules v2
It's not allowed to nest another dma_fence container into a dma_fence_array or otherwise we can run into recursion. Warn about that when we create a dma_fence_array. v2: fix comment style and typo in the warning pointed out by Thomas Signed-off-by: Christian König Reviewed-by: Daniel Vetter Reviewed-by: Thomas Hellström --- drivers/dma-buf/dma-fence-array.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c index 3e07f961e2f3..cb1bacb5a42b 100644 --- a/drivers/dma-buf/dma-fence-array.c +++ b/drivers/dma-buf/dma-fence-array.c @@ -176,6 +176,20 @@ struct dma_fence_array *dma_fence_array_create(int num_fences, array->base.error = PENDING_ERROR; + /* +* dma_fence_array objects should never contain any other fence +* containers or otherwise we run into recursion and potential kernel +* stack overflow on operations on the dma_fence_array. +* +* The correct way of handling this is to flatten out the array by the +* caller instead. +* +* Enforce this here by checking that we don't create a dma_fence_array +* with any container inside. +*/ + while (num_fences--) + WARN_ON(dma_fence_is_container(fences[num_fences])); + return array; } EXPORT_SYMBOL(dma_fence_array_create); -- 2.25.1
[PATCH 4/6] dma-buf: warn about containers in dma_resv object
Drivers should not add containers as shared fences to the dma_resv object, instead each fence should be added individually. Signed-off-by: Christian König Reviewed-by: Daniel Vetter Reviewed-by: Thomas Hellström --- drivers/dma-buf/dma-resv.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index ee31f15d633a..b51416405e86 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -256,6 +256,11 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence) dma_resv_assert_held(obj); + /* Drivers should not add containers here, instead add each fence +* individually. +*/ + WARN_ON(dma_fence_is_container(fence)); + fobj = dma_resv_shared_list(obj); count = fobj->shared_count; -- 2.25.1
[PATCH 5/6] dma-buf: add dma_fence_chain_contained helper
It's a reoccurring pattern that we need to extract the fence from a dma_fence_chain object. Add a helper for this. Signed-off-by: Christian König --- drivers/dma-buf/dma-fence-chain.c | 6 ++ include/linux/dma-fence-chain.h | 15 +++ 2 files changed, 17 insertions(+), 4 deletions(-) diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c index 084c6927b735..06f8ef97c6e8 100644 --- a/drivers/dma-buf/dma-fence-chain.c +++ b/drivers/dma-buf/dma-fence-chain.c @@ -148,8 +148,7 @@ static bool dma_fence_chain_enable_signaling(struct dma_fence *fence) dma_fence_get(&head->base); dma_fence_chain_for_each(fence, &head->base) { - struct dma_fence_chain *chain = to_dma_fence_chain(fence); - struct dma_fence *f = chain ? chain->fence : fence; + struct dma_fence *f = dma_fence_chain_contained(fence); dma_fence_get(f); if (!dma_fence_add_callback(f, &head->cb, dma_fence_chain_cb)) { @@ -165,8 +164,7 @@ static bool dma_fence_chain_enable_signaling(struct dma_fence *fence) static bool dma_fence_chain_signaled(struct dma_fence *fence) { dma_fence_chain_for_each(fence, fence) { - struct dma_fence_chain *chain = to_dma_fence_chain(fence); - struct dma_fence *f = chain ? chain->fence : fence; + struct dma_fence *f = dma_fence_chain_contained(fence); if (!dma_fence_is_signaled(f)) { dma_fence_put(fence); diff --git a/include/linux/dma-fence-chain.h b/include/linux/dma-fence-chain.h index ee906b659694..10d51bcdf7b7 100644 --- a/include/linux/dma-fence-chain.h +++ b/include/linux/dma-fence-chain.h @@ -66,6 +66,21 @@ to_dma_fence_chain(struct dma_fence *fence) return container_of(fence, struct dma_fence_chain, base); } +/** + * dma_fence_chain_contained - return the contained fence + * @fence: the fence to test + * + * If the fence is a dma_fence_chain the function returns the fence contained + * inside the chain object, otherwise it returns the fence itself. + */ +static inline struct dma_fence * +dma_fence_chain_contained(struct dma_fence *fence) +{ + struct dma_fence_chain *chain = to_dma_fence_chain(fence); + + return chain ? chain->fence : fence; +} + /** * dma_fence_chain_alloc * -- 2.25.1
[PATCH 6/6] drm/amdgpu: use dma_fence_chain_contained
Instead of manually extracting the fence. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c index f7d8487799b2..40e06745fae9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c @@ -261,10 +261,9 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync, dma_resv_for_each_fence(&cursor, resv, true, f) { dma_fence_chain_for_each(f, f) { - struct dma_fence_chain *chain = to_dma_fence_chain(f); + struct dma_fence *tmp = dma_fence_chain_contained(f); - if (amdgpu_sync_test_fence(adev, mode, owner, chain ? - chain->fence : f)) { + if (amdgpu_sync_test_fence(adev, mode, owner, tmp)) { r = amdgpu_sync_fence(sync, f); dma_fence_put(f); if (r) -- 2.25.1
[PATCH 3/6] dma-buf: Warn about dma_fence_chain container rules v2
Chaining of dma_fence_chain objects is only allowed through the prev fence and not through the contained fence. Warn about that when we create a dma_fence_chain. v2: fix comment style Signed-off-by: Christian König Reviewed-by: Daniel Vetter Reviewed-by: Thomas Hellström --- drivers/dma-buf/dma-fence-chain.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c index 1b4cb3e5cec9..084c6927b735 100644 --- a/drivers/dma-buf/dma-fence-chain.c +++ b/drivers/dma-buf/dma-fence-chain.c @@ -254,5 +254,14 @@ void dma_fence_chain_init(struct dma_fence_chain *chain, dma_fence_init(&chain->base, &dma_fence_chain_ops, &chain->lock, context, seqno); + + /* +* Chaining dma_fence_chain container together is only allowed through +* the prev fence and not through the contained fence. +* +* The correct way of handling this is to flatten out the fence +* structure into a dma_fence_array by the caller instead. +*/ + WARN_ON(dma_fence_is_chain(fence)); } EXPORT_SYMBOL(dma_fence_chain_init); -- 2.25.1
Re: [PATCH v4 3/3] fbcon: Add option to enable legacy hardware acceleration
Hello Geert, On 2/4/22 09:37, Geert Uytterhoeven wrote: > On Wed, Feb 2, 2022 at 8:05 PM Helge Deller wrote: >> Add a config option CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION to >> enable bitblt and fillrect hardware acceleration in the framebuffer >> console. If disabled, such acceleration will not be used, even if it is >> supported by the graphics hardware driver. > > Note that this also applies to vertical panning and wrapping. That's correct. Would you mind to send a patch which adds this info? Helge
Re: [PATCH v4 3/3] fbcon: Add option to enable legacy hardware acceleration
Hi Helge, On Fri, Feb 4, 2022 at 11:17 AM Helge Deller wrote: > On 2/4/22 09:37, Geert Uytterhoeven wrote: > > On Wed, Feb 2, 2022 at 8:05 PM Helge Deller wrote: > >> Add a config option CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION to > >> enable bitblt and fillrect hardware acceleration in the framebuffer > >> console. If disabled, such acceleration will not be used, even if it is > >> supported by the graphics hardware driver. > > > > Note that this also applies to vertical panning and wrapping. > > That's correct. > Would you mind to send a patch which adds this info? To add it where? "bitblt and fillrect" are only mentioned in the patch description. The Kconfig help entry just talks about "hardware acceleration", which can mean any trick supported by the hardware. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH 1/6] dma-buf: consolidate dma_fence subclass checking
On Fri, 2022-02-04 at 11:04 +0100, Christian König wrote: > Consolidate the wrapper functions to check for dma_fence > subclasses in the dma_fence header. > > This makes it easier to document and also check the different > requirements for fence containers in the subclasses. > > Signed-off-by: Christian König I'd probably still opt for a fence ops is_container member, but won't insist. Reviewed-by: Thomas Hellström > --- > include/linux/dma-fence-array.h | 15 + > include/linux/dma-fence-chain.h | 3 +-- > include/linux/dma-fence.h | 38 > + > 3 files changed, 40 insertions(+), 16 deletions(-) > > diff --git a/include/linux/dma-fence-array.h b/include/linux/dma- > fence-array.h > index 303dd712220f..fec374f69e12 100644 > --- a/include/linux/dma-fence-array.h > +++ b/include/linux/dma-fence-array.h > @@ -45,19 +45,6 @@ struct dma_fence_array { > struct irq_work work; > }; > > -extern const struct dma_fence_ops dma_fence_array_ops; > - > -/** > - * dma_fence_is_array - check if a fence is from the array subsclass > - * @fence: fence to test > - * > - * Return true if it is a dma_fence_array and false otherwise. > - */ > -static inline bool dma_fence_is_array(struct dma_fence *fence) > -{ > - return fence->ops == &dma_fence_array_ops; > -} > - > /** > * to_dma_fence_array - cast a fence to a dma_fence_array > * @fence: fence to cast to a dma_fence_array > @@ -68,7 +55,7 @@ static inline bool dma_fence_is_array(struct > dma_fence *fence) > static inline struct dma_fence_array * > to_dma_fence_array(struct dma_fence *fence) > { > - if (fence->ops != &dma_fence_array_ops) > + if (!fence || !dma_fence_is_array(fence)) > return NULL; > > return container_of(fence, struct dma_fence_array, base); > diff --git a/include/linux/dma-fence-chain.h b/include/linux/dma- > fence-chain.h > index 54fe3443fd2c..ee906b659694 100644 > --- a/include/linux/dma-fence-chain.h > +++ b/include/linux/dma-fence-chain.h > @@ -49,7 +49,6 @@ struct dma_fence_chain { > spinlock_t lock; > }; > > -extern const struct dma_fence_ops dma_fence_chain_ops; > > /** > * to_dma_fence_chain - cast a fence to a dma_fence_chain > @@ -61,7 +60,7 @@ extern const struct dma_fence_ops > dma_fence_chain_ops; > static inline struct dma_fence_chain * > to_dma_fence_chain(struct dma_fence *fence) > { > - if (!fence || fence->ops != &dma_fence_chain_ops) > + if (!fence || !dma_fence_is_chain(fence)) > return NULL; > > return container_of(fence, struct dma_fence_chain, base); > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h > index 1ea691753bd3..775cdc0b4f24 100644 > --- a/include/linux/dma-fence.h > +++ b/include/linux/dma-fence.h > @@ -587,4 +587,42 @@ struct dma_fence *dma_fence_get_stub(void); > struct dma_fence *dma_fence_allocate_private_stub(void); > u64 dma_fence_context_alloc(unsigned num); > > +extern const struct dma_fence_ops dma_fence_array_ops; > +extern const struct dma_fence_ops dma_fence_chain_ops; > + > +/** > + * dma_fence_is_array - check if a fence is from the array subclass > + * @fence: the fence to test > + * > + * Return true if it is a dma_fence_array and false otherwise. > + */ > +static inline bool dma_fence_is_array(struct dma_fence *fence) > +{ > + return fence->ops == &dma_fence_array_ops; > +} > + > +/** > + * dma_fence_is_chain - check if a fence is from the chain subclass > + * @fence: the fence to test > + * > + * Return true if it is a dma_fence_chain and false otherwise. > + */ > +static inline bool dma_fence_is_chain(struct dma_fence *fence) > +{ > + return fence->ops == &dma_fence_chain_ops; > +} > + > +/** > + * dma_fence_is_container - check if a fence is a container for > other fences > + * @fence: the fence to test > + * > + * Return true if this fence is a container for other fences, false > otherwise. > + * This is important since we can't build up large fence structure > or otherwise > + * we run into recursion during operation on those fences. > + */ > +static inline bool dma_fence_is_container(struct dma_fence *fence) > +{ > + return dma_fence_is_array(fence) || > dma_fence_is_chain(fence); > +} > + > #endif /* __LINUX_DMA_FENCE_H */
Re: [PATCH 5/6] dma-buf: add dma_fence_chain_contained helper
On Fri, 2022-02-04 at 11:04 +0100, Christian König wrote: > It's a reoccurring pattern that we need to extract the fence > from a dma_fence_chain object. Add a helper for this. > > Signed-off-by: Christian König I thought I'd reviewed this one already, but in case I didn't Reviewed-by: Thomas Hellström > --- > drivers/dma-buf/dma-fence-chain.c | 6 ++ > include/linux/dma-fence-chain.h | 15 +++ > 2 files changed, 17 insertions(+), 4 deletions(-) > > diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma- > fence-chain.c > index 084c6927b735..06f8ef97c6e8 100644 > --- a/drivers/dma-buf/dma-fence-chain.c > +++ b/drivers/dma-buf/dma-fence-chain.c > @@ -148,8 +148,7 @@ static bool > dma_fence_chain_enable_signaling(struct dma_fence *fence) > > dma_fence_get(&head->base); > dma_fence_chain_for_each(fence, &head->base) { > - struct dma_fence_chain *chain = > to_dma_fence_chain(fence); > - struct dma_fence *f = chain ? chain->fence : fence; > + struct dma_fence *f = > dma_fence_chain_contained(fence); > > dma_fence_get(f); > if (!dma_fence_add_callback(f, &head->cb, > dma_fence_chain_cb)) { > @@ -165,8 +164,7 @@ static bool > dma_fence_chain_enable_signaling(struct dma_fence *fence) > static bool dma_fence_chain_signaled(struct dma_fence *fence) > { > dma_fence_chain_for_each(fence, fence) { > - struct dma_fence_chain *chain = > to_dma_fence_chain(fence); > - struct dma_fence *f = chain ? chain->fence : fence; > + struct dma_fence *f = > dma_fence_chain_contained(fence); > > if (!dma_fence_is_signaled(f)) { > dma_fence_put(fence); > diff --git a/include/linux/dma-fence-chain.h b/include/linux/dma- > fence-chain.h > index ee906b659694..10d51bcdf7b7 100644 > --- a/include/linux/dma-fence-chain.h > +++ b/include/linux/dma-fence-chain.h > @@ -66,6 +66,21 @@ to_dma_fence_chain(struct dma_fence *fence) > return container_of(fence, struct dma_fence_chain, base); > } > > +/** > + * dma_fence_chain_contained - return the contained fence > + * @fence: the fence to test > + * > + * If the fence is a dma_fence_chain the function returns the fence > contained > + * inside the chain object, otherwise it returns the fence itself. > + */ > +static inline struct dma_fence * > +dma_fence_chain_contained(struct dma_fence *fence) > +{ > + struct dma_fence_chain *chain = to_dma_fence_chain(fence); > + > + return chain ? chain->fence : fence; > +} > + > /** > * dma_fence_chain_alloc > *
Re: Add warning for nesting dma_fence containers
On Fri, 2022-02-04 at 11:04 +0100, Christian König wrote: > Hi everyone, > > Since some operations can then lead to recursive handling nesting > dma_fence containers into each other is only allowed under some > restrictions. > > dma_fence_array containers can be attached to dma_fence_chain > containers and dma_fence_chain containers by chaining them together. > > In all other cases the individual fences should be extracted with > the appropriate iterators and added to the new containers > individually. > > I've separated the i915 cleanup from this change since it is > generally a different functionality and the build bots complained > about some issues with those patches. > > Most patches are already reviewd, but especially the first one still > needs an rb tag. > > Please review and comment, I see you dropped the i915 patch (probably due to lack of reviews?), Got distracted with other things, but I'll see if I can resurrect that and get it reviewed and merged. Thanks, Thomas > Christian. > >
Re: [PATCH v11 5/5] drm/amdgpu: add drm buddy support to amdgpu
On 28/01/22 7:48 pm, Matthew Auld wrote: > On Thu, 27 Jan 2022 at 14:11, Arunpravin > wrote: >> >> - Remove drm_mm references and replace with drm buddy functionalities >> - Add res cursor support for drm buddy >> >> v2(Matthew Auld): >> - replace spinlock with mutex as we call kmem_cache_zalloc >> (..., GFP_KERNEL) in drm_buddy_alloc() function >> >> - lock drm_buddy_block_trim() function as it calls >> mark_free/mark_split are all globally visible >> >> v3(Matthew Auld): >> - remove trim method error handling as we address the failure case >> at drm_buddy_block_trim() function >> >> v4: >> - fix warnings reported by kernel test robot >> >> v5: >> - fix merge conflict issue >> >> v6: >> - fix warnings reported by kernel test robot >> >> Signed-off-by: Arunpravin >> --- >> drivers/gpu/drm/Kconfig | 1 + >> .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h| 97 +-- >> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 7 +- >> drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 259 ++ >> 4 files changed, 231 insertions(+), 133 deletions(-) > > > >> >> -/** >> - * amdgpu_vram_mgr_virt_start - update virtual start address >> - * >> - * @mem: ttm_resource to update >> - * @node: just allocated node >> - * >> - * Calculate a virtual BO start address to easily check if everything is CPU >> - * accessible. >> - */ >> -static void amdgpu_vram_mgr_virt_start(struct ttm_resource *mem, >> - struct drm_mm_node *node) >> -{ >> - unsigned long start; >> - >> - start = node->start + node->size; >> - if (start > mem->num_pages) >> - start -= mem->num_pages; >> - else >> - start = 0; >> - mem->start = max(mem->start, start); >> -} >> - >> /** >> * amdgpu_vram_mgr_new - allocate new ranges >> * >> @@ -366,13 +357,13 @@ static int amdgpu_vram_mgr_new(struct >> ttm_resource_manager *man, >>const struct ttm_place *place, >>struct ttm_resource **res) >> { >> - unsigned long lpfn, num_nodes, pages_per_node, pages_left, pages; >> + unsigned long lpfn, pages_per_node, pages_left, pages, n_pages; >> + u64 vis_usage = 0, mem_bytes, max_bytes, min_page_size; >> struct amdgpu_vram_mgr *mgr = to_vram_mgr(man); >> struct amdgpu_device *adev = to_amdgpu_device(mgr); >> - uint64_t vis_usage = 0, mem_bytes, max_bytes; >> - struct ttm_range_mgr_node *node; >> - struct drm_mm *mm = &mgr->mm; >> - enum drm_mm_insert_mode mode; >> + struct amdgpu_vram_mgr_node *node; >> + struct drm_buddy *mm = &mgr->mm; >> + struct drm_buddy_block *block; >> unsigned i; >> int r; >> >> @@ -391,10 +382,9 @@ static int amdgpu_vram_mgr_new(struct >> ttm_resource_manager *man, >> goto error_sub; >> } >> >> - if (place->flags & TTM_PL_FLAG_CONTIGUOUS) { >> + if (place->flags & TTM_PL_FLAG_CONTIGUOUS) >> pages_per_node = ~0ul; >> - num_nodes = 1; >> - } else { >> + else { >> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >> pages_per_node = HPAGE_PMD_NR; >> #else >> @@ -403,11 +393,9 @@ static int amdgpu_vram_mgr_new(struct >> ttm_resource_manager *man, >> #endif >> pages_per_node = max_t(uint32_t, pages_per_node, >>tbo->page_alignment); >> - num_nodes = DIV_ROUND_UP_ULL(PFN_UP(mem_bytes), >> pages_per_node); >> } >> >> - node = kvmalloc(struct_size(node, mm_nodes, num_nodes), >> - GFP_KERNEL | __GFP_ZERO); >> + node = kzalloc(sizeof(*node), GFP_KERNEL); >> if (!node) { >> r = -ENOMEM; >> goto error_sub; >> @@ -415,9 +403,17 @@ static int amdgpu_vram_mgr_new(struct >> ttm_resource_manager *man, >> >> ttm_resource_init(tbo, place, &node->base); >> >> - mode = DRM_MM_INSERT_BEST; >> + INIT_LIST_HEAD(&node->blocks); >> + >> if (place->flags & TTM_PL_FLAG_TOPDOWN) >> - mode = DRM_MM_INSERT_HIGH; >> + node->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION; >> + >> + if (place->fpfn || lpfn != man->size) >> + /* Allocate blocks in desired range */ >> + node->flags |= DRM_BUDDY_RANGE_ALLOCATION; >> + >> + min_page_size = mgr->default_page_size; >> + BUG_ON(min_page_size < mm->chunk_size); >> >> pages_left = node->base.num_pages; >> >> @@ -425,36 +421,61 @@ static int amdgpu_vram_mgr_new(struct >> ttm_resource_manager *man, >> pages = min(pages_left, 2UL << (30 - PAGE_SHIFT)); >> >> i = 0; >> - spin_lock(&mgr->lock); >> while (pages_left) { >> - uint32_t alignment = tbo->page_alignment; >> - >> if (pages >= pages_per_node) >> - align
Re: [PATCH v2 5/8] drm/i915/dp: rewrite DP 2.0 128b/132b link training based on errata
On Thu, Feb 03, 2022 at 11:03:54AM +0200, Jani Nikula wrote: > The DP 2.0 errata completely overhauls the 128b/132b link training, with > no provisions for backward compatibility with the original DP 2.0 > specification. > > The changes are too intrusive to consider reusing the same code for both > 8b/10b and 128b/132b, mainly because the LTTPR channel equalisation is > done concurrently instead of serialized. > > NOTES: > > * It's a bit unclear when to wait for DP_INTERLANE_ALIGN_DONE and > per-lane DP_LANE_SYMBOL_LOCKED. Figure xx4 in the SCR implies the > LANEx_CHANNEL_EQ_DONE sequence may end with either 0x77,0x77,0x85 *or* > 0x33,0x33,0x84 (for four lane configuration in DPCD 0x202..0x204) > i.e. without the above bits set. Text elsewhere seems contradictory or > incomplete. > > * We read entire link status (6 bytes) everywhere instead of individual > DPCD addresses. > > * There are some subtle ambiguities or contradictions in the order of > some DPCD access and TPS signal enables/disables. It's also not clear > whether these are significant. > > v2: > - Always try one last time after timeouts to avoid races (Ville) > - Extend timeout to cover the entire LANEx_EQ_DONE sequence (Ville) > - Also check for eq interlane align done in LANEx_CDS_DONE Sequence (Ville) > - Check for Intra-hop status before link training > > Cc: Uma Shankar > Cc: Ville Syrjälä > Signed-off-by: Jani Nikula > --- > .../drm/i915/display/intel_dp_link_training.c | 279 +- > 1 file changed, 278 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.c > b/drivers/gpu/drm/i915/display/intel_dp_link_training.c > index 4e507aa75a03..cc2b82d9114c 100644 > --- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c > +++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c > @@ -1102,6 +1102,277 @@ intel_dp_link_train_all_phys(struct intel_dp > *intel_dp, > return ret; > } > > + > +/* > + * 128b/132b DP LANEx_EQ_DONE Sequence (DP 2.0 E11 3.5.2.16.1) > + */ > +static bool > +intel_dp_128b132b_lane_eq(struct intel_dp *intel_dp, > + const struct intel_crtc_state *crtc_state) > +{ > + struct intel_encoder *encoder = &dp_to_dig_port(intel_dp)->base; > + struct drm_i915_private *i915 = to_i915(encoder->base.dev); > + u8 link_status[DP_LINK_STATUS_SIZE]; > + int delay_us; > + int try, max_tries = 20; > + unsigned long deadline; > + bool timeout = false; > + > + /* > + * Reset signal levels. Start transmitting 128b/132b TPS1. > + * > + * Put DPRX and LTTPRs (if any) into intra-hop AUX mode by writing TPS1 > + * in DP_TRAINING_PATTERN_SET. > + */ > + if (!intel_dp_reset_link_train(intel_dp, crtc_state, DP_PHY_DPRX, > +DP_TRAINING_PATTERN_1)) { > + drm_err(&i915->drm, > + "[ENCODER:%d:%s] Failed to start 128b/132b TPS1\n", > + encoder->base.base.id, encoder->base.name); > + return false; > + } > + > + delay_us = drm_dp_128b132b_read_aux_rd_interval(&intel_dp->aux); > + > + /* Read the initial TX FFE settings. */ > + if (drm_dp_dpcd_read_link_status(&intel_dp->aux, link_status) < 0) { > + drm_err(&i915->drm, > + "[ENCODER:%d:%s] Failed to read TX FFE presets\n", > + encoder->base.base.id, encoder->base.name); > + return false; > + } > + > + /* Update signal levels and training set as requested. */ > + intel_dp_get_adjust_train(intel_dp, crtc_state, DP_PHY_DPRX, > link_status); > + if (!intel_dp_update_link_train(intel_dp, crtc_state, DP_PHY_DPRX)) { > + drm_err(&i915->drm, > + "[ENCODER:%d:%s] Failed to set initial TX FFE > settings\n", > + encoder->base.base.id, encoder->base.name); > + return false; > + } > + > + /* Start transmitting 128b/132b TPS2. */ > + if (!intel_dp_set_link_train(intel_dp, crtc_state, DP_PHY_DPRX, > + DP_TRAINING_PATTERN_2)) { > + drm_err(&i915->drm, > + "[ENCODER:%d:%s] Failed to start 128b/132b TPS2\n", > + encoder->base.base.id, encoder->base.name); > + return false; > + } > + > + /* Time budget for the LANEx_EQ_DONE Sequence */ > + deadline = jiffies + msecs_to_jiffies(400); Didn't we have a msecs_to_jiffies_timeout() that adds an extra jiffy to make sure we don't bail too early? > + > + for (try = 0; try < max_tries; try++) { > + usleep_range(delay_us, 2 * delay_us); > + > + /* > + * The delay may get updated. The transmitter shall read the > + * delay before link status during link training. > + */ > + delay_us = drm_dp_128b132b_read_aux_rd_interval(&intel_dp->aux); > + > +
Re: [PATCH] HPE BMC GXP SUPPORT
Hi, On Wed, Feb 02, 2022 at 10:52:50AM -0600, nick.hawk...@hpe.com wrote: > diff --git a/arch/arm/mach-hpe/Makefile b/arch/arm/mach-hpe/Makefile > new file mode 100644 > index ..8b0a91234df4 > --- /dev/null > +++ b/arch/arm/mach-hpe/Makefile > @@ -0,0 +1 @@ > +obj-$(CONFIG_ARCH_HPE_GXP) += gxp.o > diff --git a/arch/arm/mach-hpe/gxp.c b/arch/arm/mach-hpe/gxp.c > new file mode 100644 > index ..a37838247948 > --- /dev/null > +++ b/arch/arm/mach-hpe/gxp.c > @@ -0,0 +1,62 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Copyright (C) 2022 Hewlett-Packard Enterprise Development Company, L.P. > + * > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License version 2 as > + * published by the Free Software Foundation. > + */ > + > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include It's normal to list all linux/ includes before asm/ includes. Please rearrange. > + > +#define IOP_REGS_PHYS_BASE 0xc000 > +#define IOP_REGS_VIRT_BASE 0xf000 > +#define IOP_REGS_SIZE (240*SZ_1M) > + > +#define IOP_EHCI_USBCMD 0x0efe0010 > + > +static struct map_desc gxp_io_desc[] __initdata = { > + { > + .virtual= (unsigned long)IOP_REGS_VIRT_BASE, > + .pfn= __phys_to_pfn(IOP_REGS_PHYS_BASE), > + .length = IOP_REGS_SIZE, > + .type = MT_DEVICE, If you keep this, please indent the above four lines by one more tab. > + }, > +}; > + > +void __init gxp_map_io(void) > +{ > + iotable_init(gxp_io_desc, ARRAY_SIZE(gxp_io_desc)); > +} > + > +static void __init gxp_dt_init(void) > +{ > + /*reset EHCI host controller for clear start*/ > + __raw_writel(0x00080002, > + (void __iomem *)(IOP_REGS_VIRT_BASE + IOP_EHCI_USBCMD)); Please consider making IOP_REGS_VIRT_BASE a 'void __iomem' pointer, it being a _virtual_ iomem address. This should save you needing repeated casts except for the initialiser above. > + of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL); > +} > + > +static void gxp_restart(enum reboot_mode mode, const char *cmd) > +{ > + __raw_writel(1, (void __iomem *) IOP_REGS_VIRT_BASE); > +} > + > +static const char * const gxp_board_dt_compat[] = { > + "HPE,GXP", > + NULL, > +}; > + > +DT_MACHINE_START(GXP_DT, "HPE GXP") > + .init_machine = gxp_dt_init, > + .map_io = gxp_map_io, > + .restart= gxp_restart, > + .dt_compat = gxp_board_dt_compat, > +MACHINE_END > diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig > index cfb8ea0df3b1..5916dade7608 100644 > --- a/drivers/clocksource/Kconfig > +++ b/drivers/clocksource/Kconfig > @@ -617,6 +617,14 @@ config CLKSRC_ST_LPC > Enable this option to use the Low Power controller timer > as clocksource. > > +config GXP_TIMER > + bool "GXP timer driver" > + depends on ARCH_HPE > + default y > + help > + Provides a driver for the timer control found on HPE > + GXP SOCs. This is required for all GXP SOCs. > + > config ATCPIT100_TIMER > bool "ATCPIT100 timer driver" > depends on NDS32 || COMPILE_TEST > diff --git a/drivers/clocksource/Makefile b/drivers/clocksource/Makefile > index fa5f624eadb6..ffca09ec34de 100644 > --- a/drivers/clocksource/Makefile > +++ b/drivers/clocksource/Makefile > @@ -89,3 +89,4 @@ obj-$(CONFIG_GX6605S_TIMER) += timer-gx6605s.o > obj-$(CONFIG_HYPERV_TIMER) += hyperv_timer.o > obj-$(CONFIG_MICROCHIP_PIT64B) += timer-microchip-pit64b.o > obj-$(CONFIG_MSC313E_TIMER) += timer-msc313e.o > +obj-$(CONFIG_GXP_TIMER) += gxp_timer.o > diff --git a/drivers/clocksource/gxp_timer.c b/drivers/clocksource/gxp_timer.c > new file mode 100644 > index ..e3c617036e0d > --- /dev/null > +++ b/drivers/clocksource/gxp_timer.c > @@ -0,0 +1,158 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Copyright (C) 2022 Hewlett-Packard Enterprise Development Company, L.P. > + * > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License version 2 as > + * published by the Free Software Foundation. > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include Why do you need asm/irq.h ? > + > +#define TIMER0_FREQ 100 > +#define TIMER1_FREQ 100 > + > +#define MASK_TCS_ENABLE 0x01 > +#define MASK_TCS_PERIOD 0x02 > +#define MASK_TCS_RELOAD 0x04 > +#define MASK_TCS_TC 0x80 > + > +struct gxp_timer { > + void __iomem *counter; > + void __iomem *control; > + struct clock_event_device evt; > +}; > + > +static void __iomem *system_clock __read_mostly; > + > +static u64 notrace gxp_sche
Re: [PATCH] HPE BMC GXP SUPPORT
On Fri, 2022-02-04 at 12:05 +, Russell King (Oracle) wrote: > On Wed, Feb 02, 2022 at 10:52:50AM -0600, nick.hawk...@hpe.com wrote: [] > > diff --git a/drivers/clocksource/Makefile b/drivers/clocksource/Makefile [] > > +static irqreturn_t gxp_time_interrupt(int irq, void *dev_id) > > +{ > > + struct gxp_timer *timer = dev_id; > > + void (*event_handler)(struct clock_event_device *timer); > > + > > + > > One too many blank lines. > > > + if (readb_relaxed(timer->control) & MASK_TCS_TC) { > > + writeb_relaxed(MASK_TCS_TC, timer->control); > > + > > + event_handler = READ_ONCE(timer->evt.event_handler); > > + if (event_handler) > > + event_handler(&timer->evt); > > + return IRQ_HANDLED; > > + } else { > > + return IRQ_NONE; > > + } > > +} It's also less indented code and perhaps clearer to reverse the test if (!readb_relaxed(timer->control) & MASK_TCS_TC) return IRQ_NONE; writeb_relaxed(MASK_TCS_TC, timer->control); event_handler = READ_ONCE(timer->evt.event_handler); if (event_handler) event_handler(&timer->evt); return IRQ_HANDLED;
Re: [PATCH] HPE BMC GXP SUPPORT
On Fri, Feb 04, 2022 at 04:18:24AM -0800, Joe Perches wrote: > On Fri, 2022-02-04 at 12:05 +, Russell King (Oracle) wrote: > > On Wed, Feb 02, 2022 at 10:52:50AM -0600, nick.hawk...@hpe.com wrote: > > > + if (readb_relaxed(timer->control) & MASK_TCS_TC) { > > > + writeb_relaxed(MASK_TCS_TC, timer->control); > > > + > > > + event_handler = READ_ONCE(timer->evt.event_handler); > > > + if (event_handler) > > > + event_handler(&timer->evt); > > > + return IRQ_HANDLED; > > > + } else { > > > + return IRQ_NONE; > > > + } > > > +} > > It's also less indented code and perhaps clearer to reverse the test > > if (!readb_relaxed(timer->control) & MASK_TCS_TC) This will need to be: if (!(readb_relaxed(timer->control) & MASK_TCS_TC)) > return IRQ_NONE; > > writeb_relaxed(MASK_TCS_TC, timer->control); > > event_handler = READ_ONCE(timer->evt.event_handler); > if (event_handler) > event_handler(&timer->evt); > > return IRQ_HANDLED; > > > -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Re: Add warning for nesting dma_fence containers
Am 04.02.22 um 11:40 schrieb Thomas Hellström: On Fri, 2022-02-04 at 11:04 +0100, Christian König wrote: Hi everyone, Since some operations can then lead to recursive handling nesting dma_fence containers into each other is only allowed under some restrictions. dma_fence_array containers can be attached to dma_fence_chain containers and dma_fence_chain containers by chaining them together. In all other cases the individual fences should be extracted with the appropriate iterators and added to the new containers individually. I've separated the i915 cleanup from this change since it is generally a different functionality and the build bots complained about some issues with those patches. Most patches are already reviewd, but especially the first one still needs an rb tag. Please review and comment, I see you dropped the i915 patch (probably due to lack of reviews?), Got distracted with other things, but I'll see if I can resurrect that and get it reviewed and merged. I was about to send out the i915 patch when that one here is merged. The CI systems yielded some strange error with that one and I wanted to double check what's that all about. Regards, Christian. Thanks, Thomas Christian.
Re: [PATCH v11 5/5] drm/amdgpu: add drm buddy support to amdgpu
Am 04.02.22 um 12:22 schrieb Arunpravin: On 28/01/22 7:48 pm, Matthew Auld wrote: On Thu, 27 Jan 2022 at 14:11, Arunpravin wrote: - Remove drm_mm references and replace with drm buddy functionalities - Add res cursor support for drm buddy v2(Matthew Auld): - replace spinlock with mutex as we call kmem_cache_zalloc (..., GFP_KERNEL) in drm_buddy_alloc() function - lock drm_buddy_block_trim() function as it calls mark_free/mark_split are all globally visible v3(Matthew Auld): - remove trim method error handling as we address the failure case at drm_buddy_block_trim() function v4: - fix warnings reported by kernel test robot v5: - fix merge conflict issue v6: - fix warnings reported by kernel test robot Signed-off-by: Arunpravin --- drivers/gpu/drm/Kconfig | 1 + .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h| 97 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 7 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 259 ++ 4 files changed, 231 insertions(+), 133 deletions(-) -/** - * amdgpu_vram_mgr_virt_start - update virtual start address - * - * @mem: ttm_resource to update - * @node: just allocated node - * - * Calculate a virtual BO start address to easily check if everything is CPU - * accessible. - */ -static void amdgpu_vram_mgr_virt_start(struct ttm_resource *mem, - struct drm_mm_node *node) -{ - unsigned long start; - - start = node->start + node->size; - if (start > mem->num_pages) - start -= mem->num_pages; - else - start = 0; - mem->start = max(mem->start, start); -} - /** * amdgpu_vram_mgr_new - allocate new ranges * @@ -366,13 +357,13 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, const struct ttm_place *place, struct ttm_resource **res) { - unsigned long lpfn, num_nodes, pages_per_node, pages_left, pages; + unsigned long lpfn, pages_per_node, pages_left, pages, n_pages; + u64 vis_usage = 0, mem_bytes, max_bytes, min_page_size; struct amdgpu_vram_mgr *mgr = to_vram_mgr(man); struct amdgpu_device *adev = to_amdgpu_device(mgr); - uint64_t vis_usage = 0, mem_bytes, max_bytes; - struct ttm_range_mgr_node *node; - struct drm_mm *mm = &mgr->mm; - enum drm_mm_insert_mode mode; + struct amdgpu_vram_mgr_node *node; + struct drm_buddy *mm = &mgr->mm; + struct drm_buddy_block *block; unsigned i; int r; @@ -391,10 +382,9 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, goto error_sub; } - if (place->flags & TTM_PL_FLAG_CONTIGUOUS) { + if (place->flags & TTM_PL_FLAG_CONTIGUOUS) pages_per_node = ~0ul; - num_nodes = 1; - } else { + else { #ifdef CONFIG_TRANSPARENT_HUGEPAGE pages_per_node = HPAGE_PMD_NR; #else @@ -403,11 +393,9 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, #endif pages_per_node = max_t(uint32_t, pages_per_node, tbo->page_alignment); - num_nodes = DIV_ROUND_UP_ULL(PFN_UP(mem_bytes), pages_per_node); } - node = kvmalloc(struct_size(node, mm_nodes, num_nodes), - GFP_KERNEL | __GFP_ZERO); + node = kzalloc(sizeof(*node), GFP_KERNEL); if (!node) { r = -ENOMEM; goto error_sub; @@ -415,9 +403,17 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, ttm_resource_init(tbo, place, &node->base); - mode = DRM_MM_INSERT_BEST; + INIT_LIST_HEAD(&node->blocks); + if (place->flags & TTM_PL_FLAG_TOPDOWN) - mode = DRM_MM_INSERT_HIGH; + node->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION; + + if (place->fpfn || lpfn != man->size) + /* Allocate blocks in desired range */ + node->flags |= DRM_BUDDY_RANGE_ALLOCATION; + + min_page_size = mgr->default_page_size; + BUG_ON(min_page_size < mm->chunk_size); pages_left = node->base.num_pages; @@ -425,36 +421,61 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, pages = min(pages_left, 2UL << (30 - PAGE_SHIFT)); i = 0; - spin_lock(&mgr->lock); while (pages_left) { - uint32_t alignment = tbo->page_alignment; - if (pages >= pages_per_node) - alignment = pages_per_node; - - r = drm_mm_insert_node_in_range(mm, &node->mm_nodes[i], pages, - alignment, 0, place->fpfn, - lpfn, mode); - if (unlikely(r)) { - if (pages > pages_per_no
Re: [PATCH] HPE BMC GXP SUPPORT
> > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > It's normal to list all linux/ includes before asm/ includes. Please > rearrange. Hi Nick Since you are new to the kernel, please let me point out, you should consider Russell comments for all your code, not just this one file. Many of the comments are generic to code anywhere in the kernel. So it would be good to fix the same issues in the rest of your code base before submitting them. I would also suggest that when you start submitting drivers, submit just one or two to start with. You will learn a lot from the feedback you get, and you can apply what you have learnt to the rest of your code before you post them for review. I would also suggest you spend 30 minutes a day just reading comments other patches receive. You can also learn a lot that way, see if the comments apply to your own code. You will also learn about processes this way, which can be just as challenging to get right as code. Andrew
[PATCH v2 0/4] drm/tiny: Add driver for Solomon SSD1307 OLED displays
This patch series adds a DRM driver for the Solomon OLED SSD1305, SSD1306, SSD1307 and SSD1309 displays. It is a port of the ssd1307fb fbdev driver. Using the DRM fb emulation, all the tests from Geert Uytterhoeven's fbtest (https://git.kernel.org/pub/scm/linux/kernel/git/geert/fbtest.git) passes: ./fbtest -f /dev/fb1 Using drawops cfb32 (32 bpp packed pixels) Available visuals: Monochrome Grayscale 256 Truecolor 8:8:8:0 Using visops truecolor Running all tests test001: PASSED test002: PASSED test003: PASSED test004: PASSED test005: PASSED test006: PASSED test008: PASSED Screen size too small for this test test010: PASSED Benchmarking... 10x10 squares: 414.41 Mpixels/s Benchmarking... 20x20 squares: 858.31 Mpixels/s Benchmarking... 50x50 squares: 1586.33 Mpixels/s test012: PASSED Benchmarking... R5 circles: 234.68 Mpixels/s Benchmarking... R10 circles: 498.24 Mpixels/s Benchmarking... R25 circles: 942.34 Mpixels/s test013: PASSED This is a v2 that addresses all the issues pointed in v1, thanks a lot to everyone that gave me feedback and reviews. I tried to not miss any comment, but there were a lot so forgive me if something is not there. Patch #1 adds two new helpers, drm_fb_gray8_to_mono_reversed() to convert from grayscale to monochrome and a drm_fb_xrgb_to_mono_reversed() to convert from XR24 to monochrome. The latter internally use thes former. Patch #2 adds the driver. The name ssd130x was used instead of ssd1307fb to denote that this driver is not only for SSD1307, but also for other displays from the same chip family. Patch #3 just adds a MAINTAINERS entry for the DRM driver and patch #4 adds myself as a co-maintainer of the existing Device Tree binding for ssd1307fb, since the same is shared between the fbdev and DRM drivers. Best regards, Javier Changes in v2: - Drop patch that was adding a DRM_MODE_CONNECTOR_I2C type. - Invert order of backlight {en,dis}able and display {on,off} (Sam Ravnborg) - Don't clear the screen and turn on display on probe (Sam Ravnborg) - Use backlight_get_brightness() macro to get BL brightness (Sam Ravnborg) - Use dev managed version of devm_backlight_device_register() (Sam Ravnborg) - Use dev_name(dev) for backlight name instead of an array (Sam Ravnborg) - Drop the .get_brightness callback since isn't needed (Sam Ravnborg) - Add myself as co-maintainer of the ssd1370fb DT binding (Sam Ravnborg) - Add Sam Ravnborg's acked-by tag to patch 3/4. - Rename driver to ssd130x since supports a display family (Thomas Zimmermann) - Drop the TINY prefix from the Kconfig symbol (Thomas Zimmermann) - Sort the Kconfig symbol dependencies alphabetically (Thomas Zimmermann) - Rename struct ssd130x_array to struct ssd130x_i2c_msg (Thomas Zimmermann) - Rename struct ssd130x_i2c_msg .type member to .cmd (Thomas Zimmermann) - Use sizeof(*foo) instead of sizeof(struct foo) (Thomas Zimmermann) - Use struct_size() macro to calculate sizeof(*foo) + len (Thomas Zimmermann) - Use kcalloc() instead of kmalloc_array() + memset() (Thomas Zimmermann) - Use shadow plane helpers virtual screen support (Thomas Zimmermann) - Remove unused goto label in ssd1307_fb_blit_rect() (Thomas Zimmermann) - Use drm_set_preferred_mode() inset of manually set (Thomas Zimmermann) - Use shadow plane helpers virtual screen support (Thomas Zimmermann) - Remove unused goto label in ssd1307_fb_blit_rect() (Thomas Zimmermann) - Use drm_set_preferred_mode() inset of manually set (Thomas Zimmermann) - Reorganize code in probe to make it more legible (Thomas Zimmermann) - ssd130x_write_cmd() uses varargs to simplify I2C code (Thomas Zimmermann) - Move regulator/pwm init logic to display pipe enable callback. - Also add a drm_fb_xrgb_to_mono_reversed() helper (Thomas Zimmermann) - Add a drm_fb_gray8_to_mono_reversed_line() helper (Thomas Zimmermann) Javier Martinez Canillas (4): drm/format-helper: Add drm_fb_{xrgb,gray8}_to_mono_reversed() drm/tiny: Add driver for Solomon SSD130X OLED displays MAINTAINERS: Add entry for Solomon SSD130X OLED displays DRM driver dt-bindings: display: ssd1307fb: Add myself as binding co-maintainer .../bindings/display/solomon,ssd1307fb.yaml | 1 + MAINTAINERS | 7 + drivers/gpu/drm/drm_format_helper.c | 80 ++ drivers/gpu/drm/tiny/Kconfig | 12 + drivers/gpu/drm/tiny/Makefile | 1 + drivers/gpu/drm/tiny/ssd130x.c| 971 ++ include/drm/drm_format_helper.h | 7 + 7 files changed, 1079 insertions(+) create mode 100644 drivers/gpu/drm/tiny/ssd130x.c -- 2.34.1
[PATCH v2 1/4] drm/format-helper: Add drm_fb_{xrgb8888, gray8}_to_mono_reversed()
Add support to convert XR24 and 8-bit grayscale to reversed monochrome for drivers that control monochromatic panels, that only have 1 bit per pixel. The drm_fb_gray8_to_mono_reversed() helper was based on the function that does the same in the drivers/gpu/drm/tiny/repaper.c driver. Signed-off-by: Javier Martinez Canillas --- (no changes since v1) drivers/gpu/drm/drm_format_helper.c | 80 + include/drm/drm_format_helper.h | 7 +++ 2 files changed, 87 insertions(+) diff --git a/drivers/gpu/drm/drm_format_helper.c b/drivers/gpu/drm/drm_format_helper.c index 0f28dd2bdd72..cdce4b7c25d9 100644 --- a/drivers/gpu/drm/drm_format_helper.c +++ b/drivers/gpu/drm/drm_format_helper.c @@ -584,3 +584,83 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int dst_pitch, uint32_t dst_for return -EINVAL; } EXPORT_SYMBOL(drm_fb_blit_toio); + +static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, const u8 *src, size_t pixels) +{ + unsigned int xb, i; + + for (xb = 0; xb < pixels / 8; xb++) { + u8 byte = 0x00; + + for (i = 0; i < 8; i++) { + int x = xb * 8 + i; + + byte >>= 1; + if (src[x] >> 7) + byte |= BIT(7); + } + *dst++ = byte; + } +} + +/** + * drm_fb_gray8_to_mono_reversed - Convert grayscale to reversed monochrome + * @dst: reversed monochrome destination buffer + * @dst_pitch: Number of bytes between two consecutive scanlines within dst + * @src: 8-bit grayscale source buffer + * @clip: Clip rectangle area to copy + * + * DRM doesn't have native monochrome or grayscale support. + * Such drivers can announce the commonly supported XR24 format to userspace + * and use drm_fb_xrgb_to_gray8() to convert to grayscale and then this + * helper function to convert to the native format. + */ +void drm_fb_gray8_to_mono_reversed(void *dst, unsigned int dst_pitch, const void *src, + const struct drm_rect *clip) +{ + + size_t height = drm_rect_height(clip); + size_t width = drm_rect_width(clip); + unsigned int y; + const u8 *gray8 = src; + u8 *mono = dst; + + if (!dst_pitch) + dst_pitch = width; + + for (y = 0; y < height; y++) { + drm_fb_gray8_to_mono_reversed_line(mono, gray8, dst_pitch); + mono += (dst_pitch / 8); + gray8 += dst_pitch; + } +} + +/** + * drm_fb_xrgb_to_mono_reversed - Convert XRGB to reversed monochrome + * @dst: reversed monochrome destination buffer + * @dst_pitch: Number of bytes between two consecutive scanlines within dst + * @src: XRGB source buffer + * @fb: DRM framebuffer + * @clip: Clip rectangle area to copy + * + * DRM doesn't have native monochrome support. + * Such drivers can announce the commonly supported XR24 format to userspace + * and use this function to convert to the native format. + * + * This function uses drm_fb_xrgb_to_gray8() to convert to grayscale and + * then the result is converted from grayscale to reversed monohrome. + */ +void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const void *src, + const struct drm_framebuffer *fb, + const struct drm_rect *clip) +{ + if (WARN_ON(fb->format->format != DRM_FORMAT_XRGB)) + return; + + if (!dst_pitch) + dst_pitch = drm_rect_width(clip); + + drm_fb_xrgb_to_gray8(dst, dst_pitch, src, fb, clip); + drm_fb_gray8_to_mono_reversed(dst, dst_pitch, dst, fb, clip); +} +EXPORT_SYMBOL(drm_fb_xrgb_to_mono_reversed); diff --git a/include/drm/drm_format_helper.h b/include/drm/drm_format_helper.h index b30ed5de0a33..85e551a5cbe6 100644 --- a/include/drm/drm_format_helper.h +++ b/include/drm/drm_format_helper.h @@ -43,4 +43,11 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int dst_pitch, uint32_t dst_for const void *vmap, const struct drm_framebuffer *fb, const struct drm_rect *rect); +void drm_fb_gray8_to_mono_reversed(void *dst, unsigned int dst_pitch, const void *src, + const struct drm_rect *clip); + +void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const void *src, + const struct drm_framebuffer *fb, + const struct drm_rect *clip); + #endif /* __LINUX_DRM_FORMAT_HELPER_H */ -- 2.34.1
[PATCH v2 3/4] MAINTAINERS: Add entry for Solomon SSD130X OLED displays DRM driver
To make sure that tools like the get_maintainer.pl script will suggest to Cc me if patches are posted for this driver. Also include the Device Tree binding for the old ssd1307fb fbdev driver since the new DRM driver was made compatible with the existing binding. Signed-off-by: Javier Martinez Canillas Acked-by: Sam Ravnborg --- (no changes since v1) MAINTAINERS | 7 +++ drivers/gpu/drm/drm_format_helper.c | 2 +- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index d03ad8da1f36..9061488a4113 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6102,6 +6102,13 @@ T: git git://anongit.freedesktop.org/drm/drm-misc F: Documentation/devicetree/bindings/display/repaper.txt F: drivers/gpu/drm/tiny/repaper.c +DRM DRIVER FOR SOLOMON SSD130X OLED DISPLAYS +M: Javier Martinez Canillas +S: Maintained +T: git git://anongit.freedesktop.org/drm/drm-misc +F: Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml +F: drivers/gpu/drm/tiny/ssd130x.c + DRM DRIVER FOR QEMU'S CIRRUS DEVICE M: Dave Airlie M: Gerd Hoffmann diff --git a/drivers/gpu/drm/drm_format_helper.c b/drivers/gpu/drm/drm_format_helper.c index cdce4b7c25d9..c3c1372fb771 100644 --- a/drivers/gpu/drm/drm_format_helper.c +++ b/drivers/gpu/drm/drm_format_helper.c @@ -661,6 +661,6 @@ void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const v dst_pitch = drm_rect_width(clip); drm_fb_xrgb_to_gray8(dst, dst_pitch, src, fb, clip); - drm_fb_gray8_to_mono_reversed(dst, dst_pitch, dst, fb, clip); + drm_fb_gray8_to_mono_reversed(dst, dst_pitch, dst, clip); } EXPORT_SYMBOL(drm_fb_xrgb_to_mono_reversed); -- 2.34.1
[PATCH v2 2/4] drm/tiny: Add driver for Solomon SSD130X OLED displays
Add a DRM driver for SSD1305, SSD1306, SSD1307 and SSD1309 Solomon OLED controllers that can be programmed via an I2C interface. This is a port of the ssd1307fb driver that already supports these devices. A Device Tree binding is not added because the DRM driver is compatible with the existing binding for the ssd1307fb driver. Signed-off-by: Javier Martinez Canillas --- (no changes since v1) drivers/gpu/drm/tiny/Kconfig | 12 + drivers/gpu/drm/tiny/Makefile | 1 + drivers/gpu/drm/tiny/ssd130x.c | 971 + 3 files changed, 984 insertions(+) create mode 100644 drivers/gpu/drm/tiny/ssd130x.c diff --git a/drivers/gpu/drm/tiny/Kconfig b/drivers/gpu/drm/tiny/Kconfig index 712e0004e96e..1b6f5aa41d69 100644 --- a/drivers/gpu/drm/tiny/Kconfig +++ b/drivers/gpu/drm/tiny/Kconfig @@ -67,6 +67,18 @@ config DRM_SIMPLEDRM On x86 BIOS or UEFI systems, you should also select SYSFB_SIMPLEFB to use UEFI and VESA framebuffers. +config DRM_SSD130X + tristate "DRM support for Solomon SSD130X OLED displays" + depends on DRM && I2C + select BACKLIGHT_CLASS_DEVICE + select DRM_GEM_SHMEM_HELPER + select DRM_KMS_HELPER + help + DRM driver for the SSD1305, SSD1306, SSD1307 and SSD1309 Solomon + OLED controllers that can be programmed via an I2C interface. + + If M is selected the module will be called ssd130x. + config TINYDRM_HX8357D tristate "DRM support for HX8357D display panels" depends on DRM && SPI diff --git a/drivers/gpu/drm/tiny/Makefile b/drivers/gpu/drm/tiny/Makefile index 5d5505d40e7b..18c3557dcb71 100644 --- a/drivers/gpu/drm/tiny/Makefile +++ b/drivers/gpu/drm/tiny/Makefile @@ -5,6 +5,7 @@ obj-$(CONFIG_DRM_BOCHS) += bochs.o obj-$(CONFIG_DRM_CIRRUS_QEMU) += cirrus.o obj-$(CONFIG_DRM_GM12U320) += gm12u320.o obj-$(CONFIG_DRM_SIMPLEDRM)+= simpledrm.o +obj-$(CONFIG_DRM_SSD130X) += ssd130x.o obj-$(CONFIG_TINYDRM_HX8357D) += hx8357d.o obj-$(CONFIG_TINYDRM_ILI9163) += ili9163.o obj-$(CONFIG_TINYDRM_ILI9225) += ili9225.o diff --git a/drivers/gpu/drm/tiny/ssd130x.c b/drivers/gpu/drm/tiny/ssd130x.c new file mode 100644 index ..b348768529dc --- /dev/null +++ b/drivers/gpu/drm/tiny/ssd130x.c @@ -0,0 +1,971 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * DRM driver for Solomon SSD130X OLED displays + * + * Copyright 2022 Red Hat Inc. + * + * Based on drivers/video/fbdev/ssd1307fb.c + * Copyright 2012 Free Electrons + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define DRIVER_NAME"ssd130x" +#define DRIVER_DESC"DRM driver for Solomon SSD130X OLED displays" +#define DRIVER_DATE"20220131" +#define DRIVER_MAJOR 1 +#define DRIVER_MINOR 0 + +#define SSD130X_DATA 0x40 +#define SSD130X_COMMAND0x80 + +#define SSD130X_SET_ADDRESS_MODE 0x20 +#define SSD130X_SET_ADDRESS_MODE_HORIZONTAL(0x00) +#define SSD130X_SET_ADDRESS_MODE_VERTICAL (0x01) +#define SSD130X_SET_ADDRESS_MODE_PAGE (0x02) +#define SSD130X_SET_COL_RANGE 0x21 +#define SSD130X_SET_PAGE_RANGE 0x22 +#define SSD130X_CONTRAST 0x81 +#define SSD130X_SET_LOOKUP_TABLE 0x91 +#define SSD130X_CHARGE_PUMP0x8d +#define SSD130X_SEG_REMAP_ON 0xa1 +#define SSD130X_DISPLAY_OFF0xae +#define SSD130X_SET_MULTIPLEX_RATIO0xa8 +#define SSD130X_DISPLAY_ON 0xaf +#define SSD130X_START_PAGE_ADDRESS 0xb0 +#define SSD130X_SET_DISPLAY_OFFSET 0xd3 +#define SSD130X_SET_CLOCK_FREQ 0xd5 +#define SSD130X_SET_AREA_COLOR_MODE0xd8 +#define SSD130X_SET_PRECHARGE_PERIOD 0xd9 +#define SSD130X_SET_COM_PINS_CONFIG0xda +#define SSD130X_SET_VCOMH 0xdb + +#define MAX_CONTRAST 255 + +struct ssd130x_deviceinfo { + u32 default_vcomh; + u32 default_dclk_div; + u32 default_dclk_frq; + int need_pwm; + int need_chargepump; +}; + +struct ssd130x_device { + struct drm_device drm; + struct drm_simple_display_pipe pipe; + struct drm_display_mode mode; + struct drm_connector connector; + struct i2c_client *client; + + const struct ssd130x_deviceinfo *device_info; + + unsigned area_color_enable : 1; + unsigned com_invdir : 1; + unsigned com_lrremap : 1; + unsigned com_seq : 1; + unsigned lookup_table_set : 1; + unsigned low_power : 1; + unsigned seg_remap : 1; + u32 com_offset; + u32 contrast; + u32 dclk_div; +
[PATCH v2 4/4] dt-bindings: display: ssd1307fb: Add myself as binding co-maintainer
The ssd130x DRM driver also makes use of this Device Tree binding to allow existing users of the fbdev driver to migrate without the need to change their Device Trees. Add myself as another maintainer of the binding, to make sure that I will be on Cc when patches are proposed for it. Suggested-by: Sam Ravnborg Signed-off-by: Javier Martinez Canillas --- (no changes since v1) Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml b/Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml index 2ed2a7d0ca2f..9baafd0c42dd 100644 --- a/Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml +++ b/Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml @@ -8,6 +8,7 @@ title: Solomon SSD1307 OLED Controller Framebuffer maintainers: - Maxime Ripard + - Javier Martinez Canillas properties: compatible: -- 2.34.1
Re: [PATCH v2 3/4] MAINTAINERS: Add entry for Solomon SSD130X OLED displays DRM driver
On Fri, Feb 04, 2022 at 02:43:46PM +0100, Javier Martinez Canillas wrote: > To make sure that tools like the get_maintainer.pl script will suggest > to Cc me if patches are posted for this driver. > > Also include the Device Tree binding for the old ssd1307fb fbdev driver > since the new DRM driver was made compatible with the existing binding. ... > drivers/gpu/drm/drm_format_helper.c | 2 +- Nothing about this in the commit message... Stray change? -- With Best Regards, Andy Shevchenko
Re: [PATCH] HPE BMC GXP SUPPORT
On Fri, 2022-02-04 at 12:31 +, Russell King (Oracle) wrote: > On Fri, Feb 04, 2022 at 04:18:24AM -0800, Joe Perches wrote: > > On Fri, 2022-02-04 at 12:05 +, Russell King (Oracle) wrote: > > > On Wed, Feb 02, 2022 at 10:52:50AM -0600, nick.hawk...@hpe.com wrote: > > > > + if (readb_relaxed(timer->control) & MASK_TCS_TC) { > > > > + writeb_relaxed(MASK_TCS_TC, timer->control); > > > > + > > > > + event_handler = READ_ONCE(timer->evt.event_handler); > > > > + if (event_handler) > > > > + event_handler(&timer->evt); > > > > + return IRQ_HANDLED; > > > > + } else { > > > > + return IRQ_NONE; > > > > + } > > > > +} > > > > It's also less indented code and perhaps clearer to reverse the test > > > > if (!readb_relaxed(timer->control) & MASK_TCS_TC) > > This will need to be: > > if (!(readb_relaxed(timer->control) & MASK_TCS_TC)) right, thanks.
Re: [PATCH v2] drm/bridge: dw-hdmi: use safe format when first in bridge chain
On Wed, 19 Jan 2022 at 13:28, Neil Armstrong wrote: > > When the dw-hdmi bridge is in first place of the bridge chain, this > means there is no way to select an input format of the dw-hdmi HW > component. > > Since introduction of display-connector, negotiation was broken since > the dw-hdmi negotiation code only worked when the dw-hdmi bridge was > in last position of the bridge chain or behind another bridge also > supporting input & output format negotiation. > > Commit 0656d1285b79 ("drm/bridge: display-connector: implement bus fmts > callbacks") I think this is the wrong hash. Is 7cd70656d128 the actual hash? > was introduced to make negotiation work again by making display-connector > act as a pass-through concerning input & output format negotiation. > > But in the case where the dw-hdmi is single in the bridge chain, for > example on Renesas SoCs, with the display-connector bridge the dw-hdmi > is no more single, breaking output format. > > Reported-by: Biju Das > Bisected-by: Kieran Bingham > Tested-by: Kieran Bingham > Fixes: 0656d1285b79 ("drm/bridge: display-connector: implement bus fmts > callbacks"). This hash too. > Signed-off-by: Neil Armstrong > --- > Changes since v1: > - Remove bad fix in dw_hdmi_bridge_atomic_get_input_bus_fmts > - Fix typos in commit message > > drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c > b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c > index 54d8fdad395f..97cdc61b57f6 100644 > --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c > +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c > @@ -2551,8 +2551,9 @@ static u32 > *dw_hdmi_bridge_atomic_get_output_bus_fmts(struct drm_bridge *bridge, > if (!output_fmts) > return NULL; > > - /* If dw-hdmi is the only bridge, avoid negociating with ourselves */ > - if (list_is_singular(&bridge->encoder->bridge_chain)) { > + /* If dw-hdmi is the first or only bridge, avoid negociating with > ourselves */ > + if (list_is_singular(&bridge->encoder->bridge_chain) || > + list_is_first(&bridge->chain_node, > &bridge->encoder->bridge_chain)) { > *num_output_fmts = 1; > output_fmts[0] = MEDIA_BUS_FMT_FIXED; > > -- > 2.25.1 > There are two checkstyle issues apart from the above mentioned hash issues, and I think we can ignore those. With the above mentioned issue fixed, feel free to add my r-b. Reviewed-by: Robert Foss
Re: [PATCH v2 3/4] MAINTAINERS: Add entry for Solomon SSD130X OLED displays DRM driver
Hello Andy, On 2/4/22 14:57, Andy Shevchenko wrote: > On Fri, Feb 04, 2022 at 02:43:46PM +0100, Javier Martinez Canillas wrote: >> To make sure that tools like the get_maintainer.pl script will suggest >> to Cc me if patches are posted for this driver. >> >> Also include the Device Tree binding for the old ssd1307fb fbdev driver >> since the new DRM driver was made compatible with the existing binding. > > ... > >> drivers/gpu/drm/drm_format_helper.c | 2 +- > > Nothing about this in the commit message... > > Stray change? > Sigh, I'm not sure how added that change. Just ignore it, I'll fix it on v3 or when applying if there isn't another revision of this series. Best regards, -- Javier Martinez Canillas Linux Engineering Red Hat
Re: [PATCH v2 2/4] drm/tiny: Add driver for Solomon SSD130X OLED displays
On Fri, Feb 04, 2022 at 02:43:45PM +0100, Javier Martinez Canillas wrote: > Add a DRM driver for SSD1305, SSD1306, SSD1307 and SSD1309 Solomon OLED > controllers that can be programmed via an I2C interface. This is a port > of the ssd1307fb driver that already supports these devices. > > A Device Tree binding is not added because the DRM driver is compatible > with the existing binding for the ssd1307fb driver. ... > +/* > + * DRM driver for Solomon SSD130X OLED displays > + * > + * Copyright 2022 Red Hat Inc. > + * > + * Based on drivers/video/fbdev/ssd1307fb.c > + * Copyright 2012 Free Electrons > + * No need for this blank line. > + */ ... > +struct ssd130x_device { > + struct drm_device drm; > + struct drm_simple_display_pipe pipe; > + struct drm_display_mode mode; > + struct drm_connector connector; > + struct i2c_client *client; Can we logically separate hw protocol vs hw interface from day 1, please? This will allow to add SPI support for this panel much easier. Technically I would like to see here struct device *dev; and probably (I haven't looked into design) struct ssd130x_ops *ops; or something alike. > + const struct ssd130x_deviceinfo *device_info; > + > + unsigned area_color_enable : 1; > + unsigned com_invdir : 1; > + unsigned com_lrremap : 1; > + unsigned com_seq : 1; > + unsigned lookup_table_set : 1; > + unsigned low_power : 1; > + unsigned seg_remap : 1; > + u32 com_offset; > + u32 contrast; > + u32 dclk_div; > + u32 dclk_frq; > + u32 height; > + u8 lookup_table[4]; > + u32 page_offset; > + u32 col_offset; > + u32 prechargep1; > + u32 prechargep2; > + > + struct backlight_device *bl_dev; > + struct pwm_device *pwm; > + struct gpio_desc *reset; > + struct regulator *vbat_reg; > + u32 vcomh; > + u32 width; > + /* Cached address ranges */ > + u8 col_start; > + u8 col_end; > + u8 page_start; > + u8 page_end; > +}; ... > +static inline int ssd130x_write_value(struct i2c_client *client, u8 value) Not sure inline does anything useful here. Ditto for the rest similar cases. ... > +static inline int ssd130x_write_cmd(struct i2c_client *client, int count, > + /* u8 cmd, u8 value, ... */...) > +{ > + va_list ap; > + u8 value; > + int ret; > + > + va_start(ap, count); > + while (count--) { > + value = va_arg(ap, int); > + ret = ssd130x_write_value(client, (u8)value); > + if (ret) > + goto out_end; > + } I'm wondering if this can be written in a form do { ... } while (--count); In this case it will give a hint that count can't be 0. > +out_end: > + va_end(ap); > + > + return ret; > +} ... > + ssd130x->pwm = pwm_get(dev, NULL); > + if (IS_ERR(ssd130x->pwm)) { > + dev_err(dev, "Could not get PWM from device tree!\n"); "device tree" is a bit confusing here if I run this on ACPI. Maybe something like "firmware description"? > + return PTR_ERR(ssd130x->pwm); > + } ... > + /* Set initial contrast */ > + ret = ssd130x_write_cmd(ssd130x->client, 2, SSD130X_CONTRAST, > ssd130x->contrast); Creating a local variable for client allows to: - make lines shorter and might even be less LOCs - allow to convert struct device to client in one place (as per my above comment) Ditto for other similar cases. > + if (ret < 0) > + return ret; ... > + for (i = 0; i < ARRAY_SIZE(ssd130x->lookup_table); ++i) { i++ should work same way. > + } ... > + /* Switch to horizontal addressing mode */ > + ret = ssd130x_write_cmd(ssd130x->client, 2, SSD130X_SET_ADDRESS_MODE, > + SSD130X_SET_ADDRESS_MODE_HORIZONTAL); > + if (ret < 0) > + return ret; > + > + return 0; Can it be return ssd130x_write_cmd(...); ? ... > + unsigned int line_length = DIV_ROUND_UP(width, 8); > + unsigned int pages = DIV_ROUND_UP(height, 8); For power of two there are more efficient roundup()/rounddown() (or with _ in the names, I don't remember by heart). ... > + for (k = 0; k < m; k++) { > + u8 byte = buf[(8 * i + k) * line_length + > +j / 8]; One line? > + u8 bit = (byte >> (j % 8)) & 1; > + > + data |= bit << k; > + } ... > +static int ssd130x_display_pipe_mode_valid(struct drm_simple_display_pipe > *pipe, > +const struct drm_display_mode *mode) > +{ > + struct ssd130x_device *ssd130x = drm_to_ssd130x(pipe->crtc.dev); > + > + if (mode->hdisplay != ssd130x->mode.hdisplay && > + mode->vdisplay != ssd130x->mode.vdisplay) > +
Re: [PATCH v2] drm/bridge: dw-hdmi: use safe format when first in bridge chain
Hi, On 04/02/2022 15:05, Robert Foss wrote: On Wed, 19 Jan 2022 at 13:28, Neil Armstrong wrote: When the dw-hdmi bridge is in first place of the bridge chain, this means there is no way to select an input format of the dw-hdmi HW component. Since introduction of display-connector, negotiation was broken since the dw-hdmi negotiation code only worked when the dw-hdmi bridge was in last position of the bridge chain or behind another bridge also supporting input & output format negotiation. Commit 0656d1285b79 ("drm/bridge: display-connector: implement bus fmts callbacks") I think this is the wrong hash. Is 7cd70656d128 the actual hash? Wow indeed, thanks for checking... was introduced to make negotiation work again by making display-connector act as a pass-through concerning input & output format negotiation. But in the case where the dw-hdmi is single in the bridge chain, for example on Renesas SoCs, with the display-connector bridge the dw-hdmi is no more single, breaking output format. Reported-by: Biju Das Bisected-by: Kieran Bingham Tested-by: Kieran Bingham Fixes: 0656d1285b79 ("drm/bridge: display-connector: implement bus fmts callbacks"). This hash too. Signed-off-by: Neil Armstrong --- Changes since v1: - Remove bad fix in dw_hdmi_bridge_atomic_get_input_bus_fmts - Fix typos in commit message drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c index 54d8fdad395f..97cdc61b57f6 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c @@ -2551,8 +2551,9 @@ static u32 *dw_hdmi_bridge_atomic_get_output_bus_fmts(struct drm_bridge *bridge, if (!output_fmts) return NULL; - /* If dw-hdmi is the only bridge, avoid negociating with ourselves */ - if (list_is_singular(&bridge->encoder->bridge_chain)) { + /* If dw-hdmi is the first or only bridge, avoid negociating with ourselves */ + if (list_is_singular(&bridge->encoder->bridge_chain) || + list_is_first(&bridge->chain_node, &bridge->encoder->bridge_chain)) { *num_output_fmts = 1; output_fmts[0] = MEDIA_BUS_FMT_FIXED; -- 2.25.1 There are two checkstyle issues apart from the above mentioned hash issues, and I think we can ignore those. With the above mentioned issue fixed, feel free to add my r-b. Reviewed-by: Robert Foss Thanks, I'll fix & resend with your r-b. Neil
Re: [PATCH v2 3/4] MAINTAINERS: Add entry for Solomon SSD130X OLED displays DRM driver
On Fri, Feb 04, 2022 at 03:12:17PM +0100, Javier Martinez Canillas wrote: > On 2/4/22 14:57, Andy Shevchenko wrote: > > On Fri, Feb 04, 2022 at 02:43:46PM +0100, Javier Martinez Canillas wrote: ... > > Stray change? > > Sigh, I'm not sure how added that change. Just ignore it, I'll fix it > on v3 or when applying if there isn't another revision of this series. I believe v3 is warranted due to the other patch review. -- With Best Regards, Andy Shevchenko
Re: [PATCH v2 0/4] drm/tiny: Add driver for Solomon SSD1307 OLED displays
Hi Javier, On Fri, Feb 4, 2022 at 2:43 PM Javier Martinez Canillas wrote: > This patch series adds a DRM driver for the Solomon OLED SSD1305, SSD1306, > SSD1307 and SSD1309 displays. It is a port of the ssd1307fb fbdev driver. [...] > This is a v2 that addresses all the issues pointed in v1, thanks a lot > to everyone that gave me feedback and reviews. I tried to not miss any > comment, but there were a lot so forgive me if something is not there. Thanks for the update! > Changes in v2: [...] Note that the individual patches say "(no changes since v1)"? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH v2 3/4] MAINTAINERS: Add entry for Solomon SSD130X OLED displays DRM driver
On 2/4/22 15:28, Andy Shevchenko wrote: > On Fri, Feb 04, 2022 at 03:12:17PM +0100, Javier Martinez Canillas wrote: >> On 2/4/22 14:57, Andy Shevchenko wrote: >>> On Fri, Feb 04, 2022 at 02:43:46PM +0100, Javier Martinez Canillas wrote: > > ... > >>> Stray change? >> >> Sigh, I'm not sure how added that change. Just ignore it, I'll fix it >> on v3 or when applying if there isn't another revision of this series. > > I believe v3 is warranted due to the other patch review. > Agreed. Thanks a lot for your feedback and comments. Best regards, -- Javier Martinez Canillas Linux Engineering Red Hat
[PATCH v3] drm/bridge: dw-hdmi: use safe format when first in bridge chain
When the dw-hdmi bridge is in first place of the bridge chain, this means there is no way to select an input format of the dw-hdmi HW component. Since introduction of display-connector, negotiation was broken since the dw-hdmi negotiation code only worked when the dw-hdmi bridge was in last position of the bridge chain or behind another bridge also supporting input & output format negotiation. Commit 7cd70656d128 ("drm/bridge: display-connector: implement bus fmts callbacks") was introduced to make negotiation work again by making display-connector act as a pass-through concerning input & output format negotiation. But in the case where the dw-hdmi is single in the bridge chain, for example on Renesas SoCs, with the display-connector bridge the dw-hdmi is no more single, breaking output format. Reported-by: Biju Das Bisected-by: Kieran Bingham Tested-by: Kieran Bingham Fixes: 7cd70656d128 ("drm/bridge: display-connector: implement bus fmts callbacks"). Signed-off-by: Neil Armstrong Reviewed-by: Robert Foss --- Changes since v2: - Add rob's r-b - Fix invalid Fixes commit hash Changes since v1: - Remove bad fix in dw_hdmi_bridge_atomic_get_input_bus_fmts - Fix typos in commit message drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c index 54d8fdad395f..97cdc61b57f6 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c @@ -2551,8 +2551,9 @@ static u32 *dw_hdmi_bridge_atomic_get_output_bus_fmts(struct drm_bridge *bridge, if (!output_fmts) return NULL; - /* If dw-hdmi is the only bridge, avoid negociating with ourselves */ - if (list_is_singular(&bridge->encoder->bridge_chain)) { + /* If dw-hdmi is the first or only bridge, avoid negociating with ourselves */ + if (list_is_singular(&bridge->encoder->bridge_chain) || + list_is_first(&bridge->chain_node, &bridge->encoder->bridge_chain)) { *num_output_fmts = 1; output_fmts[0] = MEDIA_BUS_FMT_FIXED; -- 2.25.1
Re: [PATCH v2 0/4] drm/tiny: Add driver for Solomon SSD1307 OLED displays
Hello Geert, On 2/4/22 15:31, Geert Uytterhoeven wrote: > Hi Javier, > > On Fri, Feb 4, 2022 at 2:43 PM Javier Martinez Canillas > wrote: >> This patch series adds a DRM driver for the Solomon OLED SSD1305, SSD1306, >> SSD1307 and SSD1309 displays. It is a port of the ssd1307fb fbdev driver. > > [...] > >> This is a v2 that addresses all the issues pointed in v1, thanks a lot >> to everyone that gave me feedback and reviews. I tried to not miss any >> comment, but there were a lot so forgive me if something is not there. > > Thanks for the update! > You are welcome! >> Changes in v2: > > [...] > > Note that the individual patches say "(no changes since v1)"? > That's due patman (the tool I use to post patches) not being that smart. I only added the v2 changelog in the cover letter and not the individual patches to avoid adding noise, since there are a lot of changes since v1. But patman then thought that means individual patches had no changes... Best regards, -- Javier Martinez Canillas Linux Engineering Red Hat
Re: [PATCH 6/6] drm/amdgpu: use dma_fence_chain_contained
On Fri, Feb 4, 2022 at 5:04 AM Christian König wrote: > > Instead of manually extracting the fence. > > Signed-off-by: Christian König Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c > index f7d8487799b2..40e06745fae9 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c > @@ -261,10 +261,9 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct > amdgpu_sync *sync, > > dma_resv_for_each_fence(&cursor, resv, true, f) { > dma_fence_chain_for_each(f, f) { > - struct dma_fence_chain *chain = to_dma_fence_chain(f); > + struct dma_fence *tmp = dma_fence_chain_contained(f); > > - if (amdgpu_sync_test_fence(adev, mode, owner, chain ? > - chain->fence : f)) { > + if (amdgpu_sync_test_fence(adev, mode, owner, tmp)) { > r = amdgpu_sync_fence(sync, f); > dma_fence_put(f); > if (r) > -- > 2.25.1 >
Re: [PATCH v2 1/4] drm/format-helper: Add drm_fb_{xrgb8888,gray8}_to_mono_reversed()
Hi Am 04.02.22 um 14:43 schrieb Javier Martinez Canillas: Add support to convert XR24 and 8-bit grayscale to reversed monochrome for drivers that control monochromatic panels, that only have 1 bit per pixel. The drm_fb_gray8_to_mono_reversed() helper was based on the function that does the same in the drivers/gpu/drm/tiny/repaper.c driver. Signed-off-by: Javier Martinez Canillas --- (no changes since v1) drivers/gpu/drm/drm_format_helper.c | 80 + include/drm/drm_format_helper.h | 7 +++ 2 files changed, 87 insertions(+) diff --git a/drivers/gpu/drm/drm_format_helper.c b/drivers/gpu/drm/drm_format_helper.c index 0f28dd2bdd72..cdce4b7c25d9 100644 --- a/drivers/gpu/drm/drm_format_helper.c +++ b/drivers/gpu/drm/drm_format_helper.c @@ -584,3 +584,83 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int dst_pitch, uint32_t dst_for return -EINVAL; } EXPORT_SYMBOL(drm_fb_blit_toio); + +static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, const u8 *src, size_t pixels) +{ + unsigned int xb, i; + + for (xb = 0; xb < pixels / 8; xb++) { In practice, all mode widths are multiples of 8 because VGA mandated it. So it's ok-ish to assume this here. You should probably at least print a warning somewhere if (pixels % 8 != 0) + u8 byte = 0x00; + + for (i = 0; i < 8; i++) { + int x = xb * 8 + i; + + byte >>= 1; + if (src[x] >> 7) + byte |= BIT(7); + } + *dst++ = byte; + } +} + +/** + * drm_fb_gray8_to_mono_reversed - Convert grayscale to reversed monochrome + * @dst: reversed monochrome destination buffer + * @dst_pitch: Number of bytes between two consecutive scanlines within dst + * @src: 8-bit grayscale source buffer + * @clip: Clip rectangle area to copy + * + * DRM doesn't have native monochrome or grayscale support. + * Such drivers can announce the commonly supported XR24 format to userspace + * and use drm_fb_xrgb_to_gray8() to convert to grayscale and then this + * helper function to convert to the native format. + */ +void drm_fb_gray8_to_mono_reversed(void *dst, unsigned int dst_pitch, const void *src, + const struct drm_rect *clip) There's a bug here. You want to pass in a drm_framebuffer as fourth argument. +{ + + size_t height = drm_rect_height(clip); + size_t width = drm_rect_width(clip); + unsigned int y; + const u8 *gray8 = src; + u8 *mono = dst; + + if (!dst_pitch) + dst_pitch = width; The dst_pitch is given in bytes. You have to device by 8. Here would be a good place to warn if (width % 8 != 0). + + for (y = 0; y < height; y++) { + drm_fb_gray8_to_mono_reversed_line(mono, gray8, dst_pitch); + mono += (dst_pitch / 8); The dst_pitch is already given in bytes. + gray8 += dst_pitch; 'gray8 += fb->pitches[0]' would be correct. + } +} + +/** + * drm_fb_xrgb_to_mono_reversed - Convert XRGB to reversed monochrome + * @dst: reversed monochrome destination buffer + * @dst_pitch: Number of bytes between two consecutive scanlines within dst + * @src: XRGB source buffer + * @fb: DRM framebuffer + * @clip: Clip rectangle area to copy + * + * DRM doesn't have native monochrome support. + * Such drivers can announce the commonly supported XR24 format to userspace + * and use this function to convert to the native format. + * + * This function uses drm_fb_xrgb_to_gray8() to convert to grayscale and + * then the result is converted from grayscale to reversed monohrome. + */ +void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const void *src, + const struct drm_framebuffer *fb, + const struct drm_rect *clip) +{ + if (WARN_ON(fb->format->format != DRM_FORMAT_XRGB)) + return; + + if (!dst_pitch) + dst_pitch = drm_rect_width(clip); + + drm_fb_xrgb_to_gray8(dst, dst_pitch, src, fb, clip); + drm_fb_gray8_to_mono_reversed(dst, dst_pitch, dst, fb, clip); Converting from dst into dst can give incorrect results. At some point we probably want to add restrict qualifiers to these pointers, to help the compiler with optimizing. A better approach here is to pull the per-line conversion from drm_fb_xrgb_to_gray8() into a separate helper and implement a line-by-line conversion here. something like this: drm_fb_xrgb_to_mono_reversed() { char *tmp = kmalloc(size of a single line of gray8) for (heigth) { drm_fb_xrgb_to_gray8_line(tmp, ..., src, ...); drm_fb_gray8_to_mono_reversed(dst, ..., tmp, ...); src += fb->pitches[0] dst += dst_pitch; } kfree(tmp); } Best regards Thomas +} +EXPORT_SYMBOL(drm_fb_
Re: [PATCH v2 1/4] drm/format-helper: Add drm_fb_{xrgb8888,gray8}_to_mono_reversed()
Am 04.02.22 um 16:52 schrieb Thomas Zimmermann: [...] +/** + * drm_fb_xrgb_to_mono_reversed - Convert XRGB to reversed monochrome + * @dst: reversed monochrome destination buffer + * @dst_pitch: Number of bytes between two consecutive scanlines within dst + * @src: XRGB source buffer + * @fb: DRM framebuffer + * @clip: Clip rectangle area to copy + * + * DRM doesn't have native monochrome support. + * Such drivers can announce the commonly supported XR24 format to userspace + * and use this function to convert to the native format. + * + * This function uses drm_fb_xrgb_to_gray8() to convert to grayscale and + * then the result is converted from grayscale to reversed monohrome. + */ +void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const void *src, + const struct drm_framebuffer *fb, + const struct drm_rect *clip) +{ + if (WARN_ON(fb->format->format != DRM_FORMAT_XRGB)) + return; + + if (!dst_pitch) + dst_pitch = drm_rect_width(clip); + + drm_fb_xrgb_to_gray8(dst, dst_pitch, src, fb, clip); + drm_fb_gray8_to_mono_reversed(dst, dst_pitch, dst, fb, clip); Converting from dst into dst can give incorrect results. At some point we probably want to add restrict qualifiers to these pointers, to help the compiler with optimizing. A better approach here is to pull the per-line conversion from drm_fb_xrgb_to_gray8() into a separate helper and implement a line-by-line conversion here. something like this: drm_fb_xrgb_to_mono_reversed() { char *tmp = kmalloc(size of a single line of gray8) for (heigth) { drm_fb_xrgb_to_gray8_line(tmp, ..., src, ...); drm_fb_gray8_to_mono_reversed(dst, ..., tmp, ...); Here, I meant 'drm_fb_gray8_to_mono_reversed_line()' src += fb->pitches[0] dst += dst_pitch; } kfree(tmp); } To elaborate, this is an example of what I meant by composable. In the future, we can generalize this function and hand-in 2 function pointers the do the conversion with tmp as intermediate buffer. That would work for any combination of formats that have a common intermediate format. Best regards Thomas +} +EXPORT_SYMBOL(drm_fb_xrgb_to_mono_reversed); diff --git a/include/drm/drm_format_helper.h b/include/drm/drm_format_helper.h index b30ed5de0a33..85e551a5cbe6 100644 --- a/include/drm/drm_format_helper.h +++ b/include/drm/drm_format_helper.h @@ -43,4 +43,11 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int dst_pitch, uint32_t dst_for const void *vmap, const struct drm_framebuffer *fb, const struct drm_rect *rect); +void drm_fb_gray8_to_mono_reversed(void *dst, unsigned int dst_pitch, const void *src, + const struct drm_rect *clip); + +void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const void *src, + const struct drm_framebuffer *fb, + const struct drm_rect *clip); + #endif /* __LINUX_DRM_FORMAT_HELPER_H */ -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH] drm/amdgpu: Fix recursive locking warning
Am 2022-02-04 um 02:13 schrieb Christian König: Am 04.02.22 um 04:11 schrieb Rajneesh Bhardwaj: Noticed the below warning while running a pytorch workload on vega10 GPUs. Change to trylock to avoid conflicts with already held reservation locks. [ +0.03] WARNING: possible recursive locking detected [ +0.03] 5.13.0-kfd-rajneesh #1030 Not tainted [ +0.04] [ +0.02] python/4822 is trying to acquire lock: [ +0.04] 932cd9a259f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000203] but task is already holding lock: [ +0.03] 932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_eu_reserve_buffers+0x270/0x470 [ttm] [ +0.17] other info that might help us debug this: [ +0.02] Possible unsafe locking scenario: [ +0.03] CPU0 [ +0.02] [ +0.02] lock(reservation_ww_class_mutex); [ +0.04] lock(reservation_ww_class_mutex); [ +0.03] *** DEADLOCK *** [ +0.02] May be due to missing lock nesting notation [ +0.03] 7 locks held by python/4822: [ +0.03] #0: 932c4ac028d0 (&process->mutex){+.+.}-{3:3}, at: kfd_ioctl_map_memory_to_gpu+0x10b/0x320 [amdgpu] [ +0.000232] #1: 932c55e830a8 (&info->lock#2){+.+.}-{3:3}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x64/0xf60 [amdgpu] [ +0.000241] #2: 932cc45b5e68 (&(*mem)->lock){+.+.}-{3:3}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0xdf/0xf60 [amdgpu] [ +0.000236] #3: b2b35606fd28 (reservation_ww_class_acquire){+.+.}-{0:0}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x232/0xf60 [amdgpu] [ +0.000235] #4: 932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_eu_reserve_buffers+0x270/0x470 [ttm] [ +0.15] #5: c045f700 (*(sspp++)){}-{0:0}, at: drm_dev_enter+0x5/0xa0 [drm] [ +0.38] #6: 932c52da7078 (&vm->eviction_lock){+.+.}-{3:3}, at: amdgpu_vm_bo_update_mapping+0xd5/0x4f0 [amdgpu] [ +0.000195] stack backtrace: [ +0.03] CPU: 11 PID: 4822 Comm: python Not tainted 5.13.0-kfd-rajneesh #1030 [ +0.05] Hardware name: GIGABYTE MZ01-CE0-00/MZ01-CE0-00, BIOS F02 08/29/2018 [ +0.03] Call Trace: [ +0.03] dump_stack+0x6d/0x89 [ +0.10] __lock_acquire+0xb93/0x1a90 [ +0.09] lock_acquire+0x25d/0x2d0 [ +0.05] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000184] ? lock_is_held_type+0xa2/0x110 [ +0.06] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000184] __ww_mutex_lock.constprop.17+0xca/0x1060 [ +0.07] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000183] ? lock_release+0x13f/0x270 [ +0.05] ? lock_is_held_type+0xa2/0x110 [ +0.06] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000183] amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000185] ttm_bo_release+0x4c6/0x580 [ttm] [ +0.10] amdgpu_bo_unref+0x1a/0x30 [amdgpu] [ +0.000183] amdgpu_vm_free_table+0x76/0xa0 [amdgpu] [ +0.000189] amdgpu_vm_free_pts+0xb8/0xf0 [amdgpu] [ +0.000189] amdgpu_vm_update_ptes+0x411/0x770 [amdgpu] [ +0.000191] amdgpu_vm_bo_update_mapping+0x324/0x4f0 [amdgpu] [ +0.000191] amdgpu_vm_bo_update+0x251/0x610 [amdgpu] [ +0.000191] update_gpuvm_pte+0xcc/0x290 [amdgpu] [ +0.000229] ? amdgpu_vm_bo_map+0xd7/0x130 [amdgpu] [ +0.000190] amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x912/0xf60 [amdgpu] [ +0.000234] kfd_ioctl_map_memory_to_gpu+0x182/0x320 [amdgpu] [ +0.000218] kfd_ioctl+0x2b9/0x600 [amdgpu] [ +0.000216] ? kfd_ioctl_unmap_memory_from_gpu+0x270/0x270 [amdgpu] [ +0.000216] ? lock_release+0x13f/0x270 [ +0.06] ? __fget_files+0x107/0x1e0 [ +0.07] __x64_sys_ioctl+0x8b/0xd0 [ +0.07] do_syscall_64+0x36/0x70 [ +0.04] entry_SYSCALL_64_after_hwframe+0x44/0xae [ +0.07] RIP: 0033:0x7fbff90a7317 [ +0.04] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48 [ +0.05] RSP: 002b:7fbe301fe648 EFLAGS: 0246 ORIG_RAX: 0010 [ +0.06] RAX: ffda RBX: 7fbcc402d820 RCX: 7fbff90a7317 [ +0.03] RDX: 7fbe301fe690 RSI: c0184b18 RDI: 0004 [ +0.03] RBP: 7fbe301fe690 R08: R09: 7fbcc402d880 [ +0.03] R10: 02001000 R11: 0246 R12: c0184b18 [ +0.03] R13: 0004 R14: 7fbf689593a0 R15: 7fbcc402d820 Cc: Christian König Cc: Felix Kuehling Cc: Alex Deucher Fixes: 627b92ef9d7c ("drm/amdgpu: Wipe all VRAM on free when RAS is enabled") Signed-off-by: Rajneesh Bhardwaj The fixes tag is not necessarily correct, I would remove that. But apart from that the patch is Reviewed-by: Christian König . I suggested the Fixes tag since it was my patch that introduced the problem. W
Re: [PATCH] drm/amdgpu: Fix recursive locking warning
Am 04.02.22 um 17:23 schrieb Felix Kuehling: Am 2022-02-04 um 02:13 schrieb Christian König: Am 04.02.22 um 04:11 schrieb Rajneesh Bhardwaj: Noticed the below warning while running a pytorch workload on vega10 GPUs. Change to trylock to avoid conflicts with already held reservation locks. [ +0.03] WARNING: possible recursive locking detected [ +0.03] 5.13.0-kfd-rajneesh #1030 Not tainted [ +0.04] [ +0.02] python/4822 is trying to acquire lock: [ +0.04] 932cd9a259f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000203] but task is already holding lock: [ +0.03] 932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_eu_reserve_buffers+0x270/0x470 [ttm] [ +0.17] other info that might help us debug this: [ +0.02] Possible unsafe locking scenario: [ +0.03] CPU0 [ +0.02] [ +0.02] lock(reservation_ww_class_mutex); [ +0.04] lock(reservation_ww_class_mutex); [ +0.03] *** DEADLOCK *** [ +0.02] May be due to missing lock nesting notation [ +0.03] 7 locks held by python/4822: [ +0.03] #0: 932c4ac028d0 (&process->mutex){+.+.}-{3:3}, at: kfd_ioctl_map_memory_to_gpu+0x10b/0x320 [amdgpu] [ +0.000232] #1: 932c55e830a8 (&info->lock#2){+.+.}-{3:3}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x64/0xf60 [amdgpu] [ +0.000241] #2: 932cc45b5e68 (&(*mem)->lock){+.+.}-{3:3}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0xdf/0xf60 [amdgpu] [ +0.000236] #3: b2b35606fd28 (reservation_ww_class_acquire){+.+.}-{0:0}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x232/0xf60 [amdgpu] [ +0.000235] #4: 932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_eu_reserve_buffers+0x270/0x470 [ttm] [ +0.15] #5: c045f700 (*(sspp++)){}-{0:0}, at: drm_dev_enter+0x5/0xa0 [drm] [ +0.38] #6: 932c52da7078 (&vm->eviction_lock){+.+.}-{3:3}, at: amdgpu_vm_bo_update_mapping+0xd5/0x4f0 [amdgpu] [ +0.000195] stack backtrace: [ +0.03] CPU: 11 PID: 4822 Comm: python Not tainted 5.13.0-kfd-rajneesh #1030 [ +0.05] Hardware name: GIGABYTE MZ01-CE0-00/MZ01-CE0-00, BIOS F02 08/29/2018 [ +0.03] Call Trace: [ +0.03] dump_stack+0x6d/0x89 [ +0.10] __lock_acquire+0xb93/0x1a90 [ +0.09] lock_acquire+0x25d/0x2d0 [ +0.05] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000184] ? lock_is_held_type+0xa2/0x110 [ +0.06] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000184] __ww_mutex_lock.constprop.17+0xca/0x1060 [ +0.07] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000183] ? lock_release+0x13f/0x270 [ +0.05] ? lock_is_held_type+0xa2/0x110 [ +0.06] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000183] amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000185] ttm_bo_release+0x4c6/0x580 [ttm] [ +0.10] amdgpu_bo_unref+0x1a/0x30 [amdgpu] [ +0.000183] amdgpu_vm_free_table+0x76/0xa0 [amdgpu] [ +0.000189] amdgpu_vm_free_pts+0xb8/0xf0 [amdgpu] [ +0.000189] amdgpu_vm_update_ptes+0x411/0x770 [amdgpu] [ +0.000191] amdgpu_vm_bo_update_mapping+0x324/0x4f0 [amdgpu] [ +0.000191] amdgpu_vm_bo_update+0x251/0x610 [amdgpu] [ +0.000191] update_gpuvm_pte+0xcc/0x290 [amdgpu] [ +0.000229] ? amdgpu_vm_bo_map+0xd7/0x130 [amdgpu] [ +0.000190] amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x912/0xf60 [amdgpu] [ +0.000234] kfd_ioctl_map_memory_to_gpu+0x182/0x320 [amdgpu] [ +0.000218] kfd_ioctl+0x2b9/0x600 [amdgpu] [ +0.000216] ? kfd_ioctl_unmap_memory_from_gpu+0x270/0x270 [amdgpu] [ +0.000216] ? lock_release+0x13f/0x270 [ +0.06] ? __fget_files+0x107/0x1e0 [ +0.07] __x64_sys_ioctl+0x8b/0xd0 [ +0.07] do_syscall_64+0x36/0x70 [ +0.04] entry_SYSCALL_64_after_hwframe+0x44/0xae [ +0.07] RIP: 0033:0x7fbff90a7317 [ +0.04] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48 [ +0.05] RSP: 002b:7fbe301fe648 EFLAGS: 0246 ORIG_RAX: 0010 [ +0.06] RAX: ffda RBX: 7fbcc402d820 RCX: 7fbff90a7317 [ +0.03] RDX: 7fbe301fe690 RSI: c0184b18 RDI: 0004 [ +0.03] RBP: 7fbe301fe690 R08: R09: 7fbcc402d880 [ +0.03] R10: 02001000 R11: 0246 R12: c0184b18 [ +0.03] R13: 0004 R14: 7fbf689593a0 R15: 7fbcc402d820 Cc: Christian König Cc: Felix Kuehling Cc: Alex Deucher Fixes: 627b92ef9d7c ("drm/amdgpu: Wipe all VRAM on free when RAS is enabled") Signed-off-by: Rajneesh Bhardwaj The fixes tag is not necessarily correct, I would remove that. But apart from that the patch is Reviewed-by: Christian König . I suggested the Fixes tag sin
[PATCH v5 2/5] drm/i915/gt: Drop invalidate_csb_entries
Drop invalidate_csb_entries and directly call drm_clflush_virt_range. This allows for one less function call, and prevent complier errors when building for non-x86 architectures. v2(Michael Cheng): Drop invalidate_csb_entries function and directly invoke drm_clflush_virt_range. Thanks to Tvrtko for the sugguestion. Signed-off-by: Michael Cheng --- drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 9bb7c863172f..7500c06562da 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -1646,12 +1646,6 @@ cancel_port_requests(struct intel_engine_execlists * const execlists, return inactive; } -static void invalidate_csb_entries(const u64 *first, const u64 *last) -{ - clflush((void *)first); - clflush((void *)last); -} - /* * Starting with Gen12, the status has a new format: * @@ -1999,7 +1993,7 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive) * the wash as hardware, working or not, will need to do the * invalidation before. */ - invalidate_csb_entries(&buf[0], &buf[num_entries - 1]); + drm_clflush_virt_range(&buf[0], num_entries * sizeof(buf[0])); /* * We assume that any event reflects a change in context flow @@ -2783,8 +2777,8 @@ static void reset_csb_pointers(struct intel_engine_cs *engine) /* Check that the GPU does indeed update the CSB entries! */ memset(execlists->csb_status, -1, (reset_value + 1) * sizeof(u64)); - invalidate_csb_entries(&execlists->csb_status[0], - &execlists->csb_status[reset_value]); + drm_clflush_virt_range(&execlists->csb_status[0], + sizeof(&execlists->csb_status[reset_value])); /* Once more for luck and our trusty paranoia */ ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR, -- 2.25.1
[PATCH v5 5/5] drm/i915/gt: replace cache_clflush_range
Replace all occurance of cache_clflush_range with drm_clflush_virt_range. This will prevent compile errors on non-x86 platforms. Signed-off-by: Michael Cheng --- drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 12 ++-- drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 2 +- drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +- drivers/gpu/drm/i915/gt/intel_ppgtt.c| 2 +- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c| 2 +- 5 files changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index c43e724afa9f..d0999e92621b 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -444,11 +444,11 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt, pd = pdp->entry[gen8_pd_index(idx, 2)]; } - clflush_cache_range(vaddr, PAGE_SIZE); + drm_clflush_virt_range(vaddr, PAGE_SIZE); vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1))); } } while (1); - clflush_cache_range(vaddr, PAGE_SIZE); + drm_clflush_virt_range(vaddr, PAGE_SIZE); return idx; } @@ -532,7 +532,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm, } } while (rem >= page_size && index < I915_PDES); - clflush_cache_range(vaddr, PAGE_SIZE); + drm_clflush_virt_range(vaddr, PAGE_SIZE); /* * Is it safe to mark the 2M block as 64K? -- Either we have @@ -548,7 +548,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm, I915_GTT_PAGE_SIZE_2M { vaddr = px_vaddr(pd); vaddr[maybe_64K] |= GEN8_PDE_IPS_64K; - clflush_cache_range(vaddr, PAGE_SIZE); + drm_clflush_virt_range(vaddr, PAGE_SIZE); page_size = I915_GTT_PAGE_SIZE_64K; /* @@ -569,7 +569,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm, for (i = 1; i < index; i += 16) memset64(vaddr + i, encode, 15); - clflush_cache_range(vaddr, PAGE_SIZE); + drm_clflush_virt_range(vaddr, PAGE_SIZE); } } @@ -617,7 +617,7 @@ static void gen8_ppgtt_insert_entry(struct i915_address_space *vm, vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1))); vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags); - clflush_cache_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr)); + drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr)); } static int gen8_init_scratch(struct i915_address_space *vm) diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 22505aa428d9..5e74c99d5929 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -2822,7 +2822,7 @@ static void execlists_sanitize(struct intel_engine_cs *engine) sanitize_hwsp(engine); /* And scrub the dirty cachelines for the HWSP */ - clflush_cache_range(engine->status_page.addr, PAGE_SIZE); + drm_clflush_virt_range(engine->status_page.addr, PAGE_SIZE); intel_engine_reset_pinned_contexts(engine); } diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 0d6bbc8c57f2..9b594be9102f 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -255,7 +255,7 @@ fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count) void *vaddr = __px_vaddr(p); memset64(vaddr, val, count); - clflush_cache_range(vaddr, PAGE_SIZE); + drm_clflush_virt_range(vaddr, PAGE_SIZE); } static void poison_scratch_page(struct drm_i915_gem_object *scratch) diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c index 48e6e2f87700..bd474a5123cb 100644 --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c @@ -90,7 +90,7 @@ write_dma_entry(struct drm_i915_gem_object * const pdma, u64 * const vaddr = __px_vaddr(pdma); vaddr[idx] = encoded_entry; - clflush_cache_range(&vaddr[idx], sizeof(u64)); + drm_clflush_virt_range(&vaddr[idx], sizeof(u64)); } void diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index b3a429a92c0d..89020706adc4 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915
[PATCH v5 3/5] drm/i915/gt: Re-work reset_csb
Use drm_clflush_virt_range instead of directly invoking clflush. This will prevent compiler errors when building for non-x86 architectures. v2(Michael Cheng): Remove extra clflush v3(Michael Cheng): Remove memory barrier since drm_clflush_virt_range takes care of it. Signed-off-by: Michael Cheng --- drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 7500c06562da..22505aa428d9 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -2944,9 +2944,8 @@ reset_csb(struct intel_engine_cs *engine, struct i915_request **inactive) { struct intel_engine_execlists * const execlists = &engine->execlists; - mb(); /* paranoia: read the CSB pointers from after the reset */ - clflush(execlists->csb_write); - mb(); + drm_clflush_virt_range(execlists->csb_write, + sizeof(execlists->csb_write)); inactive = process_csb(engine, inactive); /* drain preemption events */ -- 2.25.1
[PATCH v5 4/5] drm/i915/: Re-work clflush_write32
Use drm_clflush_virt_range instead of clflushopt and remove the memory barrier, since drm_clflush_virt_range takes care of that. Signed-off-by: Michael Cheng --- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 498b458fd784..0854276ff7ba 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1332,10 +1332,8 @@ static void *reloc_vaddr(struct i915_vma *vma, static void clflush_write32(u32 *addr, u32 value, unsigned int flushes) { if (unlikely(flushes & (CLFLUSH_BEFORE | CLFLUSH_AFTER))) { - if (flushes & CLFLUSH_BEFORE) { - clflushopt(addr); - mb(); - } + if (flushes & CLFLUSH_BEFORE) + drm_clflush_virt_range(addr, sizeof(addr)); *addr = value; @@ -1347,7 +1345,7 @@ static void clflush_write32(u32 *addr, u32 value, unsigned int flushes) * to ensure ordering of clflush wrt to the system. */ if (flushes & CLFLUSH_AFTER) - clflushopt(addr); + drm_clflush_virt_range(addr, sizeof(addr)); } else *addr = value; } -- 2.25.1
[PATCH v5 0/5] Use drm_clflush* instead of clflush
This patch series re-work a few i915 functions to use drm_clflush_virt_range instead of calling clflush or clflushopt directly. This will prevent errors when building for non-x86 architectures. v2: s/PAGE_SIZE/sizeof(value) for Re-work intel_write_status_page and added more patches to convert additional clflush/clflushopt to use drm_clflush*. (Michael Cheng) v3: Drop invalidate_csb_entries and directly invoke drm_clflush_virt_ran v4: Remove extra memory barriers v5: s/cache_clflush_range/drm_clflush_virt_range Michael Cheng (5): drm/i915/gt: Re-work intel_write_status_page drm/i915/gt: Drop invalidate_csb_entries drm/i915/gt: Re-work reset_csb drm/i915/: Re-work clflush_write32 drm/i915/gt: replace cache_clflush_range .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 8 +++- drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 12 ++-- drivers/gpu/drm/i915/gt/intel_engine.h| 13 - .../drm/i915/gt/intel_execlists_submission.c | 19 ++- drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +- drivers/gpu/drm/i915/gt/intel_ppgtt.c | 2 +- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 +- 7 files changed, 22 insertions(+), 36 deletions(-) -- 2.25.1
[PATCH v5 1/5] drm/i915/gt: Re-work intel_write_status_page
Re-work intel_write_status_page to use drm_clflush_virt_range. This will prevent compiler errors when building for non-x86 architectures. Signed-off-by: Michael Cheng --- drivers/gpu/drm/i915/gt/intel_engine.h | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index 0e353d8c2bc8..986777c2430d 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -4,6 +4,7 @@ #include #include +#include #include #include @@ -143,15 +144,9 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value) * of extra paranoia to try and ensure that the HWS takes the value * we give and that it doesn't end up trapped inside the CPU! */ - if (static_cpu_has(X86_FEATURE_CLFLUSH)) { - mb(); - clflush(&engine->status_page.addr[reg]); - engine->status_page.addr[reg] = value; - clflush(&engine->status_page.addr[reg]); - mb(); - } else { - WRITE_ONCE(engine->status_page.addr[reg], value); - } + drm_clflush_virt_range(&engine->status_page.addr[reg], sizeof(value)); + WRITE_ONCE(engine->status_page.addr[reg], value); + drm_clflush_virt_range(&engine->status_page.addr[reg], sizeof(value)); } /* -- 2.25.1
[PATCH v4] dma-buf-map: Rename to iosys-map
Rename struct dma_buf_map to struct iosys_map and corresponding APIs. Over time dma-buf-map grew up to more functionality than the one used by dma-buf: in fact it's just a shim layer to abstract system memory, that can be accessed via regular load and store, from IO memory that needs to be acessed via arch helpers. The idea is to extend this API so it can fulfill other needs, internal to a single driver. Example: in the i915 driver it's desired to share the implementation for integrated graphics, which uses mostly system memory, with discrete graphics, which may need to access IO memory. The conversion was mostly done with the following semantic patch: @r1@ @@ - struct dma_buf_map + struct iosys_map @r2@ @@ ( - DMA_BUF_MAP_INIT_VADDR + IOSYS_MAP_INIT_VADDR | - dma_buf_map_set_vaddr + iosys_map_set_vaddr | - dma_buf_map_set_vaddr_iomem + iosys_map_set_vaddr_iomem | - dma_buf_map_is_equal + iosys_map_is_equal | - dma_buf_map_is_null + iosys_map_is_null | - dma_buf_map_is_set + iosys_map_is_set | - dma_buf_map_clear + iosys_map_clear | - dma_buf_map_memcpy_to + iosys_map_memcpy_to | - dma_buf_map_incr + iosys_map_incr ) @@ @@ - #include + #include Then some files had their includes adjusted and some comments were update to remove mentions to dma-buf-map. Since this is not specific to dma-buf anymore, move the documentation to the "Bus-Independent Device Accesses" section. v2: - Squash patches v3: - Fix wrong removal of dma-buf.h from MAINTAINERS - Move documentation from dma-buf.rst to device-io.rst v4: - Change documentation tile and level Signed-off-by: Lucas De Marchi Acked-by: Christian König Acked-by: Sumit Semwal Acked-by: Thomas Zimmermann --- Documentation/driver-api/device-io.rst| 9 + Documentation/driver-api/dma-buf.rst | 9 - Documentation/gpu/todo.rst| 20 +- MAINTAINERS | 9 +- drivers/dma-buf/dma-buf.c | 22 +- drivers/dma-buf/heaps/cma_heap.c | 10 +- drivers/dma-buf/heaps/system_heap.c | 10 +- drivers/gpu/drm/ast/ast_drv.h | 2 +- drivers/gpu/drm/ast/ast_mode.c| 8 +- drivers/gpu/drm/drm_cache.c | 18 +- drivers/gpu/drm/drm_client.c | 9 +- drivers/gpu/drm/drm_fb_helper.c | 12 +- drivers/gpu/drm/drm_gem.c | 12 +- drivers/gpu/drm/drm_gem_cma_helper.c | 9 +- drivers/gpu/drm/drm_gem_framebuffer_helper.c | 16 +- drivers/gpu/drm/drm_gem_shmem_helper.c| 15 +- drivers/gpu/drm/drm_gem_ttm_helper.c | 4 +- drivers/gpu/drm/drm_gem_vram_helper.c | 25 +- drivers/gpu/drm/drm_internal.h| 6 +- drivers/gpu/drm/drm_mipi_dbi.c| 8 +- drivers/gpu/drm/drm_prime.c | 4 +- drivers/gpu/drm/etnaviv/etnaviv_drv.h | 2 +- drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 8 +- drivers/gpu/drm/gud/gud_pipe.c| 4 +- drivers/gpu/drm/hyperv/hyperv_drm_modeset.c | 5 +- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c| 8 +- .../drm/i915/gem/selftests/i915_gem_dmabuf.c | 6 +- .../gpu/drm/i915/gem/selftests/mock_dmabuf.c | 6 +- drivers/gpu/drm/lima/lima_gem.c | 3 +- drivers/gpu/drm/lima/lima_sched.c | 4 +- drivers/gpu/drm/mediatek/mtk_drm_gem.c| 7 +- drivers/gpu/drm/mediatek/mtk_drm_gem.h| 5 +- drivers/gpu/drm/mgag200/mgag200_mode.c| 4 +- drivers/gpu/drm/msm/msm_drv.h | 4 +- drivers/gpu/drm/msm/msm_gem_prime.c | 6 +- drivers/gpu/drm/panfrost/panfrost_perfcnt.c | 13 +- drivers/gpu/drm/qxl/qxl_display.c | 8 +- drivers/gpu/drm/qxl/qxl_draw.c| 6 +- drivers/gpu/drm/qxl/qxl_drv.h | 10 +- drivers/gpu/drm/qxl/qxl_object.c | 8 +- drivers/gpu/drm/qxl/qxl_object.h | 4 +- drivers/gpu/drm/qxl/qxl_prime.c | 4 +- drivers/gpu/drm/radeon/radeon_gem.c | 1 + drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 9 +- drivers/gpu/drm/rockchip/rockchip_drm_gem.h | 5 +- drivers/gpu/drm/tegra/gem.c | 10 +- drivers/gpu/drm/tiny/cirrus.c | 8 +- drivers/gpu/drm/tiny/gm12u320.c | 7 +- drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +- drivers/gpu/drm/ttm/ttm_resource.c| 42 +-- drivers/gpu/drm/ttm/ttm_tt.c | 8 +- drivers/gpu/drm/udl/udl_modeset.c | 3 +- drivers/gpu/drm/vboxvideo/vbox_mode.c | 4 +- drivers/gpu/drm/vkms/vkms_c
Re: [PATCH v4 3/3] fbcon: Add option to enable legacy hardware acceleration
Hi Geert, On 2/4/22 11:24, Geert Uytterhoeven wrote: > On Fri, Feb 4, 2022 at 11:17 AM Helge Deller wrote: >> On 2/4/22 09:37, Geert Uytterhoeven wrote: >>> On Wed, Feb 2, 2022 at 8:05 PM Helge Deller wrote: Add a config option CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION to enable bitblt and fillrect hardware acceleration in the framebuffer console. If disabled, such acceleration will not be used, even if it is supported by the graphics hardware driver. >>> >>> Note that this also applies to vertical panning and wrapping. >> >> That's correct. >> Would you mind to send a patch which adds this info? > > To add it where? "bitblt and fillrect" are only mentioned in the patch > description. > > The Kconfig help entry just talks about "hardware acceleration", > which can mean any trick supported by the hardware. Yes, I thought about extending the Kconfig help entry text to mention vertical panning and wrapping as well. If you have an idea how that should be phrased, I think it makes sense to add it. Helge
[Important!] 2022 X.Org Foundation Membership deadline for voting in the election
The 2022 X.Org Foundation elections are rapidly approaching. We will be forwarding instructions on the nomination process to membership in the near future. Please note that only current members can vote in the upcoming election, and that the deadline for new memberships or renewals to vote in the upcoming election is March 17th 2022 at 23:59 UTC. If you are interested in joining the X.Org Foundation or in renewing your membership, please visit the membership system site at: https://members.x.org/ You can find the current election schedule here: https://www.x.org/wiki/BoardOfDirectors/Elections/2022/ Lyude Paul, On behalf of the X.Org elections committee
[PATCH 00/19] drm/i915/guc: Refactor ADS access to use iosys_map
2nd version of https://patchwork.freedesktop.org/series/99378/ As first patch I'm including the dma-buf-map rename to iosys-map for completeness and to allow the other patches to be reviewed. However the first patch was also sent by itself. I think all the feedback from v1 was incorporated in this version. All the new helpers in iosys-map were squashed in a single patch as requested. Original cover letter: While porting i915 to arm64 we noticed some issues accessing lmem. Some writes were getting corrupted and the final state of the buffer didn't have exactly what we wrote. This became evident when enabling GuC submission: depending on the number of engines the ADS struct was being corrupted and GuC would reject it, refusin to initialize. >From Documentation/core-api/bus-virt-phys-mapping.rst: This memory is called "PCI memory" or "shared memory" or "IO memory" or whatever, and there is only one way to access it: the readb/writeb and related functions. You should never take the address of such memory, because there is really nothing you can do with such an address: it's not conceptually in the same memory space as "real memory" at all, so you cannot just dereference a pointer. (Sadly, on x86 it **is** in the same memory space, so on x86 it actually works to just deference a pointer, but it's not portable). When reading or writing words directly to IO memory, in order to be portable the Linux kernel provides the abstraction detailed in section "Differences between I/O access functions" of Documentation/driver-api/device-io.rst. This limits our ability to simply overlay our structs on top a buffer and directly access it since that buffer may come from IO memory rather than system memory. Hence the approach taken in intel_guc_ads.c needs to be refactored. This is not the only place in i915 that neeed to be changed, but the one causing the most problems, with a real reproducer. This first set of patch focuses on fixing the gem object to pass the ADS After the addition of a few helpers in the dma_buf_map API, most of intel_guc_ads.c can be converted to use it. The exception is the regset initialization: we'd incur into a lot of extra indirection when reading/writting each register. So the regset is converted to use a temporary buffer allocated on probe, which is then copied to its final location when finishing the initialization or on gt reset. Testing on some discrete cards, after this change we can correctly pass the ADS struct to GuC and have it initialized correctly. thanks, Lucas De Marchi Lucas De Marchi (19): dma-buf-map: Rename to iosys-map iosys-map: Add offset to iosys_map_memcpy_to() iosys-map: Add a few more helpers drm/i915/gt: Add helper for shmem copy to iosys_map drm/i915/guc: Keep iosys_map of ads_blob around drm/i915/guc: Add read/write helpers for ADS blob drm/i915/guc: Convert golden context init to iosys_map drm/i915/guc: Convert policies update to iosys_map drm/i915/guc: Convert engine record to iosys_map drm/i915/guc: Convert guc_ads_private_data_reset to iosys_map drm/i915/guc: Convert golden context prep to iosys_map drm/i915/guc: Replace check for golden context size drm/i915/guc: Convert mapping table to iosys_map drm/i915/guc: Convert capture list to iosys_map drm/i915/guc: Prepare for error propagation drm/i915/guc: Use a single pass to calculate regset drm/i915/guc: Convert guc_mmio_reg_state_init to iosys_map drm/i915/guc: Convert __guc_ads_init to iosys_map drm/i915/guc: Remove plain ads_blob pointer Documentation/driver-api/device-io.rst| 9 + Documentation/driver-api/dma-buf.rst | 9 - Documentation/gpu/todo.rst| 20 +- MAINTAINERS | 9 +- drivers/dma-buf/dma-buf.c | 22 +- drivers/dma-buf/heaps/cma_heap.c | 10 +- drivers/dma-buf/heaps/system_heap.c | 10 +- drivers/gpu/drm/ast/ast_drv.h | 2 +- drivers/gpu/drm/ast/ast_mode.c| 8 +- drivers/gpu/drm/drm_cache.c | 18 +- drivers/gpu/drm/drm_client.c | 9 +- drivers/gpu/drm/drm_fb_helper.c | 12 +- drivers/gpu/drm/drm_gem.c | 12 +- drivers/gpu/drm/drm_gem_cma_helper.c | 9 +- drivers/gpu/drm/drm_gem_framebuffer_helper.c | 16 +- drivers/gpu/drm/drm_gem_shmem_helper.c| 15 +- drivers/gpu/drm/drm_gem_ttm_helper.c | 4 +- drivers/gpu/drm/drm_gem_vram_helper.c | 25 +- drivers/gpu/drm/drm_internal.h| 6 +- drivers/gpu/drm/drm_mipi_dbi.c| 8 +- drivers/gpu/drm/drm_prime.c | 4 +- drivers/gpu/drm/etnaviv/etnaviv_drv.h | 2 +- drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 8 +- drivers/gpu/drm/gud/gud_pipe.c| 4 +- drivers/gpu/drm/hyperv/hyperv_drm_modeset.c |
[PATCH 03/19] iosys-map: Add a few more helpers
First the simplest ones: - iosys_map_memset(): when abstracting system and I/O memory, just like the memcpy() use case, memset() also has dedicated functions to be called for using IO memory. - iosys_map_memcpy_from(): we may need to copy data from I/O memory, not only to. In certain situations it's useful to be able to read or write to an offset that is calculated by having the memory layout given by a struct declaration. Usually we are going to read/write a u8, u16, u32 or u64. As a pre-requisite for the implementation, add iosys_map_memcpy_from() to be the equivalent of iosys_map_memcpy_to(), but in the other direction. Then add 2 pairs of macros: - iosys_map_rd() / iosys_map_wr() - iosys_map_rd_field() / iosys_map_wr_field() The first pair takes the C-type and offset to read/write. The second pair uses a struct describing the layout of the mapping in order to calculate the offset and size being read/written. We could use readb, readw, readl, readq and the write* counterparts, however due to alignment issues this may not work on all architectures. If alignment needs to be checked to call the right function, it's not possible to decide at compile-time which function to call: so just leave the decision to the memcpy function that will do exactly that. Finally, in order to use the above macros with a map derived from another, add another initializer: IOSYS_MAP_INIT_OFFSET(). Cc: Sumit Semwal Cc: Christian König Cc: Thomas Zimmermann Cc: dri-devel@lists.freedesktop.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Lucas De Marchi --- include/linux/iosys-map.h | 154 +- 1 file changed, 153 insertions(+), 1 deletion(-) diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h index edd7fa3be9e9..96f8b61ac6fb 100644 --- a/include/linux/iosys-map.h +++ b/include/linux/iosys-map.h @@ -6,6 +6,7 @@ #ifndef __IOSYS_MAP_H__ #define __IOSYS_MAP_H__ +#include #include #include @@ -133,6 +134,45 @@ static inline void iosys_map_set_vaddr(struct iosys_map *map, void *vaddr) map->is_iomem = false; } +/** + * IOSYS_MAP_INIT_OFFSET - Initializes struct iosys_map from another iosys_map + * @map_: The dma-buf mapping structure to copy from + * @offset_: Offset to add to the other mapping + * + * Initializes a new iosys_map struct based on another passed as argument. It + * does a shallow copy of the struct so it's possible to update the back storage + * without changing where the original map points to. It is the equivalent of + * doing: + * + * .. code-block: c + * + * iosys_map map = other_map; + * iosys_map_incr(&map, &offset); + * + * Example usage: + * + * .. code-block: c + * + * void foo(struct device *dev, struct iosys_map *base_map) + * { + * ... + * struct iosys_map map = IOSYS_MAP_INIT_OFFSET(base_map, FIELD_OFFSET); + * ... + * } + * + * The advantage of using the initializer over just increasing the offset with + * ``iosys_map_incr()`` like above is that the new map will always point to the + * right place of the buffer during its scope. It reduces the risk of updating + * the wrong part of the buffer and having no compiler warning about that. If + * the assignment to IOSYS_MAP_INIT_OFFSET() is forgotten, the compiler can warn + * using a uninitialized variable. + */ +#define IOSYS_MAP_INIT_OFFSET(map_, offset_) (struct iosys_map) \ + { \ + .vaddr = (map_)->vaddr + (offset_), \ + .is_iomem = (map_)->is_iomem, \ + } + /** * iosys_map_set_vaddr_iomem - Sets a iosys mapping structure to an address in I/O memory * @map: The iosys_map structure @@ -220,7 +260,7 @@ static inline void iosys_map_clear(struct iosys_map *map) } /** - * iosys_map_memcpy_to_offset - Memcpy into offset of iosys_map + * iosys_map_memcpy_to - Memcpy into iosys_map * @dst: The iosys_map structure * @dst_offset:The offset from which to copy * @src: The source buffer @@ -239,6 +279,26 @@ static inline void iosys_map_memcpy_to(struct iosys_map *dst, size_t dst_offset, memcpy(dst->vaddr + dst_offset, src, len); } +/** + * iosys_map_memcpy_from - Memcpy from iosys_map into system memory + * @dst: Destination in system memory + * @src: The iosys_map structure + * @src_offset:The offset from which to copy + * @len: The number of byte in src + * + * Copies data from a iosys_map with an offset. The dest buffer is in + * system memory. Depending on the mapping location, the helper picks the + * correct method of accessing the memory. + */ +static inline void iosys_map_memcpy_from(void *dst, const struct iosys_map *src, +size_t src_offset, size_t len) +{ +
[PATCH 02/19] iosys-map: Add offset to iosys_map_memcpy_to()
In certain situations it's useful to be able to write to an offset of the mapping. Add a dst_offset to iosys_map_memcpy_to(). Cc: Sumit Semwal Cc: Christian König Cc: Thomas Zimmermann Cc: dri-devel@lists.freedesktop.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/drm_cache.c | 2 +- drivers/gpu/drm/drm_fb_helper.c | 2 +- include/linux/iosys-map.h | 17 + 3 files changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index 66597e411764..c3e6e615bf09 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -218,7 +218,7 @@ static void memcpy_fallback(struct iosys_map *dst, if (!dst->is_iomem && !src->is_iomem) { memcpy(dst->vaddr, src->vaddr, len); } else if (!src->is_iomem) { - iosys_map_memcpy_to(dst, src->vaddr, len); + iosys_map_memcpy_to(dst, 0, src->vaddr, len); } else if (!dst->is_iomem) { memcpy_fromio(dst->vaddr, src->vaddr_iomem, len); } else { diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 238f815cb2a0..bf5cc9a42e5a 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -385,7 +385,7 @@ static void drm_fb_helper_damage_blit_real(struct drm_fb_helper *fb_helper, iosys_map_incr(dst, offset); /* go to first pixel within clip rect */ for (y = clip->y1; y < clip->y2; y++) { - iosys_map_memcpy_to(dst, src, len); + iosys_map_memcpy_to(dst, 0, src, len); iosys_map_incr(dst, fb->pitches[0]); src += fb->pitches[0]; } diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h index f4186f91caa6..edd7fa3be9e9 100644 --- a/include/linux/iosys-map.h +++ b/include/linux/iosys-map.h @@ -220,22 +220,23 @@ static inline void iosys_map_clear(struct iosys_map *map) } /** - * iosys_map_memcpy_to - Memcpy into iosys mapping + * iosys_map_memcpy_to_offset - Memcpy into offset of iosys_map * @dst: The iosys_map structure + * @dst_offset:The offset from which to copy * @src: The source buffer * @len: The number of byte in src * - * Copies data into a iosys mapping. The source buffer is in system - * memory. Depending on the buffer's location, the helper picks the correct - * method of accessing the memory. + * Copies data into a iosys_map with an offset. The source buffer is in + * system memory. Depending on the buffer's location, the helper picks the + * correct method of accessing the memory. */ -static inline void iosys_map_memcpy_to(struct iosys_map *dst, const void *src, - size_t len) +static inline void iosys_map_memcpy_to(struct iosys_map *dst, size_t dst_offset, + const void *src, size_t len) { if (dst->is_iomem) - memcpy_toio(dst->vaddr_iomem, src, len); + memcpy_toio(dst->vaddr_iomem + dst_offset, src, len); else - memcpy(dst->vaddr, src, len); + memcpy(dst->vaddr + dst_offset, src, len); } /** -- 2.35.1
[PATCH 04/19] drm/i915/gt: Add helper for shmem copy to iosys_map
Add a variant of shmem_read() that takes a iosys_map pointer rather than a plain pointer as argument. It's mostly a copy __shmem_rw() but adapting the api and removing the write support since there's currently only need to use iosys_map as destination. Reworking __shmem_rw() to share the implementation was tempting, but finding a good balance between reuse and clarity pushed towards a little code duplication. Since the function is small, just add the similar function with a copy/paste/adapt approach. Cc: Matt Roper Cc: Joonas Lahtinen Cc: Tvrtko Ursulin Cc: David Airlie Cc: Daniel Vetter Cc: Matthew Auld Cc: Thomas Hellström Cc: Maarten Lankhorst Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/shmem_utils.c | 33 +++ drivers/gpu/drm/i915/gt/shmem_utils.h | 3 +++ 2 files changed, 36 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c index 0683b27a3890..764adefdb4be 100644 --- a/drivers/gpu/drm/i915/gt/shmem_utils.c +++ b/drivers/gpu/drm/i915/gt/shmem_utils.c @@ -3,6 +3,7 @@ * Copyright © 2020 Intel Corporation */ +#include #include #include #include @@ -123,6 +124,38 @@ static int __shmem_rw(struct file *file, loff_t off, return 0; } +int shmem_read_to_iosys_map(struct file *file, loff_t off, + struct iosys_map *map, size_t len) +{ + struct iosys_map map_iter = *map; + unsigned long pfn; + + for (pfn = off >> PAGE_SHIFT; len; pfn++) { + unsigned int this = + min_t(size_t, PAGE_SIZE - offset_in_page(off), len); + struct page *page; + void *vaddr; + + page = shmem_read_mapping_page_gfp(file->f_mapping, pfn, + GFP_KERNEL); + if (IS_ERR(page)) + return PTR_ERR(page); + + vaddr = kmap(page); + iosys_map_memcpy_to(&map_iter, 0, vaddr + offset_in_page(off), + this); + mark_page_accessed(page); + kunmap(page); + put_page(page); + + len -= this; + iosys_map_incr(&map_iter, this); + off = 0; + } + + return 0; +} + int shmem_read(struct file *file, loff_t off, void *dst, size_t len) { return __shmem_rw(file, off, dst, len, false); diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.h b/drivers/gpu/drm/i915/gt/shmem_utils.h index c1669170c351..e1784999faee 100644 --- a/drivers/gpu/drm/i915/gt/shmem_utils.h +++ b/drivers/gpu/drm/i915/gt/shmem_utils.h @@ -8,6 +8,7 @@ #include +struct iosys_map; struct drm_i915_gem_object; struct file; @@ -17,6 +18,8 @@ struct file *shmem_create_from_object(struct drm_i915_gem_object *obj); void *shmem_pin_map(struct file *file); void shmem_unpin_map(struct file *file, void *ptr); +int shmem_read_to_iosys_map(struct file *file, loff_t off, + struct iosys_map *map, size_t len); int shmem_read(struct file *file, loff_t off, void *dst, size_t len); int shmem_write(struct file *file, loff_t off, void *src, size_t len); -- 2.35.1
[PATCH 05/19] drm/i915/guc: Keep iosys_map of ads_blob around
Convert intel_guc_ads_create() and initialization to use iosys_map rather than plain pointer and save it in the guc struct. This will help with additional updates to the ads_blob after the creation/initialization by abstracting the IO vs system memory. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc.h | 4 +++- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 6 ++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index 697d9d66acef..9b9ba79f7594 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -6,8 +6,9 @@ #ifndef _INTEL_GUC_H_ #define _INTEL_GUC_H_ -#include #include +#include +#include #include "intel_uncore.h" #include "intel_guc_fw.h" @@ -148,6 +149,7 @@ struct intel_guc { struct i915_vma *ads_vma; /** @ads_blob: contents of the GuC ADS */ struct __guc_ads_blob *ads_blob; + struct iosys_map ads_map; /** @ads_regset_size: size of the save/restore regsets in the ADS */ u32 ads_regset_size; /** @ads_golden_ctxt_size: size of the golden contexts in the ADS */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index e61150adcbe9..13671b186908 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -624,6 +624,11 @@ int intel_guc_ads_create(struct intel_guc *guc) if (ret) return ret; + if (i915_gem_object_is_lmem(guc->ads_vma->obj)) + iosys_map_set_vaddr_iomem(&guc->ads_map, (void __iomem *)guc->ads_blob); + else + iosys_map_set_vaddr(&guc->ads_map, guc->ads_blob); + __guc_ads_init(guc); return 0; @@ -645,6 +650,7 @@ void intel_guc_ads_destroy(struct intel_guc *guc) { i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP); guc->ads_blob = NULL; + iosys_map_clear(&guc->ads_map); } static void guc_ads_private_data_reset(struct intel_guc *guc) -- 2.35.1
[PATCH 01/19] dma-buf-map: Rename to iosys-map
Rename struct dma_buf_map to struct iosys_map and corresponding APIs. Over time dma-buf-map grew up to more functionality than the one used by dma-buf: in fact it's just a shim layer to abstract system memory, that can be accessed via regular load and store, from IO memory that needs to be acessed via arch helpers. The idea is to extend this API so it can fulfill other needs, internal to a single driver. Example: in the i915 driver it's desired to share the implementation for integrated graphics, which uses mostly system memory, with discrete graphics, which may need to access IO memory. The conversion was mostly done with the following semantic patch: @r1@ @@ - struct dma_buf_map + struct iosys_map @r2@ @@ ( - DMA_BUF_MAP_INIT_VADDR + IOSYS_MAP_INIT_VADDR | - dma_buf_map_set_vaddr + iosys_map_set_vaddr | - dma_buf_map_set_vaddr_iomem + iosys_map_set_vaddr_iomem | - dma_buf_map_is_equal + iosys_map_is_equal | - dma_buf_map_is_null + iosys_map_is_null | - dma_buf_map_is_set + iosys_map_is_set | - dma_buf_map_clear + iosys_map_clear | - dma_buf_map_memcpy_to + iosys_map_memcpy_to | - dma_buf_map_incr + iosys_map_incr ) @@ @@ - #include + #include Then some files had their includes adjusted and some comments were update to remove mentions to dma-buf-map. Since this is not specific to dma-buf anymore, move the documentation to the "Bus-Independent Device Accesses" section. v2: - Squash patches v3: - Fix wrong removal of dma-buf.h from MAINTAINERS - Move documentation from dma-buf.rst to device-io.rst v4: - Change documentation tile and level Signed-off-by: Lucas De Marchi Acked-by: Christian König Acked-by: Sumit Semwal Acked-by: Thomas Zimmermann --- Documentation/driver-api/device-io.rst| 9 + Documentation/driver-api/dma-buf.rst | 9 - Documentation/gpu/todo.rst| 20 +- MAINTAINERS | 9 +- drivers/dma-buf/dma-buf.c | 22 +- drivers/dma-buf/heaps/cma_heap.c | 10 +- drivers/dma-buf/heaps/system_heap.c | 10 +- drivers/gpu/drm/ast/ast_drv.h | 2 +- drivers/gpu/drm/ast/ast_mode.c| 8 +- drivers/gpu/drm/drm_cache.c | 18 +- drivers/gpu/drm/drm_client.c | 9 +- drivers/gpu/drm/drm_fb_helper.c | 12 +- drivers/gpu/drm/drm_gem.c | 12 +- drivers/gpu/drm/drm_gem_cma_helper.c | 9 +- drivers/gpu/drm/drm_gem_framebuffer_helper.c | 16 +- drivers/gpu/drm/drm_gem_shmem_helper.c| 15 +- drivers/gpu/drm/drm_gem_ttm_helper.c | 4 +- drivers/gpu/drm/drm_gem_vram_helper.c | 25 +- drivers/gpu/drm/drm_internal.h| 6 +- drivers/gpu/drm/drm_mipi_dbi.c| 8 +- drivers/gpu/drm/drm_prime.c | 4 +- drivers/gpu/drm/etnaviv/etnaviv_drv.h | 2 +- drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 8 +- drivers/gpu/drm/gud/gud_pipe.c| 4 +- drivers/gpu/drm/hyperv/hyperv_drm_modeset.c | 5 +- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c| 8 +- .../drm/i915/gem/selftests/i915_gem_dmabuf.c | 6 +- .../gpu/drm/i915/gem/selftests/mock_dmabuf.c | 6 +- drivers/gpu/drm/lima/lima_gem.c | 3 +- drivers/gpu/drm/lima/lima_sched.c | 4 +- drivers/gpu/drm/mediatek/mtk_drm_gem.c| 7 +- drivers/gpu/drm/mediatek/mtk_drm_gem.h| 5 +- drivers/gpu/drm/mgag200/mgag200_mode.c| 4 +- drivers/gpu/drm/msm/msm_drv.h | 4 +- drivers/gpu/drm/msm/msm_gem_prime.c | 6 +- drivers/gpu/drm/panfrost/panfrost_perfcnt.c | 13 +- drivers/gpu/drm/qxl/qxl_display.c | 8 +- drivers/gpu/drm/qxl/qxl_draw.c| 6 +- drivers/gpu/drm/qxl/qxl_drv.h | 10 +- drivers/gpu/drm/qxl/qxl_object.c | 8 +- drivers/gpu/drm/qxl/qxl_object.h | 4 +- drivers/gpu/drm/qxl/qxl_prime.c | 4 +- drivers/gpu/drm/radeon/radeon_gem.c | 1 + drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 9 +- drivers/gpu/drm/rockchip/rockchip_drm_gem.h | 5 +- drivers/gpu/drm/tegra/gem.c | 10 +- drivers/gpu/drm/tiny/cirrus.c | 8 +- drivers/gpu/drm/tiny/gm12u320.c | 7 +- drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +- drivers/gpu/drm/ttm/ttm_resource.c| 42 +-- drivers/gpu/drm/ttm/ttm_tt.c | 8 +- drivers/gpu/drm/udl/udl_modeset.c | 3 +- drivers/gpu/drm/vboxvideo/vbox_mode.c | 4 +- drivers/gpu/drm/vkms/vkms_c
[PATCH 06/19] drm/i915/guc: Add read/write helpers for ADS blob
Add helpers on top of iosys_map_read_field() / iosys_map_write_field() functions so they always use the right arguments and make code easier to read. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 13671b186908..3a0afce7564e 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -60,6 +60,13 @@ struct __guc_ads_blob { struct guc_mmio_reg regset[0]; } __packed; +#define ads_blob_read(guc_, field_)\ + iosys_map_rd_field(&(guc_)->ads_map, struct __guc_ads_blob, field_) + +#define ads_blob_write(guc_, field_, val_) \ + iosys_map_wr_field(&(guc_)->ads_map, struct __guc_ads_blob, \ + field_, val_) + static u32 guc_ads_regset_size(struct intel_guc *guc) { GEM_BUG_ON(!guc->ads_regset_size); -- 2.35.1
[PATCH 08/19] drm/i915/guc: Convert policies update to iosys_map
Use iosys_map to write the policies update so access to IO and system memory is abstracted away. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 41 -- 1 file changed, 23 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index d32b407a2d25..6311b9da87e4 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -130,33 +130,37 @@ static u32 guc_ads_blob_size(struct intel_guc *guc) guc_ads_private_data_size(guc); } -static void guc_policies_init(struct intel_guc *guc, struct guc_policies *policies) +static void guc_policies_init(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); struct drm_i915_private *i915 = gt->i915; + u32 global_flags = 0; - policies->dpc_promote_time = GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US; - policies->max_num_work_items = GLOBAL_POLICY_MAX_NUM_WI; + ads_blob_write(guc, policies.dpc_promote_time, + GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US); + ads_blob_write(guc, policies.max_num_work_items, + GLOBAL_POLICY_MAX_NUM_WI); - policies->global_flags = 0; if (i915->params.reset < 2) - policies->global_flags |= GLOBAL_POLICY_DISABLE_ENGINE_RESET; + global_flags |= GLOBAL_POLICY_DISABLE_ENGINE_RESET; - policies->is_valid = 1; + ads_blob_write(guc, policies.global_flags, global_flags); + ads_blob_write(guc, policies.is_valid, 1); } void intel_guc_ads_print_policy_info(struct intel_guc *guc, struct drm_printer *dp) { - struct __guc_ads_blob *blob = guc->ads_blob; - - if (unlikely(!blob)) + if (unlikely(iosys_map_is_null(&guc->ads_map))) return; drm_printf(dp, "Global scheduling policies:\n"); - drm_printf(dp, " DPC promote time = %u\n", blob->policies.dpc_promote_time); - drm_printf(dp, " Max num work items = %u\n", blob->policies.max_num_work_items); - drm_printf(dp, " Flags = %u\n", blob->policies.global_flags); + drm_printf(dp, " DPC promote time = %u\n", + ads_blob_read(guc, policies.dpc_promote_time)); + drm_printf(dp, " Max num work items = %u\n", + ads_blob_read(guc, policies.max_num_work_items)); + drm_printf(dp, " Flags = %u\n", + ads_blob_read(guc, policies.global_flags)); } static int guc_action_policies_update(struct intel_guc *guc, u32 policy_offset) @@ -171,23 +175,24 @@ static int guc_action_policies_update(struct intel_guc *guc, u32 policy_offset) int intel_guc_global_policies_update(struct intel_guc *guc) { - struct __guc_ads_blob *blob = guc->ads_blob; struct intel_gt *gt = guc_to_gt(guc); + u32 scheduler_policies; intel_wakeref_t wakeref; int ret; - if (!blob) + if (iosys_map_is_null(&guc->ads_map)) return -EOPNOTSUPP; - GEM_BUG_ON(!blob->ads.scheduler_policies); + scheduler_policies = ads_blob_read(guc, ads.scheduler_policies); + GEM_BUG_ON(!scheduler_policies); - guc_policies_init(guc, &blob->policies); + guc_policies_init(guc); if (!intel_guc_is_ready(guc)) return 0; with_intel_runtime_pm(>->i915->runtime_pm, wakeref) - ret = guc_action_policies_update(guc, blob->ads.scheduler_policies); + ret = guc_action_policies_update(guc, scheduler_policies); return ret; } @@ -557,7 +562,7 @@ static void __guc_ads_init(struct intel_guc *guc) u32 base; /* GuC scheduling policies */ - guc_policies_init(guc, &blob->policies); + guc_policies_init(guc); /* System info */ fill_engine_enable_masks(gt, &blob->system_info); -- 2.35.1
[PATCH 07/19] drm/i915/guc: Convert golden context init to iosys_map
Now the map is saved during creation, so use it to initialize the golden context, reading from shmem and writing to either system or IO memory. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 25 +++--- 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 3a0afce7564e..d32b407a2d25 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -473,18 +473,17 @@ static struct intel_engine_cs *find_engine_state(struct intel_gt *gt, u8 engine_ static void guc_init_golden_context(struct intel_guc *guc) { - struct __guc_ads_blob *blob = guc->ads_blob; struct intel_engine_cs *engine; struct intel_gt *gt = guc_to_gt(guc); + struct iosys_map golden_context_map; u32 addr_ggtt, offset; u32 total_size = 0, alloc_size, real_size; u8 engine_class, guc_class; - u8 *ptr; if (!intel_uc_uses_guc_submission(>->uc)) return; - GEM_BUG_ON(!blob); + GEM_BUG_ON(iosys_map_is_null(&guc->ads_map)); /* * Go back and fill in the golden context data now that it is @@ -492,15 +491,15 @@ static void guc_init_golden_context(struct intel_guc *guc) */ offset = guc_ads_golden_ctxt_offset(guc); addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset; - ptr = ((u8 *)blob) + offset; + + golden_context_map = IOSYS_MAP_INIT_OFFSET(&guc->ads_map, offset); for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; ++engine_class) { if (engine_class == OTHER_CLASS) continue; guc_class = engine_class_to_guc_class(engine_class); - - if (!blob->system_info.engine_enabled_masks[guc_class]) + if (!ads_blob_read(guc, system_info.engine_enabled_masks[guc_class])) continue; real_size = intel_engine_context_size(gt, engine_class); @@ -511,18 +510,20 @@ static void guc_init_golden_context(struct intel_guc *guc) if (!engine) { drm_err(>->i915->drm, "No engine state recorded for class %d!\n", engine_class); - blob->ads.eng_state_size[guc_class] = 0; - blob->ads.golden_context_lrca[guc_class] = 0; + ads_blob_write(guc, ads.eng_state_size[guc_class], 0); + ads_blob_write(guc, ads.golden_context_lrca[guc_class], 0); continue; } - GEM_BUG_ON(blob->ads.eng_state_size[guc_class] != + GEM_BUG_ON(ads_blob_read(guc, ads.eng_state_size[guc_class]) != real_size - LRC_SKIP_SIZE); - GEM_BUG_ON(blob->ads.golden_context_lrca[guc_class] != addr_ggtt); + GEM_BUG_ON(ads_blob_read(guc, ads.golden_context_lrca[guc_class]) != addr_ggtt); + addr_ggtt += alloc_size; - shmem_read(engine->default_state, 0, ptr, real_size); - ptr += alloc_size; + shmem_read_to_iosys_map(engine->default_state, 0, + &golden_context_map, real_size); + iosys_map_incr(&golden_context_map, alloc_size); } GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size); -- 2.35.1
[PATCH 13/19] drm/i915/guc: Convert mapping table to iosys_map
Use iosys_map to write the fields system_info.mapping_table[][]. Since we already have the info_map around where needed, just use it instead of going through guc->ads_map. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 098a4756e8c5..9e96a44a6bbc 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -204,7 +204,7 @@ int intel_guc_global_policies_update(struct intel_guc *guc) } static void guc_mapping_table_init(struct intel_gt *gt, - struct guc_gt_system_info *system_info) + struct iosys_map *info_map) { unsigned int i, j; struct intel_engine_cs *engine; @@ -213,14 +213,14 @@ static void guc_mapping_table_init(struct intel_gt *gt, /* Table must be set to invalid values for entries not used */ for (i = 0; i < GUC_MAX_ENGINE_CLASSES; ++i) for (j = 0; j < GUC_MAX_INSTANCES_PER_CLASS; ++j) - system_info->mapping_table[i][j] = - GUC_MAX_INSTANCES_PER_CLASS; + info_map_write(info_map, mapping_table[i][j], + GUC_MAX_INSTANCES_PER_CLASS); for_each_engine(engine, gt, id) { u8 guc_class = engine_class_to_guc_class(engine->class); - system_info->mapping_table[guc_class][ilog2(engine->logical_mask)] = - engine->instance; + info_map_write(info_map, mapping_table[guc_class][ilog2(engine->logical_mask)], + engine->instance); } } @@ -595,7 +595,7 @@ static void __guc_ads_init(struct intel_guc *guc) /* Golden contexts for re-initialising after a watchdog reset */ guc_prep_golden_context(guc); - guc_mapping_table_init(guc_to_gt(guc), &blob->system_info); + guc_mapping_table_init(guc_to_gt(guc), &info_map); base = intel_guc_ggtt_offset(guc, guc->ads_vma); -- 2.35.1
[PATCH 16/19] drm/i915/guc: Use a single pass to calculate regset
The ADS initialitazion was using 2 passes to calculate the regset sent to GuC to initialize each engine: the first pass to just have the final object size and the second to set each register in place in the final gem object. However in order to maintain an ordered set of registers to pass to guc, each register needs to be added and moved in the final array. The second phase may actually happen in IO memory rather than system memory and accessing IO memory by simply dereferencing the pointer doesn't work on all architectures. Other places of the ADS initializaition were converted to use the iosys_map API, but here there may be a lot more accesses to IO memory. So, instead of following that same approach, convert the regset initialization to calculate the final array in 1 pass and in the second pass that array is just copied to its final location, updating the pointers for each engine written to the ADS blob. One important thing is that struct temp_regset now have different semantics: `registers` continues to track the registers of a single engine, however the other fields are updated together, according to the newly added `storage`, which tracks the memory allocated for all the registers. So rename some of these fields and add a __mmio_reg_add(): this function (possibly) allocates memory and operates on the storage pointer while guc_mmio_reg_add() continues to manage the registers pointer. On a Tiger Lake system using enable_guc=3, the following log message is now seen: [ 187.334310] i915 :00:02.0: [drm:intel_guc_ads_create [i915]] Used 4 KB for temporary ADS regset This change has also been tested on an ARM64 host with DG2 and other discrete graphics cards. v2 (Daniele): - Fix leaking tempset on error path - Add comments on struct temp_regset to document the meaning of each field Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc.h | 7 ++ drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 128 + 2 files changed, 90 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index 9b9ba79f7594..f857e9190750 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -152,6 +152,13 @@ struct intel_guc { struct iosys_map ads_map; /** @ads_regset_size: size of the save/restore regsets in the ADS */ u32 ads_regset_size; + /** +* @ads_regset_count: number of save/restore registers in the ADS for +* each engine +*/ + u32 ads_regset_count[I915_NUM_ENGINES]; + /** @ads_regset: save/restore regsets in the ADS */ + struct guc_mmio_reg *ads_regset; /** @ads_golden_ctxt_size: size of the golden contexts in the ADS */ u32 ads_golden_ctxt_size; /** @ads_engine_usage_size: size of engine usage in the ADS */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 3a4558948c31..c040d8d8d7a4 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -226,14 +226,18 @@ static void guc_mapping_table_init(struct intel_gt *gt, /* * The save/restore register list must be pre-calculated to a temporary - * buffer of driver defined size before it can be generated in place - * inside the ADS. + * buffer before it can be copied inside the ADS. */ -#define MAX_MMIO_REGS 128 /* Arbitrary size, increase as needed */ struct temp_regset { + /* +* ptr to the section of the storage for the engine currently being +* worked on +*/ struct guc_mmio_reg *registers; - u32 used; - u32 size; + /* ptr to the base of the allocated storage for all engines */ + struct guc_mmio_reg *storage; + u32 storage_used; + u32 storage_max; }; static int guc_mmio_reg_cmp(const void *a, const void *b) @@ -244,18 +248,44 @@ static int guc_mmio_reg_cmp(const void *a, const void *b) return (int)ra->offset - (int)rb->offset; } +static struct guc_mmio_reg * __must_check +__mmio_reg_add(struct temp_regset *regset, struct guc_mmio_reg *reg) +{ + u32 pos = regset->storage_used; + struct guc_mmio_reg *slot; + + if (pos >= regset->storage_max) { + size_t size = ALIGN((pos + 1) * sizeof(*slot), PAGE_SIZE); + struct guc_mmio_reg *r = krealloc(regset->storage, + size, GFP_KERNEL); + if (!r) { + WARN_ONCE(1, "Incomplete regset list: can't add register (%d)\n", + -ENOMEM); + return ERR_PTR(-ENOMEM); + } + + regset->registers = r + (regset->registers - regset->storage); +
[PATCH 09/19] drm/i915/guc: Convert engine record to iosys_map
Use iosys_map to read fields from the dma_blob so access to IO and system memory is abstracted away. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 14 ++ drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h | 3 ++- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 17 ++--- 3 files changed, 18 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 6311b9da87e4..1d21a2d457e0 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -698,18 +698,16 @@ void intel_guc_ads_reset(struct intel_guc *guc) u32 intel_guc_engine_usage_offset(struct intel_guc *guc) { - struct __guc_ads_blob *blob = guc->ads_blob; - u32 base = intel_guc_ggtt_offset(guc, guc->ads_vma); - u32 offset = base + ptr_offset(blob, engine_usage); - - return offset; + return intel_guc_ggtt_offset(guc, guc->ads_vma) + + offsetof(struct __guc_ads_blob, engine_usage); } -struct guc_engine_usage_record *intel_guc_engine_usage(struct intel_engine_cs *engine) +struct iosys_map intel_guc_engine_usage_record_map(struct intel_engine_cs *engine) { struct intel_guc *guc = &engine->gt->uc.guc; - struct __guc_ads_blob *blob = guc->ads_blob; u8 guc_class = engine_class_to_guc_class(engine->class); + size_t offset = offsetof(struct __guc_ads_blob, + engine_usage.engines[guc_class][ilog2(engine->logical_mask)]); - return &blob->engine_usage.engines[guc_class][ilog2(engine->logical_mask)]; + return IOSYS_MAP_INIT_OFFSET(&guc->ads_map, offset); } diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h index e74c110facff..1c64f4d6ea21 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h @@ -7,6 +7,7 @@ #define _INTEL_GUC_ADS_H_ #include +#include struct intel_guc; struct drm_printer; @@ -18,7 +19,7 @@ void intel_guc_ads_init_late(struct intel_guc *guc); void intel_guc_ads_reset(struct intel_guc *guc); void intel_guc_ads_print_policy_info(struct intel_guc *guc, struct drm_printer *p); -struct guc_engine_usage_record *intel_guc_engine_usage(struct intel_engine_cs *engine); +struct iosys_map intel_guc_engine_usage_record_map(struct intel_engine_cs *engine); u32 intel_guc_engine_usage_offset(struct intel_guc *guc); #endif diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index b3a429a92c0d..6d34842f68b4 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1139,6 +1139,9 @@ __extend_last_switch(struct intel_guc *guc, u64 *prev_start, u32 new_start) *prev_start = ((u64)gt_stamp_hi << 32) | new_start; } +#define record_read(map_, field_) \ + iosys_map_rd_field(map_, struct guc_engine_usage_record, field_) + /* * GuC updates shared memory and KMD reads it. Since this is not synchronized, * we run into a race where the value read is inconsistent. Sometimes the @@ -1153,17 +1156,17 @@ __extend_last_switch(struct intel_guc *guc, u64 *prev_start, u32 new_start) static void __get_engine_usage_record(struct intel_engine_cs *engine, u32 *last_in, u32 *id, u32 *total) { - struct guc_engine_usage_record *rec = intel_guc_engine_usage(engine); + struct iosys_map rec_map = intel_guc_engine_usage_record_map(engine); int i = 0; do { - *last_in = READ_ONCE(rec->last_switch_in_stamp); - *id = READ_ONCE(rec->current_context_index); - *total = READ_ONCE(rec->total_runtime); + *last_in = record_read(&rec_map, last_switch_in_stamp); + *id = record_read(&rec_map, current_context_index); + *total = record_read(&rec_map, total_runtime); - if (READ_ONCE(rec->last_switch_in_stamp) == *last_in && - READ_ONCE(rec->current_context_index) == *id && - READ_ONCE(rec->total_runtime) == *total) + if (record_read(&rec_map, last_switch_in_stamp) == *last_in && + record_read(&rec_map, current_context_index) == *id && + record_read(&rec_map, total_runtime) == *total) break; } while (++i < 6); } -- 2.35.1
[PATCH 12/19] drm/i915/guc: Replace check for golden context size
In the other places in this function, guc->ads_map is being protected from access when it's not yet set. However the last check is actually about guc->ads_golden_ctxt_size been set before. These checks should always match as the size is initialized on the first call to guc_prep_golden_context(), but it's clearer if we have a single return and check for guc->ads_golden_ctxt_size. This is just a readability improvement, no change in behavior. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 80fbab831536..098a4756e8c5 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -461,10 +461,10 @@ static int guc_prep_golden_context(struct intel_guc *guc) addr_ggtt += alloc_size; } - if (iosys_map_is_null(&guc->ads_map)) - return total_size; + /* Make sure current size matches what we calculated previously */ + if (guc->ads_golden_ctxt_size) + GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size); - GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size); return total_size; } -- 2.35.1
[PATCH 19/19] drm/i915/guc: Remove plain ads_blob pointer
Now we have the access to content of GuC ADS either using iosys_map API or using a temporary buffer. Remove guc->ads_blob as there shouldn't be updates using the bare pointer anymore. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc.h | 3 +-- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 8 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index f857e9190750..bf7079480d47 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -147,8 +147,7 @@ struct intel_guc { /** @ads_vma: object allocated to hold the GuC ADS */ struct i915_vma *ads_vma; - /** @ads_blob: contents of the GuC ADS */ - struct __guc_ads_blob *ads_blob; + /** @ads_map: contents of the GuC ADS */ struct iosys_map ads_map; /** @ads_regset_size: size of the save/restore regsets in the ADS */ u32 ads_regset_size; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 6262fd4e0d4a..7dd44b6d76da 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -672,6 +672,7 @@ static void __guc_ads_init(struct intel_guc *guc) */ int intel_guc_ads_create(struct intel_guc *guc) { + void *ads_blob; u32 size; int ret; @@ -696,14 +697,14 @@ int intel_guc_ads_create(struct intel_guc *guc) size = guc_ads_blob_size(guc); ret = intel_guc_allocate_and_map_vma(guc, size, &guc->ads_vma, -(void **)&guc->ads_blob); +&ads_blob); if (ret) return ret; if (i915_gem_object_is_lmem(guc->ads_vma->obj)) - iosys_map_set_vaddr_iomem(&guc->ads_map, (void __iomem *)guc->ads_blob); + iosys_map_set_vaddr_iomem(&guc->ads_map, (void __iomem *)ads_blob); else - iosys_map_set_vaddr(&guc->ads_map, guc->ads_blob); + iosys_map_set_vaddr(&guc->ads_map, ads_blob); __guc_ads_init(guc); @@ -725,7 +726,6 @@ void intel_guc_ads_init_late(struct intel_guc *guc) void intel_guc_ads_destroy(struct intel_guc *guc) { i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP); - guc->ads_blob = NULL; iosys_map_clear(&guc->ads_map); kfree(guc->ads_regset); } -- 2.35.1
[PATCH 10/19] drm/i915/guc: Convert guc_ads_private_data_reset to iosys_map
Use iosys_map_memset() to zero the private data as ADS may be either on system or IO memory. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 1d21a2d457e0..973762ce2196 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -674,8 +674,8 @@ static void guc_ads_private_data_reset(struct intel_guc *guc) if (!size) return; - memset((void *)guc->ads_blob + guc_ads_private_data_offset(guc), 0, - size); + iosys_map_memset(&guc->ads_map, guc_ads_private_data_offset(guc), +0, size); } /** -- 2.35.1
[PATCH 17/19] drm/i915/guc: Convert guc_mmio_reg_state_init to iosys_map
Now that the regset list is prepared, convert guc_mmio_reg_state_init() to use iosys_map to copy the array to the final location and initialize additional fields in ads.reg_state_list. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 30 +- 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index c040d8d8d7a4..cf6fafa1024c 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -383,40 +383,46 @@ static long guc_mmio_reg_state_create(struct intel_guc *guc) return ret; } -static void guc_mmio_reg_state_init(struct intel_guc *guc, - struct __guc_ads_blob *blob) +static void guc_mmio_reg_state_init(struct intel_guc *guc) { + struct iosys_map ads_regset_map; struct intel_gt *gt = guc_to_gt(guc); struct intel_engine_cs *engine; - struct guc_mmio_reg *ads_registers; enum intel_engine_id id; u32 addr_ggtt, offset; offset = guc_ads_regset_offset(guc); addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset; - ads_registers = (struct guc_mmio_reg *)(((u8 *)blob) + offset); + ads_regset_map = IOSYS_MAP_INIT_OFFSET(&guc->ads_map, offset); - memcpy(ads_registers, guc->ads_regset, guc->ads_regset_size); + iosys_map_memcpy_to(&ads_regset_map, 0, guc->ads_regset, + guc->ads_regset_size); for_each_engine(engine, gt, id) { u32 count = guc->ads_regset_count[id]; - struct guc_mmio_reg_set *ads_reg_set; u8 guc_class; /* Class index is checked in class converter */ GEM_BUG_ON(engine->instance >= GUC_MAX_INSTANCES_PER_CLASS); guc_class = engine_class_to_guc_class(engine->class); - ads_reg_set = &blob->ads.reg_state_list[guc_class][engine->instance]; if (!count) { - ads_reg_set->address = 0; - ads_reg_set->count = 0; + ads_blob_write(guc, + ads.reg_state_list[guc_class][engine->instance].address, + 0); + ads_blob_write(guc, + ads.reg_state_list[guc_class][engine->instance].count, + 0); continue; } - ads_reg_set->address = addr_ggtt; - ads_reg_set->count = count; + ads_blob_write(guc, + ads.reg_state_list[guc_class][engine->instance].address, + addr_ggtt); + ads_blob_write(guc, + ads.reg_state_list[guc_class][engine->instance].count, + count); addr_ggtt += count * sizeof(struct guc_mmio_reg); } @@ -646,7 +652,7 @@ static void __guc_ads_init(struct intel_guc *guc) blob->ads.gt_system_info = base + ptr_offset(blob, system_info); /* MMIO save/restore list */ - guc_mmio_reg_state_init(guc, blob); + guc_mmio_reg_state_init(guc); /* Private Data */ blob->ads.private_data = base + guc_ads_private_data_offset(guc); -- 2.35.1
[PATCH 14/19] drm/i915/guc: Convert capture list to iosys_map
Use iosys_map to write the fields ads.capture_*. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 9e96a44a6bbc..245b703568ff 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -544,7 +544,7 @@ static void guc_init_golden_context(struct intel_guc *guc) GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size); } -static void guc_capture_list_init(struct intel_guc *guc, struct __guc_ads_blob *blob) +static void guc_capture_list_init(struct intel_guc *guc) { int i, j; u32 addr_ggtt, offset; @@ -556,11 +556,11 @@ static void guc_capture_list_init(struct intel_guc *guc, struct __guc_ads_blob * for (i = 0; i < GUC_CAPTURE_LIST_INDEX_MAX; i++) { for (j = 0; j < GUC_MAX_ENGINE_CLASSES; j++) { - blob->ads.capture_instance[i][j] = addr_ggtt; - blob->ads.capture_class[i][j] = addr_ggtt; + ads_blob_write(guc, ads.capture_instance[i][j], addr_ggtt); + ads_blob_write(guc, ads.capture_class[i][j], addr_ggtt); } - blob->ads.capture_global[i] = addr_ggtt; + ads_blob_write(guc, ads.capture_global[i], addr_ggtt); } } @@ -600,7 +600,7 @@ static void __guc_ads_init(struct intel_guc *guc) base = intel_guc_ggtt_offset(guc, guc->ads_vma); /* Capture list for hang debug */ - guc_capture_list_init(guc, blob); + guc_capture_list_init(guc); /* ADS */ blob->ads.scheduler_policies = base + ptr_offset(blob, policies); -- 2.35.1
[PATCH 15/19] drm/i915/guc: Prepare for error propagation
Currently guc_mmio_reg_add() relies on having enough memory available in the array to add a new slot. It uses `GEM_BUG_ON(count >= regset->size);` to protect going above the threshold. In order to allow guc_mmio_reg_add() to handle the memory allocation by itself, it must return an error in case of failures. Adjust return code so this error can be propagated to the callers of guc_mmio_reg_add() and guc_mmio_regset_init(). No intended change in behavior. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 31 +- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 245b703568ff..3a4558948c31 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -244,8 +244,8 @@ static int guc_mmio_reg_cmp(const void *a, const void *b) return (int)ra->offset - (int)rb->offset; } -static void guc_mmio_reg_add(struct temp_regset *regset, -u32 offset, u32 flags) +static long __must_check guc_mmio_reg_add(struct temp_regset *regset, + u32 offset, u32 flags) { u32 count = regset->used; struct guc_mmio_reg reg = { @@ -264,7 +264,7 @@ static void guc_mmio_reg_add(struct temp_regset *regset, */ if (bsearch(®, regset->registers, count, sizeof(reg), guc_mmio_reg_cmp)) - return; + return 0; slot = ®set->registers[count]; regset->used++; @@ -277,6 +277,8 @@ static void guc_mmio_reg_add(struct temp_regset *regset, swap(slot[1], slot[0]); } + + return 0; } #define GUC_MMIO_REG_ADD(regset, reg, masked) \ @@ -284,32 +286,35 @@ static void guc_mmio_reg_add(struct temp_regset *regset, i915_mmio_reg_offset((reg)), \ (masked) ? GUC_REGSET_MASKED : 0) -static void guc_mmio_regset_init(struct temp_regset *regset, -struct intel_engine_cs *engine) +static int guc_mmio_regset_init(struct temp_regset *regset, + struct intel_engine_cs *engine) { const u32 base = engine->mmio_base; struct i915_wa_list *wal = &engine->wa_list; struct i915_wa *wa; unsigned int i; + int ret = 0; regset->used = 0; - GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true); - GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false); - GUC_MMIO_REG_ADD(regset, RING_IMR(base), false); + ret |= GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true); + ret |= GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false); + ret |= GUC_MMIO_REG_ADD(regset, RING_IMR(base), false); for (i = 0, wa = wal->list; i < wal->count; i++, wa++) - GUC_MMIO_REG_ADD(regset, wa->reg, wa->masked_reg); + ret |= GUC_MMIO_REG_ADD(regset, wa->reg, wa->masked_reg); /* Be extra paranoid and include all whitelist registers. */ for (i = 0; i < RING_MAX_NONPRIV_SLOTS; i++) - GUC_MMIO_REG_ADD(regset, -RING_FORCE_TO_NONPRIV(base, i), -false); + ret |= GUC_MMIO_REG_ADD(regset, + RING_FORCE_TO_NONPRIV(base, i), + false); /* add in local MOCS registers */ for (i = 0; i < GEN9_LNCFCMOCS_REG_COUNT; i++) - GUC_MMIO_REG_ADD(regset, GEN9_LNCFCMOCS(i), false); + ret |= GUC_MMIO_REG_ADD(regset, GEN9_LNCFCMOCS(i), false); + + return ret ? -1 : 0; } static int guc_mmio_reg_state_query(struct intel_guc *guc) -- 2.35.1
[PATCH 18/19] drm/i915/guc: Convert __guc_ads_init to iosys_map
Now that all the called functions from __guc_ads_init() are converted to use ads_map, stop using ads_blob in __guc_ads_init(). Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 25 -- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index cf6fafa1024c..6262fd4e0d4a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -613,7 +613,6 @@ static void __guc_ads_init(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); struct drm_i915_private *i915 = gt->i915; - struct __guc_ads_blob *blob = guc->ads_blob; struct iosys_map info_map = IOSYS_MAP_INIT_OFFSET(&guc->ads_map, offsetof(struct __guc_ads_blob, system_info)); u32 base; @@ -624,17 +623,18 @@ static void __guc_ads_init(struct intel_guc *guc) /* System info */ fill_engine_enable_masks(gt, &info_map); - blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED] = - hweight8(gt->info.sseu.slice_mask); - blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_VDBOX_SFC_SUPPORT_MASK] = - gt->info.vdbox_sfc_access; + ads_blob_write(guc, system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED], + hweight8(gt->info.sseu.slice_mask)); + ads_blob_write(guc, system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_VDBOX_SFC_SUPPORT_MASK], + gt->info.vdbox_sfc_access); if (GRAPHICS_VER(i915) >= 12 && !IS_DGFX(i915)) { u32 distdbreg = intel_uncore_read(gt->uncore, GEN12_DIST_DBS_POPULATED); - blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_DOORBELL_COUNT_PER_SQIDI] = - ((distdbreg >> GEN12_DOORBELLS_PER_SQIDI_SHIFT) & -GEN12_DOORBELLS_PER_SQIDI) + 1; + ads_blob_write(guc, + system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_DOORBELL_COUNT_PER_SQIDI], + ((distdbreg >> GEN12_DOORBELLS_PER_SQIDI_SHIFT) + & GEN12_DOORBELLS_PER_SQIDI) + 1); } /* Golden contexts for re-initialising after a watchdog reset */ @@ -648,14 +648,17 @@ static void __guc_ads_init(struct intel_guc *guc) guc_capture_list_init(guc); /* ADS */ - blob->ads.scheduler_policies = base + ptr_offset(blob, policies); - blob->ads.gt_system_info = base + ptr_offset(blob, system_info); + ads_blob_write(guc, ads.scheduler_policies, base + + offsetof(struct __guc_ads_blob, policies)); + ads_blob_write(guc, ads.gt_system_info, base + + offsetof(struct __guc_ads_blob, system_info)); /* MMIO save/restore list */ guc_mmio_reg_state_init(guc); /* Private Data */ - blob->ads.private_data = base + guc_ads_private_data_offset(guc); + ads_blob_write(guc, ads.private_data, base + + guc_ads_private_data_offset(guc)); i915_gem_object_flush_map(guc->ads_vma->obj); } -- 2.35.1
[PATCH 11/19] drm/i915/guc: Convert golden context prep to iosys_map
Use the saved ads_map to prepare the golden context. One difference from the init context is that this function can be called before there is a gem object (and thus the guc->ads_map) to calculare the size of the golden context that should be allocated for that object. So in this case the function needs to be prepared for not having the system_info with enabled engines filled out. To accomplish that an info_map is prepared on the side to point either to the gem object or the local variable on the stack. This allows making fill_engine_enable_masks() operate always with a iosys_map argument. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 52 +- 1 file changed, 32 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 973762ce2196..80fbab831536 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -67,6 +67,12 @@ struct __guc_ads_blob { iosys_map_wr_field(&(guc_)->ads_map, struct __guc_ads_blob, \ field_, val_) +#define info_map_write(map_, field_, val_) \ + iosys_map_wr_field(map_, struct guc_gt_system_info, field_, val_) + +#define info_map_read(map_, field_) \ + iosys_map_rd_field(map_, struct guc_gt_system_info, field_) + static u32 guc_ads_regset_size(struct intel_guc *guc) { GEM_BUG_ON(!guc->ads_regset_size); @@ -378,24 +384,24 @@ static void guc_mmio_reg_state_init(struct intel_guc *guc, } static void fill_engine_enable_masks(struct intel_gt *gt, -struct guc_gt_system_info *info) +struct iosys_map *info_map) { - info->engine_enabled_masks[GUC_RENDER_CLASS] = 1; - info->engine_enabled_masks[GUC_BLITTER_CLASS] = 1; - info->engine_enabled_masks[GUC_VIDEO_CLASS] = VDBOX_MASK(gt); - info->engine_enabled_masks[GUC_VIDEOENHANCE_CLASS] = VEBOX_MASK(gt); + info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], 1); + info_map_write(info_map, engine_enabled_masks[GUC_BLITTER_CLASS], 1); + info_map_write(info_map, engine_enabled_masks[GUC_VIDEO_CLASS], VDBOX_MASK(gt)); + info_map_write(info_map, engine_enabled_masks[GUC_VIDEOENHANCE_CLASS], VEBOX_MASK(gt)); } #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32)) #define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE) -static int guc_prep_golden_context(struct intel_guc *guc, - struct __guc_ads_blob *blob) +static int guc_prep_golden_context(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); u32 addr_ggtt, offset; u32 total_size = 0, alloc_size, real_size; u8 engine_class, guc_class; - struct guc_gt_system_info *info, local_info; + struct guc_gt_system_info local_info; + struct iosys_map info_map; /* * Reserve the memory for the golden contexts and point GuC at it but @@ -409,14 +415,15 @@ static int guc_prep_golden_context(struct intel_guc *guc, * GuC will also validate that the LRC base + size fall within the * allowed GGTT range. */ - if (blob) { + if (!iosys_map_is_null(&guc->ads_map)) { offset = guc_ads_golden_ctxt_offset(guc); addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset; - info = &blob->system_info; + info_map = IOSYS_MAP_INIT_OFFSET(&guc->ads_map, +offsetof(struct __guc_ads_blob, system_info)); } else { memset(&local_info, 0, sizeof(local_info)); - info = &local_info; - fill_engine_enable_masks(gt, info); + iosys_map_set_vaddr(&info_map, &local_info); + fill_engine_enable_masks(gt, &info_map); } for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; ++engine_class) { @@ -425,14 +432,14 @@ static int guc_prep_golden_context(struct intel_guc *guc, guc_class = engine_class_to_guc_class(engine_class); - if (!info->engine_enabled_masks[guc_class]) + if (!info_map_read(&info_map, engine_enabled_masks[guc_class])) continue; real_size = intel_engine_context_size(gt, engine_class); alloc_size = PAGE_ALIGN(real_size); total_size += alloc_size; - if (!blob) + if (iosys_map_is_null(&guc->ads_map)) continue; /* @@ -446,12 +453,15 @@ static int guc_prep_golden_context(struct intel_guc *guc, * what comes before it in the context image (which is identical
[RFC PATCH 0/3] drm/helpers: Make the suballocation manager drm generic.
The suballocation manager itself is not dependent on any implementation detail, and can be made generic. I want to potentially use it inside i915, as it looks like a clean implementation to do so. :) Looking for feedback and some testing, as I don't have a amdgpu/radeon myself. Only compile tested so far, so some stupid bugs may remain. Maarten Lankhorst (3): drm: Extract amdgpu_sa.c as a generic suballocation helper drm/amd: Convert amdgpu to use suballocation helper. drm/radeon: Use the drm suballocation manager implementation. drivers/gpu/drm/Makefile | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 29 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 21 +- drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c | 320 +--- drivers/gpu/drm/drm_suballoc.c | 424 + drivers/gpu/drm/radeon/radeon.h| 55 +-- drivers/gpu/drm/radeon/radeon_ib.c | 10 +- drivers/gpu/drm/radeon/radeon_object.h | 23 +- drivers/gpu/drm/radeon/radeon_sa.c | 314 ++- drivers/gpu/drm/radeon/radeon_semaphore.c | 6 +- include/drm/drm_suballoc.h | 78 12 files changed, 595 insertions(+), 694 deletions(-) create mode 100644 drivers/gpu/drm/drm_suballoc.c create mode 100644 include/drm/drm_suballoc.h -- 2.34.1
[RFC PATCH 3/3] drm/radeon: Use the drm suballocation manager implementation.
Use the generic suballocation helper lifted from amdgpu. Note that the generic suballocator only allows a single alignment, so we may waste a few more bytes for radeon_semaphore, shouldn't be a big deal, could be re-added if needed. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/radeon/radeon.h | 55 +--- drivers/gpu/drm/radeon/radeon_ib.c| 10 +- drivers/gpu/drm/radeon/radeon_object.h| 23 +- drivers/gpu/drm/radeon/radeon_sa.c| 314 ++ drivers/gpu/drm/radeon/radeon_semaphore.c | 6 +- 5 files changed, 51 insertions(+), 357 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 895776c421d4..a6339c9e7c47 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -79,6 +79,7 @@ #include #include +#include #include "radeon_family.h" #include "radeon_mode.h" @@ -512,52 +513,12 @@ struct radeon_bo { }; #define gem_to_radeon_bo(gobj) container_of((gobj), struct radeon_bo, tbo.base) -/* sub-allocation manager, it has to be protected by another lock. - * By conception this is an helper for other part of the driver - * like the indirect buffer or semaphore, which both have their - * locking. - * - * Principe is simple, we keep a list of sub allocation in offset - * order (first entry has offset == 0, last entry has the highest - * offset). - * - * When allocating new object we first check if there is room at - * the end total_size - (last_object_offset + last_object_size) >= - * alloc_size. If so we allocate new object there. - * - * When there is not enough room at the end, we start waiting for - * each sub object until we reach object_offset+object_size >= - * alloc_size, this object then become the sub object we return. - * - * Alignment can't be bigger than page size. - * - * Hole are not considered for allocation to keep things simple. - * Assumption is that there won't be hole (all object on same - * alignment). - */ struct radeon_sa_manager { - wait_queue_head_t wq; - struct radeon_bo*bo; - struct list_head*hole; - struct list_headflist[RADEON_NUM_RINGS]; - struct list_headolist; - unsignedsize; - uint64_tgpu_addr; - void*cpu_ptr; - uint32_tdomain; - uint32_talign; -}; - -struct radeon_sa_bo; - -/* sub-allocation buffer */ -struct radeon_sa_bo { - struct list_headolist; - struct list_headflist; - struct radeon_sa_manager*manager; - unsignedsoffset; - unsignedeoffset; - struct radeon_fence *fence; + struct drm_suballoc_manager base; + struct radeon_bo*bo; + uint64_tgpu_addr; + void*cpu_ptr; + u32 domain; }; /* @@ -588,7 +549,7 @@ int radeon_mode_dumb_mmap(struct drm_file *filp, * Semaphores. */ struct radeon_semaphore { - struct radeon_sa_bo *sa_bo; + struct drm_suballoc *sa_bo; signed waiters; uint64_tgpu_addr; }; @@ -817,7 +778,7 @@ void radeon_irq_kms_disable_hpd(struct radeon_device *rdev, unsigned hpd_mask); */ struct radeon_ib { - struct radeon_sa_bo *sa_bo; + struct drm_suballoc *sa_bo; uint32_tlength_dw; uint64_tgpu_addr; uint32_t*ptr; diff --git a/drivers/gpu/drm/radeon/radeon_ib.c b/drivers/gpu/drm/radeon/radeon_ib.c index 62b116727b4f..bca2cbd27abf 100644 --- a/drivers/gpu/drm/radeon/radeon_ib.c +++ b/drivers/gpu/drm/radeon/radeon_ib.c @@ -61,7 +61,7 @@ int radeon_ib_get(struct radeon_device *rdev, int ring, { int r; - r = radeon_sa_bo_new(rdev, &rdev->ring_tmp_bo, &ib->sa_bo, size, 256); + r = radeon_sa_bo_new(&rdev->ring_tmp_bo, &ib->sa_bo, size); if (r) { dev_err(rdev->dev, "failed to get a new IB (%d)\n", r); return r; @@ -97,7 +97,7 @@ int radeon_ib_get(struct radeon_device *rdev, int ring, void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib) { radeon_sync_free(rdev, &ib->sync, ib->fence); - radeon_sa_bo_free(rdev, &ib->sa_bo, ib->fence); + radeon_sa_bo_free(&ib->sa_bo, ib->fence); radeon_fence_unref(&ib->fence); } @@ -201,8 +201,7 @@ int radeon_ib_pool_init(struct radeon_device *rdev) if (rdev->family >= CHIP_BONAIRE) { r = radeon_sa_bo_manager_init(rdev, &rdev->ring_tmp_bo, - RADEON_IB_POOL_SIZE*64*1024, - RADEON_GPU_PAGE_SIZE, + RADEON_IB_POOL_SIZE*6
[RFC PATCH 2/3] drm/amd: Convert amdgpu to use suballocation helper.
Now that the suballocation helper is generic, we can use it in amdgpu. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 29 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 21 +- drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c | 320 ++--- 4 files changed, 39 insertions(+), 336 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 9a53a4de2bb7..a8c7a7ef480c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -61,6 +61,7 @@ #include #include #include +#include #include #include "dm_pp_interface.h" @@ -417,29 +418,11 @@ struct amdgpu_clock { * alignment). */ -#define AMDGPU_SA_NUM_FENCE_LISTS 32 - struct amdgpu_sa_manager { - wait_queue_head_t wq; - struct amdgpu_bo*bo; - struct list_head*hole; - struct list_headflist[AMDGPU_SA_NUM_FENCE_LISTS]; - struct list_headolist; - unsignedsize; - uint64_tgpu_addr; - void*cpu_ptr; - uint32_tdomain; - uint32_talign; -}; - -/* sub-allocation buffer */ -struct amdgpu_sa_bo { - struct list_headolist; - struct list_headflist; - struct amdgpu_sa_manager*manager; - unsignedsoffset; - unsignedeoffset; - struct dma_fence*fence; + struct drm_suballoc_manager base; + struct amdgpu_bo*bo; + uint64_tgpu_addr; + void*cpu_ptr; }; int amdgpu_fence_slab_init(void); @@ -470,7 +453,7 @@ struct amdgpu_flip_work { */ struct amdgpu_ib { - struct amdgpu_sa_bo *sa_bo; + struct drm_suballoc *sa_bo; uint32_tlength_dw; uint64_tgpu_addr; uint32_t*ptr; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c index bc1297dcdf97..883828a4988c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c @@ -69,7 +69,7 @@ int amdgpu_ib_get(struct amdgpu_device *adev, struct amdgpu_vm *vm, if (size) { r = amdgpu_sa_bo_new(&adev->ib_pools[pool_type], - &ib->sa_bo, size, 256); + &ib->sa_bo, size); if (r) { dev_err(adev->dev, "failed to get a new IB (%d)\n", r); return r; @@ -307,8 +307,7 @@ int amdgpu_ib_pool_init(struct amdgpu_device *adev) for (i = 0; i < AMDGPU_IB_POOL_MAX; i++) { r = amdgpu_sa_bo_manager_init(adev, &adev->ib_pools[i], - AMDGPU_IB_POOL_SIZE, - AMDGPU_GPU_PAGE_SIZE, + AMDGPU_IB_POOL_SIZE, 256, AMDGPU_GEM_DOMAIN_GTT); if (r) goto error; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h index 4c9cbdc66995..7db4fe1bc1d6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h @@ -337,15 +337,20 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev, /* * sub allocation */ +static inline struct amdgpu_sa_manager * +to_amdgpu_sa_manager(struct drm_suballoc_manager *manager) +{ + return container_of(manager, struct amdgpu_sa_manager, base); +} -static inline uint64_t amdgpu_sa_bo_gpu_addr(struct amdgpu_sa_bo *sa_bo) +static inline uint64_t amdgpu_sa_bo_gpu_addr(struct drm_suballoc *sa_bo) { - return sa_bo->manager->gpu_addr + sa_bo->soffset; + return to_amdgpu_sa_manager(sa_bo->manager)->gpu_addr + sa_bo->soffset; } -static inline void * amdgpu_sa_bo_cpu_addr(struct amdgpu_sa_bo *sa_bo) +static inline void * amdgpu_sa_bo_cpu_addr(struct drm_suballoc *sa_bo) { - return sa_bo->manager->cpu_ptr + sa_bo->soffset; + return to_amdgpu_sa_manager(sa_bo->manager)->cpu_ptr + sa_bo->soffset; } int amdgpu_sa_bo_manager_init(struct amdgpu_device *adev, @@ -356,11 +361,11 @@ void amdgpu_sa_bo_manager_fini(struct amdgpu_device *adev, int amdgpu_sa_bo_manager_start(struct amdgpu_device *adev, struct amdgpu_sa_manager *sa_manager); int amdgpu_sa_bo_new(struct amdgpu_sa_manager *sa_manager, -struct amdgpu_sa_bo **sa_bo, -unsigned size, unsigned align); +struct drm_suballoc **sa_bo, +unsigned siz
[RFC PATCH 1/3] drm: Extract amdgpu_sa.c as a generic suballocation helper
Suballocating a buffer object is something that is not driver generic, and is useful for other drivers as well. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/Makefile | 4 +- drivers/gpu/drm/drm_suballoc.c | 424 + include/drm/drm_suballoc.h | 78 ++ 3 files changed, 505 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/drm_suballoc.c create mode 100644 include/drm/drm_suballoc.h diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 8675c2af7ae1..b848bcf8790c 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -57,7 +57,9 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o \ drm_scdc_helper.o drm_gem_atomic_helper.o \ drm_gem_framebuffer_helper.o \ drm_atomic_state_helper.o drm_damage_helper.o \ - drm_format_helper.o drm_self_refresh_helper.o drm_rect.o + drm_format_helper.o drm_self_refresh_helper.o drm_rect.o \ + drm_suballoc.o + drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o diff --git a/drivers/gpu/drm/drm_suballoc.c b/drivers/gpu/drm/drm_suballoc.c new file mode 100644 index ..e0bb35367b71 --- /dev/null +++ b/drivers/gpu/drm/drm_suballoc.c @@ -0,0 +1,424 @@ +/* + * Copyright 2011 Red Hat Inc. + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE + * USE OR OTHER DEALINGS IN THE SOFTWARE. + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + */ +/* + * Authors: + *Jerome Glisse + */ +/* Algorithm: + * + * We store the last allocated bo in "hole", we always try to allocate + * after the last allocated bo. Principle is that in a linear GPU ring + * progression was is after last is the oldest bo we allocated and thus + * the first one that should no longer be in use by the GPU. + * + * If it's not the case we skip over the bo after last to the closest + * done bo if such one exist. If none exist and we are not asked to + * block we report failure to allocate. + * + * If we are asked to block we wait on all the oldest fence of all + * rings. We just wait for any of those fence to complete. + */ + +#include +#include +#include +#include +#include +#include + +static void drm_suballoc_remove_locked(struct drm_suballoc *sa); +static void drm_suballoc_try_free(struct drm_suballoc_manager *sa_manager); + +/** + * drm_suballoc_manager_init - Initialise the drm_suballoc_manager + * + * @sa_manager: pointer to the sa_manager + * @size: number of bytes we want to suballocate + * @align: alignment for each suballocated chunk + * + * Prepares the suballocation manager for suballocations. + */ +void drm_suballoc_manager_init(struct drm_suballoc_manager *sa_manager, + u32 size, u32 align) +{ + u32 i; + + if (!align) + align = 1; + + /* alignment must be a power of 2 */ + BUG_ON(align & (align - 1)); + + init_waitqueue_head(&sa_manager->wq); + sa_manager->size = size; + sa_manager->align = align; + sa_manager->hole = &sa_manager->olist; + INIT_LIST_HEAD(&sa_manager->olist); + for (i = 0; i < DRM_SUBALLOC_MAX_QUEUES; ++i) + INIT_LIST_HEAD(&sa_manager->flist[i]); +} +EXPORT_SYMBOL(drm_suballoc_manager_init); + +/** + * drm_suballoc_manager_fini - Destroy the drm_suballoc_manager + * + * @sa_manager: pointer to the sa_manager + * + * Cleans up the suballocation manager after use. All fences added + * with drm_suballoc_free() must be signaled, or we cannot clean up + * the entire manager. + */ +void drm_suballoc_manager_fini(struct drm_suballoc_manager *sa_manager) +{ + struct drm_suballoc *sa, *tmp; + + if (!sa_manager->size) + return; + + if (!list_empty(&sa_manager->olist)) { + sa_manager->hole = &sa_manager->olist, + drm
[PATCH] drm/nouveau/backlight: Fix LVDS backlight detection on some laptops
It seems that some laptops will report having both an eDP and LVDS connector, even though only the LVDS connector is actually hooked up. This can lead to issues with backlight registration if the eDP connector ends up getting registered before the LVDS connector, as the backlight device will then be registered to the eDP connector instead of the LVDS connector. So, fix this by only registering the backlight on connectors that are reported as being connected. Signed-off-by: Lyude Paul Fixes: 6eca310e8924 ("drm/nouveau/kms/nv50-: Add basic DPCD backlight support for nouveau") Bugzilla: https://gitlab.freedesktop.org/drm/nouveau/-/issues/137 Cc: # v5.15+ --- drivers/gpu/drm/nouveau/nouveau_backlight.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c b/drivers/gpu/drm/nouveau/nouveau_backlight.c index ae2f2abc8f5a..6af12dc99d7f 100644 --- a/drivers/gpu/drm/nouveau/nouveau_backlight.c +++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c @@ -294,7 +294,8 @@ nv50_backlight_init(struct nouveau_backlight *bl, struct nouveau_drm *drm = nouveau_drm(nv_encoder->base.base.dev); struct nvif_object *device = &drm->client.device.object; - if (!nvif_rd32(device, NV50_PDISP_SOR_PWM_CTL(ffs(nv_encoder->dcb->or) - 1))) + if (!nvif_rd32(device, NV50_PDISP_SOR_PWM_CTL(ffs(nv_encoder->dcb->or) - 1)) || + nv_conn->base.status != connector_status_connected) return -ENODEV; if (nv_conn->type == DCB_CONNECTOR_eDP) { -- 2.34.1
Re: [RFC PATCH 1/3] drm: Extract amdgpu_sa.c as a generic suballocation helper
Oh, that's on my TODO list for years! Am 04.02.22 um 18:48 schrieb Maarten Lankhorst: Suballocating a buffer object is something that is not driver generic, and is useful for other drivers as well. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/Makefile | 4 +- drivers/gpu/drm/drm_suballoc.c | 424 + include/drm/drm_suballoc.h | 78 ++ 3 files changed, 505 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/drm_suballoc.c create mode 100644 include/drm/drm_suballoc.h diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 8675c2af7ae1..b848bcf8790c 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -57,7 +57,9 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o \ drm_scdc_helper.o drm_gem_atomic_helper.o \ drm_gem_framebuffer_helper.o \ drm_atomic_state_helper.o drm_damage_helper.o \ - drm_format_helper.o drm_self_refresh_helper.o drm_rect.o + drm_format_helper.o drm_self_refresh_helper.o drm_rect.o \ + drm_suballoc.o + I think we should put that into a separate module like we now do with other helpers as well. drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o diff --git a/drivers/gpu/drm/drm_suballoc.c b/drivers/gpu/drm/drm_suballoc.c new file mode 100644 index ..e0bb35367b71 --- /dev/null +++ b/drivers/gpu/drm/drm_suballoc.c @@ -0,0 +1,424 @@ +/* + * Copyright 2011 Red Hat Inc. + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE + * USE OR OTHER DEALINGS IN THE SOFTWARE. + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + */ +/* + * Authors: + *Jerome Glisse + */ That is hopelessly outdated. IIRC I completely rewrote that stuff in ~2012. +/* Algorithm: + * + * We store the last allocated bo in "hole", we always try to allocate + * after the last allocated bo. Principle is that in a linear GPU ring + * progression was is after last is the oldest bo we allocated and thus + * the first one that should no longer be in use by the GPU. + * + * If it's not the case we skip over the bo after last to the closest + * done bo if such one exist. If none exist and we are not asked to + * block we report failure to allocate. + * + * If we are asked to block we wait on all the oldest fence of all + * rings. We just wait for any of those fence to complete. + */ + +#include +#include +#include +#include +#include +#include + +static void drm_suballoc_remove_locked(struct drm_suballoc *sa); +static void drm_suballoc_try_free(struct drm_suballoc_manager *sa_manager); + +/** + * drm_suballoc_manager_init - Initialise the drm_suballoc_manager + * + * @sa_manager: pointer to the sa_manager + * @size: number of bytes we want to suballocate + * @align: alignment for each suballocated chunk + * + * Prepares the suballocation manager for suballocations. + */ +void drm_suballoc_manager_init(struct drm_suballoc_manager *sa_manager, + u32 size, u32 align) +{ + u32 i; + + if (!align) + align = 1; + + /* alignment must be a power of 2 */ + BUG_ON(align & (align - 1)); When we move that I think we should cleanup the code once more, e.g. use is_power_of_2() function here for example. There are also a bunch of places with extra {} and constructs like "if () return true; else return false;" which could certainly be simplified. Apart from that really great idea. Regards, Christian. + + init_waitqueue_head(&sa_manager->wq); + sa_manager->size = size; + sa_manager->align = align; + sa_manager->hole = &sa_manager->olist; + INIT_LIST_HEAD(&sa_manager->olist); + for (i = 0; i < DRM_SUBALLOC_MAX_QUEUES; ++i) + INIT_LIST_HEAD(&sa_manager->flist[i]); +} +EXPORT_SYMBOL(drm_
Re: [PATCH 02/19] iosys-map: Add offset to iosys_map_memcpy_to()
Am 04.02.22 um 18:44 schrieb Lucas De Marchi: In certain situations it's useful to be able to write to an offset of the mapping. Add a dst_offset to iosys_map_memcpy_to(). Cc: Sumit Semwal Cc: Christian König Cc: Thomas Zimmermann Cc: dri-devel@lists.freedesktop.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Lucas De Marchi Reviewed-by: Christian König --- drivers/gpu/drm/drm_cache.c | 2 +- drivers/gpu/drm/drm_fb_helper.c | 2 +- include/linux/iosys-map.h | 17 + 3 files changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index 66597e411764..c3e6e615bf09 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -218,7 +218,7 @@ static void memcpy_fallback(struct iosys_map *dst, if (!dst->is_iomem && !src->is_iomem) { memcpy(dst->vaddr, src->vaddr, len); } else if (!src->is_iomem) { - iosys_map_memcpy_to(dst, src->vaddr, len); + iosys_map_memcpy_to(dst, 0, src->vaddr, len); } else if (!dst->is_iomem) { memcpy_fromio(dst->vaddr, src->vaddr_iomem, len); } else { diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 238f815cb2a0..bf5cc9a42e5a 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -385,7 +385,7 @@ static void drm_fb_helper_damage_blit_real(struct drm_fb_helper *fb_helper, iosys_map_incr(dst, offset); /* go to first pixel within clip rect */ for (y = clip->y1; y < clip->y2; y++) { - iosys_map_memcpy_to(dst, src, len); + iosys_map_memcpy_to(dst, 0, src, len); iosys_map_incr(dst, fb->pitches[0]); src += fb->pitches[0]; } diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h index f4186f91caa6..edd7fa3be9e9 100644 --- a/include/linux/iosys-map.h +++ b/include/linux/iosys-map.h @@ -220,22 +220,23 @@ static inline void iosys_map_clear(struct iosys_map *map) } /** - * iosys_map_memcpy_to - Memcpy into iosys mapping + * iosys_map_memcpy_to_offset - Memcpy into offset of iosys_map * @dst: The iosys_map structure + * @dst_offset:The offset from which to copy * @src: The source buffer * @len: The number of byte in src * - * Copies data into a iosys mapping. The source buffer is in system - * memory. Depending on the buffer's location, the helper picks the correct - * method of accessing the memory. + * Copies data into a iosys_map with an offset. The source buffer is in + * system memory. Depending on the buffer's location, the helper picks the + * correct method of accessing the memory. */ -static inline void iosys_map_memcpy_to(struct iosys_map *dst, const void *src, - size_t len) +static inline void iosys_map_memcpy_to(struct iosys_map *dst, size_t dst_offset, + const void *src, size_t len) { if (dst->is_iomem) - memcpy_toio(dst->vaddr_iomem, src, len); + memcpy_toio(dst->vaddr_iomem + dst_offset, src, len); else - memcpy(dst->vaddr, src, len); + memcpy(dst->vaddr + dst_offset, src, len); } /**
[PATCH v3 0/3] enable widebus feature base on chip hardware revision
split into 3 patches 1) widebus timing engine programming 2) dsc timing engine 3) enable widebus feature base on chip hardware revision Kuogee Hsieh (3): drm/msm/dp: revise timing engine programming to support widebus feature drm/msm/dp: revise timing engine programming to support compression (DSC) drm/msm/dp: enable widebus feature for display port drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c| 14 ++- drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h| 2 + .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c | 14 +++ drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c| 107 +++-- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h| 5 + drivers/gpu/drm/msm/dp/dp_catalog.c| 36 ++- drivers/gpu/drm/msm/dp/dp_catalog.h| 3 +- drivers/gpu/drm/msm/dp/dp_ctrl.c | 13 ++- drivers/gpu/drm/msm/dp/dp_ctrl.h | 1 + drivers/gpu/drm/msm/dp/dp_display.c| 30 ++ drivers/gpu/drm/msm/dp/dp_display.h| 2 + drivers/gpu/drm/msm/dp/dp_panel.c | 4 +- drivers/gpu/drm/msm/dp/dp_panel.h | 2 +- drivers/gpu/drm/msm/msm_drv.h | 6 ++ 14 files changed, 197 insertions(+), 42 deletions(-) -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v3 1/3] drm/msm/dp: revise timing engine programming to support widebus feature
Widebus feature will transmit two pixel data per pixel clock to interface. Timing engine provides driving force for this purpose. This patch base on HPG (Hardware Programming Guide) to revise timing engine register setting to accommodate both widebus and non widebus application. Also horizontal width parameters need to be reduced by half since two pixel data are clocked out per pixel clock when widebus feature enabled. In addition, revised timing engine function is an generic function and intend to be shared by all platforms to reduce maintenance efforts. Changes in v2: -- remove compression related code from timing -- remove op_info from struct msm_drm_private -- remove unnecessary wide_bus_en variables -- pass wide_bus_en into timing configuration by struct msm_dp Changes in v3: -- split patch into 3 patches Signed-off-by: Kuogee Hsieh --- drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c| 10 +++ drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h| 2 + .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c | 14 +++ drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c| 99 ++ drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h| 2 + 5 files changed, 93 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c index 0d315b4..0c22839 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c @@ -208,6 +208,8 @@ struct dpu_encoder_virt { u32 idle_timeout; + bool wide_bus_en; + struct msm_dp *dp; }; @@ -217,6 +219,14 @@ static u32 dither_matrix[DITHER_MATRIX_SZ] = { 15, 7, 13, 5, 3, 11, 1, 9, 12, 4, 14, 6, 0, 8, 2, 10 }; + +bool dpu_encoder_is_widebus_enabled(struct drm_encoder *drm_enc) +{ + struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc); + + return dpu_enc->wide_bus_en; +} + static void _dpu_encoder_setup_dither(struct dpu_hw_pingpong *hw_pp, unsigned bpc) { struct dpu_hw_dither_cfg dither_cfg = { 0 }; diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h index 99a5d73..893d74d 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h @@ -168,4 +168,6 @@ int dpu_encoder_get_linecount(struct drm_encoder *drm_enc); */ int dpu_encoder_get_frame_count(struct drm_encoder *drm_enc); +bool dpu_encoder_is_widebus_enabled(struct drm_encoder *drm_enc); + #endif /* __DPU_ENCODER_H__ */ diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c index 185379b..3d6c914 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c @@ -110,6 +110,20 @@ static void drm_mode_to_intf_timing_params( timing->v_back_porch += timing->v_front_porch; timing->v_front_porch = 0; } + + timing->wide_bus_en = dpu_encoder_is_widebus_enabled(phys_enc->parent); + + /* +* for DP, divide the horizonal parameters by 2 when +* widebus is enabled +*/ + if (timing->wide_bus_en) { + timing->width = timing->width >> 1; + timing->xres = timing->xres >> 1; + timing->h_back_porch = timing->h_back_porch >> 1; + timing->h_front_porch = timing->h_front_porch >> 1; + timing->hsync_pulse_width = timing->hsync_pulse_width >> 1; + } } static u32 get_horizontal_total(const struct intf_timing_params *timing) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c index 116e2b5..35d4aaa 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c @@ -33,6 +33,7 @@ #define INTF_TP_COLOR1 0x05C #define INTF_CONFIG20x060 #define INTF_DISPLAY_DATA_HCTL 0x064 +#define INTF_ACTIVE_DATA_HCTL 0x068 #define INTF_FRAME_LINE_COUNT_EN0x0A8 #define INTF_FRAME_COUNT0x0AC #define INTF_LINE_COUNT 0x0B0 @@ -90,68 +91,95 @@ static void dpu_hw_intf_setup_timing_engine(struct dpu_hw_intf *ctx, u32 hsync_period, vsync_period; u32 display_v_start, display_v_end; u32 hsync_start_x, hsync_end_x; + u32 hsync_data_start_x, hsync_data_end_x; u32 active_h_start, active_h_end; u32 active_v_start, active_v_end; u32 active_hctl, display_hctl, hsync_ctl; u32 polarity_ctl, den_polarity, hsync_polarity, vsync_polarity; u32 panel_format; - u32 intf_cfg, intf_cfg2 = 0, display_data_hctl = 0; + u32 intf_cfg, intf_cfg2 = 0; + u32 display_data_hctl = 0, active_data_hctl = 0; + u32 data_width; + bool dp_intf = false; /* read interface_cfg */ intf_cfg = DPU_REG_READ(c, INTF_CO
[PATCH v3 2/3] drm/msm/dp: revise timing engine programming to support compression (DSC)
Divides horizontal width by 3 at timing engine of interface. There are major part of compression (DSC) programming have to be done at DSC controller which is not covered by this patch. Signed-off-by: Kuogee Hsieh --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c | 22 ++ drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h | 3 +++ 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c index 35d4aaa..ee7ca34 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c @@ -128,7 +128,7 @@ static void dpu_hw_intf_setup_timing_engine(struct dpu_hw_intf *ctx, * video timing. It is recommended to enable it for all cases, except * if compression is enabled in 1 pixel per clock mode */ - if (p->wide_bus_en) + if (!p->compression_en || p->wide_bus_en) intf_cfg2 |= BIT(4); if (p->wide_bus_en) @@ -150,10 +150,16 @@ static void dpu_hw_intf_setup_timing_engine(struct dpu_hw_intf *ctx, */ data_width = p->width; - if (!dp_intf && p->wide_bus_en) + if (p->compression_en) { + data_width = DIV_ROUND_UP(p->dce_bytes_per_line, 3); + + if (p->wide_bus_en) + data_width >>= 1; + } else if (!dp_intf && p->wide_bus_en) { data_width = p->width >> 1; - else + } else { data_width = p->width; + } hsync_data_start_x = hsync_start_x; hsync_data_end_x = hsync_start_x + data_width - 1; @@ -178,8 +184,16 @@ static void dpu_hw_intf_setup_timing_engine(struct dpu_hw_intf *ctx, active_hctl = (active_h_end << 16) | active_h_start; - if (dp_intf) + if (dp_intf) { display_hctl = active_hctl; + if (p->compression_en) { + active_data_hctl = (hsync_start_x + + p->extra_dto_cycles) << 16; + active_data_hctl += hsync_start_x; + + display_data_hctl = active_data_hctl; + } + } den_polarity = 0; if (ctx->cap->type == INTF_HDMI) { diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h index e4a518a..8fc71ce 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h @@ -32,6 +32,9 @@ struct intf_timing_params { u32 hsync_skew; bool wide_bus_en; + bool compression_en; + u32 extra_dto_cycles; /* for DP only */ + u32 dce_bytes_per_line; }; struct intf_prog_fetch { -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v3 3/3] drm/msm/dp: enable widebus feature for display port
Widebus feature will transmit two pixel data per pixel clock to interface. This feature now is required to be enabled to easy migrant to higher resolution applications in future. However since some legacy chipsets does not support this feature, this feature is enabled base on chip's hardware revision. changes in v2: -- remove compression related code from timing -- remove op_info from struct msm_drm_private -- remove unnecessary wide_bus_en variables -- pass wide_bus_en into timing configuration by struct msm_dp Changes in v3: -- split patch into 3 patches -- enable widebus feature base on chip hardware revision Signed-off-by: Kuogee Hsieh --- drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 4 +++- drivers/gpu/drm/msm/dp/dp_catalog.c | 36 +++-- drivers/gpu/drm/msm/dp/dp_catalog.h | 3 ++- drivers/gpu/drm/msm/dp/dp_ctrl.c| 13 +++ drivers/gpu/drm/msm/dp/dp_ctrl.h| 1 + drivers/gpu/drm/msm/dp/dp_display.c | 30 drivers/gpu/drm/msm/dp/dp_display.h | 2 ++ drivers/gpu/drm/msm/dp/dp_panel.c | 4 ++-- drivers/gpu/drm/msm/dp/dp_panel.h | 2 +- drivers/gpu/drm/msm/msm_drv.h | 6 + 10 files changed, 90 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c index 0c22839..b2d23c2 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c @@ -2167,8 +2167,10 @@ int dpu_encoder_setup(struct drm_device *dev, struct drm_encoder *enc, timer_setup(&dpu_enc->vsync_event_timer, dpu_encoder_vsync_event_handler, 0); - else if (disp_info->intf_type == DRM_MODE_ENCODER_TMDS) + else if (disp_info->intf_type == DRM_MODE_ENCODER_TMDS) { dpu_enc->dp = priv->dp[disp_info->h_tile_instance[0]]; + dpu_enc->wide_bus_en = msm_dp_wide_bus_enable(dpu_enc->dp); + } INIT_DELAYED_WORK(&dpu_enc->delayed_off_work, dpu_encoder_off_work); diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.c b/drivers/gpu/drm/msm/dp/dp_catalog.c index 64f0b26..99d087e 100644 --- a/drivers/gpu/drm/msm/dp/dp_catalog.c +++ b/drivers/gpu/drm/msm/dp/dp_catalog.c @@ -483,6 +483,27 @@ int dp_catalog_ctrl_set_pattern_state_bit(struct dp_catalog *dp_catalog, } /** + * dp_catalog_hw_revision() - retrieve DP hw revision + * + * @dp_catalog: DP catalog structure + * + * return: u32 + * + * This function return the DP controller hw revision + * + */ +u32 dp_catalog_hw_revision(struct dp_catalog *dp_catalog) +{ + u32 revision; + struct dp_catalog_private *catalog = container_of(dp_catalog, + struct dp_catalog_private, dp_catalog); + + revision = dp_read_ahb(catalog, REG_DP_HW_VERSION); + + return revision; +} + +/** * dp_catalog_ctrl_reset() - reset DP controller * * @dp_catalog: DP catalog structure @@ -739,10 +760,11 @@ u32 dp_catalog_ctrl_read_phy_pattern(struct dp_catalog *dp_catalog) } /* panel related catalog functions */ -int dp_catalog_panel_timing_cfg(struct dp_catalog *dp_catalog) +int dp_catalog_panel_timing_cfg(struct dp_catalog *dp_catalog, bool wide_bus_en) { struct dp_catalog_private *catalog = container_of(dp_catalog, struct dp_catalog_private, dp_catalog); + u32 reg; dp_write_link(catalog, REG_DP_TOTAL_HOR_VER, dp_catalog->total); @@ -751,7 +773,17 @@ int dp_catalog_panel_timing_cfg(struct dp_catalog *dp_catalog) dp_write_link(catalog, REG_DP_HSYNC_VSYNC_WIDTH_POLARITY, dp_catalog->width_blanking); dp_write_link(catalog, REG_DP_ACTIVE_HOR_VER, dp_catalog->dp_active); - dp_write_p0(catalog, MMSS_DP_INTF_CONFIG, 0); + + reg = dp_read_p0(catalog, MMSS_DP_INTF_CONFIG); + + if (wide_bus_en) + reg |= BIT(4); /* DATABUS_WIDEN */ + else + reg &= ~BIT(4); + + DRM_DEBUG_DP("wide_bus_en=%d reg=%x\n", wide_bus_en, reg); + + dp_write_p0(catalog, MMSS_DP_INTF_CONFIG, reg); return 0; } diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.h b/drivers/gpu/drm/msm/dp/dp_catalog.h index 7dea101..a3a0129 100644 --- a/drivers/gpu/drm/msm/dp/dp_catalog.h +++ b/drivers/gpu/drm/msm/dp/dp_catalog.h @@ -95,6 +95,7 @@ void dp_catalog_ctrl_config_misc(struct dp_catalog *dp_catalog, u32 cc, u32 tb); void dp_catalog_ctrl_config_msa(struct dp_catalog *dp_catalog, u32 rate, u32 stream_rate_khz, bool fixed_nvid); int dp_catalog_ctrl_set_pattern_state_bit(struct dp_catalog *dp_catalog, u32 pattern); +u32 dp_catalog_hw_revision(struct dp_catalog *dp_catalog); void dp_catalog_ctrl_reset(struct dp_catalog *dp_catalog); bool dp_catalog_ctrl_ma
Re: [PATCH 02/19] iosys-map: Add offset to iosys_map_memcpy_to()
Hi Am 04.02.22 um 18:44 schrieb Lucas De Marchi: In certain situations it's useful to be able to write to an offset of the mapping. Add a dst_offset to iosys_map_memcpy_to(). Cc: Sumit Semwal Cc: Christian König Cc: Thomas Zimmermann Cc: dri-devel@lists.freedesktop.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/drm_cache.c | 2 +- drivers/gpu/drm/drm_fb_helper.c | 2 +- include/linux/iosys-map.h | 17 + 3 files changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index 66597e411764..c3e6e615bf09 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -218,7 +218,7 @@ static void memcpy_fallback(struct iosys_map *dst, if (!dst->is_iomem && !src->is_iomem) { memcpy(dst->vaddr, src->vaddr, len); } else if (!src->is_iomem) { - iosys_map_memcpy_to(dst, src->vaddr, len); + iosys_map_memcpy_to(dst, 0, src->vaddr, len); } else if (!dst->is_iomem) { memcpy_fromio(dst->vaddr, src->vaddr_iomem, len); } else { diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 238f815cb2a0..bf5cc9a42e5a 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -385,7 +385,7 @@ static void drm_fb_helper_damage_blit_real(struct drm_fb_helper *fb_helper, iosys_map_incr(dst, offset); /* go to first pixel within clip rect */ for (y = clip->y1; y < clip->y2; y++) { - iosys_map_memcpy_to(dst, src, len); + iosys_map_memcpy_to(dst, 0, src, len); iosys_map_incr(dst, fb->pitches[0]); src += fb->pitches[0]; } diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h index f4186f91caa6..edd7fa3be9e9 100644 --- a/include/linux/iosys-map.h +++ b/include/linux/iosys-map.h @@ -220,22 +220,23 @@ static inline void iosys_map_clear(struct iosys_map *map) } /** - * iosys_map_memcpy_to - Memcpy into iosys mapping + * iosys_map_memcpy_to_offset - Memcpy into offset of iosys_map 'iosys_map_memcpy_to' With that fixed: Reviewed-by: Thomas Zimmermann Best regards Thomas * @dst: The iosys_map structure + * @dst_offset:The offset from which to copy * @src: The source buffer * @len: The number of byte in src * - * Copies data into a iosys mapping. The source buffer is in system - * memory. Depending on the buffer's location, the helper picks the correct - * method of accessing the memory. + * Copies data into a iosys_map with an offset. The source buffer is in + * system memory. Depending on the buffer's location, the helper picks the + * correct method of accessing the memory. */ -static inline void iosys_map_memcpy_to(struct iosys_map *dst, const void *src, - size_t len) +static inline void iosys_map_memcpy_to(struct iosys_map *dst, size_t dst_offset, + const void *src, size_t len) { if (dst->is_iomem) - memcpy_toio(dst->vaddr_iomem, src, len); + memcpy_toio(dst->vaddr_iomem + dst_offset, src, len); else - memcpy(dst->vaddr, src, len); + memcpy(dst->vaddr + dst_offset, src, len); } /** -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH v3 2/2] drm/i915/uapi: Add query for hwconfig table
On 2/4/2022 01:55, Daniel Vetter wrote: On Wed, Jan 19, 2022 at 9:35 PM wrote: From: Rodrigo Vivi GuC contains a consolidated table with a bunch of information about the current device. Previously, this information was spread and hardcoded to all the components including GuC, i915 and various UMDs. The goal here is to consolidate the data into GuC in a way that all interested components can grab the very latest and synchronized information using a simple query. As per most of the other queries, this one can be called twice. Once with item.length=0 to determine the exact buffer size, then allocate the user memory and call it again for to retrieve the table data. For example: struct drm_i915_query_item item = { .query_id = DRM_I915_QUERY_HWCONCFIG_TABLE; }; query.items_ptr = (int64_t) &item; query.num_items = 1; ioctl(fd, DRM_IOCTL_I915_QUERY, query, sizeof(query)); if (item.length <= 0) return -ENOENT; data = malloc(item.length); item.data_ptr = (int64_t) &data; ioctl(fd, DRM_IOCTL_I915_QUERY, query, sizeof(query)); // Parse the data as appropriate... The returned array is a simple and flexible KLV (Key/Length/Value) formatted table. For example, it could be just: enum device_attr { ATTR_SOME_VALUE = 0, ATTR_SOME_MASK = 1, }; static const u32 hwconfig[] = { ATTR_SOME_VALUE, 1, // Value Length in DWords 8, // Value ATTR_SOME_MASK, 3, 0x00, 0x, 0xFF00, }; The attribute ids are defined in a hardware spec. Cc: Tvrtko Ursulin Cc: Kenneth Graunke Cc: Michal Wajdeczko Cc: Slawomir Milczarek Signed-off-by: Rodrigo Vivi Signed-off-by: John Harrison Reviewed-by: Matthew Brost --- drivers/gpu/drm/i915/i915_query.c | 23 +++ include/uapi/drm/i915_drm.h | 1 + 2 files changed, 24 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c index 2dfbc22857a3..609e64d5f395 100644 --- a/drivers/gpu/drm/i915/i915_query.c +++ b/drivers/gpu/drm/i915/i915_query.c @@ -479,12 +479,35 @@ static int query_memregion_info(struct drm_i915_private *i915, return total_length; } +static int query_hwconfig_table(struct drm_i915_private *i915, + struct drm_i915_query_item *query_item) +{ + struct intel_gt *gt = to_gt(i915); + struct intel_guc_hwconfig *hwconfig = >->uc.guc.hwconfig; + + if (!hwconfig->size || !hwconfig->ptr) + return -ENODEV; + + if (query_item->length == 0) + return hwconfig->size; + + if (query_item->length < hwconfig->size) + return -EINVAL; + + if (copy_to_user(u64_to_user_ptr(query_item->data_ptr), +hwconfig->ptr, hwconfig->size)) + return -EFAULT; + + return hwconfig->size; +} + static int (* const i915_query_funcs[])(struct drm_i915_private *dev_priv, struct drm_i915_query_item *query_item) = { query_topology_info, query_engine_info, query_perf_config, query_memregion_info, + query_hwconfig_table, }; int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 914ebd9290e5..132515199f27 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -2685,6 +2685,7 @@ struct drm_i915_query_item { #define DRM_I915_QUERY_ENGINE_INFO 2 #define DRM_I915_QUERY_PERF_CONFIG 3 #define DRM_I915_QUERY_MEMORY_REGIONS 4 +#define DRM_I915_QUERY_HWCONFIG_TABLE 5 /* Must be kept compact -- no holes and well documented */ New uapi needs kerneldoc in the uapi header, and please fill in any gaps you have (i.e. if the query uapi this is built on top of isn't fully documented yet). Also this holds across the board, so please keep in mind in patch review. -Daniel There is no extra documentation to add. The query interface itself is already documented. This new query does not have any kernel defined data structures associated with it. There is just 'struct drm_i915_query_item' with a length and a pointer, all of which are fully documented. John.
Re: [PATCH 03/19] iosys-map: Add a few more helpers
Hi Am 04.02.22 um 18:44 schrieb Lucas De Marchi: First the simplest ones: - iosys_map_memset(): when abstracting system and I/O memory, just like the memcpy() use case, memset() also has dedicated functions to be called for using IO memory. - iosys_map_memcpy_from(): we may need to copy data from I/O memory, not only to. In certain situations it's useful to be able to read or write to an offset that is calculated by having the memory layout given by a struct declaration. Usually we are going to read/write a u8, u16, u32 or u64. As a pre-requisite for the implementation, add iosys_map_memcpy_from() to be the equivalent of iosys_map_memcpy_to(), but in the other direction. Then add 2 pairs of macros: - iosys_map_rd() / iosys_map_wr() - iosys_map_rd_field() / iosys_map_wr_field() The first pair takes the C-type and offset to read/write. The second pair uses a struct describing the layout of the mapping in order to calculate the offset and size being read/written. We could use readb, readw, readl, readq and the write* counterparts, however due to alignment issues this may not work on all architectures. If alignment needs to be checked to call the right function, it's not possible to decide at compile-time which function to call: so just leave the decision to the memcpy function that will do exactly that. Finally, in order to use the above macros with a map derived from another, add another initializer: IOSYS_MAP_INIT_OFFSET(). Cc: Sumit Semwal Cc: Christian König Cc: Thomas Zimmermann Cc: dri-devel@lists.freedesktop.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Lucas De Marchi --- include/linux/iosys-map.h | 154 +- 1 file changed, 153 insertions(+), 1 deletion(-) diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h index edd7fa3be9e9..96f8b61ac6fb 100644 --- a/include/linux/iosys-map.h +++ b/include/linux/iosys-map.h @@ -6,6 +6,7 @@ #ifndef __IOSYS_MAP_H__ #define __IOSYS_MAP_H__ +#include #include #include @@ -133,6 +134,45 @@ static inline void iosys_map_set_vaddr(struct iosys_map *map, void *vaddr) map->is_iomem = false; } +/** + * IOSYS_MAP_INIT_OFFSET - Initializes struct iosys_map from another iosys_map + * @map_: The dma-buf mapping structure to copy from + * @offset_: Offset to add to the other mapping + * + * Initializes a new iosys_map struct based on another passed as argument. It + * does a shallow copy of the struct so it's possible to update the back storage + * without changing where the original map points to. It is the equivalent of + * doing: + * + * .. code-block: c + * + * iosys_map map = other_map; + * iosys_map_incr(&map, &offset); + * + * Example usage: + * + * .. code-block: c + * + * void foo(struct device *dev, struct iosys_map *base_map) + * { + * ... + * struct iosys_map map = IOSYS_MAP_INIT_OFFSET(base_map, FIELD_OFFSET); + * ... + * } + * + * The advantage of using the initializer over just increasing the offset with + * ``iosys_map_incr()`` like above is that the new map will always point to the + * right place of the buffer during its scope. It reduces the risk of updating + * the wrong part of the buffer and having no compiler warning about that. If + * the assignment to IOSYS_MAP_INIT_OFFSET() is forgotten, the compiler can warn + * using a uninitialized variable. + */ +#define IOSYS_MAP_INIT_OFFSET(map_, offset_) (struct iosys_map) \ + { \ + .vaddr = (map_)->vaddr + (offset_), \ + .is_iomem = (map_)->is_iomem,\ + } I already nak'ed this macro. This works because of the aliasing rules within the union and the fact that there are only plain pointers. But this is fragile. Do anything more complicated and it breaks. There's not even a test that would tell you that it failed. Therefore, struct iosys_map should only be initialized by the code that creates the stored pointer. However, you won't need the offset'ed iosys_map because the memcpy_to/from helpers now have the offset parameter. + /** * iosys_map_set_vaddr_iomem - Sets a iosys mapping structure to an address in I/O memory * @map: The iosys_map structure @@ -220,7 +260,7 @@ static inline void iosys_map_clear(struct iosys_map *map) } /** - * iosys_map_memcpy_to_offset - Memcpy into offset of iosys_map + * iosys_map_memcpy_to - Memcpy into iosys_map That's the fix for the other patch. :) * @dst: The iosys_map structure * @dst_offset: The offset from which to copy * @src: The source buffer @@ -239,6 +279,26 @@ static inline void iosys_map_memcpy_to(struct iosys_map *dst, size_t dst_offset, memcpy(dst->vaddr + dst_offset, src, len); } +/** + *
Re: [PATCH 04/19] drm/i915/gt: Add helper for shmem copy to iosys_map
Hi Am 04.02.22 um 18:44 schrieb Lucas De Marchi: Add a variant of shmem_read() that takes a iosys_map pointer rather than a plain pointer as argument. It's mostly a copy __shmem_rw() but adapting the api and removing the write support since there's currently only need to use iosys_map as destination. Reworking __shmem_rw() to share the implementation was tempting, but finding a good balance between reuse and clarity pushed towards a little code duplication. Since the function is small, just add the similar function with a copy/paste/adapt approach. Cc: Matt Roper Cc: Joonas Lahtinen Cc: Tvrtko Ursulin Cc: David Airlie Cc: Daniel Vetter Cc: Matthew Auld Cc: Thomas Hellström Cc: Maarten Lankhorst Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/shmem_utils.c | 33 +++ drivers/gpu/drm/i915/gt/shmem_utils.h | 3 +++ 2 files changed, 36 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c index 0683b27a3890..764adefdb4be 100644 --- a/drivers/gpu/drm/i915/gt/shmem_utils.c +++ b/drivers/gpu/drm/i915/gt/shmem_utils.c @@ -3,6 +3,7 @@ * Copyright © 2020 Intel Corporation */ +#include #include #include #include @@ -123,6 +124,38 @@ static int __shmem_rw(struct file *file, loff_t off, return 0; } Here's a good example of how to avoid iosys_map_incr() and use the memcpy offset: +int shmem_read_to_iosys_map(struct file *file, loff_t off, + struct iosys_map *map, size_t len) +{ + struct iosys_map map_iter = *map; Rather replace map_iter with something like unsigned long map_off = 0; + unsigned long pfn; + + for (pfn = off >> PAGE_SHIFT; len; pfn++) { + unsigned int this = + min_t(size_t, PAGE_SIZE - offset_in_page(off), len); + struct page *page; + void *vaddr; + + page = shmem_read_mapping_page_gfp(file->f_mapping, pfn, + GFP_KERNEL); + if (IS_ERR(page)) + return PTR_ERR(page); + + vaddr = kmap(page); + iosys_map_memcpy_to(&map_iter, 0, vaddr + offset_in_page(off), + this); Use map_off to index into map directly. + mark_page_accessed(page); + kunmap(page); + put_page(page); + + len -= this; + iosys_map_incr(&map_iter, this); Raplace iosys_map_incr() with map_off += this; + off = 0; Maybe off += this ? I think this pattern should be applied to all similar code. As you already noted, iosys_map_incr() is problematic. Best regards Thomas + } + + return 0; +} + int shmem_read(struct file *file, loff_t off, void *dst, size_t len) { return __shmem_rw(file, off, dst, len, false); diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.h b/drivers/gpu/drm/i915/gt/shmem_utils.h index c1669170c351..e1784999faee 100644 --- a/drivers/gpu/drm/i915/gt/shmem_utils.h +++ b/drivers/gpu/drm/i915/gt/shmem_utils.h @@ -8,6 +8,7 @@ #include +struct iosys_map; struct drm_i915_gem_object; struct file; @@ -17,6 +18,8 @@ struct file *shmem_create_from_object(struct drm_i915_gem_object *obj); void *shmem_pin_map(struct file *file); void shmem_unpin_map(struct file *file, void *ptr); +int shmem_read_to_iosys_map(struct file *file, loff_t off, + struct iosys_map *map, size_t len); int shmem_read(struct file *file, loff_t off, void *dst, size_t len); int shmem_write(struct file *file, loff_t off, void *src, size_t len); -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH v2 2/4] drm/tiny: Add driver for Solomon SSD130X OLED displays
Hello Andy, Thanks for your feedback. On 2/4/22 15:26, Andy Shevchenko wrote: > On Fri, Feb 04, 2022 at 02:43:45PM +0100, Javier Martinez Canillas wrote: >> Add a DRM driver for SSD1305, SSD1306, SSD1307 and SSD1309 Solomon OLED >> controllers that can be programmed via an I2C interface. This is a port >> of the ssd1307fb driver that already supports these devices. >> >> A Device Tree binding is not added because the DRM driver is compatible >> with the existing binding for the ssd1307fb driver. > > ... > >> +/* >> + * DRM driver for Solomon SSD130X OLED displays >> + * >> + * Copyright 2022 Red Hat Inc. >> + * >> + * Based on drivers/video/fbdev/ssd1307fb.c >> + * Copyright 2012 Free Electrons > >> + * > > No need for this blank line. > Ok. >> + */ > > ... > >> +struct ssd130x_device { >> +struct drm_device drm; >> +struct drm_simple_display_pipe pipe; >> +struct drm_display_mode mode; >> +struct drm_connector connector; > > >> +struct i2c_client *client; > > Can we logically separate hw protocol vs hw interface from day 1, please? > This will allow to add SPI support for this panel much easier. > > Technically I would like to see here > > struct device *dev; > > and probably (I haven't looked into design) > > struct ssd130x_ops *ops; > > or something alike. > Sure. I wanted to keep the driver simple, making the writes bus agnostic and adding a level of indirection would make it more complex. But I agree that it will also make easier to add more buses later. I will do that for v3. [snip] > >> +static inline int ssd130x_write_value(struct i2c_client *client, u8 value) > > Not sure inline does anything useful here. > Ditto for the rest similar cases. > Ok, I'll drop them. > ... > >> +static inline int ssd130x_write_cmd(struct i2c_client *client, int count, >> +/* u8 cmd, u8 value, ... */...) >> +{ >> +va_list ap; >> +u8 value; >> +int ret; >> + >> +va_start(ap, count); > >> +while (count--) { >> +value = va_arg(ap, int); >> +ret = ssd130x_write_value(client, (u8)value); >> +if (ret) >> +goto out_end; >> +} > > I'm wondering if this can be written in a form > > do { > ... > } while (--count); > > In this case it will give a hint that count can't be 0. > Sure, I don't have a strong preference. I will change it. [snip] >> +ssd130x->pwm = pwm_get(dev, NULL); >> +if (IS_ERR(ssd130x->pwm)) { >> +dev_err(dev, "Could not get PWM from device tree!\n"); > > "device tree" is a bit confusing here if I run this on ACPI. > Maybe something like "firmware description"? > Indeed. >> +return PTR_ERR(ssd130x->pwm); >> +} > > ... > >> +/* Set initial contrast */ >> +ret = ssd130x_write_cmd(ssd130x->client, 2, SSD130X_CONTRAST, >> ssd130x->contrast); > > Creating a local variable for client allows to: > - make lines shorter and might even be less LOCs > - allow to convert struct device to client in one place > (as per my above comment) > > Ditto for other similar cases. > Ok. [snip] >> +/* Switch to horizontal addressing mode */ >> +ret = ssd130x_write_cmd(ssd130x->client, 2, SSD130X_SET_ADDRESS_MODE, >> +SSD130X_SET_ADDRESS_MODE_HORIZONTAL); >> +if (ret < 0) >> +return ret; >> + >> +return 0; > > Can it be > > return ssd130x_write_cmd(...); > > ? > > ... > Yes. >> +unsigned int line_length = DIV_ROUND_UP(width, 8); >> +unsigned int pages = DIV_ROUND_UP(height, 8); > > For power of two there are more efficient roundup()/rounddown() > (or with _ in the names, I don't remember by heart). > Oh, I didn't know about round_{up,down}(). Thanks a lot for the pointer. > ... > >> +for (k = 0; k < m; k++) { > >> +u8 byte = buf[(8 * i + k) * line_length + >> + j / 8]; > > One line? > Yes. >> +u8 bit = (byte >> (j % 8)) & 1; >> + >> +data |= bit << k; >> +} > > ... > >> +static int ssd130x_display_pipe_mode_valid(struct drm_simple_display_pipe >> *pipe, >> + const struct drm_display_mode *mode) >> +{ >> +struct ssd130x_device *ssd130x = drm_to_ssd130x(pipe->crtc.dev); >> + >> +if (mode->hdisplay != ssd130x->mode.hdisplay && >> +mode->vdisplay != ssd130x->mode.vdisplay) >> +return MODE_ONE_SIZE; > >> +else if (mode->hdisplay != ssd130x->mode.hdisplay) >> +return MODE_ONE_WIDTH; >> +else if (mode->vdisplay != ssd130x->mode.vdisplay) >> +return MODE_ONE_HEIGHT; > > 'else' in both cases is redundant. > Indeed. >> +return MODE_OK; >> +} > > ... > >> +poweroff: > > out_power_off: ? > Ok. > ... > >> +if (!fb) >> +
Re: [PATCH 07/19] drm/i915/guc: Convert golden context init to iosys_map
Hi Am 04.02.22 um 18:44 schrieb Lucas De Marchi: Now the map is saved during creation, so use it to initialize the golden context, reading from shmem and writing to either system or IO memory. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 25 +++--- 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 3a0afce7564e..d32b407a2d25 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -473,18 +473,17 @@ static struct intel_engine_cs *find_engine_state(struct intel_gt *gt, u8 engine_ static void guc_init_golden_context(struct intel_guc *guc) { - struct __guc_ads_blob *blob = guc->ads_blob; struct intel_engine_cs *engine; struct intel_gt *gt = guc_to_gt(guc); + struct iosys_map golden_context_map; u32 addr_ggtt, offset; u32 total_size = 0, alloc_size, real_size; u8 engine_class, guc_class; - u8 *ptr; if (!intel_uc_uses_guc_submission(>->uc)) return; - GEM_BUG_ON(!blob); + GEM_BUG_ON(iosys_map_is_null(&guc->ads_map)); /* * Go back and fill in the golden context data now that it is @@ -492,15 +491,15 @@ static void guc_init_golden_context(struct intel_guc *guc) */ offset = guc_ads_golden_ctxt_offset(guc); addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset; - ptr = ((u8 *)blob) + offset; + + golden_context_map = IOSYS_MAP_INIT_OFFSET(&guc->ads_map, offset); for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; ++engine_class) { if (engine_class == OTHER_CLASS) continue; guc_class = engine_class_to_guc_class(engine_class); - - if (!blob->system_info.engine_enabled_masks[guc_class]) + if (!ads_blob_read(guc, system_info.engine_enabled_masks[guc_class])) continue; real_size = intel_engine_context_size(gt, engine_class); @@ -511,18 +510,20 @@ static void guc_init_golden_context(struct intel_guc *guc) if (!engine) { drm_err(>->i915->drm, "No engine state recorded for class %d!\n", engine_class); - blob->ads.eng_state_size[guc_class] = 0; - blob->ads.golden_context_lrca[guc_class] = 0; + ads_blob_write(guc, ads.eng_state_size[guc_class], 0); + ads_blob_write(guc, ads.golden_context_lrca[guc_class], 0); continue; } - GEM_BUG_ON(blob->ads.eng_state_size[guc_class] != + GEM_BUG_ON(ads_blob_read(guc, ads.eng_state_size[guc_class]) != real_size - LRC_SKIP_SIZE); - GEM_BUG_ON(blob->ads.golden_context_lrca[guc_class] != addr_ggtt); + GEM_BUG_ON(ads_blob_read(guc, ads.golden_context_lrca[guc_class]) != addr_ggtt); + addr_ggtt += alloc_size; - shmem_read(engine->default_state, 0, ptr, real_size); - ptr += alloc_size; + shmem_read_to_iosys_map(engine->default_state, 0, + &golden_context_map, real_size); + iosys_map_incr(&golden_context_map, alloc_size); Use an offset to index into iosys_map. Even if that means to add another parameter to shmem_read_to_iosys_map(). This will save you IOSYS_MAP_INIT_OFFSET() and iosys_map_incr(). Best regards Thomas } GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size); -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH 09/19] drm/i915/guc: Convert engine record to iosys_map
Hi Am 04.02.22 um 18:44 schrieb Lucas De Marchi: Use iosys_map to read fields from the dma_blob so access to IO and system memory is abstracted away. Cc: Matt Roper Cc: Thomas Hellström Cc: Daniel Vetter Cc: John Harrison Cc: Matthew Brost Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 14 ++ drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h | 3 ++- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 17 ++--- 3 files changed, 18 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 6311b9da87e4..1d21a2d457e0 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -698,18 +698,16 @@ void intel_guc_ads_reset(struct intel_guc *guc) u32 intel_guc_engine_usage_offset(struct intel_guc *guc) { - struct __guc_ads_blob *blob = guc->ads_blob; - u32 base = intel_guc_ggtt_offset(guc, guc->ads_vma); - u32 offset = base + ptr_offset(blob, engine_usage); - - return offset; + return intel_guc_ggtt_offset(guc, guc->ads_vma) + + offsetof(struct __guc_ads_blob, engine_usage); } -struct guc_engine_usage_record *intel_guc_engine_usage(struct intel_engine_cs *engine) +struct iosys_map intel_guc_engine_usage_record_map(struct intel_engine_cs *engine) { struct intel_guc *guc = &engine->gt->uc.guc; - struct __guc_ads_blob *blob = guc->ads_blob; u8 guc_class = engine_class_to_guc_class(engine->class); + size_t offset = offsetof(struct __guc_ads_blob, + engine_usage.engines[guc_class][ilog2(engine->logical_mask)]); - return &blob->engine_usage.engines[guc_class][ilog2(engine->logical_mask)]; + return IOSYS_MAP_INIT_OFFSET(&guc->ads_map, offset); Here's one of the few cases where you can legitimately make a copy of an iosys_map buffer and call iosys_map_incr() on it. Saves you the IOSYS_MAP_INIT_OFFSET(). Best regards Thomas } diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h index e74c110facff..1c64f4d6ea21 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h @@ -7,6 +7,7 @@ #define _INTEL_GUC_ADS_H_ #include +#include struct intel_guc; struct drm_printer; @@ -18,7 +19,7 @@ void intel_guc_ads_init_late(struct intel_guc *guc); void intel_guc_ads_reset(struct intel_guc *guc); void intel_guc_ads_print_policy_info(struct intel_guc *guc, struct drm_printer *p); -struct guc_engine_usage_record *intel_guc_engine_usage(struct intel_engine_cs *engine); +struct iosys_map intel_guc_engine_usage_record_map(struct intel_engine_cs *engine); u32 intel_guc_engine_usage_offset(struct intel_guc *guc); #endif diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index b3a429a92c0d..6d34842f68b4 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1139,6 +1139,9 @@ __extend_last_switch(struct intel_guc *guc, u64 *prev_start, u32 new_start) *prev_start = ((u64)gt_stamp_hi << 32) | new_start; } +#define record_read(map_, field_) \ + iosys_map_rd_field(map_, struct guc_engine_usage_record, field_) + /* * GuC updates shared memory and KMD reads it. Since this is not synchronized, * we run into a race where the value read is inconsistent. Sometimes the @@ -1153,17 +1156,17 @@ __extend_last_switch(struct intel_guc *guc, u64 *prev_start, u32 new_start) static void __get_engine_usage_record(struct intel_engine_cs *engine, u32 *last_in, u32 *id, u32 *total) { - struct guc_engine_usage_record *rec = intel_guc_engine_usage(engine); + struct iosys_map rec_map = intel_guc_engine_usage_record_map(engine); int i = 0; do { - *last_in = READ_ONCE(rec->last_switch_in_stamp); - *id = READ_ONCE(rec->current_context_index); - *total = READ_ONCE(rec->total_runtime); + *last_in = record_read(&rec_map, last_switch_in_stamp); + *id = record_read(&rec_map, current_context_index); + *total = record_read(&rec_map, total_runtime); - if (READ_ONCE(rec->last_switch_in_stamp) == *last_in && - READ_ONCE(rec->current_context_index) == *id && - READ_ONCE(rec->total_runtime) == *total) + if (record_read(&rec_map, last_switch_in_stamp) == *last_in && + record_read(&rec_map, current_context_index) == *id && + record_read(&rec_map, total_runtime) == *total) break; } while (++i < 6); } -- Thomas Zimmermann Graphi
Re: [PATCH 02/19] iosys-map: Add offset to iosys_map_memcpy_to()
On Fri, Feb 04, 2022 at 07:48:10PM +0100, Thomas Zimmermann wrote: Hi Am 04.02.22 um 18:44 schrieb Lucas De Marchi: In certain situations it's useful to be able to write to an offset of the mapping. Add a dst_offset to iosys_map_memcpy_to(). Cc: Sumit Semwal Cc: Christian König Cc: Thomas Zimmermann Cc: dri-devel@lists.freedesktop.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/drm_cache.c | 2 +- drivers/gpu/drm/drm_fb_helper.c | 2 +- include/linux/iosys-map.h | 17 + 3 files changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index 66597e411764..c3e6e615bf09 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -218,7 +218,7 @@ static void memcpy_fallback(struct iosys_map *dst, if (!dst->is_iomem && !src->is_iomem) { memcpy(dst->vaddr, src->vaddr, len); } else if (!src->is_iomem) { - iosys_map_memcpy_to(dst, src->vaddr, len); + iosys_map_memcpy_to(dst, 0, src->vaddr, len); } else if (!dst->is_iomem) { memcpy_fromio(dst->vaddr, src->vaddr_iomem, len); } else { diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 238f815cb2a0..bf5cc9a42e5a 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -385,7 +385,7 @@ static void drm_fb_helper_damage_blit_real(struct drm_fb_helper *fb_helper, iosys_map_incr(dst, offset); /* go to first pixel within clip rect */ for (y = clip->y1; y < clip->y2; y++) { - iosys_map_memcpy_to(dst, src, len); + iosys_map_memcpy_to(dst, 0, src, len); iosys_map_incr(dst, fb->pitches[0]); src += fb->pitches[0]; } diff --git a/include/linux/iosys-map.h b/include/linux/iosys-map.h index f4186f91caa6..edd7fa3be9e9 100644 --- a/include/linux/iosys-map.h +++ b/include/linux/iosys-map.h @@ -220,22 +220,23 @@ static inline void iosys_map_clear(struct iosys_map *map) } /** - * iosys_map_memcpy_to - Memcpy into iosys mapping + * iosys_map_memcpy_to_offset - Memcpy into offset of iosys_map 'iosys_map_memcpy_to' With that fixed: Reviewed-by: Thomas Zimmermann thanks, I noticed that, but looks like I squashed to the wrong patch. Lucas De Marchi
Re: [PATCH v2 1/4] drm/format-helper: Add drm_fb_{xrgb8888,gray8}_to_mono_reversed()
Hello Thomas, Thanks a lot for your feedback. On 2/4/22 16:52, Thomas Zimmermann wrote: [snip] >> +static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, const u8 *src, >> size_t pixels) >> +{ >> +unsigned int xb, i; >> + >> +for (xb = 0; xb < pixels / 8; xb++) { > > In practice, all mode widths are multiples of 8 because VGA mandated it. > So it's ok-ish to assume this here. You should probably at least print a > warning somewhere if (pixels % 8 != 0) > Agreed. [snip] >> + * DRM doesn't have native monochrome or grayscale support. >> + * Such drivers can announce the commonly supported XR24 format to userspace >> + * and use drm_fb_xrgb_to_gray8() to convert to grayscale and then this >> + * helper function to convert to the native format. >> + */ >> +void drm_fb_gray8_to_mono_reversed(void *dst, unsigned int dst_pitch, const >> void *src, >> + const struct drm_rect *clip) > > There's a bug here. You want to pass in a drm_framebuffer as fourth > argument. > >> +{ >> + >> +size_t height = drm_rect_height(clip); >> +size_t width = drm_rect_width(clip); >> +unsigned int y; >> +const u8 *gray8 = src; >> +u8 *mono = dst; >> + >> +if (!dst_pitch) >> +dst_pitch = width; > > The dst_pitch is given in bytes. You have to device by 8. Here would be > a good place to warn if (width % 8 != 0). > Ok. >> + >> +for (y = 0; y < height; y++) { >> +drm_fb_gray8_to_mono_reversed_line(mono, gray8, dst_pitch); >> +mono += (dst_pitch / 8); > > The dst_pitch is already given in bytes. > Yes, I know but for reversed mono we want only 1/8 of the width since we are converting from 8 bits per pixel greyscale to 1 bit per pixel mono. Or am I misunderstanding what you meant ? >> +gray8 += dst_pitch; > > 'gray8 += fb->pitches[0]' would be correct. > Ok. [snip] >> + */ >> +void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, >> const void *src, >> + const struct drm_framebuffer *fb, >> + const struct drm_rect *clip) >> +{ >> +if (WARN_ON(fb->format->format != DRM_FORMAT_XRGB)) >> +return; >> + >> +if (!dst_pitch) >> +dst_pitch = drm_rect_width(clip); >> + >> +drm_fb_xrgb_to_gray8(dst, dst_pitch, src, fb, clip); >> +drm_fb_gray8_to_mono_reversed(dst, dst_pitch, dst, fb, clip); > > Converting from dst into dst can give incorrect results. At some point > we probably want to add restrict qualifiers to these pointers, to help > the compiler with optimizing. > > A better approach here is to pull the per-line conversion from > drm_fb_xrgb_to_gray8() into a separate helper and implement a > line-by-line conversion here. something like this: > >drm_fb_xrgb_to_mono_reversed() >{ > char *tmp = kmalloc(size of a single line of gray8) > > for (heigth) { > drm_fb_xrgb_to_gray8_line(tmp, ..., src, ...); > drm_fb_gray8_to_mono_reversed(dst, ..., tmp, ...); > > src += fb->pitches[0] > dst += dst_pitch; > } > > kfree(tmp); >} > I see. Yes, that sounds a much better approach. I'll change it in v3. Best regards, -- Javier Martinez Canillas Linux Engineering Red Hat
[PATCH] drm/nouveau/backlight: Just set all backlight types as RAW
Currently we can get a warning on systems with eDP backlights like so: nv_backlight: invalid backlight type WARNING: CPU: 4 PID: 454 at drivers/video/backlight/backlight.c:420 backlight_device_register+0x226/0x250 This happens as a result of us not filling out props.type for the eDP backlight, even though we do it for all other backlight types. Since nothing in our driver uses anything but BACKLIGHT_RAW, let's take the props\.type assignments out of the codepaths for individual backlight types and just set it unconditionally to prevent this from happening again. Signed-off-by: Lyude Paul Fixes: 6eca310e8924 ("drm/nouveau/kms/nv50-: Add basic DPCD backlight support for nouveau") Cc: # v5.15+ --- drivers/gpu/drm/nouveau/nouveau_backlight.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c b/drivers/gpu/drm/nouveau/nouveau_backlight.c index 6af12dc99d7f..daf9f87477ba 100644 --- a/drivers/gpu/drm/nouveau/nouveau_backlight.c +++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c @@ -101,7 +101,6 @@ nv40_backlight_init(struct nouveau_encoder *encoder, if (!(nvif_rd32(device, NV40_PMC_BACKLIGHT) & NV40_PMC_BACKLIGHT_MASK)) return -ENODEV; - props->type = BACKLIGHT_RAW; props->max_brightness = 31; *ops = &nv40_bl_ops; return 0; @@ -343,7 +342,6 @@ nv50_backlight_init(struct nouveau_backlight *bl, else *ops = &nva3_bl_ops; - props->type = BACKLIGHT_RAW; props->max_brightness = 100; return 0; @@ -411,6 +409,7 @@ nouveau_backlight_init(struct drm_connector *connector) goto fail_alloc; } + props.type = BACKLIGHT_RAW; bl->dev = backlight_device_register(backlight_name, connector->kdev, nv_encoder, ops, &props); if (IS_ERR(bl->dev)) { -- 2.34.1
Re: [PATCH 13/21] fbcon: move more common code into fb_open()
On Mon, Jan 31, 2022 at 10:05:44PM +0100, Daniel Vetter wrote: > No idea why con2fb_acquire_newinfo() initializes much less than > fbcon_startup(), but so be it. From a quick look most of the > un-initialized stuff should be fairly harmless, but who knows. > > Signed-off-by: Daniel Vetter > Cc: Daniel Vetter > Cc: Greg Kroah-Hartman > Cc: Tetsuo Handa > Cc: Thomas Zimmermann > Cc: Claudio Suarez > Cc: Du Cheng > --- > drivers/video/fbdev/core/fbcon.c | 74 +--- > 1 file changed, 31 insertions(+), 43 deletions(-) > > diff --git a/drivers/video/fbdev/core/fbcon.c > b/drivers/video/fbdev/core/fbcon.c > index b83a5a77d8a8..5a3391ff038d 100644 > --- a/drivers/video/fbdev/core/fbcon.c > +++ b/drivers/video/fbdev/core/fbcon.c > @@ -680,8 +680,18 @@ static int fbcon_invalid_charcount(struct fb_info *info, > unsigned charcount) > > #endif /* CONFIG_MISC_TILEBLITTING */ > > +static void fbcon_release(struct fb_info *info) > +{ > + if (info->fbops->fb_release) > + info->fbops->fb_release(info, 0); > + > + module_put(info->fbops->owner); > +} > + > static int fbcon_open(struct fb_info *info) > { > + struct fbcon_ops *ops; > + > if (!try_module_get(info->fbops->owner)) > return -ENODEV; > > @@ -691,19 +701,22 @@ static int fbcon_open(struct fb_info *info) > return -ENODEV; > } > > - return 0; > -} > + ops = kzalloc(sizeof(struct fbcon_ops), GFP_KERNEL); > + if (!ops) { > + fbcon_release(info); > + return -ENOMEM; > + } > > -static void fbcon_release(struct fb_info *info) > -{ > - if (info->fbops->fb_release) > - info->fbops->fb_release(info, 0); > + INIT_DELAYED_WORK(&ops->cursor_work, fb_flashcursor); > + ops->info = info; > + info->fbcon_par = ops; > + ops->cur_blink_jiffies = HZ / 5; > > - module_put(info->fbops->owner); > + return 0; > } > > static int con2fb_acquire_newinfo(struct vc_data *vc, struct fb_info *info, > - int unit, int oldidx) > + int unit) > { > struct fbcon_ops *ops = NULL; > int err; > @@ -712,27 +725,10 @@ static int con2fb_acquire_newinfo(struct vc_data *vc, > struct fb_info *info, > if (err) > return err; > > - if (!err) { > - ops = kzalloc(sizeof(struct fbcon_ops), GFP_KERNEL); > - if (!ops) > - err = -ENOMEM; > - > - INIT_DELAYED_WORK(&ops->cursor_work, fb_flashcursor); > - } > - > - if (!err) { > - ops->cur_blink_jiffies = HZ / 5; > - ops->info = info; > - info->fbcon_par = ops; > - > - if (vc) > - set_blitting_type(vc, info); > - } > + ops = info->fbcon_par; > > - if (err) { > - con2fb_map[unit] = oldidx; > - fbcon_release(info); > - } > + if (vc) > + set_blitting_type(vc, info); > > return err; > } > @@ -840,9 +836,11 @@ static int set_con2fb_map(int unit, int newidx, int user) > > found = search_fb_in_map(newidx); > > - con2fb_map[unit] = newidx; > - if (!err && !found) > - err = con2fb_acquire_newinfo(vc, info, unit, oldidx); > + if (!err && !found) { > + err = con2fb_acquire_newinfo(vc, info, unit); > + if (!err) > + con2fb_map[unit] = newidx; > + } This looks like an unintentional change of functionality as con2fb_map[unit] is only assigned when we do a con2fb_acquire_newinfo(). Staring at the code I could not say it is wrong, but not nice to hide the change in this patch. Sam > > /* >* If old fb is not mapped to any of the consoles, > @@ -939,20 +937,10 @@ static const char *fbcon_startup(void) > if (fbcon_open(info)) > return NULL; > > - ops = kzalloc(sizeof(struct fbcon_ops), GFP_KERNEL); > - if (!ops) { > - fbcon_release(info); > - return NULL; > - } > - > - INIT_DELAYED_WORK(&ops->cursor_work, fb_flashcursor); > - > + ops = info->fbcon_par; > ops->currcon = -1; > ops->graphics = 1; > ops->cur_rotate = -1; > - ops->cur_blink_jiffies = HZ / 5; > - ops->info = info; > - info->fbcon_par = ops; > > p->con_rotate = initial_rotation; > if (p->con_rotate == -1) > @@ -1022,7 +1010,7 @@ static void fbcon_init(struct vc_data *vc, int init) > return; > > if (!info->fbcon_par) > - con2fb_acquire_newinfo(vc, info, vc->vc_num, -1); > + con2fb_acquire_newinfo(vc, info, vc->vc_num); > > /* If we are not the first console on this > fb, copy the font from that console */ > -- > 2.33.0
Re: [RFC PATCH 1/3] drm: Extract amdgpu_sa.c as a generic suballocation helper
Hi Am 04.02.22 um 19:29 schrieb Christian König: Oh, that's on my TODO list for years! Am 04.02.22 um 18:48 schrieb Maarten Lankhorst: Suballocating a buffer object is something that is not driver generic, and is useful for other drivers as well. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/Makefile | 4 +- drivers/gpu/drm/drm_suballoc.c | 424 + include/drm/drm_suballoc.h | 78 ++ 3 files changed, 505 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/drm_suballoc.c create mode 100644 include/drm/drm_suballoc.h diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 8675c2af7ae1..b848bcf8790c 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -57,7 +57,9 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o \ drm_scdc_helper.o drm_gem_atomic_helper.o \ drm_gem_framebuffer_helper.o \ drm_atomic_state_helper.o drm_damage_helper.o \ - drm_format_helper.o drm_self_refresh_helper.o drm_rect.o + drm_format_helper.o drm_self_refresh_helper.o drm_rect.o \ + drm_suballoc.o + I think we should put that into a separate module like we now do with other helpers as well. Please. KMS helpers are now likely to be linked into the kernel binary. I've already spent time to reduce the size of the module. Best regard Thomas -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH 14/21] fbcon: use lock_fb_info in fbcon_open/release
On Mon, Jan 31, 2022 at 10:05:45PM +0100, Daniel Vetter wrote: > Now we get to the real motiviation, because fbmem.c insists that > that's the right lock for these. > > Ofc fbcon.c has a lot more places where it probably should call > lock_fb_info(). But looking at fbmem.c at least most of these seem to > be protected by console_lock() too, which is probably what papers over > any issues. > > Note that this means we're shuffling around a bit the locking sections > for some of the console takeover and unbind paths, but not all: > - console binding/unbinding from the console layer never with > lock_fb_info > - unbind (as opposed to unlink) never bother with lock_fb_info > > Also the real serialization against set_par and set_pan are still > doing by wrapping the entire ioctl code in console_lock(). So this > shuffling shouldn't be worse than what we had from a "can you trigger > races?" pov, but it's at least clearer. > > Signed-off-by: Daniel Vetter > Cc: Daniel Vetter > Cc: Claudio Suarez > Cc: Tetsuo Handa > Cc: Thomas Zimmermann > Cc: Greg Kroah-Hartman > Cc: Du Cheng > Cc: Sam Ravnborg > Cc: Matthew Wilcox > Cc: William Kucharski > Cc: Alex Deucher > Cc: Zheyu Ma > Cc: Zhen Lei > Cc: Xiyu Yang Well, the patch does what the commit log says. Acked-by: Sam Ravnborg