date:20250612

Re: [PATCH v4 04/20] rust: add new `num` module with useful integer operations

2025-06-12 Thread Alexandre Courbot

On Thu Jun 12, 2025 at 11:49 PM JST, Benno Lossin wrote:
> On Thu Jun 12, 2025 at 3:27 PM CEST, Alexandre Courbot wrote:
>> On Thu Jun 12, 2025 at 10:17 PM JST, Alexandre Courbot wrote:
>>> On Wed Jun 4, 2025 at 4:18 PM JST, Benno Lossin wrote:
 On Wed Jun 4, 2025 at 2:05 AM CEST, Alexandre Courbot wrote:
> On Wed Jun 4, 2025 at 8:02 AM JST, Benno Lossin wrote:
>> On Mon Jun 2, 2025 at 3:09 PM CEST, Alexandre Courbot wrote:
>>> On Thu May 29, 2025 at 4:27 PM JST, Benno Lossin wrote:
 On Thu May 29, 2025 at 3:18 AM CEST, Alexandre Courbot wrote:
> On Thu May 29, 2025 at 5:17 AM JST, Benno Lossin wrote:
>> On Wed May 21, 2025 at 8:44 AM CEST, Alexandre Courbot wrote:
>>> +/// Align `self` up to `alignment`.
>>> +///
>>> +/// `alignment` must be a power of 2 for accurate results.
>>> +///
>>> +/// Wraps around to `0` if the requested alignment pushes the 
>>> result above the type's limits.
>>> +///
>>> +/// # Examples
>>> +///
>>> +/// ```
>>> +/// use kernel::num::NumExt;
>>> +///
>>> +/// assert_eq!(0x4fffu32.align_up(0x1000), 0x5000);
>>> +/// assert_eq!(0x4000u32.align_up(0x1000), 0x4000);
>>> +/// assert_eq!(0x0u32.align_up(0x1000), 0x0);
>>> +/// assert_eq!(0xu16.align_up(0x100), 0x0);
>>> +/// assert_eq!(0x4fffu32.align_up(0x0), 0x0);
>>> +/// ```
>>> +fn align_up(self, alignment: Self) -> Self;
>>
>> Isn't this `next_multiple_of` [1] (it also allows non power of 2
>> inputs).
>>
>> [1]: 
>> https://doc.rust-lang.org/std/primitive.u32.html#method.next_multiple_of
>
> It is, however the fact that `next_multiple_of` works with non powers 
> of
> two also means it needs to perform a modulo operation. That operation
> might well be optimized away by the compiler, but ACAICT we have no 
> way
> of proving it will always be the case, hence the always-optimal
> implementation here.

 When you use a power of 2 constant, then I'm very sure that it will get
 optimized [1]. Even with non-powers of 2, you don't get a division [2].
 If you find some code that is not optimized, then sure add a custom
 function.

 [1]: https://godbolt.org/z/57M9e36T3
 [2]: https://godbolt.org/z/9P4P8zExh
>>>
>>> That's impressive and would definitely work well with a constant. But
>>> when the value is not known at compile-time, the division does occur
>>> unfortunately: https://godbolt.org/z/WK1bPMeEx
>>>
>>> So I think we will still need a kernel-optimized version of these
>>> alignment functions.
>>
>> Hmm what exactly is the use-case for a variable align amount? Could you
>> store it in const generics?
>
> Say you have an IOMMU with support for different pages sizes, the size
> of a particular page can be decided at runtime.
>
>>
>> If not, there are also these two variants that are more efficient:
>>
>> * option: https://godbolt.org/z/ecnb19zaM
>> * unsafe: https://godbolt.org/z/EqTaGov71
>>
>> So if the compiler can infer it from context it still optimizes it :)
>
> I think the `Option` (and subsequent `unwrap`) is something we want to
> avoid on such a common operation.

 Makes sense.

>> But yeah to be extra sure, you need your version. By the way, what
>> happens if `align` is not a power of 2 in your version?
>
> It will just return `(self + (self - 1)) & (alignment - 1)`, which will
> likely be a value you don't want.

 So wouldn't it be better to make users validate that they gave a
 power-of-2 alignment?

> So yes, for this particular operation we would prefer to only use powers
> of 2 as inputs - if we can ensure that then it solves most of our
> problems (can use `next_multiple_of`, no `Option`, etc).
>
> Maybe we can introduce a new integer type that, similarly to `NonZero`,
> guarantees that the value it stores is a power of 2? Users with const
> values (90+% of uses) won't see any difference, and if working with a
> runtime-generated value we will want to validate it anyway...

 I like this idea. But it will mean that we have to have a custom
 function that is either standalone and const or in an extension trait :(
 But for this one we can use the name `align_up` :)

 Here is a cool idea for the implementation: https://godbolt.org/z/x6navM5WK
>>>
>>> Yeah that's close to what I had in mind. Actually, we can also define
>>> `align_up` and `align_down` within this new type, and these methods can
>>> now be const since they are not implemented via a trait!
>
> That sounds like a good id

[RFC PATCH 4/4] drm/i915/writeback: Init writeback connector

2025-06-12 Thread Suraj Kandpal

Initialize writeback connector initialising the virtual encoder
and intel connector. We also allocate memory for drm_writeback_connector
but not the drm_connector within it due to a constraint
we need all connectors to be an intel_connector.
The writeback_format arrays is used to tell the user which
drm formats are supported by us.

Bspec: 49275
Signed-off-by: Suraj Kandpal 

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 13d4a16f7d33..0748edae8aa9 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -294,6 +294,7 @@ i915-y += \
display/intel_vblank.o \
display/intel_vga.o \
display/intel_wm.o \
+   display/intel_writeback.o \
display/skl_scaler.o \
display/skl_universal_plane.o \
display/skl_watermark.o
diff --git a/drivers/gpu/drm/i915/display/intel_writeback.c 
b/drivers/gpu/drm/i915/display/intel_writeback.c
new file mode 100644
index ..7be2c24c530f
--- /dev/null
+++ b/drivers/gpu/drm/i915/display/intel_writeback.c
@@ -0,0 +1,131 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2024 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "i915_drv.h"
+#include "intel_de.h"
+#include "intel_display_types.h"
+#include "intel_display_driver.h"
+#include "intel_connector.h"
+#include "intel_writeback.h"
+
+struct intel_writeback_connector {
+   struct drm_writeback_connector base;
+   struct intel_encoder encoder;
+   struct intel_connector connector;
+};
+
+static const u32 writeback_formats[] = {
+   DRM_FORMAT_XYUV,
+   DRM_FORMAT_YUYV,
+   DRM_FORMAT_XBGR,
+   DRM_FORMAT_XVYU2101010,
+   DRM_FORMAT_VYUY,
+   DRM_FORMAT_XBGR2101010,
+};
+
+static int intel_writeback_connector_init(struct intel_connector *connector)
+{
+   struct intel_digital_connector_state *conn_state;
+
+   conn_state = kzalloc(sizeof(conn_state), GFP_KERNEL);
+   if (!conn_state)
+   return -ENOMEM;
+
+   __drm_atomic_helper_connector_reset(&connector->base,
+   &conn_state->base);
+
+   return 0;
+}
+
+static struct
+intel_connector *intel_writeback_connector_alloc(struct intel_connector 
*connector)
+{
+   connector = kzalloc(sizeof(connector), GFP_KERNEL);
+   if (!connector)
+   return NULL;
+
+   if (intel_writeback_connector_init(connector) < 0) {
+   kfree(connector);
+   return NULL;
+   }
+
+   return connector;
+}
+
+static const struct drm_encoder_helper_funcs enc_helper_funcs = {
+};
+
+static const struct drm_encoder_funcs drm_writeback_encoder_funcs = {
+   .destroy = drm_encoder_cleanup,
+};
+
+const struct drm_connector_funcs conn_funcs = {
+   .fill_modes = drm_helper_probe_single_connector_modes,
+   .reset = drm_atomic_helper_connector_reset,
+   .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
+   .atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
+};
+
+static const struct drm_connector_helper_funcs 
intel_writeback_conn_helper_funcs = {
+};
+
+int intel_writeback_init(struct intel_display *display)
+{
+   struct intel_encoder *encoder;
+   struct intel_writeback_connector *writeback_conn;
+   struct intel_connector *connector;
+   int ret;
+
+   writeback_conn = kzalloc(sizeof(*writeback_conn), GFP_KERNEL);
+   if (!writeback_conn)
+   return -ENOSPC;
+
+   connector = &writeback_conn->connector;
+   intel_writeback_connector_alloc(connector);
+
+   encoder = &writeback_conn->encoder;
+   drm_encoder_helper_add(&encoder->base, &enc_helper_funcs);
+   ret = drm_encoder_init(display->drm, &encoder->base,
+  &drm_writeback_encoder_funcs,
+  DRM_MODE_ENCODER_VIRTUAL, NULL);
+   if (ret) {
+   intel_connector_free(connector);
+   kfree(writeback_conn);
+   return ret;
+   }
+
+   encoder->base.possible_crtcs = 0xf;
+   encoder->type = INTEL_OUTPUT_WRITEBACK;
+   encoder->pipe_mask = ~0;
+
+   connector->base.interlace_allowed = 0;
+   drm_connector_helper_add(&connector->base, 
&intel_writeback_conn_helper_funcs);
+   ret = drm_writeback_connector_init_with_conn(display->drm, 
&connector->base,
+&writeback_conn->base,
+&encoder->base, 
&conn_funcs,
+writeback_formats,
+
ARRAY_SIZE(writeback_formats));
+   if (ret) {
+   intel_connector_free(connector);
+   drm_encoder_cleanup(&encoder->base);
+   kfree(&writeback_conn->encoder);
+   kfree(writeback_conn);
+

Re: [PATCH] drm/bridge: ti-sn65dsi86: fix REFCLK setting

2025-06-12 Thread Lucas Stach

Am Donnerstag, dem 12.06.2025 um 15:31 -0700 schrieb Doug Anderson:
> Hi,
> 
> On Thu, Jun 12, 2025 at 10:52 AM Doug Anderson  wrote:
> > 
> > Hi,
> > 
> > On Thu, Jun 12, 2025 at 12:35 AM Jayesh Choudhary  
> > wrote:
> > > 
> > > > > If refclk is described in devicetree node, then I see that
> > > > > the driver modifies it in every resume call based solely on the
> > > > > clock value in dts.
> > > > 
> > > > Exactly. But that is racy with what the chip itself is doing. I.e.
> > > > if you don't have that usleep() above, the chip will win the race
> > > > and the refclk frequency setting will be set according to the
> > > > external GPIOs (which is poorly described in the datasheet, btw),
> > > > regardless what the linux driver is setting (because that I2C write
> > > > happens too early).
> > > 
> > > I am a little confused here.
> > > Won't it be opposite?
> > > If we have this delay here, GPIO will stabilize and set the register
> > > accordingly?
> > > 
> > > In the driver, I came across the case when we do not have refclk.
> > > (My platform does have a refclk, I am just removing the property from
> > > the dts node to check the affect of GPIO[3:1] in question because clock
> > > is not a required property for the bridge as per the bindings)
> > > 
> > > In the ti_sn65dsi86_probe(), before we read SN_DEVICE_ID_REGS,
> > > when we go to resume(), we do not do enable_comms() that calls
> > > ti_sn_bridge_set_refclk_freq() to set SN_DPPLL_SRC_REG.
> > > I see that register read for SN_DEVICE_ID_REGS fails in that case.
> > > 
> > > Adding this delay fixes that issue. This made me think that we need
> > > the delay for GPIO to stabilize and set the refclk.
> > 
> > FWIW, it's been on my plate for a while to delete the "no refclk"
> > support. The chip is really hard to use properly without a refclk and
> > I'm not at all convinced that the current code actually works properly
> > without a refclk. I'm not aware of any current hardware working this
> > way. I know we had some very early prototype hardware ages ago that
> > tried it and we got it limping along at one point, but the driver
> > looked _very_ different then. I believe someone on the lists once
> > mentioned trying to do something without a refclk and it didn't work
> > and I strongly encouraged them to add a refclk.
> 
> Actually, I may have to eat my words here. I double-checked the dts
> and I see there's at least two mainline users
> ("meson-g12b-bananapi-cm4-mnt-reform2.dts" and
> "/imx8mq-mnt-reform2.dts") that don't seem to be specifying a `refclk`
> to `ti,sn65dsi86`.
> 
> Neil / Lucas: is that correct? ...and it actually works?
> 
The description is correct, the refclock is not connected on the reform
baseboard.

It sort of works, as-in AUX channel is not working before the display
link is up to provide a reference clock and I guess that also means HPD
is broken. On the reform the connected panel is described as a simple
panel with a fixed mode, not using the EDID from panel.

Regards,
Lucas

Re: [PATCH v3 3/8] drm/imagination: Use pwrseq for TH1520 GPU power management

2025-06-12 Thread Krzysztof Kozlowski

On 11/06/2025 14:32, Bartosz Golaszewski wrote:
> On Wed, Jun 11, 2025 at 2:01 PM Michal Wilczynski
>  wrote:
>>
>>
>>
>> On 6/5/25 10:10, Bartosz Golaszewski wrote:
>>> On Thu, Jun 5, 2025 at 9:47 AM Michal Wilczynski
>>>  wrote:



 On 6/4/25 14:07, Krzysztof Kozlowski wrote:
> On 04/06/2025 13:53, Michal Wilczynski wrote:

 The GPU node will depend on the AON node, which will be the sole
 provider for the 'gpu-power' sequencer (based on the discussion in 
 patch
 1).

 Therefore, if the AON/pwrseq driver has already completed its probe, 
 and
 devm_pwrseq_get() in the GPU driver subsequently returns -EPROBE_DEFER
 (because pwrseq_get found 'no match' on the bus for 'gpu-power'), the
 interpretation is that the AON driver did not register this optional
 sequencer. Since AON is the only anticipated source, it implies the
 sequencer won't become available later from its designated provider.
>>>
>>> I don't understand why you made this assumption. AON could be a module
>>> and this driver built-in. AON will likely probe later.
>>
>> You're absolutely right that AON could be a module and would generally
>> probe later in that scenario. However, the GPU device also has a
>> 'power-domains = <&aon TH1520_GPU_PD>' dependency. If the AON driver (as
>> the PM domain provider) were a late probing module, the GPU driver's
>> probe would hit -EPROBE_DEFER when its power domain is requested
>> which happens before attempting to get other resources like a power
>> sequencer.
>
> Huh, so basically you imply certain hardware design and certain DTS
> description in your driver code. Well, that's clearly fragile design to
> me, because you should not rely how hardware properties are presented in
> DTS. Will work here on th1520 with this DTS, won't work with something 
> else.
>
> Especially that this looks like generic Imagination GPU code, common to
> multiple devices, not TH1520 only specific.
>
>>
>> So, if the GPU driver's code does reach the devm_pwrseq_get(dev,
>> "gpu-power") call, it strongly implies the AON driver has already
>> successfully probed.
>>
>> This leads to the core challenge with the optional 'gpu-power'
>> sequencer: Even if the AON driver has already probed, if it then chooses
>> not to register the "gpu-power" sequence (because it's an optional
>> feature), pwrseq_get() will still find "no device matched" on the
>> pwrseq_bus and return EPROBE_DEFER.
>>
>> If the GPU driver defers here, as it normally should for -EPROBE_DEFER,
>> it could wait indefinitely for an optional sequence that its
>> already probed AON provider will not supply.
>>
>> Anyway I think you're right, that this is probably confusing and we
>> shouldn't rely on this behavior.
>>
>> To solve this, and to allow the GPU driver to correctly handle
>> -EPROBE_DEFER when a sequencer is genuinely expected, I propose using a
>> boolean property on the GPU's DT node, e.g.
>> img,gpu-expects-power-sequencer. If the GPU node provides this property
>> it means the pwrseq 'gpu-power' is required.
>
> No, that would be driver design in DTS.
>
> I think the main problem is the pwrseq API: you should get via phandle,
> not name of the pwrseq controller. That's how all producer-consumer
> relationships are done in OF platforms.

 Bart,
 Given Krzysztof's valid concerns about the current name based
 lookup in pwrseq_get() and the benefits of phandle based resource
 linking in OF platforms: Would you be open to a proposal for extending
 the pwrseq API to allow consumers to obtain a sequencer (or a specific
 target sequence) via a phandle defined in their Device Tree node? For
 instance, a consumer device could specify power-sequencer =
 <&aon> and a new API variant could resolve this.

>>>
>>> I can be open to it all I want, but I bet Krzysztof won't be open to
>>> introducing anything like a power-sequencer device property in DT
>>> bindings. Simply because there's no such thing in the physical world.
>>> The concept behind the power sequencing framework was to bind
>>> providers to consumers based on existing links modelling real device
>>> properties (which a "power-sequencer" is not). I commented on it under
>>> another email saying that you already have a link here - the
>>> power-domains property taking the aon phandle. In your pwrseq
>>> provider's match() callback you can parse and resolve it back to the
>>> aon node thus making sure you're matching the consumer with the
>>> correct provider.
>>>
>>> Please take a look at the existing wcn pwrseq driver which does a
>>> similar thing but parses the regulator properties of the power
>>> management unit (in the pwrseq_qcom_wcn_match()

Re: [PATCH v3 3/8] drm/imagination: Use pwrseq for TH1520 GPU power management

2025-06-12 Thread Krzysztof Kozlowski

On 11/06/2025 14:01, Michal Wilczynski wrote:
> 
> However, this leads me back to a fundamental issue with the
> consumer side implementation in the generic pvr_device.c driver. The
> current fallback code is:
> 
> /*
>  * If the error is -EPROBE_DEFER, it's because the
>  * optional sequencer provider is not present
>  * and it's safe to fall back on manual power-up.
>  */
> if (pwrseq_err == -EPROBE_DEFER)
> pvr_dev->pwrseq = NULL;
> 
> As Krzysztof noted, simply ignoring -EPROBE_DEFER is not ideal. But if I
> change this to a standard deferred probe, the pvr_device.c driver will

Why? You have specific compatible for executing such quirks only for
given platform.

> break on all other supported SoCs. It would wait indefinitely for a
> pwrseq-thead-gpu provider that will never appear on those platforms.
> 



Best regards,
Krzysztof

RE: [PATCH v6 1/4] clk: renesas: rzv2h-cpg: Add support for DSI clocks

2025-06-12 Thread Biju Das

Hi Prabhakar,

> -Original Message-
> From: Prabhakar 
> Sent: 30 May 2025 18:19
.castro...@renesas.com>; Prabhakar Mahadev Lad  lad...@bp.renesas.com>
> Subject: [PATCH v6 1/4] clk: renesas: rzv2h-cpg: Add support for DSI clocks
> 
> From: Lad Prabhakar 
> 
> Add support for PLLDSI and PLLDSI divider clocks.
> 
> Introduce the `renesas-rzv2h-dsi.h` header to centralize and share 
> PLLDSI-related data structures,
> limits, and algorithms between the RZ/V2H CPG and DSI drivers.
> 
> The DSI PLL is functionally similar to the CPG's PLLDSI, but has slightly 
> different parameter limits
> and omits the programmable divider present in CPG. To ensure precise 
> frequency calculations-especially
> for milliHz-level accuracy needed by the DSI driver-the shared algorithm 
> allows both drivers to
> compute PLL parameters consistently using the same logic and input clock.
> 
> Co-developed-by: Fabrizio Castro 
> Signed-off-by: Fabrizio Castro 
> Signed-off-by: Lad Prabhakar 
> ---
> v5->v6:
> - Renamed CPG_PLL_STBY_SSCGEN_WEN to CPG_PLL_STBY_SSC_EN_WEN
> - Updated CPG_PLL_CLK1_DIV_K, CPG_PLL_CLK1_DIV_M, and
>   CPG_PLL_CLK1_DIV_P macros to use GENMASK
> - Updated req->rate in rzv2h_cpg_plldsi_div_determine_rate()
> - Dropped the cast in rzv2h_cpg_plldsi_div_set_rate()
> - Dropped rzv2h_cpg_plldsi_round_rate() and implemented
>   rzv2h_cpg_plldsi_determine_rate() instead
> - Made use of FIELD_PREP()
> - Moved CPG_CSDIV1 macro in patch 2/4
> - Dropped two_pow_s in rzv2h_dsi_get_pll_parameters_values()
> - Used mul_u32_u32() while calculating output_m and output_k_range
> - Used div_s64() instead of div64_s64() while calculating
>   pll_k
> - Used mul_u32_u32() while calculating fvco and fvco checks
> - Rounded the final output using DIV_U64_ROUND_CLOSEST()
> 
> v4->v5:
> - No changes
> 
> v3->v4:
> - Corrected parameter name in rzv2h_dsi_get_pll_parameters_values()
>   description freq_millihz
> 
> v2->v3:
> - Update the commit message to clarify the purpose of `renesas-rzv2h-dsi.h`
>   header
> - Used mul_u32_u32() in rzv2h_cpg_plldsi_div_determine_rate()
> - Replaced *_mhz to *_millihz for clarity
> - Updated u64->u32 for fvco limits
> - Initialized the members in declaration order for
>   RZV2H_CPG_PLL_DSI_LIMITS() macro
> - Used clk_div_mask() in rzv2h_cpg_plldsi_div_recalc_rate()
> - Replaced `unsigned long long` with u64
> - Dropped rzv2h_cpg_plldsi_clk_recalc_rate() and reused
>   rzv2h_cpg_pll_clk_recalc_rate() instead
> - In rzv2h_cpg_plldsi_div_set_rate() followed the same style
>   of RMW-operation as done in the other functions
> - Renamed rzv2h_cpg_plldsi_set_rate() to rzv2h_cpg_pll_set_rate()
> - Dropped rzv2h_cpg_plldsi_clk_register() and reused
>   rzv2h_cpg_pll_clk_register() instead
> - Added a gaurd in renesas-rzv2h-dsi.h header
> 
> v1->v2:
> - No changes
> ---
>  drivers/clk/renesas/rzv2h-cpg.c   | 278 +-
>  drivers/clk/renesas/rzv2h-cpg.h   |  13 ++
>  include/linux/clk/renesas-rzv2h-dsi.h | 210 +++
>  3 files changed, 492 insertions(+), 9 deletions(-)  create mode 100644 
> include/linux/clk/renesas-
> rzv2h-dsi.h
> 
> diff --git a/drivers/clk/renesas/rzv2h-cpg.c 
> b/drivers/clk/renesas/rzv2h-cpg.c index
> 761da3bf77ce..d590f9f47371 100644
> --- a/drivers/clk/renesas/rzv2h-cpg.c
> +++ b/drivers/clk/renesas/rzv2h-cpg.c
> @@ -14,9 +14,13 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -26,6 +30,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include 
> 
> @@ -48,12 +53,13 @@
>  #define CPG_PLL_STBY(x)  ((x))
>  #define CPG_PLL_STBY_RESETB  BIT(0)
>  #define CPG_PLL_STBY_RESETB_WEN  BIT(16)
> +#define CPG_PLL_STBY_SSC_EN_WEN BIT(18)
>  #define CPG_PLL_CLK1(x)  ((x) + 0x004)
> -#define CPG_PLL_CLK1_KDIV(x) ((s16)FIELD_GET(GENMASK(31, 16), (x)))
> -#define CPG_PLL_CLK1_MDIV(x) FIELD_GET(GENMASK(15, 6), (x))
> -#define CPG_PLL_CLK1_PDIV(x) FIELD_GET(GENMASK(5, 0), (x))
> +#define CPG_PLL_CLK1_KDIVGENMASK(31, 16)
> +#define CPG_PLL_CLK1_MDIVGENMASK(15, 6)
> +#define CPG_PLL_CLK1_PDIVGENMASK(5, 0)
>  #define CPG_PLL_CLK2(x)  ((x) + 0x008)
> -#define CPG_PLL_CLK2_SDIV(x) FIELD_GET(GENMASK(2, 0), (x))
> +#define CPG_PLL_CLK2_SDIVGENMASK(2, 0)
>  #define CPG_PLL_MON(x)   ((x) + 0x010)
>  #define CPG_PLL_MON_RESETB   BIT(0)
>  #define CPG_PLL_MON_LOCK BIT(4)
> @@ -79,6 +85,8 @@
>   * @last_dt_core_clk: ID of the last Core Clock exported to DT
>   * @mstop_count: Array of mstop values
>   * @rcdev: Reset controller entity
> + * @dsi_limits: PLL DSI parameters limits
> + * @plldsi_div_parameters: PLL DSI and divider parameters configuration
>   */
>  struct rzv2h_cpg_priv {
>   struct device *dev;
> @@ -95,6 +103,9 @@ struct rzv2h_cpg_priv {
>   atomic_t *mstop_count;
> 
>   struct reset_controller_dev rcdev;
> +
> + const

[RFC PATCH 2/4] drm/i915/writeback: Add writeback registers

2025-06-12 Thread Suraj Kandpal

Add writeback registers to its own file.

Signed-off-by: Suraj Kandpal 

diff --git a/drivers/gpu/drm/i915/display/intel_writeback_reg.h 
b/drivers/gpu/drm/i915/display/intel_writeback_reg.h
new file mode 100644
index ..dd872b6f8103
--- /dev/null
+++ b/drivers/gpu/drm/i915/display/intel_writeback_reg.h
@@ -0,0 +1,134 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2024 Intel Corporation
+ */
+
+#ifndef __INTEL_WRITEBACK_REGS_H__
+#define __INTEL_WRITEBACK_REGS_H__
+
+#include "intel_display_reg_defs.h"
+
+/* WD 0 and 1 */
+#define TRANSCODER_WD0_OFFSET  0x6e000
+#define TRANSCODER_WD1_OFFSET  0x6d800
+
+/* WD 0 and 1 */
+#define PIPE_WD0_OFFSET0x7e000
+#define PIPE_WD1_OFFSET0x7d000
+
+/* Gen12 WD */
+#define _MMIO_WD(tc, wd0, wd1) _MMIO_TRANS((tc) - TRANSCODER_WD_0, wd0, wd1)
+
+#define WD_TRANS_ENABLEREG_BIT(31)
+#define WD_TRANS_STATE REG_BIT(30)
+
+/* WD transcoder control */
+#define _WD_TRANS_FUNC_CTL_0   0x6e400
+#define _WD_TRANS_FUNC_CTL_1   0x6ec00
+#define WD_TRANS_FUNC_CTL(tc)  _MMIO_WD(tc,\
+   _WD_TRANS_FUNC_CTL_0,\
+   _WD_TRANS_FUNC_CTL_1)
+
+#define TRANS_WD_FUNC_ENABLE   REG_BIT(31)
+#define WD_TRIGGERED_CAP_MODE_ENABLE   REG_BIT(30)
+#define START_TRIGGER_FRAMEREG_BIT(29)
+#define STOP_TRIGGER_FRAME REG_BIT(28)
+#define WD_INPUT_SELECT_MASK   REG_GENMASK(14, 12)
+#define WD_INPUT_PIPE_A
REG_FIELD_PREP(WD_INPUT_SELECT_MASK, 0)
+#define WD_INPUT_PIPE_B
REG_FIELD_PREP(WD_INPUT_SELECT_MASK, 5)
+#define WD_INPUT_PIPE_C
REG_FIELD_PREP(WD_INPUT_SELECT_MASK, 6)
+#define WD_INPUT_PIPE_D
REG_FIELD_PREP(WD_INPUT_SELECT_MASK, 7)
+#define WD_COLOR_MODE_MASK REG_GENMASK(22, 20)
+#define WD_PIX_FMT_YUYV
REG_FIELD_PREP(WD_COLOR_MODE_MASK, 1)
+#define WD_PIX_FMT_XYUVREG_FIELD_PREP(WD_COLOR_MODE_MASK, 2)
+#define WD_PIX_FMT_XBGRREG_FIELD_PREP(WD_COLOR_MODE_MASK, 3)
+#define WD_PIX_FMT_Y410
REG_FIELD_PREP(WD_COLOR_MODE_MASK, 4)
+#define WD_PIX_FMT_YUV422  REG_FIELD_PREP(WD_COLOR_MODE_MASK, 5)
+#define WD_PIX_FMT_XBGR2101010 REG_FIELD_PREP(WD_COLOR_MODE_MASK, 6)
+#define WD_PIX_FMT_RGB565  REG_FIELD_PREP(WD_COLOR_MODE_MASK, 7)
+#define WD_FRAME_NUMBER_MASK   REG_GENMASK(3, 0)
+#define WD_FRAME_NUMBER(n) REG_FIELD_PREP(WD_FRAME_NUMBER_MASK, n)
+
+#define _WD_STRIDE_0   0x6e510
+#define _WD_STRIDE_1   0x6ed10
+#define WD_STRIDE(tc)  _MMIO_WD(tc,\
+   _WD_STRIDE_0,\
+   _WD_STRIDE_1)
+#define WD_STRIDE_MASK REG_GENMASK(15, 6)
+
+#define _WD_STREAMCAP_CTL0 0x6e590
+#define _WD_STREAMCAP_CTL1 0x6ed90
+#define WD_STREAMCAP_CTL(tc)   _MMIO_WD(tc,\
+   _WD_STREAMCAP_CTL0,\
+   _WD_STREAMCAP_CTL1)
+
+#define WD_STREAM_CAP_MODE_EN  REG_BIT(31)
+#define WD_SLICING_STRAT_MASK  REG_GENMASK(25, 24)
+#define WD_SLICING_STRAT_1_1   REG_FIELD_PREP(WD_SLICING_STRAT_MASK, 0)
+#define WD_SLICING_STRAT_2_1   REG_FIELD_PREP(WD_SLICING_STRAT_MASK, 1)
+#define WD_SLICING_STRAT_4_1   REG_FIELD_PREP(WD_SLICING_STRAT_MASK, 2)
+#define WD_SLICING_STRAT_8_1   REG_FIELD_PREP(WD_SLICING_STRAT_MASK, 3)
+#define WD_STREAM_OVERRUN_STATUS   1
+
+#define _WD_SURF_0 0x6e514
+#define _WD_SURF_1 0x6ed14
+#define WD_SURF(tc)_MMIO_WD(tc,\
+   _WD_SURF_0,\
+   _WD_SURF_1)
+
+#define _WD_IMR_0  0x6e560
+#define _WD_IMR_1  0x6ed60
+#define WD_IMR(tc) _MMIO_WD(tc,\
+   _WD_IMR_0,\
+   _WD_IMR_1)
+#define WD_FRAME_COMPLETE_INT  REG_BIT(7)
+#define WD_GTT_FAULT_INT   REG_BIT(6)
+#define WD_VBLANK_INT  REG_BIT(5)
+#define WD_OVERRUN_INT REG_BIT(4)
+#define WD_CAPTURING_INT   REG_BIT(3)
+#define WD_WRITE_COMPLETE_INT  REG_BIT(2)
+
+#define _WD_IIR_0  0x6e564
+#define _WD_IIR_1  0x6ed64
+#define WD_IIR(tc) _MMIO_WD(tc,\
+   _WD_IIR_0,\
+   _WD_IIR_1)
+
+#define _WD_FRAME_STATUS_0 0x6e56b
+#define _WD_FRAME_STATUS_1 0x6ed6b
+#define WD_FRAME_STATUS(tc)_MMIO_WD(tc,\
+   _WD_FRAME_STATUS_0,\
+   _WD_F

[RFC PATCH 0/4] New Helper to Initialise writeback connector

2025-06-12 Thread Suraj Kandpal

This series is for review comments only and is not tested.
This series added a helper to be able to initialise writeback connector
in a way where drivers can send their own connector and encoder.

Signed-off-by: Suraj Kandpal 

Suraj Kandpal (4):
  drm/writeback: Add function that takes preallocated connector
  drm/i915/writeback: Add writeback registers
  drm/i915/writeback: Add some preliminary writeback definitions
  drm/i915/writeback: Init writeback connector

 drivers/gpu/drm/drm_writeback.c   |  83 +++
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/display/intel_display.h  |   4 +
 .../drm/i915/display/intel_display_device.c   |  26 +++-
 .../drm/i915/display/intel_display_device.h   |   2 +-
 .../drm/i915/display/intel_display_limits.h   |   2 +
 .../drm/i915/display/intel_display_types.h|   1 +
 .../gpu/drm/i915/display/intel_writeback.c| 131 +
 .../gpu/drm/i915/display/intel_writeback.h|  17 +++
 .../drm/i915/display/intel_writeback_reg.h| 134 ++
 include/drm/drm_writeback.h   |   7 +
 11 files changed, 405 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/display/intel_writeback.c
 create mode 100644 drivers/gpu/drm/i915/display/intel_writeback.h
 create mode 100644 drivers/gpu/drm/i915/display/intel_writeback_reg.h

-- 
2.34.1

[RFC PATCH 1/4] drm/writeback: Add function that takes preallocated connector

2025-06-12 Thread Suraj Kandpal

Write a function that takes a preallocated drm_connector instead of
using the one allocated inside the drm writeback connector init
function.

Signed-off-by: Suraj Kandpal 

diff --git a/drivers/gpu/drm/drm_writeback.c b/drivers/gpu/drm/drm_writeback.c
index edbeab88ff2b..0d287ab9bded 100644
--- a/drivers/gpu/drm/drm_writeback.c
+++ b/drivers/gpu/drm/drm_writeback.c
@@ -414,6 +414,89 @@ int drmm_writeback_connector_init(struct drm_device *dev,
 }
 EXPORT_SYMBOL(drmm_writeback_connector_init);
 
+/*
+ * drm_writeback_connector_init_with_conn - Initialize a writeback connector 
with
+ * custom encoder and connector
+ *
+ * @enc: handle to the already initialized drm encoder
+ * @con_funcs: Connector funcs vtable
+ * @formats: Array of supported pixel formats for the writeback engine
+ * @n_formats: Length of the formats array
+ *
+ * This function assumes that the drm_writeback_connector's encoder has 
already been
+ * created and initialized before invoking this function.
+ *
+ * In addition, this function also assumes that callers of this API will manage
+ * assigning the encoder helper functions, possible_crtcs and any other encoder
+ * specific operation.
+ *
+ * Drivers should always use this function instead of drm_connector_init() to
+ * set up writeback connectors if they want to manage themselves the lifetime 
of the
+ * associated encoder.
+ *
+ * Returns: 0 on success, or a negative error code
+ */
+int
+drm_writeback_connector_init_with_conn(struct drm_device *dev, struct 
drm_connector *connector,
+  struct drm_writeback_connector 
*wb_connector,
+  struct drm_encoder *enc,
+  const struct drm_connector_funcs 
*con_funcs,
+  const u32 *formats, int n_formats)
+{
+   struct drm_property_blob *blob;
+   struct drm_mode_config *config = &dev->mode_config;
+   int ret = create_writeback_properties(dev);
+
+   if (ret != 0)
+   return ret;
+
+   blob = drm_property_create_blob(dev, n_formats * sizeof(*formats),
+   formats);
+   if (IS_ERR(blob))
+   return PTR_ERR(blob);
+
+
+   connector->interlace_allowed = 0;
+
+   ret = drm_connector_init(dev, connector, con_funcs,
+DRM_MODE_CONNECTOR_WRITEBACK);
+   if (ret)
+   goto connector_fail;
+
+   ret = drm_connector_attach_encoder(connector, enc);
+   if (ret)
+   goto attach_fail;
+
+   INIT_LIST_HEAD(&wb_connector->job_queue);
+   spin_lock_init(&wb_connector->job_lock);
+
+   wb_connector->fence_context = dma_fence_context_alloc(1);
+   spin_lock_init(&wb_connector->fence_lock);
+   snprintf(wb_connector->timeline_name,
+sizeof(wb_connector->timeline_name),
+"CONNECTOR:%d-%s", connector->base.id, connector->name);
+
+   drm_object_attach_property(&connector->base,
+  config->writeback_out_fence_ptr_property, 0);
+
+   drm_object_attach_property(&connector->base,
+  config->writeback_fb_id_property, 0);
+
+   drm_object_attach_property(&connector->base,
+  config->writeback_pixel_formats_property,
+  blob->base.id);
+   wb_connector->pixel_formats_blob_ptr = blob;
+
+   return 0;
+
+attach_fail:
+   drm_connector_cleanup(connector);
+connector_fail:
+   drm_property_blob_put(blob);
+   return ret;
+}
+EXPORT_SYMBOL(drm_writeback_connector_init_with_conn);
+
 int drm_writeback_set_fb(struct drm_connector_state *conn_state,
 struct drm_framebuffer *fb)
 {
diff --git a/include/drm/drm_writeback.h b/include/drm/drm_writeback.h
index c380a7b8f55a..149744dbeef0 100644
--- a/include/drm/drm_writeback.h
+++ b/include/drm/drm_writeback.h
@@ -167,6 +167,13 @@ int drmm_writeback_connector_init(struct drm_device *dev,
  struct drm_encoder *enc,
  const u32 *formats, int n_formats);
 
+int
+drm_writeback_connector_init_with_conn(struct drm_device *dev, struct 
drm_connector *connector,
+  struct drm_writeback_connector 
*wb_connector,
+  struct drm_encoder *enc,
+  const struct drm_connector_funcs 
*con_funcs,
+  const u32 *formats, int n_formats);
+
 int drm_writeback_set_fb(struct drm_connector_state *conn_state,
 struct drm_framebuffer *fb);
 
-- 
2.34.1

[RFC PATCH 3/4] drm/i915/writeback: Add some preliminary writeback definitions

2025-06-12 Thread Suraj Kandpal

Add some preliminary definitions like, output type and transcoder
related to the writeback functionality.

Signed-off-by: Suraj Kandpal 

diff --git a/drivers/gpu/drm/i915/display/intel_display.h 
b/drivers/gpu/drm/i915/display/intel_display.h
index 3b54a62c290a..ae474cbeb791 100644
--- a/drivers/gpu/drm/i915/display/intel_display.h
+++ b/drivers/gpu/drm/i915/display/intel_display.h
@@ -82,6 +82,10 @@ static inline const char *transcoder_name(enum transcoder 
transcoder)
return "DSI A";
case TRANSCODER_DSI_C:
return "DSI C";
+   case TRANSCODER_WD_0:
+   return "WD 0";
+   case TRANSCODER_WD_1:
+   return "WD 1";
default:
return "";
}
diff --git a/drivers/gpu/drm/i915/display/intel_display_device.c 
b/drivers/gpu/drm/i915/display/intel_display_device.c
index 90d714598664..2b187472e752 100644
--- a/drivers/gpu/drm/i915/display/intel_display_device.c
+++ b/drivers/gpu/drm/i915/display/intel_display_device.c
@@ -21,6 +21,7 @@
 #include "intel_display_types.h"
 #include "intel_fbc.h"
 #include "intel_step.h"
+#include "intel_writeback_reg.h"
 
 __diag_push();
 __diag_ignore_all("-Woverride-init", "Allow field initialization overrides for 
display info");
@@ -144,12 +145,16 @@ static const struct intel_display_device_info no_display 
= {};
[TRANSCODER_B] = PIPE_B_OFFSET, \
[TRANSCODER_C] = PIPE_C_OFFSET, \
[TRANSCODER_EDP] = PIPE_EDP_OFFSET, \
+   [TRANSCODER_WD_0] = PIPE_WD0_OFFSET, \
+   [TRANSCODER_WD_1] = PIPE_WD1_OFFSET, \
}, \
.trans_offsets = { \
[TRANSCODER_A] = TRANSCODER_A_OFFSET, \
[TRANSCODER_B] = TRANSCODER_B_OFFSET, \
[TRANSCODER_C] = TRANSCODER_C_OFFSET, \
[TRANSCODER_EDP] = TRANSCODER_EDP_OFFSET, \
+   [TRANSCODER_WD_0] = TRANSCODER_WD0_OFFSET, \
+   [TRANSCODER_WD_1] = TRANSCODER_WD1_OFFSET, \
}
 
 #define CHV_PIPE_OFFSETS \
@@ -677,7 +682,8 @@ static const struct intel_display_device_info skl_display = 
{
.__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
.__runtime_defaults.cpu_transcoder_mask =
BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
-   BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP),
+   BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP) |
+   BIT(TRANSCODER_WD_0) | BIT(TRANSCODER_WD_1),
.__runtime_defaults.port_mask = BIT(PORT_A) | BIT(PORT_B) | BIT(PORT_C) 
| BIT(PORT_D) | BIT(PORT_E),
.__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
 };
@@ -829,6 +835,7 @@ static const struct platform_desc cml_desc = {
BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | \
BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP) | \
BIT(TRANSCODER_DSI_A) | BIT(TRANSCODER_DSI_C), \
+   BIT(TRANSCODER_WD_0) | BIT(TRANSCODER_WD_0), \
.__runtime_defaults.port_mask = BIT(PORT_A) | BIT(PORT_B) | BIT(PORT_C)
 
 static const enum intel_step bxt_steppings[] = {
@@ -883,6 +890,8 @@ static const struct platform_desc glk_desc = {
[TRANSCODER_EDP] = PIPE_EDP_OFFSET, \
[TRANSCODER_DSI_0] = PIPE_DSI0_OFFSET, \
[TRANSCODER_DSI_1] = PIPE_DSI1_OFFSET, \
+   [TRANSCODER_WD_0] = PIPE_WD0_OFFSET, \
+   [TRANSCODER_WD_1] = PIPE_WD1_OFFSET, \
}, \
.trans_offsets = { \
[TRANSCODER_A] = TRANSCODER_A_OFFSET, \
@@ -891,6 +900,8 @@ static const struct platform_desc glk_desc = {
[TRANSCODER_EDP] = TRANSCODER_EDP_OFFSET, \
[TRANSCODER_DSI_0] = TRANSCODER_DSI0_OFFSET, \
[TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \
+   [TRANSCODER_WD_0] = TRANSCODER_WD0_OFFSET, \
+   [TRANSCODER_WD_1] = TRANSCODER_WD1_OFFSET, \
}, \
IVB_CURSOR_OFFSETS, \
ICL_COLORS, \
@@ -904,6 +915,7 @@ static const struct platform_desc glk_desc = {
BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | \
BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP) | \
BIT(TRANSCODER_DSI_0) | BIT(TRANSCODER_DSI_1), \
+   BIT(TRANSCODER_WD_0) | BIT(TRANSCODER_WD_1), \
.__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A)
 
 static const u16 icl_port_f_ids[] = {
@@ -974,6 +986,8 @@ static const struct platform_desc ehl_desc = {
[TRANSCODER_D] = PIPE_D_OFFSET, \
[TRANSCODER_DSI_0] = PIPE_DSI0_OFFSET, \
[TRANSCODER_DSI_1] = PIPE_DSI1_OFFSET, \
+   [TRANSCODER_WD_0] = PIPE_WD0_OFFSET, \
+   [TRANSCODER_WD_1] = PIPE_WD1_OFFSET, \
}, \
.trans_offsets = { \
[TRANSCODER_A] = TRANSCODER_A_OFFSET, \
@@ -982,6 +996,8 @@ static const struct platform_desc ehl_desc = {
[TRANSCODER_D] = TRANSCODER_D_OFFSET, \
[TRANSCODER_DSI_0] = TRANSCODER_DSI0_OFFSET

RE: [PATCH v6 4/4] drm: renesas: rz-du: mipi_dsi: Add support for RZ/V2H(P) SoC

2025-06-12 Thread Biju Das

Hi Prabhakar,

> -Original Message-
> From: Prabhakar 
> Sent: 30 May 2025 18:19
> Subject: [PATCH v6 4/4] drm: renesas: rz-du: mipi_dsi: Add support for 
> RZ/V2H(P) SoC
> 
> From: Lad Prabhakar 
> 
> Add DSI support for Renesas RZ/V2H(P) SoC.
> 
> Co-developed-by: Fabrizio Castro 
> Signed-off-by: Fabrizio Castro 
> Signed-off-by: Lad Prabhakar 
> ---
> v5->v6:
> - Made use of GENMASK() macro for PLLCLKSET0R_PLL_*,
>   PHYTCLKSETR_* and PHYTHSSETR_* macros.
> - Replaced 1000UL with 10 * MEGA
> - Renamed mode_freq_hz to mode_freq_khz in rzv2h_dsi_mode_calc
> - Replaced `i -= 1;` with `i--;`
> - Renamed RZV2H_MIPI_DPHY_FOUT_MIN_IN_MEGA to
>   RZV2H_MIPI_DPHY_FOUT_MIN_IN_MHZ and
>   RZV2H_MIPI_DPHY_FOUT_MAX_IN_MEGA to
>   RZV2H_MIPI_DPHY_FOUT_MAX_IN_MHZ.
> 
> v4->v5:
> - No changes
> 
> v3->v4
> - In rzv2h_dphy_find_ulpsexit() made the array static const.
> 
> v2->v3:
> - Simplifed V2H DSI timings array to save space
> - Switched to use fsleep() instead of udelay()
> 
> v1->v2:
> - Dropped unused macros
> - Added missing LPCLK flag to rzv2h info
> ---
>  .../gpu/drm/renesas/rz-du/rzg2l_mipi_dsi.c| 345 ++
>  .../drm/renesas/rz-du/rzg2l_mipi_dsi_regs.h   |  34 ++
>  2 files changed, 379 insertions(+)
> 
> diff --git a/drivers/gpu/drm/renesas/rz-du/rzg2l_mipi_dsi.c 
> b/drivers/gpu/drm/renesas/rz-
> du/rzg2l_mipi_dsi.c
> index a31f9b6aa920..ea554ced6713 100644
> --- a/drivers/gpu/drm/renesas/rz-du/rzg2l_mipi_dsi.c
> +++ b/drivers/gpu/drm/renesas/rz-du/rzg2l_mipi_dsi.c
> @@ -5,6 +5,7 @@
>   * Copyright (C) 2022 Renesas Electronics Corporation
>   */
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -30,6 +31,9 @@
> 
>  #define RZ_MIPI_DSI_FEATURE_16BPPBIT(0)
> 
> +#define RZV2H_MIPI_DPHY_FOUT_MIN_IN_MHZ  (80 * MEGA)
> +#define RZV2H_MIPI_DPHY_FOUT_MAX_IN_MHZ  (1500 * MEGA)
> +
>  struct rzg2l_mipi_dsi;
> 
>  struct rzg2l_mipi_dsi_hw_info {
> @@ -40,6 +44,7 @@ struct rzg2l_mipi_dsi_hw_info {
> u64 *hsfreq_millihz);
>   unsigned int (*dphy_mode_clk_check)(struct rzg2l_mipi_dsi *dsi,
>   unsigned long mode_freq);
> + const struct rzv2h_pll_div_limits *cpg_dsi_limits;
>   u32 phy_reg_offset;
>   u32 link_reg_offset;
>   unsigned long min_dclk;
> @@ -47,6 +52,11 @@ struct rzg2l_mipi_dsi_hw_info {
>   u8 features;
>  };
> 
> +struct rzv2h_dsi_mode_calc {
> + unsigned long mode_freq_khz;
> + u64 mode_freq_hz;
> +};
> +
>  struct rzg2l_mipi_dsi {
>   struct device *dev;
>   void __iomem *mmio;
> @@ -68,6 +78,18 @@ struct rzg2l_mipi_dsi {
>   unsigned int num_data_lanes;
>   unsigned int lanes;
>   unsigned long mode_flags;
> +
> + struct rzv2h_dsi_mode_calc mode_calc;
> + struct rzv2h_plldsi_parameters dsi_parameters; };
> +
> +static const struct rzv2h_pll_div_limits rzv2h_plldsi_div_limits = {
> + .fvco = { .min = 1050 * MEGA, .max = 2100 * MEGA },
> + .m = { .min = 64, .max = 1023 },
> + .p = { .min = 1, .max = 4 },
> + .s = { .min = 0, .max = 5 },
> + .k = { .min = -32768, .max = 32767 },
> + .csdiv = { .min = 1, .max = 1 },
>  };
> 
>  static inline struct rzg2l_mipi_dsi *
> @@ -184,6 +206,155 @@ static const struct rzg2l_mipi_dsi_timings 
> rzg2l_mipi_dsi_global_timings[] = {
>   },
>  };
> 
> +struct rzv2h_mipi_dsi_timings {
> + const u8 *hsfreq;
> + u8 len;
> + u8 start_index;
> +};
> +
> +enum {
> + TCLKPRPRCTL,
> + TCLKZEROCTL,
> + TCLKPOSTCTL,
> + TCLKTRAILCTL,
> + THSPRPRCTL,
> + THSZEROCTL,
> + THSTRAILCTL,
> + TLPXCTL,
> + THSEXITCTL,
> +};
> +
> +static const u8 tclkprprctl[] = {
> + 15, 26, 37, 47, 58, 69, 79, 90, 101, 111, 122, 133, 143, 150, };
> +
> +static const u8 tclkzeroctl[] = {
> + 9, 11, 13, 15, 18, 21, 23, 24, 25, 27, 29, 31, 34, 36, 38,
> + 41, 43, 45, 47, 50, 52, 54, 57, 59, 61, 63, 66, 68, 70, 73,
> + 75, 77, 79, 82, 84, 86, 89, 91, 93, 95, 98, 100, 102, 105,
> + 107, 109, 111, 114, 116, 118, 121, 123, 125, 127, 130, 132,
> + 134, 137, 139, 141, 143, 146, 148, 150, };
> +
> +static const u8 tclkpostctl[] = {
> + 8, 21, 34, 48, 61, 74, 88, 101, 114, 128, 141, 150, };
> +
> +static const u8 tclktrailctl[] = {
> + 14, 25, 37, 48, 59, 71, 82, 94, 105, 117, 128, 139, 150, };
> +
> +static const u8 thsprprctl[] = {
> + 11, 19, 29, 40, 50, 61, 72, 82, 93, 103, 114, 125, 135, 146, 150, };
> +
> +static const u8 thszeroctl[] = {
> + 18, 24, 29, 35, 40, 46, 51, 57, 62, 68, 73, 79, 84, 90,
> + 95, 101, 106, 112, 117, 123, 128, 134, 139, 145, 150, };
> +
> +static const u8 thstrailctl[] = {
> + 10, 21, 32, 42, 53, 64, 75, 85, 96, 107, 118, 128, 139, 150, };
> +
> +static const u8 tlpxctl[] = {
> + 13, 26, 39, 53, 66, 79, 93, 106, 119, 133, 146, 150,
> +};
> +
> +static const u8 thsexitctl[] = {
> + 15, 23, 31, 39, 47, 55, 63, 71, 79, 87,
> + 95, 103, 111, 119, 127, 135, 143, 150,

Re: [PATCH v3 0/5] drm/dp: Limit the DPCD probe quirk to the affected monitor

2025-06-12 Thread Imre Deak

Hi,

On Tue, Jun 10, 2025 at 06:42:04PM +0300, Imre Deak wrote:
> Hi Maxim, Thomas, Maarten,
> 
> could you please ack merging this patchset via drm-intel?

any objection to merge the patchset via drm-intel? If not, could
someone ack it?

Patches 1-4 could be also merged to drm-misc-next instead, but then
would need to wait with patch 5 until drm-misc-next is merged to
drm-intel.

Thanks,
Imre

> On Thu, Jun 05, 2025 at 11:28:45AM +0300, Imre Deak wrote:
> > This is v3 of [1], with the following changes requested by Jani:
> > 
> > - Convert the internal quirk list to an enum list.
> > - Track both the internal and global quirks on a single list.
> > - Drop the change to support panel name specific quirks for now.
> > 
> > [1] https://lore.kernel.org/all/20250603121543.17842-1-imre.d...@intel.com
> > 
> > Cc: Ville Syrjälä 
> > Cc: Jani Nikula 
> > 
> > Imre Deak (5):
> >   drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS
> >   drm/edid: Define the quirks in an enum list
> >   drm/edid: Add support for quirks visible to DRM core and drivers
> >   drm/dp: Add an EDID quirk for the DPCD register access probe
> >   drm/i915/dp: Disable the AUX DPCD probe quirk if it's not required
> > 
> >  drivers/gpu/drm/display/drm_dp_helper.c  |  44 ++--
> >  drivers/gpu/drm/drm_edid.c   | 227 ++-
> >  drivers/gpu/drm/i915/display/intel_dp.c  |  11 +-
> >  drivers/gpu/drm/i915/display/intel_dp_aux.c  |   2 +
> >  drivers/gpu/drm/i915/display/intel_hotplug.c |  10 +
> >  include/drm/display/drm_dp_helper.h  |   6 +
> >  include/drm/drm_connector.h  |   4 +-
> >  include/drm/drm_edid.h   |   8 +
> >  8 files changed, 189 insertions(+), 123 deletions(-)
> > 
> > -- 
> > 2.44.2
> >

Re: [PATCH v4 04/20] rust: add new `num` module with useful integer operations

2025-06-12 Thread Alexandre Courbot

On Thu Jun 12, 2025 at 10:17 PM JST, Alexandre Courbot wrote:
> On Wed Jun 4, 2025 at 4:18 PM JST, Benno Lossin wrote:
>> On Wed Jun 4, 2025 at 2:05 AM CEST, Alexandre Courbot wrote:
>>> On Wed Jun 4, 2025 at 8:02 AM JST, Benno Lossin wrote:
 On Mon Jun 2, 2025 at 3:09 PM CEST, Alexandre Courbot wrote:
> On Thu May 29, 2025 at 4:27 PM JST, Benno Lossin wrote:
>> On Thu May 29, 2025 at 3:18 AM CEST, Alexandre Courbot wrote:
>>> On Thu May 29, 2025 at 5:17 AM JST, Benno Lossin wrote:
 On Wed May 21, 2025 at 8:44 AM CEST, Alexandre Courbot wrote:
> +/// Align `self` up to `alignment`.
> +///
> +/// `alignment` must be a power of 2 for accurate results.
> +///
> +/// Wraps around to `0` if the requested alignment pushes the 
> result above the type's limits.
> +///
> +/// # Examples
> +///
> +/// ```
> +/// use kernel::num::NumExt;
> +///
> +/// assert_eq!(0x4fffu32.align_up(0x1000), 0x5000);
> +/// assert_eq!(0x4000u32.align_up(0x1000), 0x4000);
> +/// assert_eq!(0x0u32.align_up(0x1000), 0x0);
> +/// assert_eq!(0xu16.align_up(0x100), 0x0);
> +/// assert_eq!(0x4fffu32.align_up(0x0), 0x0);
> +/// ```
> +fn align_up(self, alignment: Self) -> Self;

 Isn't this `next_multiple_of` [1] (it also allows non power of 2
 inputs).

 [1]: 
 https://doc.rust-lang.org/std/primitive.u32.html#method.next_multiple_of
>>>
>>> It is, however the fact that `next_multiple_of` works with non powers of
>>> two also means it needs to perform a modulo operation. That operation
>>> might well be optimized away by the compiler, but ACAICT we have no way
>>> of proving it will always be the case, hence the always-optimal
>>> implementation here.
>>
>> When you use a power of 2 constant, then I'm very sure that it will get
>> optimized [1]. Even with non-powers of 2, you don't get a division [2].
>> If you find some code that is not optimized, then sure add a custom
>> function.
>>
>> [1]: https://godbolt.org/z/57M9e36T3
>> [2]: https://godbolt.org/z/9P4P8zExh
>
> That's impressive and would definitely work well with a constant. But
> when the value is not known at compile-time, the division does occur
> unfortunately: https://godbolt.org/z/WK1bPMeEx
>
> So I think we will still need a kernel-optimized version of these
> alignment functions.

 Hmm what exactly is the use-case for a variable align amount? Could you
 store it in const generics?
>>>
>>> Say you have an IOMMU with support for different pages sizes, the size
>>> of a particular page can be decided at runtime.
>>>

 If not, there are also these two variants that are more efficient:

 * option: https://godbolt.org/z/ecnb19zaM
 * unsafe: https://godbolt.org/z/EqTaGov71

 So if the compiler can infer it from context it still optimizes it :)
>>>
>>> I think the `Option` (and subsequent `unwrap`) is something we want to
>>> avoid on such a common operation.
>>
>> Makes sense.
>>
 But yeah to be extra sure, you need your version. By the way, what
 happens if `align` is not a power of 2 in your version?
>>>
>>> It will just return `(self + (self - 1)) & (alignment - 1)`, which will
>>> likely be a value you don't want.
>>
>> So wouldn't it be better to make users validate that they gave a
>> power-of-2 alignment?
>>
>>> So yes, for this particular operation we would prefer to only use powers
>>> of 2 as inputs - if we can ensure that then it solves most of our
>>> problems (can use `next_multiple_of`, no `Option`, etc).
>>>
>>> Maybe we can introduce a new integer type that, similarly to `NonZero`,
>>> guarantees that the value it stores is a power of 2? Users with const
>>> values (90+% of uses) won't see any difference, and if working with a
>>> runtime-generated value we will want to validate it anyway...
>>
>> I like this idea. But it will mean that we have to have a custom
>> function that is either standalone and const or in an extension trait :(
>> But for this one we can use the name `align_up` :)
>>
>> Here is a cool idea for the implementation: https://godbolt.org/z/x6navM5WK
>
> Yeah that's close to what I had in mind.

... with one difference though: I would like to avoid the use of
`unsafe` for something so basic, so the implementation is close to the C
one (using masks and logical operations). I think it's a great
demonstration of the compiler's abilities that we can generate an
always-optimized version of `next_multiple_of`, but for our use-case it
feels like jumping through hoops just to show that we can jump through
these hoops. I'll reconsider if there is pushback on v5 though. :)

Re: [PATCH v2 05/10] drm/xe/xe_late_bind_fw: Load late binding firmware

2025-06-12 Thread Nilawar, Badal




On 12-06-2025 17:24, Usyskin, Alexander wrote:

Subject: Re: [PATCH v2 05/10] drm/xe/xe_late_bind_fw: Load late binding
firmware



On 6/6/2025 10:57 AM, Badal Nilawar wrote:

Load late binding firmware

v2:
   - s/EAGAIN/EBUSY/
   - Flush worker in suspend and driver unload (Daniele)

Signed-off-by: Badal Nilawar 
---
   drivers/gpu/drm/xe/xe_late_bind_fw.c   | 121

-

   drivers/gpu/drm/xe/xe_late_bind_fw.h   |   1 +
   drivers/gpu/drm/xe/xe_late_bind_fw_types.h |   5 +
   3 files changed, 126 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c

b/drivers/gpu/drm/xe/xe_late_bind_fw.c

index 0231f3dcfc18..7fe304c54374 100644
--- a/drivers/gpu/drm/xe/xe_late_bind_fw.c
+++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
@@ -16,6 +16,16 @@
   #include "xe_late_bind_fw.h"
   #include "xe_pcode.h"
   #include "xe_pcode_api.h"
+#include "xe_pm.h"
+
+/*
+ * The component should load quite quickly in most cases, but it could take
+ * a bit. Using a very big timeout just to cover the worst case scenario
+ */
+#define LB_INIT_TIMEOUT_MS 2
+
+#define LB_FW_LOAD_RETRY_MAXCOUNT 40
+#define LB_FW_LOAD_RETRY_PAUSE_MS 50

Are those retry values spec'd anywhere? For GSC we use those because the
GSC specs say to retry in 50ms intervals for up to 2 secs to give time
for the GSC to do proxy handling. Does it make sense to do the same in
this case, given that there is no proxy involved?


Here 50ms is too small, we are waiting for other OS components to release 
handle.
We usually have 3 times 2 sec in user-space, but it is too big for kernel,
let's do 200ms step up to 6 sec.


Sure I will  change the intervals.

Regards,
Badal



- -
Thanks,
Sasha

Re: [PATCH] accel/ivpu: Add turbo flag to the DRM_IVPU_CMDQ_CREATE ioctl

2025-06-12 Thread Falkowski, Maciej


On 6/6/2025 6:30 PM, Jeff Hugo wrote:


On 6/5/2025 10:20 AM, Maciej Falkowski wrote:

From: Andrzej Kacprowski 

Introduces a new parameter to the DRM_IVPU_CMDQ_CREATE ioctl,


Introduce

Ack, thanks.



enabling turbo mode for jobs submitted via the command queue.
Turbo mode allows jobs to run at higher frequencies,
potentially improving performance for demanding workloads.

The change also adds the IVPU_TEST_MODE_TURBO_DISABLE flag


"This change" is redundant. Just start with "Also add the..."

Ack, thanks.



to allow test mode to explicitly disable turbo mode
requested by the application.
The IVPU_TEST_MODE_TURBO mode has been renamed to
IVPU_TEST_MODE_TURBO_ENABLE for clarity and consistency.

+/* Command queue flags */
+#define DRM_IVPU_CMDQ_FLAG_TURBO 0x0001
+
  /**
   * struct drm_ivpu_cmdq_create - Create command queue for job 
submission

   */
@@ -462,6 +465,17 @@ struct drm_ivpu_cmdq_create {
   * %DRM_IVPU_JOB_PRIORITY_REALTIME
   */
  __u32 priority;
+    /**
+ * @flags:
+ *
+ * Supported flags:
+ *
+ * %DRM_IVPU_CMDQ_FLAG_TURBO
+ *
+ * Enable low-latency mode for the command queue. The NPU will 
maximize performance
+ * when executing jobs from such queue at the cost of increased 
power usage.

+ */
+    __u32 flags;


This is going to break the struct size on compat.  You probably need a 
__u32 reserved to maintain 64-bit alignment. 


Thank you for suggestion,
I think compat is preserved here as u32 imposes 4 byte alignment on 64bit
so the alignment is going to be 12 bytes on both 32bit and 64bit, I 
tested this manually.

Please correct me if I am wrong.

Best regards,
Maciej

Re: [PATCH v2 05/10] drm/xe/xe_late_bind_fw: Load late binding firmware

2025-06-12 Thread Nilawar, Badal




On 11-06-2025 05:47, Daniele Ceraolo Spurio wrote:



On 6/6/2025 10:57 AM, Badal Nilawar wrote:

Load late binding firmware

v2:
  - s/EAGAIN/EBUSY/
  - Flush worker in suspend and driver unload (Daniele)

Signed-off-by: Badal Nilawar 
---
  drivers/gpu/drm/xe/xe_late_bind_fw.c   | 121 -
  drivers/gpu/drm/xe/xe_late_bind_fw.h   |   1 +
  drivers/gpu/drm/xe/xe_late_bind_fw_types.h |   5 +
  3 files changed, 126 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c 
b/drivers/gpu/drm/xe/xe_late_bind_fw.c

index 0231f3dcfc18..7fe304c54374 100644
--- a/drivers/gpu/drm/xe/xe_late_bind_fw.c
+++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
@@ -16,6 +16,16 @@
  #include "xe_late_bind_fw.h"
  #include "xe_pcode.h"
  #include "xe_pcode_api.h"
+#include "xe_pm.h"
+
+/*
+ * The component should load quite quickly in most cases, but it 
could take
+ * a bit. Using a very big timeout just to cover the worst case 
scenario

+ */
+#define LB_INIT_TIMEOUT_MS 2
+
+#define LB_FW_LOAD_RETRY_MAXCOUNT 40
+#define LB_FW_LOAD_RETRY_PAUSE_MS 50


Are those retry values spec'd anywhere? For GSC we use those because 
the GSC specs say to retry in 50ms intervals for up to 2 secs to give 
time for the GSC to do proxy handling. Does it make sense to do the 
same in this case, given that there is no proxy involved?



    static const char * const fw_type_to_name[] = {
  [CSC_LATE_BINDING_TYPE_FAN_CONTROL] = "fan_control",
@@ -39,6 +49,95 @@ static int late_bind_fw_num_fans(struct 
xe_late_bind *late_bind)

  return 0;
  }
  +static void xe_late_bind_wait_for_worker_completion(struct 
xe_late_bind *late_bind)

+{
+    struct xe_device *xe = late_bind_to_xe(late_bind);
+    struct xe_late_bind_fw *lbfw;
+
+    lbfw = &late_bind->late_bind_fw;
+    if (lbfw->valid && late_bind->wq) {
+    drm_dbg(&xe->drm, "Flush work: load %s firmware\n",
+    fw_type_to_name[lbfw->type]);
+    flush_work(&lbfw->work);
+    }
+}
+
+static void late_bind_work(struct work_struct *work)
+{
+    struct xe_late_bind_fw *lbfw = container_of(work, struct 
xe_late_bind_fw, work);
+    struct xe_late_bind *late_bind = container_of(lbfw, struct 
xe_late_bind,

+  late_bind_fw);
+    struct xe_device *xe = late_bind_to_xe(late_bind);
+    int retry = LB_FW_LOAD_RETRY_MAXCOUNT;
+    int ret;
+    int slept;
+
+    if (!late_bind->component_added)
+    return;
+
+    if (!lbfw->valid)
+    return;


The first check is redundant because lbfw->valid can't be true if 
late_bind->component_added is false with the current code.
I will remove this change, while scheduling work already lbfw->valid is 
being checked. Shall I even remove check for late_bind->component_added) 
as this is also being checked while before scheduling work.



+
+    /* we can queue this before the component is bound */
+    for (slept = 0; slept < LB_INIT_TIMEOUT_MS; slept += 100) {
+    if (late_bind->component.ops)
+    break;
+    msleep(100);
+    }
+
+    xe_pm_runtime_get(xe);
+    mutex_lock(&late_bind->mutex);
+
+    if (!late_bind->component.ops) {
+    drm_err(&xe->drm, "Late bind component not bound\n");
+    goto out;
+    }
+
+    drm_dbg(&xe->drm, "Load %s firmware\n", 
fw_type_to_name[lbfw->type]);

+
+    do {
+    ret = 
late_bind->component.ops->push_config(late_bind->component.mei_dev,

+    lbfw->type, lbfw->flags,
+    lbfw->payload, lbfw->payload_size);
+    if (!ret)
+    break;
+    msleep(LB_FW_LOAD_RETRY_PAUSE_MS);
+    } while (--retry && ret == -EBUSY);
+
+    if (ret)
+    drm_err(&xe->drm, "Load %s firmware failed with err %d\n",
+    fw_type_to_name[lbfw->type], ret);
+    else
+    drm_dbg(&xe->drm, "Load %s firmware successful\n",
+    fw_type_to_name[lbfw->type]);
+out:
+    mutex_unlock(&late_bind->mutex);
+    xe_pm_runtime_put(xe);
+}
+
+int xe_late_bind_fw_load(struct xe_late_bind *late_bind)
+{
+    struct xe_device *xe = late_bind_to_xe(late_bind);
+    struct xe_late_bind_fw *lbfw;
+
+    if (!late_bind->component_added)
+    return -EINVAL;
+
+    lbfw = &late_bind->late_bind_fw;
+    if (lbfw->valid) {
+    drm_dbg(&xe->drm, "Queue work: to load %s firmware\n",
+    fw_type_to_name[lbfw->type]);


This log seems a bit too specific, also given that you also have logs 
inside the work


Will remove this log.

Thanks,
Badal



Daniele


+    queue_work(late_bind->wq, &lbfw->work);
+    }
+
+    return 0;
+}
+
+/**
+ * late_bind_fw_init() - initialize late bind firmware
+ *
+ * Return: 0 if the initialization was successful, a negative errno 
otherwise.

+ */
  static int late_bind_fw_init(struct xe_late_bind *late_bind, u32 type)
  {
  struct xe_device *xe = late_bind_to_xe(late_bind);
@@ -87,6 +186,7 @@ static int late_bind_fw_init(struct xe_late_bind 
*late_bind, u32 type)

Re: [PATCH 1/3] drm/tests: Do not use drm_fb_blit() in format-helper tests

2025-06-12 Thread Thomas Zimmermann


Hi

Am 06.06.25 um 09:41 schrieb Maxime Ripard:

On Wed, Jun 04, 2025 at 05:45:42PM +0200, Thomas Zimmermann wrote:

Export additional helpers from the format-helper library and open-code
drm_fb_blit() in tests. Prepares for the removal of drm_fb_blit(). Only
sysfb drivers use drm_fb_blit(). The function will soon be removed from
format helpers and be refactored within sysfb helpers.

Signed-off-by: Thomas Zimmermann 

Alternatively, we have VISIBLE_IF_KUNIT and EXPORT_SYMBOL_IF_KUNIT, or
EXPORT_SYMBOL_FOR_TESTS_ONLY if you don't want to make it part of the
API.


Once drm_fb_blit() has been merged into the sysfb helpers, these symbols 
would be exported anyway.




Either way,

Acked-by: Maxime Ripard 


Thanks for looking over the series.

Best regards
Thomas



Maxime


--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)

Re: [PATCH] drm/i915/panel: make panel funcs static

2025-06-12 Thread Gustavo Sousa

Quoting Jani Nikula (2025-06-12 09:46:17-03:00)
>The drm panel funcs should be static, fix it.
>
>Fixes: 3fdd5bfbd638 ("drm/i915/panel: register drm_panel and call 
>prepare/unprepare for ICL+ DSI")
>Signed-off-by: Jani Nikula 

Reviewed-by: Gustavo Sousa 

>---
> drivers/gpu/drm/i915/display/intel_panel.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/i915/display/intel_panel.c 
>b/drivers/gpu/drm/i915/display/intel_panel.c
>index f956919dc648..e28ad72c4f2b 100644
>--- a/drivers/gpu/drm/i915/display/intel_panel.c
>+++ b/drivers/gpu/drm/i915/display/intel_panel.c
>@@ -502,7 +502,7 @@ static void intel_panel_sync_state(struct intel_connector 
>*connector)
> drm_modeset_unlock(&display->drm->mode_config.connection_mutex);
> }
> 
>-const struct drm_panel_funcs dummy_panel_funcs = {
>+static const struct drm_panel_funcs dummy_panel_funcs = {
> };
> 
> int intel_panel_register(struct intel_connector *connector)
>-- 
>2.39.5
>

Re: [PATCH] accel/ivpu: Add turbo flag to the DRM_IVPU_CMDQ_CREATE ioctl

2025-06-12 Thread Jeff Hugo


On 6/12/2025 7:31 AM, Falkowski, Maciej wrote:

On 6/6/2025 6:30 PM, Jeff Hugo wrote:


On 6/5/2025 10:20 AM, Maciej Falkowski wrote:

From: Andrzej Kacprowski 

Introduces a new parameter to the DRM_IVPU_CMDQ_CREATE ioctl,


Introduce

Ack, thanks.



enabling turbo mode for jobs submitted via the command queue.
Turbo mode allows jobs to run at higher frequencies,
potentially improving performance for demanding workloads.

The change also adds the IVPU_TEST_MODE_TURBO_DISABLE flag


"This change" is redundant. Just start with "Also add the..."

Ack, thanks.



to allow test mode to explicitly disable turbo mode
requested by the application.
The IVPU_TEST_MODE_TURBO mode has been renamed to
IVPU_TEST_MODE_TURBO_ENABLE for clarity and consistency.

+/* Command queue flags */
+#define DRM_IVPU_CMDQ_FLAG_TURBO 0x0001
+
  /**
   * struct drm_ivpu_cmdq_create - Create command queue for job 
submission

   */
@@ -462,6 +465,17 @@ struct drm_ivpu_cmdq_create {
   * %DRM_IVPU_JOB_PRIORITY_REALTIME
   */
  __u32 priority;
+    /**
+ * @flags:
+ *
+ * Supported flags:
+ *
+ * %DRM_IVPU_CMDQ_FLAG_TURBO
+ *
+ * Enable low-latency mode for the command queue. The NPU will 
maximize performance
+ * when executing jobs from such queue at the cost of increased 
power usage.

+ */
+    __u32 flags;


This is going to break the struct size on compat.  You probably need a 
__u32 reserved to maintain 64-bit alignment. 


Thank you for suggestion,
I think compat is preserved here as u32 imposes 4 byte alignment on 64bit
so the alignment is going to be 12 bytes on both 32bit and 64bit, I 
tested this manually.

Please correct me if I am wrong.


Looks like I'm wrong.  Majority of the structures have 64-bit values, 
and I didn't clearly see that this specific one is only 32-bit values.


My initial comment was based on 
https://docs.kernel.org/process/botching-up-ioctls.html - specifically:


Pad the entire struct to a multiple of 64-bits if the structure contains 
64-bit types - the structure size will otherwise differ on 32-bit versus 
64-bit. Having a different structure size hurts when passing arrays of 
structures to the kernel, or if the kernel checks the structure size, 
which e.g. the drm core does.


Ok. This was the only functional comment, and it is resolved. The other 
two are trivial fixups, so I think with those -


Reviewed-by: Jeff Hugo

Re: [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job

2025-06-12 Thread Tvrtko Ursulin




On 11/06/2025 22:21, Danilo Krummrich wrote:

On Tue, 2025-06-03 at 13:27 +0100, Tvrtko Ursulin wrote:

On 03/06/2025 10:31, Philipp Stanner wrote:
What I am not that ecstatic about is only getting the Suggested-by
credit in 1/6. Given it is basically my patch with some cosmetic
changes
like the kernel doc and the cancel loop extracted to a helper.


Sign the patch off and I give you the authorship if you want.


AFAICS, the proposal of having cancel_job() has been a review comment which has
been clarified with a reference patch.


Right, this one:

https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursu...@igalia.com/


IMO, the fact that after some discussion Philipp decided to go with this
suggestion and implement the suggestion in his patch series does not result in
an obligation for him to hand over authorship of the patch he wrote to the
person who suggested the change in the context of the code review.


It is fine. Just that instead of rewriting we could have also said 
something along the lines of "Okay lets go with your version after all, 
just please tweak this or that". Which in my experience would have been 
more typical.



Anyways, it seems that Philipp did offer it however, so this seems to be
resolved?


At the end of the day the very fact a neater solution is going in is the 
main thing for me. Authorship is not that important, only that the way 
of working I follow, both as a maintainer and a colleague, aspires to be 
more like what I described in the previous paragraph.


I am not sure I can review this version though. It feels it would be too 
much like reviewing my own code so wouldn't carry the fully weight of 
review. Technically I probably could, but in reality someone else should 
probably better do it.


Regards,

Tvrtko

Re: [PATCH v3 0/5] drm/dp: Limit the DPCD probe quirk to the affected monitor

2025-06-12 Thread Thomas Zimmermann


Hi

Am 12.06.25 um 15:29 schrieb Imre Deak:

Hi,

On Tue, Jun 10, 2025 at 06:42:04PM +0300, Imre Deak wrote:

Hi Maxim, Thomas, Maarten,

could you please ack merging this patchset via drm-intel?

any objection to merge the patchset via drm-intel? If not, could
someone ack it?


Sorry for missing that. I'm OK with merging it through Intel trees. Go 
ahead.


Best regards
Thomas



Patches 1-4 could be also merged to drm-misc-next instead, but then
would need to wait with patch 5 until drm-misc-next is merged to
drm-intel.

Thanks,
Imre


On Thu, Jun 05, 2025 at 11:28:45AM +0300, Imre Deak wrote:

This is v3 of [1], with the following changes requested by Jani:

- Convert the internal quirk list to an enum list.
- Track both the internal and global quirks on a single list.
- Drop the change to support panel name specific quirks for now.

[1] https://lore.kernel.org/all/20250603121543.17842-1-imre.d...@intel.com

Cc: Ville Syrjälä 
Cc: Jani Nikula 

Imre Deak (5):
   drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS
   drm/edid: Define the quirks in an enum list
   drm/edid: Add support for quirks visible to DRM core and drivers
   drm/dp: Add an EDID quirk for the DPCD register access probe
   drm/i915/dp: Disable the AUX DPCD probe quirk if it's not required

  drivers/gpu/drm/display/drm_dp_helper.c  |  44 ++--
  drivers/gpu/drm/drm_edid.c   | 227 ++-
  drivers/gpu/drm/i915/display/intel_dp.c  |  11 +-
  drivers/gpu/drm/i915/display/intel_dp_aux.c  |   2 +
  drivers/gpu/drm/i915/display/intel_hotplug.c |  10 +
  include/drm/display/drm_dp_helper.h  |   6 +
  include/drm/drm_connector.h  |   4 +-
  include/drm/drm_edid.h   |   8 +
  8 files changed, 189 insertions(+), 123 deletions(-)

--
2.44.2



--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)

[PATCH v5 02/23] rust: make ETIMEDOUT error available

2025-06-12 Thread Alexandre Courbot

We will use this error in the nova-core driver.

Reviewed-by: Benno Lossin 
Signed-off-by: Alexandre Courbot 
---
 rust/kernel/error.rs | 1 +
 1 file changed, 1 insertion(+)

diff --git a/rust/kernel/error.rs b/rust/kernel/error.rs
index 
3dee3139fcd4379b94748c0ba1965f4e1865b633..083c7b068cf4e185100de96e520c54437898ee72
 100644
--- a/rust/kernel/error.rs
+++ b/rust/kernel/error.rs
@@ -65,6 +65,7 @@ macro_rules! declare_err {
 declare_err!(EDOM, "Math argument out of domain of func.");
 declare_err!(ERANGE, "Math result not representable.");
 declare_err!(EOVERFLOW, "Value too large for defined data type.");
+declare_err!(ETIMEDOUT, "Connection timed out.");
 declare_err!(ERESTARTSYS, "Restart the system call.");
 declare_err!(ERESTARTNOINTR, "System call was interrupted by a signal and 
will be restarted.");
 declare_err!(ERESTARTNOHAND, "Restart if no handler.");

-- 
2.49.0

[PATCH v5 00/23] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization

2025-06-12 Thread Alexandre Courbot

Hi everyone,

The feedback on v4 has been (hopefully) addressed. I guess the main
remaining unknown is the direction of the `num` module ; for this
iteration, following the received feedback I have eschewed the extension
trait and implemented the alignment functions as methods of the new
`PowerOfTwo` type. This has the benefit of making it impossible to call
them with undesirable (i.e. non-power of two) values. The `fls` function
is now provided as a series of const functions for each supported type,
generated by a macro.

It feels like the `num` module could be its own series though, so if
there is still discussion about it, I can also extract it and implement
the functionality we need in nova-core as local helper functions until
it gets merged at its own pace.

As previously, this series only successfully probes Ampere GPUs, but
support for other generations is on the way.

Upon successful probe, the driver will display the range of the WPR2
region constructed by FWSEC-FRTS with debug priority:

  [   95.436000] NovaCore :01:00.0: WPR2: 0xffc0-0xffce
  [   95.436002] NovaCore :01:00.0: GPU instance built

This series is based on v6.16-rc1 with no other dependencies.

There are bits of documentation still missing, these are addressed by
Joel in his own documentation patch series [1]. I'll also double-check
and send follow-up patches if anything is still missing after that.

[1] 
https://lore.kernel.org/rust-for-linux/20250503040802.1411285-1-joelagn...@nvidia.com/

Signed-off-by: Alexandre Courbot 
---
Changes in v5:
- Rebased on top of 6.16-rc1.
- Improve invariants of CoherentAllocation related to the new `size`
  method.
- Use SZ_* consts when redefining BAR0 size.
- Split VBIOS patch into 3 patches (Joel)
- Convert all `Result<()>` into `Result`.
- Use `::cast()` instead of ` as ` to convert pointer types.
- Use `KBox` instead of `Arc` for falcon HALs.
- Do not use `get_` prefix on methods that do not increase reference
  count.
- Replace arbitrary immediate values with proper constants.
- Use EIO to indicate firmware errors.
- Use inspect_err to be more verbose on which step of the FWSEC setup
  failed.
- Move sysmem flush page into its own type and add its registration to
  the FB HAL.
- Turn HAL getters into standalone functions.
- Patch FWSEC command at construction time.
- Force the signing stage (or an explicit non-signing state transition)
  on the firmware DMA objects.
- Link to v4: 
https://lore.kernel.org/r/20250521-nova-frts-v4-0-05dfd4f39...@nvidia.com

Changes in v4:
- Improve documentation of falcon security modes (thanks Joel!)
- Add the definition of the size of CoherentAllocation as one of its
  invariants.
- Better document GFW boot progress, registers and use wait_on() helper,
  and move it to `gfw` module instead of `devinit`.
- Add missing TODOs for workarounds waiting to be replaced by in-flight
  R4L features.
- Register macro: add the offset of the register as a type constant, and
  allow register aliases for registers which can be interpreted
  differently depending on context.
- Rework the `num` module using only macros (to allow use of overflowing
  ops), and add the `PowerOfTwo` type.
- Add a proper HAL to the `fb` module.
- Move HAL builders to impl blocks of Chipset.
- Add proper types and traits for signatures.
- Proactively split FalconFirmware into distinct traits to ease
  management of v2 vs v3 FWSEC headers that will be needed for Turing
  support.
- Link to v3:
  https://lore.kernel.org/r/20250507-nova-frts-v3-0-fcb027497...@nvidia.com

Changes in v3:
- Rebased on top of latest nova-next.
- Use the new Devres::access() and remove the now unneeded with_bar!()
  macro.
- Dropped `rust: devres: allow to borrow a reference to the resource's
  Device` as it is not needed anymore.
- Fixed more erroneous uses of `ERANGE` error.
- Optimized alignment computations of the FB layout a bit.
- Link to v2: 
https://lore.kernel.org/r/20250501-nova-frts-v2-0-b4a137175...@nvidia.com

Changes in v2:
- Rebased on latest nova-next.
- Fixed all clippy warnings.
- Added `count` and `size` methods to `CoherentAllocation`.
- Added method to obtain a reference to the `Device` from a `Devres`
  (this is super convenient).
- Split `DmaObject` into its own patch and added `Deref` implementation.
- Squashed field names from [3] into "extract FWSEC from BIOS".
- Fixed erroneous use of `ERANGE` error.
- Reworked `register!()` macro towards a more intuitive syntax, moved
  its helper macros into internal rules to avoid polluting the macro
  namespace.
- Renamed all registers to capital snake case to better match OpenRM.
- Removed declarations for registers that are not used yet.
- Added more documentation for items not covered by Joel's documentation
  patches.
- Removed timer device and replaced it with a helper function using
  `Ktime`. This also made [4] unneeded so it is dropped.
- Unregister the sysmem flush page upon device destruction.
- ... probably more that I forgot. >_<

[PATCH v5 23/23] gpu: nova-core: load and run FWSEC-FRTS

2025-06-12 Thread Alexandre Courbot

With all the required pieces in place, load FWSEC-FRTS onto the GSP
falcon, run it, and check that it successfully carved out the WPR2
region out of framebuffer memory.

Reviewed-by: Lyude Paul 
Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/falcon.rs |  3 --
 drivers/gpu/nova-core/gpu.rs| 63 -
 drivers/gpu/nova-core/regs.rs   | 15 ++
 3 files changed, 77 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
index 
25ed8ee30def3abcc43bcba965eb62f49d532604..486be64895a0250ae4263de708784a8fdf1d54b5
 100644
--- a/drivers/gpu/nova-core/falcon.rs
+++ b/drivers/gpu/nova-core/falcon.rs
@@ -2,9 +2,6 @@
 
 //! Falcon microprocessor base support
 
-// To be removed when all code is used.
-#![expect(dead_code)]
-
 use core::ops::Deref;
 use core::time::Duration;
 use hal::FalconHal;
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 
b0bc390b972b5e75538797acd6abffd013a8a159..7af35ffa1d2f900e0117a55ec41312d16d718f67
 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -226,7 +226,7 @@ pub(crate) fn new(
 
 let bios = Vbios::new(pdev, bar)?;
 
-let _fwsec_frts = FwsecFirmware::new(
+let fwsec_frts = FwsecFirmware::new(
 &gsp_falcon,
 pdev.as_ref(),
 bar,
@@ -237,6 +237,67 @@ pub(crate) fn new(
 },
 )?;
 
+// Check that the WPR2 region does not already exists - if it does, 
the GPU needs to be
+// reset.
+if regs::NV_PFB_PRI_MMU_WPR2_ADDR_HI::read(bar).hi_val() != 0 {
+dev_err!(
+pdev.as_ref(),
+"WPR2 region already exists - GPU needs to be reset to 
proceed\n"
+);
+return Err(EBUSY);
+}
+
+// Reset falcon, load FWSEC-FRTS, and run it.
+gsp_falcon
+.reset(bar)
+.inspect_err(|e| dev_err!(pdev.as_ref(), "Failed to reset GSP 
falcon: {:?}\n", e))?;
+gsp_falcon
+.dma_load(bar, &fwsec_frts)
+.inspect_err(|e| dev_err!(pdev.as_ref(), "Failed to load 
FWSEC-FRTS: {:?}\n", e))?;
+let (mbox0, _) = gsp_falcon
+.boot(bar, Some(0), None)
+.inspect_err(|e| dev_err!(pdev.as_ref(), "Failed to boot 
FWSEC-FRTS: {:?}\n", e))?;
+if mbox0 != 0 {
+dev_err!(pdev.as_ref(), "FWSEC firmware returned error {}\n", 
mbox0);
+return Err(EIO);
+}
+
+// SCRATCH_E contains FWSEC-FRTS' error code, if any.
+let frts_status = 
regs::NV_PBUS_SW_SCRATCH_0E::read(bar).frts_err_code();
+if frts_status != 0 {
+dev_err!(
+pdev.as_ref(),
+"FWSEC-FRTS returned with error code {:#x}",
+frts_status
+);
+return Err(EIO);
+}
+
+// Check the WPR2 has been created as we requested.
+let (wpr2_lo, wpr2_hi) = (
+(regs::NV_PFB_PRI_MMU_WPR2_ADDR_LO::read(bar).lo_val() as u64) << 
12,
+(regs::NV_PFB_PRI_MMU_WPR2_ADDR_HI::read(bar).hi_val() as u64) << 
12,
+);
+if wpr2_hi == 0 {
+dev_err!(
+pdev.as_ref(),
+"WPR2 region not created after running FWSEC-FRTS\n"
+);
+
+return Err(EIO);
+} else if wpr2_lo != fb_layout.frts.start {
+dev_err!(
+pdev.as_ref(),
+"WPR2 region created at unexpected address {:#x}; expected 
{:#x}\n",
+wpr2_lo,
+fb_layout.frts.start,
+);
+return Err(EIO);
+}
+
+dev_dbg!(pdev.as_ref(), "WPR2: {:#x}-{:#x}\n", wpr2_lo, wpr2_hi);
+dev_dbg!(pdev.as_ref(), "GPU instance built\n");
+
 Ok(pin_init!(Self {
 spec,
 bar: devres_bar,
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 
54d4d37d6bf2c31947b965258d2733009c293a18..2a2d5610e552780957bcf00e0da1ec4cd3ac85d2
 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -42,6 +42,13 @@ pub(crate) fn chipset(self) -> Result {
 }
 }
 
+/* PBUS */
+
+// TODO: this is an array of registers.
+register!(NV_PBUS_SW_SCRATCH_0E@0x1438  {
+31:16   frts_err_code as u16;
+});
+
 /* PFB */
 
 register!(NV_PFB_NISO_FLUSH_SYSMEM_ADDR @ 0x00100c10 {
@@ -73,6 +80,14 @@ pub(crate) fn usable_fb_size(self) -> u64 {
 }
 }
 
+register!(NV_PFB_PRI_MMU_WPR2_ADDR_LO@0x001fa824  {
+31:4lo_val as u32;
+});
+
+register!(NV_PFB_PRI_MMU_WPR2_ADDR_HI@0x001fa828  {
+31:4hi_val as u32;
+});
+
 /* PGC6 */
 
 register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK @ 0x00118128 {

-- 
2.49.0

[PATCH v5 20/23] gpu: nova-core: compute layout of the FRTS region

2025-06-12 Thread Alexandre Courbot

FWSEC-FRTS is run with the desired address of the FRTS region as
parameter, which we need to compute depending on some hardware
parameters.

Do this in a `FbLayout` structure, that will be later extended to
describe more memory regions used to boot the GSP.

Reviewed-by: Lyude Paul 
Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/fb.rs   | 70 
 drivers/gpu/nova-core/fb/hal.rs   | 12 +-
 drivers/gpu/nova-core/fb/hal/ga100.rs | 12 ++
 drivers/gpu/nova-core/fb/hal/ga102.rs | 36 +
 drivers/gpu/nova-core/fb/hal/tu102.rs | 16 
 drivers/gpu/nova-core/gpu.rs  |  4 ++
 drivers/gpu/nova-core/regs.rs | 76 +++
 7 files changed, 224 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 
308cd76edfee5a2e8a4cd979c20da2ce51cb16a5..39c7a7c506dd83776eb2b23f0bfb5c57a4d3f84f
 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -1,12 +1,17 @@
 // SPDX-License-Identifier: GPL-2.0
 
+use core::ops::Range;
+
+use kernel::num::PowerOfTwo;
 use kernel::prelude::*;
+use kernel::sizes::*;
 use kernel::types::ARef;
 use kernel::{dev_warn, device};
 
 use crate::dma::DmaObject;
 use crate::driver::Bar0;
 use crate::gpu::Chipset;
+use crate::regs;
 
 mod hal;
 
@@ -64,3 +69,68 @@ pub(crate) fn unregister(self, bar: &Bar0) {
 }
 }
 }
+
+/// Layout of the GPU framebuffer memory.
+///
+/// Contains ranges of GPU memory reserved for a given purpose during the GSP 
bootup process.
+#[derive(Debug)]
+#[expect(dead_code)]
+pub(crate) struct FbLayout {
+pub fb: Range,
+pub vga_workspace: Range,
+pub frts: Range,
+}
+
+impl FbLayout {
+/// Computes the FB layout.
+pub(crate) fn new(chipset: Chipset, bar: &Bar0) -> Result {
+let hal = hal::fb_hal(chipset);
+
+let fb = {
+let fb_size = hal.vidmem_size(bar);
+
+0..fb_size
+};
+
+let vga_workspace = {
+let vga_base = {
+const NV_PRAMIN_SIZE: u64 = SZ_1M as u64;
+let base = fb.end - NV_PRAMIN_SIZE;
+
+if hal.supports_display(bar) {
+match 
regs::NV_PDISP_VGA_WORKSPACE_BASE::read(bar).vga_workspace_addr() {
+Some(addr) => {
+if addr < base {
+const VBIOS_WORKSPACE_SIZE: u64 = SZ_128K as 
u64;
+
+// Point workspace address to end of 
framebuffer.
+fb.end - VBIOS_WORKSPACE_SIZE
+} else {
+addr
+}
+}
+None => base,
+}
+} else {
+base
+}
+};
+
+vga_base..fb.end
+};
+
+let frts = {
+const FRTS_DOWN_ALIGN: PowerOfTwo = 
PowerOfTwonew(SZ_128K as u64);
+const FRTS_SIZE: u64 = SZ_1M as u64;
+let frts_base = FRTS_DOWN_ALIGN.align_down(vga_workspace.start) - 
FRTS_SIZE;
+
+frts_base..frts_base + FRTS_SIZE
+};
+
+Ok(Self {
+fb,
+vga_workspace,
+frts,
+})
+}
+}
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index 
23eab57eec9f524e066d3324eb7f5f2bf78481d2..2f914948bb9a9842fd00a4c6381420b74de81c3f
 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -6,6 +6,7 @@
 use crate::gpu::Chipset;
 
 mod ga100;
+mod ga102;
 mod tu102;
 
 pub(crate) trait FbHal {
@@ -16,6 +17,12 @@ pub(crate) trait FbHal {
 ///
 /// This might fail if the address is too large for the receiving register.
 fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result;
+
+/// Returns `true` is display is supported.
+fn supports_display(&self, bar: &Bar0) -> bool;
+
+/// Returns the VRAM size, in bytes.
+fn vidmem_size(&self, bar: &Bar0) -> u64;
 }
 
 /// Returns the HAL corresponding to `chipset`.
@@ -24,8 +31,9 @@ pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
 
 match chipset {
 TU102 | TU104 | TU106 | TU117 | TU116 => tu102::TU102_HAL,
-GA100 | GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 
| AD106 | AD107 => {
-ga100::GA100_HAL
+GA100 => ga100::GA100_HAL,
+GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 
| AD107 => {
+ga102::GA102_HAL
 }
 }
 }
diff --git a/drivers/gpu/nova-core/fb/hal/ga100.rs 
b/drivers/gpu/nova-core/fb/hal/ga100.rs
index 
7c10436c1c590d9b767c399b69370697fdf8d239..4827721c9860649601b274c3986470096e1fe9bc
 100644
--- a/drivers/gpu/nova-core/fb/hal/ga100.rs
+++ b/drivers/gpu/nova-core/fb/hal/ga100.rs
@@ -25,6 +25,10 @@ pub(sup

[PATCH v5 19/23] gpu: nova-core: vbios: Add support for FWSEC ucode extraction

2025-06-12 Thread Alexandre Courbot

From: Joel Fernandes 

Using the support for navigating the VBIOS, add support to extract vBIOS
ucode data required for GSP to boot. The main data extracted from the
vBIOS is the FWSEC-FRTS firmware which runs on the GSP processor. This
firmware runs in high secure mode, and sets up the WPR2 (Write protected
region) before the Booter runs on the SEC2 processor.

Tested on my Ampere GA102 and boot is successful.

[applied changes by Alex Courbot for fwsec signatures]
[acour...@nvidia.com: remove now-unneeded Devres acquisition]

Cc: Alexandre Courbot 
Cc: John Hubbard 
Cc: Shirish Baskaran 
Cc: Alistair Popple 
Cc: Timur Tabi 
Cc: Ben Skeggs 
Signed-off-by: Alexandre Courbot 
Signed-off-by: Joel Fernandes 
---
 drivers/gpu/nova-core/firmware.rs |   2 -
 drivers/gpu/nova-core/vbios.rs| 307 --
 2 files changed, 298 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs 
b/drivers/gpu/nova-core/firmware.rs
index 
41f43a729ad3bf2c4acb6108f41e0905a6fac0df..e5583925cb3b4353b521c68175f8cf0c2d6ce830
 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -44,7 +44,6 @@ pub(crate) fn new(dev: &device::Device, chipset: Chipset, 
ver: &str) -> Result usize {
 const HDR_SIZE_SHIFT: u32 = 16;
 const HDR_SIZE_MASK: u32 = 0x;
diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
index 
312caf82d14588e21e0fa2bae0f8954d0efe3479..032ee510646af21f26f3f46c2d54a0f812c25978
 100644
--- a/drivers/gpu/nova-core/vbios.rs
+++ b/drivers/gpu/nova-core/vbios.rs
@@ -6,7 +6,9 @@
 #![expect(dead_code)]
 
 use crate::driver::Bar0;
+use crate::firmware::FalconUCodeDescV3;
 use core::convert::TryFrom;
+use kernel::device;
 use kernel::error::Result;
 use kernel::num::PowerOfTwo;
 use kernel::pci;
@@ -192,8 +194,8 @@ impl Vbios {
 pub(crate) fn new(pdev: &pci::Device, bar0: &Bar0) -> Result {
 // Images to extract from iteration
 let mut pci_at_image: Option = None;
-let mut first_fwsec_image: Option = None;
-let mut second_fwsec_image: Option = None;
+let mut first_fwsec_image: Option = None;
+let mut second_fwsec_image: Option = None;
 
 // Parse all VBIOS images in the ROM
 for image_result in VbiosIterator::new(pdev, bar0)? {
@@ -227,12 +229,14 @@ pub(crate) fn new(pdev: &pci::Device, bar0: &Bar0) -> 
Result {
 }
 
 // Using all the images, setup the falcon data pointer in Fwsec.
-// These are temporarily unused images and will be used in later 
patches.
-if let (Some(second), Some(_first), Some(_pci_at)) =
+if let (Some(mut second), Some(first), Some(pci_at)) =
 (second_fwsec_image, first_fwsec_image, pci_at_image)
 {
+second
+.setup_falcon_data(pdev, &pci_at, &first)
+.inspect_err(|e| dev_err!(pdev.as_ref(), "Falcon data setup 
failed: {:?}\n", e))?;
 Ok(Vbios {
-fwsec_image: second,
+fwsec_image: second.build(pdev)?,
 })
 } else {
 dev_err!(
@@ -242,6 +246,10 @@ pub(crate) fn new(pdev: &pci::Device, bar0: &Bar0) -> 
Result {
 Err(EINVAL)
 }
 }
+
+pub(crate) fn fwsec_image(&self) -> &FwSecBiosImage {
+&self.fwsec_image
+}
 }
 
 /// PCI Data Structure as defined in PCI Firmware Specification
@@ -675,7 +683,7 @@ fn new(pdev: &pci::Device, data: &[u8]) -> Result {
 PciAt: PciAtBiosImage,   // PCI-AT compatible BIOS image
 Efi: EfiBiosImage,   // EFI (Extensible Firmware Interface)
 Nbsi: NbsiBiosImage, // NBSI (Nvidia Bios System Interface)
-FwSec: FwSecBiosImage,   // FWSEC (Firmware Security)
+FwSec: FwSecBiosBuilder, // FWSEC (Firmware Security)
 }
 
 struct PciAtBiosImage {
@@ -694,9 +702,24 @@ struct NbsiBiosImage {
 // NBSI-specific fields can be added here in the future.
 }
 
-struct FwSecBiosImage {
+struct FwSecBiosBuilder {
 base: BiosImageBase,
-// FWSEC-specific fields can be added here in the future.
+/// These are temporary fields that are used during the construction of
+/// the FwSecBiosBuilder. Once FwSecBiosBuilder is constructed, the
+/// falcon_ucode_offset will be copied into a new FwSecBiosImage.
+///
+/// The offset of the Falcon data from the start of Fwsec image
+falcon_data_offset: Option,
+/// The PmuLookupTable starts at the offset of the falcon data pointer
+pmu_lookup_table: Option,
+/// The offset of the Falcon ucode
+falcon_ucode_offset: Option,
+}
+
+pub(crate) struct FwSecBiosImage {
+base: BiosImageBase,
+/// The offset of the Falcon ucode
+falcon_ucode_offset: usize,
 }
 
 // Convert from BiosImageBase to BiosImage
@@ -708,7 +731,12 @@ fn try_from(base: BiosImageBase) -> Result {
 0x00 => Ok(BiosImage::PciAt(base.try_into()?)),
 0x03 => Ok(BiosImage::Efi(EfiBiosIma

[PATCH v5 13/23] gpu: nova-core: add DMA object struct

2025-06-12 Thread Alexandre Courbot

Since we will need to allocate lots of distinct memory chunks to be
shared between GPU and CPU, introduce a type dedicated to that. It is a
light wrapper around CoherentAllocation.

Reviewed-by: Lyude Paul 
Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/dma.rs   | 61 ++
 drivers/gpu/nova-core/nova_core.rs |  1 +
 2 files changed, 62 insertions(+)

diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs
new file mode 100644
index 
..4b063aaef65ec4e2f476fc5ce9dc25341b6660ca
--- /dev/null
+++ b/drivers/gpu/nova-core/dma.rs
@@ -0,0 +1,61 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Simple DMA object wrapper.
+
+// To be removed when all code is used.
+#![expect(dead_code)]
+
+use core::ops::{Deref, DerefMut};
+
+use kernel::device;
+use kernel::dma::CoherentAllocation;
+use kernel::page::PAGE_SIZE;
+use kernel::prelude::*;
+
+pub(crate) struct DmaObject {
+dma: CoherentAllocation,
+}
+
+impl DmaObject {
+pub(crate) fn new(dev: &device::Device, len: usize) -> 
Result {
+let len = core::alloc::Layout::from_size_align(len, PAGE_SIZE)
+.map_err(|_| EINVAL)?
+.pad_to_align()
+.size();
+let dma = CoherentAllocation::alloc_coherent(dev, len, GFP_KERNEL | 
__GFP_ZERO)?;
+
+Ok(Self { dma })
+}
+
+pub(crate) fn from_data(dev: &device::Device, data: &[u8]) 
-> Result {
+Self::new(dev, data.len()).map(|mut dma_obj| {
+// TODO: replace with `CoherentAllocation::write()` once available.
+// SAFETY:
+// - `dma_obj`'s size is at least `data.len()`.
+// - We have just created this object and there is no other user 
at this stage.
+unsafe {
+core::ptr::copy_nonoverlapping(
+data.as_ptr(),
+dma_obj.dma.start_ptr_mut(),
+data.len(),
+);
+}
+
+dma_obj
+})
+}
+}
+
+impl Deref for DmaObject {
+type Target = CoherentAllocation;
+
+fn deref(&self) -> &Self::Target {
+&self.dma
+}
+}
+
+impl DerefMut for DmaObject {
+fn deref_mut(&mut self) -> &mut Self::Target {
+&mut self.dma
+}
+}
diff --git a/drivers/gpu/nova-core/nova_core.rs 
b/drivers/gpu/nova-core/nova_core.rs
index 
c3fde3e132ea65851137ab47fcb7b3637577..121fe5c11044a192212d0a64353b7acad58c796a
 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -2,6 +2,7 @@
 
 //! Nova Core GPU Driver
 
+mod dma;
 mod driver;
 mod firmware;
 mod gfw;

-- 
2.49.0

[PATCH v5 16/23] gpu: nova-core: firmware: add ucode descriptor used by FWSEC-FRTS

2025-06-12 Thread Alexandre Courbot

FWSEC-FRTS is the first firmware we need to run on the GSP falcon in
order to initiate the GSP boot process. Introduce the structure that
describes it.

Reviewed-by: Lyude Paul 
Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/firmware.rs | 45 +++
 1 file changed, 45 insertions(+)

diff --git a/drivers/gpu/nova-core/firmware.rs 
b/drivers/gpu/nova-core/firmware.rs
index 
4b8a38358a4f6da2a4d57f8db50ea9e788c3e4b5..2f4f5c7c7902a386a44bc9cf5eb6d46375fe0e5a
 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -41,6 +41,51 @@ pub(crate) fn new(dev: &device::Device, chipset: Chipset, 
ver: &str) -> Result usize {
+const HDR_SIZE_SHIFT: u32 = 16;
+const HDR_SIZE_MASK: u32 = 0x;
+
+((self.hdr & HDR_SIZE_MASK) >> HDR_SIZE_SHIFT) as usize
+}
+}
+
 pub(crate) struct ModInfoBuilder(firmware::ModInfoBuilder);
 
 impl ModInfoBuilder {

-- 
2.49.0

[PATCH v5 21/23] gpu: nova-core: add types for patching firmware binaries

2025-06-12 Thread Alexandre Courbot

Some of the firmwares need to be patched at load-time with a signature.
Add a couple of types and traits that sub-modules can use to implement
this behavior, while ensuring that the correct kind of signature is
applied to the firmware.

Reviewed-by: Lyude Paul 
Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/firmware.rs | 64 +++
 1 file changed, 64 insertions(+)

diff --git a/drivers/gpu/nova-core/firmware.rs 
b/drivers/gpu/nova-core/firmware.rs
index 
e5583925cb3b4353b521c68175f8cf0c2d6ce830..32553b5142d6623bdaaa9d480fbff11069198606
 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -3,11 +3,15 @@
 //! Contains structures and functions dedicated to the parsing, building and 
patching of firmwares
 //! to be loaded into a given execution unit.
 
+use core::marker::PhantomData;
+
 use kernel::device;
 use kernel::firmware;
 use kernel::prelude::*;
 use kernel::str::CString;
 
+use crate::dma::DmaObject;
+use crate::falcon::FalconFirmware;
 use crate::gpu;
 use crate::gpu::Chipset;
 
@@ -84,6 +88,66 @@ pub(crate) fn size(&self) -> usize {
 }
 }
 
+/// Trait implemented by types defining the signed state of a firmware.
+trait SignedState {}
+
+/// Type indicating that the firmware must be signed before it can be used.
+struct Unsigned;
+impl SignedState for Unsigned {}
+
+/// Type indicating that the firmware is signed and ready to be loaded.
+struct Signed;
+impl SignedState for Signed {}
+
+/// A [`DmaObject`] containing a specific microcode ready to be loaded into a 
falcon.
+///
+/// This is module-local and meant for sub-modules to use internally.
+///
+/// After construction, a firmware is [`Unsigned`], and must generally be 
patched with a signature
+/// before it can be loaded (with an exception for development hardware). The
+/// [`Self::patch_signature`] and [`Self::no_patch_signature`] methods are 
used to transition the
+/// firmware to its [`Signed`] state.
+struct FirmwareDmaObject(DmaObject, 
PhantomData<(F, S)>);
+
+/// Trait for signatures to be patched directly into a given firmware.
+///
+/// This is module-local and meant for sub-modules to use internally.
+trait FirmwareSignature: AsRef<[u8]> {}
+
+#[expect(unused)]
+impl FirmwareDmaObject {
+/// Patches the firmware at offset `sig_base_img` with `signature`.
+fn patch_signature>(
+mut self,
+signature: &S,
+sig_base_img: usize,
+) -> Result> {
+let signature_bytes = signature.as_ref();
+if sig_base_img + signature_bytes.len() > self.0.size() {
+return Err(EINVAL);
+}
+
+// SAFETY: we are the only user of this object, so there cannot be any 
race.
+let dst = unsafe { self.0.start_ptr_mut().add(sig_base_img) };
+
+// SAFETY: `signature` and `dst` are valid, properly aligned, and do 
not overlap.
+unsafe {
+core::ptr::copy_nonoverlapping(signature_bytes.as_ptr(), dst, 
signature_bytes.len())
+};
+
+Ok(FirmwareDmaObject(self.0, PhantomData))
+}
+
+/// Mark the firmware as signed without patching it.
+///
+/// This method is used to explicitly confirm that we do not need to sign 
the firmware, while
+/// allowing us to continue as if it was. This is typically only needed 
for development
+/// hardware.
+fn no_patch_signature(self) -> FirmwareDmaObject {
+FirmwareDmaObject(self.0, PhantomData)
+}
+}
+
 pub(crate) struct ModInfoBuilder(firmware::ModInfoBuilder);
 
 impl ModInfoBuilder {

-- 
2.49.0

[PATCH v2 3/3] drm/format-helper: Move drm_fb_build_fourcc_list() to sysfb helpers

2025-06-12 Thread Thomas Zimmermann

Only sysfb drivers use drm_fb_build_fourcc_list(). Move the function
to sysfb helpers and rename it accordingly. Update drivers and tests.

v2:
- select DRM_SYSFB_HELPER (kernel test robot)

Signed-off-by: Thomas Zimmermann 
Acked-by: Maxime Ripard 
Acked-by: Javier Martinez Canillas 
---
 drivers/gpu/drm/Kconfig.debug |   1 +
 drivers/gpu/drm/drm_format_helper.c   | 138 --
 drivers/gpu/drm/sysfb/drm_sysfb_helper.h  |   4 +
 drivers/gpu/drm/sysfb/drm_sysfb_modeset.c | 138 ++
 drivers/gpu/drm/sysfb/efidrm.c|   4 +-
 drivers/gpu/drm/sysfb/ofdrm.c |   5 +-
 drivers/gpu/drm/sysfb/simpledrm.c |   5 +-
 drivers/gpu/drm/sysfb/vesadrm.c   |   4 +-
 .../gpu/drm/tests/drm_sysfb_modeset_test.c|   9 +-
 include/drm/drm_format_helper.h   |   4 -
 10 files changed, 156 insertions(+), 156 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig.debug b/drivers/gpu/drm/Kconfig.debug
index fa6ee76f4d3c..05dc43c0b8c5 100644
--- a/drivers/gpu/drm/Kconfig.debug
+++ b/drivers/gpu/drm/Kconfig.debug
@@ -70,6 +70,7 @@ config DRM_KUNIT_TEST
select DRM_GEM_SHMEM_HELPER
select DRM_KUNIT_TEST_HELPERS
select DRM_LIB_RANDOM
+   select DRM_SYSFB_HELPER
select PRIME_NUMBERS
default KUNIT_ALL_TESTS
help
diff --git a/drivers/gpu/drm/drm_format_helper.c 
b/drivers/gpu/drm/drm_format_helper.c
index 73b5a80771cc..da79100895ff 100644
--- a/drivers/gpu/drm/drm_format_helper.c
+++ b/drivers/gpu/drm/drm_format_helper.c
@@ -1338,141 +1338,3 @@ void drm_fb_xrgb_to_mono(struct iosys_map *dst, 
const unsigned int *dst_pitc
}
 }
 EXPORT_SYMBOL(drm_fb_xrgb_to_mono);
-
-static uint32_t drm_fb_nonalpha_fourcc(uint32_t fourcc)
-{
-   /* only handle formats with depth != 0 and alpha channel */
-   switch (fourcc) {
-   case DRM_FORMAT_ARGB1555:
-   return DRM_FORMAT_XRGB1555;
-   case DRM_FORMAT_ABGR1555:
-   return DRM_FORMAT_XBGR1555;
-   case DRM_FORMAT_RGBA5551:
-   return DRM_FORMAT_RGBX5551;
-   case DRM_FORMAT_BGRA5551:
-   return DRM_FORMAT_BGRX5551;
-   case DRM_FORMAT_ARGB:
-   return DRM_FORMAT_XRGB;
-   case DRM_FORMAT_ABGR:
-   return DRM_FORMAT_XBGR;
-   case DRM_FORMAT_RGBA:
-   return DRM_FORMAT_RGBX;
-   case DRM_FORMAT_BGRA:
-   return DRM_FORMAT_BGRX;
-   case DRM_FORMAT_ARGB2101010:
-   return DRM_FORMAT_XRGB2101010;
-   case DRM_FORMAT_ABGR2101010:
-   return DRM_FORMAT_XBGR2101010;
-   case DRM_FORMAT_RGBA1010102:
-   return DRM_FORMAT_RGBX1010102;
-   case DRM_FORMAT_BGRA1010102:
-   return DRM_FORMAT_BGRX1010102;
-   }
-
-   return fourcc;
-}
-
-static bool is_listed_fourcc(const uint32_t *fourccs, size_t nfourccs, 
uint32_t fourcc)
-{
-   const uint32_t *fourccs_end = fourccs + nfourccs;
-
-   while (fourccs < fourccs_end) {
-   if (*fourccs == fourcc)
-   return true;
-   ++fourccs;
-   }
-   return false;
-}
-
-/**
- * drm_fb_build_fourcc_list - Filters a list of supported color formats against
- *the device's native formats
- * @dev: DRM device
- * @native_fourccs: 4CC codes of natively supported color formats
- * @native_nfourccs: The number of entries in @native_fourccs
- * @fourccs_out: Returns 4CC codes of supported color formats
- * @nfourccs_out: The number of available entries in @fourccs_out
- *
- * This function create a list of supported color format from natively
- * supported formats and additional emulated formats.
- * At a minimum, most userspace programs expect at least support for
- * XRGB on the primary plane. Devices that have to emulate the
- * format, and possibly others, can use drm_fb_build_fourcc_list() to
- * create a list of supported color formats. The returned list can
- * be handed over to drm_universal_plane_init() et al. Native formats
- * will go before emulated formats. Native formats with alpha channel
- * will be replaced by such without, as primary planes usually don't
- * support alpha. Other heuristics might be applied
- * to optimize the order. Formats near the beginning of the list are
- * usually preferred over formats near the end of the list.
- *
- * Returns:
- * The number of color-formats 4CC codes returned in @fourccs_out.
- */
-size_t drm_fb_build_fourcc_list(struct drm_device *dev,
-   const u32 *native_fourccs, size_t 
native_nfourccs,
-   u32 *fourccs_out, size_t nfourccs_out)
-{
-   /*
-* XRGB is the default fallback format for most of userspace
-* and it's currently the only format that should be emulated for
-* the primary plane. Only if there's ever another default fallb

[PATCH v2 1/3] drm/tests: Do not use drm_fb_blit() in format-helper tests

2025-06-12 Thread Thomas Zimmermann

Export additional helpers from the format-helper library and open-code
drm_fb_blit() in tests. Prepares for the removal of drm_fb_blit(). Only
sysfb drivers use drm_fb_blit(). The function will soon be removed from
format helpers and be refactored within sysfb helpers.

Signed-off-by: Thomas Zimmermann 
Acked-by: Maxime Ripard 
---
 drivers/gpu/drm/drm_format_helper.c   | 108 --
 drivers/gpu/drm/drm_format_internal.h |   8 ++
 .../gpu/drm/tests/drm_format_helper_test.c| 108 +++---
 include/drm/drm_format_helper.h   |   9 ++
 4 files changed, 131 insertions(+), 102 deletions(-)

diff --git a/drivers/gpu/drm/drm_format_helper.c 
b/drivers/gpu/drm/drm_format_helper.c
index d36e6cacc575..73b5a80771cc 100644
--- a/drivers/gpu/drm/drm_format_helper.c
+++ b/drivers/gpu/drm/drm_format_helper.c
@@ -857,11 +857,33 @@ static void drm_fb_xrgb_to_abgr_line(void *dbuf, 
const void *sbuf, unsig
drm_fb_xfrm_line_32to32(dbuf, sbuf, pixels, 
drm_pixel_xrgb_to_abgr);
 }
 
-static void drm_fb_xrgb_to_abgr(struct iosys_map *dst, const unsigned 
int *dst_pitch,
-   const struct iosys_map *src,
-   const struct drm_framebuffer *fb,
-   const struct drm_rect *clip,
-   struct drm_format_conv_state *state)
+/**
+ * drm_fb_xrgb_to_abgr - Convert XRGB to ABGR clip buffer
+ * @dst: Array of ABGR destination buffers
+ * @dst_pitch: Array of numbers of bytes between the start of two consecutive 
scanlines
+ * within @dst; can be NULL if scanlines are stored next to each 
other.
+ * @src: Array of XRGB source buffer
+ * @fb: DRM framebuffer
+ * @clip: Clip rectangle area to copy
+ * @state: Transform and conversion state
+ *
+ * This function copies parts of a framebuffer to display memory and converts 
the
+ * color format during the process. The parameters @dst, @dst_pitch and @src 
refer
+ * to arrays. Each array must have at least as many entries as there are 
planes in
+ * @fb's format. Each entry stores the value for the format's respective color 
plane
+ * at the same index.
+ *
+ * This function does not apply clipping on @dst (i.e. the destination is at 
the
+ * top-left corner).
+ *
+ * Drivers can use this function for ABGR devices that don't support 
XRGB
+ * natively. It sets an opaque alpha channel as part of the conversion.
+ */
+void drm_fb_xrgb_to_abgr(struct iosys_map *dst, const unsigned int 
*dst_pitch,
+const struct iosys_map *src,
+const struct drm_framebuffer *fb,
+const struct drm_rect *clip,
+struct drm_format_conv_state *state)
 {
static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = {
4,
@@ -870,17 +892,40 @@ static void drm_fb_xrgb_to_abgr(struct iosys_map 
*dst, const unsigned in
drm_fb_xfrm(dst, dst_pitch, dst_pixsize, src, fb, clip, false, state,
drm_fb_xrgb_to_abgr_line);
 }
+EXPORT_SYMBOL(drm_fb_xrgb_to_abgr);
 
 static void drm_fb_xrgb_to_xbgr_line(void *dbuf, const void *sbuf, 
unsigned int pixels)
 {
drm_fb_xfrm_line_32to32(dbuf, sbuf, pixels, 
drm_pixel_xrgb_to_xbgr);
 }
 
-static void drm_fb_xrgb_to_xbgr(struct iosys_map *dst, const unsigned 
int *dst_pitch,
-   const struct iosys_map *src,
-   const struct drm_framebuffer *fb,
-   const struct drm_rect *clip,
-   struct drm_format_conv_state *state)
+/**
+ * drm_fb_xrgb_to_xbgr - Convert XRGB to XBGR clip buffer
+ * @dst: Array of XBGR destination buffers
+ * @dst_pitch: Array of numbers of bytes between the start of two consecutive 
scanlines
+ * within @dst; can be NULL if scanlines are stored next to each 
other.
+ * @src: Array of XRGB source buffer
+ * @fb: DRM framebuffer
+ * @clip: Clip rectangle area to copy
+ * @state: Transform and conversion state
+ *
+ * This function copies parts of a framebuffer to display memory and converts 
the
+ * color format during the process. The parameters @dst, @dst_pitch and @src 
refer
+ * to arrays. Each array must have at least as many entries as there are 
planes in
+ * @fb's format. Each entry stores the value for the format's respective color 
plane
+ * at the same index.
+ *
+ * This function does not apply clipping on @dst (i.e. the destination is at 
the
+ * top-left corner).
+ *
+ * Drivers can use this function for XBGR devices that don't support 
XRGB
+ * natively.
+ */
+void drm_fb_xrgb_to_xbgr(struct iosys_map *dst, const unsigned int 
*dst_pitch,
+const st

[PATCH v2 2/3] drm/tests: Test drm_fb_build_fourcc_list() in separate test suite

2025-06-12 Thread Thomas Zimmermann

Only sysfb drivers use drm_fb_build_fourcc_list(). The helper will
be moved from format helpers to sysfb helpers. Moving the related
tests to their own test suite.

v2:
- rename filename to match tested code (Maxime)

Signed-off-by: Thomas Zimmermann 
Acked-by: Maxime Ripard 
---
 drivers/gpu/drm/tests/Makefile|   3 +-
 .../gpu/drm/tests/drm_format_helper_test.c| 142 ---
 .../gpu/drm/tests/drm_sysfb_modeset_test.c| 166 ++
 3 files changed, 168 insertions(+), 143 deletions(-)
 create mode 100644 drivers/gpu/drm/tests/drm_sysfb_modeset_test.c

diff --git a/drivers/gpu/drm/tests/Makefile b/drivers/gpu/drm/tests/Makefile
index 3afd6587df08..c0e952293ad0 100644
--- a/drivers/gpu/drm/tests/Makefile
+++ b/drivers/gpu/drm/tests/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_DRM_KUNIT_TEST) += \
drm_modes_test.o \
drm_plane_helper_test.o \
drm_probe_helper_test.o \
-   drm_rect_test.o
+   drm_rect_test.o \
+   drm_sysfb_modeset_test.o
 
 CFLAGS_drm_mm_test.o := $(DISABLE_STRUCTLEAK_PLUGIN)
diff --git a/drivers/gpu/drm/tests/drm_format_helper_test.c 
b/drivers/gpu/drm/tests/drm_format_helper_test.c
index 8aacc1ffa93a..ef1cc3b28f1b 100644
--- a/drivers/gpu/drm/tests/drm_format_helper_test.c
+++ b/drivers/gpu/drm/tests/drm_format_helper_test.c
@@ -1335,147 +1335,6 @@ static void drm_test_fb_clip_offset(struct kunit *test)
KUNIT_EXPECT_EQ(test, offset, params->expected_offset);
 }
 
-struct fb_build_fourcc_list_case {
-   const char *name;
-   u32 native_fourccs[TEST_BUF_SIZE];
-   size_t native_fourccs_size;
-   u32 expected[TEST_BUF_SIZE];
-   size_t expected_fourccs_size;
-};
-
-static struct fb_build_fourcc_list_case fb_build_fourcc_list_cases[] = {
-   {
-   .name = "no native formats",
-   .native_fourccs = { },
-   .native_fourccs_size = 0,
-   .expected = { DRM_FORMAT_XRGB },
-   .expected_fourccs_size = 1,
-   },
-   {
-   .name = "XRGB as native format",
-   .native_fourccs = { DRM_FORMAT_XRGB },
-   .native_fourccs_size = 1,
-   .expected = { DRM_FORMAT_XRGB },
-   .expected_fourccs_size = 1,
-   },
-   {
-   .name = "remove duplicates",
-   .native_fourccs = {
-   DRM_FORMAT_XRGB,
-   DRM_FORMAT_XRGB,
-   DRM_FORMAT_RGB888,
-   DRM_FORMAT_RGB888,
-   DRM_FORMAT_RGB888,
-   DRM_FORMAT_XRGB,
-   DRM_FORMAT_RGB888,
-   DRM_FORMAT_RGB565,
-   DRM_FORMAT_RGB888,
-   DRM_FORMAT_XRGB,
-   DRM_FORMAT_RGB565,
-   DRM_FORMAT_RGB565,
-   DRM_FORMAT_XRGB,
-   },
-   .native_fourccs_size = 11,
-   .expected = {
-   DRM_FORMAT_XRGB,
-   DRM_FORMAT_RGB888,
-   DRM_FORMAT_RGB565,
-   },
-   .expected_fourccs_size = 3,
-   },
-   {
-   .name = "convert alpha formats",
-   .native_fourccs = {
-   DRM_FORMAT_ARGB1555,
-   DRM_FORMAT_ABGR1555,
-   DRM_FORMAT_RGBA5551,
-   DRM_FORMAT_BGRA5551,
-   DRM_FORMAT_ARGB,
-   DRM_FORMAT_ABGR,
-   DRM_FORMAT_RGBA,
-   DRM_FORMAT_BGRA,
-   DRM_FORMAT_ARGB2101010,
-   DRM_FORMAT_ABGR2101010,
-   DRM_FORMAT_RGBA1010102,
-   DRM_FORMAT_BGRA1010102,
-   },
-   .native_fourccs_size = 12,
-   .expected = {
-   DRM_FORMAT_XRGB1555,
-   DRM_FORMAT_XBGR1555,
-   DRM_FORMAT_RGBX5551,
-   DRM_FORMAT_BGRX5551,
-   DRM_FORMAT_XRGB,
-   DRM_FORMAT_XBGR,
-   DRM_FORMAT_RGBX,
-   DRM_FORMAT_BGRX,
-   DRM_FORMAT_XRGB2101010,
-   DRM_FORMAT_XBGR2101010,
-   DRM_FORMAT_RGBX1010102,
-   DRM_FORMAT_BGRX1010102,
-   },
-   .expected_fourccs_size = 12,
-   },
-   {
-   .name = "random formats",
-   .native_fourccs = {
-   DRM_FORMAT_Y212,
-   DRM_FORMAT_ARGB1555,
-   DRM_FORMAT_ABGR16161616F,
-   DRM_FORMAT_C8,
-   DRM_FORMAT_BGR888,
-   DRM_FORMAT_XRGB1555,
-   DRM_FOR

[PATCH v2 0/3] drm/tests: Update format-helper tests for sysfb

2025-06-12 Thread Thomas Zimmermann

The helpers drm_fb_blit() and drm_fb_build_fourcc_list() will be
integrated into sysfb helpers. Update the DRM format-helper tests
accordingly in patches 1 and 2.

The change to drm_fb_build_fourcc_list() is simple enough that we
can apply it here in patch 3.

v2:
- fix test filename (Maxime)
- fix dependencies (kernel test robot)

Thomas Zimmermann (3):
  drm/tests: Do not use drm_fb_blit() in format-helper tests
  drm/tests: Test drm_fb_build_fourcc_list() in separate test suite
  drm/format-helper: Move drm_fb_build_fourcc_list() to sysfb helpers

 drivers/gpu/drm/Kconfig.debug |   1 +
 drivers/gpu/drm/drm_format_helper.c   | 246 +++--
 drivers/gpu/drm/drm_format_internal.h |   8 +
 drivers/gpu/drm/sysfb/drm_sysfb_helper.h  |   4 +
 drivers/gpu/drm/sysfb/drm_sysfb_modeset.c | 138 ++
 drivers/gpu/drm/sysfb/efidrm.c|   4 +-
 drivers/gpu/drm/sysfb/ofdrm.c |   5 +-
 drivers/gpu/drm/sysfb/simpledrm.c |   5 +-
 drivers/gpu/drm/sysfb/vesadrm.c   |   4 +-
 drivers/gpu/drm/tests/Makefile|   3 +-
 .../gpu/drm/tests/drm_format_helper_test.c| 250 ++
 .../gpu/drm/tests/drm_sysfb_modeset_test.c| 167 
 include/drm/drm_format_helper.h   |  13 +-
 13 files changed, 451 insertions(+), 397 deletions(-)
 create mode 100644 drivers/gpu/drm/tests/drm_sysfb_modeset_test.c

-- 
2.49.0

[PATCH v5 17/23] gpu: nova-core: vbios: Add base support for VBIOS construction and iteration

2025-06-12 Thread Alexandre Courbot

From: Joel Fernandes 

Add support for navigating the VBIOS images required for extracting
ucode data for GSP to boot. Later patches will build on this.

Debug log messages will show the BIOS images:

[102141.013287] NovaCore: Found BIOS image at offset 0x0, size: 0xfe00, type: 
PciAt
[102141.080692] NovaCore: Found BIOS image at offset 0xfe00, size: 0x14800, 
type: Efi
[102141.098443] NovaCore: Found BIOS image at offset 0x24600, size: 0x5600, 
type: FwSec
[102141.415095] NovaCore: Found BIOS image at offset 0x29c00, size: 0x60800, 
type: FwSec

[applied feedback from Alex Courbot and Timur Tabi]
[applied changes related to code reorg, prints etc from Danilo Krummrich]
[acour...@nvidia.com: fix clippy warnings, read_more() function]

Cc: Alexandre Courbot 
Cc: John Hubbard 
Cc: Shirish Baskaran 
Cc: Alistair Popple 
Cc: Timur Tabi 
Cc: Ben Skeggs 
Signed-off-by: Alexandre Courbot 
Signed-off-by: Joel Fernandes 
---
 drivers/gpu/nova-core/firmware.rs  |   4 +-
 drivers/gpu/nova-core/gpu.rs   |   4 +
 drivers/gpu/nova-core/nova_core.rs |   1 +
 drivers/gpu/nova-core/vbios.rs | 681 +
 4 files changed, 688 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs 
b/drivers/gpu/nova-core/firmware.rs
index 
2f4f5c7c7902a386a44bc9cf5eb6d46375fe0e5a..41f43a729ad3bf2c4acb6108f41e0905a6fac0df
 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -44,6 +44,7 @@ pub(crate) fn new(dev: &device::Device, chipset: Chipset, 
ver: &str) -> Result usize {
 const HDR_SIZE_SHIFT: u32 = 16;
 const HDR_SIZE_MASK: u32 = 0x;
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 
c9f7f604a5de6ea4eb85f061cae826302c1902c3..1c577d3eff8b32bbc45d7d2302c3e2246bef3b44
 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -9,6 +9,7 @@
 use crate::gfw;
 use crate::regs;
 use crate::util;
+use crate::vbios::Vbios;
 use core::fmt;
 
 macro_rules! define_chipset {
@@ -218,6 +219,9 @@ pub(crate) fn new(
 
 let _sec2_falcon = Falconnew(pdev.as_ref(), spec.chipset, 
bar, true)?;
 
+// Will be used in a later patch when fwsec firmware is needed.
+let _bios = Vbios::new(pdev, bar)?;
+
 Ok(pin_init!(Self {
 spec,
 bar: devres_bar,
diff --git a/drivers/gpu/nova-core/nova_core.rs 
b/drivers/gpu/nova-core/nova_core.rs
index 
808997bbe36d2fa1dc8b8940c1f9373d9bdbfb69..de14f2e926361a4f954b1a8d0b95b0e985e54eec
 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -11,6 +11,7 @@
 mod gpu;
 mod regs;
 mod util;
+mod vbios;
 
 pub(crate) const MODULE_NAME: &kernel::str::CStr = ::NAME;
 
diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
new file mode 100644
index 
..aa6f19ddd51752ba453a1600ea002a198e27af5d
--- /dev/null
+++ b/drivers/gpu/nova-core/vbios.rs
@@ -0,0 +1,681 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! VBIOS extraction and parsing.
+
+// To be removed when all code is used.
+#![expect(dead_code)]
+
+use crate::driver::Bar0;
+use core::convert::TryFrom;
+use kernel::error::Result;
+use kernel::num::PowerOfTwo;
+use kernel::pci;
+use kernel::prelude::*;
+
+/// The offset of the VBIOS ROM in the BAR0 space.
+const ROM_OFFSET: usize = 0x30;
+/// The maximum length of the VBIOS ROM to scan into.
+const BIOS_MAX_SCAN_LEN: usize = 0x10;
+/// The size to read ahead when parsing initial BIOS image headers.
+const BIOS_READ_AHEAD_SIZE: usize = 1024;
+/// The bit in the last image indicator byte for the PCI Data Structure that
+/// indicates the last image. Bit 0-6 are reserved, bit 7 is last image bit.
+const LAST_IMAGE_BIT_MASK: u8 = 0x80;
+
+// PMU lookup table entry types. Used to locate PMU table entries
+// in the Fwsec image, corresponding to falcon ucodes.
+#[expect(dead_code)]
+const FALCON_UCODE_ENTRY_APPID_FIRMWARE_SEC_LIC: u8 = 0x05;
+#[expect(dead_code)]
+const FALCON_UCODE_ENTRY_APPID_FWSEC_DBG: u8 = 0x45;
+const FALCON_UCODE_ENTRY_APPID_FWSEC_PROD: u8 = 0x85;
+
+/// Vbios Reader for constructing the VBIOS data
+struct VbiosIterator<'a> {
+pdev: &'a pci::Device,
+bar0: &'a Bar0,
+// VBIOS data vector: As BIOS images are scanned, they are added to this 
vector
+// for reference or copying into other data structures. It is the entire
+// scanned contents of the VBIOS which progressively extends. It is used
+// so that we do not re-read any contents that are already read as we use
+// the cumulative length read so far, and re-read any gaps as we extend
+// the length.
+data: KVec,
+current_offset: usize, // Current offset for iterator
+last_found: bool,  // Whether the last image has been found
+}
+
+impl<'a> VbiosIterator<'a> {
+fn new(pdev: &'a pci::Device, bar0: &'a Bar0) -> Result {
+Ok(Self {
+pdev,
+bar0,
+

[PATCH v5 15/23] gpu: nova-core: add falcon register definitions and base code

2025-06-12 Thread Alexandre Courbot

Booting the GSP on Ampere requires an intricate dance between the GSP
and SEC2 falcons, where the GSP starts by running the FWSEC firmware to
create the WPR2 region , and then SEC2 loads the actual RISC-V firmware
into the GSP.

Add the common Falcon code and HAL for Ampere GPUs, and instantiate the
GSP and SEC2 Falcons that will be required to perform that dance and
boot the GSP.

Reviewed-by: Lyude Paul 
Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/falcon.rs   | 560 ++
 drivers/gpu/nova-core/falcon/gsp.rs   |  24 ++
 drivers/gpu/nova-core/falcon/hal.rs   |  54 +++
 drivers/gpu/nova-core/falcon/hal/ga102.rs | 117 +++
 drivers/gpu/nova-core/falcon/sec2.rs  |  10 +
 drivers/gpu/nova-core/gpu.rs  |  11 +
 drivers/gpu/nova-core/nova_core.rs|   1 +
 drivers/gpu/nova-core/regs.rs | 139 
 8 files changed, 916 insertions(+)

diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
new file mode 100644
index 
..25ed8ee30def3abcc43bcba965eb62f49d532604
--- /dev/null
+++ b/drivers/gpu/nova-core/falcon.rs
@@ -0,0 +1,560 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Falcon microprocessor base support
+
+// To be removed when all code is used.
+#![expect(dead_code)]
+
+use core::ops::Deref;
+use core::time::Duration;
+use hal::FalconHal;
+use kernel::bindings;
+use kernel::device;
+use kernel::prelude::*;
+use kernel::types::ARef;
+
+use crate::dma::DmaObject;
+use crate::driver::Bar0;
+use crate::gpu::Chipset;
+use crate::regs;
+use crate::util;
+
+pub(crate) mod gsp;
+mod hal;
+pub(crate) mod sec2;
+
+/// Revision number of a falcon core, used in the 
[`crate::regs::NV_PFALCON_FALCON_HWCFG1`]
+/// register.
+#[repr(u8)]
+#[derive(Debug, Default, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)]
+pub(crate) enum FalconCoreRev {
+#[default]
+Rev1 = 1,
+Rev2 = 2,
+Rev3 = 3,
+Rev4 = 4,
+Rev5 = 5,
+Rev6 = 6,
+Rev7 = 7,
+}
+
+impl TryFrom for FalconCoreRev {
+type Error = Error;
+
+fn try_from(value: u8) -> Result {
+use FalconCoreRev::*;
+
+let rev = match value {
+1 => Rev1,
+2 => Rev2,
+3 => Rev3,
+4 => Rev4,
+5 => Rev5,
+6 => Rev6,
+7 => Rev7,
+_ => return Err(EINVAL),
+};
+
+Ok(rev)
+}
+}
+
+/// Revision subversion number of a falcon core, used in the
+/// [`crate::regs::NV_PFALCON_FALCON_HWCFG1`] register.
+#[repr(u8)]
+#[derive(Debug, Default, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)]
+pub(crate) enum FalconCoreRevSubversion {
+#[default]
+Subversion0 = 0,
+Subversion1 = 1,
+Subversion2 = 2,
+Subversion3 = 3,
+}
+
+impl TryFrom for FalconCoreRevSubversion {
+type Error = Error;
+
+fn try_from(value: u8) -> Result {
+use FalconCoreRevSubversion::*;
+
+let sub_version = match value & 0b11 {
+0 => Subversion0,
+1 => Subversion1,
+2 => Subversion2,
+3 => Subversion3,
+_ => return Err(EINVAL),
+};
+
+Ok(sub_version)
+}
+}
+
+/// Security model of a falcon core, used in the 
[`crate::regs::NV_PFALCON_FALCON_HWCFG1`]
+/// register.
+#[repr(u8)]
+#[derive(Debug, Default, Copy, Clone)]
+pub(crate) enum FalconSecurityModel {
+/// Non-Secure: runs unsigned code without privileges.
+#[default]
+None = 0,
+/// Low-Secure: runs code with some privileges. Can only be entered from 
`Heavy` mode, which
+/// will typically validate the LS code through some signature.
+Light = 2,
+/// High-Secure: runs signed code with full privileges. Signature is 
validated by boot ROM.
+Heavy = 3,
+}
+
+impl TryFrom for FalconSecurityModel {
+type Error = Error;
+
+fn try_from(value: u8) -> Result {
+use FalconSecurityModel::*;
+
+let sec_model = match value {
+0 => None,
+2 => Light,
+3 => Heavy,
+_ => return Err(EINVAL),
+};
+
+Ok(sec_model)
+}
+}
+
+/// Signing algorithm for a given firmware, used in the 
[`crate::regs::NV_PFALCON2_FALCON_MOD_SEL`]
+/// register.
+#[repr(u8)]
+#[derive(Debug, Default, Copy, Clone, PartialEq, Eq)]
+pub(crate) enum FalconModSelAlgo {
+/// RSA3K.
+#[default]
+Rsa3k = 1,
+}
+
+impl TryFrom for FalconModSelAlgo {
+type Error = Error;
+
+fn try_from(value: u8) -> Result {
+match value {
+1 => Ok(FalconModSelAlgo::Rsa3k),
+_ => Err(EINVAL),
+}
+}
+}
+
+/// Valid values for the `size` field of the 
[`crate::regs::NV_PFALCON_FALCON_DMATRFCMD`] register.
+#[repr(u8)]
+#[derive(Debug, Default, Copy, Clone, PartialEq, Eq)]
+pub(crate) enum DmaTrfCmdSize {
+/// 256 bytes transfer.
+#[default]
+Size256B = 0x6,
+}
+
+impl TryFrom for DmaTrfCmdSize {
+type Error = Error

[PATCH v5 18/23] gpu: nova-core: vbios: Add support to look up PMU table in FWSEC

2025-06-12 Thread Alexandre Courbot

From: Joel Fernandes 

The PMU table in the FWSEC image has to be located to locate the start
of the Falcon ucode in the same or another FWSEC image. Add support for
the same.

Signed-off-by: Joel Fernandes 
Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/vbios.rs | 179 -
 1 file changed, 177 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
index 
aa6f19ddd51752ba453a1600ea002a198e27af5d..312caf82d14588e21e0fa2bae0f8954d0efe3479
 100644
--- a/drivers/gpu/nova-core/vbios.rs
+++ b/drivers/gpu/nova-core/vbios.rs
@@ -330,6 +330,111 @@ fn image_size_bytes(&self) -> usize {
 }
 }
 
+/// BIOS Information Table (BIT) Header
+/// This is the head of the BIT table, that is used to locate the Falcon data.
+/// The BIT table (with its header) is in the PciAtBiosImage and the falcon 
data
+/// it is pointing to is in the FwSecBiosImage.
+#[derive(Debug, Clone, Copy)]
+#[expect(dead_code)]
+struct BitHeader {
+/// 0h: BIT Header Identifier (BMP=0x7FFF/BIT=0xB8FF)
+id: u16,
+/// 2h: BIT Header Signature ("BIT\0")
+signature: [u8; 4],
+/// 6h: Binary Coded Decimal Version, ex: 0x0100 is 1.00.
+bcd_version: u16,
+/// 8h: Size of BIT Header (in bytes)
+header_size: u8,
+/// 9h: Size of BIT Tokens (in bytes)
+token_size: u8,
+/// 10h: Number of token entries that follow
+token_entries: u8,
+/// 11h: BIT Header Checksum
+checksum: u8,
+}
+
+impl BitHeader {
+fn new(data: &[u8]) -> Result {
+if data.len() < 12 {
+return Err(EINVAL);
+}
+
+let mut signature = [0u8; 4];
+signature.copy_from_slice(&data[2..6]);
+
+// Check header ID and signature
+let id = u16::from_le_bytes([data[0], data[1]]);
+if id != 0xB8FF || &signature != b"BIT\0" {
+return Err(EINVAL);
+}
+
+Ok(BitHeader {
+id,
+signature,
+bcd_version: u16::from_le_bytes([data[6], data[7]]),
+header_size: data[8],
+token_size: data[9],
+token_entries: data[10],
+checksum: data[11],
+})
+}
+}
+
+/// BIT Token Entry: Records in the BIT table followed by the BIT header
+#[derive(Debug, Clone, Copy)]
+#[expect(dead_code)]
+struct BitToken {
+/// 00h: Token identifier
+id: u8,
+/// 01h: Version of the token data
+data_version: u8,
+/// 02h: Size of token data in bytes
+data_size: u16,
+/// 04h: Offset to the token data
+data_offset: u16,
+}
+
+// Define the token ID for the Falcon data
+const BIT_TOKEN_ID_FALCON_DATA: u8 = 0x70;
+
+impl BitToken {
+/// Find a BIT token entry by BIT ID in a PciAtBiosImage
+fn from_id(image: &PciAtBiosImage, token_id: u8) -> Result {
+let header = &image.bit_header;
+
+// Offset to the first token entry
+let tokens_start = image.bit_offset + header.header_size as usize;
+
+for i in 0..header.token_entries as usize {
+let entry_offset = tokens_start + (i * header.token_size as usize);
+
+// Make sure we don't go out of bounds
+if entry_offset + header.token_size as usize > 
image.base.data.len() {
+return Err(EINVAL);
+}
+
+// Check if this token has the requested ID
+if image.base.data[entry_offset] == token_id {
+return Ok(BitToken {
+id: image.base.data[entry_offset],
+data_version: image.base.data[entry_offset + 1],
+data_size: u16::from_le_bytes([
+image.base.data[entry_offset + 2],
+image.base.data[entry_offset + 3],
+]),
+data_offset: u16::from_le_bytes([
+image.base.data[entry_offset + 4],
+image.base.data[entry_offset + 5],
+]),
+});
+}
+}
+
+// Token not found
+Err(ENOENT)
+}
+}
+
 /// PCI ROM Expansion Header as defined in PCI Firmware Specification.
 /// This is header is at the beginning of every image in the set of
 /// images in the ROM. It contains a pointer to the PCI Data Structure
@@ -575,7 +680,8 @@ fn new(pdev: &pci::Device, data: &[u8]) -> Result {
 
 struct PciAtBiosImage {
 base: BiosImageBase,
-// PCI-AT-specific fields can be added here in the future.
+bit_header: BitHeader,
+bit_offset: usize,
 }
 
 struct EfiBiosImage {
@@ -599,7 +705,7 @@ impl TryFrom for BiosImage {
 
 fn try_from(base: BiosImageBase) -> Result {
 match base.pcir.code_type {
-0x00 => Ok(BiosImage::PciAt(PciAtBiosImage { base })),
+0x00 => Ok(BiosImage::PciAt(base.try_into()?)),
 0x03 => Ok(BiosImage::Efi(EfiBiosImage { base })),
 0x70 => Ok(BiosImage::Nbsi(NbsiBiosImage { ba

[PATCH v5 14/23] gpu: nova-core: register sysmem flush page

2025-06-12 Thread Alexandre Courbot

Reserve a page of system memory so sysmembar can perform a read on it if
a system write occurred since the last flush. Do this early as it can be
required to e.g. reset the GPU falcons.

Chipsets capabilities differ in that respect, so this commit also
introduces the FB HAL.

Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/fb.rs   | 66 +++
 drivers/gpu/nova-core/fb/hal.rs   | 31 
 drivers/gpu/nova-core/fb/hal/ga100.rs | 45 
 drivers/gpu/nova-core/fb/hal/tu102.rs | 42 ++
 drivers/gpu/nova-core/gpu.rs  | 25 +++--
 drivers/gpu/nova-core/nova_core.rs|  1 +
 drivers/gpu/nova-core/regs.rs | 10 ++
 7 files changed, 218 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
new file mode 100644
index 
..308cd76edfee5a2e8a4cd979c20da2ce51cb16a5
--- /dev/null
+++ b/drivers/gpu/nova-core/fb.rs
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::prelude::*;
+use kernel::types::ARef;
+use kernel::{dev_warn, device};
+
+use crate::dma::DmaObject;
+use crate::driver::Bar0;
+use crate::gpu::Chipset;
+
+mod hal;
+
+/// Type holding the sysmem flush memory page, a page of memory to be written 
into the
+/// `NV_PFB_NISO_FLUSH_SYSMEM_ADDR*` registers and used to maintain memory 
coherency.
+///
+/// Users are responsible for manually calling [`Self::unregister`] before 
dropping this object, or
+/// the page might remain in use even after it has been freed.
+pub(crate) struct SysmemFlush {
+/// Chipset we are operating on.
+chipset: Chipset,
+device: ARef,
+/// Keep the page alive as long as we need it.
+page: DmaObject,
+}
+
+impl SysmemFlush {
+/// Allocate a memory page and register it as the sysmem flush page.
+pub(crate) fn register(
+dev: &device::Device,
+bar: &Bar0,
+chipset: Chipset,
+) -> Result {
+let page = DmaObject::new(dev, kernel::bindings::PAGE_SIZE)?;
+
+hal::fb_hal(chipset).write_sysmem_flush_page(bar, page.dma_handle())?;
+
+Ok(Self {
+chipset,
+device: dev.into(),
+page,
+})
+}
+
+/// Unregister the managed sysmem flush page.
+///
+/// Users must make sure to call this method before dropping the object.
+pub(crate) fn unregister(self, bar: &Bar0) {
+let hal = hal::fb_hal(self.chipset);
+
+if hal.read_sysmem_flush_page(bar) == self.page.dma_handle() {
+let _ = hal.write_sysmem_flush_page(bar, 0).inspect_err(|e| {
+dev_warn!(
+&self.device,
+"failed to unregister sysmem flush page: {:?}",
+e
+)
+});
+} else {
+// Another page has been registered after us for some reason - 
warn as this is a bug.
+dev_warn!(
+&self.device,
+"attempt to unregister a sysmem flush page that is not 
active\n"
+);
+}
+}
+}
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
new file mode 100644
index 
..23eab57eec9f524e066d3324eb7f5f2bf78481d2
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -0,0 +1,31 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::prelude::*;
+
+use crate::driver::Bar0;
+use crate::gpu::Chipset;
+
+mod ga100;
+mod tu102;
+
+pub(crate) trait FbHal {
+/// Returns the address of the currently-registered sysmem flush page.
+fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64;
+
+/// Register `addr` as the address of the sysmem flush page.
+///
+/// This might fail if the address is too large for the receiving register.
+fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result;
+}
+
+/// Returns the HAL corresponding to `chipset`.
+pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
+use Chipset::*;
+
+match chipset {
+TU102 | TU104 | TU106 | TU117 | TU116 => tu102::TU102_HAL,
+GA100 | GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 
| AD106 | AD107 => {
+ga100::GA100_HAL
+}
+}
+}
diff --git a/drivers/gpu/nova-core/fb/hal/ga100.rs 
b/drivers/gpu/nova-core/fb/hal/ga100.rs
new file mode 100644
index 
..7c10436c1c590d9b767c399b69370697fdf8d239
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal/ga100.rs
@@ -0,0 +1,45 @@
+// SPDX-License-Identifier: GPL-2.0
+
+struct Ga100;
+
+use kernel::prelude::*;
+
+use crate::driver::Bar0;
+use crate::fb::hal::FbHal;
+use crate::regs;
+
+use super::tu102::FLUSH_SYSMEM_ADDR_SHIFT;
+
+pub(super) fn read_sysmem_flush_page_ga100(bar: &Bar0) -> u64 {
+(regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR::read(bar).adr_39_08() as u64) << 
FLUSH_SYSMEM_ADDR_SHIFT
+|

[PATCH v5 09/23] gpu: nova-core: allow register aliases

2025-06-12 Thread Alexandre Courbot

Some registers (notably scratch registers) don't have a definitive
purpose, but need to be interpreted differently depending on context.

Expand the register!() macro to support a syntax indicating that a
register type should be at the same offset as another one, but under a
different name, and with different fields and documentation.

Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/regs/macros.rs | 40 ++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/regs/macros.rs 
b/drivers/gpu/nova-core/regs/macros.rs
index 
7cd013f3c90bbd8ca437d4072cae8f11d7946fcd..e0e6fef3796f9dd2ce4e0223444a05bcc53075a6
 100644
--- a/drivers/gpu/nova-core/regs/macros.rs
+++ b/drivers/gpu/nova-core/regs/macros.rs
@@ -71,6 +71,20 @@
 /// pr_info!("CPU CTL: {:#x}", cpuctl);
 /// cpuctl.set_start(true).write(&bar, CPU_BASE);
 /// ```
+///
+/// It is also possible to create a alias register by using the `=> ALIAS` 
syntax. This is useful
+/// for cases where a register's interpretation depends on the context:
+///
+/// ```no_run
+/// register!(SCRATCH_0 @ 0x100, "Scratch register 0" {
+///31:0 value as u32, "Raw value";
+///
+/// register!(SCRATCH_0_BOOT_STATUS => SCRATCH_0, "Boot status of the 
firmware" {
+/// 0:0 completed as bool, "Whether the firmware has completed 
booting";
+/// ```
+///
+/// In this example, `SCRATCH_0_BOOT_STATUS` uses the same I/O address as 
`SCRATCH_0`, while also
+/// providing its own `completed` method.
 macro_rules! register {
 // Creates a register at a fixed offset of the MMIO space.
 (
@@ -83,6 +97,17 @@ macro_rules! register {
 register!(@io $name @ $offset);
 };
 
+// Creates a alias register of fixed offset register `alias` with its own 
fields.
+(
+$name:ident => $alias:ident $(, $comment:literal)? {
+$($fields:tt)*
+}
+) => {
+register!(@common $name @ $alias::OFFSET $(, $comment)?);
+register!(@field_accessors $name { $($fields)* });
+register!(@io $name @ $alias::OFFSET);
+};
+
 // Creates a register at a relative offset from a base address.
 (
 $name:ident @ + $offset:literal $(, $comment:literal)? {
@@ -94,11 +119,22 @@ macro_rules! register {
 register!(@io$name @ + $offset);
 };
 
+// Creates a alias register of relative offset register `alias` with its 
own fields.
+(
+$name:ident => + $alias:ident $(, $comment:literal)? {
+$($fields:tt)*
+}
+) => {
+register!(@common $name @ $alias::OFFSET $(, $comment)?);
+register!(@field_accessors $name { $($fields)* });
+register!(@io $name @ + $alias::OFFSET);
+};
+
 // All rules below are helpers.
 
 // Defines the wrapper `$name` type, as well as its relevant 
implementations (`Debug`, `BitOr`,
 // and conversion to regular `u32`).
-(@common $name:ident @ $offset:literal $(, $comment:literal)?) => {
+(@common $name:ident @ $offset:expr $(, $comment:literal)?) => {
 $(
 #[doc=$comment]
 )?
@@ -280,7 +316,7 @@ pub(crate) fn [](mut self, value: $to_type) -> 
Self {
 };
 
 // Creates the IO accessors for a fixed offset register.
-(@io $name:ident @ $offset:literal) => {
+(@io $name:ident @ $offset:expr) => {
 #[allow(dead_code)]
 impl $name {
 #[inline]

-- 
2.49.0

[PATCH v5 04/23] rust: add new `num` module with `PowerOfTwo` type

2025-06-12 Thread Alexandre Courbot

Introduce the `num` module, featuring the `PowerOfTwo` unsigned wrapper
that guarantees (at build-time or runtime) that a value is a power of
two.

Such a property is often useful to maintain. In the context of the
kernel, powers of two are often used to align addresses or sizes up and
down, or to create masks. These operations are provided by this type.

It is introduced to be first used by the nova-core driver.

Signed-off-by: Alexandre Courbot 
---
 rust/kernel/lib.rs |   1 +
 rust/kernel/num.rs | 173 +
 2 files changed, 174 insertions(+)

diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index 
6b4774b2b1c37f4da1866e993be6230bc6715841..2955f65da1278dd4cba1e4272ff178b8211a892c
 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -89,6 +89,7 @@
 pub mod mm;
 #[cfg(CONFIG_NET)]
 pub mod net;
+pub mod num;
 pub mod of;
 #[cfg(CONFIG_PM_OPP)]
 pub mod opp;
diff --git a/rust/kernel/num.rs b/rust/kernel/num.rs
new file mode 100644
index 
..ee0f67ad1a89e69f5f8d2077eba5541b472e7d8a
--- /dev/null
+++ b/rust/kernel/num.rs
@@ -0,0 +1,173 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Numerical and binary utilities for primitive types.
+
+use crate::build_assert;
+use core::borrow::Borrow;
+use core::fmt::Debug;
+use core::hash::Hash;
+use core::ops::Deref;
+
+/// An unsigned integer which is guaranteed to be a power of 2.
+#[derive(Debug, Clone, Copy)]
+#[repr(transparent)]
+pub struct PowerOfTwo(T);
+
+macro_rules! power_of_two_impl {
+($($t:ty),+) => {
+$(
+impl PowerOfTwo<$t> {
+/// Validates that `v` is a power of two at build-time, and 
returns it wrapped into
+/// `PowerOfTwo`.
+///
+/// A build error is triggered if `v` cannot be asserted to be 
a power of two.
+///
+/// # Examples
+///
+/// ```
+/// use kernel::num::PowerOfTwo;
+///
+/// let v = PowerOfTwonew(256);
+/// assert_eq!(v.value(), 256);
+/// ```
+#[inline(always)]
+pub const fn new(v: $t) -> Self {
+build_assert!(v.count_ones() == 1);
+Self(v)
+}
+
+/// Validates that `v` is a power of two at runtime, and 
returns it wrapped into
+/// `PowerOfTwo`.
+///
+/// `None` is returned if `v` was not a power of two.
+///
+/// # Examples
+///
+/// ```
+/// use kernel::num::PowerOfTwo;
+///
+/// 
assert_eq!(PowerOfTwotry_new(16).unwrap().value(), 16);
+/// assert_eq!(PowerOfTwotry_new(15), None);
+/// ```
+#[inline(always)]
+pub const fn try_new(v: $t) -> Option {
+match v.count_ones() {
+1 => Some(Self(v)),
+_ => None,
+}
+}
+
+/// Returns the value of this instance.
+///
+/// It is guaranteed to be a power of two.
+///
+/// # Examples
+///
+/// ```
+/// use kernel::num::PowerOfTwo;
+///
+/// let v = PowerOfTwonew(256);
+/// assert_eq!(v.value(), 256);
+/// ```
+#[inline(always)]
+pub const fn value(&self) -> $t {
+self.0
+}
+
+/// Returns the mask corresponding to `self.value() - 1`.
+#[inline(always)]
+pub const fn mask(&self) -> $t {
+self.0.wrapping_sub(1)
+}
+
+/// Aligns `self` down to `alignment`.
+///
+/// # Examples
+///
+/// ```
+/// use kernel::num::PowerOfTwo;
+///
+/// 
assert_eq!(PowerOfTwonew(0x1000).align_down(0x4fff), 0x4000);
+/// ```
+#[inline(always)]
+pub const fn align_down(self, value: $t) -> $t {
+value & !self.mask()
+}
+
+/// Aligns `value` up to `self`.
+///
+/// Wraps around to `0` if the requested alignment pushes the 
result above the
+/// type's limits.
+///
+/// # Examples
+///
+/// ```
+/// use kernel::num::PowerOfTwo;
+///
+/// 
assert_eq!(PowerOfTwonew(0x1000).align_up(0x4fff), 0x5000);
+/// 
assert_eq!(PowerOfT

[PATCH v5 05/23] rust: num: add the `fls` operation

2025-06-12 Thread Alexandre Courbot

Add an equivalent to the `fls` (Find Last Set bit) C function to Rust
unsigned types.

It is to be first used by the nova-core driver.

Signed-off-by: Alexandre Courbot 
---
 rust/kernel/num.rs | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/rust/kernel/num.rs b/rust/kernel/num.rs
index 
ee0f67ad1a89e69f5f8d2077eba5541b472e7d8a..934afe17719f789c569dbd54534adc2e26fe59f2
 100644
--- a/rust/kernel/num.rs
+++ b/rust/kernel/num.rs
@@ -171,3 +171,34 @@ fn borrow(&self) -> &T {
 &self.0
 }
 }
+
+macro_rules! impl_fls {
+($($t:ty),+) => {
+$(
+::kernel::macros::paste! {
+/// Find Last Set Bit: return the 1-based index of the last (i.e. 
most significant) set
+/// bit in `v`.
+///
+/// Equivalent to the C `fls` function.
+///
+/// # Examples
+///
+/// ```
+/// use kernel::num::fls_u32;
+///
+/// assert_eq!(fls_u32(0x0), 0);
+/// assert_eq!(fls_u32(0x1), 1);
+/// assert_eq!(fls_u32(0x10), 5);
+/// assert_eq!(fls_u32(0x), 16);
+/// assert_eq!(fls_u32(0x8000_), 32);
+/// ```
+#[inline(always)]
+pub const fn [](v: $t) -> u32 {
+$t::BITS - v.leading_zeros()
+}
+}
+)+
+};
+}
+
+impl_fls!(usize, u8, u16, u32, u64, u128);

-- 
2.49.0

[PATCH v5 07/23] gpu: nova-core: add delimiter for helper rules in register!() macro

2025-06-12 Thread Alexandre Courbot

This macro is pretty complex, and most rules are just helper, so add a
delimiter to indicate when users only interested in using it can stop
reading.

Reviewed-by: Lyude Paul 
Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/regs/macros.rs | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/nova-core/regs/macros.rs 
b/drivers/gpu/nova-core/regs/macros.rs
index 
40bf9346cd0699ede05cfddff5d39822c696c164..d7f09026390b4ccb1c969f2b29caf07fa9204a77
 100644
--- a/drivers/gpu/nova-core/regs/macros.rs
+++ b/drivers/gpu/nova-core/regs/macros.rs
@@ -94,6 +94,8 @@ macro_rules! register {
 register!(@io$name @ + $offset);
 };
 
+// All rules below are helpers.
+
 // Defines the wrapper `$name` type, as well as its relevant 
implementations (`Debug`, `BitOr`,
 // and conversion to regular `u32`).
 (@common $name:ident $(, $comment:literal)?) => {

-- 
2.49.0

[PATCH v5 11/23] gpu: nova-core: add helper function to wait on condition

2025-06-12 Thread Alexandre Courbot

While programming the hardware, we frequently need to busy-wait until
a condition (like a given bit of a register to switch value) happens.

Add a basic `wait_on` helper function to wait on such conditions
expressed as a closure, with a timeout argument.

This is temporary as we will switch to `read_poll_timeout` [1] once it
is available.

[1] 
https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomon...@gmail.com/

Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/util.rs | 29 +
 1 file changed, 29 insertions(+)

diff --git a/drivers/gpu/nova-core/util.rs b/drivers/gpu/nova-core/util.rs
index 
332a64cfc6a9d7d787fbdc228887c0be53a97160..c50bfa5ab7fe385fae26c8909ae5984b96af618a
 100644
--- a/drivers/gpu/nova-core/util.rs
+++ b/drivers/gpu/nova-core/util.rs
@@ -1,5 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
 
+use core::time::Duration;
+
+use kernel::prelude::*;
+use kernel::time::Instant;
+
 pub(crate) const fn to_lowercase_bytes(s: &str) -> [u8; N] {
 let src = s.as_bytes();
 let mut dst = [0; N];
@@ -19,3 +24,27 @@ pub(crate) const fn const_bytes_to_str(bytes: &[u8]) -> &str 
{
 Err(_) => kernel::build_error!("Bytes are not valid UTF-8."),
 }
 }
+
+/// Wait until `cond` is true or `timeout` elapsed.
+///
+/// When `cond` evaluates to `Some`, its return value is returned.
+///
+/// `Err(ETIMEDOUT)` is returned if `timeout` has been reached without `cond` 
evaluating to
+/// `Some`.
+///
+/// TODO: replace with `read_poll_timeout` once it is available.
+/// 
(https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomon...@gmail.com/)
+#[expect(dead_code)]
+pub(crate) fn wait_on Option>(timeout: Duration, cond: F) -> 
Result {
+let start_time = Instant::now();
+
+loop {
+if let Some(ret) = cond() {
+return Ok(ret);
+}
+
+if start_time.elapsed().as_nanos() > timeout.as_nanos() as i64 {
+return Err(ETIMEDOUT);
+}
+}
+}

-- 
2.49.0

[PATCH v5 01/23] rust: dma: expose the count and size of CoherentAllocation

2025-06-12 Thread Alexandre Courbot

These properties are very useful to have (and to be used by nova-core)
and should be accessible.

Signed-off-by: Alexandre Courbot 
---
 rust/kernel/dma.rs | 32 ++--
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
index 
a33261c62e0c2d3c2c9e92a4c058faab594e5355..1a6fc800256500ae04099fbf4f9a1bd1115ce202
 100644
--- a/rust/kernel/dma.rs
+++ b/rust/kernel/dma.rs
@@ -114,9 +114,11 @@ pub mod attrs {
 ///
 /// # Invariants
 ///
-/// For the lifetime of an instance of [`CoherentAllocation`], the `cpu_addr` 
is a valid pointer
-/// to an allocated region of consistent memory and `dma_handle` is the DMA 
address base of
-/// the region.
+/// - For the lifetime of an instance of [`CoherentAllocation`], the 
`cpu_addr` is a valid pointer
+///   to an allocated region of consistent memory and `dma_handle` is the DMA 
address base of the
+///   region.
+/// - The size in bytes of the allocation is equal to `size_of:: * count`.
+/// - `size_of:: * count` fits into a `usize`.
 // TODO
 //
 // DMA allocations potentially carry device resources (e.g.IOMMU mappings), 
hence for soundness
@@ -179,9 +181,12 @@ pub fn alloc_attrs(
 if ret.is_null() {
 return Err(ENOMEM);
 }
-// INVARIANT: We just successfully allocated a coherent region which 
is accessible for
-// `count` elements, hence the cpu address is valid. We also hold a 
refcounted reference
-// to the device.
+// INVARIANT:
+// - We just successfully allocated a coherent region which is 
accessible for
+//   `count` elements, hence the cpu address is valid. We also hold a 
refcounted reference
+//   to the device.
+// - The allocated `size` is equal to `size_of:: * count`.
+// - The allocated `size` fits into a `usize`.
 Ok(Self {
 dev: dev.into(),
 dma_handle,
@@ -201,6 +206,21 @@ pub fn alloc_coherent(
 CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attrs(0))
 }
 
+/// Returns the number of elements `T` in this allocation.
+///
+/// Note that this is not the size of the allocation in bytes, which is 
provided by
+/// [`Self::size`].
+pub fn count(&self) -> usize {
+self.count
+}
+
+/// Returns the size in bytes of this allocation.
+pub fn size(&self) -> usize {
+// INVARIANT: The type invariant of `Self` guarantees that 
size_of:: * count` fits into
+// a `usize`.
+self.count * core::mem::size_of::()
+}
+
 /// Returns the base address to the allocated region in the CPU's virtual 
address space.
 pub fn start_ptr(&self) -> *const T {
 self.cpu_addr

-- 
2.49.0

[PATCH v5 12/23] gpu: nova-core: wait for GFW_BOOT completion

2025-06-12 Thread Alexandre Courbot

Upon reset, the GPU executes the GFW (GPU Firmware) in order to
initialize its base parameters such as clocks. The driver must ensure
that this step is completed before using the hardware.

Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/gfw.rs   | 39 ++
 drivers/gpu/nova-core/gpu.rs   |  5 +
 drivers/gpu/nova-core/nova_core.rs |  1 +
 drivers/gpu/nova-core/regs.rs  | 25 
 drivers/gpu/nova-core/util.rs  |  1 -
 5 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/gfw.rs b/drivers/gpu/nova-core/gfw.rs
new file mode 100644
index 
..911338660f9774d74c71c090517b220b64989bf6
--- /dev/null
+++ b/drivers/gpu/nova-core/gfw.rs
@@ -0,0 +1,39 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! GPU Firmware (GFW) support.
+//!
+//! Upon reset, the GPU runs some firmware code from the BIOS to setup its 
core parameters. Most of
+//! the GPU is considered unusable until this step is completed, so we must 
wait on it before
+//! performing driver initialization.
+
+use core::time::Duration;
+
+use kernel::bindings;
+use kernel::prelude::*;
+
+use crate::driver::Bar0;
+use crate::regs;
+use crate::util;
+
+/// Wait until GFW (GPU Firmware) completes, or a 4 seconds timeout elapses.
+pub(crate) fn wait_gfw_boot_completion(bar: &Bar0) -> Result {
+util::wait_on(Duration::from_secs(4), || {
+// Check that FWSEC has lowered its protection level before reading 
the GFW_BOOT
+// status.
+let gfw_booted = 
regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK::read(bar)
+.read_protection_level0()
+&& 
regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_0_GFW_BOOT::read(bar).completed();
+
+if gfw_booted {
+Some(())
+} else {
+// Avoid busy-looping.
+// SAFETY: msleep should be safe to call with any parameter.
+// TODO: replace with [1] once it merges.
+// [1] 
https://lore.kernel.org/rust-for-linux/20250423192857.199712-6-fujita.tomon...@gmail.com/
+unsafe { bindings::msleep(1) };
+
+None
+}
+})
+}
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 
60b86f3702842dc2c8b06f092250a5bad3b97bf4..e44ff6fa07147c6dd1515c2c6c0df927a2257c85
 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -4,6 +4,7 @@
 
 use crate::driver::Bar0;
 use crate::firmware::{Firmware, FIRMWARE_VERSION};
+use crate::gfw;
 use crate::regs;
 use crate::util;
 use core::fmt;
@@ -182,6 +183,10 @@ pub(crate) fn new(
 spec.revision
 );
 
+// We must wait for GFW_BOOT completion before doing any significant 
setup on the GPU.
+gfw::wait_gfw_boot_completion(bar)
+.inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not 
complete"))?;
+
 Ok(pin_init!(Self {
 spec,
 bar: devres_bar,
diff --git a/drivers/gpu/nova-core/nova_core.rs 
b/drivers/gpu/nova-core/nova_core.rs
index 
618632f0abcc8f5ef6945a04fc084acc4ecbf20b..c3fde3e132ea65851137ab47fcb7b3637577
 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -4,6 +4,7 @@
 
 mod driver;
 mod firmware;
+mod gfw;
 mod gpu;
 mod regs;
 mod util;
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 
5a12732303066f78b8ec5745096cef632ff3bfba..cba442da51181971f209b338249307c11ac481e3
 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -37,3 +37,28 @@ pub(crate) fn chipset(self) -> Result {
 .and_then(Chipset::try_from)
 }
 }
+
+/* PGC6 */
+
+register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK @ 0x00118128 {
+0:0 read_protection_level0 as bool, "Set after FWSEC lowers its 
protection level";
+});
+
+// TODO: This is an array of registers.
+register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05 @ 0x00118234 {
+31:0value as u32;
+});
+
+register!(
+NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_0_GFW_BOOT => 
NV_PGC6_AON_SECURE_SCRATCH_GROUP_05,
+"Scratch group 05 register 0 used as GFW boot progress indicator" {
+7:0progress as u8, "Progress of GFW boot (0xff means completed)";
+}
+);
+
+impl NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_0_GFW_BOOT {
+/// Returns `true` if GFW boot is completed.
+pub(crate) fn completed(self) -> bool {
+self.progress() == 0xff
+}
+}
diff --git a/drivers/gpu/nova-core/util.rs b/drivers/gpu/nova-core/util.rs
index 
c50bfa5ab7fe385fae26c8909ae5984b96af618a..69f29238b25ed949b00def1b748df3ff7567d83c
 100644
--- a/drivers/gpu/nova-core/util.rs
+++ b/drivers/gpu/nova-core/util.rs
@@ -34,7 +34,6 @@ pub(crate) const fn const_bytes_to_str(bytes: &[u8]) -> &str {
 ///
 /// TODO: replace with `read_poll_timeout` once it is available.
 /// 
(https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomon...@

[PATCH v5 10/23] gpu: nova-core: increase BAR0 size to 16MB

2025-06-12 Thread Alexandre Courbot

The Turing+ register address space spans over that range, so increase it
as future patches will access more registers.

Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/driver.rs | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index 
8c86101c26cb5fe5eb9a3d03268338c6b58baef7..ffe25c7a2fdad289549460f7fd87d6e09299a35c
 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 
-use kernel::{auxiliary, bindings, c_str, device::Core, pci, prelude::*};
+use kernel::{auxiliary, bindings, c_str, device::Core, pci, prelude::*, 
sizes::SZ_16M};
 
 use crate::gpu::Gpu;
 
@@ -11,7 +11,7 @@ pub(crate) struct NovaCore {
 _reg: auxiliary::Registration,
 }
 
-const BAR0_SIZE: usize = 8;
+const BAR0_SIZE: usize = SZ_16M;
 pub(crate) type Bar0 = pci::Bar;
 
 kernel::pci_device_table!(

-- 
2.49.0

[PATCH v5 08/23] gpu: nova-core: expose the offset of each register as a type constant

2025-06-12 Thread Alexandre Courbot

Although we want to access registers using the provided methods, it is
sometimes needed to use their raw offset, for instance when working with
a register array.

Expose the offset of each register using a type constant to avoid
resorting to hardcoded values.

Reviewed-by: Lyude Paul 
Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/regs/macros.rs | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/nova-core/regs/macros.rs 
b/drivers/gpu/nova-core/regs/macros.rs
index 
d7f09026390b4ccb1c969f2b29caf07fa9204a77..7cd013f3c90bbd8ca437d4072cae8f11d7946fcd
 100644
--- a/drivers/gpu/nova-core/regs/macros.rs
+++ b/drivers/gpu/nova-core/regs/macros.rs
@@ -78,7 +78,7 @@ macro_rules! register {
 $($fields:tt)*
 }
 ) => {
-register!(@common $name $(, $comment)?);
+register!(@common $name @ $offset $(, $comment)?);
 register!(@field_accessors $name { $($fields)* });
 register!(@io $name @ $offset);
 };
@@ -89,7 +89,7 @@ macro_rules! register {
 $($fields:tt)*
 }
 ) => {
-register!(@common $name $(, $comment)?);
+register!(@common $name @ $offset $(, $comment)?);
 register!(@field_accessors $name { $($fields)* });
 register!(@io$name @ + $offset);
 };
@@ -98,7 +98,7 @@ macro_rules! register {
 
 // Defines the wrapper `$name` type, as well as its relevant 
implementations (`Debug`, `BitOr`,
 // and conversion to regular `u32`).
-(@common $name:ident $(, $comment:literal)?) => {
+(@common $name:ident @ $offset:literal $(, $comment:literal)?) => {
 $(
 #[doc=$comment]
 )?
@@ -106,6 +106,11 @@ macro_rules! register {
 #[derive(Clone, Copy, Default)]
 pub(crate) struct $name(u32);
 
+#[allow(dead_code)]
+impl $name {
+pub(crate) const OFFSET: usize = $offset;
+}
+
 // TODO: display the raw hex value, then the value of all the fields. 
This requires
 // matching the fields, which will complexify the syntax 
considerably...
 impl ::core::fmt::Debug for $name {

-- 
2.49.0

Re: [PATCH v3] drm/amd/display: Fix exception handling in dm_validate_stream_and_context()

2025-06-12 Thread Melissa Wen

On 06/10, Markus Elfring wrote:
> From: Markus Elfring 
> Date: Tue, 10 Jun 2025 07:42:40 +0200
> 
> The label “cleanup” was used to jump to another pointer check despite of
> the detail in the implementation of the function 
> “dm_validate_stream_and_context”
> that it was determined already that corresponding variables contained
> still null pointers.
> 
> 1. Thus return directly if
>* a null pointer was passed for the function parameter “stream”
>  or
>* a call of the function “dc_create_plane_state” failed.
> 
> 2. Use a more appropriate label instead.
> 
> 3. Delete two questionable checks.
> 
> 4. Omit extra initialisations (for the variables “dc_state” and 
> “dc_plane_state”)
>which became unnecessary with this refactoring.
> 
> 
> This issue was detected by using the Coccinelle software.
> 

Hi Markus,

Thanks for working on this improvement.
Overall, LGTM. Some nits below.

> Reported-by: kernel test robot 
> Closes: 
> https://lore.kernel.org/oe-kbuild-all/202506100312.ms4xgazw-...@intel.com/

As the patch wasn't merged yet, don't add these two kernel-bot-related lines.

You only need to add these lines "If you fix the issue in a separate
patch/commit (i.e. not just a new version of the same patch/commit)"

> Fixes: 5468c36d6285 ("drm/amd/display: Filter Invalid 420 Modes for HDMI 
> TMDS")
> Signed-off-by: Markus Elfring 
> ---
> 
> V3:
> * Another function call was renamed.
> 
> * Recipient lists were adjusted once more.
> 
> V2:
> * The change suggestion was rebased on source files of
>   the software “Linux next-20250606”.
> 
> * Recipient lists were adjusted accordingly.
> 
> 
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 20 ---
>  1 file changed, 8 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 78816712afbb..7dc80b2fbd30 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -7473,19 +7473,19 @@ static enum dc_status 
> dm_validate_stream_and_context(struct dc *dc,
>   struct dc_stream_state *stream)
>  {
>   enum dc_status dc_result = DC_ERROR_UNEXPECTED;
> - struct dc_plane_state *dc_plane_state = NULL;
> - struct dc_state *dc_state = NULL;
> + struct dc_plane_state *dc_plane_state;
> + struct dc_state *dc_state;
>  
>   if (!stream)
> - goto cleanup;
> + return dc_result;
>  
>   dc_plane_state = dc_create_plane_state(dc);
>   if (!dc_plane_state)
> - goto cleanup;
> + return dc_result;
>  
>   dc_state = dc_state_create(dc, NULL);
>   if (!dc_state)
> - goto cleanup;
> + goto release_plane_state;
>  
>   /* populate stream to plane */
>   dc_plane_state->src_rect.height  = stream->src.height;
> @@ -7522,13 +7522,9 @@ static enum dc_status 
> dm_validate_stream_and_context(struct dc *dc,
>   if (dc_result == DC_OK)
>   dc_result = dc_validate_global_state(dc, dc_state, 
> DC_VALIDATE_MODE_ONLY);
>  
> -cleanup:
> - if (dc_state)
> - dc_state_release(dc_state);
> -
> - if (dc_plane_state)
> - dc_plane_state_release(dc_plane_state);
> -
> + dc_state_release(dc_state);

For readability, I would add an extra line here...

> +release_plane_state:
> + dc_plane_state_release(dc_plane_state);

and here.

With that, you can add my

Reviewed-by: Melissa Wen 

>   return dc_result;
>  }
>  
> -- 
> 2.49.0
>

[PATCH v5 06/23] gpu: nova-core: use absolute paths in register!() macro

2025-06-12 Thread Alexandre Courbot

Fix the paths that were not absolute to prevent a potential local module
from being picked up.

Reviewed-by: Lyude Paul 
Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/regs/macros.rs | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/nova-core/regs/macros.rs 
b/drivers/gpu/nova-core/regs/macros.rs
index 
7ecc70efb3cd723b673cd72915e72b8a4a009f06..40bf9346cd0699ede05cfddff5d39822c696c164
 100644
--- a/drivers/gpu/nova-core/regs/macros.rs
+++ b/drivers/gpu/nova-core/regs/macros.rs
@@ -114,7 +114,7 @@ fn fmt(&self, f: &mut ::core::fmt::Formatter<'_>) -> 
::core::fmt::Result {
 }
 }
 
-impl core::ops::BitOr for $name {
+impl ::core::ops::BitOr for $name {
 type Output = Self;
 
 fn bitor(self, rhs: Self) -> Self::Output {
@@ -161,7 +161,7 @@ impl $name {
 (@check_field_bounds $hi:tt:$lo:tt $field:ident as bool) => {
 #[allow(clippy::eq_op)]
 const _: () = {
-kernel::build_assert!(
+::kernel::build_assert!(
 $hi == $lo,
 concat!("boolean field `", stringify!($field), "` covers more 
than one bit")
 );
@@ -172,7 +172,7 @@ impl $name {
 (@check_field_bounds $hi:tt:$lo:tt $field:ident as $type:tt) => {
 #[allow(clippy::eq_op)]
 const _: () = {
-kernel::build_assert!(
+::kernel::build_assert!(
 $hi >= $lo,
 concat!("field `", stringify!($field), "`'s MSB is smaller 
than its LSB")
 );
@@ -234,7 +234,7 @@ impl $name {
 @leaf_accessor $name:ident $hi:tt:$lo:tt $field:ident as $type:ty
 { $process:expr } $to_type:ty => $res_type:ty $(, 
$comment:literal)?;
 ) => {
-kernel::macros::paste!(
+::kernel::macros::paste!(
 const [<$field:upper>]: ::core::ops::RangeInclusive = $lo..=$hi;
 const [<$field:upper _MASK>]: u32 = 1 << $hi) - 1) << 1) + 1) - 
((1 << $lo) - 1);
 const [<$field:upper _SHIFT>]: u32 = Self::[<$field:upper 
_MASK>].trailing_zeros();
@@ -246,7 +246,7 @@ impl $name {
 )?
 #[inline]
 pub(crate) fn $field(self) -> $res_type {
-kernel::macros::paste!(
+::kernel::macros::paste!(
 const MASK: u32 = $name::[<$field:upper _MASK>];
 const SHIFT: u32 = $name::[<$field:upper _SHIFT>];
 );
@@ -255,7 +255,7 @@ pub(crate) fn $field(self) -> $res_type {
 $process(field)
 }
 
-kernel::macros::paste!(
+::kernel::macros::paste!(
 $(
 #[doc="Sets the value of this field:"]
 #[doc=$comment]

-- 
2.49.0

RE: [PATCH v6 01/11] mtd: core: always create master device

2025-06-12 Thread Usyskin, Alexander

> Subject: Re: [PATCH v6 01/11] mtd: core: always create master device
> 
> Hello,
> 
> On 11/06/2025 at 10:52:36 GMT, "Usyskin, Alexander"
>  wrote:
> 
> >> Subject: Re: [PATCH v6 01/11] mtd: core: always create master device
> >>
> >> - Ursprüngliche Mail -
> >> > Von: "Miquel Raynal" 
> >> >> On 6/10/25 05:54, Richard Weinberger wrote:
> >> >>> - Ursprüngliche Mail -
> >>  Von: "Alexander Usyskin" 
> >>  Richard, I've reproduced your setup (modulo that I must load mtdram
> >> manually)
> >>  and patch provided in this thread helps to fix the issue.
> >>  Can you apply and confirm?
> >> >>> Yes, it fixes the issue here! :-)
> >> >>>
> >> >>
> >> >> It doesn't seem to fix the issue if the partition data is in
> >> >> devicetree.
> >> >
> >> > I had a look at the patch again. The whole mtd core makes assumptions
> on
> >> > parenting, which is totally changed with this patch. There are so many
> >> > creative ways this can break, I don't believe we are going to continue
> >> > this route. I propose to revert the patch entirely for now. We need to
> >> > find another approach, I'm sorry.
> >>
> >> I think reverting is a valid option to consider if the issue turns out to 
> >> be
> >> a "back to the drawing board" problem.
> >>
> >> > Alexander, can you please remind me what was your initial problem? I
> >> > believe you needed to anchor runtime PM on the master device. Can you
> >> > please elaborate again? Why taking the controller as source (the
> >> > default, before your change) did not work? Also why was selecting
> >> > MTD_PARTITIONED_MASTER not an option for you? I'm trying to get to
> the
> >> > root of this change again, so we can find a solution fixing "the world"
> >> > (fast) and in a second time a way to address your problem.
> >>
> >> IIRC the problem is that depending on
> CONFIG_MTD_PARTITIONED_MASTER
> >> won't fly as PM needs to work with any configuration.
> >> And enforcing CONFIG_MTD_PARTITIONED_MASTER will break existing
> >> setups because mtd id's will change.
> >>
> >> On the other hand, how about placing the master device at the end
> >> of the available mtd id space if CONFIG_MTD_PARTITIONED_MASTER=n?
> >> A bit hacky but IMHO worth a thought.
> >>
> >> Thanks,
> >> //Richard
> >
> > The original problem was that general purpose OS never set
> > CONFIG_MTD_PARTITIONED_MASTER and we need valid device tree
> > to power management to work.
> >
> > We can return to V7 of this patch that only creates dummy master if
> > CONFIG_MTD_PARTITIONED_MASTER is off.
> > In this case the hierarchy remains the same.
> >
> > Miquel, can you re-review v7 and say if it worth to revert current version 
> > and
> > put v7 instead?
> 
> After taking inspiration from Richard's wisdom on IRC, we have another
> proposal. Let's drop the mtd_master class. We need an mtd device to be
> the master device, we already have one but we cannot keep *at the
> beginning* of the ID space under the CONFIG_MTD_PARTITIONED_MASTER=n
> configuration to avoid breaking userspace. So let's keep the master
> anyway, with the following specificities in the problematic case:
> - id is allocated from the max value downwards (avoids messing with
>   numbering)
> - mtd device is simply hidden (same user experience as before)
> 
> Apparently this second point, while not natively supported, is something
> the block world already does:
> https://elixir.bootlin.com/linux/v6.15.1/source/include/linux/blkdev.h#L88
> 
> What do you think?
> 
> Thanks,
> Miquèl

In general, it is fine for me - we have parent mtd initialized and participating
in power management.

I can't see how to bend idr_alloc to allocate from the end and corner case
of full idr range is also will be problematic.

- - 
Thanks,
Sasha

RE: [PATCH 03/13] drm/dp: Add argument for luminance range info in drm_edp_backlight_init

2025-06-12 Thread Kandpal, Suraj




> -Original Message-
> From: Murthy, Arun R 
> Sent: Thursday, June 12, 2025 11:45 AM
> To: Kandpal, Suraj ;
> nouv...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; intel-
> x...@lists.freedesktop.org; intel-...@lists.freedesktop.org
> Cc: Nautiyal, Ankit K 
> Subject: RE: [PATCH 03/13] drm/dp: Add argument for luminance range info in
> drm_edp_backlight_init
> 
> > -Original Message-
> > From: Kandpal, Suraj 
> > Sent: Monday, April 14, 2025 9:46 AM
> > To: nouv...@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> > intel- x...@lists.freedesktop.org; intel-...@lists.freedesktop.org
> > Cc: Nautiyal, Ankit K ; Murthy, Arun R
> > ; Kandpal, Suraj 
> > Subject: [PATCH 03/13] drm/dp: Add argument for luminance range info
> > in drm_edp_backlight_init
> >
> > Add new argument to drm_edp_backlight_init which gives the
> > drm_luminance_range_info struct which will be needed to set the min
> > and max values for backlight.
> >
> > Signed-off-by: Suraj Kandpal 
> > ---
> >  drivers/gpu/drm/display/drm_dp_helper.c   | 5 -
> >  drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c | 5 +++--
> >  drivers/gpu/drm/nouveau/nouveau_backlight.c   | 5 -
> >  include/drm/display/drm_dp_helper.h   | 1 +
> >  4 files changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/display/drm_dp_helper.c
> > b/drivers/gpu/drm/display/drm_dp_helper.c
> > index 99b27e5e3365..3b309ac5190b 100644
> > --- a/drivers/gpu/drm/display/drm_dp_helper.c
> > +++ b/drivers/gpu/drm/display/drm_dp_helper.c
> > @@ -4227,6 +4227,8 @@ drm_edp_backlight_probe_state(struct
> drm_dp_aux
> > *aux, struct drm_edp_backlight_i
> >   * interface.
> >   * @aux: The DP aux device to use for probing
> >   * @bl: The &drm_edp_backlight_info struct to fill out with
> > information on the backlight
> > + * @lr: The &drm_luminance_range_info struct which is used to get the
> > + min max when using *luminance override
> >   * @driver_pwm_freq_hz: Optional PWM frequency from the driver in hz
> >   * @edp_dpcd: A cached copy of the eDP DPCD
> >   * @current_level: Where to store the probed brightness level, if any
> > @@ -
> > 4243,6 +4245,7 @@ drm_edp_backlight_probe_state(struct drm_dp_aux
> > *aux, struct drm_edp_backlight_i
> >   */
> >  int
> >  drm_edp_backlight_init(struct drm_dp_aux *aux, struct
> > drm_edp_backlight_info *bl,
> > +  struct drm_luminance_range_info *lr,
> Would it be better to have this drm_luminance_range_info inside the
> drm_edp_backlight_info?

The thing is we fill drm_edp_backlight_info struct in drm_edp_backlight_init
Which means we would have to pass it anyways. So having a reference of this in
drm_edp_backlight_info didn't make sense.

Regards,
Suraj Kandpal

> 
> Thanks and Regards,
> Arun R Murthy
> 
> 
> >u16 driver_pwm_freq_hz, const u8
> > edp_dpcd[EDP_DISPLAY_CTL_CAP_SIZE],
> >u16 *current_level, u8 *current_mode, bool
> > need_luminance)  { @@ -4372,7 +4375,7 @@ int
> > drm_panel_dp_aux_backlight(struct drm_panel *panel, struct drm_dp_aux
> > *aux)
> >
> > bl->aux = aux;
> >
> > -   ret = drm_edp_backlight_init(aux, &bl->info, 0, edp_dpcd,
> > +   ret = drm_edp_backlight_init(aux, &bl->info, NULL, 0, edp_dpcd,
> >  ¤t_level, ¤t_mode, false);
> > if (ret < 0)
> > return ret;
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
> > b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
> > index d658e77b43d8..abb5ad4eef5f 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
> > @@ -600,8 +600,9 @@ static int
> > intel_dp_aux_vesa_setup_backlight(struct
> > intel_connector *connector,
> > connector->base.base.id, connector->base.name);
> > } else {
> > ret = drm_edp_backlight_init(&intel_dp->aux, &panel-
> > >backlight.edp.vesa.info,
> > -panel->vbt.backlight.pwm_freq_hz,
> > intel_dp->edp_dpcd,
> > -¤t_level, ¤t_mode,
> > false);
> > +luminance_range, panel-
> > >vbt.backlight.pwm_freq_hz,
> > +intel_dp->edp_dpcd,
> > ¤t_level, ¤t_mode,
> > +false);
> > if (ret < 0)
> > return ret;
> >
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c
> > b/drivers/gpu/drm/nouveau/nouveau_backlight.c
> > index b938684a9422..a3681e101d56 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_backlight.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c
> > @@ -234,6 +234,8 @@ nv50_backlight_init(struct nouveau_backlight *bl,
> > const struct backlight_ops **ops)  {
> > struct nouveau_drm *drm = nouveau_drm(nv_encoder-
> > >base.

Re: [PATCH 1/1] drm/arm/malidp: Silence informational message

2025-06-12 Thread Liviu Dudau

Hi,

On Fri, May 23, 2025 at 08:40:41AM +0200, Alexander Stein wrote:
> When checking for unsupported expect an error is printed every time.
> This spams the log for platforms where this is expected, e.g. ls1028a
> having a Vivante (etnaviv) GPU and Mali display processor.
> 
> Signed-off-by: Alexander Stein 

Sorry for the delay in replying, I was on holiday when you've sent the patch and
I've only found it today.

Patch looks sensible, so Reviewed-by: Liviu Dudau 

I will push the patch today into drm-misc-next.

Best regards,
Liviu

> ---
> Every time glmark2-es2-wayland is started on a downstream kernel raises the 
> error:
> > [drm:malidp_format_mod_supported [mali_dp]] *ERROR* Unknown modifier (not 
> > Arm)
> 
>  drivers/gpu/drm/arm/malidp_planes.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/arm/malidp_planes.c 
> b/drivers/gpu/drm/arm/malidp_planes.c
> index 34547edf1ee3c..87f2e5ee87907 100644
> --- a/drivers/gpu/drm/arm/malidp_planes.c
> +++ b/drivers/gpu/drm/arm/malidp_planes.c
> @@ -159,7 +159,7 @@ bool malidp_format_mod_supported(struct drm_device *drm,
>   }
>  
>   if (!fourcc_mod_is_vendor(modifier, ARM)) {
> - DRM_ERROR("Unknown modifier (not Arm)\n");
> + DRM_DEBUG_KMS("Unknown modifier (not Arm)\n");
>   return false;
>   }
>  
> -- 
> 2.43.0
> 

-- 

| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---
¯\_(ツ)_/¯

Re: [PATCH v1 5/5] misc: fastrpc: Add missing unmapping user-requested remote heap

2025-06-12 Thread Ekansh Gupta




On 6/12/2025 1:35 PM, Dmitry Baryshkov wrote:
> On Thu, Jun 12, 2025 at 10:50:10AM +0530, Ekansh Gupta wrote:
>>
>> On 5/22/2025 5:43 PM, Dmitry Baryshkov wrote:
>>> On Thu, 22 May 2025 at 08:01, Ekansh Gupta
>>>  wrote:

 On 5/19/2025 7:04 PM, Dmitry Baryshkov wrote:
> On Mon, May 19, 2025 at 04:28:34PM +0530, Ekansh Gupta wrote:
>> On 5/19/2025 4:22 PM, Dmitry Baryshkov wrote:
>>> On Tue, May 13, 2025 at 09:58:25AM +0530, Ekansh Gupta wrote:
 User request for remote heap allocation is supported using ioctl
 interface but support for unmap is missing. This could result in
 memory leak issues. Add unmap user request support for remote heap.
>>> Can this memory be in use by the remote proc?
>> Remote heap allocation request is only intended for audioPD. Other PDs
>> running on DSP are not intended to use this request.
> 'Intended'. That's fine. I asked a different question: _can_ it be in
> use? What happens if userspace by mistake tries to unmap memory too
> early? Or if it happens intentionally, at some specific time during
> work.
 If the unmap is restricted to audio daemon, then the unmap will only
 happen if the remoteproc is no longer using this memory.

 But without this restriction, yes it possible that some userspace process
 calls unmap which tries to move the ownership back to HLOS which the
 remoteproc is still using the memory. This might lead to memory access
 problems.
>>> This needs to be fixed in the driver. We need to track which memory is
>>> being used by the remoteproc and unmap it once remoteproc stops using
>>> it, without additional userspace intervention.
>> If it's the audio daemon which is requesting for unmap then it basically 
>> means that
>> the remoteproc is no longer using the memory. Audio PD can request for both 
>> grow
>> and shrink operations for it's dedicated heap. The case of grow is already 
>> supported
>> from fastrpc_req_mmap but the case of shrink(when remoteproc is no longer 
>> using the
>> memory) is not yet available. This memory is more specific to audio PD 
>> rather than
>> complete remoteproc.
>>
>> If we have to control this completely from driver then I see a problem in 
>> freeing/unmapping
>> the memory when the PD is no longer using the memory.
> What happens if userspace requests to free the memory that is still in
> use by the PD
I understand your point, for this I was thinking to limit the unmap 
functionality to the process
that is already attached to audio PD on DSP, no other process will be able to 
map/unmap this
memory from userspace.

>
> How does PD signal the memory is no longer in use?
PD makes a reverse fastrpc request[1] to unmap the memory when it is no longer 
used.

[1] https://github.com/quic/fastrpc/blob/development/src/apps_mem_imp.c#L231
>

RE: [PATCH 03/13] drm/dp: Add argument for luminance range info in drm_edp_backlight_init

2025-06-12 Thread Kandpal, Suraj




> -Original Message-
> From: Murthy, Arun R 
> Sent: Thursday, June 12, 2025 4:43 PM
> To: Kandpal, Suraj ;
> nouv...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; intel-
> x...@lists.freedesktop.org; intel-...@lists.freedesktop.org
> Cc: Nautiyal, Ankit K 
> Subject: RE: [PATCH 03/13] drm/dp: Add argument for luminance range info in
> drm_edp_backlight_init
> 
> > > > -Original Message-
> > > > From: Kandpal, Suraj 
> > > > Sent: Monday, April 14, 2025 9:46 AM
> > > > To: nouv...@lists.freedesktop.org;
> > > > dri-devel@lists.freedesktop.org;
> > > > intel- x...@lists.freedesktop.org; intel-...@lists.freedesktop.org
> > > > Cc: Nautiyal, Ankit K ; Murthy, Arun R
> > > > ; Kandpal, Suraj
> > > > 
> > > > Subject: [PATCH 03/13] drm/dp: Add argument for luminance range
> > > > info in drm_edp_backlight_init
> > > >
> > > > Add new argument to drm_edp_backlight_init which gives the
> > > > drm_luminance_range_info struct which will be needed to set the
> > > > min and max values for backlight.
> > > >
> > > > Signed-off-by: Suraj Kandpal 
> > > > ---
> > > >  drivers/gpu/drm/display/drm_dp_helper.c   | 5 -
> > > >  drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c | 5 +++--
> > > >  drivers/gpu/drm/nouveau/nouveau_backlight.c   | 5 -
> > > >  include/drm/display/drm_dp_helper.h   | 1 +
> > > >  4 files changed, 12 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/display/drm_dp_helper.c
> > > > b/drivers/gpu/drm/display/drm_dp_helper.c
> > > > index 99b27e5e3365..3b309ac5190b 100644
> > > > --- a/drivers/gpu/drm/display/drm_dp_helper.c
> > > > +++ b/drivers/gpu/drm/display/drm_dp_helper.c
> > > > @@ -4227,6 +4227,8 @@ drm_edp_backlight_probe_state(struct
> > > drm_dp_aux
> > > > *aux, struct drm_edp_backlight_i
> > > >   * interface.
> > > >   * @aux: The DP aux device to use for probing
> > > >   * @bl: The &drm_edp_backlight_info struct to fill out with
> > > > information on the backlight
> > > > + * @lr: The &drm_luminance_range_info struct which is used to get
> > > > + the min max when using *luminance override
> > > >   * @driver_pwm_freq_hz: Optional PWM frequency from the driver in
> hz
> > > >   * @edp_dpcd: A cached copy of the eDP DPCD
> > > >   * @current_level: Where to store the probed brightness level, if
> > > > any @@ -
> > > > 4243,6 +4245,7 @@ drm_edp_backlight_probe_state(struct
> drm_dp_aux
> > > > *aux, struct drm_edp_backlight_i
> > > >   */
> > > >  int
> > > >  drm_edp_backlight_init(struct drm_dp_aux *aux, struct
> > > > drm_edp_backlight_info *bl,
> > > > +  struct drm_luminance_range_info *lr,
> > > Would it be better to have this drm_luminance_range_info inside the
> > > drm_edp_backlight_info?
> >
> > The thing is we fill drm_edp_backlight_info struct in
> > drm_edp_backlight_init Which means we would have to pass it anyways.
> > So having a reference of this in drm_edp_backlight_info didn't make sense.
> >
> The main intention for this ask is two xx_info struct passed as argument.
> Moreover luminance is part of backlight and this new element is _info and
> there already exists backlight_info. So wondering is luminance can be put
> inside backlight_info. The caller of this function can fill the luminance part
> and then make a call.
> 

I see you point but the thing is luminance range is not something we will be 
using later and is
only used the set the max level of brightness that can be set.
That being said I do get your point on sending two xx_info struct here, I was 
thinking we send only the
U32 max luminance here since that's the only one we actually use. Drivers can 
send the max luminance they like.
What do you think?

Regards,
Suraj Kandpal

> Thanks and Regards,
> Arun R Murthy
>

Re: [PATCH v5 6/6] drm/syncobj: Add a fast path to drm_syncobj_array_find

2025-06-12 Thread Christian König

On 6/12/25 12:58, Tvrtko Ursulin wrote:
> 
> On 12/06/2025 08:21, Christian König wrote:
>> On 6/11/25 17:29, Tvrtko Ursulin wrote:
>>>
>>> On 11/06/2025 15:21, Christian König wrote:
 On 6/11/25 16:00, Tvrtko Ursulin wrote:
> Running the Cyberpunk 2077 benchmark we can observe that the lookup helper
> is relatively hot, but the 97% of the calls are for a single object. (~3%
> for two points, and never more than three points. While a more trivial
> workload like vkmark under Plasma is even more skewed to single point
> lookups.)
>
> Therefore lets add a fast path to bypass the kmalloc_array/kfree and use a
> pre-allocated stack array for those cases.

 Have you considered using memdup_user()? That's using a separate bucket 
 IIRC and might give similar performance.
>>>
>>> I haven't but I can try it. I would be surprised if it made a (positive) 
>>> difference though.
>>
>> Yeah, it's mostly for extra security I think.
> 
> On this topic, this discussion prompted me to quickly cook up some trivial 
> cleanups for amdgpu to use memdup_user & co where it was easy. Series is on 
> the mailing list but I did not copy you explicitly giving chance for someone 
> else to notice it and off load you a bit.

Yeah, I know I always wanted to give that task to a student or interim :)

Alex is the one usually picking up amdgpu patches from the mailing list, but 
I'm happy to add an rb if necessary.

>>> And I realised I need to repeat the benchmarks anyway, since in v4 I had to 
>>> stop doing access_ok+__get_user, after kernel test robot let me know 64-bit 
>>> get_user is a not a thing on all platforms. I thought the gains are from 
>>> avoiding allocations but, as you say, now I need to see if copy_from_user 
>>> doesn't nullify them..
>>>
 If that is still not sufficient I'm really wondering if we shouldn't have 
 a macro for doing this. It's a really common use case as far as I can see.
>>>
>>> Hmm macro for what exactly?
>>
>> Like a macro which uses an array on the stack for small (<4) number of 
>> values and k(v)malloc() for large ones.
>>
>> IIRC there is also a relatively new functionality which allows releasing the 
>> memory automatically when we leave the function.
> 
> Okay I will have a look at all those options. But it's going to the bottom of 
> my priority pile so it might be a while.

I'm also perfectly fine with the solution you came up in those patches here for 
now if that improves the performance at hand.

Just wanted to point out that it is possible that somebody has an use case 
where X sync_obj handles are asked for timeline fences and that now becomes 
slower because of that here.

Regards,
Christian.

> 
> Regards,
> 
> Tvrtko
> 
> Signed-off-by: Tvrtko Ursulin 
> Reviewed-by: Maíra Canal 
> ---
> v2:
>    * Added comments describing how the fast path arrays were sized.
>    * Make container freeing criteria clearer by using a boolean.
> ---
>    drivers/gpu/drm/drm_syncobj.c | 56 +++
>    1 file changed, 44 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
> index be5905dca87f..65c301852f0d 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -1259,6 +1259,8 @@ EXPORT_SYMBOL(drm_timeout_abs_to_jiffies);
>    static int drm_syncobj_array_find(struct drm_file *file_private,
>  u32 __user *handles,
>  uint32_t count,
> +  struct drm_syncobj **stack_syncobjs,
> +  u32 stack_count,
>  struct drm_syncobj ***syncobjs_out)
>    {
>    struct drm_syncobj **syncobjs;
> @@ -1268,9 +1270,13 @@ static int drm_syncobj_array_find(struct drm_file 
> *file_private,
>    if (!access_ok(handles, count * sizeof(*handles)))
>    return -EFAULT;
>    -    syncobjs = kmalloc_array(count, sizeof(*syncobjs), GFP_KERNEL);
> -    if (!syncobjs)
> -    return -ENOMEM;
> +    if (count > stack_count) {
> +    syncobjs = kmalloc_array(count, sizeof(*syncobjs), GFP_KERNEL);
> +    if (!syncobjs)
> +    return -ENOMEM;
> +    } else {
> +    syncobjs = stack_syncobjs;
> +    }
>      for (i = 0; i < count; i++) {
>    u32 handle;
> @@ -1292,25 +1298,31 @@ static int drm_syncobj_array_find(struct drm_file 
> *file_private,
>    err_put_syncobjs:
>    while (i-- > 0)
>    drm_syncobj_put(syncobjs[i]);
> -    kfree(syncobjs);
> +
> +    if (syncobjs != stack_syncobjs)
> +    kfree(syncobjs);
>      return ret;
>    }
>      static void drm_syncobj_array_free(struct drm_syncobj **syncobjs,
> -   uint32_t count)
> +   uint32_t count,
> +

RE: [PATCH v2 05/10] drm/xe/xe_late_bind_fw: Load late binding firmware

2025-06-12 Thread Usyskin, Alexander

> Subject: Re: [PATCH v2 05/10] drm/xe/xe_late_bind_fw: Load late binding
> firmware
> 
> 
> 
> On 6/6/2025 10:57 AM, Badal Nilawar wrote:
> > Load late binding firmware
> >
> > v2:
> >   - s/EAGAIN/EBUSY/
> >   - Flush worker in suspend and driver unload (Daniele)
> >
> > Signed-off-by: Badal Nilawar 
> > ---
> >   drivers/gpu/drm/xe/xe_late_bind_fw.c   | 121
> -
> >   drivers/gpu/drm/xe/xe_late_bind_fw.h   |   1 +
> >   drivers/gpu/drm/xe/xe_late_bind_fw_types.h |   5 +
> >   3 files changed, 126 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c
> b/drivers/gpu/drm/xe/xe_late_bind_fw.c
> > index 0231f3dcfc18..7fe304c54374 100644
> > --- a/drivers/gpu/drm/xe/xe_late_bind_fw.c
> > +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
> > @@ -16,6 +16,16 @@
> >   #include "xe_late_bind_fw.h"
> >   #include "xe_pcode.h"
> >   #include "xe_pcode_api.h"
> > +#include "xe_pm.h"
> > +
> > +/*
> > + * The component should load quite quickly in most cases, but it could take
> > + * a bit. Using a very big timeout just to cover the worst case scenario
> > + */
> > +#define LB_INIT_TIMEOUT_MS 2
> > +
> > +#define LB_FW_LOAD_RETRY_MAXCOUNT 40
> > +#define LB_FW_LOAD_RETRY_PAUSE_MS 50
> 
> Are those retry values spec'd anywhere? For GSC we use those because the
> GSC specs say to retry in 50ms intervals for up to 2 secs to give time
> for the GSC to do proxy handling. Does it make sense to do the same in
> this case, given that there is no proxy involved?
> 
Here 50ms is too small, we are waiting for other OS components to release 
handle.
We usually have 3 times 2 sec in user-space, but it is too big for kernel,
let's do 200ms step up to 6 sec.

- - 
Thanks,
Sasha

Re: [PATCH] drm/bridge: ti-sn65dsi86: fix REFCLK setting

2025-06-12 Thread Michael Walle

Hi Jayesh,

>  +   /*
>  +    * After EN is deasserted and an external clock is detected, 
>  the bridge
>  +    * will sample GPIO3:1 to determine its frequency. The 
>  driver will
>  +    * overwrite this setting. But this is racy. Thus we have to 
>  wait a
>  +    * couple of us. According to the datasheet the GPIO lines 
>  has to be
>  +    * stable at least 5 us (td5) but it seems that is not 
>  enough and the
>  +    * refclk frequency value is lost/overwritten by the bridge 
>  itself.
>  +    * Waiting for 20us seems to work.
>  +    */
>  +   usleep_range(20, 30);
> >>>
> >>> It might be worth pointing at _where_ the driver overwrites this
> >>> setting, or maybe at least pointing to something that makes it easy to
> >>> find which exact bits you're talking about.
> > 
> > Yeah, Jayesh just pointed that out below. I'll add add it to the comment.
> > 
> >>> This looks reasonable to me, though.
> >>
> >> I think we are talking about SN_DPPLL_SRC_REG[3:1] bits?
> > 
> > Yes.
> > 
> >> What exact mismatch are you observing in register value?
> > 
> > The one set by the chip itself vs the one from the driver, see below.
> > 
> >> I am assuming that you have a clock at REFCLK pin. For that:
> > 
> > Yes, I'm using an external clock.
> > 
> >> If refclk is described in devicetree node, then I see that
> >> the driver modifies it in every resume call based solely on the
> >> clock value in dts.
> > 
> > Exactly. But that is racy with what the chip itself is doing. I.e.
> > if you don't have that usleep() above, the chip will win the race
> > and the refclk frequency setting will be set according to the
> > external GPIOs (which is poorly described in the datasheet, btw),
> > regardless what the linux driver is setting (because that I2C write
> > happens too early).
>
> I am a little confused here.
> Won't it be opposite?
> If we have this delay here, GPIO will stabilize and set the register
> accordingly?

What do you mean by GPIO? Maybe we are talking about the very same
thing. From my understanding there are two "parties" involved here:

(1) the linux driver
(2) the bridge IC that comes out of reset when EN is asserted

And both are trying to write to the same setting.

>From what I understand, is that (2) is running some kind of state
machine or even firmware that will figure out if there is a refclk
present. If so it will sample the GPIOs and set the refclk frequency
setting accordingly. This happens autonomously after EN is asserted.

Now there is also (1) which will assert the EN signal and shortly
after trying to write the refclk frequency setting.

With this patch we will delay the register write from (1) to a point
after (2) updated its refclk setting. Thus (1) will win.

> In the driver, I came across the case when we do not have refclk.
> (My platform does have a refclk, I am just removing the property from
> the dts node to check the affect of GPIO[3:1] in question because clock
> is not a required property for the bridge as per the bindings)

I'd expect that in this case the refclk is set according to the GPIO
strapping. Correct?

> In the ti_sn65dsi86_probe(), before we read SN_DEVICE_ID_REGS,
> when we go to resume(), we do not do enable_comms() that calls
> ti_sn_bridge_set_refclk_freq() to set SN_DPPLL_SRC_REG.
> I see that register read for SN_DEVICE_ID_REGS fails in that case.

Does it work with the property still in the device tree? I might try
that on my board later.

> Adding this delay fixes that issue. This made me think that we need
> the delay for GPIO to stabilize and set the refclk.
>
> Is my understanding incorrect?

Unfortunately, the datasheet is really sparse on details here, but
maybe the bridge needs some time after EN is assert to respond on
the I2C bus in general. I'm basing my guesswork on the td5 timing
with the vague description "GPIO[3:1] stable after EN assertion". I
assume that somewhere during that time the chip will sample the
GPIOs and do something with that setting (presumable setting its
internal refclk frequency setting). FWIW there is also a td4
("GPIO[3:1] stable before EN assertion"). Both td4 and td5, makes
me believe that this is not some setting which is sampled (and hold)
at reset, otherwise td5 wouldn't make much sense.

> I am totally on board with your change especially after observing the
> above but is my understanding incorrect somewhere?
>
> Warm Regards,
> Jayesh
>
> > 
> >> If refclk is not described in dts, then this register is modified by the
> >> driver only when pre_enable() calls enable_comms(). Here also, the
> >> value depends on crtc_mode and the refclk_rate often would not be equal
> >> to the values in "ti_sn_bridge_dsiclk_lut" (supported frequencies), and
> >> you would fallback to "001" register value.
> > 
> >> Rest of time, I guess it depends on reading the status from GPIO and
> >> changing the register.
> > 
> > Not "re

Re: [PATCH v2 0/4] Support for Adreno X1-45 GPU

2025-06-12 Thread Jens Glathe


On 6/11/25 13:15, Akhil P Oommen wrote:


Add support for X1-45 GPU found in X1P41200 chipset (8 cpu core
version). X1-45 is a smaller version of X1-85 with lower core count and
smaller memories. From UMD perspective, this is similar to "FD735"
present in Mesa.


Hi Akhil,

when loading the driver (still without firmware files) I'm getting a 
speedbin warning:


[    3.318341] adreno 3d0.gpu: [drm:a6xx_gpu_init [msm]] *ERROR* 
missing support for speed-bin: 233. Some OPPs may not be supported by 
hardware


I've seen that there is a table for speed bins, this one is not there. 
Tested on a Lenovo ThinkBook 16 G7 QOY.


with best regards

Jens

Re: [PATCH] drm/ssd130x: fix ssd132x_clear_screen() columns

2025-06-12 Thread Javier Martinez Canillas

John Keeping  writes:

Hello John,

> The number of columns relates to the width, not the height.  Use the
> correct variable.
>
> Signed-off-by: John Keeping 
> ---

Pushed to drm-misc (drm-misc-fixes). Thanks!

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat

Re: [PATCH 0/3] arm64: dts: rockchip: Fix HDMI output on RK3576

2025-06-12 Thread Nicolas Frattaroli

On Wednesday, 11 June 2025 23:47:46 Central European Summer Time Cristian 
Ciocaltea wrote:
> Since commit c871a311edf0 ("phy: rockchip: samsung-hdptx: Setup TMDS
> char rate via phy_configure_opts_hdmi"), the workaround of passing the
> PHY rate from DW HDMI QP bridge driver via phy_set_bus_width() became
> partially broken, unless the rate adjustment is done as with RK3588,
> i.e. by CCF from VOP2.
> 
> Attempting to fix this up at PHY level would not only introduce
> additional hacks, but it would also fail to adequately resolve the
> display issues that are a consequence of the system CRU limitations.
> 
> Therefore, let's proceed with the solution already implemented for
> RK3588, that is to make use of the HDMI PHY PLL as a more accurate DCLK
> source in VOP2.
> 
> It's worth noting a follow-up patch is going to drop the hack from the
> bridge driver altogether, while switching to HDMI PHY configuration API
> for setting up the TMDS character rate.
> 
> Signed-off-by: Cristian Ciocaltea 
> ---
> Cristian Ciocaltea (3):
>   dt-bindings: display: vop2: Add optional PLL clock property for rk3576
>   arm64: dts: rockchip: Enable HDMI PHY clk provider on rk3576
>   arm64: dts: rockchip: Add HDMI PHY PLL clock source to VOP2 on rk3576
> 
>  .../bindings/display/rockchip/rockchip-vop2.yaml   | 56 
> +-
>  arch/arm64/boot/dts/rockchip/rk3576.dtsi   |  7 ++-
>  2 files changed, 49 insertions(+), 14 deletions(-)
> ---
> base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
> change-id: 20250611-rk3576-hdmitx-fix-e030fbdb0d17
> 
> 
> ___
> Linux-rockchip mailing list
> linux-rockc...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip
> 

For the whole series:

Tested-by: Nicolas Frattaroli 

This fixes HDMI output for 4K resolutions on my RK3576 ArmSoM Sige5.
The DTB checks and bindings checks pass as well.

Kind regards,
Nicolas Frattaroli

Re: [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback

2025-06-12 Thread Tvrtko Ursulin




On 03/06/2025 10:31, Philipp Stanner wrote:

Since its inception, the GPU scheduler can leak memory if the driver
calls drm_sched_fini() while there are still jobs in flight.

The simplest way to solve this in a backwards compatible manner is by
adding a new callback, drm_sched_backend_ops.cancel_job(), which
instructs the driver to signal the hardware fence associated with the
job. Afterwards, the scheduler can savely use the established free_job()
callback for freeing the job.

Implement the new backend_ops callback cancel_job().

Suggested-by: Tvrtko Ursulin 


Please just add the link to the patch here (it is only in the cover letter):

Link: 
https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursu...@igalia.com/


And you probably want to take the unit test modifications from the same 
patch too. You could put them in the same patch or separate.


Regards,

Tvrtko


Signed-off-by: Philipp Stanner 
---
  drivers/gpu/drm/scheduler/sched_main.c | 34 --
  include/drm/gpu_scheduler.h|  9 +++
  2 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index d20726d7adf0..3f14f1e151fa 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1352,6 +1352,18 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, 
const struct drm_sched_init_
  }
  EXPORT_SYMBOL(drm_sched_init);
  
+static void drm_sched_kill_remaining_jobs(struct drm_gpu_scheduler *sched)

+{
+   struct drm_sched_job *job, *tmp;
+
+   /* All other accessors are stopped. No locking necessary. */
+   list_for_each_entry_safe_reverse(job, tmp, &sched->pending_list, list) {
+   sched->ops->cancel_job(job);
+   list_del(&job->list);
+   sched->ops->free_job(job);
+   }
+}
+
  /**
   * drm_sched_fini - Destroy a gpu scheduler
   *
@@ -1359,19 +1371,11 @@ EXPORT_SYMBOL(drm_sched_init);
   *
   * Tears down and cleans up the scheduler.
   *
- * This stops submission of new jobs to the hardware through
- * drm_sched_backend_ops.run_job(). Consequently, 
drm_sched_backend_ops.free_job()
- * will not be called for all jobs still in drm_gpu_scheduler.pending_list.
- * There is no solution for this currently. Thus, it is up to the driver to 
make
- * sure that:
- *
- *  a) drm_sched_fini() is only called after for all submitted jobs
- * drm_sched_backend_ops.free_job() has been called or that
- *  b) the jobs for which drm_sched_backend_ops.free_job() has not been called
- * after drm_sched_fini() ran are freed manually.
- *
- * FIXME: Take care of the above problem and prevent this function from leaking
- * the jobs in drm_gpu_scheduler.pending_list under any circumstances.
+ * This stops submission of new jobs to the hardware through &struct
+ * drm_sched_backend_ops.run_job. If &struct drm_sched_backend_ops.cancel_job
+ * is implemented, all jobs will be canceled through it and afterwards cleaned
+ * up through &struct drm_sched_backend_ops.free_job. If cancel_job is not
+ * implemented, memory could leak.
   */
  void drm_sched_fini(struct drm_gpu_scheduler *sched)
  {
@@ -1401,6 +1405,10 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
/* Confirm no work left behind accessing device structures */
cancel_delayed_work_sync(&sched->work_tdr);
  
+	/* Avoid memory leaks if supported by the driver. */

+   if (sched->ops->cancel_job)
+   drm_sched_kill_remaining_jobs(sched);
+
if (sched->own_submit_wq)
destroy_workqueue(sched->submit_wq);
sched->ready = false;
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index e62a7214e052..81dcbfc8c223 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -512,6 +512,15 @@ struct drm_sched_backend_ops {
   * and it's time to clean it up.
 */
void (*free_job)(struct drm_sched_job *sched_job);
+
+   /**
+* @cancel_job: Used by the scheduler to guarantee remaining jobs' 
fences
+* get signaled in drm_sched_fini().
+*
+* Drivers need to signal the passed job's hardware fence with
+* -ECANCELED in this callback. They must not free the job.
+*/
+   void (*cancel_job)(struct drm_sched_job *sched_job);
  };
  
  /**

[PATCH v5 22/23] gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS

2025-06-12 Thread Alexandre Courbot

The FWSEC firmware needs to be extracted from the VBIOS and patched with
the desired command, as well as the right signature. Do this so we are
ready to load and run this firmware into the GSP falcon and create the
FRTS region.

[joelagn...@nvidia.com: give better names to FalconAppifHdrV1's fields]

Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/nova-core/dma.rs|   3 -
 drivers/gpu/nova-core/firmware.rs   |   3 +-
 drivers/gpu/nova-core/firmware/fwsec.rs | 395 
 drivers/gpu/nova-core/gpu.rs|  15 +-
 drivers/gpu/nova-core/vbios.rs  |  30 ++-
 5 files changed, 431 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs
index 
4b063aaef65ec4e2f476fc5ce9dc25341b6660ca..1f1f8c378d8e2cf51edc772e7afe392e9c9c8831
 100644
--- a/drivers/gpu/nova-core/dma.rs
+++ b/drivers/gpu/nova-core/dma.rs
@@ -2,9 +2,6 @@
 
 //! Simple DMA object wrapper.
 
-// To be removed when all code is used.
-#![expect(dead_code)]
-
 use core::ops::{Deref, DerefMut};
 
 use kernel::device;
diff --git a/drivers/gpu/nova-core/firmware.rs 
b/drivers/gpu/nova-core/firmware.rs
index 
32553b5142d6623bdaaa9d480fbff11069198606..ae449a98dffb51e400db058c7368f0632b62f147
 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -15,6 +15,8 @@
 use crate::gpu;
 use crate::gpu::Chipset;
 
+pub(crate) mod fwsec;
+
 pub(crate) const FIRMWARE_VERSION: &str = "535.113.01";
 
 /// Structure encapsulating the firmware blobs required for the GPU to operate.
@@ -114,7 +116,6 @@ impl SignedState for Signed {}
 /// This is module-local and meant for sub-modules to use internally.
 trait FirmwareSignature: AsRef<[u8]> {}
 
-#[expect(unused)]
 impl FirmwareDmaObject {
 /// Patches the firmware at offset `sig_base_img` with `signature`.
 fn patch_signature>(
diff --git a/drivers/gpu/nova-core/firmware/fwsec.rs 
b/drivers/gpu/nova-core/firmware/fwsec.rs
new file mode 100644
index 
..e02c051a682b790b1627ace42c7aaa214b8903df
--- /dev/null
+++ b/drivers/gpu/nova-core/firmware/fwsec.rs
@@ -0,0 +1,395 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! FWSEC is a High Secure firmware that is extracted from the BIOS and 
performs the first step of
+//! the GSP startup by creating the WPR2 memory region and copying critical 
areas of the VBIOS into
+//! it after authenticating them, ensuring they haven't been tampered with. It 
runs on the GSP
+//! falcon.
+//!
+//! Before being run, it needs to be patched in two areas:
+//!
+//! - The command to be run, as this firmware can perform several tasks ;
+//! - The ucode signature, so the GSP falcon can run FWSEC in HS mode.
+
+use core::marker::PhantomData;
+use core::ops::Deref;
+
+use kernel::device::{self, Device};
+use kernel::num::PowerOfTwo;
+use kernel::prelude::*;
+use kernel::transmute::FromBytes;
+
+use crate::dma::DmaObject;
+use crate::driver::Bar0;
+use crate::falcon::gsp::Gsp;
+use crate::falcon::{Falcon, FalconBromParams, FalconFirmware, 
FalconLoadParams, FalconLoadTarget};
+use crate::firmware::{FalconUCodeDescV3, FirmwareDmaObject, FirmwareSignature, 
Signed, Unsigned};
+use crate::vbios::Vbios;
+
+const NVFW_FALCON_APPIF_ID_DMEMMAPPER: u32 = 0x4;
+
+#[repr(C)]
+#[derive(Debug)]
+struct FalconAppifHdrV1 {
+version: u8,
+header_size: u8,
+entry_size: u8,
+entry_count: u8,
+}
+// SAFETY: any byte sequence is valid for this struct.
+unsafe impl FromBytes for FalconAppifHdrV1 {}
+
+#[repr(C, packed)]
+#[derive(Debug)]
+struct FalconAppifV1 {
+id: u32,
+dmem_base: u32,
+}
+// SAFETY: any byte sequence is valid for this struct.
+unsafe impl FromBytes for FalconAppifV1 {}
+
+#[derive(Debug)]
+#[repr(C, packed)]
+struct FalconAppifDmemmapperV3 {
+signature: u32,
+version: u16,
+size: u16,
+cmd_in_buffer_offset: u32,
+cmd_in_buffer_size: u32,
+cmd_out_buffer_offset: u32,
+cmd_out_buffer_size: u32,
+nvf_img_data_buffer_offset: u32,
+nvf_img_data_buffer_size: u32,
+printf_buffer_hdr: u32,
+ucode_build_time_stamp: u32,
+ucode_signature: u32,
+init_cmd: u32,
+ucode_feature: u32,
+ucode_cmd_mask0: u32,
+ucode_cmd_mask1: u32,
+multi_tgt_tbl: u32,
+}
+// SAFETY: any byte sequence is valid for this struct.
+unsafe impl FromBytes for FalconAppifDmemmapperV3 {}
+
+#[derive(Debug)]
+#[repr(C, packed)]
+struct ReadVbios {
+ver: u32,
+hdr: u32,
+addr: u64,
+size: u32,
+flags: u32,
+}
+// SAFETY: any byte sequence is valid for this struct.
+unsafe impl FromBytes for ReadVbios {}
+
+#[derive(Debug)]
+#[repr(C, packed)]
+struct FrtsRegion {
+ver: u32,
+hdr: u32,
+addr: u32,
+size: u32,
+ftype: u32,
+}
+// SAFETY: any byte sequence is valid for this struct.
+unsafe impl FromBytes for FrtsRegion {}
+
+const NVFW_FRTS_CMD_REGION_TYPE_FB: u32 = 2;
+
+#[repr(C, packed)]
+struct FrtsCmd {
+read_vbios: ReadVbios,
+

Re: [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback

2025-06-12 Thread Philipp Stanner

On Thu, 2025-06-12 at 15:17 +0100, Tvrtko Ursulin wrote:
> 
> On 03/06/2025 10:31, Philipp Stanner wrote:
> > Since its inception, the GPU scheduler can leak memory if the
> > driver
> > calls drm_sched_fini() while there are still jobs in flight.
> > 
> > The simplest way to solve this in a backwards compatible manner is
> > by
> > adding a new callback, drm_sched_backend_ops.cancel_job(), which
> > instructs the driver to signal the hardware fence associated with
> > the
> > job. Afterwards, the scheduler can savely use the established
> > free_job()
> > callback for freeing the job.
> > 
> > Implement the new backend_ops callback cancel_job().
> > 
> > Suggested-by: Tvrtko Ursulin 
> 
> Please just add the link to the patch here (it is only in the cover
> letter):
> 
> Link: 
> https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursu...@igalia.com/

That I can do, sure

> 
> And you probably want to take the unit test modifications from the
> same 
> patch too. You could put them in the same patch or separate.

Necessary adjustments for the unit tests are already implemented and
are waiting for review separately, since this can be done independently
from this entire series:

https://lore.kernel.org/dri-devel/20250605134154.191764-2-pha...@kernel.org/


Thx
P.

> 
> Regards,
> 
> Tvrtko
> 
> > Signed-off-by: Philipp Stanner 
> > ---
> >   drivers/gpu/drm/scheduler/sched_main.c | 34 -
> > -
> >   include/drm/gpu_scheduler.h    |  9 +++
> >   2 files changed, 30 insertions(+), 13 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index d20726d7adf0..3f14f1e151fa 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -1352,6 +1352,18 @@ int drm_sched_init(struct drm_gpu_scheduler
> > *sched, const struct drm_sched_init_
> >   }
> >   EXPORT_SYMBOL(drm_sched_init);
> >   
> > +static void drm_sched_kill_remaining_jobs(struct drm_gpu_scheduler
> > *sched)
> > +{
> > +   struct drm_sched_job *job, *tmp;
> > +
> > +   /* All other accessors are stopped. No locking necessary.
> > */
> > +   list_for_each_entry_safe_reverse(job, tmp, &sched-
> > >pending_list, list) {
> > +   sched->ops->cancel_job(job);
> > +   list_del(&job->list);
> > +   sched->ops->free_job(job);
> > +   }
> > +}
> > +
> >   /**
> >    * drm_sched_fini - Destroy a gpu scheduler
> >    *
> > @@ -1359,19 +1371,11 @@ EXPORT_SYMBOL(drm_sched_init);
> >    *
> >    * Tears down and cleans up the scheduler.
> >    *
> > - * This stops submission of new jobs to the hardware through
> > - * drm_sched_backend_ops.run_job(). Consequently,
> > drm_sched_backend_ops.free_job()
> > - * will not be called for all jobs still in
> > drm_gpu_scheduler.pending_list.
> > - * There is no solution for this currently. Thus, it is up to the
> > driver to make
> > - * sure that:
> > - *
> > - *  a) drm_sched_fini() is only called after for all submitted
> > jobs
> > - * drm_sched_backend_ops.free_job() has been called or that
> > - *  b) the jobs for which drm_sched_backend_ops.free_job() has not
> > been called
> > - * after drm_sched_fini() ran are freed manually.
> > - *
> > - * FIXME: Take care of the above problem and prevent this function
> > from leaking
> > - * the jobs in drm_gpu_scheduler.pending_list under any
> > circumstances.
> > + * This stops submission of new jobs to the hardware through
> > &struct
> > + * drm_sched_backend_ops.run_job. If &struct
> > drm_sched_backend_ops.cancel_job
> > + * is implemented, all jobs will be canceled through it and
> > afterwards cleaned
> > + * up through &struct drm_sched_backend_ops.free_job. If
> > cancel_job is not
> > + * implemented, memory could leak.
> >    */
> >   void drm_sched_fini(struct drm_gpu_scheduler *sched)
> >   {
> > @@ -1401,6 +1405,10 @@ void drm_sched_fini(struct drm_gpu_scheduler
> > *sched)
> >     /* Confirm no work left behind accessing device structures
> > */
> >     cancel_delayed_work_sync(&sched->work_tdr);
> >   
> > +   /* Avoid memory leaks if supported by the driver. */
> > +   if (sched->ops->cancel_job)
> > +   drm_sched_kill_remaining_jobs(sched);
> > +
> >     if (sched->own_submit_wq)
> >     destroy_workqueue(sched->submit_wq);
> >     sched->ready = false;
> > diff --git a/include/drm/gpu_scheduler.h
> > b/include/drm/gpu_scheduler.h
> > index e62a7214e052..81dcbfc8c223 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -512,6 +512,15 @@ struct drm_sched_backend_ops {
> >    * and it's time to clean it up.
> >      */
> >     void (*free_job)(struct drm_sched_job *sched_job);
> > +
> > +   /**
> > +* @cancel_job: Used by the scheduler to guarantee
> > remaining jobs' fences
> > +* get signaled in drm_sched_fini().
> > +*
> > +* Drivers need to signal the passed job's hardware fence
> > wi

[PULL] drm-misc-next

2025-06-12 Thread Maxime Ripard

Hi,

Here's the first drm-misc-next PR for 6.17.

Maxime

drm-misc-next-2025-06-12:
drm-misc-next for 6.17:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
 - atomic-helpers: Tune the enable / disable sequence
 - bridge: Add destroy hook
 - color management: Add helpers for hardware gamma LUT handling
 - HDMI: Add CEC handling, YUV420 output support
 - sched: tracing improvements

Driver Changes:
 - hyperv: Move out of simple-kms, drm_panic support
 - i915: drm_panel_follower support
 - imx: Add IMX8qxq Display Controller Support
 - lima: Add Rockchip RK3528 GPU Support
 - nouveau: fence handling cleanup
 - panfrost: Add BO labeling, 64-bit registers access
 - qaic: Add RAS Support
 - rz-du: Add RZ/V2H(P) Support, MIPI-DSI DCS Support
 - sun4i: Add H616 Support
 - tidss: Add TI AM62L Support
 - vkms: YUV and R* formats support

 - bridges:
   - Switched to reference counted drm_bridge allocations

 - panels:
   - Switched to reference counted drm_panel allocations
   - Add support for fwnode-based panel lookup
   - himax-hx8394: Support for Huiling hl055fhv028c
   - ilitek-ili9881c: Support for 7" Raspberry Pi 720x1280
   - panel-edp: Support for KDC KD116N3730A05, N160JCE-ELL CMN,
   - panel-simple: Support for AUO P238HAN01
   - st7701: Support for Winstar wf40eswaa6mnn0
   - visionox-rm69299: Support for rm69299-shift
   - New panels: Renesas R61307, Renesas R69328
The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/misc/kernel.git 
tags/drm-misc-next-2025-06-12

for you to fetch changes up to c5b4393c5492555e35c08677a326c9c53b275abd:

  drm/file: add client id to drm_file_error (2025-06-12 14:33:51 +0200)


drm-misc-next for 6.17:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
 - atomic-helpers: Tune the enable / disable sequence
 - bridge: Add destroy hook
 - color management: Add helpers for hardware gamma LUT handling
 - HDMI: Add CEC handling, YUV420 output support
 - sched: tracing improvements

Driver Changes:
 - hyperv: Move out of simple-kms, drm_panic support
 - i915: drm_panel_follower support
 - imx: Add IMX8qxq Display Controller Support
 - lima: Add Rockchip RK3528 GPU Support
 - nouveau: fence handling cleanup
 - panfrost: Add BO labeling, 64-bit registers access
 - qaic: Add RAS Support
 - rz-du: Add RZ/V2H(P) Support, MIPI-DSI DCS Support
 - sun4i: Add H616 Support
 - tidss: Add TI AM62L Support
 - vkms: YUV and R* formats support

 - bridges:
   - Switched to reference counted drm_bridge allocations

 - panels:
   - Switched to reference counted drm_panel allocations
   - Add support for fwnode-based panel lookup
   - himax-hx8394: Support for Huiling hl055fhv028c
   - ilitek-ili9881c: Support for 7" Raspberry Pi 720x1280
   - panel-edp: Support for KDC KD116N3730A05, N160JCE-ELL CMN,
   - panel-simple: Support for AUO P238HAN01
   - st7701: Support for Winstar wf40eswaa6mnn0
   - visionox-rm69299: Support for rm69299-shift
   - New panels: Renesas R61307, Renesas R69328


Adrián Larumbe (5):
  drm/panfrost: Add BO labelling to Panfrost
  drm/panfrost: Internally label some BOs
  drm/panfrost: Add driver IOCTL for setting BO labels
  drm/panfrost: show device-wide list of DRM GEM objects over DebugFS
  drm/panfrost: Fix panfrost device variable name in devfreq

André Almeida (1):
  drm: drm_auth: Convert mutex usage to guard(mutex)

Andy Shevchenko (2):
  accel/habanalabs: Switch to use %ptTs
  drm/panel: ili9341: Remove unused member from struct ili9341

Andy Yan (2):
  drm/rockchip: cleanup fb when drm_gem_fb_afbc_init failed
  drm/gem-framebuffer: log errors when gem size < afbc_size

Anusha Srivatsa (76):
  panel/panel-elida-kd35t133: Use refcounted allocation in place of 
devm_kzalloc()
  panel/feixin-k101-im2ba02: Use refcounted allocation in place of 
devm_kzalloc()
  panel/fy07024di26a30d: Use refcounted allocation in place of 
devm_kzalloc()
  panel/himax-hx83112a: Use refcounted allocation in place of devm_kzalloc()
  panel/himax-hx8394: Use refcounted allocation in place of devm_kzalloc()
  panel/ilitek-ili9322: Use refcounted allocation in place of devm_kzalloc()
  panel/ilitek-ili9341: Use refcounted allocation in place of devm_kzalloc()
  panel/panel-ili9805: Use refcounted allocation in place of devm_kzalloc()
  panel/ilitek-ili9806e: Use refcounted allocation in place of 
devm_kzalloc()
  panel/ilitek-ili9881c: Use refcounted allocation in place of 
devm_kzalloc()
  panel/innolux-ej030na: Use refcounted allocation in place of 
devm_kzalloc()
  panel/innolux-p079zca: Use refcounted allocation in place of 
devm_kzalloc()
  panel/jadard-jd9365da-h3: Use refcounted allocation in place of 
devm_

Re: [PATCH 03/12] mm/pagewalk: Skip dax pages in pagewalk

2025-06-12 Thread Lorenzo Stoakes

On Thu, May 29, 2025 at 04:32:04PM +1000, Alistair Popple wrote:
> Previously dax pages were skipped by the pagewalk code as pud_special() or
> vm_normal_page{_pmd}() would be false for DAX pages. Now that dax pages are
> refcounted normally that is no longer the case, so add explicit checks to
> skip them.
>
> Signed-off-by: Alistair Popple 
> ---
>  include/linux/memremap.h | 11 +++
>  mm/pagewalk.c| 12 ++--
>  2 files changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/memremap.h b/include/linux/memremap.h
> index 4aa1519..54e8b57 100644
> --- a/include/linux/memremap.h
> +++ b/include/linux/memremap.h
> @@ -198,6 +198,17 @@ static inline bool folio_is_fsdax(const struct folio 
> *folio)
>   return is_fsdax_page(&folio->page);
>  }
>
> +static inline bool is_devdax_page(const struct page *page)
> +{
> + return is_zone_device_page(page) &&
> + page_pgmap(page)->type == MEMORY_DEVICE_GENERIC;
> +}
> +
> +static inline bool folio_is_devdax(const struct folio *folio)
> +{
> + return is_devdax_page(&folio->page);
> +}
> +
>  #ifdef CONFIG_ZONE_DEVICE
>  void zone_device_page_init(struct page *page);
>  void *memremap_pages(struct dev_pagemap *pgmap, int nid);
> diff --git a/mm/pagewalk.c b/mm/pagewalk.c
> index e478777..0dfb9c2 100644
> --- a/mm/pagewalk.c
> +++ b/mm/pagewalk.c
> @@ -884,6 +884,12 @@ struct folio *folio_walk_start(struct folio_walk *fw,
>* support PUD mappings in VM_PFNMAP|VM_MIXEDMAP VMAs.
>*/
>   page = pud_page(pud);
> +
> + if (is_devdax_page(page)) {

Is it only devdax that can exist at PUD leaf level, not fsdax?

> + spin_unlock(ptl);
> + goto not_found;
> + }
> +
>   goto found;
>   }
>
> @@ -911,7 +917,8 @@ struct folio *folio_walk_start(struct folio_walk *fw,
>   goto pte_table;
>   } else if (pmd_present(pmd)) {
>   page = vm_normal_page_pmd(vma, addr, pmd);
> - if (page) {
> + if (page && !is_devdax_page(page) &&
> + !is_fsdax_page(page)) {
>   goto found;
>   } else if ((flags & FW_ZEROPAGE) &&
>   is_huge_zero_pmd(pmd)) {
> @@ -945,7 +952,8 @@ struct folio *folio_walk_start(struct folio_walk *fw,
>
>   if (pte_present(pte)) {
>   page = vm_normal_page(vma, addr, pte);
> - if (page)
> + if (page && !is_devdax_page(page) &&
> + !is_fsdax_page(page))
>   goto found;
>   if ((flags & FW_ZEROPAGE) &&
>   is_zero_pfn(pte_pfn(pte))) {

I'm probably echoing others here (and I definitely particularly like Dan's
suggestion of a helper function here, and Jason's suggestion of explanatory
comments), but would also be nice to not have to do this separately at each page
table level and instead have something that you can say 'get me normal non-dax
page at page table level '.

> --
> git-series 0.9.1

[PATCH v5 03/23] rust: sizes: add constants up to SZ_2G

2025-06-12 Thread Alexandre Courbot

nova-core will need to use SZ_1M, so make the remaining constants
available.

Reviewed-by: Boqun Feng 
Signed-off-by: Alexandre Courbot 
---
 rust/kernel/sizes.rs | 24 
 1 file changed, 24 insertions(+)

diff --git a/rust/kernel/sizes.rs b/rust/kernel/sizes.rs
index 
834c343e4170f507821b870e77afd08e2392911f..661e680d9330616478513a19fe2f87f9521516d7
 100644
--- a/rust/kernel/sizes.rs
+++ b/rust/kernel/sizes.rs
@@ -24,3 +24,27 @@
 pub const SZ_256K: usize = bindings::SZ_256K as usize;
 /// 0x0008
 pub const SZ_512K: usize = bindings::SZ_512K as usize;
+/// 0x0010
+pub const SZ_1M: usize = bindings::SZ_1M as usize;
+/// 0x0020
+pub const SZ_2M: usize = bindings::SZ_2M as usize;
+/// 0x0040
+pub const SZ_4M: usize = bindings::SZ_4M as usize;
+/// 0x0080
+pub const SZ_8M: usize = bindings::SZ_8M as usize;
+/// 0x0100
+pub const SZ_16M: usize = bindings::SZ_16M as usize;
+/// 0x0200
+pub const SZ_32M: usize = bindings::SZ_32M as usize;
+/// 0x0400
+pub const SZ_64M: usize = bindings::SZ_64M as usize;
+/// 0x0800
+pub const SZ_128M: usize = bindings::SZ_128M as usize;
+/// 0x1000
+pub const SZ_256M: usize = bindings::SZ_256M as usize;
+/// 0x2000
+pub const SZ_512M: usize = bindings::SZ_512M as usize;
+/// 0x4000
+pub const SZ_1G: usize = bindings::SZ_1G as usize;
+/// 0x8000
+pub const SZ_2G: usize = bindings::SZ_2G as usize;

-- 
2.49.0

Re: [PATCH v5 2/2] drm/panthor: Make the timeout per-queue instead of per-job

2025-06-12 Thread Liviu Dudau

On Tue, Jun 03, 2025 at 10:49:32AM +0100, Ashley Smith wrote:
> The timeout logic provided by drm_sched leads to races when we try
> to suspend it while the drm_sched workqueue queues more jobs. Let's
> overhaul the timeout handling in panthor to have our own delayed work
> that's resumed/suspended when a group is resumed/suspended. When an
> actual timeout occurs, we call drm_sched_fault() to report it
> through drm_sched, still. But otherwise, the drm_sched timeout is
> disabled (set to MAX_SCHEDULE_TIMEOUT), which leaves us in control of
> how we protect modifications on the timer.
> 
> One issue seems to be when we call drm_sched_suspend_timeout() from
> both queue_run_job() and tick_work() which could lead to races due to
> drm_sched_suspend_timeout() not having a lock. Another issue seems to
> be in queue_run_job() if the group is not scheduled, we suspend the
> timeout again which undoes what drm_sched_job_begin() did when calling
> drm_sched_start_timeout(). So the timeout does not reset when a job
> is finished.
> 
> Co-developed-by: Boris Brezillon 
> Signed-off-by: Boris Brezillon 
> Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
> ---
Don't know why you have the dashes in line above in the patch, it will make git 
to skip your S-o-b.


> Signed-off-by: Ashley Smith 

Reviewed-by: Liviu Dudau 

Best regards,
Liviu

> ---
>  drivers/gpu/drm/panthor/panthor_sched.c | 233 +---
>  1 file changed, 167 insertions(+), 66 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c 
> b/drivers/gpu/drm/panthor/panthor_sched.c
> index 65d8ae3dcac1..fb5a66ca5384 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -360,17 +360,20 @@ struct panthor_queue {
>   /** @entity: DRM scheduling entity used for this queue. */
>   struct drm_sched_entity entity;
>  
> - /**
> -  * @remaining_time: Time remaining before the job timeout expires.
> -  *
> -  * The job timeout is suspended when the queue is not scheduled by the
> -  * FW. Every time we suspend the timer, we need to save the remaining
> -  * time so we can restore it later on.
> -  */
> - unsigned long remaining_time;
> + /** @timeout: Queue timeout related fields. */
> + struct {
> + /** @timeout.work: Work executed when a queue timeout occurs. */
> + struct delayed_work work;
>  
> - /** @timeout_suspended: True if the job timeout was suspended. */
> - bool timeout_suspended;
> + /**
> +  * @timeout.remaining: Time remaining before a queue timeout.
> +  *
> +  * When the timer is running, this value is set to 
> MAX_SCHEDULE_TIMEOUT.
> +  * When the timer is suspended, it's set to the time remaining 
> when the
> +  * timer was suspended.
> +  */
> + unsigned long remaining;
> + } timeout;
>  
>   /**
>* @doorbell_id: Doorbell assigned to this queue.
> @@ -1031,6 +1034,82 @@ group_unbind_locked(struct panthor_group *group)
>   return 0;
>  }
>  
> +static bool
> +group_is_idle(struct panthor_group *group)
> +{
> + struct panthor_device *ptdev = group->ptdev;
> + u32 inactive_queues;
> +
> + if (group->csg_id >= 0)
> + return ptdev->scheduler->csg_slots[group->csg_id].idle;
> +
> + inactive_queues = group->idle_queues | group->blocked_queues;
> + return hweight32(inactive_queues) == group->queue_count;
> +}
> +
> +static void
> +queue_suspend_timeout(struct panthor_queue *queue)
> +{
> + unsigned long qtimeout, now;
> + struct panthor_group *group;
> + struct panthor_job *job;
> + bool timer_was_active;
> +
> + spin_lock(&queue->fence_ctx.lock);
> +
> + /* Already suspended, nothing to do. */
> + if (queue->timeout.remaining != MAX_SCHEDULE_TIMEOUT)
> + goto out_unlock;
> +
> + job = list_first_entry_or_null(&queue->fence_ctx.in_flight_jobs,
> +struct panthor_job, node);
> + group = job ? job->group : NULL;
> +
> + /* If the queue is blocked and the group is idle, we want the timer to
> +  * keep running because the group can't be unblocked by other queues,
> +  * so it has to come from an external source, and we want to timebox
> +  * this external signalling.
> +  */
> + if (group && (group->blocked_queues & BIT(job->queue_idx)) &&
> + group_is_idle(group))
> + goto out_unlock;
> +
> + now = jiffies;
> + qtimeout = queue->timeout.work.timer.expires;
> +
> + /* Cancel the timer. */
> + timer_was_active = cancel_delayed_work(&queue->timeout.work);
> + if (!timer_was_active || !job)
> + queue->timeout.remaining = msecs_to_jiffies(JOB_TIMEOUT_MS);
> + else if (time_after(qtimeout, now))
> + queue->timeout.remaining = qtimeout - now;
> + else
> +

[PULL] drm-xe-fixes

2025-06-12 Thread Thomas Hellstrom

Hi Dave, Simona

Two fixes for 6.16-rc2.

Thanks,
Thomas

drm-xe-fixes-2025-06-12:
Driver Changes:
- Fix regression disallowing 64K SVM migration (Maarten)
- Use a bounce buffer for WA BB (Lucas)

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2025-06-12

for you to fetch changes up to 9c7632faad434c98f1f2cc06f3647a5a5d05ddbf:

  drm/xe/lrc: Use a temporary buffer for WA BB (2025-06-12 18:09:50 +0200)


Driver Changes:
- Fix regression disallowing 64K SVM migration (Maarten)
- Use a bounce buffer for WA BB (Lucas)


Lucas De Marchi (1):
  drm/xe/lrc: Use a temporary buffer for WA BB

Maarten Lankhorst (1):
  drm/xe/svm: Fix regression disallowing 64K SVM migration

 drivers/gpu/drm/xe/xe_lrc.c | 24 
 drivers/gpu/drm/xe/xe_svm.c |  2 +-
 2 files changed, 21 insertions(+), 5 deletions(-)

Re: [PATCH 16/20] PCI: rockchip: switch to HWORD_UPDATE* macros

2025-06-12 Thread Bjorn Helgaas

On Thu, Jun 12, 2025 at 08:56:18PM +0200, Nicolas Frattaroli wrote:
> The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
> drivers that use constant masks.
> 
> The Rockchip PCI driver, like many other Rockchip drivers, has its very
> own definition of HIWORD_UPDATE.
> 
> Remove it, and replace its usage with either HWORD_UPDATE, or two new
> header local macros for setting/clearing a bit with the high mask, which
> use HWORD_UPDATE_CONST internally. In the process, ENCODE_LANES needed
> to be adjusted, as HWORD_UPDATE* shifts the value for us.
> 
> That this is equivalent was verified by first making all HWORD_UPDATE
> instances HWORD_UPDATE_CONST, then doing a static_assert() comparing it
> to the old macro (and for those with parameters, static_asserting for
> the full range of possible values with the old encode macro).
> 
> What we get out of this is compile time error checking to make sure the
> value actually fits in the mask, and that the mask fits in the register,
> and also generally less icky code that writes shifted values when it
> actually just meant to set and clear a handful of bits.
> 
> Signed-off-by: Nicolas Frattaroli 

Looks good to me.  I assume you want to merge these via a non-PCI tree
since this depends on patch 01/20.  PCI subject convention would
capitalize "Switch":

  PCI: rockchip: Switch to HWORD_UPDATE* macros

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/controller/pcie-rockchip.h | 35 
> +-
>  1 file changed, 18 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/pci/controller/pcie-rockchip.h 
> b/drivers/pci/controller/pcie-rockchip.h
> index 
> 5864a20323f21a004bfee4ac6d3a1328c4ab4d8a..5f2e45f062d94cd75983f7ad0c5b708e5b4cfb6f
>  100644
> --- a/drivers/pci/controller/pcie-rockchip.h
> +++ b/drivers/pci/controller/pcie-rockchip.h
> @@ -11,6 +11,7 @@
>  #ifndef _PCIE_ROCKCHIP_H
>  #define _PCIE_ROCKCHIP_H
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -21,10 +22,10 @@
>   * The upper 16 bits of PCIE_CLIENT_CONFIG are a write mask for the lower 16
>   * bits.  This allows atomic updates of the register without locking.
>   */
> -#define HIWORD_UPDATE(mask, val) (((mask) << 16) | (val))
> -#define HIWORD_UPDATE_BIT(val)   HIWORD_UPDATE(val, val)
> +#define HWORD_SET_BIT(val)   (HWORD_UPDATE_CONST((val), 1))
> +#define HWORD_CLR_BIT(val)   (HWORD_UPDATE_CONST((val), 0))
>  
> -#define ENCODE_LANES(x)  x) >> 1) & 3) << 4)
> +#define ENCODE_LANES(x)  x) >> 1) & 3))
>  #define MAX_LANE_NUM 4
>  #define MAX_REGION_LIMIT 32
>  #define MIN_EP_APERTURE  28
> @@ -32,21 +33,21 @@
>  
>  #define PCIE_CLIENT_BASE 0x0
>  #define PCIE_CLIENT_CONFIG   (PCIE_CLIENT_BASE + 0x00)
> -#define   PCIE_CLIENT_CONF_ENABLE  HIWORD_UPDATE_BIT(0x0001)
> -#define   PCIE_CLIENT_CONF_DISABLE   HIWORD_UPDATE(0x0001, 0)
> -#define   PCIE_CLIENT_LINK_TRAIN_ENABLEHIWORD_UPDATE_BIT(0x0002)
> -#define   PCIE_CLIENT_LINK_TRAIN_DISABLE  HIWORD_UPDATE(0x0002, 0)
> -#define   PCIE_CLIENT_ARI_ENABLE   HIWORD_UPDATE_BIT(0x0008)
> -#define   PCIE_CLIENT_CONF_LANE_NUM(x) HIWORD_UPDATE(0x0030, 
> ENCODE_LANES(x))
> -#define   PCIE_CLIENT_MODE_RC  HIWORD_UPDATE_BIT(0x0040)
> -#define   PCIE_CLIENT_MODE_EPHIWORD_UPDATE(0x0040, 0)
> -#define   PCIE_CLIENT_GEN_SEL_1HIWORD_UPDATE(0x0080, 0)
> -#define   PCIE_CLIENT_GEN_SEL_2HIWORD_UPDATE_BIT(0x0080)
> +#define   PCIE_CLIENT_CONF_ENABLEHWORD_SET_BIT(0x0001)
> +#define   PCIE_CLIENT_CONF_DISABLE   HWORD_CLR_BIT(0x0001)
> +#define   PCIE_CLIENT_LINK_TRAIN_ENABLE  HWORD_SET_BIT(0x0002)
> +#define   PCIE_CLIENT_LINK_TRAIN_DISABLE HWORD_CLR_BIT(0x0002)
> +#define   PCIE_CLIENT_ARI_ENABLE HWORD_SET_BIT(0x0008)
> +#define   PCIE_CLIENT_CONF_LANE_NUM(x)   HWORD_UPDATE(0x0030, 
> ENCODE_LANES(x))
> +#define   PCIE_CLIENT_MODE_RCHWORD_SET_BIT(0x0040)
> +#define   PCIE_CLIENT_MODE_EPHWORD_CLR_BIT(0x0040)
> +#define   PCIE_CLIENT_GEN_SEL_1  HWORD_CLR_BIT(0x0080)
> +#define   PCIE_CLIENT_GEN_SEL_2  HWORD_SET_BIT(0x0080)
>  #define PCIE_CLIENT_LEGACY_INT_CTRL  (PCIE_CLIENT_BASE + 0x0c)
> -#define   PCIE_CLIENT_INT_IN_ASSERT  HIWORD_UPDATE_BIT(0x0002)
> -#define   PCIE_CLIENT_INT_IN_DEASSERTHIWORD_UPDATE(0x0002, 0)
> -#define   PCIE_CLIENT_INT_PEND_ST_PEND   
> HIWORD_UPDATE_BIT(0x0001)
> -#define   PCIE_CLIENT_INT_PEND_ST_NORMAL HIWORD_UPDATE(0x0001, 0)
> +#define   PCIE_CLIENT_INT_IN_ASSERT  HWORD_SET_BIT(0x0002)
> +#define   PCIE_CLIENT_INT_IN_DEASSERTHWORD_CLR_BIT(0x0002)
> +#define   PCIE_CLIENT_INT_PEND_ST_PEND   HWORD_SET_BIT(0x0001)
> +#define   PCIE_CLIENT_

Re: [PATCH 17/20] PCI: dw-rockchip: switch to HWORD_UPDATE macro

2025-06-12 Thread Bjorn Helgaas

On Thu, Jun 12, 2025 at 08:56:19PM +0200, Nicolas Frattaroli wrote:
> The era of hand-rolled HIWORD_UPDATE macros is over.
> 
> Like many other Rockchip drivers, pcie-dw-rockchip brings with it its
> very own flavour of HIWORD_UPDATE. It's occasionally used without a
> constant mask, which complicates matters. HIWORD_UPDATE_BIT is a
> confusingly named addition, as it doesn't update the bit, it actually
> sets all bits in the value to 1. HIWORD_DISABLE_BIT is similarly
> confusing; it disables several bits at once by using the value as a mask
> and the inverse of value as the value, and the "disabling only these"
> effect comes from the hardware actually using the mask. The more obvious
> approach would've been HIWORD_UPDATE(val, 0) in my opinion.
> 
> This is part of the motivation why this patch uses bitfield.h's
> HWORD_UPDATE instead, where possible. HWORD_UPDATE requires a constant
> bit mask, which isn't possible where the irq number is used to generate
> a bit mask. For that purpose, we replace it with a more robust macro
> than what was there but that should also bring close to zero runtime
> overhead: we actually mask the IRQ number to make sure we're not writing
> garbage.
> 
> For the remaining bits, there also are some caveats. For starters, the
> PCIE_CLIENT_ENABLE_LTSSM and PCIE_CLIENT_DISABLE_LTSSM were named in a
> manner that isn't quite truthful to what they do. Their modification
> actually spans not just the LTSSM bit but also another bit, flipping
> only the LTSSM one, but keeping the other (which according to the TRM
> has a reset value of 0) always enabled. This other bit is reserved as of
> the IP version RK3588 uses at least, and I have my doubts as to whether
> it was meant to be set, and whether it was meant to be set in that code
> path. Either way, it's confusing.
> 
> Replace it with just writing either 1 or 0 to the LTSSM bit, using the
> new HWORD_UPDATE macro from bitfield.h, which grants us the benefit of
> better compile-time error checking.
> 
> The change of no longer setting the reserved bit doesn't appear to
> change the behaviour on RK3568 in RC mode, where it's not marked as
> reserved.
> 
> PCIE_CLIENT_RC_MODE/PCIE_CLIENT_EP_MODE was another field that wasn't
> super clear on what the bit field modification actually is. As far as I
> can tell, switching to RC mode doesn't actually write the correct value
> to the field if any of its bits have been set previously, as it only
> updates one bit of a 4 bit field.
> 
> Replace it by actually writing the full values to the field, using the
> new HWORD_UPDATE macro, which grants us the benefit of better
> compile-time error checking.
> 
> This patch was tested on RK3588 (PCIe3 x4 controller), RK3576 (PCIe2 x1
> controller) and RK3568 (PCIe x2 controller), all in RC mode.
> 
> Signed-off-by: Nicolas Frattaroli 

  PCI: dw-rockchip: Switch to HWORD_UPDATE macro

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/controller/dwc/pcie-dw-rockchip.c | 39 
> ---
>  1 file changed, 24 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c 
> b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> index 
> 93171a3928794915ad1e8c03c017ce0afc1f9169..29363346f2cd9774d8d2e06cd76f7f82e6a7fecf
>  100644
> --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> @@ -29,18 +29,19 @@
>   * The upper 16 bits of PCIE_CLIENT_CONFIG are a write
>   * mask for the lower 16 bits.
>   */
> -#define HIWORD_UPDATE(mask, val) (((mask) << 16) | (val))
> -#define HIWORD_UPDATE_BIT(val)   HIWORD_UPDATE(val, val)
> -#define HIWORD_DISABLE_BIT(val)  HIWORD_UPDATE(val, ~val)
>  
>  #define to_rockchip_pcie(x) dev_get_drvdata((x)->dev)
>  
>  /* General Control Register */
>  #define PCIE_CLIENT_GENERAL_CON  0x0
> -#define  PCIE_CLIENT_RC_MODE HIWORD_UPDATE_BIT(0x40)
> -#define  PCIE_CLIENT_EP_MODE HIWORD_UPDATE(0xf0, 0x0)
> -#define  PCIE_CLIENT_ENABLE_LTSSMHIWORD_UPDATE_BIT(0xc)
> -#define  PCIE_CLIENT_DISABLE_LTSSM   HIWORD_UPDATE(0x0c, 0x8)
> +#define  PCIE_CLIENT_MODE_MASK   GENMASK(7, 4)
> +#define  PCIE_CLIENT_MODE_EP 0x0U
> +#define  PCIE_CLIENT_MODE_LEGACY 0x1U
> +#define  PCIE_CLIENT_MODE_RC 0x4U
> +#define  PCIE_CLIENT_SET_MODE(x) HWORD_UPDATE(PCIE_CLIENT_MODE_MASK, (x))
> +#define  PCIE_CLIENT_LD_RQ_RST_GRT   HWORD_UPDATE(BIT(3), 1)
> +#define  PCIE_CLIENT_ENABLE_LTSSMHWORD_UPDATE(BIT(2), 1)
> +#define  PCIE_CLIENT_DISABLE_LTSSM   HWORD_UPDATE(BIT(2), 0)
>  
>  /* Interrupt Status Register Related to Legacy Interrupt */
>  #define PCIE_CLIENT_INTR_STATUS_LEGACY   0x8
> @@ -52,6 +53,11 @@
>  
>  /* Interrupt Mask Register Related to Legacy Interrupt */
>  #define PCIE_CLIENT_INTR_MASK_LEGACY 0x1c
> +#define  PCIE_INTR_MASK  GENMASK(7, 0)
> +#define  PCIE_INTR_CLAMP(_x) ((BIT((_x)) & PCIE_INTR_MASK))
> +#define  PCIE_INTR_LEGACY_MASK

Re: [PATCH 15/20] net: stmmac: dwmac-rk: switch to HWORD_UPDATE macro

2025-06-12 Thread Andrew Lunn

On Thu, Jun 12, 2025 at 08:56:17PM +0200, Nicolas Frattaroli wrote:
> The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
> drivers that use constant masks.
> 
> Like many other Rockchip drivers, dwmac-rk has its own HIWORD_UPDATE
> macro. Its semantics allow us to redefine it as a wrapper to the shared
> bitfield.h HWORD_UPDATE macros though.
> 
> Replace the implementation of this driver's very own HIWORD_UPDATE macro
> with an instance of HWORD_UPDATE from bitfield.h. This keeps the diff
> easily reviewable, while giving us more compile-time error checking.
> 
> The related GRF_BIT macro is left alone for now; any attempt to rework
> the code to not use its own solution here would likely end up harder to
> review and less pretty for the time being.
> 
> Signed-off-by: Nicolas Frattaroli 

Please split this out into a patch for net-next. Also, Russell King
has just posted a number of patches for this driver, so you will
probably want to wait for them to be merged, so you post something
which will merged without any fuzz.

Andrew

Re: [PATCH 01/20] bitfield: introduce HWORD_UPDATE bitfield macros

2025-06-12 Thread Jakub Kicinski

On Thu, 12 Jun 2025 20:56:03 +0200 Nicolas Frattaroli wrote:
> Hardware of various vendors, but very notably Rockchip, often uses
> 32-bit registers where the upper 16-bit half of the register is a
> write-enable mask for the lower half.

Please limit the spread of this weirdness to a rockchip or "hiword"
specific header. To a normal reader of bitfield.h these macros will
be equally confusing and useless.

Re: [PATCH 01/20] bitfield: introduce HWORD_UPDATE bitfield macros

2025-06-12 Thread Nicolas Frattaroli

On Thursday, 12 June 2025 21:44:15 Central European Summer Time Jakub Kicinski 
wrote:
> On Thu, 12 Jun 2025 20:56:03 +0200 Nicolas Frattaroli wrote:
> > Hardware of various vendors, but very notably Rockchip, often uses
> > 32-bit registers where the upper 16-bit half of the register is a
> > write-enable mask for the lower half.
> 
> Please limit the spread of this weirdness to a rockchip or "hiword"
> specific header. To a normal reader of bitfield.h these macros will
> be equally confusing and useless.
> 

That is how this change started out, and then a different maintainer told
me that this is a commonly used thing (see: the sunplus patch), and
Rockchip just happens to have a lot of these with consistent naming.

I believe normal readers of bitfield.h will be much more confused by the
undocumented concatenating macro soup at the end, but maybe that's just
me.

Best regards,
Nicolas Frattaroli

Re: [PATCH 16/20] PCI: rockchip: switch to HWORD_UPDATE* macros

2025-06-12 Thread Yury Norov

On Thu, Jun 12, 2025 at 02:37:28PM -0500, Bjorn Helgaas wrote:
> On Thu, Jun 12, 2025 at 08:56:18PM +0200, Nicolas Frattaroli wrote:
> > The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
> > drivers that use constant masks.
> > 
> > The Rockchip PCI driver, like many other Rockchip drivers, has its very
> > own definition of HIWORD_UPDATE.
> > 
> > Remove it, and replace its usage with either HWORD_UPDATE, or two new
> > header local macros for setting/clearing a bit with the high mask, which
> > use HWORD_UPDATE_CONST internally. In the process, ENCODE_LANES needed
> > to be adjusted, as HWORD_UPDATE* shifts the value for us.
> > 
> > That this is equivalent was verified by first making all HWORD_UPDATE
> > instances HWORD_UPDATE_CONST, then doing a static_assert() comparing it
> > to the old macro (and for those with parameters, static_asserting for
> > the full range of possible values with the old encode macro).
> > 
> > What we get out of this is compile time error checking to make sure the
> > value actually fits in the mask, and that the mask fits in the register,
> > and also generally less icky code that writes shifted values when it
> > actually just meant to set and clear a handful of bits.
> > 
> > Signed-off-by: Nicolas Frattaroli 
> 
> Looks good to me.  I assume you want to merge these via a non-PCI tree
> since this depends on patch 01/20.  PCI subject convention would
> capitalize "Switch":

Hi,

I'd like to take patch #1 and the explicitly acked following patches in
my bitmap-for-next.Those who would prefer to move the material in their
per-driver branches (like net, as mentioned by Andrew Lunn) can wait
till the end of next merge window, and then apply the patches cleanly.

Thanks,
Yury

>   PCI: rockchip: Switch to HWORD_UPDATE* macros
> 
> Acked-by: Bjorn Helgaas 
> 
> > ---
> >  drivers/pci/controller/pcie-rockchip.h | 35 
> > +-
> >  1 file changed, 18 insertions(+), 17 deletions(-)
> > 
> > diff --git a/drivers/pci/controller/pcie-rockchip.h 
> > b/drivers/pci/controller/pcie-rockchip.h
> > index 
> > 5864a20323f21a004bfee4ac6d3a1328c4ab4d8a..5f2e45f062d94cd75983f7ad0c5b708e5b4cfb6f
> >  100644
> > --- a/drivers/pci/controller/pcie-rockchip.h
> > +++ b/drivers/pci/controller/pcie-rockchip.h
> > @@ -11,6 +11,7 @@
> >  #ifndef _PCIE_ROCKCHIP_H
> >  #define _PCIE_ROCKCHIP_H
> >  
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -21,10 +22,10 @@
> >   * The upper 16 bits of PCIE_CLIENT_CONFIG are a write mask for the lower 
> > 16
> >   * bits.  This allows atomic updates of the register without locking.
> >   */
> > -#define HIWORD_UPDATE(mask, val)   (((mask) << 16) | (val))
> > -#define HIWORD_UPDATE_BIT(val) HIWORD_UPDATE(val, val)
> > +#define HWORD_SET_BIT(val) (HWORD_UPDATE_CONST((val), 1))
> > +#define HWORD_CLR_BIT(val) (HWORD_UPDATE_CONST((val), 0))
> >  
> > -#define ENCODE_LANES(x)x) >> 1) & 3) << 4)
> > +#define ENCODE_LANES(x)x) >> 1) & 3))
> >  #define MAX_LANE_NUM   4
> >  #define MAX_REGION_LIMIT   32
> >  #define MIN_EP_APERTURE28
> > @@ -32,21 +33,21 @@
> >  
> >  #define PCIE_CLIENT_BASE   0x0
> >  #define PCIE_CLIENT_CONFIG (PCIE_CLIENT_BASE + 0x00)
> > -#define   PCIE_CLIENT_CONF_ENABLEHIWORD_UPDATE_BIT(0x0001)
> > -#define   PCIE_CLIENT_CONF_DISABLE   HIWORD_UPDATE(0x0001, 0)
> > -#define   PCIE_CLIENT_LINK_TRAIN_ENABLE  HIWORD_UPDATE_BIT(0x0002)
> > -#define   PCIE_CLIENT_LINK_TRAIN_DISABLE  HIWORD_UPDATE(0x0002, 0)
> > -#define   PCIE_CLIENT_ARI_ENABLE HIWORD_UPDATE_BIT(0x0008)
> > -#define   PCIE_CLIENT_CONF_LANE_NUM(x)   HIWORD_UPDATE(0x0030, 
> > ENCODE_LANES(x))
> > -#define   PCIE_CLIENT_MODE_RCHIWORD_UPDATE_BIT(0x0040)
> > -#define   PCIE_CLIENT_MODE_EPHIWORD_UPDATE(0x0040, 0)
> > -#define   PCIE_CLIENT_GEN_SEL_1  HIWORD_UPDATE(0x0080, 0)
> > -#define   PCIE_CLIENT_GEN_SEL_2  HIWORD_UPDATE_BIT(0x0080)
> > +#define   PCIE_CLIENT_CONF_ENABLE  HWORD_SET_BIT(0x0001)
> > +#define   PCIE_CLIENT_CONF_DISABLE HWORD_CLR_BIT(0x0001)
> > +#define   PCIE_CLIENT_LINK_TRAIN_ENABLEHWORD_SET_BIT(0x0002)
> > +#define   PCIE_CLIENT_LINK_TRAIN_DISABLE   HWORD_CLR_BIT(0x0002)
> > +#define   PCIE_CLIENT_ARI_ENABLE   HWORD_SET_BIT(0x0008)
> > +#define   PCIE_CLIENT_CONF_LANE_NUM(x) HWORD_UPDATE(0x0030, 
> > ENCODE_LANES(x))
> > +#define   PCIE_CLIENT_MODE_RC  HWORD_SET_BIT(0x0040)
> > +#define   PCIE_CLIENT_MODE_EP  HWORD_CLR_BIT(0x0040)
> > +#define   PCIE_CLIENT_GEN_SEL_1HWORD_CLR_BIT(0x0080)
> > +#define   PCIE_CLIENT_GEN_SEL_2HWORD_SET_BIT(0x0080)
> >  #define PCIE_CLIENT_LEGACY_INT_CTRL(PCIE_CLIENT_BASE + 0x0c)
> > -#define   PCIE_CLIENT_INT_IN_ASSERT

Re: [PATCH 01/20] bitfield: introduce HWORD_UPDATE bitfield macros

2025-06-12 Thread Jakub Kicinski

On Thu, 12 Jun 2025 16:10:37 -0400 Yury Norov wrote:
> I don't think that that having HWORD_UPDATE() in bitfield.h is a wrong
> thing. Jakub, if you do, we can just create a new header for it.

Yes, I'd prefer to contain it. This looks very much like a CSR tooling
convention of Rockchip's ASIC developers. IOW this is really about how
CSRs are access for a specific vendor, not a generic bitfield operator.

Re: [PATCH] drm/bridge: ti-sn65dsi86: fix REFCLK setting

2025-06-12 Thread Doug Anderson

Hi,

On Thu, Jun 12, 2025 at 10:52 AM Doug Anderson  wrote:
>
> Hi,
>
> On Thu, Jun 12, 2025 at 12:35 AM Jayesh Choudhary  wrote:
> >
> > >> If refclk is described in devicetree node, then I see that
> > >> the driver modifies it in every resume call based solely on the
> > >> clock value in dts.
> > >
> > > Exactly. But that is racy with what the chip itself is doing. I.e.
> > > if you don't have that usleep() above, the chip will win the race
> > > and the refclk frequency setting will be set according to the
> > > external GPIOs (which is poorly described in the datasheet, btw),
> > > regardless what the linux driver is setting (because that I2C write
> > > happens too early).
> >
> > I am a little confused here.
> > Won't it be opposite?
> > If we have this delay here, GPIO will stabilize and set the register
> > accordingly?
> >
> > In the driver, I came across the case when we do not have refclk.
> > (My platform does have a refclk, I am just removing the property from
> > the dts node to check the affect of GPIO[3:1] in question because clock
> > is not a required property for the bridge as per the bindings)
> >
> > In the ti_sn65dsi86_probe(), before we read SN_DEVICE_ID_REGS,
> > when we go to resume(), we do not do enable_comms() that calls
> > ti_sn_bridge_set_refclk_freq() to set SN_DPPLL_SRC_REG.
> > I see that register read for SN_DEVICE_ID_REGS fails in that case.
> >
> > Adding this delay fixes that issue. This made me think that we need
> > the delay for GPIO to stabilize and set the refclk.
>
> FWIW, it's been on my plate for a while to delete the "no refclk"
> support. The chip is really hard to use properly without a refclk and
> I'm not at all convinced that the current code actually works properly
> without a refclk. I'm not aware of any current hardware working this
> way. I know we had some very early prototype hardware ages ago that
> tried it and we got it limping along at one point, but the driver
> looked _very_ different then. I believe someone on the lists once
> mentioned trying to do something without a refclk and it didn't work
> and I strongly encouraged them to add a refclk.

Actually, I may have to eat my words here. I double-checked the dts
and I see there's at least two mainline users
("meson-g12b-bananapi-cm4-mnt-reform2.dts" and
"/imx8mq-mnt-reform2.dts") that don't seem to be specifying a `refclk`
to `ti,sn65dsi86`.

Neil / Lucas: is that correct? ...and it actually works?

-Doug

Re: [PATCH 03/12] mm/pagewalk: Skip dax pages in pagewalk

2025-06-12 Thread Alistair Popple

On Thu, Jun 12, 2025 at 03:15:31PM +0100, Lorenzo Stoakes wrote:
> On Thu, May 29, 2025 at 04:32:04PM +1000, Alistair Popple wrote:
> > Previously dax pages were skipped by the pagewalk code as pud_special() or
> > vm_normal_page{_pmd}() would be false for DAX pages. Now that dax pages are
> > refcounted normally that is no longer the case, so add explicit checks to
> > skip them.
> >
> > Signed-off-by: Alistair Popple 
> > ---
> >  include/linux/memremap.h | 11 +++
> >  mm/pagewalk.c| 12 ++--
> >  2 files changed, 21 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/memremap.h b/include/linux/memremap.h
> > index 4aa1519..54e8b57 100644
> > --- a/include/linux/memremap.h
> > +++ b/include/linux/memremap.h
> > @@ -198,6 +198,17 @@ static inline bool folio_is_fsdax(const struct folio 
> > *folio)
> > return is_fsdax_page(&folio->page);
> >  }
> >
> > +static inline bool is_devdax_page(const struct page *page)
> > +{
> > +   return is_zone_device_page(page) &&
> > +   page_pgmap(page)->type == MEMORY_DEVICE_GENERIC;
> > +}
> > +
> > +static inline bool folio_is_devdax(const struct folio *folio)
> > +{
> > +   return is_devdax_page(&folio->page);
> > +}
> > +
> >  #ifdef CONFIG_ZONE_DEVICE
> >  void zone_device_page_init(struct page *page);
> >  void *memremap_pages(struct dev_pagemap *pgmap, int nid);
> > diff --git a/mm/pagewalk.c b/mm/pagewalk.c
> > index e478777..0dfb9c2 100644
> > --- a/mm/pagewalk.c
> > +++ b/mm/pagewalk.c
> > @@ -884,6 +884,12 @@ struct folio *folio_walk_start(struct folio_walk *fw,
> >  * support PUD mappings in VM_PFNMAP|VM_MIXEDMAP VMAs.
> >  */
> > page = pud_page(pud);
> > +
> > +   if (is_devdax_page(page)) {
> 
> Is it only devdax that can exist at PUD leaf level, not fsdax?

Correct.

> > +   spin_unlock(ptl);
> > +   goto not_found;
> > +   }
> > +
> > goto found;
> > }
> >
> > @@ -911,7 +917,8 @@ struct folio *folio_walk_start(struct folio_walk *fw,
> > goto pte_table;
> > } else if (pmd_present(pmd)) {
> > page = vm_normal_page_pmd(vma, addr, pmd);
> > -   if (page) {
> > +   if (page && !is_devdax_page(page) &&
> > +   !is_fsdax_page(page)) {
> > goto found;
> > } else if ((flags & FW_ZEROPAGE) &&
> > is_huge_zero_pmd(pmd)) {
> > @@ -945,7 +952,8 @@ struct folio *folio_walk_start(struct folio_walk *fw,
> >
> > if (pte_present(pte)) {
> > page = vm_normal_page(vma, addr, pte);
> > -   if (page)
> > +   if (page && !is_devdax_page(page) &&
> > +   !is_fsdax_page(page))
> > goto found;
> > if ((flags & FW_ZEROPAGE) &&
> > is_zero_pfn(pte_pfn(pte))) {
> 
> I'm probably echoing others here (and I definitely particularly like Dan's
> suggestion of a helper function here, and Jason's suggestion of explanatory
> comments), but would also be nice to not have to do this separately at each 
> page
> table level and instead have something that you can say 'get me normal non-dax
> page at page table level '.

I did the filtering here because I was trying to avoid unintended behavioural
changes and was being lazy by not auditing the callers. Turns out naming is
harder than doing this properly so I'm going to go with Jason and David's
suggestion and drop the filtering entirely. It will then be up to callers to
define what is "normal" for them by filtering out folio types they don't care
about. Most already do filter out zone device folios or DAX VMA's anyway, and I
will add some commentary to this effect in the respin.

> > --
> > git-series 0.9.1
>

Re: [PATCH v1] drm/amdgpu: give each kernel job a unique id

2025-06-12 Thread Danilo Krummrich

On Thu, Jun 12, 2025 at 09:00:34AM +0200, Christian König wrote:
> On 6/11/25 17:11, Danilo Krummrich wrote:
>  Mhm, reiterating our internal discussion on the mailing list.
> 
>  I think it would be nicer if we could use negative values for the kernel 
>  submissions and positive for userspace. But as discussed internally we 
>  would need to adjust the scheduler trace points for that once more.
> 
>  @Philip and @Danilo any opinion on that?
> >>>
> >>> Both, the U64_MAX and the positive-negative approach, are a bit hacky. I 
> >>> wonder
> >>> why we need client_id to be a u64, wouldn't a u32 not be enough?
> >>
> >> That can trivially overflow on long running boxes.
> > 
> > I don't know if "trivially" is the word of choice given that the number is
> > 4,294,967,295.
> > 
> > But I did indeed miss that this is a for ever increasing atomic. Why is it 
> > an
> > atomic? Why is it not an IDA?
> 
> Well IDA has some extra overhead compared to an ever increasing atomic, 
> additional to that it might not be the best choice to re-use numbers for 
> clients in a trace log.

I think the overhead is not relevant at all, this is called from
drm_file_alloc(). The only path I can see where this is called is
drm_client_init(), which isn't high frequent stuff at all, is it?

It seems to me that we should probably use IDA here.

> On the other hand using smaller numbers is usually nicer for manual 
> inspection.

Another option is to just add an interface to get a kernel client_id from the
same atomic / IDA.

Re: [PATCH 1/2] dt-bindings: display: panel: document Samsung S6E8AA5X01 panel driver

2025-06-12 Thread Conor Dooley

On Thu, Jun 12, 2025 at 05:10:23PM +, Kaustabh Chakraborty wrote:
> On 2025-06-12 15:30, Conor Dooley wrote:
> > On Thu, Jun 12, 2025 at 08:22:41PM +0530, Kaustabh Chakraborty wrote:
> >> Samsung S6E8AA5X01 is an AMOLED MIPI DSI panel controller. Document the
> >> compatible and devicetree properties of this panel driver. Timings are
> >> provided through the devicetree node as panels are available in
> >> different sizes.
> >> 
> >> Signed-off-by: Kaustabh Chakraborty 
> > 
> > Acked-by: Conor Dooley 
> 
> Okay no, even this one has the ID wrong, ugh :(
> 
> >> +$id: http://devicetree.org/schemas/display/panel/samsung,s6e8aa0.yaml#
> 
> Will apply tag after fixing it.

Thanks, I didn't spot it here either.


signature.asc
Description: PGP signature

Re: [PATCH v6 4/4] drm/xe: Make dma-fences compliant with the safe access rules

2025-06-12 Thread Lucas De Marchi


On Tue, Jun 10, 2025 at 05:42:26PM +0100, Tvrtko Ursulin wrote:

Xe can free some of the data pointed to by the dma-fences it exports. Most
notably the timeline name can get freed if userspace closes the associated
submit queue. At the same time the fence could have been exported to a
third party (for example a sync_fence fd) which will then cause an use-
after-free on subsequent access.

To make this safe we need to make the driver compliant with the newly
documented dma-fence rules. Driver has to ensure a RCU grace period
between signalling a fence and freeing any data pointed to by said fence.

For the timeline name we simply make the queue be freed via kfree_rcu and
for the shared lock associated with multiple queues we add a RCU grace
period before freeing the per GT structure holding the lock.

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Matthew Brost 



Acked-by: Lucas De Marchi 

for merging this through drm-misc tree.

Lucas De Marchi

Re: [PATCH v3 0/5] drm/dp: Limit the DPCD probe quirk to the affected monitor

2025-06-12 Thread Imre Deak

On Thu, Jun 12, 2025 at 03:54:51PM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 12.06.25 um 15:29 schrieb Imre Deak:
> > Hi,
> > 
> > On Tue, Jun 10, 2025 at 06:42:04PM +0300, Imre Deak wrote:
> > > Hi Maxim, Thomas, Maarten,
> > > 
> > > could you please ack merging this patchset via drm-intel?
> > any objection to merge the patchset via drm-intel? If not, could
> > someone ack it?
> 
> Sorry for missing that. I'm OK with merging it through Intel trees. Go
> ahead.

Ok, thanks for the follow-up, acks and reviews, patchset is pushed to
drm-intel-next.

> Best regards
> Thomas
> 
> > 
> > Patches 1-4 could be also merged to drm-misc-next instead, but then
> > would need to wait with patch 5 until drm-misc-next is merged to
> > drm-intel.
> > 
> > Thanks,
> > Imre
> > 
> > > On Thu, Jun 05, 2025 at 11:28:45AM +0300, Imre Deak wrote:
> > > > This is v3 of [1], with the following changes requested by Jani:
> > > > 
> > > > - Convert the internal quirk list to an enum list.
> > > > - Track both the internal and global quirks on a single list.
> > > > - Drop the change to support panel name specific quirks for now.
> > > > 
> > > > [1] 
> > > > https://lore.kernel.org/all/20250603121543.17842-1-imre.d...@intel.com
> > > > 
> > > > Cc: Ville Syrjälä 
> > > > Cc: Jani Nikula 
> > > > 
> > > > Imre Deak (5):
> > > >drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS
> > > >drm/edid: Define the quirks in an enum list
> > > >drm/edid: Add support for quirks visible to DRM core and drivers
> > > >drm/dp: Add an EDID quirk for the DPCD register access probe
> > > >drm/i915/dp: Disable the AUX DPCD probe quirk if it's not required
> > > > 
> > > >   drivers/gpu/drm/display/drm_dp_helper.c  |  44 ++--
> > > >   drivers/gpu/drm/drm_edid.c   | 227 ++-
> > > >   drivers/gpu/drm/i915/display/intel_dp.c  |  11 +-
> > > >   drivers/gpu/drm/i915/display/intel_dp_aux.c  |   2 +
> > > >   drivers/gpu/drm/i915/display/intel_hotplug.c |  10 +
> > > >   include/drm/display/drm_dp_helper.h  |   6 +
> > > >   include/drm/drm_connector.h  |   4 +-
> > > >   include/drm/drm_edid.h   |   8 +
> > > >   8 files changed, 189 insertions(+), 123 deletions(-)
> > > > 
> > > > -- 
> > > > 2.44.2
> > > > 
> 
> -- 
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Frankenstrasse 146, 90461 Nuernberg, Germany
> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
> HRB 36809 (AG Nuernberg)
>

Re: [PATCH] drm/bridge: ti-sn65dsi86: fix REFCLK setting

2025-06-12 Thread Doug Anderson

Hi,

On Thu, Jun 12, 2025 at 12:35 AM Jayesh Choudhary  wrote:
>
> >> If refclk is described in devicetree node, then I see that
> >> the driver modifies it in every resume call based solely on the
> >> clock value in dts.
> >
> > Exactly. But that is racy with what the chip itself is doing. I.e.
> > if you don't have that usleep() above, the chip will win the race
> > and the refclk frequency setting will be set according to the
> > external GPIOs (which is poorly described in the datasheet, btw),
> > regardless what the linux driver is setting (because that I2C write
> > happens too early).
>
> I am a little confused here.
> Won't it be opposite?
> If we have this delay here, GPIO will stabilize and set the register
> accordingly?
>
> In the driver, I came across the case when we do not have refclk.
> (My platform does have a refclk, I am just removing the property from
> the dts node to check the affect of GPIO[3:1] in question because clock
> is not a required property for the bridge as per the bindings)
>
> In the ti_sn65dsi86_probe(), before we read SN_DEVICE_ID_REGS,
> when we go to resume(), we do not do enable_comms() that calls
> ti_sn_bridge_set_refclk_freq() to set SN_DPPLL_SRC_REG.
> I see that register read for SN_DEVICE_ID_REGS fails in that case.
>
> Adding this delay fixes that issue. This made me think that we need
> the delay for GPIO to stabilize and set the refclk.

FWIW, it's been on my plate for a while to delete the "no refclk"
support. The chip is really hard to use properly without a refclk and
I'm not at all convinced that the current code actually works properly
without a refclk. I'm not aware of any current hardware working this
way. I know we had some very early prototype hardware ages ago that
tried it and we got it limping along at one point, but the driver
looked _very_ different then. I believe someone on the lists once
mentioned trying to do something without a refclk and it didn't work
and I strongly encouraged them to add a refclk.

-Doug

Re: [PATCH v4 2/2] drm/xe/bo: add GPU memory trace points

2025-06-12 Thread Lucas De Marchi


On Thu, Jun 12, 2025 at 05:46:52PM +0100, Tvrtko Ursulin wrote:


On 12/06/2025 06:40, Lucas De Marchi wrote:

On Wed, Jun 11, 2025 at 03:51:24PM -0700, Juston Li wrote:

Add TRACE_GPU_MEM tracepoints for tracking global and per-process GPU
memory usage.

These are required by VSR on Android 12+ for reporting GPU driver memory
allocations.

v3:
- Use now configurable CONFIG_TRACE_GPU_MEM instead of adding a
  per-driver Kconfig (Lucas)

v2:
- Use u64 as preferred by checkpatch (Tvrtko)
- Fix errors in comments/Kconfig description (Tvrtko)
- drop redundant "CONFIG" in Kconfig

Signed-off-by: Juston Li 
Reviewed-by: Tvrtko Ursulin 
---
drivers/gpu/drm/xe/xe_bo.c   | 47 
drivers/gpu/drm/xe/xe_device_types.h | 16 ++
2 files changed, 63 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 4e39188a021ab..89a3d23e3b800 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -19,6 +19,8 @@

#include 

+#include 
+
#include "xe_device.h"
#include "xe_dma_buf.h"
#include "xe_drm_client.h"
@@ -418,6 +420,35 @@ static void xe_ttm_tt_account_subtract(struct 
xe_device *xe, struct ttm_tt *tt)

    xe_shrinker_mod_pages(xe->mem.shrinker, -(long)tt->num_pages, 0);
}

+#if IS_ENABLED(CONFIG_TRACE_GPU_MEM)
+static void update_global_total_pages(struct ttm_device *ttm_dev, 
long num_pages)

+{
+    struct xe_device *xe = ttm_to_xe_device(ttm_dev);
+    u64 global_total_pages =
+    atomic64_add_return(num_pages, &xe->global_total_pages);
+
+    trace_gpu_mem_total(0, 0, global_total_pages << PAGE_SHIFT);
+}
+
+static void update_process_mem(struct drm_file *file, ssize_t size)
+{
+    struct xe_file *xef = to_xe_file(file);
+    u64 process_mem = atomic64_add_return(size, &xef->process_mem);
+
+    rcu_read_lock(); /* Locks file->pid! */
+    trace_gpu_mem_total(0, pid_nr(rcu_dereference(file->pid)), 
process_mem);


Isn't the first arg supposed to be the gpu id? Doesn't this become
invalid when I have e.g. LNL + BMG and the trace is enabled?


Good point.

u32 gpu_id does not seem possible to map to anything useful.


maybe minor_id? although I'm not sure if the intention is to share this
outside drm as seems the case.



Shall we replace it with a string obtained from dev_name(struct device 
*) ? As only Android parses them I think we can get still away with 
changing the tracepoints ABI.


works for me too. Is Android actually parsing it or just ignoring?
Because afaics it's always 0 in msm.

Lucas De Marchi

[PATCH 00/20] BYEWORD_UPDATE: unifying (most) HIWORD_UPDATE macros

2025-06-12 Thread Nicolas Frattaroli

This series was spawned by [1], where I was asked to move every instance
of HIWORD_UPDATE et al that I could find to a common macro in the same
series that I am introducing said common macro.

The first patch of the series introduces the two new macros in
bitfield.h, called HWORD_UPDATE and HWORD_UPDATE_CONST. The latter can
be used in initializers.

This macro definition checks that the mask fits, and that the value fits
in the mask. Like FIELD_PREP, it also shifts the value up to the mask,
so turning off a bit does not require using the mask as a value. Masks
are also required to be contiguous, like with FIELD_PREP.

For each definition of such a macro, the driver(s) that used it were
evaluated for three different treatments:
 - full conversion to the new macro, for cases where replacing the
   implementation of the old macro wouldn't have worked, or where the
   conversion was trivial. These are the most complex patches in this
   series, as they sometimes have to pull apart definitions of masks
   and values due to the new semantics, which require a contiguous
   mask and shift the value for us.
 - replacing the implementation of the old macro with an instance of the
   new macro, done where I felt it made the patch much easier to review
   because I didn't want to drop a big diff on people.
 - skipping conversion entirely, usually because the mask is
   non-constant and it's not trivial to make it constant. Sometimes an
   added complication is that said non-constant mask is either used in a
   path where runtime overhead may not be desirable, or in an
   initializer.

Left out of conversion:
 - drivers/mmc/host/sdhci-of-arasan.c: mask is non-constant.
 - drivers/phy/rockchip/phy-rockchip-inno-csidphy.c: mask is
   non-constant likely by way of runtime pointer dereferencing, even if
   struct and members are made const.
 - drivers/clk/rockchip/clk.h: way too many clock drivers use non-const
   masks in the context of an initializer.

I will not be addressing these 3 remaining users in this series, as
implementing a runtime checked version on top of this and verifying that
it doesn't cause undue overhead just for 3 stragglers is a bit outside
the scope of wanting to get my RK3576 PWM series unblocked. Please have
mercy.

In total, I count 19 different occurrences of such a macro fixed out of
22 I found. The vast majority of these patches have either undergone
static testing to ensure the values end up the same during development,
or have been verified to not break the device the driver is for at
runtime. Only a handful are just compile-tested, and the individual
patches remark which ones those are.

This took a lot of manual work as this wasn't really something that
could be automated: code had to be refactored to ensure masks were
contiguous, made sense to how the hardware actually works and to human
readers, were constant, and that the code uses unshifted values.

https://lore.kernel.org/all/aD8hB-qJ4Qm6IFuS@yury/ [1]

Signed-off-by: Nicolas Frattaroli 
---
Nicolas Frattaroli (20):
  bitfield: introduce HWORD_UPDATE bitfield macros
  mmc: dw_mmc-rockchip: switch to HWORD_UPDATE macro
  soc: rockchip: grf: switch to HWORD_UPDATE_CONST macro
  media: synopsys: hdmirx: replace macros with bitfield variants
  drm/rockchip: lvds: switch to HWORD_UPDATE macro
  phy: rockchip-emmc: switch to HWORD_UPDATE macro
  drm/rockchip: dsi: switch to HWORD_UPDATE* macros
  drm/rockchip: vop2: switch to HWORD_UPDATE macro
  phy: rockchip-samsung-dcphy: switch to HWORD_UPDATE macro
  drm/rockchip: dw_hdmi_qp: switch to HWORD_UPDATE macro
  drm/rockchip: inno-hdmi: switch to HWORD_UPDATE macro
  phy: rockchip-usb: switch to HWORD_UPDATE macro
  drm/rockchip: dw_hdmi: switch to HWORD_UPDATE* macros
  ASoC: rockchip: i2s-tdm: switch to HWORD_UPDATE_CONST macro
  net: stmmac: dwmac-rk: switch to HWORD_UPDATE macro
  PCI: rockchip: switch to HWORD_UPDATE* macros
  PCI: dw-rockchip: switch to HWORD_UPDATE macro
  PM / devfreq: rockchip-dfi: switch to HWORD_UPDATE macro
  clk: sp7021: switch to HWORD_UPDATE macro
  phy: rockchip-pcie: switch to HWORD_UPDATE macro

 drivers/clk/clk-sp7021.c   |  21 +--
 drivers/devfreq/event/rockchip-dfi.c   |  26 ++--
 drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c| 142 ++---
 drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c|  80 ++--
 drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c |  68 +-
 drivers/gpu/drm/rockchip/inno_hdmi.c   |  11 +-
 drivers/gpu/drm/rockchip/rockchip_drm_vop2.h   |   1 -
 drivers/gpu/drm/rockchip/rockchip_lvds.h   |  21 +--
 drivers/gpu/drm/rockchip/rockchip_vop2_reg.c   |  14 +-
 .../media/platform/synopsys/hdmirx/snps_hdmirx.h   |   5 +-
 drivers/mmc/host/dw_mmc-rockchip.c |   7 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c |   3 +-
 drivers/pci/controller/dw

[PATCH 04/20] media: synopsys: hdmirx: replace macros with bitfield variants

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

Replace the UPDATE macro with bitfield.h's FIELD_PREP, to give us
additional error checking.

Also, replace the HIWORD_UPDATE macro at the same time with bitfield.h's
new HWORD_UPDATE macro, which also gives us additional error checking.

The UPDATE/HIWORD_UPDATE macros are left as wrappers around the
bitfield.h macros, in order to not rock the boat too much, and keep the
changes easy to review.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/media/platform/synopsys/hdmirx/snps_hdmirx.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/synopsys/hdmirx/snps_hdmirx.h 
b/drivers/media/platform/synopsys/hdmirx/snps_hdmirx.h
index 
220ab99ca61152b36b0a08b398ddefdb985709a5..cd5250e282a5c9de9a75ea73f26496ed53766dff
 100644
--- a/drivers/media/platform/synopsys/hdmirx/snps_hdmirx.h
+++ b/drivers/media/platform/synopsys/hdmirx/snps_hdmirx.h
@@ -8,10 +8,11 @@
 #ifndef DW_HDMIRX_H
 #define DW_HDMIRX_H
 
+#include 
 #include 
 
-#define UPDATE(x, h, l)(((x) << (l)) & GENMASK((h), (l)))
-#define HIWORD_UPDATE(v, h, l) (((v) << (l)) | (GENMASK((h), (l)) << 16))
+#define UPDATE(x, h, l)(FIELD_PREP(GENMASK((h), (l)), (x)))
+#define HIWORD_UPDATE(v, h, l) (HWORD_UPDATE(GENMASK((h), (l)), (v)))
 
 /* SYS_GRF */
 #define SYS_GRF_SOC_CON1   0x0304

-- 
2.49.0

[PATCH 01/20] bitfield: introduce HWORD_UPDATE bitfield macros

2025-06-12 Thread Nicolas Frattaroli

Hardware of various vendors, but very notably Rockchip, often uses
32-bit registers where the upper 16-bit half of the register is a
write-enable mask for the lower half.

This type of hardware setup allows for more granular concurrent register
write access.

Over the years, many drivers have hand-rolled their own version of this
macro, usually without any checks, often called something like
HIWORD_UPDATE or FIELD_PREP_HIWORD, commonly with slightly different
semantics between them.

Clearly there is a demand for such a macro, and thus the demand should
be satisfied in a common header file.

Add two macros: HWORD_UPDATE, and HWORD_UPDATE_CONST. The latter is a
version that can be used in initializers, like FIELD_PREP_CONST. The
macro names are chosen to not clash with any potential other macros that
drivers may already have implemented themselves, while retaining a
familiar name.

Signed-off-by: Nicolas Frattaroli 
---
 include/linux/bitfield.h | 47 +++
 1 file changed, 47 insertions(+)

diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h
index 
6d9a53db54b66c0833973c880444bd289d9667b1..b90d88db7405f95b78cdd6f3426263086bab5aa6
 100644
--- a/include/linux/bitfield.h
+++ b/include/linux/bitfield.h
@@ -8,6 +8,7 @@
 #define _LINUX_BITFIELD_H
 
 #include 
+#include 
 #include 
 #include 
 
@@ -142,6 +143,52 @@
(((typeof(_mask))(_val) << __bf_shf(_mask)) & (_mask))  \
)
 
+/**
+ * HWORD_UPDATE() - prepare a bitfield element with a mask in the upper half
+ * @_mask: shifted mask defining the field's length and position
+ * @_val:  value to put in the field
+ *
+ * HWORD_UPDATE() masks and shifts up the value, as well as bitwise ORs the
+ * result with the mask shifted up by 16.
+ *
+ * This is useful for a common design of hardware registers where the upper
+ * 16-bit half of a 32-bit register is used as a write-enable mask. In such a
+ * register, a bit in the lower half is only updated if the corresponding bit
+ * in the upper half is high.
+ */
+#define HWORD_UPDATE(_mask, _val)   \
+   ({   \
+   __BF_FIELD_CHECK(_mask, ((u16) 0U), _val,\
+"HWORD_UPDATE: ");  \
+   (((typeof(_mask))(_val) << __bf_shf(_mask)) & (_mask)) | \
+   ((_mask) << 16); \
+   })
+
+/**
+ * HWORD_UPDATE_CONST() - prepare a constant bitfield element with a mask in
+ *the upper half
+ * @_mask: shifted mask defining the field's length and position
+ * @_val:  value to put in the field
+ *
+ * HWORD_UPDATE_CONST() masks and shifts up the value, as well as bitwise ORs
+ * the result with the mask shifted up by 16.
+ *
+ * This is useful for a common design of hardware registers where the upper
+ * 16-bit half of a 32-bit register is used as a write-enable mask. In such a
+ * register, a bit in the lower half is only updated if the corresponding bit
+ * in the upper half is high.
+ *
+ * Unlike HWORD_UPDATE(), this is a constant expression and can therefore
+ * be used in initializers. Error checking is less comfortable for this
+ * version.
+ */
+#define HWORD_UPDATE_CONST(_mask, _val)
  \
+   ( \
+   FIELD_PREP_CONST(_mask, _val) |   \
+   (BUILD_BUG_ON_ZERO(const_true((u64) (_mask) > U16_MAX)) + \
+((_mask) << 16)) \
+   )
+
 /**
  * FIELD_GET() - extract a bitfield element
  * @_mask: shifted mask defining the field's length and position

-- 
2.49.0

[PATCH 03/20] soc: rockchip: grf: switch to HWORD_UPDATE_CONST macro

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

Switch the rockchip grf driver to the HWORD_UPDATE_CONST macro, which
brings with it more error checking while still being able to be used in
initializers.

All HIWORD_UPDATE instances and its definition are removed from the
driver, as the conversion here is obvious, and static_asserts were used
during development to make sure the ones greater than one bit in width
were really equivalent.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/soc/rockchip/grf.c | 35 +--
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/drivers/soc/rockchip/grf.c b/drivers/soc/rockchip/grf.c
index 
1eab4bb0eacffe19a8f0af0b71bdaa5c0b506629..a4a075ec98309cfcf7fc0bbbd310678ffcbe45da
 100644
--- a/drivers/soc/rockchip/grf.c
+++ b/drivers/soc/rockchip/grf.c
@@ -5,14 +5,13 @@
  * Copyright (c) 2016 Heiko Stuebner 
  */
 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 
-#define HIWORD_UPDATE(val, mask, shift) \
-   ((val) << (shift) | (mask) << ((shift) + 16))
 
 struct rockchip_grf_value {
const char *desc;
@@ -32,7 +31,7 @@ static const struct rockchip_grf_value rk3036_defaults[] 
__initconst = {
 * Disable auto jtag/sdmmc switching that causes issues with the
 * clock-framework and the mmc controllers making them unreliable.
 */
-   { "jtag switching", RK3036_GRF_SOC_CON0, HIWORD_UPDATE(0, 1, 11) },
+   { "jtag switching", RK3036_GRF_SOC_CON0, HWORD_UPDATE_CONST(BIT(11), 0) 
},
 };
 
 static const struct rockchip_grf_info rk3036_grf __initconst = {
@@ -44,8 +43,8 @@ static const struct rockchip_grf_info rk3036_grf __initconst 
= {
 #define RK3128_GRF_SOC_CON10x144
 
 static const struct rockchip_grf_value rk3128_defaults[] __initconst = {
-   { "jtag switching", RK3128_GRF_SOC_CON0, HIWORD_UPDATE(0, 1, 8) },
-   { "vpu main clock", RK3128_GRF_SOC_CON1, HIWORD_UPDATE(0, 1, 10) },
+   { "jtag switching", RK3128_GRF_SOC_CON0, HWORD_UPDATE_CONST(BIT(8), 0) 
},
+   { "vpu main clock", RK3128_GRF_SOC_CON1, HWORD_UPDATE_CONST(BIT(10), 0) 
},
 };
 
 static const struct rockchip_grf_info rk3128_grf __initconst = {
@@ -56,7 +55,7 @@ static const struct rockchip_grf_info rk3128_grf __initconst 
= {
 #define RK3228_GRF_SOC_CON60x418
 
 static const struct rockchip_grf_value rk3228_defaults[] __initconst = {
-   { "jtag switching", RK3228_GRF_SOC_CON6, HIWORD_UPDATE(0, 1, 8) },
+   { "jtag switching", RK3228_GRF_SOC_CON6, HWORD_UPDATE_CONST(BIT(8), 0) 
},
 };
 
 static const struct rockchip_grf_info rk3228_grf __initconst = {
@@ -68,8 +67,8 @@ static const struct rockchip_grf_info rk3228_grf __initconst 
= {
 #define RK3288_GRF_SOC_CON20x24c
 
 static const struct rockchip_grf_value rk3288_defaults[] __initconst = {
-   { "jtag switching", RK3288_GRF_SOC_CON0, HIWORD_UPDATE(0, 1, 12) },
-   { "pwm select", RK3288_GRF_SOC_CON2, HIWORD_UPDATE(1, 1, 0) },
+   { "jtag switching", RK3288_GRF_SOC_CON0, HWORD_UPDATE_CONST(BIT(12), 0) 
},
+   { "pwm select", RK3288_GRF_SOC_CON2, HWORD_UPDATE_CONST(BIT(0), 1) },
 };
 
 static const struct rockchip_grf_info rk3288_grf __initconst = {
@@ -80,7 +79,7 @@ static const struct rockchip_grf_info rk3288_grf __initconst 
= {
 #define RK3328_GRF_SOC_CON40x410
 
 static const struct rockchip_grf_value rk3328_defaults[] __initconst = {
-   { "jtag switching", RK3328_GRF_SOC_CON4, HIWORD_UPDATE(0, 1, 12) },
+   { "jtag switching", RK3328_GRF_SOC_CON4, HWORD_UPDATE_CONST(BIT(12), 0) 
},
 };
 
 static const struct rockchip_grf_info rk3328_grf __initconst = {
@@ -91,7 +90,7 @@ static const struct rockchip_grf_info rk3328_grf __initconst 
= {
 #define RK3368_GRF_SOC_CON15   0x43c
 
 static const struct rockchip_grf_value rk3368_defaults[] __initconst = {
-   { "jtag switching", RK3368_GRF_SOC_CON15, HIWORD_UPDATE(0, 1, 13) },
+   { "jtag switching", RK3368_GRF_SOC_CON15, HWORD_UPDATE_CONST(BIT(13), 
0) },
 };
 
 static const struct rockchip_grf_info rk3368_grf __initconst = {
@@ -102,7 +101,7 @@ static const struct rockchip_grf_info rk3368_grf 
__initconst = {
 #define RK3399_GRF_SOC_CON70xe21c
 
 static const struct rockchip_grf_value rk3399_defaults[] __initconst = {
-   { "jtag switching", RK3399_GRF_SOC_CON7, HIWORD_UPDATE(0, 1, 12) },
+   { "jtag switching", RK3399_GRF_SOC_CON7, HWORD_UPDATE_CONST(BIT(12), 0) 
},
 };
 
 static const struct rockchip_grf_info rk3399_grf __initconst = {
@@ -113,9 +112,9 @@ static const struct rockchip_grf_info rk3399_grf 
__initconst = {
 #define RK3566_GRF_USB3OTG0_CON1   0x0104
 
 static const struct rockchip_grf_value rk3566_defaults[] __initconst = {
-   { "usb3otg port switch", RK3566_GRF_USB3OTG0_CON1, HIWORD_UPDATE(0, 1, 
12) },
-   { "usb3otg clock switch", RK3566_GRF_USB3OTG0_CON1, HIWORD_UPDATE(1, 1, 
7) },
-

[PATCH 02/20] mmc: dw_mmc-rockchip: switch to HWORD_UPDATE macro

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

Switch to the new HWORD_UPDATE macro in bitfield.h, which has error
checking. Instead of redefining the driver's HIWORD_UPDATE macro in this
case, replace the two only instances of it with the new macro, as I
could test that they result in an equivalent value.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/mmc/host/dw_mmc-rockchip.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/mmc/host/dw_mmc-rockchip.c 
b/drivers/mmc/host/dw_mmc-rockchip.c
index 
baa23b51773127b4137f472581259b61649273a5..9e3d17becf65ffb60fe3d32d2cdec341fbd30b1e
 100644
--- a/drivers/mmc/host/dw_mmc-rockchip.c
+++ b/drivers/mmc/host/dw_mmc-rockchip.c
@@ -5,6 +5,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -24,8 +25,6 @@
 #define ROCKCHIP_MMC_DELAYNUM_OFFSET   2
 #define ROCKCHIP_MMC_DELAYNUM_MASK (0xff << ROCKCHIP_MMC_DELAYNUM_OFFSET)
 #define ROCKCHIP_MMC_DELAY_ELEMENT_PSEC60
-#define HIWORD_UPDATE(val, mask, shift) \
-   ((val) << (shift) | (mask) << ((shift) + 16))
 
 static const unsigned int freqs[] = { 10, 20, 30, 40 };
 
@@ -148,9 +147,9 @@ static int rockchip_mmc_set_internal_phase(struct dw_mci 
*host, bool sample, int
raw_value |= nineties;
 
if (sample)
-   mci_writel(host, TIMING_CON1, HIWORD_UPDATE(raw_value, 0x07ff, 
1));
+   mci_writel(host, TIMING_CON1, HWORD_UPDATE(GENMASK(11, 1), 
raw_value));
else
-   mci_writel(host, TIMING_CON0, HIWORD_UPDATE(raw_value, 0x07ff, 
1));
+   mci_writel(host, TIMING_CON0, HWORD_UPDATE(GENMASK(11, 1), 
raw_value));
 
dev_dbg(host->dev, "set %s_phase(%d) delay_nums=%u actual_degrees=%d\n",
sample ? "sample" : "drv", degrees, delay_num,

-- 
2.49.0

[PATCH 16/20] PCI: rockchip: switch to HWORD_UPDATE* macros

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

The Rockchip PCI driver, like many other Rockchip drivers, has its very
own definition of HIWORD_UPDATE.

Remove it, and replace its usage with either HWORD_UPDATE, or two new
header local macros for setting/clearing a bit with the high mask, which
use HWORD_UPDATE_CONST internally. In the process, ENCODE_LANES needed
to be adjusted, as HWORD_UPDATE* shifts the value for us.

That this is equivalent was verified by first making all HWORD_UPDATE
instances HWORD_UPDATE_CONST, then doing a static_assert() comparing it
to the old macro (and for those with parameters, static_asserting for
the full range of possible values with the old encode macro).

What we get out of this is compile time error checking to make sure the
value actually fits in the mask, and that the mask fits in the register,
and also generally less icky code that writes shifted values when it
actually just meant to set and clear a handful of bits.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/pci/controller/pcie-rockchip.h | 35 +-
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/drivers/pci/controller/pcie-rockchip.h 
b/drivers/pci/controller/pcie-rockchip.h
index 
5864a20323f21a004bfee4ac6d3a1328c4ab4d8a..5f2e45f062d94cd75983f7ad0c5b708e5b4cfb6f
 100644
--- a/drivers/pci/controller/pcie-rockchip.h
+++ b/drivers/pci/controller/pcie-rockchip.h
@@ -11,6 +11,7 @@
 #ifndef _PCIE_ROCKCHIP_H
 #define _PCIE_ROCKCHIP_H
 
+#include 
 #include 
 #include 
 #include 
@@ -21,10 +22,10 @@
  * The upper 16 bits of PCIE_CLIENT_CONFIG are a write mask for the lower 16
  * bits.  This allows atomic updates of the register without locking.
  */
-#define HIWORD_UPDATE(mask, val)   (((mask) << 16) | (val))
-#define HIWORD_UPDATE_BIT(val) HIWORD_UPDATE(val, val)
+#define HWORD_SET_BIT(val) (HWORD_UPDATE_CONST((val), 1))
+#define HWORD_CLR_BIT(val) (HWORD_UPDATE_CONST((val), 0))
 
-#define ENCODE_LANES(x)x) >> 1) & 3) << 4)
+#define ENCODE_LANES(x)x) >> 1) & 3))
 #define MAX_LANE_NUM   4
 #define MAX_REGION_LIMIT   32
 #define MIN_EP_APERTURE28
@@ -32,21 +33,21 @@
 
 #define PCIE_CLIENT_BASE   0x0
 #define PCIE_CLIENT_CONFIG (PCIE_CLIENT_BASE + 0x00)
-#define   PCIE_CLIENT_CONF_ENABLEHIWORD_UPDATE_BIT(0x0001)
-#define   PCIE_CLIENT_CONF_DISABLE   HIWORD_UPDATE(0x0001, 0)
-#define   PCIE_CLIENT_LINK_TRAIN_ENABLE  HIWORD_UPDATE_BIT(0x0002)
-#define   PCIE_CLIENT_LINK_TRAIN_DISABLE  HIWORD_UPDATE(0x0002, 0)
-#define   PCIE_CLIENT_ARI_ENABLE HIWORD_UPDATE_BIT(0x0008)
-#define   PCIE_CLIENT_CONF_LANE_NUM(x)   HIWORD_UPDATE(0x0030, ENCODE_LANES(x))
-#define   PCIE_CLIENT_MODE_RCHIWORD_UPDATE_BIT(0x0040)
-#define   PCIE_CLIENT_MODE_EPHIWORD_UPDATE(0x0040, 0)
-#define   PCIE_CLIENT_GEN_SEL_1  HIWORD_UPDATE(0x0080, 0)
-#define   PCIE_CLIENT_GEN_SEL_2  HIWORD_UPDATE_BIT(0x0080)
+#define   PCIE_CLIENT_CONF_ENABLE  HWORD_SET_BIT(0x0001)
+#define   PCIE_CLIENT_CONF_DISABLE HWORD_CLR_BIT(0x0001)
+#define   PCIE_CLIENT_LINK_TRAIN_ENABLEHWORD_SET_BIT(0x0002)
+#define   PCIE_CLIENT_LINK_TRAIN_DISABLE   HWORD_CLR_BIT(0x0002)
+#define   PCIE_CLIENT_ARI_ENABLE   HWORD_SET_BIT(0x0008)
+#define   PCIE_CLIENT_CONF_LANE_NUM(x) HWORD_UPDATE(0x0030, 
ENCODE_LANES(x))
+#define   PCIE_CLIENT_MODE_RC  HWORD_SET_BIT(0x0040)
+#define   PCIE_CLIENT_MODE_EP  HWORD_CLR_BIT(0x0040)
+#define   PCIE_CLIENT_GEN_SEL_1HWORD_CLR_BIT(0x0080)
+#define   PCIE_CLIENT_GEN_SEL_2HWORD_SET_BIT(0x0080)
 #define PCIE_CLIENT_LEGACY_INT_CTRL(PCIE_CLIENT_BASE + 0x0c)
-#define   PCIE_CLIENT_INT_IN_ASSERTHIWORD_UPDATE_BIT(0x0002)
-#define   PCIE_CLIENT_INT_IN_DEASSERT  HIWORD_UPDATE(0x0002, 0)
-#define   PCIE_CLIENT_INT_PEND_ST_PEND HIWORD_UPDATE_BIT(0x0001)
-#define   PCIE_CLIENT_INT_PEND_ST_NORMAL   HIWORD_UPDATE(0x0001, 0)
+#define   PCIE_CLIENT_INT_IN_ASSERTHWORD_SET_BIT(0x0002)
+#define   PCIE_CLIENT_INT_IN_DEASSERT  HWORD_CLR_BIT(0x0002)
+#define   PCIE_CLIENT_INT_PEND_ST_PEND HWORD_SET_BIT(0x0001)
+#define   PCIE_CLIENT_INT_PEND_ST_NORMAL   HWORD_CLR_BIT(0x0001)
 #define PCIE_CLIENT_SIDE_BAND_STATUS   (PCIE_CLIENT_BASE + 0x20)
 #define   PCIE_CLIENT_PHY_ST   BIT(12)
 #define PCIE_CLIENT_DEBUG_OUT_0(PCIE_CLIENT_BASE + 0x3c)

-- 
2.49.0

[PATCH 13/20] drm/rockchip: dw_hdmi: switch to HWORD_UPDATE* macros

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

Remove this driver's very own HIWORD_UPDATE macro, and replace all
instances of it with equivalent instantiations of HWORD_UPDATE or
HWORD_UPDATE_CONST, depending on whether it's in an initializer.

This gives us better error checking, and a centrally agreed upon
signature for this macro, to ease in code comprehension.

Because HWORD_UPDATE/HWORD_UPDATE_CONST shifts the value to the mask
(like FIELD_PREP et al do), a lot of macro instantiations get easier to
read.

This was tested on an RK3568 ODROID M1, as well as an RK3399 ROCKPro64.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c | 80 +
 1 file changed, 36 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c 
b/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c
index 
f737e7d46e667f2411a77aa8d1004637c50fbc5c..e8cb7fae6c22903db32f498459b22372a131963d
 100644
--- a/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c
+++ b/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c
@@ -3,6 +3,7 @@
  * Copyright (c) 2014, Rockchip Electronics Co., Ltd.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -54,8 +55,6 @@
 #define RK3568_HDMI_SDAIN_MSK  BIT(15)
 #define RK3568_HDMI_SCLIN_MSK  BIT(14)
 
-#define HIWORD_UPDATE(val, mask)   (val | (mask) << 16)
-
 /**
  * struct rockchip_hdmi_chip_data - splite the grf setting of kind of chips
  * @lcdsel_grf_reg: grf register offset of lcdc select
@@ -359,17 +358,14 @@ static void dw_hdmi_rk3228_setup_hpd(struct dw_hdmi 
*dw_hdmi, void *data)
 
dw_hdmi_phy_setup_hpd(dw_hdmi, data);
 
-   regmap_write(hdmi->regmap,
-   RK3228_GRF_SOC_CON6,
-   HIWORD_UPDATE(RK3228_HDMI_HPD_VSEL | RK3228_HDMI_SDA_VSEL |
- RK3228_HDMI_SCL_VSEL,
- RK3228_HDMI_HPD_VSEL | RK3228_HDMI_SDA_VSEL |
- RK3228_HDMI_SCL_VSEL));
-
-   regmap_write(hdmi->regmap,
-   RK3228_GRF_SOC_CON2,
-   HIWORD_UPDATE(RK3228_HDMI_SDAIN_MSK | RK3228_HDMI_SCLIN_MSK,
- RK3228_HDMI_SDAIN_MSK | RK3228_HDMI_SCLIN_MSK));
+   regmap_write(hdmi->regmap, RK3228_GRF_SOC_CON6,
+HWORD_UPDATE(RK3228_HDMI_HPD_VSEL, 1) |
+HWORD_UPDATE(RK3228_HDMI_SDA_VSEL, 1) |
+HWORD_UPDATE(RK3228_HDMI_SCL_VSEL, 1));
+
+   regmap_write(hdmi->regmap, RK3228_GRF_SOC_CON2,
+HWORD_UPDATE(RK3228_HDMI_SDAIN_MSK, 1) |
+HWORD_UPDATE(RK3328_HDMI_SCLIN_MSK, 1));
 }
 
 static enum drm_connector_status
@@ -381,15 +377,13 @@ dw_hdmi_rk3328_read_hpd(struct dw_hdmi *dw_hdmi, void 
*data)
status = dw_hdmi_phy_read_hpd(dw_hdmi, data);
 
if (status == connector_status_connected)
-   regmap_write(hdmi->regmap,
-   RK3328_GRF_SOC_CON4,
-   HIWORD_UPDATE(RK3328_HDMI_SDA_5V | RK3328_HDMI_SCL_5V,
- RK3328_HDMI_SDA_5V | RK3328_HDMI_SCL_5V));
+   regmap_write(hdmi->regmap, RK3328_GRF_SOC_CON4,
+HWORD_UPDATE(RK3328_HDMI_SDA_5V, 1) |
+HWORD_UPDATE(RK3328_HDMI_SCL_5V, 1));
else
-   regmap_write(hdmi->regmap,
-   RK3328_GRF_SOC_CON4,
-   HIWORD_UPDATE(0, RK3328_HDMI_SDA_5V |
-RK3328_HDMI_SCL_5V));
+   regmap_write(hdmi->regmap, RK3328_GRF_SOC_CON4,
+HWORD_UPDATE(RK3328_HDMI_SDA_5V, 0) |
+HWORD_UPDATE(RK3328_HDMI_SCL_5V, 0));
return status;
 }
 
@@ -400,21 +394,21 @@ static void dw_hdmi_rk3328_setup_hpd(struct dw_hdmi 
*dw_hdmi, void *data)
dw_hdmi_phy_setup_hpd(dw_hdmi, data);
 
/* Enable and map pins to 3V grf-controlled io-voltage */
-   regmap_write(hdmi->regmap,
-   RK3328_GRF_SOC_CON4,
-   HIWORD_UPDATE(0, RK3328_HDMI_HPD_SARADC | RK3328_HDMI_CEC_5V |
-RK3328_HDMI_SDA_5V | RK3328_HDMI_SCL_5V |
-RK3328_HDMI_HPD_5V));
-   regmap_write(hdmi->regmap,
-   RK3328_GRF_SOC_CON3,
-   HIWORD_UPDATE(0, RK3328_HDMI_SDA5V_GRF | RK3328_HDMI_SCL5V_GRF |
-RK3328_HDMI_HPD5V_GRF |
-RK3328_HDMI_CEC5V_GRF));
-   regmap_write(hdmi->regmap,
-   RK3328_GRF_SOC_CON2,
-   HIWORD_UPDATE(RK3328_HDMI_SDAIN_MSK | RK3328_HDMI_SCLIN_MSK,
- RK3328_HDMI_SDAIN_MSK | RK3328_HDMI_SCLIN_MSK |
- RK3328_HDMI_HPD_IOE));
+   regmap_write(hdmi->regmap, RK3328_GRF_SOC_CON4,
+HWORD_UPDATE(RK3328_HDMI_HPD_SARADC, 0) |
+HWO

[PATCH 15/20] net: stmmac: dwmac-rk: switch to HWORD_UPDATE macro

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

Like many other Rockchip drivers, dwmac-rk has its own HIWORD_UPDATE
macro. Its semantics allow us to redefine it as a wrapper to the shared
bitfield.h HWORD_UPDATE macros though.

Replace the implementation of this driver's very own HIWORD_UPDATE macro
with an instance of HWORD_UPDATE from bitfield.h. This keeps the diff
easily reviewable, while giving us more compile-time error checking.

The related GRF_BIT macro is left alone for now; any attempt to rework
the code to not use its own solution here would likely end up harder to
review and less pretty for the time being.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
index 
700858ff6f7c33fdca08100dd7406aedeff0fc41..38a15aaf7846dc16e5e3f2ff91be0b5e81d29dba
 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
@@ -8,6 +8,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -84,7 +85,7 @@ struct rk_priv_data {
 };
 
 #define HIWORD_UPDATE(val, mask, shift) \
-   ((val) << (shift) | (mask) << ((shift) + 16))
+   (HWORD_UPDATE((mask) << (shift), (val)))
 
 #define GRF_BIT(nr)(BIT(nr) | BIT(nr+16))
 #define GRF_CLR_BIT(nr)(BIT(nr+16))

-- 
2.49.0

[PATCH 14/20] ASoC: rockchip: i2s-tdm: switch to HWORD_UPDATE_CONST macro

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

Replace the implementation of this driver's HIWORD_UPDATE macro with an
instance of HWORD_UPDATE_CONST. The const variant is chosen here because
some of the header defines are then used in initializers.

This gives us some compile-time error checking, while keeping the diff
very small and easy to review.

Signed-off-by: Nicolas Frattaroli 
---
 sound/soc/rockchip/rockchip_i2s_tdm.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/sound/soc/rockchip/rockchip_i2s_tdm.h 
b/sound/soc/rockchip/rockchip_i2s_tdm.h
index 
0aa1c6da1e2c0ebb70473b1bcd1f6e0c1fb90df3..6efb76fbff9c158b79a87cdea02ef9db335cf700
 100644
--- a/sound/soc/rockchip/rockchip_i2s_tdm.h
+++ b/sound/soc/rockchip/rockchip_i2s_tdm.h
@@ -10,6 +10,8 @@
 #ifndef _ROCKCHIP_I2S_TDM_H
 #define _ROCKCHIP_I2S_TDM_H
 
+#include 
+
 /*
  * TXCR
  * transmit operation control register
@@ -285,7 +287,7 @@ enum {
 #define I2S_TDM_RXCR   (0x0034)
 #define I2S_CLKDIV (0x0038)
 
-#define HIWORD_UPDATE(v, h, l) (((v) << (l)) | (GENMASK((h), (l)) << 16))
+#define HIWORD_UPDATE(v, h, l) (HWORD_UPDATE_CONST(GENMASK((h), (l)), (v)))
 
 /* PX30 GRF CONFIGS */
 #define PX30_I2S0_CLK_IN_SRC_FROM_TX   HIWORD_UPDATE(1, 13, 12)

-- 
2.49.0

[PATCH 20/20] phy: rockchip-pcie: switch to HWORD_UPDATE macro

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

The Rockchip PCIe PHY driver, used on the RK3399, has its own definition
of HIWORD_UPDATE.

Remove it, and replace instances of it with bitfield.h's HWORD_UPDATE.
To achieve this, some mask defines are reshuffled, as HWORD_UPDATE uses
the mask as both the mask of bits to write and to derive the shift
amount from in order to shift the value.

In order to ensure that the mask is always a constant, the inst->index
shift is performed after the HWORD_UPDATE, as this is a runtime value.

>From this, we gain compile-time error checking, and in my humble opinion
nicer code, as well as a single definition of this macro across the
entire codebase to aid in code comprehension.

Tested on a RK3399 ROCKPro64, where PCIe still works as expected when
accessing an NVMe drive.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/phy/rockchip/phy-rockchip-pcie.c | 72 ++--
 1 file changed, 21 insertions(+), 51 deletions(-)

diff --git a/drivers/phy/rockchip/phy-rockchip-pcie.c 
b/drivers/phy/rockchip/phy-rockchip-pcie.c
index 
bd44af36c67a5a504801275c1b0384d373fe7ec7..7c486ecb96ffe1589fa077d7d2b079e02f4f6769
 100644
--- a/drivers/phy/rockchip/phy-rockchip-pcie.c
+++ b/drivers/phy/rockchip/phy-rockchip-pcie.c
@@ -6,6 +6,7 @@
  * Copyright (C) 2016 ROCKCHIP, Inc.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -18,23 +19,14 @@
 #include 
 #include 
 
-/*
- * The higher 16-bit of this register is used for write protection
- * only if BIT(x + 16) set to 1 the BIT(x) can be written.
- */
-#define HIWORD_UPDATE(val, mask, shift) \
-   ((val) << (shift) | (mask) << ((shift) + 16))
 
 #define PHY_MAX_LANE_NUM  4
-#define PHY_CFG_DATA_SHIFT7
-#define PHY_CFG_ADDR_SHIFT1
-#define PHY_CFG_DATA_MASK 0xf
-#define PHY_CFG_ADDR_MASK 0x3f
-#define PHY_CFG_RD_MASK   0x3ff
+#define PHY_CFG_DATA_MASK GENMASK(10, 7)
+#define PHY_CFG_ADDR_MASK GENMASK(6, 1)
+#define PHY_CFG_RD_MASK   GENMASK(9, 0)
 #define PHY_CFG_WR_ENABLE 1
 #define PHY_CFG_WR_DISABLE1
-#define PHY_CFG_WR_SHIFT  0
-#define PHY_CFG_WR_MASK   1
+#define PHY_CFG_WR_MASK   BIT(0)
 #define PHY_CFG_PLL_LOCK  0x10
 #define PHY_CFG_CLK_TEST  0x10
 #define PHY_CFG_CLK_SCC   0x12
@@ -49,11 +41,7 @@
 #define PHY_LANE_RX_DET_SHIFT 11
 #define PHY_LANE_RX_DET_TH0x1
 #define PHY_LANE_IDLE_OFF 0x1
-#define PHY_LANE_IDLE_MASK0x1
-#define PHY_LANE_IDLE_A_SHIFT 3
-#define PHY_LANE_IDLE_B_SHIFT 4
-#define PHY_LANE_IDLE_C_SHIFT 5
-#define PHY_LANE_IDLE_D_SHIFT 6
+#define PHY_LANE_IDLE_MASKBIT(3)
 
 struct rockchip_pcie_data {
unsigned int pcie_conf;
@@ -100,22 +88,14 @@ static inline void phy_wr_cfg(struct rockchip_pcie_phy 
*rk_phy,
  u32 addr, u32 data)
 {
regmap_write(rk_phy->reg_base, rk_phy->phy_data->pcie_conf,
-HIWORD_UPDATE(data,
-  PHY_CFG_DATA_MASK,
-  PHY_CFG_DATA_SHIFT) |
-HIWORD_UPDATE(addr,
-  PHY_CFG_ADDR_MASK,
-  PHY_CFG_ADDR_SHIFT));
+HWORD_UPDATE(PHY_CFG_DATA_MASK, data) |
+HWORD_UPDATE(PHY_CFG_ADDR_MASK, addr));
udelay(1);
regmap_write(rk_phy->reg_base, rk_phy->phy_data->pcie_conf,
-HIWORD_UPDATE(PHY_CFG_WR_ENABLE,
-  PHY_CFG_WR_MASK,
-  PHY_CFG_WR_SHIFT));
+HWORD_UPDATE(PHY_CFG_WR_MASK, PHY_CFG_WR_ENABLE));
udelay(1);
regmap_write(rk_phy->reg_base, rk_phy->phy_data->pcie_conf,
-HIWORD_UPDATE(PHY_CFG_WR_DISABLE,
-  PHY_CFG_WR_MASK,
-  PHY_CFG_WR_SHIFT));
+HWORD_UPDATE(PHY_CFG_WR_MASK, PHY_CFG_WR_DISABLE));
 }
 
 static int rockchip_pcie_phy_power_off(struct phy *phy)
@@ -126,11 +106,9 @@ static int rockchip_pcie_phy_power_off(struct phy *phy)
 
guard(mutex)(&rk_phy->pcie_mutex);
 
-   regmap_write(rk_phy->reg_base,
-rk_phy->phy_data->pcie_laneoff,
-HIWORD_UPDATE(PHY_LANE_IDLE_OFF,
-  PHY_LANE_IDLE_MASK,
-  PHY_LANE_IDLE_A_SHIFT + inst->index));
+   regmap_write(rk_phy->reg_base, rk_phy->phy_data->pcie_laneoff,
+HWORD_UPDATE(PHY_LANE_IDLE_MASK,
+ PHY_LANE_IDLE_OFF) << inst->index);
 
if (--rk_phy->pwr_cnt) {
return 0;
@@ -140,11 +118,9 @@ static int rockchip_pcie_phy_power_off(struct phy *phy)
if (err) {
dev_err(&phy->dev, "assert phy_rst err %d\n", err);
rk_phy->pwr_cnt++;
-   regmap_write(rk_phy->reg_base,
-

[PATCH 07/20] drm/rockchip: dsi: switch to HWORD_UPDATE* macros

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

Remove this driver's HIWORD_UPDATE macro, and replace instances of it
with either HWORD_UPDATE or HWORD_UPDATE_CONST, depending on whether
they're in an initializer. This gives us better error checking, which
already saved me some trouble during this refactor.

The driver's HIWORD_UPDATE macro doesn't shift up the value, but expects
a pre-shifted value. Meanwhile, HWORD_UPDATE and HWORD_UPDATE_CONST will
shift the value for us, based on the given mask. So a few things that
used to be a HIWORD_UPDATE(VERY_LONG_FOO, VERY_LONG_FOO) are now a
somewhat more pleasant HWORD_UPDATE(VERY_LONG_FOO, 1).

There are some non-trivial refactors here. A few literals needed a U
suffix added to stop them from unintentionally overflowing as a signed
long. To make sure all of these cases are caught, and not just the ones
where the HWORD_UPDATE* macros use such a value as a mask, just mark
every literal that's used as a mask as unsigned.

Non-contiguous masks also have to be split into multiple HWORD_UPDATE*
instances, as the macro's checks and shifting logic rely on contiguous
masks.

This is compile-tested only.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 142 
 1 file changed, 68 insertions(+), 74 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c 
b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c
index 
3398160ad75e4a9629082bc47491eab473caecc0..930bd412904cb244ca0d14e89f5b5d2af3e570ba
 100644
--- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c
+++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c
@@ -6,6 +6,7 @@
  *  Nickey Yang 
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -148,7 +149,7 @@
 #define DW_MIPI_NEEDS_GRF_CLK  BIT(1)
 
 #define PX30_GRF_PD_VO_CON10x0438
-#define PX30_DSI_FORCETXSTOPMODE   (0xf << 7)
+#define PX30_DSI_FORCETXSTOPMODE   (0xfU << 7)
 #define PX30_DSI_FORCERXMODE   BIT(6)
 #define PX30_DSI_TURNDISABLE   BIT(5)
 #define PX30_DSI_LCDC_SEL  BIT(0)
@@ -167,16 +168,16 @@
 #define RK3399_DSI1_LCDC_SEL   BIT(4)
 
 #define RK3399_GRF_SOC_CON22   0x6258
-#define RK3399_DSI0_TURNREQUEST(0xf << 12)
-#define RK3399_DSI0_TURNDISABLE(0xf << 8)
-#define RK3399_DSI0_FORCETXSTOPMODE(0xf << 4)
-#define RK3399_DSI0_FORCERXMODE(0xf << 0)
+#define RK3399_DSI0_TURNREQUEST(0xfU << 12)
+#define RK3399_DSI0_TURNDISABLE(0xfU << 8)
+#define RK3399_DSI0_FORCETXSTOPMODE(0xfU << 4)
+#define RK3399_DSI0_FORCERXMODE(0xfU << 0)
 
 #define RK3399_GRF_SOC_CON23   0x625c
-#define RK3399_DSI1_TURNDISABLE(0xf << 12)
-#define RK3399_DSI1_FORCETXSTOPMODE(0xf << 8)
-#define RK3399_DSI1_FORCERXMODE(0xf << 4)
-#define RK3399_DSI1_ENABLE (0xf << 0)
+#define RK3399_DSI1_TURNDISABLE(0xfU << 12)
+#define RK3399_DSI1_FORCETXSTOPMODE(0xfU << 8)
+#define RK3399_DSI1_FORCERXMODE(0xfU << 4)
+#define RK3399_DSI1_ENABLE (0xfU << 0)
 
 #define RK3399_GRF_SOC_CON24   0x6260
 #define RK3399_TXRX_MASTERSLAVEZ   BIT(7)
@@ -186,8 +187,8 @@
 #define RK3399_TXRX_TURNREQUESTGENMASK(3, 0)
 
 #define RK3568_GRF_VO_CON2 0x0368
-#define RK3568_DSI0_SKEWCALHS  (0x1f << 11)
-#define RK3568_DSI0_FORCETXSTOPMODE(0xf << 4)
+#define RK3568_DSI0_SKEWCALHS  (0x1fU << 11)
+#define RK3568_DSI0_FORCETXSTOPMODE(0xfU << 4)
 #define RK3568_DSI0_TURNDISABLEBIT(2)
 #define RK3568_DSI0_FORCERXMODEBIT(0)
 
@@ -197,18 +198,16 @@
  * come from. Name GRF_VO_CON3 is assumed.
  */
 #define RK3568_GRF_VO_CON3 0x36c
-#define RK3568_DSI1_SKEWCALHS  (0x1f << 11)
-#define RK3568_DSI1_FORCETXSTOPMODE(0xf << 4)
+#define RK3568_DSI1_SKEWCALHS  (0x1fU << 11)
+#define RK3568_DSI1_FORCETXSTOPMODE(0xfU << 4)
 #define RK3568_DSI1_TURNDISABLEBIT(2)
 #define RK3568_DSI1_FORCERXMODEBIT(0)
 
 #define RV1126_GRF_DSIPHY_CON  0x10220
-#define RV1126_DSI_FORCETXSTOPMODE (0xf << 4)
+#define RV1126_DSI_FORCETXSTOPMODE (0xfU << 4)
 #define RV1126_DSI_TURNDISABLE BIT(2)
 #define RV1126_DSI_FORCERXMODE BIT(0)
 
-#define HIWORD_UPDATE(val, mask)   (val | (mask) << 16)
-
 enum {
DW_DSI_USAGE_IDLE,
DW_DSI_USAGE_DSI,
@@ -1484,14 +1483,13 @@ static const struct rockchip_dw_dsi_chip_data 
px30_chip_data[] = {
{
.reg = 0xff45,
.lcdsel_grf_reg = PX30_GRF_PD_VO_CON1,
-   .lcdsel_big = HIWORD_UPDATE(0, PX30_DSI_LCDC_SEL),
-   .lcdsel_lit = HIWORD_UPDATE(PX30_DSI_LCDC_SEL,
-   PX30_DSI_LCDC_SEL),
+   .lcdsel

[PATCH 08/20] drm/rockchip: vop2: switch to HWORD_UPDATE macro

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

Remove VOP2's HIWORD_UPDATE macro from the vop2 header file, and replace
all instances in rockchip_vop2_reg.c (the only user of this particular
HIWORD_UPDATE definition) with equivalent HWORD_UPDATE instances. This
gives us better error checking.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/gpu/drm/rockchip/rockchip_drm_vop2.h |  1 -
 drivers/gpu/drm/rockchip/rockchip_vop2_reg.c | 14 --
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop2.h 
b/drivers/gpu/drm/rockchip/rockchip_drm_vop2.h
index 
fc3ecb9fcd9576d20c0fdfa8df469dfbff6605da..757232de41f609917aca679c17623c80879f3593
 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop2.h
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop2.h
@@ -33,7 +33,6 @@
 #define WIN_FEATURE_AFBDC  BIT(0)
 #define WIN_FEATURE_CLUSTERBIT(1)
 
-#define HIWORD_UPDATE(v, h, l)  ((GENMASK(h, l) << 16) | ((v) << (l)))
 /*
  *  the delay number of a window in different mode.
  */
diff --git a/drivers/gpu/drm/rockchip/rockchip_vop2_reg.c 
b/drivers/gpu/drm/rockchip/rockchip_vop2_reg.c
index 
32c4ed6857395a953bef8cd800b510fbdf7d9cec..ff1f3eabd1bc2cdb0b7b2aac2ca55ac9b7989d71
 100644
--- a/drivers/gpu/drm/rockchip/rockchip_vop2_reg.c
+++ b/drivers/gpu/drm/rockchip/rockchip_vop2_reg.c
@@ -1695,8 +1695,9 @@ static unsigned long rk3588_set_intf_mux(struct 
vop2_video_port *vp, int id, u32
die |= RK3588_SYS_DSP_INFACE_EN_HDMI0 |
FIELD_PREP(RK3588_SYS_DSP_INFACE_EN_EDP_HDMI0_MUX, 
vp->id);
val = rk3588_get_hdmi_pol(polflags);
-   regmap_write(vop2->vop_grf, RK3588_GRF_VOP_CON2, 
HIWORD_UPDATE(1, 1, 1));
-   regmap_write(vop2->vo1_grf, RK3588_GRF_VO1_CON0, 
HIWORD_UPDATE(val, 6, 5));
+   regmap_write(vop2->vop_grf, RK3588_GRF_VOP_CON2, 
HWORD_UPDATE(BIT(1), 1));
+   regmap_write(vop2->vo1_grf, RK3588_GRF_VO1_CON0,
+HWORD_UPDATE(GENMASK(6, 5), val));
break;
case ROCKCHIP_VOP2_EP_HDMI1:
div &= ~RK3588_DSP_IF_EDP_HDMI1_DCLK_DIV;
@@ -1707,8 +1708,9 @@ static unsigned long rk3588_set_intf_mux(struct 
vop2_video_port *vp, int id, u32
die |= RK3588_SYS_DSP_INFACE_EN_HDMI1 |
FIELD_PREP(RK3588_SYS_DSP_INFACE_EN_EDP_HDMI1_MUX, 
vp->id);
val = rk3588_get_hdmi_pol(polflags);
-   regmap_write(vop2->vop_grf, RK3588_GRF_VOP_CON2, 
HIWORD_UPDATE(1, 4, 4));
-   regmap_write(vop2->vo1_grf, RK3588_GRF_VO1_CON0, 
HIWORD_UPDATE(val, 8, 7));
+   regmap_write(vop2->vop_grf, RK3588_GRF_VOP_CON2, 
HWORD_UPDATE(BIT(4), 1));
+   regmap_write(vop2->vo1_grf, RK3588_GRF_VO1_CON0,
+HWORD_UPDATE(GENMASK(8, 7), val));
break;
case ROCKCHIP_VOP2_EP_EDP0:
div &= ~RK3588_DSP_IF_EDP_HDMI0_DCLK_DIV;
@@ -1718,7 +1720,7 @@ static unsigned long rk3588_set_intf_mux(struct 
vop2_video_port *vp, int id, u32
die &= ~RK3588_SYS_DSP_INFACE_EN_EDP_HDMI0_MUX;
die |= RK3588_SYS_DSP_INFACE_EN_EDP0 |
   FIELD_PREP(RK3588_SYS_DSP_INFACE_EN_EDP_HDMI0_MUX, 
vp->id);
-   regmap_write(vop2->vop_grf, RK3588_GRF_VOP_CON2, 
HIWORD_UPDATE(1, 0, 0));
+   regmap_write(vop2->vop_grf, RK3588_GRF_VOP_CON2, 
HWORD_UPDATE(BIT(0), 1));
break;
case ROCKCHIP_VOP2_EP_EDP1:
div &= ~RK3588_DSP_IF_EDP_HDMI1_DCLK_DIV;
@@ -1728,7 +1730,7 @@ static unsigned long rk3588_set_intf_mux(struct 
vop2_video_port *vp, int id, u32
die &= ~RK3588_SYS_DSP_INFACE_EN_EDP_HDMI1_MUX;
die |= RK3588_SYS_DSP_INFACE_EN_EDP1 |
   FIELD_PREP(RK3588_SYS_DSP_INFACE_EN_EDP_HDMI1_MUX, 
vp->id);
-   regmap_write(vop2->vop_grf, RK3588_GRF_VOP_CON2, 
HIWORD_UPDATE(1, 3, 3));
+   regmap_write(vop2->vop_grf, RK3588_GRF_VOP_CON2, 
HWORD_UPDATE(BIT(3), 1));
break;
case ROCKCHIP_VOP2_EP_MIPI0:
div &= ~RK3588_DSP_IF_MIPI0_PCLK_DIV;

-- 
2.49.0

[PATCH 05/20] drm/rockchip: lvds: switch to HWORD_UPDATE macro

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

Remove rockchip_lvds.h's own HIWORD_UPDATE macro, and replace all
instances of it with bitfield.h's HWORD_UPDATE macro, which gives us
more error checking.

For the slightly-less-trivial case of the 2-bit width instance, the
results were checked during development to match all possible input
values (0 to 3, inclusive).

Signed-off-by: Nicolas Frattaroli 
---
 drivers/gpu/drm/rockchip/rockchip_lvds.h | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_lvds.h 
b/drivers/gpu/drm/rockchip/rockchip_lvds.h
index 
ca83d7b6bea733588849d3ff379cf8540405462b..568fe8d7918586581a461493d57d7b95f4c9eebc
 100644
--- a/drivers/gpu/drm/rockchip/rockchip_lvds.h
+++ b/drivers/gpu/drm/rockchip/rockchip_lvds.h
@@ -9,6 +9,9 @@
 #ifndef _ROCKCHIP_LVDS_
 #define _ROCKCHIP_LVDS_
 
+#include 
+#include 
+
 #define RK3288_LVDS_CH0_REG0   0x00
 #define RK3288_LVDS_CH0_REG0_LVDS_EN   BIT(7)
 #define RK3288_LVDS_CH0_REG0_TTL_ENBIT(6)
@@ -106,18 +109,16 @@
 #define LVDS_VESA_18   2
 #define LVDS_JEIDA_18  3
 
-#define HIWORD_UPDATE(v, h, l)  ((GENMASK(h, l) << 16) | ((v) << (l)))
-
 #define PX30_LVDS_GRF_PD_VO_CON0   0x434
-#define   PX30_LVDS_TIE_CLKS(val)  HIWORD_UPDATE(val,  8,  8)
-#define   PX30_LVDS_INVERT_CLKS(val)   HIWORD_UPDATE(val,  9,  9)
-#define   PX30_LVDS_INVERT_DCLK(val)   HIWORD_UPDATE(val,  5,  5)
+#define   PX30_LVDS_TIE_CLKS(val)  HWORD_UPDATE(BIT(8), (val))
+#define   PX30_LVDS_INVERT_CLKS(val)   HWORD_UPDATE(BIT(9), (val))
+#define   PX30_LVDS_INVERT_DCLK(val)   HWORD_UPDATE(BIT(5), (val))
 
 #define PX30_LVDS_GRF_PD_VO_CON1   0x438
-#define   PX30_LVDS_FORMAT(val)HIWORD_UPDATE(val, 14, 
13)
-#define   PX30_LVDS_MODE_EN(val)   HIWORD_UPDATE(val, 12, 12)
-#define   PX30_LVDS_MSBSEL(val)HIWORD_UPDATE(val, 11, 
11)
-#define   PX30_LVDS_P2S_EN(val)HIWORD_UPDATE(val,  6,  
6)
-#define   PX30_LVDS_VOP_SEL(val)   HIWORD_UPDATE(val,  1,  1)
+#define   PX30_LVDS_FORMAT(val)
HWORD_UPDATE(GENMASK(14, 13), (val))
+#define   PX30_LVDS_MODE_EN(val)   HWORD_UPDATE(BIT(12), (val))
+#define   PX30_LVDS_MSBSEL(val)HWORD_UPDATE(BIT(11), 
(val))
+#define   PX30_LVDS_P2S_EN(val)HWORD_UPDATE(BIT(6), 
(val))
+#define   PX30_LVDS_VOP_SEL(val)   HWORD_UPDATE(BIT(1), (val))
 
 #endif /* _ROCKCHIP_LVDS_ */

-- 
2.49.0

[PATCH 06/20] phy: rockchip-emmc: switch to HWORD_UPDATE macro

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

Replace the implementation of the rockchip eMMC PHY driver's
HIWORD_UPDATE macro with bitfield.h's HWORD_UPDATE. This makes the
change more easily reviewable.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/phy/rockchip/phy-rockchip-emmc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/phy/rockchip/phy-rockchip-emmc.c 
b/drivers/phy/rockchip/phy-rockchip-emmc.c
index 
20023f6eb9944eaab505101d57e806476ecfac71..42423d4cd1811cd039a701895758050483d1c959
 100644
--- a/drivers/phy/rockchip/phy-rockchip-emmc.c
+++ b/drivers/phy/rockchip/phy-rockchip-emmc.c
@@ -6,6 +6,7 @@
  * Copyright (C) 2016 ROCKCHIP, Inc.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -21,7 +22,7 @@
  * only if BIT(x + 16) set to 1 the BIT(x) can be written.
  */
 #define HIWORD_UPDATE(val, mask, shift) \
-   ((val) << (shift) | (mask) << ((shift) + 16))
+   (HWORD_UPDATE((mask) << (shift), (val)))
 
 /* Register definition */
 #define GRF_EMMCPHY_CON0   0x0

-- 
2.49.0

[PATCH 12/20] phy: rockchip-usb: switch to HWORD_UPDATE macro

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

Remove this driver's HIWORD_UPDATE macro, and replace all instances of
it with (hopefully) equivalent HWORD_UPDATE instances. To do this, a few
of the defines are being adjusted, as HWORD_UPDATE shifts up the value
for us. This gets rid of the icky update(mask, mask) shenanigans.

The benefit of using HWORD_UPDATE is that it does more checking of the
input, hopefully catching errors. In practice, a shared definition makes
code more readable than several different flavours of the same macro,
and the shifted value helps as well.

I do not have the hardware that uses this particular driver, so it's
compile-tested only as far as my own testing goes.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/phy/rockchip/phy-rockchip-usb.c | 51 +
 1 file changed, 20 insertions(+), 31 deletions(-)

diff --git a/drivers/phy/rockchip/phy-rockchip-usb.c 
b/drivers/phy/rockchip/phy-rockchip-usb.c
index 
666a896c8f0a08443228914a039b95974e15ba58..23c9885ec717ffeb6e4589dd0851b0307366738c
 100644
--- a/drivers/phy/rockchip/phy-rockchip-usb.c
+++ b/drivers/phy/rockchip/phy-rockchip-usb.c
@@ -6,6 +6,7 @@
  * Copyright (C) 2014 ROCKCHIP, Inc.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -24,9 +25,6 @@
 
 static int enable_usb_uart;
 
-#define HIWORD_UPDATE(val, mask) \
-   ((val) | (mask) << 16)
-
 #define UOC_CON0   0x00
 #define UOC_CON0_SIDDQ BIT(13)
 #define UOC_CON0_DISABLE   BIT(4)
@@ -38,10 +36,10 @@ static int enable_usb_uart;
 #define UOC_CON3   0x0c
 /* bits present on rk3188 and rk3288 phys */
 #define UOC_CON3_UTMI_TERMSEL_FULLSPEEDBIT(5)
-#define UOC_CON3_UTMI_XCVRSEELCT_FSTRANSC  (1 << 3)
-#define UOC_CON3_UTMI_XCVRSEELCT_MASK  (3 << 3)
-#define UOC_CON3_UTMI_OPMODE_NODRIVING (1 << 1)
-#define UOC_CON3_UTMI_OPMODE_MASK  (3 << 1)
+#define UOC_CON3_UTMI_XCVRSEELCT_FSTRANSC  1U
+#define UOC_CON3_UTMI_XCVRSEELCT_MASK  GENMASK(4, 3)
+#define UOC_CON3_UTMI_OPMODE_NODRIVING 1U
+#define UOC_CON3_UTMI_OPMODE_MASK  GENMASK(2, 1)
 #define UOC_CON3_UTMI_SUSPENDN BIT(0)
 
 struct rockchip_usb_phys {
@@ -79,7 +77,7 @@ struct rockchip_usb_phy {
 static int rockchip_usb_phy_power(struct rockchip_usb_phy *phy,
   bool siddq)
 {
-   u32 val = HIWORD_UPDATE(siddq ? UOC_CON0_SIDDQ : 0, UOC_CON0_SIDDQ);
+   u32 val = HWORD_UPDATE(UOC_CON0_SIDDQ, siddq);
 
return regmap_write(phy->base->reg_base, phy->reg_offset, val);
 }
@@ -332,29 +330,24 @@ static int __init rockchip_init_usb_uart_common(struct 
regmap *grf,
 * but were not present in the original code.
 * Also disable the analog phy components to save power.
 */
-   val = HIWORD_UPDATE(UOC_CON0_COMMON_ON_N
-   | UOC_CON0_DISABLE
-   | UOC_CON0_SIDDQ,
-   UOC_CON0_COMMON_ON_N
-   | UOC_CON0_DISABLE
-   | UOC_CON0_SIDDQ);
+   val = HWORD_UPDATE(UOC_CON0_COMMON_ON_N, 1) |
+ HWORD_UPDATE(UOC_CON0_DISABLE, 1) |
+ HWORD_UPDATE(UOC_CON0_SIDDQ, 1);
ret = regmap_write(grf, regoffs + UOC_CON0, val);
if (ret)
return ret;
 
-   val = HIWORD_UPDATE(UOC_CON2_SOFT_CON_SEL,
-   UOC_CON2_SOFT_CON_SEL);
+   val = HWORD_UPDATE(UOC_CON2_SOFT_CON_SEL, 1);
ret = regmap_write(grf, regoffs + UOC_CON2, val);
if (ret)
return ret;
 
-   val = HIWORD_UPDATE(UOC_CON3_UTMI_OPMODE_NODRIVING
-   | UOC_CON3_UTMI_XCVRSEELCT_FSTRANSC
-   | UOC_CON3_UTMI_TERMSEL_FULLSPEED,
-   UOC_CON3_UTMI_SUSPENDN
-   | UOC_CON3_UTMI_OPMODE_MASK
-   | UOC_CON3_UTMI_XCVRSEELCT_MASK
-   | UOC_CON3_UTMI_TERMSEL_FULLSPEED);
+   val = HWORD_UPDATE(UOC_CON3_UTMI_SUSPENDN, 0) |
+ HWORD_UPDATE(UOC_CON3_UTMI_OPMODE_MASK,
+  UOC_CON3_UTMI_OPMODE_NODRIVING) |
+ HWORD_UPDATE(UOC_CON3_UTMI_XCVRSEELCT_MASK,
+  UOC_CON3_UTMI_XCVRSEELCT_FSTRANSC) |
+ HWORD_UPDATE(UOC_CON3_UTMI_TERMSEL_FULLSPEED, 1);
ret = regmap_write(grf, UOC_CON3, val);
if (ret)
return ret;
@@ -380,10 +373,8 @@ static int __init rk3188_init_usb_uart(struct regmap *grf,
if (ret)
return ret;
 
-   val = HIWORD_UPDATE(RK3188_UOC0_CON0_BYPASSSEL
-   | RK3

[PATCH 10/20] drm/rockchip: dw_hdmi_qp: switch to HWORD_UPDATE macro

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

Replace this driver's HIWORD_UPDATE with the HWORD_UPDATE from
bitfield.h. While at it, disambiguate the write GRF write to SOC_CON7 by
splitting the definition into the individual bitflags. This is done
because HWORD_UPDATE shifts the value for us according to the mask, so
writing the mask to itself to enable two bits is no longer something
that can be done. It should also not be done, because it hides the true
meaning of those two individual bit flags.

HDMI output with this patch has been tested on both RK3588 and RK3576.
On the former, with both present HDMI connectors.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c | 68 +-
 1 file changed, 33 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c 
b/drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c
index 
7d531b6f4c098c6c548788dad487ce4613a2f32b..0431913c2f71893638d1824d52836cc095e04551
 100644
--- a/drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c
+++ b/drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c
@@ -7,6 +7,7 @@
  * Author: Cristian Ciocaltea 
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -66,7 +67,8 @@
 #define RK3588_HDMI1_HPD_INT_MSK   BIT(15)
 #define RK3588_HDMI1_HPD_INT_CLR   BIT(14)
 #define RK3588_GRF_SOC_CON70x031c
-#define RK3588_SET_HPD_PATH_MASK   GENMASK(13, 12)
+#define RK3588_HPD_HDMI0_IO_EN_MASKBIT(12)
+#define RK3588_HPD_HDMI1_IO_EN_MASKBIT(13)
 #define RK3588_GRF_SOC_STATUS1 0x0384
 #define RK3588_HDMI0_LEVEL_INT BIT(16)
 #define RK3588_HDMI1_LEVEL_INT BIT(24)
@@ -80,7 +82,6 @@
 #define RK3588_HDMI0_GRANT_SEL BIT(10)
 #define RK3588_HDMI1_GRANT_SEL BIT(12)
 
-#define HIWORD_UPDATE(val, mask)   ((val) | (mask) << 16)
 #define HOTPLUG_DEBOUNCE_MS150
 #define MAX_HDMI_PORT_NUM  2
 
@@ -185,11 +186,11 @@ static void dw_hdmi_qp_rk3588_setup_hpd(struct dw_hdmi_qp 
*dw_hdmi, void *data)
u32 val;
 
if (hdmi->port_id)
-   val = HIWORD_UPDATE(RK3588_HDMI1_HPD_INT_CLR,
-   RK3588_HDMI1_HPD_INT_CLR | 
RK3588_HDMI1_HPD_INT_MSK);
+   val = (HWORD_UPDATE(RK3588_HDMI1_HPD_INT_CLR, 1) |
+  HWORD_UPDATE(RK3588_HDMI1_HPD_INT_MSK, 0));
else
-   val = HIWORD_UPDATE(RK3588_HDMI0_HPD_INT_CLR,
-   RK3588_HDMI0_HPD_INT_CLR | 
RK3588_HDMI0_HPD_INT_MSK);
+   val = (HWORD_UPDATE(RK3588_HDMI0_HPD_INT_CLR, 1) |
+  HWORD_UPDATE(RK3588_HDMI0_HPD_INT_MSK, 0));
 
regmap_write(hdmi->regmap, RK3588_GRF_SOC_CON2, val);
 }
@@ -218,8 +219,8 @@ static void dw_hdmi_qp_rk3576_setup_hpd(struct dw_hdmi_qp 
*dw_hdmi, void *data)
struct rockchip_hdmi_qp *hdmi = (struct rockchip_hdmi_qp *)data;
u32 val;
 
-   val = HIWORD_UPDATE(RK3576_HDMI_HPD_INT_CLR,
-   RK3576_HDMI_HPD_INT_CLR | RK3576_HDMI_HPD_INT_MSK);
+   val = (HWORD_UPDATE(RK3576_HDMI_HPD_INT_CLR, 1) |
+  HWORD_UPDATE(RK3576_HDMI_HPD_INT_MSK, 0));
 
regmap_write(hdmi->regmap, RK3576_IOC_MISC_CON0, val);
regmap_write(hdmi->regmap, 0xa404, 0x0102);
@@ -254,7 +255,7 @@ static irqreturn_t dw_hdmi_qp_rk3576_hardirq(int irq, void 
*dev_id)
 
regmap_read(hdmi->regmap, RK3576_IOC_HDMI_HPD_STATUS, &intr_stat);
if (intr_stat) {
-   val = HIWORD_UPDATE(RK3576_HDMI_HPD_INT_MSK, 
RK3576_HDMI_HPD_INT_MSK);
+   val = HWORD_UPDATE(RK3576_HDMI_HPD_INT_MSK, 1);
 
regmap_write(hdmi->regmap, RK3576_IOC_MISC_CON0, val);
return IRQ_WAKE_THREAD;
@@ -273,12 +274,12 @@ static irqreturn_t dw_hdmi_qp_rk3576_irq(int irq, void 
*dev_id)
if (!intr_stat)
return IRQ_NONE;
 
-   val = HIWORD_UPDATE(RK3576_HDMI_HPD_INT_CLR, RK3576_HDMI_HPD_INT_CLR);
+   val = HWORD_UPDATE(RK3576_HDMI_HPD_INT_CLR, 1);
regmap_write(hdmi->regmap, RK3576_IOC_MISC_CON0, val);
mod_delayed_work(system_wq, &hdmi->hpd_work,
 msecs_to_jiffies(HOTPLUG_DEBOUNCE_MS));
 
-   val = HIWORD_UPDATE(0, RK3576_HDMI_HPD_INT_MSK);
+   val = HWORD_UPDATE(RK3576_HDMI_HPD_INT_MSK, 0);
regmap_write(hdmi->regmap, RK3576_IOC_MISC_CON0, val);
 
return IRQ_HANDLED;
@@ -293,11 +294,9 @@ static irqreturn_t dw_hdmi_qp_rk3588_hardirq(int irq, void 
*dev_id)
 
if (intr_stat) {
if (hdmi->port_id)
-   val = HIWORD_UPDATE(RK3588_HDMI1_HPD_INT_MSK,
-   RK3588_HDMI1_HPD_INT_MSK);
+   val = HWORD_UPDATE(RK3588_HDMI1_HPD_INT_MSK, 1);
else
-   val = HIWORD_UPDATE(RK3588_HDMI0_HPD_INT_MSK,
-   RK3588_HDMI

[PATCH 11/20] drm/rockchip: inno-hdmi: switch to HWORD_UPDATE macro

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

The inno-hdmi driver's own HIWORD_UPDATE macro is instantiated only
twice. Remove it, and replace its uses with HWORD_UPDATE. Since
HWORD_UPDATE shifts the value for us, we replace using the mask as the
value by simply using 1 instead.

With the new HWORD_UPDATE macro, we gain better error checking and a
central shared definition.

This has been compile-tested only as I lack hardware this old, but the
change is trivial enough that I am fairly certain it's equivalent.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/gpu/drm/rockchip/inno_hdmi.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/inno_hdmi.c 
b/drivers/gpu/drm/rockchip/inno_hdmi.c
index 
db4b4038e51d5a963f9ddad568282485ed355040..ab6b1d91127885afe0f5e0feb265d6b7b02d88a7
 100644
--- a/drivers/gpu/drm/rockchip/inno_hdmi.c
+++ b/drivers/gpu/drm/rockchip/inno_hdmi.c
@@ -6,6 +6,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -31,8 +32,6 @@
 
 #include "inno_hdmi.h"
 
-#define HIWORD_UPDATE(val, mask)   ((val) | (mask) << 16)
-
 #define INNO_HDMI_MIN_TMDS_CLOCK  2500U
 
 #define RK3036_GRF_SOC_CON20x148
@@ -392,10 +391,10 @@ static int inno_hdmi_config_video_timing(struct inno_hdmi 
*hdmi,
int value, psync;
 
if (hdmi->variant->dev_type == RK3036_HDMI) {
-   psync = mode->flags & DRM_MODE_FLAG_PHSYNC ? RK3036_HDMI_PHSYNC 
: 0;
-   value = HIWORD_UPDATE(psync, RK3036_HDMI_PHSYNC);
-   psync = mode->flags & DRM_MODE_FLAG_PVSYNC ? RK3036_HDMI_PVSYNC 
: 0;
-   value |= HIWORD_UPDATE(psync, RK3036_HDMI_PVSYNC);
+   psync = mode->flags & DRM_MODE_FLAG_PHSYNC ? 1 : 0;
+   value = HWORD_UPDATE(RK3036_HDMI_PHSYNC, psync);
+   psync = mode->flags & DRM_MODE_FLAG_PVSYNC ? 1 : 0;
+   value |= HWORD_UPDATE(RK3036_HDMI_PVSYNC, psync);
regmap_write(hdmi->grf, RK3036_GRF_SOC_CON2, value);
}
 

-- 
2.49.0

[PATCH 09/20] phy: rockchip-samsung-dcphy: switch to HWORD_UPDATE macro

2025-06-12 Thread Nicolas Frattaroli

The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.

phy-rockchip-samsung-dcphy is actually an exemplary example, where the
similarities to FIELD_PREP were spotted and the driver local macro has
the same semantics as the new HWORD_UPDATE bitfield.h macro.

Still, get rid of FIELD_PREP_HIWORD now that a shared implementation
exists, replacing the two instances of it with HWORD_UPDATE. This gives
us slightly better error checking; the value is now checked to fit in 16
bits.

Signed-off-by: Nicolas Frattaroli 
---
 drivers/phy/rockchip/phy-rockchip-samsung-dcphy.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/phy/rockchip/phy-rockchip-samsung-dcphy.c 
b/drivers/phy/rockchip/phy-rockchip-samsung-dcphy.c
index 
28a052e17366516d5a99988bec9a52e3f0f09101..71e88635c95371fcc6f0f7227954e1f34dd97fc6
 100644
--- a/drivers/phy/rockchip/phy-rockchip-samsung-dcphy.c
+++ b/drivers/phy/rockchip/phy-rockchip-samsung-dcphy.c
@@ -20,12 +20,6 @@
 #include 
 #include 
 
-#define FIELD_PREP_HIWORD(_mask, _val) \
-   (   \
-   FIELD_PREP((_mask), (_val)) |   \
-   ((_mask) << 16) \
-   )
-
 #define BIAS_CON0  0x
 #define I_RES_CNTL_MASKGENMASK(6, 4)
 #define I_RES_CNTL(x)  FIELD_PREP(I_RES_CNTL_MASK, x)
@@ -252,8 +246,8 @@
 
 /* MIPI_CDPHY_GRF registers */
 #define MIPI_DCPHY_GRF_CON00x
-#define S_CPHY_MODEFIELD_PREP_HIWORD(BIT(3), 1)
-#define M_CPHY_MODEFIELD_PREP_HIWORD(BIT(0), 1)
+#define S_CPHY_MODEHWORD_UPDATE(BIT(3), 1)
+#define M_CPHY_MODEHWORD_UPDATE(BIT(0), 1)
 
 enum hs_drv_res_ohm {
STRENGTH_30_OHM = 0x8,

-- 
2.49.0

1 2 3 >

1 - 100 of 262 matches

Mail list logo