Re: [PATCH v1] drm/ttm: Clean up page shift operation

2022-11-22 Thread Christian König

Am 22.11.22 um 08:51 schrieb Somalapuram Amaranath:

Remove page shift operations as ttm_resource moved
from num_pages to size_t size in bytes.
v1 -> v2: fix missing page shift to fpfn and lpfn

Signed-off-by: Somalapuram Amaranath 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  4 +---
  drivers/gpu/drm/ttm/ttm_range_manager.c| 13 ++---
  2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 974e85d8b6cc..19ad365dc159 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -541,12 +541,10 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA)) {
/* GWS and OA don't need any alignment. */
page_align = bp->byte_align;
-   size <<= PAGE_SHIFT;
-
} else if (bp->domain & AMDGPU_GEM_DOMAIN_GDS) {
/* Both size and alignment must be a multiple of 4. */
page_align = ALIGN(bp->byte_align, 4);
-   size = ALIGN(size, 4) << PAGE_SHIFT;
+   size = ALIGN(size, 4);


The amdgpu changes should probably be a separate patch.


} else {
/* Memory should be aligned at least to a page size. */
page_align = ALIGN(bp->byte_align, PAGE_SIZE) >> PAGE_SHIFT;
diff --git a/drivers/gpu/drm/ttm/ttm_range_manager.c 
b/drivers/gpu/drm/ttm/ttm_range_manager.c
index 0a8bc0b7f380..6ac38092dd2a 100644
--- a/drivers/gpu/drm/ttm/ttm_range_manager.c
+++ b/drivers/gpu/drm/ttm/ttm_range_manager.c
@@ -83,9 +83,10 @@ static int ttm_range_man_alloc(struct ttm_resource_manager 
*man,
  
  	spin_lock(&rman->lock);

ret = drm_mm_insert_node_in_range(mm, &node->mm_nodes[0],
- PFN_UP(node->base.size),
+ node->base.size,
  bo->page_alignment, 0,
- place->fpfn, lpfn, mode);
+ place->fpfn << PAGE_SHIFT,
+ lpfn << PAGE_SHIFT, mode);
spin_unlock(&rman->lock);
  
  	if (unlikely(ret)) {

@@ -119,11 +120,10 @@ static bool ttm_range_man_intersects(struct 
ttm_resource_manager *man,
 size_t size)
  {
struct drm_mm_node *node = &to_ttm_range_mgr_node(res)->mm_nodes[0];
-   u32 num_pages = PFN_UP(size);
  
  	/* Don't evict BOs outside of the requested placement range */

-   if (place->fpfn >= (node->start + num_pages) ||
-   (place->lpfn && place->lpfn <= node->start))
+   if ((place->fpfn << PAGE_SHIFT) >= (node->start + size) ||
+   (place->lpfn && (place->lpfn << PAGE_SHIFT) <= node->start))
return false;
  
  	return true;

@@ -135,10 +135,9 @@ static bool ttm_range_man_compatible(struct 
ttm_resource_manager *man,
 size_t size)
  {
struct drm_mm_node *node = &to_ttm_range_mgr_node(res)->mm_nodes[0];
-   u32 num_pages = PFN_UP(size);
  
  	if (node->start < place->fpfn ||

-   (place->lpfn && (node->start + num_pages) > place->lpfn))
+   (place->lpfn && (node->start + size) > place->lpfn << PAGE_SHIFT))
return false;


This looks good but can't be complete. We have a couple of place where 
node->start and node->size is used outside of TTM.


See drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h and 
drivers/gpu/drm/i915/intel_region_ttm.c for example.


Those need to be updated as well.

Regards,
Christian.


  
  	return true;




Re: [Intel-gfx] [PATCH v2 0/5] Add module oriented dmesg output

2022-11-22 Thread Tvrtko Ursulin



On 21/11/2022 18:21, John Harrison wrote:

On 11/18/2022 02:52, Jani Nikula wrote:

On Thu, 17 Nov 2022, john.c.harri...@intel.com wrote:

From: John Harrison 

When trying to analyse bug reports from CI, customers, etc. it can be
difficult to work out exactly what is happening on which GT in a
multi-GT system. So add GT oriented debug/error message wrappers. If
used instead of the drm_ equivalents, you get the same output but with
a GT# prefix on it.

It was also requested to extend this further to submodules in order to
factor out the repeated structure accessing constructs and common
string prefixes. So, add versions for GuC, HuC and GuC CTB as well.

This patch set updates all the gt/uc files to use the new helpers as a
first step. The intention would be to convert all output messages that
have access to a GT structure.

v2: Go back to using lower case names, add more wrapper sets (combined
review feedback). Also, wrap up probe injection and WARN entries.

Signed-off-by: John Harrison 

For adding the wrappers in general, I'm going to disagree and
commit. I'll leave it up to Tvrtko and Joonas.

Regarding the placement of the macros, I insist you add individual
header files for the wrappers and include them only where needed.

We have a fairly serious problem with everything including everything in
i915 that I've been slowly trying to tackle. Touch one thing, rebuild
everything. About a third of our headers cause the rebuild of the entire
driver when modified. We need to reduce the surface of things that cause
rebuilds.

For example, intel_gt.h is included by 97 files, intel_guc.h by 332
files, and intel_huc.h by 329 files (counting recursively).

There's absolutely no reason any of the display code, for example, needs
to have these logging macros in their build. Long term, the headers
should be reorganized to reduce the interdependencies, and this is what
I've been doing in i915_drv.h and display/ in general. But the least we
can do is not make the problem worse.
@Tvrtko/@Michal W, any other review comments or feedback? I'd rather not 
spend time fixing up the header file issue and reposting only to have 
someone point out another issue that could have been resolved at the 
same time.


I read through the patches when you posted them and it looked nice and 
clean to me. I think I spotted one instance of a debug build only 
message getting upgraded to production build, and one loss of stack 
trace on a warning, but it wasn't a concern to me AFAIR.


Regards,

Tvrtko



John.


BR,
Jani.



John Harrison (5):
   drm/i915/gt: Start adding module oriented dmesg output
   drm/i915/huc: Add HuC specific debug print wrappers
   drm/i915/guc: Add GuC specific debug print wrappers
   drm/i915/guc: Add GuC CT specific debug print wrappers
   drm/i915/uc: Update the gt/uc code to use gt_err and friends

  drivers/gpu/drm/i915/gt/intel_gt.c    |  96 
  drivers/gpu/drm/i915/gt/intel_gt.h    |  35 +++
  drivers/gpu/drm/i915/gt/uc/intel_guc.c    |  32 +--
  drivers/gpu/drm/i915/gt/uc/intel_guc.h    |  35 +++
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |   8 +-
  .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |  48 ++--
  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 222 +-
  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c |  19 +-
  drivers/gpu/drm/i915/gt/uc/intel_guc_log.c    |  37 ++-
  drivers/gpu/drm/i915/gt/uc/intel_guc_rc.c |   7 +-
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c   |  55 ++---
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  62 +++--
  drivers/gpu/drm/i915/gt/uc/intel_huc.c    |  31 +--
  drivers/gpu/drm/i915/gt/uc/intel_huc.h    |  23 ++
  drivers/gpu/drm/i915/gt/uc/intel_uc.c | 108 -
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  |  98 
  drivers/gpu/drm/i915/gt/uc/selftest_guc.c |  34 +--
  .../drm/i915/gt/uc/selftest_guc_hangcheck.c   |  22 +-
  .../drm/i915/gt/uc/selftest_guc_multi_lrc.c   |  10 +-
  19 files changed, 507 insertions(+), 475 deletions(-)




Re: [PATCH v1 2/8] dt-bindings: display: bridge: renesas, dsi-csi2-tx: Add r8a779g0

2022-11-22 Thread Tomi Valkeinen

On 17/11/2022 17:14, Geert Uytterhoeven wrote:

Hi Tomi,

On Thu, Nov 17, 2022 at 1:26 PM Tomi Valkeinen
 wrote:

From: Tomi Valkeinen 

Extend the Renesas DSI display bindings to support the r8a779g0 V4H.

Signed-off-by: Tomi Valkeinen 
---
  .../bindings/display/bridge/renesas,dsi-csi2-tx.yaml   | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git 
a/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml 
b/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml
index afeeb967393d..bc3101f77e5a 100644
--- a/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml
+++ b/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml
@@ -11,13 +11,14 @@ maintainers:

  description: |
This binding describes the MIPI DSI/CSI-2 encoder embedded in the Renesas
-  R-Car V3U SoC. The encoder can operate in either DSI or CSI-2 mode, with up
+  R-Car V3U/V4H SoC. The encoder can operate in either DSI or CSI-2 mode, with 
up


Perhaps "R-Car Gen4 SoCs", so we stay within 80 chars, and don't have
to update this when the next member of the family is around the block?


Is V3U gen 4? Or do you mean "R-Car V3U and Gen 4 SoCs"?


Is there anything that might be SoC-specific?
If not, perhaps the time is ripe for a family-specific compatible value?


At least v3u and v4h DSIs are slightly different. Well, the DSI IP block 
itself looks the same, but the PLL and PHY are different.


 Tomi



Re: [PATCH] drm/msm/dpu: Print interrupt index in addition to the mask

2022-11-22 Thread Martin Botka



On November 21, 2022 11:24:55 PM GMT+01:00, Marijn Suijten 
 wrote:
>The mask only describes the `irq_idx % 32` part, making it generally
>impossible to deduce what interrupt is being enabled/disabled.  Since
>`debug/core_irq` in debugfs (and other prints) also include the full
>`DPU_IRQ_IDX()` value, print the same full value here for easier
>correlation instead of only adding the `irq_idx / 32` part.
>
>Furthermore, make the dbgstr messages more consistent.
>
>Signed-off-by: Marijn Suijten 
>---
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c | 12 ++--
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
>diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c 
>b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
>index cf1b6d84c18a..64589a9c2c51 100644
>--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
>+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
>@@ -252,9 +252,9 @@ static int dpu_hw_intr_enable_irq_locked(struct 
>dpu_hw_intr *intr, int irq_idx)
> 
>   cache_irq_mask = intr->cache_irq_mask[reg_idx];
>   if (cache_irq_mask & DPU_IRQ_MASK(irq_idx)) {
>-  dbgstr = "DPU IRQ already set:";
>+  dbgstr = "already ";
>   } else {
>-  dbgstr = "DPU IRQ enabled:";
>+  dbgstr = "";
> 
>   cache_irq_mask |= DPU_IRQ_MASK(irq_idx);
>   /* Cleaning any pending interrupt */
>@@ -268,7 +268,7 @@ static int dpu_hw_intr_enable_irq_locked(struct 
>dpu_hw_intr *intr, int irq_idx)
>   intr->cache_irq_mask[reg_idx] = cache_irq_mask;
>   }
> 
>-  pr_debug("%s MASK:0x%.8lx, CACHE-MASK:0x%.8x\n", dbgstr,
>+  pr_debug("DPU IRQ %d %senabled: MASK:0x%.8lx, CACHE-MASK:0x%.8x\n", 
>irq_idx, dbgstr,
>   DPU_IRQ_MASK(irq_idx), cache_irq_mask);
> 
>   return 0;
>@@ -301,9 +301,9 @@ static int dpu_hw_intr_disable_irq_locked(struct 
>dpu_hw_intr *intr, int irq_idx)
> 
>   cache_irq_mask = intr->cache_irq_mask[reg_idx];
>   if ((cache_irq_mask & DPU_IRQ_MASK(irq_idx)) == 0) {
>-  dbgstr = "DPU IRQ is already cleared:";
>+  dbgstr = "already ";
>   } else {
>-  dbgstr = "DPU IRQ mask disable:";
>+  dbgstr = "";
> 
>   cache_irq_mask &= ~DPU_IRQ_MASK(irq_idx);
>   /* Disable interrupts based on the new mask */
>@@ -317,7 +317,7 @@ static int dpu_hw_intr_disable_irq_locked(struct 
>dpu_hw_intr *intr, int irq_idx)
>   intr->cache_irq_mask[reg_idx] = cache_irq_mask;
>   }
> 
>-  pr_debug("%s MASK:0x%.8lx, CACHE-MASK:0x%.8x\n", dbgstr,
>+  pr_debug("DPU IRQ %d %sdisabled: MASK:0x%.8lx, CACHE-MASK:0x%.8x\n", 
>irq_idx, dbgstr,
>   DPU_IRQ_MASK(irq_idx), cache_irq_mask);
> 
>   return 0;

Looks good to me.

Reviewed-by: Martin Botka 


Re: ast: resolutions that require single-buffering (due to VRAM limitations) are unavailable

2022-11-22 Thread Jeremy Rand

Thomas Zimmermann:

Hi

Am 25.10.22 um 09:12 schrieb Jeremy Rand:

Hi dri-devel,

I have two machines with ASPEED GPU's (ast Linux driver).  One machine 
is x86_64, running an ASRock Rack Tommy 90-SC02P1-00UBNZ GPU (AST2510 
chipset) with KDE Plasma Wayland; the other is ppc64le, running an 
integrated AST2500 GPU with KDE Plasma X11.  Both the AST2510 and 
AST2500 have 16 MiB VRAM according to lspci.  Both ASPEED GPU's are 
advertised as supporting up to 1920x1200 resolution, but KDE only 
detects a maximum resolution of 1920x1080.


Some additional information about this bug can be found at 
https://forums.raptorcs.com/index.php/topic,31.0.html .


I believe this is a Linux bug, because it is solely dependent on the 
Linux version.  The following Linux versions are confirmed to have the 
bug:


Debian:
5.6.0-1 (ppc64el)
Fedora:
5.6.0-1.fc33.x86_64
5.6.0-1.fc33.ppc64le
5.17.5-300.fc36.x86_64
5.18.6-200.fc36.ppc64le
6.1.0-0.rc0.20221007git4c86114194e6.5.fc38.ppc64le

Whereas the following Linux versions are confirmed to work fine (max 
resolution detected by KDE is 1920x1200 as it should be, and that 
resolution works fine when selected):


Debian:
5.5.0-2 (ppc64el)
Fedora:
5.5.17-200.fc31.x86_64
5.5.17-200.fc31.ppc64le

I believe the bug was introduced by Linux commit 
9253f830c9166bfa6cc07d5ed59e174e9d5ec6ca, which adds a VRAM size check 
that assumes double-buffering.  1920x1080 resolution at 4 bytes per 
pixel with 2 buffers is 16.6 MB, while bumping that to 1920x1200 
results in 18.4 MB.  Since the VRAM size is 16 MiB == 16.8 MB, that 
explains the issue.


Thanks for reporting. It's been a known issue for a while.

But in the most recent devel tree, we have replaced ast memory 
management, so that it can now use the full vram size for scanout 
buffers. See



https://cgit.freedesktop.org/drm/drm-tip/commit/drivers/gpu/drm/ast/ast_mode.c?id=f2fa5a99ca81ce1056539e83c705f3d6bec62e31 



To test, get the latest drm-tip from

   git://anongit.freedesktop.org/drm/drm-tip

and try on your machine.

The updated driver should become available in Linux v6.3.

Best regards
Thomas


Hi Thomas!

Thanks for pointing me to that branch.  It took me some weeks to get it 
to build due to my lack of experience building kernels from source 
(hence the delayed reply), but I can confirm that the issue is fixed in 
drm-tip (1920x1200 works as it should on my ppc64le system; I didn't 
test on my x86_64 system).  Additionally, drm-tip fixes a different AST 
bug that was causing certain Wayland ppc64le systems (mainly GNOME, KDE, 
and GDM) to hang on a black screen (for which I was halfway through 
preparing a bug report), so that is a pleasant surprise.


Looking forward to Linux v6.3.

Cheers,
--
-Jeremy Rand
Lead Application Engineer at Namecoin


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v4 02/10] dt-bindings: display: bridge: Add MHDP DP for i.MX8MQ

2022-11-22 Thread Krzysztof Kozlowski
On 21/11/2022 08:23, Sandor Yu wrote:
> Add bindings for i.MX8MQ MHDP DisplayPort.
> 
> Signed-off-by: Sandor Yu 
> ---
>  .../display/bridge/cdns,mhdp-imx8mq-dp.yaml   | 93 +++
>  1 file changed, 93 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp.yaml
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp.yaml 
> b/Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp.yaml
> new file mode 100644
> index ..d82f3ceddaa8
> --- /dev/null
> +++ 
> b/Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp.yaml
> @@ -0,0 +1,93 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/display/bridge/cdns,mhdp-imx8mq-dp.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Cadence MHDP Displayport bridge
> +
> +maintainers:
> +  - Sandor Yu 
> +
> +description:
> +  The Cadence MHDP Displayport TX interface.
> +
> +properties:
> +  compatible:
> +enum:
> +  - cdns,mhdp-imx8mq-dp
> +
> +  reg:
> +maxItems: 1
> +
> +  clocks:
> +maxItems: 1
> +description: MHDP DP APB clock.
> +
> +  phys:
> +maxItems: 1
> +
> +  interrupts:
> +items:
> +  - description: Hotplug cable plugin.
> +  - description: Hotplug cable plugout.
> +
> +  interrupt-names:
> +items:
> +  - const: plug_in
> +  - const: plug_out
> +
> +  ports:
> +$ref: /schemas/graph.yaml#/properties/ports
> +
> +properties:
> +  port@0:
> +$ref: /schemas/graph.yaml#/properties/port
> +description:
> +  Input port from display controller output.
> +  port@1:
> +$ref: /schemas/graph.yaml#/properties/port
> +description:
> +  Output port to DP connector.
> +
> +required:
> +  - port@0
> +
> +required:
> +  - compatible
> +  - reg
> +  - clocks
> +  - interrupts
> +  - interrupt-names
> +  - phys
> +  - ports
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +#include 
> +#include 
> +
> +mhdp_dp: dp-bridge@32c0 {
> +compatible = "cdns,mhdp-imx8mq-dp";
> +reg = <0x32c0 0x10>;
> +interrupts = ,
> + ;
> +interrupt-names = "plug_in", "plug_out";
> +clocks = <&clk IMX8MQ_CLK_DISP_APB_ROOT>;
> +phys = <&dp_phy>;
> +
> +ports {
> +#address-cells = <1>;
> +#size-cells = <0>;
> +
> +port@0 {
> +reg = <0>;
> +
> +mhdp_in: endpoint {
> +remote-endpoint = <&dcss_out>;
> +};

As Rob suggested, you allowed property for output to DP port. However it
is not in the example. If this is a bridge, what does it bridge if there
is no output connector?


Best regards,
Krzysztof



Re: [PATCH v1 4/8] arm64: dts: renesas: r8a779g0: Add display related data

2022-11-22 Thread Tomi Valkeinen

On 17/11/2022 17:03, Kieran Bingham wrote:

Quoting Tomi Valkeinen (2022-11-17 12:25:43)

From: Tomi Valkeinen 

Add DT nodes for components needed to get the DSI output working:
- FCPv
- VSPd
- DU
- DSI

Signed-off-by: Tomi Valkeinen 
---
  arch/arm64/boot/dts/renesas/r8a779g0.dtsi | 129 ++
  1 file changed, 129 insertions(+)

diff --git a/arch/arm64/boot/dts/renesas/r8a779g0.dtsi 
b/arch/arm64/boot/dts/renesas/r8a779g0.dtsi
index 45d8d927ad26..31d4930c5adc 100644
--- a/arch/arm64/boot/dts/renesas/r8a779g0.dtsi
+++ b/arch/arm64/boot/dts/renesas/r8a779g0.dtsi
@@ -1207,6 +1207,135 @@ prr: chipid@fff00044 {
 compatible = "renesas,prr";
 reg = <0 0xfff00044 0 4>;
 };


I think these nodes are supposed to be in sort order based on the
register address in memory.


Ah, I didn't realize that.


Disregarding sort order, I'll review the node contents.

I would probably s/data/nodes/ in $SUBJECT too.



+
+   fcpvd0: fcp@fea1 {
+   compatible = "renesas,fcpv";
+   reg = <0 0xfea1 0 0x200>;
+   clocks = <&cpg CPG_MOD 508>;
+   power-domains = <&sysc R8A779G0_PD_ALWAYS_ON>;
+   resets = <&cpg 508>;
+   };
+
+   fcpvd1: fcp@fea11000 {
+   compatible = "renesas,fcpv";
+   reg = <0 0xfea11000 0 0x200>;
+   clocks = <&cpg CPG_MOD 509>;
+   power-domains = <&sysc R8A779G0_PD_ALWAYS_ON>;
+   resets = <&cpg 509>;
+   };


I'm intrigued at the length of 0x200 as I only see 3 registers up to
0x0018 ..

But all existing platforms with fcpv* set 0x200 ... so lets cargo cult it up... 
:-)


+
+   vspd0: vsp@fea2 {
+   compatible = "renesas,vsp2";
+   reg = <0 0xfea2 0 0x5000>;


"""
Below are the base addresses of each VSP unit. VSPX has 32Kbyte address
space. VSPD has 28Kbyte address space.
"""

Hrm : 28K is 0x7000

RPf n OSD CLUT Table: H’4000 + H’0400*n to H’43fc + H’0400*n

  0x43fc+(0x400*5)
22524   [0x57fc]

So this needs to be /at least/ 0x6000 (Would 0x5800 be odd?) and perhaps as it 
clearly states
28k, we should just set it to 0x7000.


Ok. These are identical to v3u, and I was just copying v3u's dts, 
assuming they're correct. We probably should fix the v3u dts files too.





+   interrupts = ;
+   clocks = <&cpg CPG_MOD 830>;
+   power-domains = <&sysc R8A779G0_PD_ALWAYS_ON>;
+   resets = <&cpg 830>;
+
+   renesas,fcp = <&fcpvd0>;
+   };
+
+   vspd1: vsp@fea28000 {
+   compatible = "renesas,vsp2";
+   reg = <0 0xfea28000 0 0x5000>;


Same here of course (reg = <0 0xfea28000 0 0x7000>)



+   interrupts = ;
+   clocks = <&cpg CPG_MOD 831>;
+   power-domains = <&sysc R8A779G0_PD_ALWAYS_ON>;
+   resets = <&cpg 831>;
+
+   renesas,fcp = <&fcpvd1>;
+   };
+
+   du: display@feb0 {
+   compatible = "renesas,du-r8a779g0";
+   reg = <0 0xfeb0 0 0x4>;
+   interrupts = ,
+;
+   clocks = <&cpg CPG_MOD 411>;
+   clock-names = "du.0";
+   power-domains = <&sysc R8A779G0_PD_ALWAYS_ON>;
+   resets = <&cpg 411>;
+   reset-names = "du.0";
+   renesas,vsps = <&vspd0 0>, <&vspd1 0>;
+
+   status = "disabled";
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   reg = <0>;
+   du_out_dsi0: endpoint {
+   remote-endpoint = <&dsi0_in>;
+   };
+   };
+
+   port@1 {
+   reg = <1>;
+   du_out_dsi1: endpoint {
+   remote-endpoint = <&dsi1_in>;
+   };
+   };
+   };
+   };
+
+   dsi0: dsi-encoder@fed8 {
+   compatible = "renesas,r8a779g0-dsi-csi2-tx";
+   reg = <0 0xfed8 0 0x1>;
+   power-domains = <&sysc R8A779G0_PD_ALWAYS_ON>;
+   clocks = <&cpg CPG_MOD 415>,
+   

Re: [PATCH v1 6/8] drm: rcar-du: Add r8a779g0 support

2022-11-22 Thread Tomi Valkeinen

On 22/11/2022 05:05, Laurent Pinchart wrote:

On Thu, Nov 17, 2022 at 03:08:38PM +, Kieran Bingham wrote:

Quoting Tomi Valkeinen (2022-11-17 12:25:45)

From: Tomi Valkeinen 

Add support for DU on r8a779g0, which is identical to DU on r8a779a0.

Signed-off-by: Tomi Valkeinen 
---
  drivers/gpu/drm/rcar-du/rcar_du_drv.c | 22 ++
  1 file changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c 
b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
index d003e8d9e7a2..b1761d4ec4e5 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
@@ -524,6 +524,27 @@ static const struct rcar_du_device_info 
rcar_du_r8a779a0_info = {
 .dsi_clk_mask =  BIT(1) | BIT(0),
  };
  
+static const struct rcar_du_device_info rcar_du_r8a779g0_info = {

+   .gen = 3,


Given that this is the V4H ... I wonder if this should be bumped
already. I guess that has knock on effects through the driver though...


rcar_du_group_setup_didsr() would need to be fixed to test gen >= 3
instead of gen == 3. That seems to be the only problematic location. It
could thus fairly easily be done in v2, but we can also delay it.


Ok, I can fix that here.

 Tomi



Re: [PATCH v1 7/8] drm: rcar-du: dsi: Add r8A779g0 support

2022-11-22 Thread Tomi Valkeinen

On 17/11/2022 17:46, Kieran Bingham wrote:

Quoting Tomi Valkeinen (2022-11-17 12:25:46)

From: Tomi Valkeinen 

Add DSI support for r8a779g0. The main differences to r8a779a0 are in
the PLL and PHTW setups.

Signed-off-by: Tomi Valkeinen 
---
  drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c  | 484 +++
  drivers/gpu/drm/rcar-du/rcar_mipi_dsi_regs.h |   6 +-
  2 files changed, 384 insertions(+), 106 deletions(-)

diff --git a/drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c 
b/drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c
index a7f2b7f66a17..723c35726c38 100644
--- a/drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c
+++ b/drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c
@@ -9,6 +9,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -28,6 +29,20 @@
  #include "rcar_mipi_dsi.h"
  #include "rcar_mipi_dsi_regs.h"
  
+#define MHZ(v) ((v) * 100u)

+
+enum rcar_mipi_dsi_hw_model {
+   RCAR_DSI_R8A779A0,
+   RCAR_DSI_R8A779G0,
+};
+
+struct rcar_mipi_dsi_device_info {
+   enum rcar_mipi_dsi_hw_model model;
+   const struct dsi_clk_config *clk_cfg;
+   u8 clockset2_m_offset;
+   u8 clockset2_n_offset;
+};
+
  struct rcar_mipi_dsi {
 struct device *dev;
 const struct rcar_mipi_dsi_device_info *info;
@@ -50,6 +65,17 @@ struct rcar_mipi_dsi {
 unsigned int lanes;
  };
  
+struct dsi_setup_info {

+   unsigned long hsfreq;
+   u16 hsfreqrange;
+
+   unsigned long fout;
+   u16 m;
+   u16 n;
+   u16 vclk_divider;
+   const struct dsi_clk_config *clkset;
+};
+
  static inline struct rcar_mipi_dsi *
  bridge_to_rcar_mipi_dsi(struct drm_bridge *bridge)
  {
@@ -62,22 +88,6 @@ host_to_rcar_mipi_dsi(struct mipi_dsi_host *host)
 return container_of(host, struct rcar_mipi_dsi, host);
  }
  
-static const u32 phtw[] = {

-   0x01020114, 0x01600115, /* General testing */
-   0x01030116, 0x0102011d, /* General testing */
-   0x011101a4, 0x018601a4, /* 1Gbps testing */
-   0x014201a0, 0x010001a3, /* 1Gbps testing */
-   0x0101011f, /* 1Gbps testing */
-};
-
-static const u32 phtw2[] = {
-   0x010c0130, 0x010c0140, /* General testing */
-   0x010c0150, 0x010c0180, /* General testing */
-   0x010c0190,
-   0x010a0160, 0x010a0170,
-   0x01800164, 0x01800174, /* 1Gbps testing */
-};
-
  static const u32 hsfreqrange_table[][2] = {
 { 8000U,   0x00 }, { 9000U,   0x10 }, { 1U,  0x20 },
 { 11000U,  0x30 }, { 12000U,  0x01 }, { 13000U,  0x11 },
@@ -103,24 +113,53 @@ static const u32 hsfreqrange_table[][2] = {
 { /* sentinel */ },
  };
  
-struct vco_cntrl_value {

+struct dsi_clk_config {
 u32 min_freq;
 u32 max_freq;
-   u16 value;
+   u8 vco_cntrl;
+   u8 cpbias_cntrl;
+   u8 gmp_cntrl;
+   u8 int_cntrl;
+   u8 prop_cntrl;
  };
  
-static const struct vco_cntrl_value vco_cntrl_table[] = {

-   { .min_freq = 4000U,   .max_freq = 5500U,   .value = 0x3f },
-   { .min_freq = 5250U,   .max_freq = 8000U,   .value = 0x39 },
-   { .min_freq = 8000U,   .max_freq = 11000U,  .value = 0x2f },
-   { .min_freq = 10500U,  .max_freq = 16000U,  .value = 0x29 },
-   { .min_freq = 16000U,  .max_freq = 22000U,  .value = 0x1f },
-   { .min_freq = 21000U,  .max_freq = 32000U,  .value = 0x19 },
-   { .min_freq = 32000U,  .max_freq = 44000U,  .value = 0x0f },
-   { .min_freq = 42000U,  .max_freq = 66000U,  .value = 0x09 },
-   { .min_freq = 63000U,  .max_freq = 114900U, .value = 0x03 },
-   { .min_freq = 11U, .max_freq = 115200U, .value = 0x01 },
-   { .min_freq = 115000U, .max_freq = 125000U, .value = 0x01 },
+static const struct dsi_clk_config dsi_clk_cfg_r8a779a0[] = {
+   {   4000u,   5500u, 0x3f, 0x10, 0x01, 0x00, 0x0b },
+   {   5250u,   8000u, 0x39, 0x10, 0x01, 0x00, 0x0b },
+   {   8000u,  11000u, 0x2f, 0x10, 0x01, 0x00, 0x0b },
+   {  10500u,  16000u, 0x29, 0x10, 0x01, 0x00, 0x0b },
+   {  16000u,  22000u, 0x1f, 0x10, 0x01, 0x00, 0x0b },
+   {  21000u,  32000u, 0x19, 0x10, 0x01, 0x00, 0x0b },
+   {  32000u,  44000u, 0x0f, 0x10, 0x01, 0x00, 0x0b },
+   {  42000u,  66000u, 0x09, 0x10, 0x01, 0x00, 0x0b },
+   {  63000u, 114900u, 0x03, 0x10, 0x01, 0x00, 0x0b },
+   { 11u, 115200u, 0x01, 0x10, 0x01, 0x00, 0x0b },
+   { 115000u, 125000u, 0x01, 0x10, 0x01, 0x00, 0x0c },


Sigh ... is it this one 0x0c value that means we need to keep all these
entries repeated ? :-S

If it weren't for that, it seems we could keep just two sets of

+   u8 cpbias_cntrl;
+   u8 gmp_cntrl;
+   u8 int_cntrl;
+   u8 prop_cntrl;


One for each of the 9a0, and the 9g0...

But this is fine, and I guess the implication is there may be other
future differences 

Re: [PATCH v1 2/8] dt-bindings: display: bridge: renesas, dsi-csi2-tx: Add r8a779g0

2022-11-22 Thread Geert Uytterhoeven
Hi Tomi,

On Tue, Nov 22, 2022 at 9:20 AM Tomi Valkeinen
 wrote:
> On 17/11/2022 17:14, Geert Uytterhoeven wrote:
> > On Thu, Nov 17, 2022 at 1:26 PM Tomi Valkeinen
> >  wrote:
> >> From: Tomi Valkeinen 
> >>
> >> Extend the Renesas DSI display bindings to support the r8a779g0 V4H.
> >>
> >> Signed-off-by: Tomi Valkeinen 
> >> ---
> >>   .../bindings/display/bridge/renesas,dsi-csi2-tx.yaml   | 3 ++-
> >>   1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git 
> >> a/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml
> >>  
> >> b/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml
> >> index afeeb967393d..bc3101f77e5a 100644
> >> --- 
> >> a/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml
> >> +++ 
> >> b/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml
> >> @@ -11,13 +11,14 @@ maintainers:
> >>
> >>   description: |
> >> This binding describes the MIPI DSI/CSI-2 encoder embedded in the 
> >> Renesas
> >> -  R-Car V3U SoC. The encoder can operate in either DSI or CSI-2 mode, 
> >> with up
> >> +  R-Car V3U/V4H SoC. The encoder can operate in either DSI or CSI-2 mode, 
> >> with up
> >
> > Perhaps "R-Car Gen4 SoCs", so we stay within 80 chars, and don't have
> > to update this when the next member of the family is around the block?
>
> Is V3U gen 4? Or do you mean "R-Car V3U and Gen 4 SoCs"?

Despite the name, R-Car V3U is the first member of the R-Car Gen4 family...
https://www.renesas.com/us/en/products/automotive-products/automotive-system-chips-socs/r-car-v3u-best-class-r-car-v3u-asil-d-system-chip-automated-driving

> > Is there anything that might be SoC-specific?
> > If not, perhaps the time is ripe for a family-specific compatible value?
>
> At least v3u and v4h DSIs are slightly different. Well, the DSI IP block
> itself looks the same, but the PLL and PHY are different.

I noticed, when I saw the dsi-csi2 driver changes.
So no family-specific compatible value is needed.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [Intel-gfx] [PATCH 1/6] drm/i915/uc: Introduce GSC FW

2022-11-22 Thread Jani Nikula
On Mon, 21 Nov 2022, Daniele Ceraolo Spurio  
wrote:
> On MTL the GSC FW needs to be loaded on the media GT by the graphics
> driver. We're going to treat it like a new uc_fw, so add the initial
> defs and init/fini functions for it.
>
> Similarly to the other FWs, the GSC FW path can be overriden via
> modparam. The modparam can also be used to disable the GSC FW loading by
> setting it to an empty string.
>
> Note that the new structure has been called intel_gsc_uc to avoid
> confusion with the existing intel_gsc, which instead represents the heci
> gsc interfaces.
>
> Signed-off-by: Daniele Ceraolo Spurio 
> Cc: Alan Previn 
> Cc: John Harrison 
> ---
>  drivers/gpu/drm/i915/Makefile |  3 +-
>  drivers/gpu/drm/i915/gt/intel_gt.h|  5 ++
>  drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c | 70 +++
>  drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h | 36 
>  drivers/gpu/drm/i915/gt/uc/intel_uc.c | 17 ++
>  drivers/gpu/drm/i915/gt/uc/intel_uc.h |  3 +
>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 25 +++-
>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h  |  7 ++-
>  drivers/gpu/drm/i915/i915_params.c|  3 +
>  drivers/gpu/drm/i915/i915_params.h|  1 +
>  10 files changed, 164 insertions(+), 6 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
>  create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h
>
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 01974b82d205..92d37cf71e16 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -205,7 +205,8 @@ i915-y += gt/uc/intel_uc.o \
> gt/uc/intel_guc_submission.o \
> gt/uc/intel_huc.o \
> gt/uc/intel_huc_debugfs.o \
> -   gt/uc/intel_huc_fw.o
> +   gt/uc/intel_huc_fw.o \
> +   gt/uc/intel_gsc_uc.o

Comment near the top of the file:

# Please keep these build lists sorted!

>  
>  # graphics system controller (GSC) support
>  i915-y += gt/intel_gsc.o
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h 
> b/drivers/gpu/drm/i915/gt/intel_gt.h
> index e0365d556248..d2f4fbde5f9f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.h
> @@ -39,6 +39,11 @@ static inline struct intel_gt *huc_to_gt(struct intel_huc 
> *huc)
>   return container_of(huc, struct intel_gt, uc.huc);
>  }
>  
> +static inline struct intel_gt *gsc_uc_to_gt(struct intel_gsc_uc *gsc_uc)
> +{
> + return container_of(gsc_uc, struct intel_gt, uc.gsc);
> +}
> +
>  static inline struct intel_gt *gsc_to_gt(struct intel_gsc *gsc)
>  {
>   return container_of(gsc, struct intel_gt, gsc);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
> new file mode 100644
> index ..65cbf1ce9fa1
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
> @@ -0,0 +1,70 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +
> +#include 
> +
> +#include "gt/intel_gt.h"
> +#include "intel_gsc_uc.h"
> +#include "i915_drv.h"
> +
> +static bool gsc_engine_supported(struct intel_gt *gt)
> +{
> + intel_engine_mask_t mask;
> +
> + /*
> +  * We reach here from i915_driver_early_probe for the primary GT before
> +  * its engine mask is set, so we use the device info engine mask for it.
> +  * For other GTs we expect the GT-specific mask to be set before we
> +  * call this function.
> +  */
> + GEM_BUG_ON(!gt_is_root(gt) && !gt->info.engine_mask);
> +
> + if (gt_is_root(gt))
> + mask = RUNTIME_INFO(gt->i915)->platform_engine_mask;
> + else
> + mask = gt->info.engine_mask;
> +
> + return __HAS_ENGINE(mask, GSC0);
> +}
> +
> +void intel_gsc_uc_init_early(struct intel_gsc_uc *gsc)
> +{
> + intel_uc_fw_init_early(&gsc->fw, INTEL_UC_FW_TYPE_GSC);
> +
> + /* we can arrive here from i915_driver_early_probe for primary
> +  * GT with it being not fully setup hence check device info's
> +  * engine mask
> +  */
> + if (!gsc_engine_supported(gsc_uc_to_gt(gsc))){
> + intel_uc_fw_change_status(&gsc->fw, 
> INTEL_UC_FIRMWARE_NOT_SUPPORTED);
> + return;
> + }
> +}
> +
> +int intel_gsc_uc_init(struct intel_gsc_uc *gsc)
> +{
> + struct drm_i915_private *i915 = gsc_uc_to_gt(gsc)->i915;
> + int err;
> +
> + err = intel_uc_fw_init(&gsc->fw);
> + if (err)
> + goto out;
> +
> + intel_uc_fw_change_status(&gsc->fw, INTEL_UC_FIRMWARE_LOADABLE);
> +
> + return 0;
> +
> +out:
> + i915_probe_error(i915, "failed with %d\n", err);
> + return err;
> +}
> +
> +void intel_gsc_uc_fini(struct intel_gsc_uc *gsc)
> +{
> + if (!intel_uc_fw_is_loadable(&gsc->fw))
> + return;
> +
> + intel_uc_fw_fini(&gsc->fw);
> +}
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h 
> b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h
> new file mode 10

Re: [PATCH] drm/mediatek: Clean dangling pointer on bind error path

2022-11-22 Thread AngeloGioacchino Del Regno

Il 21/11/22 23:37, Nícolas F. R. A. Prado ha scritto:

mtk_drm_bind() can fail, in which case drm_dev_put() is called,
destroying the drm_device object. However a pointer to it was still
being held in the private object, and that pointer would be passed along
to DRM in mtk_drm_sys_prepare() if a suspend were triggered at that
point, resulting in a panic. Clean the pointer when destroying the
object in the error path to prevent this from happening.

Signed-off-by: Nícolas F. R. A. Prado 



Fixes tag please! :-)

Cheers,
Angelo


---

  drivers/gpu/drm/mediatek/mtk_drm_drv.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index 39a42dc8fb85..a21ff1b3258c 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -514,6 +514,7 @@ static int mtk_drm_bind(struct device *dev)
  err_deinit:
mtk_drm_kms_deinit(drm);
  err_free:
+   private->drm = NULL;
drm_dev_put(drm);
return ret;
  }





[PATCH] drm/bridge: ti-sn65dsi83: Fix delay after reset deassert to match spec

2022-11-22 Thread Frieder Schrempf
From: Frieder Schrempf 

The datasheet specifies a delay of 10 milliseconds, but the current
driver only waits for 1 ms. Fix this to make sure the initialization
sequence meets the spec.

Fixes: ceb515ba29ba ("drm/bridge: ti-sn65dsi83: Add TI SN65DSI83 and SN65DSI84 
driver")
Signed-off-by: Frieder Schrempf 
---
 drivers/gpu/drm/bridge/ti-sn65dsi83.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
index 7ba9467fff12..047c14ddbbf1 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
@@ -346,7 +346,7 @@ static void sn65dsi83_atomic_enable(struct drm_bridge 
*bridge,
 
/* Deassert reset */
gpiod_set_value_cansleep(ctx->enable_gpio, 1);
-   usleep_range(1000, 1100);
+   usleep_range(1, 11000);
 
/* Get the LVDS format from the bridge state. */
bridge_state = drm_atomic_get_new_bridge_state(state, bridge);
-- 
2.38.1



[Bug 216727] New: Failure to wake up from suspend to RAM

2022-11-22 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216727

Bug ID: 216727
   Summary: Failure to wake up from suspend to RAM
   Product: Drivers
   Version: 2.5
Kernel Version: 6.0.9
  Hardware: AMD
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Video(DRI - non Intel)
  Assignee: drivers_video-...@kernel-bugs.osdl.org
  Reporter: martin...@gmx.com
Regression: No

Created attachment 303261
  --> https://bugzilla.kernel.org/attachment.cgi?id=303261&action=edit
errors and tracess logged when the issue occured

I've had this issue at least twice so far. The previous time it was on kernel
version 5.19.17.

When I tried waking up my computer from sleep I was greeted with corrupt
graphics. Was able to login via ssh and reboot.

I'm on Slackware-current. CPU is AMD Phenom II X4 965 Black Edition. GPU is AMD
Radeon RX 550 / 550 Series

Both time this happened I had a process running and using up all the memory and
eventually being killed by "out of memory". Though first time I experienced
this issue the out of memory happend the day beofre. The issue occured only on
wake up the next day.

I'm attaching some logs (some minor comments in the attached log file), I hope
it's not too much but I didn't want to miss any relevant information. Apologies
in advance.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v4 0/4] new subsystem for compute accelerator devices

2022-11-22 Thread Jacek Lawrynowicz
Hi,

On 11/19/2022 9:44 PM, Oded Gabbay wrote:
> This is the fourth (and hopefully last) version of the patch-set to add the
> new subsystem for compute accelerators. I removed the RFC headline as
> I believe it is now ready for merging.

Looks good and works without issues.
I will rebase next version of Intel VPU patchest on top of this.

Acked-by: Jacek Lawrynowicz 
Tested-by: Jacek Lawrynowicz  

Regards,
Jacek


Re: [PATCH v4] dma-buf: fix racing conflict of dma_heap_add()

2022-11-22 Thread Sumit Semwal
Hi Dawei Li,

On Mon, 21 Nov 2022 at 23:53, Christian König  wrote:
>
> Hi Dawei,
>
> from the technical description, coding style etc.. it looks clean to me,
> but I'm the completely wrong person to ask for a background check.
>
> I have a high level understanding of how dma-heaps work, but not a
> single line of this code is from me.
>
> Feel free to add my Acked-by, but Laura, John and others do you have any
> opinion?
>
> Regards,
> Christian.
>
> Am 21.11.22 um 16:28 schrieb Dawei Li:
> > On Sat, Nov 05, 2022 at 12:05:36AM +0800, Dawei Li wrote:
> >
> > Hi Christian,
> > May I have your opinion on this change?
> >
> > Thanks,
> > Dawei
> >
> >> Racing conflict could be:
> >> task A task B
> >> list_for_each_entry
> >> strcmp(h->name))
> >> list_for_each_entry
> >> strcmp(h->name)
> >> kzallockzalloc
> >> .. .
> >> device_create  device_create
> >> list_add
> >> list_add
> >>
> >> The root cause is that task B has no idea about the fact someone
> >> else(A) has inserted heap with same name when it calls list_add,
> >> so a potential collision occurs.
> >>
> >> Fixes: c02a81fba74f ("dma-buf: Add dma-buf heaps framework")
> >> Signed-off-by: Dawei Li 

Looks good to me as well. I will apply it over on drm-misc.

Best,
Sumit.
> >> ---
> >> v1: 
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FTYCP286MB2323950197F60FC3473123B7CA349%40TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM%2F&data=05%7C01%7Cchristian.koenig%40amd.com%7C1989f31257fc458a27c508dacbd5078e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638046413707443203%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=OWPUPrIHGnzwXyQE4WlIthThwSuSK2y3Eq2Wq5zAzbo%3D&reserved=0
> >> v1->v2: Narrow down locking scope, check the existence of heap before
> >> insertion, as suggested by Andrew Davis.
> >> v2->v3: Remove double checking.
> >> v3->v4: Minor coding style and patch formatting adjustment.
> >> ---
> >>   drivers/dma-buf/dma-heap.c | 28 +++-
> >>   1 file changed, 15 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
> >> index 8f5848aa144f..59d158873f4c 100644
> >> --- a/drivers/dma-buf/dma-heap.c
> >> +++ b/drivers/dma-buf/dma-heap.c
> >> @@ -233,18 +233,6 @@ struct dma_heap *dma_heap_add(const struct 
> >> dma_heap_export_info *exp_info)
> >>  return ERR_PTR(-EINVAL);
> >>  }
> >>
> >> -/* check the name is unique */
> >> -mutex_lock(&heap_list_lock);
> >> -list_for_each_entry(h, &heap_list, list) {
> >> -if (!strcmp(h->name, exp_info->name)) {
> >> -mutex_unlock(&heap_list_lock);
> >> -pr_err("dma_heap: Already registered heap named %s\n",
> >> -   exp_info->name);
> >> -return ERR_PTR(-EINVAL);
> >> -}
> >> -}
> >> -mutex_unlock(&heap_list_lock);
> >> -
> >>  heap = kzalloc(sizeof(*heap), GFP_KERNEL);
> >>  if (!heap)
> >>  return ERR_PTR(-ENOMEM);
> >> @@ -283,13 +271,27 @@ struct dma_heap *dma_heap_add(const struct 
> >> dma_heap_export_info *exp_info)
> >>  err_ret = ERR_CAST(dev_ret);
> >>  goto err2;
> >>  }
> >> -/* Add heap to the list */
> >> +
> >>  mutex_lock(&heap_list_lock);
> >> +/* check the name is unique */
> >> +list_for_each_entry(h, &heap_list, list) {
> >> +if (!strcmp(h->name, exp_info->name)) {
> >> +mutex_unlock(&heap_list_lock);
> >> +pr_err("dma_heap: Already registered heap named %s\n",
> >> +   exp_info->name);
> >> +err_ret = ERR_PTR(-EINVAL);
> >> +goto err3;
> >> +}
> >> +}
> >> +
> >> +/* Add heap to the list */
> >>  list_add(&heap->list, &heap_list);
> >>  mutex_unlock(&heap_list_lock);
> >>
> >>  return heap;
> >>
> >> +err3:
> >> +device_destroy(dma_heap_class, heap->heap_devt);
> >>   err2:
> >>  cdev_del(&heap->heap_cdev);
> >>   err1:
> >> --
> >> 2.25.1
> >>
>


-- 
Thanks and regards,

Sumit Semwal (he / him)
Tech Lead - LCG, Vertical Technologies
Linaro.org │ Open source software for ARM SoCs


Re: [PATCH 6/6] drm/i915: Bpp/timeslot calculation fixes for DP MST DSC

2022-11-22 Thread Lisovskiy, Stanislav
On Thu, Nov 10, 2022 at 02:23:53PM -0800, Navare, Manasi wrote:
> On Thu, Nov 03, 2022 at 03:23:00PM +0200, Stanislav Lisovskiy wrote:
> > Fix intel_dp_dsc_compute_config, previously timeslots parameter
> > was used in fact not as a timeslots, but more like a ratio
> > timeslots/64, which of course didn't have any effect for SST DSC,
> > but causes now issues for MST DSC.
> > Secondly we need to calculate pipe_bpp using intel_dp_dsc_compute_bpp
> > only for SST DSC case, while for MST case it has been calculated
> > earlier already with intel_dp_dsc_mst_compute_link_config.
> > Third we also were wrongly determining sink min bpp/max bpp, those
> > limites should be intersected with our limits to find common
> > acceptable bpp's, plus on top of that we should align those with
> > VESA bpps and only then calculate required timeslots amount.
> > Some MST hubs started to work only after third change was made.
> > 
> > v2: Make kernel test robot happy(claimed there was unitialzed use,
> > while there is none)
> > 
> > Signed-off-by: Stanislav Lisovskiy 
> > ---
> >  drivers/gpu/drm/i915/display/intel_dp.c | 69 ++---
> >  drivers/gpu/drm/i915/display/intel_dp.h |  3 +-
> >  drivers/gpu/drm/i915/display/intel_dp_mst.c | 69 +
> >  3 files changed, 106 insertions(+), 35 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
> > b/drivers/gpu/drm/i915/display/intel_dp.c
> > index 8288a30dbd51..82752b696498 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > @@ -716,9 +716,14 @@ u16 intel_dp_dsc_get_output_bpp(struct 
> > drm_i915_private *i915,
> >  * for SST -> TimeSlotsPerMTP is 1,
> >  * for MST -> TimeSlotsPerMTP has to be calculated
> >  */
> > -   bits_per_pixel = (link_clock * lane_count * 8) * timeslots /
> > -intel_dp_mode_to_fec_clock(mode_clock);
> > -   drm_dbg_kms(&i915->drm, "Max link bpp: %u\n", bits_per_pixel);
> > +   bits_per_pixel = DIV_ROUND_UP((link_clock * lane_count) * timeslots,
> > + intel_dp_mode_to_fec_clock(mode_clock) * 
> > 8);
> 
> Why did we remove the *8 in the numerator for the total bandwidth
> link_clock * lane_count * 8 ? 
> 
> Other than this clarification, all changes look good
> 
> Manasi

Hi Manasi,

Because previously this function was actually confusing the ratio timeslots/64,
with the timeslots number.

It was actually expecting a ratio timeslots/64, rather than the timeslots 
number.

For SST it didn't matter as timeslots were always 1, but for MST case if we 
multiply 
that by number number of timeslots, this formula will return some big bogus 
bits_per_pixel number(checked that). 
Of course we can pass a ratio timeslots/64 here, but it isn't very convenient 
and intuitive
to manipulate.
So I made it to use a "timeslots" parameter as timeslots number, 
so that the ratio is calculated as part of the formula i.e:

((link_clock * lane_count * 8) * (timeslots / 64)) /  
intel_dp_mode_to_fec_clock(mode_clock);

which can be simplified as

((link_clock * lane_count * timeslots) / 8) / 
intel_dp_mode_to_fec_clock(mode_clock);

the whole formula comes from that
pipe_bpp * crtc_clock should be equal to link_total_bw * (timeslots / 64), i.e
timeslots/64 ratio defines, how much of the link_total_bw(link_clock * 
lane_count * 8)
we have for those pipe_bpp * crtc_clock, which we want to accomodate there.

Obviously if we just multiplied link_total_bw by timeslots, we would get a 
situation that
the more timeslots we allocate, the more total bw we get, which is wrong and 
will
result in some bogus huge pipe_bpp numbers.

Stan

> 
> > +
> > +   drm_dbg_kms(&i915->drm, "Max link bpp is %u for %u timeslots "
> > +   "total bw %u pixel clock %u\n",
> > +   bits_per_pixel, timeslots,
> > +   (link_clock * lane_count * 8),
> > +   intel_dp_mode_to_fec_clock(mode_clock));
> >  
> > /* Small Joiner Check: output bpp <= joiner RAM (bits) / Horiz. width */
> > max_bpp_small_joiner_ram = small_joiner_ram_size_bits(i915) /
> > @@ -1047,7 +1052,7 @@ intel_dp_mode_valid(struct drm_connector *_connector,
> > target_clock,
> > mode->hdisplay,
> > bigjoiner,
> > -   pipe_bpp, 1) >> 4;
> > +   pipe_bpp, 64) >> 4;
> > dsc_slice_count =
> > intel_dp_dsc_get_slice_count(intel_dp,
> >  target_clock,
> > @@ -1481,7 +1486,8 @@ int intel_dp_dsc_compute_config(struct intel_dp 
> > *intel_dp,
> > struct intel_crtc_state *pip

Re: [Intel-gfx] [PATCH v3 1/2] drm/i915: Fix negative value passed as remaining time

2022-11-22 Thread Tvrtko Ursulin



On 21/11/2022 23:19, Janusz Krzysztofik wrote:

Hi Andrzej,

Thanks for providing your R-b.

On Monday, 21 November 2022 18:40:51 CET Andrzej Hajda wrote:

On 21.11.2022 15:56, Janusz Krzysztofik wrote:

Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
with GuC") extended the API of intel_gt_retire_requests_timeout() with an
extra argument 'remaining_timeout', intended for passing back unconsumed
portion of requested timeout when 0 (success) is returned.  However, when
request retirement happens to succeed despite an error returned by a call
to dma_fence_wait_timeout(), that error code (a negative value) is passed
back instead of remaining time.  If we then pass that negative value
forward as requested timeout to intel_uc_wait_for_idle(), an explicit BUG
will be triggered.


Right, AFAICT a GEM_BUG_ON in debug builds, but in production builds 
negative timeout will get passed along all the way to schedule_timeout 
where error and call trace will be dumped. So fix appears warranted indeed.



If request retirement succeeds but an error code is passed back via
remaininig_timeout, we may have no clue on how much of the initial timeout
might have been left for spending it on waiting for GuC to become idle.
OTOH, since all pending requests have been successfully retired, that
error code has been already ignored by intel_gt_retire_requests_timeout(),
then we shouldn't fail.

Assume no more time has been left on error and pass 0 timeout value to
intel_uc_wait_for_idle() to give it a chance to return success if GuC is
already idle.

v3: Don't fail on any error passed back via remaining_timeout.

v2: Fix the issue on the caller side, not the provider.

Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work

with GuC")

Signed-off-by: Janusz Krzysztofik 
Cc: sta...@vger.kernel.org # v5.15+


Reviewed-by: Andrzej Hajda 


While still open for comments from others, I'm now looking for potential
committer.


Both patches are considered good to go?

Regards,

Tvrtko



Thanks,
Janusz




Regards
Andrzej


---
   drivers/gpu/drm/i915/gt/intel_gt.c | 9 +++--
   1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/

intel_gt.c

index b5ad9caa55372..7ef0edb2e37cd 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -677,8 +677,13 @@ int intel_gt_wait_for_idle(struct intel_gt *gt, long

timeout)

return -EINTR;
}
   
-	return timeout ? timeout : intel_uc_wait_for_idle(>->uc,

-   

remaining_timeout);

+   if (timeout)
+   return timeout;
+
+   if (remaining_timeout < 0)
+   remaining_timeout = 0;
+
+   return intel_uc_wait_for_idle(>->uc, remaining_timeout);
   }
   
   int intel_gt_init(struct intel_gt *gt)










[git pull] new subsystem for compute accelerator devices

2022-11-22 Thread Oded Gabbay
Hi Dave,

Here is the pull request for creating the compute acceleration (accel)
subsystem. It includes the fixes to the few comments given in the latest
patch-set review.

Highlights of the patch-set contents are in the signed tag.

Thanks,
Oded

The following changes since commit fc58764bbf602b65a6f63c53e5fd6feae76c510c:

  Merge tag 'amd-drm-next-6.2-2022-11-18' of 
https://gitlab.freedesktop.org/agd5f/linux into drm-next (2022-11-22 13:41:11 
+1000)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git 
tags/drm-accel-2022-11-22

for you to fetch changes up to cadbcc80e81800a3189f9fed23407fa014b8e3c3:

  doc: add documentation for accel subsystem (2022-11-22 12:41:08 +0200)


This tag contains the patches that add the new compute acceleration
subsystem, which is part of the DRM subsystem.

The patches:
- Add a new directory at drivers/accel.
- Add a new major (261) for compute accelerators.
- Add a new DRM minor type for compute accelerators.
- Integrate the accel core code with DRM core code.
- Add documentation for the accel subsystem.


Oded Gabbay (4):
  drivers/accel: define kconfig and register a new major
  accel: add dedicated minor for accelerator devices
  drm: initialize accel framework
  doc: add documentation for accel subsystem

 Documentation/accel/index.rst |  17 ++
 Documentation/accel/introduction.rst  | 110 
 Documentation/admin-guide/devices.txt |   5 +
 Documentation/subsystem-apis.rst  |   1 +
 MAINTAINERS   |   9 +
 drivers/Kconfig   |   2 +
 drivers/accel/Kconfig |  24 +++
 drivers/accel/drm_accel.c | 323 ++
 drivers/gpu/drm/Makefile  |   1 +
 drivers/gpu/drm/drm_drv.c | 101 ---
 drivers/gpu/drm/drm_file.c|   2 +-
 drivers/gpu/drm/drm_sysfs.c   |  24 ++-
 include/drm/drm_accel.h   |  97 ++
 include/drm/drm_device.h  |   3 +
 include/drm/drm_drv.h |   8 +
 include/drm/drm_file.h|  21 ++-
 16 files changed, 711 insertions(+), 37 deletions(-)
 create mode 100644 Documentation/accel/index.rst
 create mode 100644 Documentation/accel/introduction.rst
 create mode 100644 drivers/accel/Kconfig
 create mode 100644 drivers/accel/drm_accel.c
 create mode 100644 include/drm/drm_accel.h


Re: [PATCH v3 2/2] drm/i915: Never return 0 if not all requests retired

2022-11-22 Thread Tvrtko Ursulin



On 21/11/2022 14:56, Janusz Krzysztofik wrote:

Users of intel_gt_retire_requests_timeout() expect 0 return value on
success.  However, we have no protection from passing back 0 potentially
returned by a call to dma_fence_wait_timeout() when it succedes right
after its timeout has expired.


Is this talking about a potential weakness, or ambiguous kerneldoc, of 
dma_fence_wait_timeout, dma_fence_default_wait and 
i915_request_wait_timeout? They appear to say 0 return means timeout, 
implying unsignaled fence. In other words signaled must return positive 
remaining timeout. Implementations seems to allow a race which indeed 
appears that return 0 and signaled fence is possible.


If dma_fence_wait can indeed return 0 even when a request is signaled, 
then how is timeout ?: -ETIME below correct? It isn't a chance for false 
negative in its' callers?


Regards,

Tvrtko


Replace 0 with -ETIME before potentially using the timeout value as return
code, so -ETIME is returned if there are still some requests not retired
after timeout, 0 otherwise.

v3: Use conditional expression, more compact but also better reflecting
 intention standing behind the change.

v2: Move the added lines down so flush_submission() is not affected.

Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request")
Signed-off-by: Janusz Krzysztofik 
Reviewed-by: Andrzej Hajda 
Cc: sta...@vger.kernel.org # v5.5+
---
  drivers/gpu/drm/i915/gt/intel_gt_requests.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c 
b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index edb881d756309..1dfd01668c79c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -199,7 +199,7 @@ out_active: spin_lock(&timelines->lock);
if (remaining_timeout)
*remaining_timeout = timeout;
  
-	return active_count ? timeout : 0;

+   return active_count ? timeout ?: -ETIME : 0;
  }
  
  static void retire_work_handler(struct work_struct *work)


Re: [PATCH v28 01/11] dt-bindings: arm: mediatek: mmsys: add vdosys1 compatible for MT8195

2022-11-22 Thread 林欣螢


Re: Build regressions/improvements in v6.1-rc6

2022-11-22 Thread Geert Uytterhoeven

On Tue, 22 Nov 2022, Geert Uytterhoeven wrote:

JFYI, when comparing v6.1-rc6[1] to v6.1-rc5[3], the summaries are:
 - build errors: +6/-0


  + /kisskb/src/arch/sh/include/asm/io.h: error: cast to pointer from integer of 
different size [-Werror=int-to-pointer-cast]:  => 239:34

sh4-gcc11/sh-allmodconfig (in cvm_oct_free_hw_memory())

  + /kisskb/src/arch/um/include/asm/processor-generic.h: error: called object is 
not a function or function pointer:  => 94:18
  + /kisskb/src/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c: error: 
control reaches end of non-void function [-Werror=return-type]:  => 1934:1

um-x86_64/um-all{mod,yes}config (in kfd_cpumask_to_apic_id())

  + /kisskb/src/drivers/infiniband/hw/qib/qib_wc_x86_64.c: error: 'X86_VENDOR_AMD' 
undeclared (first use in this function):  => 149:37
  + /kisskb/src/drivers/infiniband/hw/qib/qib_wc_x86_64.c: error: 'struct 
cpuinfo_um' has no member named 'x86_vendor':  => 149:22
  + /kisskb/src/drivers/infiniband/hw/qib/qib_wc_x86_64.c: error: control reaches 
end of non-void function [-Werror=return-type]:  => 150:1

um-x86_64/um-allyesconfig


[1] 
http://kisskb.ellerman.id.au/kisskb/branch/linus/head/eb7081409f94a9a8608593d0fb63a1aa3d6f95d8/
 (all 149 configs)
[3] 
http://kisskb.ellerman.id.au/kisskb/branch/linus/head/094226ad94f471a9f19e8f8e7140a09c2625abaa/
 (all 149 configs)


Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH v4 3/4] drm: initialize accel framework

2022-11-22 Thread Melissa Wen
On 11/19, Oded Gabbay wrote:
> Now that we have the accel framework code ready, let's call the
> accel functions from all the appropriate places. These places are the
> drm module init/exit functions, and all the drm_minor handling
> functions.

Hi Oded,

The proposal overall LGTM, but I didn't manage to compile the kernel
with your patch series when DRM is enabled but accel support doesn't.
Multiple missing link errors...

Am I missing something? Can you take a look at it from this patch 3/4?

Thanks,

Melissa
> 
> Signed-off-by: Oded Gabbay 
> ---
>  drivers/gpu/drm/drm_drv.c   | 102 ++--
>  drivers/gpu/drm/drm_sysfs.c |  24 ++---
>  2 files changed, 91 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 8214a0b1ab7f..1aec019f6389 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -35,6 +35,7 @@
>  #include 
>  #include 
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -90,6 +91,8 @@ static struct drm_minor **drm_minor_get_slot(struct 
> drm_device *dev,
>   return &dev->primary;
>   case DRM_MINOR_RENDER:
>   return &dev->render;
> + case DRM_MINOR_ACCEL:
> + return &dev->accel;
>   default:
>   BUG();
>   }
> @@ -104,9 +107,13 @@ static void drm_minor_alloc_release(struct drm_device 
> *dev, void *data)
>  
>   put_device(minor->kdev);
>  
> - spin_lock_irqsave(&drm_minor_lock, flags);
> - idr_remove(&drm_minors_idr, minor->index);
> - spin_unlock_irqrestore(&drm_minor_lock, flags);
> + if (minor->type == DRM_MINOR_ACCEL) {
> + accel_minor_remove(minor->index);
> + } else {
> + spin_lock_irqsave(&drm_minor_lock, flags);
> + idr_remove(&drm_minors_idr, minor->index);
> + spin_unlock_irqrestore(&drm_minor_lock, flags);
> + }
>  }
>  
>  static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
> @@ -123,13 +130,17 @@ static int drm_minor_alloc(struct drm_device *dev, 
> unsigned int type)
>   minor->dev = dev;
>  
>   idr_preload(GFP_KERNEL);
> - spin_lock_irqsave(&drm_minor_lock, flags);
> - r = idr_alloc(&drm_minors_idr,
> -   NULL,
> -   64 * type,
> -   64 * (type + 1),
> -   GFP_NOWAIT);
> - spin_unlock_irqrestore(&drm_minor_lock, flags);
> + if (type == DRM_MINOR_ACCEL) {
> + r = accel_minor_alloc();
> + } else {
> + spin_lock_irqsave(&drm_minor_lock, flags);
> + r = idr_alloc(&drm_minors_idr,
> + NULL,
> + 64 * type,
> + 64 * (type + 1),
> + GFP_NOWAIT);
> + spin_unlock_irqrestore(&drm_minor_lock, flags);
> + }
>   idr_preload_end();
>  
>   if (r < 0)
> @@ -161,10 +172,14 @@ static int drm_minor_register(struct drm_device *dev, 
> unsigned int type)
>   if (!minor)
>   return 0;
>  
> - ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root);
> - if (ret) {
> - DRM_ERROR("DRM: Failed to initialize /sys/kernel/debug/dri.\n");
> - goto err_debugfs;
> + if (minor->type == DRM_MINOR_ACCEL) {
> + accel_debugfs_init(minor, minor->index);
> + } else {
> + ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root);
> + if (ret) {
> + DRM_ERROR("DRM: Failed to initialize 
> /sys/kernel/debug/dri.\n");
> + goto err_debugfs;
> + }
>   }
>  
>   ret = device_add(minor->kdev);
> @@ -172,9 +187,13 @@ static int drm_minor_register(struct drm_device *dev, 
> unsigned int type)
>   goto err_debugfs;
>  
>   /* replace NULL with @minor so lookups will succeed from now on */
> - spin_lock_irqsave(&drm_minor_lock, flags);
> - idr_replace(&drm_minors_idr, minor, minor->index);
> - spin_unlock_irqrestore(&drm_minor_lock, flags);
> + if (minor->type == DRM_MINOR_ACCEL) {
> + accel_minor_replace(minor, minor->index);
> + } else {
> + spin_lock_irqsave(&drm_minor_lock, flags);
> + idr_replace(&drm_minors_idr, minor, minor->index);
> + spin_unlock_irqrestore(&drm_minor_lock, flags);
> + }
>  
>   DRM_DEBUG("new minor registered %d\n", minor->index);
>   return 0;
> @@ -194,9 +213,13 @@ static void drm_minor_unregister(struct drm_device *dev, 
> unsigned int type)
>   return;
>  
>   /* replace @minor with NULL so lookups will fail from now on */
> - spin_lock_irqsave(&drm_minor_lock, flags);
> - idr_replace(&drm_minors_idr, NULL, minor->index);
> - spin_unlock_irqrestore(&drm_minor_lock, flags);
> + if (minor->type == DRM_MINOR_ACCEL) {
> + accel_minor_replace(NULL, minor->index);
> + } else {
> + spi

Re: [PATCH v4 3/4] drm: initialize accel framework

2022-11-22 Thread Oded Gabbay
On Tue, Nov 22, 2022 at 12:56 PM Melissa Wen  wrote:
>
> On 11/19, Oded Gabbay wrote:
> > Now that we have the accel framework code ready, let's call the
> > accel functions from all the appropriate places. These places are the
> > drm module init/exit functions, and all the drm_minor handling
> > functions.
>
> Hi Oded,
>
> The proposal overall LGTM, but I didn't manage to compile the kernel
> with your patch series when DRM is enabled but accel support doesn't.
> Multiple missing link errors...
>
> Am I missing something? Can you take a look at it from this patch 3/4?
>
> Thanks,
>
> Melissa
Looking at it now, thanks for letting me know.
Oded

> >
> > Signed-off-by: Oded Gabbay 
> > ---
> >  drivers/gpu/drm/drm_drv.c   | 102 ++--
> >  drivers/gpu/drm/drm_sysfs.c |  24 ++---
> >  2 files changed, 91 insertions(+), 35 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > index 8214a0b1ab7f..1aec019f6389 100644
> > --- a/drivers/gpu/drm/drm_drv.c
> > +++ b/drivers/gpu/drm/drm_drv.c
> > @@ -35,6 +35,7 @@
> >  #include 
> >  #include 
> >
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -90,6 +91,8 @@ static struct drm_minor **drm_minor_get_slot(struct 
> > drm_device *dev,
> >   return &dev->primary;
> >   case DRM_MINOR_RENDER:
> >   return &dev->render;
> > + case DRM_MINOR_ACCEL:
> > + return &dev->accel;
> >   default:
> >   BUG();
> >   }
> > @@ -104,9 +107,13 @@ static void drm_minor_alloc_release(struct drm_device 
> > *dev, void *data)
> >
> >   put_device(minor->kdev);
> >
> > - spin_lock_irqsave(&drm_minor_lock, flags);
> > - idr_remove(&drm_minors_idr, minor->index);
> > - spin_unlock_irqrestore(&drm_minor_lock, flags);
> > + if (minor->type == DRM_MINOR_ACCEL) {
> > + accel_minor_remove(minor->index);
> > + } else {
> > + spin_lock_irqsave(&drm_minor_lock, flags);
> > + idr_remove(&drm_minors_idr, minor->index);
> > + spin_unlock_irqrestore(&drm_minor_lock, flags);
> > + }
> >  }
> >
> >  static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
> > @@ -123,13 +130,17 @@ static int drm_minor_alloc(struct drm_device *dev, 
> > unsigned int type)
> >   minor->dev = dev;
> >
> >   idr_preload(GFP_KERNEL);
> > - spin_lock_irqsave(&drm_minor_lock, flags);
> > - r = idr_alloc(&drm_minors_idr,
> > -   NULL,
> > -   64 * type,
> > -   64 * (type + 1),
> > -   GFP_NOWAIT);
> > - spin_unlock_irqrestore(&drm_minor_lock, flags);
> > + if (type == DRM_MINOR_ACCEL) {
> > + r = accel_minor_alloc();
> > + } else {
> > + spin_lock_irqsave(&drm_minor_lock, flags);
> > + r = idr_alloc(&drm_minors_idr,
> > + NULL,
> > + 64 * type,
> > + 64 * (type + 1),
> > + GFP_NOWAIT);
> > + spin_unlock_irqrestore(&drm_minor_lock, flags);
> > + }
> >   idr_preload_end();
> >
> >   if (r < 0)
> > @@ -161,10 +172,14 @@ static int drm_minor_register(struct drm_device *dev, 
> > unsigned int type)
> >   if (!minor)
> >   return 0;
> >
> > - ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root);
> > - if (ret) {
> > - DRM_ERROR("DRM: Failed to initialize 
> > /sys/kernel/debug/dri.\n");
> > - goto err_debugfs;
> > + if (minor->type == DRM_MINOR_ACCEL) {
> > + accel_debugfs_init(minor, minor->index);
> > + } else {
> > + ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root);
> > + if (ret) {
> > + DRM_ERROR("DRM: Failed to initialize 
> > /sys/kernel/debug/dri.\n");
> > + goto err_debugfs;
> > + }
> >   }
> >
> >   ret = device_add(minor->kdev);
> > @@ -172,9 +187,13 @@ static int drm_minor_register(struct drm_device *dev, 
> > unsigned int type)
> >   goto err_debugfs;
> >
> >   /* replace NULL with @minor so lookups will succeed from now on */
> > - spin_lock_irqsave(&drm_minor_lock, flags);
> > - idr_replace(&drm_minors_idr, minor, minor->index);
> > - spin_unlock_irqrestore(&drm_minor_lock, flags);
> > + if (minor->type == DRM_MINOR_ACCEL) {
> > + accel_minor_replace(minor, minor->index);
> > + } else {
> > + spin_lock_irqsave(&drm_minor_lock, flags);
> > + idr_replace(&drm_minors_idr, minor, minor->index);
> > + spin_unlock_irqrestore(&drm_minor_lock, flags);
> > + }
> >
> >   DRM_DEBUG("new minor registered %d\n", minor->index);
> >   return 0;
> > @@ -194,9 +213,13 @@ static void drm_minor_unregister(struct drm_device 
> > *dev, unsigned int type)
> >   return;
> >
> >   /* replace @mino

Re: [PATCH v4 3/4] drm: initialize accel framework

2022-11-22 Thread Oded Gabbay
On Tue, Nov 22, 2022 at 12:59 PM Oded Gabbay  wrote:
>
> On Tue, Nov 22, 2022 at 12:56 PM Melissa Wen  wrote:
> >
> > On 11/19, Oded Gabbay wrote:
> > > Now that we have the accel framework code ready, let's call the
> > > accel functions from all the appropriate places. These places are the
> > > drm module init/exit functions, and all the drm_minor handling
> > > functions.
> >
> > Hi Oded,
> >
> > The proposal overall LGTM, but I didn't manage to compile the kernel
> > with your patch series when DRM is enabled but accel support doesn't.
> > Multiple missing link errors...
> >
> > Am I missing something? Can you take a look at it from this patch 3/4?
> >
> > Thanks,
> >
> > Melissa
> Looking at it now, thanks for letting me know.
> Oded
Found the issue, missing } at end of accel_debugfs_init() in
drm_accel.h. Only compiles when accel support isn't compiled in.
I'll fix it before sending the PR to Dave.
Much appreciated :)
Oded

>
> > >
> > > Signed-off-by: Oded Gabbay 
> > > ---
> > >  drivers/gpu/drm/drm_drv.c   | 102 ++--
> > >  drivers/gpu/drm/drm_sysfs.c |  24 ++---
> > >  2 files changed, 91 insertions(+), 35 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > > index 8214a0b1ab7f..1aec019f6389 100644
> > > --- a/drivers/gpu/drm/drm_drv.c
> > > +++ b/drivers/gpu/drm/drm_drv.c
> > > @@ -35,6 +35,7 @@
> > >  #include 
> > >  #include 
> > >
> > > +#include 
> > >  #include 
> > >  #include 
> > >  #include 
> > > @@ -90,6 +91,8 @@ static struct drm_minor **drm_minor_get_slot(struct 
> > > drm_device *dev,
> > >   return &dev->primary;
> > >   case DRM_MINOR_RENDER:
> > >   return &dev->render;
> > > + case DRM_MINOR_ACCEL:
> > > + return &dev->accel;
> > >   default:
> > >   BUG();
> > >   }
> > > @@ -104,9 +107,13 @@ static void drm_minor_alloc_release(struct 
> > > drm_device *dev, void *data)
> > >
> > >   put_device(minor->kdev);
> > >
> > > - spin_lock_irqsave(&drm_minor_lock, flags);
> > > - idr_remove(&drm_minors_idr, minor->index);
> > > - spin_unlock_irqrestore(&drm_minor_lock, flags);
> > > + if (minor->type == DRM_MINOR_ACCEL) {
> > > + accel_minor_remove(minor->index);
> > > + } else {
> > > + spin_lock_irqsave(&drm_minor_lock, flags);
> > > + idr_remove(&drm_minors_idr, minor->index);
> > > + spin_unlock_irqrestore(&drm_minor_lock, flags);
> > > + }
> > >  }
> > >
> > >  static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
> > > @@ -123,13 +130,17 @@ static int drm_minor_alloc(struct drm_device *dev, 
> > > unsigned int type)
> > >   minor->dev = dev;
> > >
> > >   idr_preload(GFP_KERNEL);
> > > - spin_lock_irqsave(&drm_minor_lock, flags);
> > > - r = idr_alloc(&drm_minors_idr,
> > > -   NULL,
> > > -   64 * type,
> > > -   64 * (type + 1),
> > > -   GFP_NOWAIT);
> > > - spin_unlock_irqrestore(&drm_minor_lock, flags);
> > > + if (type == DRM_MINOR_ACCEL) {
> > > + r = accel_minor_alloc();
> > > + } else {
> > > + spin_lock_irqsave(&drm_minor_lock, flags);
> > > + r = idr_alloc(&drm_minors_idr,
> > > + NULL,
> > > + 64 * type,
> > > + 64 * (type + 1),
> > > + GFP_NOWAIT);
> > > + spin_unlock_irqrestore(&drm_minor_lock, flags);
> > > + }
> > >   idr_preload_end();
> > >
> > >   if (r < 0)
> > > @@ -161,10 +172,14 @@ static int drm_minor_register(struct drm_device 
> > > *dev, unsigned int type)
> > >   if (!minor)
> > >   return 0;
> > >
> > > - ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root);
> > > - if (ret) {
> > > - DRM_ERROR("DRM: Failed to initialize 
> > > /sys/kernel/debug/dri.\n");
> > > - goto err_debugfs;
> > > + if (minor->type == DRM_MINOR_ACCEL) {
> > > + accel_debugfs_init(minor, minor->index);
> > > + } else {
> > > + ret = drm_debugfs_init(minor, minor->index, 
> > > drm_debugfs_root);
> > > + if (ret) {
> > > + DRM_ERROR("DRM: Failed to initialize 
> > > /sys/kernel/debug/dri.\n");
> > > + goto err_debugfs;
> > > + }
> > >   }
> > >
> > >   ret = device_add(minor->kdev);
> > > @@ -172,9 +187,13 @@ static int drm_minor_register(struct drm_device 
> > > *dev, unsigned int type)
> > >   goto err_debugfs;
> > >
> > >   /* replace NULL with @minor so lookups will succeed from now on */
> > > - spin_lock_irqsave(&drm_minor_lock, flags);
> > > - idr_replace(&drm_minors_idr, minor, minor->index);
> > > - spin_unlock_irqrestore(&drm_minor_lock, flags);
> > > + if (minor->type == DRM_MINOR_ACCEL) {
> > > + accel_min

Re: [PATCH v4 07/10] dt-bindings: phy: Add Cadence HDP-TX DP PHY

2022-11-22 Thread Krzysztof Kozlowski
On 21/11/2022 08:23, Sandor Yu wrote:
> Add bindings for Cadence HDP-TX DisplayPort PHY.
> 
> Signed-off-by: Sandor Yu 
> --->  .../bindings/phy/cdns,hdptx-dp-phy.yaml   | 68 +++
>  1 file changed, 68 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/phy/cdns,hdptx-dp-phy.yaml
> 
> diff --git a/Documentation/devicetree/bindings/phy/cdns,hdptx-dp-phy.yaml 
> b/Documentation/devicetree/bindings/phy/cdns,hdptx-dp-phy.yaml
> new file mode 100644
> index ..b997c15ff0bb
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/phy/cdns,hdptx-dp-phy.yaml
> @@ -0,0 +1,68 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/phy/cdns,hdptx-dp-phy.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Cadence HDP-TX(HDMI/DisplayPort) PHY for DisplayPort protocol
> +
> +maintainers:
> +  - Sandor Yu 
> +
> +properties:
> +  compatible:
> +enum:
> +  - cdns,hdptx-dp-phy
> +
> +  reg:
> +maxItems: 1
> +
> +  clocks:
> +items:
> +  - description: PHY reference clock.
> +  - description: APB clock.
> +
> +  clock-names:
> +items:
> +  - const: refclk
> +  - const: apbclk

Drop "clk" suffix.


Best regards,
Krzysztof



Re: [PATCH v4 09/10] dt-bindings: phy: Add Cadence HDP-TX HDMI PHY

2022-11-22 Thread Krzysztof Kozlowski
On 21/11/2022 08:23, Sandor Yu wrote:
> Add bindings for Cadence HDP-TX HDMI PHY.
> 
> Signed-off-by: Sandor Yu 
> ---
>  .../bindings/phy/cdns,hdptx-hdmi-phy.yaml | 52 +++
>  1 file changed, 52 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/phy/cdns,hdptx-hdmi-phy.yaml
> 
> diff --git a/Documentation/devicetree/bindings/phy/cdns,hdptx-hdmi-phy.yaml 
> b/Documentation/devicetree/bindings/phy/cdns,hdptx-hdmi-phy.yaml
> new file mode 100644
> index ..99352e655eec
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/phy/cdns,hdptx-hdmi-phy.yaml
> @@ -0,0 +1,52 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/phy/cdns,hdptx-hdmi-phy.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Cadence HDP-TX(HDMI/DisplayPort) PHY for HDMI protocol
> +
> +maintainers:
> +  - Sandor Yu 
> +
> +properties:
> +  compatible:
> +enum:
> +  - cdns,hdptx-hdmi-phy
> +
> +  reg:
> +maxItems: 1
> +
> +  clocks:
> +items:
> +  - description: PHY reference clock.
> +  - description: APB clock.
> +
> +  clock-names:
> +items:
> +  - const: refclk
> +  - const: apbclk

Drop "clk" suffix.

Best regards,
Krzysztof



Re: [PATCH v4 3/4] drm: initialize accel framework

2022-11-22 Thread Melissa Wen
11/22, Oded Gabbay wrote:
> On Tue, Nov 22, 2022 at 12:59 PM Oded Gabbay  wrote:
> >
> > On Tue, Nov 22, 2022 at 12:56 PM Melissa Wen  wrote:
> > >
> > > On 11/19, Oded Gabbay wrote:
> > > > Now that we have the accel framework code ready, let's call the
> > > > accel functions from all the appropriate places. These places are the
> > > > drm module init/exit functions, and all the drm_minor handling
> > > > functions.
> > >
> > > Hi Oded,
> > >
> > > The proposal overall LGTM, but I didn't manage to compile the kernel
> > > with your patch series when DRM is enabled but accel support doesn't.
> > > Multiple missing link errors...
> > >
> > > Am I missing something? Can you take a look at it from this patch 3/4?
> > >
> > > Thanks,
> > >
> > > Melissa
> > Looking at it now, thanks for letting me know.
> > Oded
> Found the issue, missing } at end of accel_debugfs_init() in
> drm_accel.h. Only compiles when accel support isn't compiled in.
> I'll fix it before sending the PR to Dave.
> Much appreciated :)
> Oded

Oh, great! Just checked here and everything is fine now.
With that, you can add my r-b for the entire series too

Reviewed-by: Melissa Wen 

Thanks for working on it,

Melissa

> 
> >
> > > >
> > > > Signed-off-by: Oded Gabbay 
> > > > ---
> > > >  drivers/gpu/drm/drm_drv.c   | 102 ++--
> > > >  drivers/gpu/drm/drm_sysfs.c |  24 ++---
> > > >  2 files changed, 91 insertions(+), 35 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > > > index 8214a0b1ab7f..1aec019f6389 100644
> > > > --- a/drivers/gpu/drm/drm_drv.c
> > > > +++ b/drivers/gpu/drm/drm_drv.c
> > > > @@ -35,6 +35,7 @@
> > > >  #include 
> > > >  #include 
> > > >
> > > > +#include 
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > @@ -90,6 +91,8 @@ static struct drm_minor **drm_minor_get_slot(struct 
> > > > drm_device *dev,
> > > >   return &dev->primary;
> > > >   case DRM_MINOR_RENDER:
> > > >   return &dev->render;
> > > > + case DRM_MINOR_ACCEL:
> > > > + return &dev->accel;
> > > >   default:
> > > >   BUG();
> > > >   }
> > > > @@ -104,9 +107,13 @@ static void drm_minor_alloc_release(struct 
> > > > drm_device *dev, void *data)
> > > >
> > > >   put_device(minor->kdev);
> > > >
> > > > - spin_lock_irqsave(&drm_minor_lock, flags);
> > > > - idr_remove(&drm_minors_idr, minor->index);
> > > > - spin_unlock_irqrestore(&drm_minor_lock, flags);
> > > > + if (minor->type == DRM_MINOR_ACCEL) {
> > > > + accel_minor_remove(minor->index);
> > > > + } else {
> > > > + spin_lock_irqsave(&drm_minor_lock, flags);
> > > > + idr_remove(&drm_minors_idr, minor->index);
> > > > + spin_unlock_irqrestore(&drm_minor_lock, flags);
> > > > + }
> > > >  }
> > > >
> > > >  static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
> > > > @@ -123,13 +130,17 @@ static int drm_minor_alloc(struct drm_device 
> > > > *dev, unsigned int type)
> > > >   minor->dev = dev;
> > > >
> > > >   idr_preload(GFP_KERNEL);
> > > > - spin_lock_irqsave(&drm_minor_lock, flags);
> > > > - r = idr_alloc(&drm_minors_idr,
> > > > -   NULL,
> > > > -   64 * type,
> > > > -   64 * (type + 1),
> > > > -   GFP_NOWAIT);
> > > > - spin_unlock_irqrestore(&drm_minor_lock, flags);
> > > > + if (type == DRM_MINOR_ACCEL) {
> > > > + r = accel_minor_alloc();
> > > > + } else {
> > > > + spin_lock_irqsave(&drm_minor_lock, flags);
> > > > + r = idr_alloc(&drm_minors_idr,
> > > > + NULL,
> > > > + 64 * type,
> > > > + 64 * (type + 1),
> > > > + GFP_NOWAIT);
> > > > + spin_unlock_irqrestore(&drm_minor_lock, flags);
> > > > + }
> > > >   idr_preload_end();
> > > >
> > > >   if (r < 0)
> > > > @@ -161,10 +172,14 @@ static int drm_minor_register(struct drm_device 
> > > > *dev, unsigned int type)
> > > >   if (!minor)
> > > >   return 0;
> > > >
> > > > - ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root);
> > > > - if (ret) {
> > > > - DRM_ERROR("DRM: Failed to initialize 
> > > > /sys/kernel/debug/dri.\n");
> > > > - goto err_debugfs;
> > > > + if (minor->type == DRM_MINOR_ACCEL) {
> > > > + accel_debugfs_init(minor, minor->index);
> > > > + } else {
> > > > + ret = drm_debugfs_init(minor, minor->index, 
> > > > drm_debugfs_root);
> > > > + if (ret) {
> > > > + DRM_ERROR("DRM: Failed to initialize 
> > > > /sys/kernel/debug/dri.\n");
> > > > + goto err_debugfs;
> > > > + }
> > > >   }
> > > >
> > > >   ret = device_add(minor->kdev);
> > > > @@ -172,9 +187,13 @@ st

[PATCH 1/2] drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()

2022-11-22 Thread Xiongfeng Wang
As comment of pci_get_class() says, it returns a pci_device with its
refcount increased and decreased the refcount for the input parameter
@from if it is not NULL.

If we break the loop in radeon_atrm_get_bios() with 'pdev' not NULL, we
need to call pci_dev_put() to decrease the refcount. Add the missing
pci_dev_put() to avoid refcount leak.

Fixes: d8ade3526b2a ("drm/radeon: handle non-VGA class pci devices with ATRM")
Fixes: c61e2775873f ("drm/radeon: split ATRM support out from the ATPX handler 
(v3)")
Signed-off-by: Xiongfeng Wang 
---
 drivers/gpu/drm/radeon/radeon_bios.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/radeon/radeon_bios.c 
b/drivers/gpu/drm/radeon/radeon_bios.c
index 33121655d50b..2df6ce3e32cb 100644
--- a/drivers/gpu/drm/radeon/radeon_bios.c
+++ b/drivers/gpu/drm/radeon/radeon_bios.c
@@ -227,6 +227,7 @@ static bool radeon_atrm_get_bios(struct radeon_device *rdev)
 
if (!found)
return false;
+   pci_dev_put(pdev);
 
rdev->bios = kmalloc(size, GFP_KERNEL);
if (!rdev->bios) {
-- 
2.20.1



[PATCH 0/2] drm: Fix PCI device refcount leak

2022-11-22 Thread Xiongfeng Wang
As comment of pci_get_class() says, it returns a pci_device with its
refcount increased and decreased the refcount for the input parameter
@from if it is not NULL.

If we use pci_get_class() to iterate all the PCI device of a certain
class, when we find the expected device and break the iteration loop,
the refcount of the found PCI device is increased. When finish using
the PCI device, we need to call pci_dev_put() to decrease the refcount.


Xiongfeng Wang (2):
  drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()
  drm/amdgpu: Fix PCI device refcount leak in amdgpu_atrm_get_bios()

 drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 1 +
 drivers/gpu/drm/radeon/radeon_bios.c | 1 +
 2 files changed, 2 insertions(+)

-- 
2.20.1



[PATCH 2/2] drm/amdgpu: Fix PCI device refcount leak in amdgpu_atrm_get_bios()

2022-11-22 Thread Xiongfeng Wang
As comment of pci_get_class() says, it returns a pci_device with its
refcount increased and decreased the refcount for the input parameter
@from if it is not NULL.

If we break the loop in amdgpu_atrm_get_bios() with 'pdev' not NULL, we
need to call pci_dev_put() to decrease the refcount. Add the missing
pci_dev_put() to avoid refcount leak.

Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)")
Signed-off-by: Xiongfeng Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
index e363f56c72af..30c28a69e847 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
@@ -317,6 +317,7 @@ static bool amdgpu_atrm_get_bios(struct amdgpu_device *adev)
 
if (!found)
return false;
+   pci_dev_put(pdev);
 
adev->bios = kmalloc(size, GFP_KERNEL);
if (!adev->bios) {
-- 
2.20.1



[git pull v2] new subsystem for compute accelerator devices

2022-11-22 Thread Oded Gabbay
Hi Dave,

Resending the pull-request with the fix of the build when CONFIG_DRM_ACCEL is
set to N.

Thanks,
Oded

The following changes since commit fc58764bbf602b65a6f63c53e5fd6feae76c510c:

  Merge tag 'amd-drm-next-6.2-2022-11-18' of 
https://gitlab.freedesktop.org/agd5f/linux into drm-next (2022-11-22 13:41:11 
+1000)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git 
tags/drm-accel-2022-11-22

for you to fetch changes up to 8c5577a5ccc632685e65168fc6890b72a779f93a:

  doc: add documentation for accel subsystem (2022-11-22 13:14:52 +0200)


This tag contains the patches that add the new compute acceleration
subsystem, which is part of the DRM subsystem.

The patches:
- Add a new directory at drivers/accel.
- Add a new major (261) for compute accelerators.
- Add a new DRM minor type for compute accelerators.
- Integrate the accel core code with DRM core code.
- Add documentation for the accel subsystem.


Oded Gabbay (4):
  drivers/accel: define kconfig and register a new major
  accel: add dedicated minor for accelerator devices
  drm: initialize accel framework
  doc: add documentation for accel subsystem

 Documentation/accel/index.rst |  17 ++
 Documentation/accel/introduction.rst  | 110 
 Documentation/admin-guide/devices.txt |   5 +
 Documentation/subsystem-apis.rst  |   1 +
 MAINTAINERS   |   9 +
 drivers/Kconfig   |   2 +
 drivers/accel/Kconfig |  24 +++
 drivers/accel/drm_accel.c | 323 ++
 drivers/gpu/drm/Makefile  |   1 +
 drivers/gpu/drm/drm_drv.c | 101 ---
 drivers/gpu/drm/drm_file.c|   2 +-
 drivers/gpu/drm/drm_sysfs.c   |  24 ++-
 include/drm/drm_accel.h   |  97 ++
 include/drm/drm_device.h  |   3 +
 include/drm/drm_drv.h |   8 +
 include/drm/drm_file.h|  21 ++-
 16 files changed, 711 insertions(+), 37 deletions(-)
 create mode 100644 Documentation/accel/index.rst
 create mode 100644 Documentation/accel/introduction.rst
 create mode 100644 drivers/accel/Kconfig
 create mode 100644 drivers/accel/drm_accel.c
 create mode 100644 include/drm/drm_accel.h


Re: [regression][6.0] After commit b261509952bc19d1012cf732f853659be6ebc61e I see WARNING message at drivers/gpu/drm/drm_modeset_lock.c:276 drm_modeset_drop_locks+0x63/0x70

2022-11-22 Thread Mikhail Gavrilov
On Thu, Oct 13, 2022 at 6:36 PM Mikhail Gavrilov
 wrote:
>
> Hi!
> I bisected an issue of the 6.0 kernel which started happening after
> 6.0-rc7 on all my machines.
>
> Backtrace of this issue looks like as:
>
> [ 2807.339439] [ cut here ]
> [ 2807.339445] WARNING: CPU: 11 PID: 2061 at
> drivers/gpu/drm/drm_modeset_lock.c:276
> drm_modeset_drop_locks+0x63/0x70
>
> bisect points to this commit: b261509952bc19d1012cf732f853659be6ebc61e.
>
> After reverting this commit the WARNING messages described here disappeared.
>

Hi Harry, Christian says that you can help with it.

Thanks.

-- 
Best Regards,
Mike Gavrilov.


[PATCH] drm/amd/display: Fix set scaling doesn's work

2022-11-22 Thread hongao
[Why]
Setting scaling does not correctly update CRTC state. As a result
dc stream state's src (composition area) && dest (addressable area)
was not calculated as expected. This causes set scaling doesn's work.

[How]
Correctly update CRTC state when setting scaling property.

Signed-off-by: hongao 

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 3e1ecca72430..a88a6f758748 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -9386,8 +9386,8 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
goto fail;
}
 
-   if (dm_old_con_state->abm_level !=
-   dm_new_con_state->abm_level)
+   if (dm_old_con_state->abm_level != dm_new_con_state->abm_level 
||
+   dm_old_con_state->scaling != dm_new_con_state->scaling)
new_crtc_state->connectors_changed = true;
}
 
-- 
2.20.1



[PATCH linux-next] drm/i915/gvt: use sysfs_streq() instead of strncmp()

2022-11-22 Thread yang.yang29
From: Xu Panda 

Replace the open-code with sysfs_streq().

Signed-off-by: Xu Panda 
Signed-off-by: Yang Yang 
---
 drivers/gpu/drm/i915/gvt/cmd_parser.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.c 
b/drivers/gpu/drm/i915/gvt/cmd_parser.c
index 0ebf5fbf0e39..7b9a5f7f5363 100644
--- a/drivers/gpu/drm/i915/gvt/cmd_parser.c
+++ b/drivers/gpu/drm/i915/gvt/cmd_parser.c
@@ -917,8 +917,8 @@ static int cmd_reg_handler(struct parser_exec_state *s,
return -EBADRQC;
}

-   if (!strncmp(cmd, "srm", 3) ||
-   !strncmp(cmd, "lrm", 3)) {
+   if (sysfs_streq(cmd, "srm") ||
+   sysfs_streq(cmd, "lrm")) {
if (offset == i915_mmio_reg_offset(GEN8_L3SQCREG4) ||
offset == 0x21f0 ||
(IS_BROADWELL(gvt->gt->i915) &&
@@ -931,8 +931,8 @@ static int cmd_reg_handler(struct parser_exec_state *s,
}
}

-   if (!strncmp(cmd, "lrr-src", 7) ||
-   !strncmp(cmd, "lrr-dst", 7)) {
+   if (sysfs_streq(cmd, "lrr-src") ||
+   sysfs_streq(cmd, "lrr-dst")) {
if (IS_BROADWELL(gvt->gt->i915) && offset == 0x215c)
return 0;
else {
@@ -941,12 +941,12 @@ static int cmd_reg_handler(struct parser_exec_state *s,
}
}

-   if (!strncmp(cmd, "pipe_ctrl", 9)) {
+   if (sysfs_streq(cmd, "pipe_ctrl")) {
/* TODO: add LRI POST logic here */
return 0;
}

-   if (strncmp(cmd, "lri", 3))
+   if (!sysfs_streq(cmd, "lri"))
return -EPERM;

/* below are all lri handlers */
@@ -1011,7 +1011,7 @@ static int cmd_reg_handler(struct parser_exec_state *s,
 */
if (GRAPHICS_VER(s->engine->i915) == 9 &&
intel_gvt_mmio_is_sr_in_ctx(gvt, offset) &&
-   !strncmp(cmd, "lri", 3)) {
+   sysfs_streq(cmd, "lri")) {
intel_gvt_read_gpa(s->vgpu,
s->workload->ring_context_gpa + 12, &ctx_sr_ctl, 4);
/* check inhibit context */
-- 
2.15.2


Re: [PATCH RFC 16/19] mm/frame-vector: remove FOLL_FORCE usage

2022-11-22 Thread Hans Verkuil
Hi Tomasz, David,

On 11/8/22 05:45, Tomasz Figa wrote:
> Hi David,
> 
> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand  wrote:
>>
>> FOLL_FORCE is really only for debugger access. According to commit
>> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always
>> writable"), the pinned pages are always writable.
> 
> Actually that patch is only a workaround to temporarily disable
> support for read-only pages as they seemed to suffer from some
> corruption issues in the retrieved user pages. We expect to support
> read-only pages as hardware input after. That said, FOLL_FORCE doesn't
> sound like the right thing even in that case, but I don't know the
> background behind it being added here in the first place. +Hans
> Verkuil +Marek Szyprowski do you happen to remember anything about it?

I tracked the use of 'force' all the way back to the first git commit
(2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the
reason is lost in the mists of time.

I'm not sure if the 'force' argument of get_user_pages() at that time
even meant the same as FOLL_FORCE today. From what I can tell it has just
been faithfully used ever since, but I have my doubt that anyone understands
the reason behind it since it was never explained.

Looking at this old LWN article https://lwn.net/Articles/28548/ suggests
that it might be related to calling get_user_pages for write buffers
(non-zero write argument) where you also want to be able to read from the
buffer. That is certainly something that some drivers need to do post-capture
fixups.

But 'force' was also always set for read buffers, and I don't know if that
was something that was actually needed, or just laziness.

I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still
allow drivers to read from the buffer?

Regards,

Hans

> 
> Best regards,
> Tomasz
> 
>>
>> FOLL_FORCE in this case seems to be a legacy leftover. Let's just remove
>> it.
>>
>> Cc: Tomasz Figa 
>> Cc: Marek Szyprowski 
>> Cc: Mauro Carvalho Chehab 
>> Signed-off-by: David Hildenbrand 
>> ---
>>  drivers/media/common/videobuf2/frame_vector.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/media/common/videobuf2/frame_vector.c 
>> b/drivers/media/common/videobuf2/frame_vector.c
>> index 542dde9d2609..062e98148c53 100644
>> --- a/drivers/media/common/videobuf2/frame_vector.c
>> +++ b/drivers/media/common/videobuf2/frame_vector.c
>> @@ -50,7 +50,7 @@ int get_vaddr_frames(unsigned long start, unsigned int 
>> nr_frames,
>> start = untagged_addr(start);
>>
>> ret = pin_user_pages_fast(start, nr_frames,
>> - FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
>> + FOLL_WRITE | FOLL_LONGTERM,
>>   (struct page **)(vec->ptrs));
>> if (ret > 0) {
>> vec->got_ref = true;
>> --
>> 2.38.1
>>



Re: [PATCH RFC 16/19] mm/frame-vector: remove FOLL_FORCE usage

2022-11-22 Thread David Hildenbrand

On 22.11.22 13:25, Hans Verkuil wrote:

Hi Tomasz, David,

On 11/8/22 05:45, Tomasz Figa wrote:

Hi David,

On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand  wrote:


FOLL_FORCE is really only for debugger access. According to commit
707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always
writable"), the pinned pages are always writable.


Actually that patch is only a workaround to temporarily disable
support for read-only pages as they seemed to suffer from some
corruption issues in the retrieved user pages. We expect to support
read-only pages as hardware input after. That said, FOLL_FORCE doesn't
sound like the right thing even in that case, but I don't know the
background behind it being added here in the first place. +Hans
Verkuil +Marek Szyprowski do you happen to remember anything about it?


I tracked the use of 'force' all the way back to the first git commit
(2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the
reason is lost in the mists of time.

I'm not sure if the 'force' argument of get_user_pages() at that time
even meant the same as FOLL_FORCE today. From what I can tell it has just
been faithfully used ever since, but I have my doubt that anyone understands
the reason behind it since it was never explained.

Looking at this old LWN article https://lwn.net/Articles/28548/ suggests
that it might be related to calling get_user_pages for write buffers
(non-zero write argument) where you also want to be able to read from the
buffer. That is certainly something that some drivers need to do post-capture
fixups.

But 'force' was also always set for read buffers, and I don't know if that
was something that was actually needed, or just laziness.

I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still
allow drivers to read from the buffer?


Yes. The only problematic corner case I can imagine is if someone has a 
VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin 
user space pages as a read buffer. We'd specify now FOLL_WRITE without 
FOLL_FORCE and GUP would reject that: write access without write 
permissions is invalid.


There would be no way around "fixing" this implementation to not specify 
FOLL_WRITE when only reading from user-space pages. Not sure what the 
implications are regarding that corruption that was mentioned in 
707947247e95.


Having said that, I assume such a scenario is unlikely -- but you might 
know better how user space usually uses this interface. There would be 
three options:


1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid.
2) Remove FOLL_FORCE and see if anybody even notices (this patch) and
   leave the implementation as is for now.
3) Remove FOLL_FORCE and fixup the implementation to only specify
   FOLL_WRITE if the pages will actually get written to.

3) would most probably ideal, however, I am no expert on that code and 
can't do it (707947247e95 confuses me). So naive me would go with 2) first.


--
Thanks,

David / dhildenb



Re: Coverity: intel_hti_uses_phy(): Integer handling issues

2022-11-22 Thread Jani Nikula
On Fri, 18 Nov 2022, coverity-bot  wrote:
> Hello!
>
> This is an experimental semi-automated report about issues detected by
> Coverity from a scan of next-20221118 as part of the linux-next scan project:
> https://scan.coverity.com/projects/linux-next-weekly-scan
>
> You're getting this email because you were associated with the identified
> lines of code (noted below) that were touched by commits:
>
>   Thu Nov 17 16:12:56 2022 +0200
> 62749912540b ("drm/i915/display: move hti under display sub-struct")
>
> Coverity reported the following:
>
> *** CID 1527374:  Integer handling issues  (BAD_SHIFT)
> drivers/gpu/drm/i915/display/intel_hti.c:24 in intel_hti_uses_phy()
> 18if (INTEL_INFO(i915)->display.has_hti)
> 19i915->display.hti.state = intel_de_read(i915, 
> HDPORT_STATE);
> 20 }
> 21
> 22 bool intel_hti_uses_phy(struct drm_i915_private *i915, enum phy phy)
> 23 {
> vvv CID 1527374:  Integer handling issues  (BAD_SHIFT)
> vvv In expression "1UL << 2 * phy + 1", shifting by a negative amount has 
> undefined behavior.  The shift amount, "2 * phy + 1", is as little as -1.
> 24return i915->display.hti.state & HDPORT_ENABLED &&
> 25i915->display.hti.state & HDPORT_DDI_USED(phy);
> 26 }
> 27
> 28 u32 intel_hti_dpll_mask(struct drm_i915_private *i915)
> 29 {
>
> If this is a false positive, please let us know so we can mark it as
> such, or teach the Coverity rules to be smarter. If not, please make
> sure fixes get into linux-next. :) For patches fixing this, please
> include these lines (but double-check the "Fixes" first):
>
> Reported-by: coverity-bot 
> Addresses-Coverity-ID: 1527374 ("Integer handling issues")
> Fixes: 62749912540b ("drm/i915/display: move hti under display sub-struct")

Thanks for the report, fix at [1].

I realize I didn't use the suggested tags above. For one thing, we've
never really logged any proprietary tools used. Looks like
"Addresses-Coverity-ID:" is growing in popularity though.

The Fixes: tag points at code refactoring, it was a pre-existing
condition.

BR,
Jani.


[1] 
https://patchwork.freedesktop.org/patch/msgid/20221122120948.3436180-1-jani.nik...@intel.com


>
> This code appears to be safe currently (intel_hti_uses_phy() is never
> called with PHY_NONE), but perhaps add an explicit check?
>
>   if (WARN_ON(phy == PHY_NONE))
>   return false;
>
> Thanks for your attention!

-- 
Jani Nikula, Intel Open Source Graphics Center


[no subject]

2022-11-22 Thread Denis Arefev
Date: Tue, 22 Nov 2022 15:51:44 +0300
Subject: [PATCH] powerplay: hwmgr: added result check

The return value of the 'div64_s64'
function called in ppevvmath.h:371
was not checked.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Denis Arefev 
---
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h
index dac29fe6cfc6..82aa7130d143 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h
@@ -357,6 +357,7 @@ static fInt fDivide (fInt X, fInt Y)
 {
fInt fZERO, fQuotient;
int64_t longlongX, longlongY;
+   int64_t result;
 
fZERO = ConvertToFraction(0);
 
@@ -367,10 +368,11 @@ static fInt fDivide (fInt X, fInt Y)
longlongY = (int64_t)Y.full;
 
longlongX = longlongX << 16; /*Q(16,16) -> Q(32,32) */
+   longlongY = longlongY << 16;
 
-   div64_s64(longlongX, longlongY); /*Q(32,32) divided by Q(16,16) = 
Q(16,16) Back to original format */
+   result = div64_s64(longlongX, longlongY);
 
-   fQuotient.full = (int)longlongX;
+   fQuotient = ConvertToFraction((int)result);
return fQuotient;
 }
 
-- 
2.25.1



[MAINTAINER TOOLS] docs: updated rules for topic/core-for-CI commit management

2022-11-22 Thread Jani Nikula
Introduce stricter rules for topic/core-for-CI management. Way too many
commits have been added over the years, with insufficient rationale
recorded in the commit message, and insufficient follow-up with removing
the commits from the topic branch.

New rules:

1. Require maintainer ack for rebase. Have better gating on when rebases
   happen and on which baselines.

2. Require maintainer/committer ack for adding/removing commits. No
   single individual should decide.

3. Require gitlab issues for new commits added. Improve tracking for
   removing the commits.

Also use the stronger "must" for commit message requiring the
justification for the commit being in topic/core-for-CI.

Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
Cc: Tvrtko Ursulin 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: intel-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: dim-to...@lists.freedesktop.org
Signed-off-by: Jani Nikula 
---
 drm-tip.rst | 27 ---
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/drm-tip.rst b/drm-tip.rst
index deae95cdd2fe..24036e2ef576 100644
--- a/drm-tip.rst
+++ b/drm-tip.rst
@@ -203,11 +203,13 @@ justified exception. The primary goal is to fix issues 
originating from Linus'
 tree. Issues that would need drm-next or other DRM subsystem tree as baseline
 should be fixed in the offending DRM subsystem tree.
 
-Only rebase the branch if you really know what you're doing. When in doubt, ask
-the maintainers. You'll need to be able to handle any conflicts in non-drm code
-while rebasing.
+Only rebase the branch if you really know what you're doing. You'll need to be
+able to handle any conflicts in non-drm code while rebasing.
 
-Simply drop fixes that are already available in the new baseline.
+Always ask for maintainer ack before rebasing. IRC ack is sufficient.
+
+Simply drop fixes that are already available in the new baseline. Close the
+associated gitlab issue when removing commits.
 
 Force pushing a rebased topic/core-for-CI requires passing the ``--force``
 parameter to git::
@@ -225,11 +227,22 @@ judgement call.
 Only add or remove commits if you really know what you're doing. When in doubt,
 ask the maintainers.
 
-Apply new commits on top with regular push. The commit message needs to explain
-why the patch has been applied to topic/core-for-CI. If it's a cherry-pick from
+Always ask for maintainer/committer ack before adding/removing commits. IRC ack
+is sufficient. Record the ``Acked-by:`` in commits being added.
+
+Apply new commits on top with regular push. The commit message must explain why
+the patch has been applied to topic/core-for-CI. If it's a cherry-pick from
 another subsystem, please reference the commit with ``git cherry-pick -x``
 option. If it's a patch from another subsystem, please reference the patch on
 the mailing list with ``Link:`` tag.
 
+New commits always need an associated gitlab issue for tracking purposes. The
+goal is to have as few commits in topic/core-for-CI as possible, and we need to
+be able to track the progress in making that happen. Reference the issue with
+``References:`` tag. Add the ``core-for-CI`` label to the issue. (Note: Do not
+use ``Closes:`` because the logic here is backwards; the issue is having the
+commit in the branch in the first place.)
+
 Instead of applying reverts, just remove the commit. This implies ``git rebase
--i`` on the current baseline; see directions above.
+-i`` on the current baseline; see directions above. Close the associated gitlab
+issue when removing commits.
-- 
2.34.1



Re: Build regressions/improvements in v6.1-rc6

2022-11-22 Thread Alex Deucher
On Tue, Nov 22, 2022 at 5:56 AM Geert Uytterhoeven  wrote:
>
> On Tue, 22 Nov 2022, Geert Uytterhoeven wrote:
> > JFYI, when comparing v6.1-rc6[1] to v6.1-rc5[3], the summaries are:
> >  - build errors: +6/-0
>
>+ /kisskb/src/arch/sh/include/asm/io.h: error: cast to pointer from 
> integer of different size [-Werror=int-to-pointer-cast]:  => 239:34
>
> sh4-gcc11/sh-allmodconfig (in cvm_oct_free_hw_memory())
>
>+ /kisskb/src/arch/um/include/asm/processor-generic.h: error: called 
> object is not a function or function pointer:  => 94:18
>+ /kisskb/src/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c: error: 
> control reaches end of non-void function [-Werror=return-type]:  => 1934:1
>
> um-x86_64/um-all{mod,yes}config (in kfd_cpumask_to_apic_id())

Presumably cpu_data is not defined on um-x86_64?  Does it even make
sense to build drivers on um-x86_64?

Alex

>
>+ /kisskb/src/drivers/infiniband/hw/qib/qib_wc_x86_64.c: error: 
> 'X86_VENDOR_AMD' undeclared (first use in this function):  => 149:37
>+ /kisskb/src/drivers/infiniband/hw/qib/qib_wc_x86_64.c: error: 'struct 
> cpuinfo_um' has no member named 'x86_vendor':  => 149:22
>+ /kisskb/src/drivers/infiniband/hw/qib/qib_wc_x86_64.c: error: control 
> reaches end of non-void function [-Werror=return-type]:  => 150:1
>
> um-x86_64/um-allyesconfig
>
> > [1] 
> > http://kisskb.ellerman.id.au/kisskb/branch/linus/head/eb7081409f94a9a8608593d0fb63a1aa3d6f95d8/
> >  (all 149 configs)
> > [3] 
> > http://kisskb.ellerman.id.au/kisskb/branch/linus/head/094226ad94f471a9f19e8f8e7140a09c2625abaa/
> >  (all 149 configs)
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds


Re: [PATCH RFC 16/19] mm/frame-vector: remove FOLL_FORCE usage

2022-11-22 Thread Hans Verkuil
On 11/22/22 13:38, David Hildenbrand wrote:
> On 22.11.22 13:25, Hans Verkuil wrote:
>> Hi Tomasz, David,
>>
>> On 11/8/22 05:45, Tomasz Figa wrote:
>>> Hi David,
>>>
>>> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand  wrote:

 FOLL_FORCE is really only for debugger access. According to commit
 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always
 writable"), the pinned pages are always writable.
>>>
>>> Actually that patch is only a workaround to temporarily disable
>>> support for read-only pages as they seemed to suffer from some
>>> corruption issues in the retrieved user pages. We expect to support
>>> read-only pages as hardware input after. That said, FOLL_FORCE doesn't
>>> sound like the right thing even in that case, but I don't know the
>>> background behind it being added here in the first place. +Hans
>>> Verkuil +Marek Szyprowski do you happen to remember anything about it?
>>
>> I tracked the use of 'force' all the way back to the first git commit
>> (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the
>> reason is lost in the mists of time.
>>
>> I'm not sure if the 'force' argument of get_user_pages() at that time
>> even meant the same as FOLL_FORCE today. From what I can tell it has just
>> been faithfully used ever since, but I have my doubt that anyone understands
>> the reason behind it since it was never explained.
>>
>> Looking at this old LWN article https://lwn.net/Articles/28548/ suggests
>> that it might be related to calling get_user_pages for write buffers
>> (non-zero write argument) where you also want to be able to read from the
>> buffer. That is certainly something that some drivers need to do post-capture
>> fixups.
>>
>> But 'force' was also always set for read buffers, and I don't know if that
>> was something that was actually needed, or just laziness.
>>
>> I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still
>> allow drivers to read from the buffer?
> 
> Yes. The only problematic corner case I can imagine is if someone has a 
> VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin 
> user space pages as a read buffer. We'd specify now FOLL_WRITE without 
> FOLL_FORCE and GUP would reject that: write access without write 
> permissions is invalid.

I do not believe this will be an issue.

> 
> There would be no way around "fixing" this implementation to not specify 
> FOLL_WRITE when only reading from user-space pages. Not sure what the 
> implications are regarding that corruption that was mentioned in 
> 707947247e95.

Before 707947247e95 the FOLL_WRITE flag was only set for write buffers
(i.e. video capture, DMA_FROM_DEVICE), not for read buffers (video output,
DMA_TO_DEVICE). In the video output case there should never be any need
for drivers to write to the buffer to the best of my knowledge.

But I have had some complaints about that commit that it causes problems
in some scenarios, and it has been on my todo list for quite some time now
to dig deeper into this. I probably should prioritize this for this or
next week.

> 
> Having said that, I assume such a scenario is unlikely -- but you might 
> know better how user space usually uses this interface. There would be 
> three options:
> 
> 1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid.
> 2) Remove FOLL_FORCE and see if anybody even notices (this patch) and
> leave the implementation as is for now.
> 3) Remove FOLL_FORCE and fixup the implementation to only specify
> FOLL_WRITE if the pages will actually get written to.
> 
> 3) would most probably ideal, however, I am no expert on that code and 
> can't do it (707947247e95 confuses me). So naive me would go with 2) first.
> 

Option 3 would be best. And 707947247e95 confuses me as well, and I actually
wrote it :-) I am wondering whether it was addressed at the right level, but
as I said, I need to dig a bit deeper into this.

Regards,

Hans


Re: [PATCH mm-unstable v1 06/20] mm: rework handling in do_wp_page() based on private vs. shared mappings

2022-11-22 Thread Vlastimil Babka
On 11/16/22 11:26, David Hildenbrand wrote:
> We want to extent FAULT_FLAG_UNSHARE support to anything mapped into a
> COW mapping (pagecache page, zeropage, PFN, ...), not just anonymous pages.
> Let's prepare for that by handling shared mappings first such that we can
> handle private mappings last.
> 
> While at it, use folio-based functions instead of page-based functions
> where we touch the code either way.
> 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Vlastimil Babka 



Re: Try to address the DMA-buf coherency problem

2022-11-22 Thread Daniel Vetter
On Fri, Nov 18, 2022 at 11:32:19AM -0800, Rob Clark wrote:
> On Thu, Nov 17, 2022 at 7:38 AM Nicolas Dufresne  wrote:
> >
> > Le jeudi 17 novembre 2022 à 13:10 +0100, Christian König a écrit :
> > > > > DMA-Buf let's the exporter setup the DMA addresses the importer uses 
> > > > > to
> > > > > be able to directly decided where a certain operation should go. E.g. 
> > > > > we
> > > > > have cases where for example a P2P write doesn't even go to memory, 
> > > > > but
> > > > > rather a doorbell BAR to trigger another operation. Throwing in CPU
> > > > > round trips for explicit ownership transfer completely breaks that
> > > > > concept.
> > > > It sounds like we should have a dma_dev_is_coherent_with_dev() which
> > > > accepts two (or an array?) of devices and tells the caller whether the
> > > > devices need explicit ownership transfer.
> > >
> > > No, exactly that's the concept I'm pushing back on very hard here.
> > >
> > > In other words explicit ownership transfer is not something we would
> > > want as requirement in the framework, cause otherwise we break tons of
> > > use cases which require concurrent access to the underlying buffer.
> >
> > I'm not pushing for this solution, but really felt the need to correct you 
> > here.
> > I have quite some experience with ownership transfer mechanism, as this is 
> > how
> > GStreamer framework works since 2000. Concurrent access is a really common 
> > use
> > cases and it is quite well defined in that context. The bracketing system 
> > (in
> > this case called map() unmap(), with flag stating the usage intention like 
> > reads
> > and write) is combined the the refcount. The basic rules are simple:
> 
> This is all CPU oriented, I think Christian is talking about the case
> where ownership transfer happens without CPU involvement, such as via
> GPU waiting on a fence

Yeah for pure device2device handover the rule pretty much has to be that
any coherency management that needs to be done must be done from the
device side (flushing device side caches and stuff like that) only. But
under the assumption that _all_ cpu side management has been done already
before the first device access started.

And then the map/unmap respectively begin/end_cpu_access can be used what
it was meant for with cpu side invalidation/flushing and stuff like that,
while having pretty clear handover/ownership rules and hopefully not doing
no unecessary flushes. And all that while allowing device acces to be
pipelined. Worst case the exporter has to insert some pipelined cache
flushes as a dma_fence pipelined work of its own between the device access
when moving from one device to the other. That last part sucks a bit right
now, because we don't have any dma_buf_attachment method which does this
syncing without recreating the mapping, but in reality this is solved by
caching mappings in the exporter (well dma-buf layer) nowadays.

True concurrent access like vk/compute expects is still a model that
dma-buf needs to support on top, but that's a special case and pretty much
needs hw that supports such concurrent access without explicit handover
and fencing.

Aside from some historical accidents and still a few warts, I do think
dma-buf does support both of these models. Of course in the case of
gpu/drm drivers, userspace must know whats possible and act accordingly,
otherwise you just get to keep all the pieces.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH v2] drm/mediatek: Clean dangling pointer on bind error path

2022-11-22 Thread Nícolas F . R . A . Prado
mtk_drm_bind() can fail, in which case drm_dev_put() is called,
destroying the drm_device object. However a pointer to it was still
being held in the private object, and that pointer would be passed along
to DRM in mtk_drm_sys_prepare() if a suspend were triggered at that
point, resulting in a panic. Clean the pointer when destroying the
object in the error path to prevent this from happening.

Fixes: 119f5173628a ("drm/mediatek: Add DRM Driver for Mediatek SoC MT8173.")
Signed-off-by: Nícolas F. R. A. Prado 

---

Changes in v2:
- Added Fixes tag

 drivers/gpu/drm/mediatek/mtk_drm_drv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index 39a42dc8fb85..a21ff1b3258c 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -514,6 +514,7 @@ static int mtk_drm_bind(struct device *dev)
 err_deinit:
mtk_drm_kms_deinit(drm);
 err_free:
+   private->drm = NULL;
drm_dev_put(drm);
return ret;
 }
-- 
2.38.1



Re: [PATCH 7/7] drm/fb-helper: Don't use the preferred depth for the BPP default

2022-11-22 Thread Daniel Vetter
On Wed, Nov 16, 2022 at 05:09:17PM +0100, Thomas Zimmermann wrote:
> If no preferred value for bits-per-pixel has been given, fall back
> to 32. Never use the preferred depth. The color depth is the number
> of color/alpha bits per pixel, while bpp is the overall number of
> bits in most cases.
> 
> Most noteworthy, XRGB has a depth of 24 and a bpp value of 32.
> Using depth for bpp would make the value 24 as well and format
> selection in fbdev helpers fails. Unfortunately XRGB is the most
> common format and the old heuristic therefore fails for most of
> the drivers (unless they implement the 24-bit RGB888 format).
> 
> Picking a bpp of 32 will lateron result in a default depth of 24
> and the format XRGB. As XRGB is the default format for most
> of the current and legacy graphics stack, all drivers must support
> it. So it is the safe choice.
> 
> Signed-off-by: Thomas Zimmermann 
> ---
>  drivers/gpu/drm/drm_fbdev_generic.c | 15 +--
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_fbdev_generic.c 
> b/drivers/gpu/drm/drm_fbdev_generic.c
> index ab86956692795..0a4c160e0e58a 100644
> --- a/drivers/gpu/drm/drm_fbdev_generic.c
> +++ b/drivers/gpu/drm/drm_fbdev_generic.c
> @@ -431,7 +431,6 @@ static const struct drm_client_funcs 
> drm_fbdev_client_funcs = {
>   * drm_fbdev_generic_setup() - Setup generic fbdev emulation
>   * @dev: DRM device
>   * @preferred_bpp: Preferred bits per pixel for the device.
> - * @dev->mode_config.preferred_depth is used if this is zero.
>   *
>   * This function sets up generic fbdev emulation for drivers that supports
>   * dumb buffers with a virtual address and that can be mmap'ed.
> @@ -475,12 +474,16 @@ void drm_fbdev_generic_setup(struct drm_device *dev,
>   }
>  
>   /*
> -  * FIXME: This mixes up depth with bpp, which results in a glorious
> -  * mess, resulting in some drivers picking wrong fbdev defaults and
> -  * others wrong preferred_depth defaults.
> +  * Pick a preferred bpp of 32 if no value has been given. This
> +  * will select XRGB for the framebuffer formats. All drivers
> +  * have to support XRGB for backwards compatibility with legacy
> +  * userspace, so it's the safe choice here.
> +  *
> +  * TODO: Replace struct drm_mode_config.preferred_depth and this
> +  *   bpp value with a preferred format that is given as struct
> +  *   drm_format_info. Then derive all other values from the
> +  *   format.

I concur on this being the right design. What I've attempted years ago,
but never managed to finish, is sort the formats list on the primary plane
in preference order (since that seems useful for other reasons), and then
let everyone derive the preferred_whatever from the first format of the
first primary plane automatically.

But doing that is a pretty huge refactor, since you get to audit every
driver. So I kinda gave up on that. But I still thing something in that
direction would be a good design overall, since then userspace could also
use the same trick to infer format preferences ...

Anyway on the series, since it pushes in a direction I wanted to fix years
ago but gave up because too ambitious :-)

Acked-by: Daniel Vetter 

>*/
> - if (!preferred_bpp)
> - preferred_bpp = dev->mode_config.preferred_depth;
>   if (!preferred_bpp)
>   preferred_bpp = 32;
>   fb_helper->preferred_bpp = preferred_bpp;
> -- 
> 2.38.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/doc: make drm-uapi igt-tests more readable

2022-11-22 Thread Daniel Vetter
On Fri, Nov 18, 2022 at 03:51:37PM -0800, Randy Dunlap wrote:
> Correct grammar and make the use of the igt-tests more readable.
> 
> Signed-off-by: Randy Dunlap 
> Cc: Gabriela Bittencourt 
> Cc: Rodrigo Siqueira 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Thomas Zimmermann 
> Cc: dri-devel@lists.freedesktop.org
> Cc: Jonathan Corbet 

Pushed to drm-misc-next, thanks for your patch.
-Daniel

> ---
>  Documentation/gpu/drm-uapi.rst |   12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff -- a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
> --- a/Documentation/gpu/drm-uapi.rst
> +++ b/Documentation/gpu/drm-uapi.rst
> @@ -402,19 +402,19 @@ It's possible to run the IGT-tests in a
>   1. Use IGT inside a VM
>   2. Use IGT from the host machine and write the results in a shared 
> directory.
>  
> -As follow, there is an example of using a VM with a shared directory with
> -the host machine to run igt-tests. As an example it's used virtme::
> +Following is an example of using a VM with a shared directory with
> +the host machine to run igt-tests. This example uses virtme::
>  
>   $ virtme-run --rwdir /path/for/shared_dir 
> --kdir=path/for/kernel/directory --mods=auto
>  
> -Run the igt-tests in the guest machine, as example it's ran the 'kms_flip'
> +Run the igt-tests in the guest machine. This example runs the 'kms_flip'
>  tests::
>  
>   $ /path/for/igt-gpu-tools/scripts/run-tests.sh -p -s -t "kms_flip.*" -v
>  
> -In this example, instead of build the igt_runner, Piglit is used
> -(-p option); it's created html summary of the tests results and it's saved
> -in the folder "igt-gpu-tools/results"; it's executed only the igt-tests
> +In this example, instead of building the igt_runner, Piglit is used
> +(-p option). It creates an HTML summary of the test results and saves
> +them in the folder "igt-gpu-tools/results". It executes only the igt-tests
>  matching the -t option.
>  
>  Display CRC Support

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()

2022-11-22 Thread Alex Deucher
On Tue, Nov 22, 2022 at 6:12 AM Xiongfeng Wang
 wrote:
>
> As comment of pci_get_class() says, it returns a pci_device with its
> refcount increased and decreased the refcount for the input parameter
> @from if it is not NULL.
>
> If we break the loop in radeon_atrm_get_bios() with 'pdev' not NULL, we
> need to call pci_dev_put() to decrease the refcount. Add the missing
> pci_dev_put() to avoid refcount leak.

For both patches, I think pci_dev_put() needs to go into the loops.
There are 2 or more GPUs on the systems where this is relevant.

Alex

>
> Fixes: d8ade3526b2a ("drm/radeon: handle non-VGA class pci devices with ATRM")
> Fixes: c61e2775873f ("drm/radeon: split ATRM support out from the ATPX 
> handler (v3)")
> Signed-off-by: Xiongfeng Wang 
> ---
>  drivers/gpu/drm/radeon/radeon_bios.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_bios.c 
> b/drivers/gpu/drm/radeon/radeon_bios.c
> index 33121655d50b..2df6ce3e32cb 100644
> --- a/drivers/gpu/drm/radeon/radeon_bios.c
> +++ b/drivers/gpu/drm/radeon/radeon_bios.c
> @@ -227,6 +227,7 @@ static bool radeon_atrm_get_bios(struct radeon_device 
> *rdev)
>
> if (!found)
> return false;
> +   pci_dev_put(pdev);
>
> rdev->bios = kmalloc(size, GFP_KERNEL);
> if (!rdev->bios) {
> --
> 2.20.1
>


Re: [PATCH mm-unstable v1 07/20] mm: don't call vm_ops->huge_fault() in wp_huge_pmd()/wp_huge_pud() for private mappings

2022-11-22 Thread Vlastimil Babka
On 11/16/22 11:26, David Hildenbrand wrote:
> If we already have a PMD/PUD mapped write-protected in a private mapping
> and we want to break COW either due to FAULT_FLAG_WRITE or
> FAULT_FLAG_UNSHARE, there is no need to inform the file system just like on
> the PTE path.
> 
> Let's just split (->zap) + fallback in that case.
> 
> This is a preparation for more generic FAULT_FLAG_UNSHARE support in
> COW mappings.
> 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Vlastimil Babka 

Nits:

> ---
>  mm/memory.c | 24 +++-
>  1 file changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index c35e6cd32b6a..d47ad33c6487 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4802,6 +4802,7 @@ static inline vm_fault_t create_huge_pmd(struct 
> vm_fault *vmf)
>  static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf)
>  {
>   const bool unshare = vmf->flags & FAULT_FLAG_UNSHARE;
> + vm_fault_t ret;
>  
>   if (vma_is_anonymous(vmf->vma)) {
>   if (likely(!unshare) &&
> @@ -4809,11 +4810,13 @@ static inline vm_fault_t wp_huge_pmd(struct vm_fault 
> *vmf)
>   return handle_userfault(vmf, VM_UFFD_WP);
>   return do_huge_pmd_wp_page(vmf);
>   }
> - if (vmf->vma->vm_ops->huge_fault) {
> - vm_fault_t ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD);
>  
> - if (!(ret & VM_FAULT_FALLBACK))
> - return ret;
> + if (vmf->vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) {
> + if (vmf->vma->vm_ops->huge_fault) {

I guess it could have been a single if with && and the reduced identation
could fit keeping 'ret' declaration inside.
AFAICS the later patches don't build more on top of this anyway.
But also fine keeping as is.

(the hunk below same)

> + ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD);
> + if (!(ret & VM_FAULT_FALLBACK))
> + return ret;
> + }
>   }
>  
>   /* COW or write-notify handled on pte level: split pmd. */
> @@ -4839,14 +4842,17 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, 
> pud_t orig_pud)
>  {
>  #if defined(CONFIG_TRANSPARENT_HUGEPAGE) &&  \
>   defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)
> + vm_fault_t ret;
> +
>   /* No support for anonymous transparent PUD pages yet */
>   if (vma_is_anonymous(vmf->vma))
>   goto split;
> - if (vmf->vma->vm_ops->huge_fault) {
> - vm_fault_t ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD);
> -
> - if (!(ret & VM_FAULT_FALLBACK))
> - return ret;
> + if (vmf->vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) {
> + if (vmf->vma->vm_ops->huge_fault) {
> + ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD);
> + if (!(ret & VM_FAULT_FALLBACK))
> + return ret;
> + }
>   }
>  split:
>   /* COW or write-notify not handled on PUD level: split pud.*/



Re: [syzbot] inconsistent lock state in sync_info_debugfs_show

2022-11-22 Thread Daniel Vetter
On Fri, Nov 18, 2022 at 06:46:38PM -0800, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
> 
> HEAD commit:84368d882b96 Merge tag 'soc-fixes-6.1-3' of git://git.kern..
> git tree:   upstream
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=1670fb6588
> kernel config:  https://syzkaller.appspot.com/x/.config?x=6f4e5e9899396248
> dashboard link: https://syzkaller.appspot.com/bug?extid=007bfe0f3330f6e1e7d1
> compiler:   gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils 
> for Debian) 2.35.2
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=164376f988
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16cf096588
> 
> Downloadable assets:
> disk image: 
> https://storage.googleapis.com/syzbot-assets/031b6e68785d/disk-84368d88.raw.xz
> vmlinux: 
> https://storage.googleapis.com/syzbot-assets/cff5e66b90e8/vmlinux-84368d88.xz
> kernel image: 
> https://storage.googleapis.com/syzbot-assets/e75525784a66/bzImage-84368d88.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+007bfe0f3330f6e1e...@syzkaller.appspotmail.com
> 
> 
> WARNING: inconsistent lock state
> 6.1.0-rc5-syzkaller-00144-g84368d882b96 #0 Not tainted
> 
> inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
> syz-executor333/3645 [HC0[0]:SC0[0]:HE0:SE1] takes:
> 8d295c38 (sync_timeline_list_lock){?...}-{2:2}, at: spin_lock_irq 
> include/linux/spinlock.h:375 [inline]
> 8d295c38 (sync_timeline_list_lock){?...}-{2:2}, at: 
> sync_info_debugfs_show+0x31/0x200 drivers/dma-buf/sync_debug.c:147
> {IN-HARDIRQ-W} state was registered at:
>   lock_acquire kernel/locking/lockdep.c:5668 [inline]
>   lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
>   __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>   _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
>   sync_timeline_debug_remove+0x29/0x1a0 drivers/dma-buf/sync_debug.c:31
>   sync_timeline_free drivers/dma-buf/sw_sync.c:104 [inline]
>   kref_put include/linux/kref.h:65 [inline]
>   sync_timeline_put drivers/dma-buf/sw_sync.c:116 [inline]
>   timeline_fence_release+0x267/0x340 drivers/dma-buf/sw_sync.c:144
>   dma_fence_release+0x14b/0x690 drivers/dma-buf/dma-fence.c:559

Do we need to just generally push all dma_fence_release finalization to an
irq work to untangle this mess once and for all? Or is this not just plain
old recursion? Christian, any good ideas?
-Daniel

>   kref_put include/linux/kref.h:65 [inline]
>   dma_fence_put include/linux/dma-fence.h:276 [inline]
>   dma_fence_array_release+0x1fa/0x2d0 drivers/dma-buf/dma-fence-array.c:120
>   dma_fence_release+0x14b/0x690 drivers/dma-buf/dma-fence.c:559
>   kref_put include/linux/kref.h:65 [inline]
>   dma_fence_put include/linux/dma-fence.h:276 [inline]
>   irq_dma_fence_array_work+0xa9/0xd0 drivers/dma-buf/dma-fence-array.c:52
>   irq_work_single+0x124/0x260 kernel/irq_work.c:211
>   irq_work_run_list kernel/irq_work.c:242 [inline]
>   irq_work_run_list+0x91/0xc0 kernel/irq_work.c:225
>   irq_work_run+0x58/0xd0 kernel/irq_work.c:251
>   __sysvec_irq_work+0xce/0x4e0 arch/x86/kernel/irq_work.c:22
>   sysvec_irq_work+0x92/0xc0 arch/x86/kernel/irq_work.c:17
>   asm_sysvec_irq_work+0x1a/0x20 arch/x86/include/asm/idtentry.h:675
>   __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:160 [inline]
>   _raw_spin_unlock_irq+0x29/0x50 kernel/locking/spinlock.c:202
>   spin_unlock_irq include/linux/spinlock.h:400 [inline]
>   sw_sync_debugfs_release+0x162/0x240 drivers/dma-buf/sw_sync.c:321
>   __fput+0x27c/0xa90 fs/file_table.c:320
>   task_work_run+0x16f/0x270 kernel/task_work.c:179
>   ptrace_notify+0x118/0x140 kernel/signal.c:2354
>   ptrace_report_syscall include/linux/ptrace.h:420 [inline]
>   ptrace_report_syscall_exit include/linux/ptrace.h:482 [inline]
>   syscall_exit_work kernel/entry/common.c:251 [inline]
>   syscall_exit_to_user_mode_prepare+0x129/0x280 kernel/entry/common.c:278
>   __syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline]
>   syscall_exit_to_user_mode+0xd/0x50 kernel/entry/common.c:296
>   do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> irq event stamp: 638
> hardirqs last  enabled at (637): [] 
> ___slab_alloc+0xca0/0x1400 mm/slub.c:3132
> hardirqs last disabled at (638): [] __raw_spin_lock_irq 
> include/linux/spinlock_api_smp.h:117 [inline]
> hardirqs last disabled at (638): [] 
> _raw_spin_lock_irq+0x45/0x50 kernel/locking/spinlock.c:170
> softirqs last  enabled at (538): [] invoke_softirq 
> kernel/softirq.c:445 [inline]
> softirqs last  enabled at (538): [] 
> __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
> softirqs last disabled at (505): [] invoke_softirq 
> kernel/softirq.c:445 [inline]
> softirqs last disabled at (505): [] 
> __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
> 
> other info that migh

[Bug 216727] Failure to wake up from suspend to RAM

2022-11-22 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216727

Alex Deucher (alexdeuc...@gmail.com) changed:

   What|Removed |Added

 CC||alexdeuc...@gmail.com

--- Comment #1 from Alex Deucher (alexdeuc...@gmail.com) ---
Does this patch help?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d4de331f1b24a22d18e3c6116aa25228cf54854

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v4 0/4] new subsystem for compute accelerator devices

2022-11-22 Thread Daniel Vetter
On Tue, Nov 22, 2022 at 03:46:25PM +1000, Dave Airlie wrote:
> On Tue, 22 Nov 2022 at 09:06, Sonal Santan  wrote:
> >
> > On 11/19/22 12:44, Oded Gabbay wrote:
> > > This is the fourth (and hopefully last) version of the patch-set to add 
> > > the
> > > new subsystem for compute accelerators. I removed the RFC headline as
> > > I believe it is now ready for merging.
> > >
> > > Compare to v3, this patch-set contains one additional patch that adds
> > > documentation regarding the accel subsystem. I hope it's good enough for
> > > this stage. In addition, there were few very minor fixes according to
> > > comments received on v3.
> > >
> > > The patches are in the following repo:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git/log/?h=accel_v4
> > >
> > > As in v3, The HEAD of that branch is a commit adding a dummy driver that
> > > registers an accel device using the new framework. This can be served
> > > as a simple reference.
> > >
> > > v1 cover letter:
> > > https://lkml.org/lkml/2022/10/22/544
> > >
> > > v2 cover letter:
> > > https://lore.kernel.org/lkml/20221102203405.1797491-1-ogab...@kernel.org/T/
> > >
> > > v3 cover letter:
> > > https://lore.kernel.org/lkml/20221106210225.2065371-1-ogab...@kernel.org/T/
> >
> > Thanks for defining the new accel subsystem. We are currently working on
> > DRM based drivers for unannounced acceleration devices. I am fine with
> > these changes with the assumption that the choice of using classic DRM
> > or accel is left up to the individual driver.
> 
> I don't think that decision should be up to any individual driver
> author. It will have to be consensus with me/Daniel/Oded and the
> driver authors.

Plus the entire point of this is that it's _still_ a drm based driver. So
aside from changing a flag in the kernel driver and adjusting userspace to
find the right chardev, there should be zero changes need for an existing
drm based driver stack that gets ported to drivers/accel.

And of course if we realize there's issues as we add drivers, we can fix
things up. This is just to kick things off, not something that's going to
be cast in stone for all eternity.

Sonal, with that clarification/explanation, is this entire thing
reasonable in principal and you can drop an Ack onto the series?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[Bug 216727] Failure to wake up from suspend to RAM

2022-11-22 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216727

--- Comment #2 from Martin (martin...@gmx.com) ---
(In reply to Alex Deucher from comment #1)
> Does this patch help?
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/
> ?id=8d4de331f1b24a22d18e3c6116aa25228cf54854

Thanks, I'll try and let you know tomorrow.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [MAINTAINER TOOLS] docs: updated rules for topic/core-for-CI commit management

2022-11-22 Thread Daniel Vetter
On Tue, Nov 22, 2022 at 03:17:14PM +0200, Jani Nikula wrote:
> Introduce stricter rules for topic/core-for-CI management. Way too many
> commits have been added over the years, with insufficient rationale
> recorded in the commit message, and insufficient follow-up with removing
> the commits from the topic branch.
> 
> New rules:
> 
> 1. Require maintainer ack for rebase. Have better gating on when rebases
>happen and on which baselines.
> 
> 2. Require maintainer/committer ack for adding/removing commits. No
>single individual should decide.
> 
> 3. Require gitlab issues for new commits added. Improve tracking for
>removing the commits.
> 
> Also use the stronger "must" for commit message requiring the
> justification for the commit being in topic/core-for-CI.
> 
> Cc: Joonas Lahtinen 
> Cc: Rodrigo Vivi 
> Cc: Tvrtko Ursulin 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: intel-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: dim-to...@lists.freedesktop.org
> Signed-off-by: Jani Nikula 

Reviewed-by: Daniel Vetter 

> ---
>  drm-tip.rst | 27 ---
>  1 file changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/drm-tip.rst b/drm-tip.rst
> index deae95cdd2fe..24036e2ef576 100644
> --- a/drm-tip.rst
> +++ b/drm-tip.rst
> @@ -203,11 +203,13 @@ justified exception. The primary goal is to fix issues 
> originating from Linus'
>  tree. Issues that would need drm-next or other DRM subsystem tree as baseline
>  should be fixed in the offending DRM subsystem tree.
>  
> -Only rebase the branch if you really know what you're doing. When in doubt, 
> ask
> -the maintainers. You'll need to be able to handle any conflicts in non-drm 
> code
> -while rebasing.
> +Only rebase the branch if you really know what you're doing. You'll need to 
> be
> +able to handle any conflicts in non-drm code while rebasing.
>  
> -Simply drop fixes that are already available in the new baseline.
> +Always ask for maintainer ack before rebasing. IRC ack is sufficient.
> +
> +Simply drop fixes that are already available in the new baseline. Close the
> +associated gitlab issue when removing commits.
>  
>  Force pushing a rebased topic/core-for-CI requires passing the ``--force``
>  parameter to git::
> @@ -225,11 +227,22 @@ judgement call.
>  Only add or remove commits if you really know what you're doing. When in 
> doubt,
>  ask the maintainers.
>  
> -Apply new commits on top with regular push. The commit message needs to 
> explain
> -why the patch has been applied to topic/core-for-CI. If it's a cherry-pick 
> from
> +Always ask for maintainer/committer ack before adding/removing commits. IRC 
> ack
> +is sufficient. Record the ``Acked-by:`` in commits being added.
> +
> +Apply new commits on top with regular push. The commit message must explain 
> why
> +the patch has been applied to topic/core-for-CI. If it's a cherry-pick from
>  another subsystem, please reference the commit with ``git cherry-pick -x``
>  option. If it's a patch from another subsystem, please reference the patch on
>  the mailing list with ``Link:`` tag.
>  
> +New commits always need an associated gitlab issue for tracking purposes. The
> +goal is to have as few commits in topic/core-for-CI as possible, and we need 
> to
> +be able to track the progress in making that happen. Reference the issue with
> +``References:`` tag. Add the ``core-for-CI`` label to the issue. (Note: Do 
> not
> +use ``Closes:`` because the logic here is backwards; the issue is having the
> +commit in the branch in the first place.)
> +
>  Instead of applying reverts, just remove the commit. This implies ``git 
> rebase
> --i`` on the current baseline; see directions above.
> +-i`` on the current baseline; see directions above. Close the associated 
> gitlab
> +issue when removing commits.
> -- 
> 2.34.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()

2022-11-22 Thread Lazar, Lijo




On 11/22/2022 8:19 PM, Alex Deucher wrote:

On Tue, Nov 22, 2022 at 6:12 AM Xiongfeng Wang
 wrote:


As comment of pci_get_class() says, it returns a pci_device with its
refcount increased and decreased the refcount for the input parameter
@from if it is not NULL.

If we break the loop in radeon_atrm_get_bios() with 'pdev' not NULL, we
need to call pci_dev_put() to decrease the refcount. Add the missing
pci_dev_put() to avoid refcount leak.


For both patches, I think pci_dev_put() needs to go into the loops.
There are 2 or more GPUs on the systems where this is relevant.



As per the logic, the loop breaks when it finds a valid ATRM handle. So 
dev_put is required only for that device.


When inside the loop this happens -  "decreased the refcount for the 
input parameter @from if it is not NULL"


Thanks,
Lijo


Alex



Fixes: d8ade3526b2a ("drm/radeon: handle non-VGA class pci devices with ATRM")
Fixes: c61e2775873f ("drm/radeon: split ATRM support out from the ATPX handler 
(v3)")
Signed-off-by: Xiongfeng Wang 
---
  drivers/gpu/drm/radeon/radeon_bios.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/radeon/radeon_bios.c 
b/drivers/gpu/drm/radeon/radeon_bios.c
index 33121655d50b..2df6ce3e32cb 100644
--- a/drivers/gpu/drm/radeon/radeon_bios.c
+++ b/drivers/gpu/drm/radeon/radeon_bios.c
@@ -227,6 +227,7 @@ static bool radeon_atrm_get_bios(struct radeon_device *rdev)

 if (!found)
 return false;
+   pci_dev_put(pdev);

 rdev->bios = kmalloc(size, GFP_KERNEL);
 if (!rdev->bios) {
--
2.20.1



Re: [PATCH RFC 16/19] mm/frame-vector: remove FOLL_FORCE usage

2022-11-22 Thread David Hildenbrand

On 22.11.22 15:07, Hans Verkuil wrote:

On 11/22/22 13:38, David Hildenbrand wrote:

On 22.11.22 13:25, Hans Verkuil wrote:

Hi Tomasz, David,

On 11/8/22 05:45, Tomasz Figa wrote:

Hi David,

On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand  wrote:


FOLL_FORCE is really only for debugger access. According to commit
707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always
writable"), the pinned pages are always writable.


Actually that patch is only a workaround to temporarily disable
support for read-only pages as they seemed to suffer from some
corruption issues in the retrieved user pages. We expect to support
read-only pages as hardware input after. That said, FOLL_FORCE doesn't
sound like the right thing even in that case, but I don't know the
background behind it being added here in the first place. +Hans
Verkuil +Marek Szyprowski do you happen to remember anything about it?


I tracked the use of 'force' all the way back to the first git commit
(2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the
reason is lost in the mists of time.

I'm not sure if the 'force' argument of get_user_pages() at that time
even meant the same as FOLL_FORCE today. From what I can tell it has just
been faithfully used ever since, but I have my doubt that anyone understands
the reason behind it since it was never explained.

Looking at this old LWN article https://lwn.net/Articles/28548/ suggests
that it might be related to calling get_user_pages for write buffers
(non-zero write argument) where you also want to be able to read from the
buffer. That is certainly something that some drivers need to do post-capture
fixups.

But 'force' was also always set for read buffers, and I don't know if that
was something that was actually needed, or just laziness.

I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still
allow drivers to read from the buffer?


Yes. The only problematic corner case I can imagine is if someone has a
VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin
user space pages as a read buffer. We'd specify now FOLL_WRITE without
FOLL_FORCE and GUP would reject that: write access without write
permissions is invalid.


I do not believe this will be an issue.



There would be no way around "fixing" this implementation to not specify
FOLL_WRITE when only reading from user-space pages. Not sure what the
implications are regarding that corruption that was mentioned in
707947247e95.


Before 707947247e95 the FOLL_WRITE flag was only set for write buffers
(i.e. video capture, DMA_FROM_DEVICE), not for read buffers (video output,
DMA_TO_DEVICE). In the video output case there should never be any need
for drivers to write to the buffer to the best of my knowledge.

But I have had some complaints about that commit that it causes problems
in some scenarios, and it has been on my todo list for quite some time now
to dig deeper into this. I probably should prioritize this for this or
next week.



Having said that, I assume such a scenario is unlikely -- but you might
know better how user space usually uses this interface. There would be
three options:

1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid.
2) Remove FOLL_FORCE and see if anybody even notices (this patch) and
 leave the implementation as is for now.
3) Remove FOLL_FORCE and fixup the implementation to only specify
 FOLL_WRITE if the pages will actually get written to.

3) would most probably ideal, however, I am no expert on that code and
can't do it (707947247e95 confuses me). So naive me would go with 2) first.



Option 3 would be best. And 707947247e95 confuses me as well, and I actually
wrote it :-) I am wondering whether it was addressed at the right level, but
as I said, I need to dig a bit deeper into this.


Cool, let me know if I can help!

--
Thanks,

David / dhildenb



Re: [PATCH v2 2/4] usb: gadget: hid: Convert to use list_count()

2022-11-22 Thread Andy Shevchenko
On Mon, Nov 14, 2022 at 04:15:35PM -0300, Fabio Estevam wrote:
> On Mon, Nov 14, 2022 at 1:22 PM Andy Shevchenko
>  wrote:
> >
> > The list API now provides the list_count() to help with counting
> > existing nodes in the list. Uilise it.
> 
> s/Uilise/Utilise

Fixed for v3, thanks!

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH v2 1/4] i915: Move list_count() to list.h for broader use

2022-11-22 Thread Andy Shevchenko
On Tue, Nov 15, 2022 at 05:46:28PM +0200, Jani Nikula wrote:
> On Mon, 14 Nov 2022, Andy Shevchenko  
> wrote:
> > Some of the existing users, and definitely will be new ones, want to
> > count existing nodes in the list. Provide a generic API for that by
> > moving code from i915 to list.h.
> 
> I think I'd find list_length() a much more natural name for this.

i915 suggests my variant :-)

> *shrug*
> 
> Acked-by: Jani Nikula 
> 
> regardless of what you decide to do with name or static inline etc.

Thanks! I will check which one looks and feels better and update for v3.

-- 
With Best Regards,
Andy Shevchenko




RE: [EXT] Re: [PATCH v4 02/10] dt-bindings: display: bridge: Add MHDP DP for i.MX8MQ

2022-11-22 Thread Sandor Yu
Hi Krzysztof,

Thanks your comments,

> -Original Message-
> From: Krzysztof Kozlowski 
> Sent: 2022年11月22日 16:28
> To: Sandor Yu ; andrzej.ha...@intel.com;
> neil.armstr...@linaro.org; robert.f...@linaro.org;
> laurent.pinch...@ideasonboard.com; jo...@kwiboo.se;
> jernej.skra...@gmail.com; airl...@gmail.com; dan...@ffwll.ch;
> robh...@kernel.org; krzysztof.kozlowski...@linaro.org;
> shawn...@kernel.org; s.ha...@pengutronix.de; feste...@gmail.com;
> kis...@ti.com; vk...@kernel.org; dri-devel@lists.freedesktop.org;
> devicet...@vger.kernel.org; linux-arm-ker...@lists.infradead.org;
> linux-ker...@vger.kernel.org; linux-...@lists.infradead.org;
> alexander.st...@ew.tq-group.com
> Cc: ker...@pengutronix.de; dl-linux-imx ; Oliver Brown
> 
> Subject: [EXT] Re: [PATCH v4 02/10] dt-bindings: display: bridge: Add MHDP
> DP for i.MX8MQ
> 
> Caution: EXT Email
> 
> On 21/11/2022 08:23, Sandor Yu wrote:
> > Add bindings for i.MX8MQ MHDP DisplayPort.
> >
> > Signed-off-by: Sandor Yu 
> > ---
> >  .../display/bridge/cdns,mhdp-imx8mq-dp.yaml   | 93
> +++
> >  1 file changed, 93 insertions(+)
> >  create mode 100644
> >
> Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp.y
> > aml
> >
> > diff --git
> >
> a/Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp
> > .yaml
> >
> b/Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp
> > .yaml
> > new file mode 100644
> > index ..d82f3ceddaa8
> > --- /dev/null
> > +++
> b/Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8m
> > +++ q-dp.yaml
> > @@ -0,0 +1,93 @@
> > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) %YAML 1.2
> > +---
> > +$id:
> > +https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdevi
> >
> +cetree.org%2Fschemas%2Fdisplay%2Fbridge%2Fcdns%2Cmhdp-imx8mq-dp.
> yaml%
> >
> +23&data=05%7C01%7CSandor.yu%40nxp.com%7C163690c8a8ab4f7a6
> c9208dac
> >
> +c6373c4%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6380470
> 247542869
> >
> +46%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luM
> zIiLCJBTi
> >
> +I6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Eei5Nkp0
> Hl8SHpBLZU
> > +1HsJDnWQujHvmPh2XMuC%2BSZ58%3D&reserved=0
> > +$schema:
> > +https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdevi
> >
> +cetree.org%2Fmeta-schemas%2Fcore.yaml%23&data=05%7C01%7CSa
> ndor.yu
> >
> +%40nxp.com%7C163690c8a8ab4f7a6c9208dacc6373c4%7C686ea1d3bc2b4
> c6fa92cd
> >
> +99c5c301635%7C0%7C0%7C638047024754286946%7CUnknown%7CTWFp
> bGZsb3d8eyJW
> >
> +IjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7
> C3000
> >
> +%7C%7C%7C&sdata=eZUnepnQHS9ewyWb9AHMht%2F%2BevutmOt
> mTajL%2B5Ewwbs
> > +%3D&reserved=0
> > +
> > +title: Cadence MHDP Displayport bridge
> > +
> > +maintainers:
> > +  - Sandor Yu 
> > +
> > +description:
> > +  The Cadence MHDP Displayport TX interface.
> > +
> > +properties:
> > +  compatible:
> > +enum:
> > +  - cdns,mhdp-imx8mq-dp
> > +
> > +  reg:
> > +maxItems: 1
> > +
> > +  clocks:
> > +maxItems: 1
> > +description: MHDP DP APB clock.
> > +
> > +  phys:
> > +maxItems: 1
> > +
> > +  interrupts:
> > +items:
> > +  - description: Hotplug cable plugin.
> > +  - description: Hotplug cable plugout.
> > +
> > +  interrupt-names:
> > +items:
> > +  - const: plug_in
> > +  - const: plug_out
> > +
> > +  ports:
> > +$ref: /schemas/graph.yaml#/properties/ports
> > +
> > +properties:
> > +  port@0:
> > +$ref: /schemas/graph.yaml#/properties/port
> > +description:
> > +  Input port from display controller output.
> > +  port@1:
> > +$ref: /schemas/graph.yaml#/properties/port
> > +description:
> > +  Output port to DP connector.
> > +
> > +required:
> > +  - port@0
> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - clocks
> > +  - interrupts
> > +  - interrupt-names
> > +  - phys
> > +  - ports
> > +
> > +additionalProperties: false
> > +
> > +examples:
> > +  - |
> > +#include 
> > +#include 
> > +
> > +mhdp_dp: dp-bridge@32c0 {
> > +compatible = "cdns,mhdp-imx8mq-dp";
> > +reg = <0x32c0 0x10>;
> > +interrupts = ,
> > + ;
> > +interrupt-names = "plug_in", "plug_out";
> > +clocks = <&clk IMX8MQ_CLK_DISP_APB_ROOT>;
> > +phys = <&dp_phy>;
> > +
> > +ports {
> > +#address-cells = <1>;
> > +#size-cells = <0>;
> > +
> > +port@0 {
> > +reg = <0>;
> > +
> > +mhdp_in: endpoint {
> > +remote-endpoint = <&dcss_out>;
> > +};
> 
> As Rob suggested, you allowed property for output to DP port. However it is
> not in the example. If this is a bridge, what does it bridge if there is no 
> output
> connector?
> 
OK, I got it. Output connector will be added to the example in the next version.
Thanks,
Sandor
> 
> Best r

[PATCH v3 1/4] i915: Move list_count() to list.h for broader use

2022-11-22 Thread Andy Shevchenko
Some of the existing users, and definitely will be new ones, want to
count existing nodes in the list. Provide a generic API for that by
moving code from i915 to list.h.

Signed-off-by: Andy Shevchenko 
Acked-by: Jani Nikula 
---
v3: added tag (Jani), changed to be static inline (Mike)
v2: dropped the duplicate code in i915 (LKP)
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 +
 include/linux/list.h  | 15 +++
 2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 6ae8b07cfaa1..b5d474be564d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -2085,17 +2085,6 @@ static void print_request_ring(struct drm_printer *m, 
struct i915_request *rq)
}
 }
 
-static unsigned long list_count(struct list_head *list)
-{
-   struct list_head *pos;
-   unsigned long count = 0;
-
-   list_for_each(pos, list)
-   count++;
-
-   return count;
-}
-
 static unsigned long read_ul(void *p, size_t x)
 {
return *(unsigned long *)(p + x);
@@ -2270,7 +2259,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
spin_lock_irqsave(&engine->sched_engine->lock, flags);
engine_dump_active_requests(engine, m);
 
-   drm_printf(m, "\tOn hold?: %lu\n",
+   drm_printf(m, "\tOn hold?: %zu\n",
   list_count(&engine->sched_engine->hold));
spin_unlock_irqrestore(&engine->sched_engine->lock, flags);
 
diff --git a/include/linux/list.h b/include/linux/list.h
index 61762054b4be..65aab596ae33 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -655,6 +655,21 @@ static inline void list_splice_tail_init(struct list_head 
*list,
 !list_is_head(pos, (head)); \
 pos = n, n = pos->prev)
 
+/**
+ * list_count - count nodes in the list
+ * @head:  the head for your list.
+ */
+size_t list_count(struct list_head *head)
+{
+   struct list_head *pos;
+   size_t count = 0;
+
+   list_for_each(pos, head)
+   count++;
+
+   return count;
+}
+
 /**
  * list_entry_is_head - test if the entry points to the head of the list
  * @pos:   the type * to cursor
-- 
2.35.1



[PATCH v3 3/4] usb: gadget: udc: bcm63xx: Convert to use list_count()

2022-11-22 Thread Andy Shevchenko
The list API now provides the list_count() to help with counting
existing nodes in the list. Utilise it.

Signed-off-by: Andy Shevchenko 
---
v3: no change
v2: no change
 drivers/usb/gadget/udc/bcm63xx_udc.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/usb/gadget/udc/bcm63xx_udc.c 
b/drivers/usb/gadget/udc/bcm63xx_udc.c
index 2cdb07905bde..0762e49e85f8 100644
--- a/drivers/usb/gadget/udc/bcm63xx_udc.c
+++ b/drivers/usb/gadget/udc/bcm63xx_udc.c
@@ -2172,7 +2172,6 @@ static int bcm63xx_iudma_dbg_show(struct seq_file *s, 
void *p)
 
for (ch_idx = 0; ch_idx < BCM63XX_NUM_IUDMA; ch_idx++) {
struct iudma_ch *iudma = &udc->iudma[ch_idx];
-   struct list_head *pos;
 
seq_printf(s, "IUDMA channel %d -- ", ch_idx);
switch (iudma_defaults[ch_idx].ep_type) {
@@ -2205,14 +2204,10 @@ static int bcm63xx_iudma_dbg_show(struct seq_file *s, 
void *p)
seq_printf(s, "  desc: %d/%d used", iudma->n_bds_used,
   iudma->n_bds);
 
-   if (iudma->bep) {
-   i = 0;
-   list_for_each(pos, &iudma->bep->queue)
-   i++;
-   seq_printf(s, "; %d queued\n", i);
-   } else {
+   if (iudma->bep)
+   seq_printf(s, "; %zu queued\n", 
list_count(&iudma->bep->queue));
+   else
seq_printf(s, "\n");
-   }
 
for (i = 0; i < iudma->n_bds; i++) {
struct bcm_enet_desc *d = &iudma->bd_ring[i];
-- 
2.35.1



[PATCH v3 2/4] usb: gadget: hid: Convert to use list_count()

2022-11-22 Thread Andy Shevchenko
The list API now provides the list_count() to help with counting
existing nodes in the list. Utilise it.

Signed-off-by: Andy Shevchenko 
---
v3: fixed typo in the commit message (Fabio)
v2: no change
 drivers/usb/gadget/legacy/hid.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/gadget/legacy/hid.c b/drivers/usb/gadget/legacy/hid.c
index 1187ee4f316a..6196c3456e0b 100644
--- a/drivers/usb/gadget/legacy/hid.c
+++ b/drivers/usb/gadget/legacy/hid.c
@@ -133,14 +133,11 @@ static struct usb_configuration config_driver = {
 static int hid_bind(struct usb_composite_dev *cdev)
 {
struct usb_gadget *gadget = cdev->gadget;
-   struct list_head *tmp;
struct hidg_func_node *n = NULL, *m, *iter_n;
struct f_hid_opts *hid_opts;
-   int status, funcs = 0;
-
-   list_for_each(tmp, &hidg_func_list)
-   funcs++;
+   int status, funcs;
 
+   funcs = list_count(&hidg_func_list);
if (!funcs)
return -ENODEV;
 
-- 
2.35.1



[PATCH v3 4/4] xhci: Convert to use list_count()

2022-11-22 Thread Andy Shevchenko
The list API now provides the list_count() to help with counting
existing nodes in the list. Utilise it.

Signed-off-by: Andy Shevchenko 
---
v3: no change
v2: no change
 drivers/usb/host/xhci-ring.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index ad81e9a508b1..817c31e3b0c8 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2532,7 +2532,6 @@ static int handle_tx_event(struct xhci_hcd *xhci,
union xhci_trb *ep_trb;
int status = -EINPROGRESS;
struct xhci_ep_ctx *ep_ctx;
-   struct list_head *tmp;
u32 trb_comp_code;
int td_num = 0;
bool handling_skipped_tds = false;
@@ -2580,10 +2579,8 @@ static int handle_tx_event(struct xhci_hcd *xhci,
}
 
/* Count current td numbers if ep->skip is set */
-   if (ep->skip) {
-   list_for_each(tmp, &ep_ring->td_list)
-   td_num++;
-   }
+   if (ep->skip)
+   td_num += list_count(&ep_ring->td_list);
 
/* Look for common error cases */
switch (trb_comp_code) {
-- 
2.35.1



Re: [PATCH mm-unstable v1 08/20] mm: extend FAULT_FLAG_UNSHARE support to anything in a COW mapping

2022-11-22 Thread Vlastimil Babka
On 11/16/22 11:26, David Hildenbrand wrote:
> Extend FAULT_FLAG_UNSHARE to break COW on anything mapped into a
> COW (i.e., private writable) mapping and adjust the documentation
> accordingly.
> 
> FAULT_FLAG_UNSHARE will now also break COW when encountering the shared
> zeropage, a pagecache page, a PFNMAP, ... inside a COW mapping, by
> properly replacing the mapped page/pfn by a private copy (an exclusive
> anonymous page).
> 
> Note that only do_wp_page() needs care: hugetlb_wp() already handles
> FAULT_FLAG_UNSHARE correctly. wp_huge_pmd()/wp_huge_pud() also handles it
> correctly, for example, splitting the huge zeropage on FAULT_FLAG_UNSHARE
> such that we can handle FAULT_FLAG_UNSHARE on the PTE level.
> 
> This change is a requirement for reliable long-term R/O pinning in
> COW mappings.
> 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Vlastimil Babka 

> ---
>  include/linux/mm_types.h | 8 
>  mm/memory.c  | 4 
>  2 files changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 5e7f4fac1e78..5e9aaad8c7b2 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -1037,9 +1037,9 @@ typedef struct {
>   * @FAULT_FLAG_REMOTE: The fault is not for current task/mm.
>   * @FAULT_FLAG_INSTRUCTION: The fault was during an instruction fetch.
>   * @FAULT_FLAG_INTERRUPTIBLE: The fault can be interrupted by non-fatal 
> signals.
> - * @FAULT_FLAG_UNSHARE: The fault is an unsharing request to unshare (and 
> mark
> - *  exclusive) a possibly shared anonymous page that is
> - *  mapped R/O.
> + * @FAULT_FLAG_UNSHARE: The fault is an unsharing request to break COW in a
> + *  COW mapping, making sure that an exclusive anon page 
> is
> + *  mapped after the fault.
>   * @FAULT_FLAG_ORIG_PTE_VALID: whether the fault has vmf->orig_pte cached.
>   *We should only access orig_pte if this flag set.
>   *
> @@ -1064,7 +1064,7 @@ typedef struct {
>   *
>   * The combination FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE is illegal.
>   * FAULT_FLAG_UNSHARE is ignored and treated like an ordinary read fault when
> - * no existing R/O-mapped anonymous page is encountered.
> + * applied to mappings that are not COW mappings.
>   */
>  enum fault_flag {
>   FAULT_FLAG_WRITE =  1 << 0,
> diff --git a/mm/memory.c b/mm/memory.c
> index d47ad33c6487..56b21ab1e4d2 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3432,10 +3432,6 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
>   }
>   wp_page_reuse(vmf);
>   return 0;
> - } else if (unshare) {
> - /* No anonymous page -> nothing to do. */
> - pte_unmap_unlock(vmf->pte, vmf->ptl);
> - return 0;
>   }
>  copy:
>   /*



[PATCH] drm/amd/pm: added result check

2022-11-22 Thread Denis Arefev
The return value of the 'div64_s64'
function called in ppevvmath.h:371
was not checked.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Denis Arefev 
---
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h
index dac29fe6cfc6..82aa7130d143 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h
@@ -357,6 +357,7 @@ static fInt fDivide (fInt X, fInt Y)
 {
fInt fZERO, fQuotient;
int64_t longlongX, longlongY;
+   int64_t result;
 
fZERO = ConvertToFraction(0);
 
@@ -367,10 +368,11 @@ static fInt fDivide (fInt X, fInt Y)
longlongY = (int64_t)Y.full;
 
longlongX = longlongX << 16; /*Q(16,16) -> Q(32,32) */
+   longlongY = longlongY << 16;
 
-   div64_s64(longlongX, longlongY); /*Q(32,32) divided by Q(16,16) = 
Q(16,16) Back to original format */
+   result = div64_s64(longlongX, longlongY);
 
-   fQuotient.full = (int)longlongX;
+   fQuotient = ConvertToFraction((int)result);
return fQuotient;
 }
 
-- 
2.25.1



Re: Build regressions/improvements in v6.1-rc6

2022-11-22 Thread Johannes Berg
On Tue, 2022-11-22 at 08:55 -0500, Alex Deucher wrote:
> > 
> >+ /kisskb/src/arch/um/include/asm/processor-generic.h: error: called 
> > object is not a function or function pointer:  => 94:18
> >+ /kisskb/src/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c: 
> > error: control reaches end of non-void function [-Werror=return-type]:  => 
> > 1934:1
> > 
> > um-x86_64/um-all{mod,yes}config (in kfd_cpumask_to_apic_id())
> 
> Presumably cpu_data is not defined on um-x86_64?  Does it even make
> sense to build drivers on um-x86_64?

Drivers in general yes ;-)

This driver, probably not.

But the issue is that a lot of drivers "depends on X86_64" or such,
where only "X86" is the arch symbol. You could add "X86 && X86_64" to
really build on x86 64-bit only.

I didn't check this driver, but this has mostly popped up since UM got
PCI support some time ago (which I added.)

johannes




Re: [PATCH v3 1/4] i915: Move list_count() to list.h for broader use

2022-11-22 Thread Andy Shevchenko
On Tue, Nov 22, 2022 at 05:35:13PM +0200, Andy Shevchenko wrote:
> Some of the existing users, and definitely will be new ones, want to
> count existing nodes in the list. Provide a generic API for that by
> moving code from i915 to list.h.

Sorry for the noise, forgot to squash one part, so it get's not compilable.

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH v28 01/11] dt-bindings: arm: mediatek: mmsys: add vdosys1 compatible for MT8195

2022-11-22 Thread Matthias Brugger




On 22/11/2022 11:51, Nancy Lin (林欣螢) wrote:

Dear Matthias,

Thanks for the review.

On Thu, 2022-11-10 at 14:10 +0100, Matthias Brugger wrote:


On 09/11/2022 06:10, Jason-JH Lin (林睿祥) wrote:
> On Tue, 2022-11-08 at 18:46 +0100, Matthias Brugger wrote:
> > 
> > On 07/11/2022 08:22, Nancy.Lin wrote:

> > > Add vdosys1 mmsys compatible for MT8195 platform.
> > > 
> > > For MT8195, VDOSYS0 and VDOSYS1 are 2 display HW pipelines

> > > binding
> > > to
> > > 2 different power domains, different clock drivers and
> > > different
> > > mediatek-drm drivers.
> > > 
> > > Signed-off-by: Nancy.Lin 

> > > Reviewed-by: Nícolas F. R. A. Prado 
> > > ---
> > >  
> > > .../devicetree/bindings/arm/mediatek/mediatek,mmsys.yaml  |

> > > 4
> > > +++-
> > >   1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git

> > > a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys
> > > .yam
> > > l
> > > b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys
> > > .yam
> > > l
> > > index 0711f1834fbd..aaabe2196185 100644
> > > ---
> > > a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys
> > > .yam
> > > l
> > > +++
> > > b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys
> > > .yam
> > > l
> > > @@ -48,7 +48,9 @@ properties:
> > > - const: syscon
> > >   
> > > - items:

> > > -  - const: mediatek,mt8195-vdosys0
> > > +  - enum:
> > > +  - mediatek,mt8195-vdosys0
> > > +  - mediatek,mt8195-vdosys1
> > > - const: mediatek,mt8195-mmsys
> > > - const: syscon
> > >   
> > 
> > I think we had that several times already:
> > 
> 
> 

https://lore.kernel.org/all/6bbe9527-ae48-30e0-fb45-519223a74...@linaro.org/
> > 
> > We will something like this, but please check that this does not

> > give
> > any 
> > errors/warnings:
> > 
> > diff --git

> > a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.y
> > aml 
> > b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.y

> > aml
> > index eb451bec23d3d..8e9c4f4d7c389 100644
> > ---
> > a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.y
> > aml
> > +++
> > b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.y
> > aml
> > @@ -32,13 +32,22 @@ properties:
> > - mediatek,mt8183-mmsys
> > - mediatek,mt8186-mmsys
> > - mediatek,mt8192-mmsys
> > -  - mediatek,mt8195-mmsys
> > - mediatek,mt8365-mmsys
> > - const: syscon
> > - items:
> > - const: mediatek,mt7623-mmsys
> > - const: mediatek,mt2701-mmsys
> > - const: syscon
> > +  - items:
> > +  - const: mediatek,mt8195-vdosys0
> > +  - const: syscon
> > +  - items:
> > +  - const: mediatek,mt8195-vdosys1
> > +  - const: syscon
> > +  - items:
> > +  - const: mediatek,mt8195-mmsys
> > +  - const: syscon
> > +  deprecated: true
> > 
> > reg:

> >   maxItems: 1
> 
> Hi Matthias,
> 
> As the vdosys0 previous reviewed patch:
> 
> 

https://urldefense.com/v3/__https://patchwork.kernel.org/project/linux-mediatek/patch/20220927152704.12018-2-jason-jh@mediatek.com/__;!!CTRNKA9wMg0ARbw!zRdbIyyAsfqob2kapMAcKYATAguhEV0x0qE5cTOAcWUNhzeAbMHzZoos_2QzUCxS$
>  
> Should I modify the vdosys0 items format like your example?
> 
> Or should vdosys1 add items format like vdosys0's previous patch?

>  - items:
>  - const: mediatek,mt8195-vdosys1
>  - const: mediatek,mt8195-mmsys
>  - const: syscon
> 


No, vdosys1 must not have mediatek,mt8195-mmsys fallback.

Regards,
Matthias



I will fix it and add the following vdosys1 binding base on [1].

   - description: vdosys0 and vdosys1 are 2 display HW pipelines,
  so mt8195 binding should be deprecated.
 deprecated: true
 items:
   - const: mediatek,mt8195-mmsys
   - const: syscon
   - items:
   - const: mediatek,mt7623-mmsys
   - const: mediatek,mt2701-mmsys
   - const: syscon
   - items:
   - const: mediatek,mt8195-vdosys0
   - const: mediatek,mt8195-mmsys
   - const: syscon
  +- items:
  +- const: mediatek,mt8195-vdosys1
  +- const: syscon



Looks good, thanks
Matthias


[1]
https://kernel.googlesource.com/pub/scm/linux/kernel/git/matthias.bgg/linux/+/b237efd47df7d25b78c306e90b97c5aa0ff4c4fc/Documentation/devicetree/bindings/arm/mediatek/mediatek%2Cmmsys.yaml

Regards,
Nancy



> Regards,
> Jason-JH.Lin
> 
> 
> 
> * MEDIATEK Confidentiality Notice

>   
> The information contained in this e-mail message (including any
> attachments) may be confidential, proprietary, privileged, or
> otherwise
> exempt from disclosure under applicable laws. It is intended to be
> conveyed only to the designated recipient(s). Any use,
> disseminat

Re: git send-email friendly smtp provider anyone?

2022-11-22 Thread Konstantin Ryabitsev
On Mon, Nov 21, 2022 at 07:13:28PM +0100, Noralf Trønnes wrote:
> > Otherwise, you might consider using:
> > https://b4.docs.kernel.org/en/latest/contributor/send.html#authenticating-with-the-web-submission-endpoint
> > 
> 
> That's an interesting option. I did briefly look at b4 a few months back
> but it looked like it was under heavy development so I figured I'd wait
> before trying it out. I think I'll give b4 a spin to see how it works, I
> wonder how it handles patch changelogs.

I'd be happy to help set this up for you -- to date, this service hasn't been
used beyond a few test posts.

-K


Re: [PATCH v4 0/4] new subsystem for compute accelerator devices

2022-11-22 Thread Jeffrey Hugo

On 11/20/2022 3:04 PM, Jeffrey Hugo wrote:

On 11/19/2022 1:44 PM, Oded Gabbay wrote:
This is the fourth (and hopefully last) version of the patch-set to 
add the

new subsystem for compute accelerators. I removed the RFC headline as
I believe it is now ready for merging.

Compare to v3, this patch-set contains one additional patch that adds
documentation regarding the accel subsystem. I hope it's good enough for
this stage. In addition, there were few very minor fixes according to
comments received on v3.

The patches are in the following repo:
https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git/log/?h=accel_v4 



As in v3, The HEAD of that branch is a commit adding a dummy driver that
registers an accel device using the new framework. This can be served
as a simple reference.

v1 cover letter:
https://lkml.org/lkml/2022/10/22/544

v2 cover letter:
https://lore.kernel.org/lkml/20221102203405.1797491-1-ogab...@kernel.org/T/ 



v3 cover letter:
https://lore.kernel.org/lkml/20221106210225.2065371-1-ogab...@kernel.org/T/ 



Thanks,
Oded.


Reviewed-by: Jeffrey Hugo 

I have some nits.  Nothing that I think should be a blocker for this 
series.


I don't recall if I previously mentioned this.  I'm planning on updating 
the QAIC series to be an accel driver.  Therefore there should be 
at-least 1 user (probably several) for this subsystem in short order.


Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()

2022-11-22 Thread Alex Deucher
On Tue, Nov 22, 2022 at 9:59 AM Lazar, Lijo  wrote:
>
>
>
> On 11/22/2022 8:19 PM, Alex Deucher wrote:
> > On Tue, Nov 22, 2022 at 6:12 AM Xiongfeng Wang
> >  wrote:
> >>
> >> As comment of pci_get_class() says, it returns a pci_device with its
> >> refcount increased and decreased the refcount for the input parameter
> >> @from if it is not NULL.
> >>
> >> If we break the loop in radeon_atrm_get_bios() with 'pdev' not NULL, we
> >> need to call pci_dev_put() to decrease the refcount. Add the missing
> >> pci_dev_put() to avoid refcount leak.
> >
> > For both patches, I think pci_dev_put() needs to go into the loops.
> > There are 2 or more GPUs on the systems where this is relevant.
> >
>
> As per the logic, the loop breaks when it finds a valid ATRM handle. So
> dev_put is required only for that device.

Sure, but what if the handle is on the second DISPLAY_VGA or
DISPLAY_OTHER class PCI device on the system?  We've already called
pci_get_class() for the first PCI device that matched.

Alex

>
> When inside the loop this happens -  "decreased the refcount for the
> input parameter @from if it is not NULL"
>
> Thanks,
> Lijo
>
> > Alex
> >
> >>
> >> Fixes: d8ade3526b2a ("drm/radeon: handle non-VGA class pci devices with 
> >> ATRM")
> >> Fixes: c61e2775873f ("drm/radeon: split ATRM support out from the ATPX 
> >> handler (v3)")
> >> Signed-off-by: Xiongfeng Wang 
> >> ---
> >>   drivers/gpu/drm/radeon/radeon_bios.c | 1 +
> >>   1 file changed, 1 insertion(+)
> >>
> >> diff --git a/drivers/gpu/drm/radeon/radeon_bios.c 
> >> b/drivers/gpu/drm/radeon/radeon_bios.c
> >> index 33121655d50b..2df6ce3e32cb 100644
> >> --- a/drivers/gpu/drm/radeon/radeon_bios.c
> >> +++ b/drivers/gpu/drm/radeon/radeon_bios.c
> >> @@ -227,6 +227,7 @@ static bool radeon_atrm_get_bios(struct radeon_device 
> >> *rdev)
> >>
> >>  if (!found)
> >>  return false;
> >> +   pci_dev_put(pdev);
> >>
> >>  rdev->bios = kmalloc(size, GFP_KERNEL);
> >>  if (!rdev->bios) {
> >> --
> >> 2.20.1
> >>


Re: [PATCH mm-unstable v1 09/20] mm/gup: reliable R/O long-term pinning in COW mappings

2022-11-22 Thread Vlastimil Babka
On 11/16/22 11:26, David Hildenbrand wrote:
> We already support reliable R/O pinning of anonymous memory. However,
> assume we end up pinning (R/O long-term) a pagecache page or the shared
> zeropage inside a writable private ("COW") mapping. The next write access
> will trigger a write-fault and replace the pinned page by an exclusive
> anonymous page in the process page tables to break COW: the pinned page no
> longer corresponds to the page mapped into the process' page table.
> 
> Now that FAULT_FLAG_UNSHARE can break COW on anything mapped into a
> COW mapping, let's properly break COW first before R/O long-term
> pinning something that's not an exclusive anon page inside a COW
> mapping. FAULT_FLAG_UNSHARE will break COW and map an exclusive anon page
> instead that can get pinned safely.
> 
> With this change, we can stop using FOLL_FORCE|FOLL_WRITE for reliable
> R/O long-term pinning in COW mappings.
> 
> With this change, the new R/O long-term pinning tests for non-anonymous
> memory succeed:
>   # [RUN] R/O longterm GUP pin ... with shared zeropage
>   ok 151 Longterm R/O pin is reliable
>   # [RUN] R/O longterm GUP pin ... with memfd
>   ok 152 Longterm R/O pin is reliable
>   # [RUN] R/O longterm GUP pin ... with tmpfile
>   ok 153 Longterm R/O pin is reliable
>   # [RUN] R/O longterm GUP pin ... with huge zeropage
>   ok 154 Longterm R/O pin is reliable
>   # [RUN] R/O longterm GUP pin ... with memfd hugetlb (2048 kB)
>   ok 155 Longterm R/O pin is reliable
>   # [RUN] R/O longterm GUP pin ... with memfd hugetlb (1048576 kB)
>   ok 156 Longterm R/O pin is reliable
>   # [RUN] R/O longterm GUP-fast pin ... with shared zeropage
>   ok 157 Longterm R/O pin is reliable
>   # [RUN] R/O longterm GUP-fast pin ... with memfd
>   ok 158 Longterm R/O pin is reliable
>   # [RUN] R/O longterm GUP-fast pin ... with tmpfile
>   ok 159 Longterm R/O pin is reliable
>   # [RUN] R/O longterm GUP-fast pin ... with huge zeropage
>   ok 160 Longterm R/O pin is reliable
>   # [RUN] R/O longterm GUP-fast pin ... with memfd hugetlb (2048 kB)
>   ok 161 Longterm R/O pin is reliable
>   # [RUN] R/O longterm GUP-fast pin ... with memfd hugetlb (1048576 kB)
>   ok 162 Longterm R/O pin is reliable
> 
> Note 1: We don't care about short-term R/O-pinning, because they have
> snapshot semantics: they are not supposed to observe modifications that
> happen after pinning.
> 
> As one example, assume we start direct I/O to read from a page and store
> page content into a file: modifications to page content after starting
> direct I/O are not guaranteed to end up in the file. So even if we'd pin
> the shared zeropage, the end result would be as expected -- getting zeroes
> stored to the file.
> 
> Note 2: For shared mappings we'll now always fallback to the slow path to
> lookup the VMA when R/O long-term pining. While that's the necessary price
> we have to pay right now, it's actually not that bad in practice: most
> FOLL_LONGTERM users already specify FOLL_WRITE, for example, along with
> FOLL_FORCE because they tried dealing with COW mappings correctly ...
> 
> Note 3: For users that use FOLL_LONGTERM right now without FOLL_WRITE,
> such as VFIO, we'd now no longer pin the shared zeropage. Instead, we'd
> populate exclusive anon pages that we can pin. There was a concern that
> this could affect the memlock limit of existing setups.
> 
> For example, a VM running with VFIO could run into the memlock limit and
> fail to run. However, we essentially had the same behavior already in
> commit 17839856fd58 ("gup: document and work around "COW can break either
> way" issue") which got merged into some enterprise distros, and there were
> not any such complaints. So most probably, we're fine.
> 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Vlastimil Babka 



Re: [Intel-gfx] [PATCH v2 0/5] Add module oriented dmesg output

2022-11-22 Thread Michal Wajdeczko



On 18.11.2022 11:52, Jani Nikula wrote:
> On Thu, 17 Nov 2022, john.c.harri...@intel.com wrote:
>> From: John Harrison 
>>
>> When trying to analyse bug reports from CI, customers, etc. it can be
>> difficult to work out exactly what is happening on which GT in a
>> multi-GT system. So add GT oriented debug/error message wrappers. If
>> used instead of the drm_ equivalents, you get the same output but with
>> a GT# prefix on it.
>>
>> It was also requested to extend this further to submodules in order to
>> factor out the repeated structure accessing constructs and common
>> string prefixes. So, add versions for GuC, HuC and GuC CTB as well.
>>
>> This patch set updates all the gt/uc files to use the new helpers as a
>> first step. The intention would be to convert all output messages that
>> have access to a GT structure.
>>
>> v2: Go back to using lower case names, add more wrapper sets (combined
>> review feedback). Also, wrap up probe injection and WARN entries.
>>
>> Signed-off-by: John Harrison 
> 
> For adding the wrappers in general, I'm going to disagree and
> commit. I'll leave it up to Tvrtko and Joonas.
> 
> Regarding the placement of the macros, I insist you add individual
> header files for the wrappers and include them only where needed.

do you mean:

intel_gt_print.h
intel_guc_print.h
intel_huc_print.h

with just macros or also with all functions that work with drm_printer?

> 
> We have a fairly serious problem with everything including everything in
> i915 that I've been slowly trying to tackle. Touch one thing, rebuild
> everything. About a third of our headers cause the rebuild of the entire
> driver when modified. We need to reduce the surface of things that cause
> rebuilds.
> 
> For example, intel_gt.h is included by 97 files, intel_guc.h by 332
> files, and intel_huc.h by 329 files (counting recursively).
> 
> There's absolutely no reason any of the display code, for example, needs
> to have these logging macros in their build. Long term, the headers
> should be reorganized to reduce the interdependencies, and this is what
> I've been doing in i915_drv.h and display/ in general. But the least we
> can do is not make the problem worse.

to solve this we should really consider splitting out GuC and HuC
definitions to dedicated _types.h files and only include them in
i915_drv.h (and print macros are orthogonal for this problem)

Michal

> 
> BR,
> Jani.
> 
>>
>>
>> John Harrison (5):
>>   drm/i915/gt: Start adding module oriented dmesg output
>>   drm/i915/huc: Add HuC specific debug print wrappers
>>   drm/i915/guc: Add GuC specific debug print wrappers
>>   drm/i915/guc: Add GuC CT specific debug print wrappers
>>   drm/i915/uc: Update the gt/uc code to use gt_err and friends
>>
>>  drivers/gpu/drm/i915/gt/intel_gt.c|  96 
>>  drivers/gpu/drm/i915/gt/intel_gt.h|  35 +++
>>  drivers/gpu/drm/i915/gt/uc/intel_guc.c|  32 +--
>>  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  35 +++
>>  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|   8 +-
>>  .../gpu/drm/i915/gt/uc/intel_guc_capture.c|  48 ++--
>>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 222 +-
>>  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c |  19 +-
>>  drivers/gpu/drm/i915/gt/uc/intel_guc_log.c|  37 ++-
>>  drivers/gpu/drm/i915/gt/uc/intel_guc_rc.c |   7 +-
>>  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c   |  55 ++---
>>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  62 +++--
>>  drivers/gpu/drm/i915/gt/uc/intel_huc.c|  31 +--
>>  drivers/gpu/drm/i915/gt/uc/intel_huc.h|  23 ++
>>  drivers/gpu/drm/i915/gt/uc/intel_uc.c | 108 -
>>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  |  98 
>>  drivers/gpu/drm/i915/gt/uc/selftest_guc.c |  34 +--
>>  .../drm/i915/gt/uc/selftest_guc_hangcheck.c   |  22 +-
>>  .../drm/i915/gt/uc/selftest_guc_multi_lrc.c   |  10 +-
>>  19 files changed, 507 insertions(+), 475 deletions(-)
> 


Re: [PATCH printk v5 00/40] reduce console_lock scope

2022-11-22 Thread Greg Kroah-Hartman
On Wed, Nov 16, 2022 at 05:27:12PM +0106, John Ogness wrote:
> This is v5 of a series to prepare for threaded/atomic
> printing. v4 is here [0]. This series focuses on reducing the
> scope of the BKL console_lock. It achieves this by switching to
> SRCU and a dedicated mutex for console list iteration and
> modification, respectively. The console_lock will no longer
> offer this protection.
> 
> Also, during the review of v2 it came to our attention that
> many console drivers are checking CON_ENABLED to see if they
> are registered. Because this flag can change without
> unregistering and because this flag does not represent an
> atomic point when an (un)registration process is complete,
> a new console_is_registered() function is introduced. This
> function uses the console_list_lock to synchronize with the
> (un)registration process to provide a reliable status.
> 
> All users of the console_lock for list iteration have been
> modified. For the call sites where the console_lock is still
> needed (for other reasons), comments are added to explain
> exactly why the console_lock is needed.
> 
> All users of CON_ENABLED for registration status have been
> modified to use console_is_registered(). Note that there are
> still users of CON_ENABLED, but this is for legitimate purposes
> about a registered console being able to print.
> 
> The base commit for this series is from Paul McKenney's RCU tree
> and provides an NMI-safe SRCU implementation [1]. Without the
> NMI-safe SRCU implementation, this series is not less safe than
> mainline. But we will need the NMI-safe SRCU implementation for
> atomic consoles anyway, so we might as well get it in
> now. Especially since it _does_ increase the reliability for
> mainline in the panic path.
> 
> Changes since v4:
> 
> printk:
> 
> - Introduce console_init_seq() to handle the now rather complex
>   procedure to find an appropriate start sequence number for a
>   new console upon registration.
> 
> - When registering a non-boot console and boot consoles are
>   registered, try to flush all the consoles to get the next @seq
>   value before falling back to use the @seq of the enabled boot
>   console that is furthest behind.
> 
> - For console_force_preferred_locked(), make the console the
>   head of the console list.
> 


Reviewed-by: Greg Kroah-Hartman 


Re: [PATCH v2 1/5] drm/i915/gt: Start adding module oriented dmesg output

2022-11-22 Thread Michal Wajdeczko



On 18.11.2022 02:58, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> When trying to analyse bug reports from CI, customers, etc. it can be
> difficult to work out exactly what is happening on which GT in a
> multi-GT system. So add GT oriented debug/error message wrappers. If
> used instead of the drm_ equivalents, you get the same output but with
> a GT# prefix on it.
> 
> v2: Go back to using lower case names (combined review feedback).
> Convert intel_gt.c as a first step.
> 
> Signed-off-by: John Harrison 
> ---
>  drivers/gpu/drm/i915/gt/intel_gt.c | 96 ++
>  drivers/gpu/drm/i915/gt/intel_gt.h | 35 +++
>  2 files changed, 81 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
> b/drivers/gpu/drm/i915/gt/intel_gt.c
> index 0325f071046ca..349fcfdd14a6d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -90,9 +90,8 @@ static int intel_gt_probe_lmem(struct intel_gt *gt)
>   if (err == -ENODEV)
>   return 0;
>  
> - drm_err(&i915->drm,
> - "Failed to setup region(%d) type=%d\n",
> - err, INTEL_MEMORY_LOCAL);
> + gt_err(gt, "Failed to setup region(%d) type=%d\n",
> +err, INTEL_MEMORY_LOCAL);
>   return err;
>   }
>  
> @@ -192,14 +191,14 @@ int intel_gt_init_hw(struct intel_gt *gt)
>  
>   ret = i915_ppgtt_init_hw(gt);
>   if (ret) {
> - drm_err(&i915->drm, "Enabling PPGTT failed (%d)\n", ret);
> + gt_err(gt, "Enabling PPGTT failed (%d)\n", ret);
>   goto out;
>   }
>  
>   /* We can't enable contexts until all firmware is loaded */
>   ret = intel_uc_init_hw(>->uc);
>   if (ret) {
> - i915_probe_error(i915, "Enabling uc failed (%d)\n", ret);
> + gt_probe_error(gt, "Enabling uc failed (%d)\n", ret);
>   goto out;
>   }
>  
> @@ -264,7 +263,7 @@ intel_gt_clear_error_registers(struct intel_gt *gt,
>* some errors might have become stuck,
>* mask them.
>*/
> - drm_dbg(>->i915->drm, "EIR stuck: 0x%08x, masking\n", eir);
> + gt_dbg(gt, "EIR stuck: 0x%08x, masking\n", eir);
>   rmw_set(uncore, EMR, eir);
>   intel_uncore_write(uncore, GEN2_IIR,
>  I915_MASTER_ERROR_INTERRUPT);
> @@ -298,16 +297,16 @@ static void gen6_check_faults(struct intel_gt *gt)
>   for_each_engine(engine, gt, id) {
>   fault = GEN6_RING_FAULT_REG_READ(engine);
>   if (fault & RING_FAULT_VALID) {
> - drm_dbg(&engine->i915->drm, "Unexpected fault\n"
> - "\tAddr: 0x%08lx\n"
> - "\tAddress space: %s\n"
> - "\tSource ID: %d\n"
> - "\tType: %d\n",
> - fault & PAGE_MASK,
> - fault & RING_FAULT_GTTSEL_MASK ?
> - "GGTT" : "PPGTT",
> - RING_FAULT_SRCID(fault),
> - RING_FAULT_FAULT_TYPE(fault));
> + gt_dbg(gt, "Unexpected fault\n"
> +"\tAddr: 0x%08lx\n"
> +"\tAddress space: %s\n"
> +"\tSource ID: %d\n"
> +"\tType: %d\n",
> +fault & PAGE_MASK,
> +fault & RING_FAULT_GTTSEL_MASK ?
> +"GGTT" : "PPGTT",
> +RING_FAULT_SRCID(fault),
> +RING_FAULT_FAULT_TYPE(fault));
>   }
>   }
>  }
> @@ -334,17 +333,17 @@ static void xehp_check_faults(struct intel_gt *gt)
>   fault_addr = ((u64)(fault_data1 & FAULT_VA_HIGH_BITS) << 44) |
>((u64)fault_data0 << 12);
>  
> - drm_dbg(>->i915->drm, "Unexpected fault\n"
> - "\tAddr: 0x%08x_%08x\n"
> - "\tAddress space: %s\n"
> - "\tEngine ID: %d\n"
> - "\tSource ID: %d\n"
> - "\tType: %d\n",
> - upper_32_bits(fault_addr), lower_32_bits(fault_addr),
> - fault_data1 & FAULT_GTT_SEL ? "GGTT" : "PPGTT",
> - GEN8_RING_FAULT_ENGINE_ID(fault),
> - RING_FAULT_SRCID(fault),
> - RING_FAULT_FAULT_TYPE(fault));
> + gt_dbg(gt, "Unexpected fault\n"
> +"\tAddr: 0x%08x_%08x\n"
> +"\tAddress space: %s\n"
> +"\tEngine ID: %d\n"
> +"\tSource ID: %d\n"
> +"\tType: %d\n",
> +upper_32_bits(fault_addr), lower_32_bits(

Re: Build regressions/improvements in v6.1-rc6

2022-11-22 Thread Randy Dunlap



On 11/22/22 07:00, Johannes Berg wrote:
> On Tue, 2022-11-22 at 08:55 -0500, Alex Deucher wrote:
>>>
>>>+ /kisskb/src/arch/um/include/asm/processor-generic.h: error: called 
>>> object is not a function or function pointer:  => 94:18
>>>+ /kisskb/src/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c: 
>>> error: control reaches end of non-void function [-Werror=return-type]:  => 
>>> 1934:1
>>>
>>> um-x86_64/um-all{mod,yes}config (in kfd_cpumask_to_apic_id())
>>
>> Presumably cpu_data is not defined on um-x86_64?  Does it even make
>> sense to build drivers on um-x86_64?
> 
> Drivers in general yes ;-)
> 
> This driver, probably not.
> 
> But the issue is that a lot of drivers "depends on X86_64" or such,
> where only "X86" is the arch symbol. You could add "X86 && X86_64" to
> really build on x86 64-bit only.
> 
> I didn't check this driver, but this has mostly popped up since UM got
> PCI support some time ago (which I added.)

I have patches for lots of these issues, but some people said that they
would want to build DRM drivers for use with KUNIT (i.e. UML), so I thought
that meant my patches were not needed.

-- 
~Randy


[PATCH] dma-buf: Require VM_PFNMAP vma for mmap

2022-11-22 Thread Daniel Vetter
tldr; DMA buffers aren't normal memory, expecting that you can use
them like that (like calling get_user_pages works, or that they're
accounting like any other normal memory) cannot be guaranteed.

Since some userspace only runs on integrated devices, where all
buffers are actually all resident system memory, there's a huge
temptation to assume that a struct page is always present and useable
like for any more pagecache backed mmap. This has the potential to
result in a uapi nightmare.

To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
blocks get_user_pages and all the other struct page based
infrastructure for everyone. In spirit this is the uapi counterpart to
the kernel-internal CONFIG_DMABUF_DEBUG.

Motivated by a recent patch which wanted to swich the system dma-buf
heap to vm_insert_page instead of vm_insert_pfn.

v2:

Jason brought up that we also want to guarantee that all ptes have the
pte_special flag set, to catch fast get_user_pages (on architectures
that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.

>From auditing the various functions to insert pfn pte entires
(vm_insert_pfn_prot, remap_pfn_range and all it's callers like
dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
this should be the correct flag to check for.

v3: Change to WARN_ON_ONCE (Thomas Zimmermann)

References: 
https://lore.kernel.org/lkml/cakmk7uhi+mg0z0humnt13qccvuturvjpcr0njrl12k-wbwz...@mail.gmail.com/
Acked-by: Christian König 
Acked-by: Thomas Zimmermann 
Cc: Thomas Zimmermann 
Cc: Jason Gunthorpe 
Cc: Suren Baghdasaryan 
Cc: Matthew Wilcox 
Cc: John Stultz 
Signed-off-by: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
--
Ok I entirely forgot about this patch but stumbled over it and checked
what's up with it no. I think it's ready now for merging:
- shmem helper patches to fix up vgem landed
- ttm has been fixed since a while
- I don't think we've had any other open issues

Time to lock down this uapi contract for real?
-Daniel
---
 drivers/dma-buf/dma-buf.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index b6c36914e7c6..88718665c3c3 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -150,6 +150,8 @@ static int dma_buf_mmap_internal(struct file *file, struct 
vm_area_struct *vma)
ret = dmabuf->ops->mmap(dmabuf, vma);
dma_resv_unlock(dmabuf->resv);
 
+   WARN_ON_ONCE(!(vma->vm_flags & VM_PFNMAP));
+
return ret;
 }
 
@@ -1495,6 +1497,8 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct 
vm_area_struct *vma,
ret = dmabuf->ops->mmap(dmabuf, vma);
dma_resv_unlock(dmabuf->resv);
 
+   WARN_ON_ONCE(!(vma->vm_flags & VM_PFNMAP));
+
return ret;
 }
 EXPORT_SYMBOL_NS_GPL(dma_buf_mmap, DMA_BUF);
-- 
2.37.2



Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()

2022-11-22 Thread Lazar, Lijo
[AMD Official Use Only - General]

When only second GPU has valid ATRM handle -
then it stays inside the loop and in the next call to pci_get_class(), it 
passes pdev reference to first GPU as the "from" param. That time it drops the 
reference count of "from" device.

Thanks,
Lijo

From: Alex Deucher 
Sent: Tuesday, November 22, 2022 9:55:33 PM
To: Lazar, Lijo 
Cc: Xiongfeng Wang ; Deucher, Alexander 
; Koenig, Christian ; Pan, 
Xinhui ; airl...@gmail.com ; 
dan...@ffwll.ch ; Zhang, Hawking ; 
dri-devel@lists.freedesktop.org ; 
amd-...@lists.freedesktop.org ; 
yangyingli...@huawei.com 
Subject: Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in 
radeon_atrm_get_bios()

On Tue, Nov 22, 2022 at 9:59 AM Lazar, Lijo  wrote:
>
>
>
> On 11/22/2022 8:19 PM, Alex Deucher wrote:
> > On Tue, Nov 22, 2022 at 6:12 AM Xiongfeng Wang
> >  wrote:
> >>
> >> As comment of pci_get_class() says, it returns a pci_device with its
> >> refcount increased and decreased the refcount for the input parameter
> >> @from if it is not NULL.
> >>
> >> If we break the loop in radeon_atrm_get_bios() with 'pdev' not NULL, we
> >> need to call pci_dev_put() to decrease the refcount. Add the missing
> >> pci_dev_put() to avoid refcount leak.
> >
> > For both patches, I think pci_dev_put() needs to go into the loops.
> > There are 2 or more GPUs on the systems where this is relevant.
> >
>
> As per the logic, the loop breaks when it finds a valid ATRM handle. So
> dev_put is required only for that device.

Sure, but what if the handle is on the second DISPLAY_VGA or
DISPLAY_OTHER class PCI device on the system?  We've already called
pci_get_class() for the first PCI device that matched.

Alex

>
> When inside the loop this happens -  "decreased the refcount for the
> input parameter @from if it is not NULL"
>
> Thanks,
> Lijo
>
> > Alex
> >
> >>
> >> Fixes: d8ade3526b2a ("drm/radeon: handle non-VGA class pci devices with 
> >> ATRM")
> >> Fixes: c61e2775873f ("drm/radeon: split ATRM support out from the ATPX 
> >> handler (v3)")
> >> Signed-off-by: Xiongfeng Wang 
> >> ---
> >>   drivers/gpu/drm/radeon/radeon_bios.c | 1 +
> >>   1 file changed, 1 insertion(+)
> >>
> >> diff --git a/drivers/gpu/drm/radeon/radeon_bios.c 
> >> b/drivers/gpu/drm/radeon/radeon_bios.c
> >> index 33121655d50b..2df6ce3e32cb 100644
> >> --- a/drivers/gpu/drm/radeon/radeon_bios.c
> >> +++ b/drivers/gpu/drm/radeon/radeon_bios.c
> >> @@ -227,6 +227,7 @@ static bool radeon_atrm_get_bios(struct radeon_device 
> >> *rdev)
> >>
> >>  if (!found)
> >>  return false;
> >> +   pci_dev_put(pdev);
> >>
> >>  rdev->bios = kmalloc(size, GFP_KERNEL);
> >>  if (!rdev->bios) {
> >> --
> >> 2.20.1
> >>
<>

Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()

2022-11-22 Thread Alex Deucher
On Tue, Nov 22, 2022 at 12:10 PM Lazar, Lijo  wrote:
>
> [AMD Official Use Only - General]
>
> When only second GPU has valid ATRM handle -
> then it stays inside the loop and in the next call to pci_get_class(), it 
> passes pdev reference to first GPU as the "from" param. That time it drops 
> the reference count of "from" device.

ah, right, that was the part I missed.  Thanks.

Alex


>
> Thanks,
> Lijo
> 
> From: Alex Deucher 
> Sent: Tuesday, November 22, 2022 9:55:33 PM
> To: Lazar, Lijo 
> Cc: Xiongfeng Wang ; Deucher, Alexander 
> ; Koenig, Christian ; 
> Pan, Xinhui ; airl...@gmail.com ; 
> dan...@ffwll.ch ; Zhang, Hawking ; 
> dri-devel@lists.freedesktop.org ; 
> amd-...@lists.freedesktop.org ; 
> yangyingli...@huawei.com 
> Subject: Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in 
> radeon_atrm_get_bios()
>
> On Tue, Nov 22, 2022 at 9:59 AM Lazar, Lijo  wrote:
> >
> >
> >
> > On 11/22/2022 8:19 PM, Alex Deucher wrote:
> > > On Tue, Nov 22, 2022 at 6:12 AM Xiongfeng Wang
> > >  wrote:
> > >>
> > >> As comment of pci_get_class() says, it returns a pci_device with its
> > >> refcount increased and decreased the refcount for the input parameter
> > >> @from if it is not NULL.
> > >>
> > >> If we break the loop in radeon_atrm_get_bios() with 'pdev' not NULL, we
> > >> need to call pci_dev_put() to decrease the refcount. Add the missing
> > >> pci_dev_put() to avoid refcount leak.
> > >
> > > For both patches, I think pci_dev_put() needs to go into the loops.
> > > There are 2 or more GPUs on the systems where this is relevant.
> > >
> >
> > As per the logic, the loop breaks when it finds a valid ATRM handle. So
> > dev_put is required only for that device.
>
> Sure, but what if the handle is on the second DISPLAY_VGA or
> DISPLAY_OTHER class PCI device on the system?  We've already called
> pci_get_class() for the first PCI device that matched.
>
> Alex
>
> >
> > When inside the loop this happens -  "decreased the refcount for the
> > input parameter @from if it is not NULL"
> >
> > Thanks,
> > Lijo
> >
> > > Alex
> > >
> > >>
> > >> Fixes: d8ade3526b2a ("drm/radeon: handle non-VGA class pci devices with 
> > >> ATRM")
> > >> Fixes: c61e2775873f ("drm/radeon: split ATRM support out from the ATPX 
> > >> handler (v3)")
> > >> Signed-off-by: Xiongfeng Wang 
> > >> ---
> > >>   drivers/gpu/drm/radeon/radeon_bios.c | 1 +
> > >>   1 file changed, 1 insertion(+)
> > >>
> > >> diff --git a/drivers/gpu/drm/radeon/radeon_bios.c 
> > >> b/drivers/gpu/drm/radeon/radeon_bios.c
> > >> index 33121655d50b..2df6ce3e32cb 100644
> > >> --- a/drivers/gpu/drm/radeon/radeon_bios.c
> > >> +++ b/drivers/gpu/drm/radeon/radeon_bios.c
> > >> @@ -227,6 +227,7 @@ static bool radeon_atrm_get_bios(struct 
> > >> radeon_device *rdev)
> > >>
> > >>  if (!found)
> > >>  return false;
> > >> +   pci_dev_put(pdev);
> > >>
> > >>  rdev->bios = kmalloc(size, GFP_KERNEL);
> > >>  if (!rdev->bios) {
> > >> --
> > >> 2.20.1
> > >>


Re: [PATCH v2 2/5] drm/i915/huc: Add HuC specific debug print wrappers

2022-11-22 Thread Michal Wajdeczko



On 18.11.2022 02:58, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> Create a set of HuC printers and start using them.
> 
> Signed-off-by: John Harrison 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_huc.c | 31 ++
>  drivers/gpu/drm/i915/gt/uc/intel_huc.h | 23 +++
>  2 files changed, 35 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> index be855811d85df..0bbbc7192da63 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> @@ -107,11 +107,9 @@ static enum hrtimer_restart 
> huc_delayed_load_timer_callback(struct hrtimer *hrti
>  
>   if (!intel_huc_is_authenticated(huc)) {
>   if (huc->delayed_load.status == INTEL_HUC_WAITING_ON_GSC)
> - drm_notice(&huc_to_gt(huc)->i915->drm,
> -"timed out waiting for MEI GSC init to load 
> HuC\n");
> + huc_notice(huc, "Timed out waiting for MEI GSC init to 
> load FW\n");
>   else if (huc->delayed_load.status == INTEL_HUC_WAITING_ON_PXP)
> - drm_notice(&huc_to_gt(huc)->i915->drm,
> -"timed out waiting for MEI PXP init to load 
> HuC\n");
> + huc_notice(huc, "Timed out waiting for MEI PXP init to 
> load FW\n");
>   else
>   MISSING_CASE(huc->delayed_load.status);
>  
> @@ -174,8 +172,7 @@ static int gsc_notifier(struct notifier_block *nb, 
> unsigned long action, void *d
>  
>   case BUS_NOTIFY_DRIVER_NOT_BOUND: /* mei driver fails to be bound */
>   case BUS_NOTIFY_UNBIND_DRIVER: /* mei driver about to be unbound */
> - drm_info(&huc_to_gt(huc)->i915->drm,
> -  "mei driver not bound, disabling HuC load\n");
> + huc_info(huc, "- mei driver not bound, disabling HuC load\n");
>   gsc_init_error(huc);
>   break;
>   }
> @@ -193,8 +190,7 @@ void intel_huc_register_gsc_notifier(struct intel_huc 
> *huc, struct bus_type *bus
>   huc->delayed_load.nb.notifier_call = gsc_notifier;
>   ret = bus_register_notifier(bus, &huc->delayed_load.nb);
>   if (ret) {
> - drm_err(&huc_to_gt(huc)->i915->drm,
> - "failed to register GSC notifier\n");
> + huc_err(huc, "Failed to register GSC notifier\n");
>   huc->delayed_load.nb.notifier_call = NULL;
>   gsc_init_error(huc);
>   }
> @@ -284,8 +280,7 @@ static int check_huc_loading_mode(struct intel_huc *huc)
> GSC_LOADS_HUC;
>  
>   if (fw_needs_gsc != hw_uses_gsc) {
> - drm_err(>->i915->drm,
> - "mismatch between HuC FW (%s) and HW (%s) load modes\n",
> + huc_err(huc, "Mismatch between FW (%s) and HW (%s) load 
> modes\n",
>   HUC_LOAD_MODE_STRING(fw_needs_gsc),
>   HUC_LOAD_MODE_STRING(hw_uses_gsc));
>   return -ENOEXEC;
> @@ -294,19 +289,17 @@ static int check_huc_loading_mode(struct intel_huc *huc)
>   /* make sure we can access the GSC via the mei driver if we need it */
>   if (!(IS_ENABLED(CONFIG_INTEL_MEI_PXP) && 
> IS_ENABLED(CONFIG_INTEL_MEI_GSC)) &&
>   fw_needs_gsc) {
> - drm_info(>->i915->drm,
> -  "Can't load HuC due to missing MEI modules\n");
> + huc_info(huc, "Can't load due to missing MEI modules\n");
>   return -EIO;
>   }
>  
> - drm_dbg(>->i915->drm, "GSC loads huc=%s\n", str_yes_no(fw_needs_gsc));
> + huc_dbg(huc, "GSC loads huc=%s\n", str_yes_no(fw_needs_gsc));

this will give:

"GT0: HuC GSC loads huc=yes"

but maybe better to change that to get:

"GT0: HuC loaded by GSC=yes"

so this should be:

huc_dbg(huc, "loaded by GSC=%s\n", str_yes_no(fw_needs_gsc));

>  
>   return 0;
>  }
>  
>  int intel_huc_init(struct intel_huc *huc)
>  {
> - struct drm_i915_private *i915 = huc_to_gt(huc)->i915;
>   int err;
>  
>   err = check_huc_loading_mode(huc);
> @@ -323,7 +316,7 @@ int intel_huc_init(struct intel_huc *huc)
>  
>  out:
>   intel_uc_fw_change_status(&huc->fw, INTEL_UC_FIRMWARE_INIT_FAIL);
> - drm_info(&i915->drm, "HuC init failed with %d\n", err);
> + huc_info(huc, "init failed with %d\n", err);
>   return err;
>  }
>  
> @@ -366,13 +359,13 @@ int intel_huc_wait_for_auth_complete(struct intel_huc 
> *huc)
>   delayed_huc_load_complete(huc);
>  
>   if (ret) {
> - drm_err(>->i915->drm, "HuC: Firmware not verified %d\n", ret);
> + huc_err(huc, "firmware not verified %d\n", ret);
>   intel_uc_fw_change_status(&huc->fw, 
> INTEL_UC_FIRMWARE_LOAD_FAIL);
>   return ret;
>   }
>  
>   intel_uc_fw_change_status(&huc->fw, INTEL_UC_FIRMWARE_RUNNING);
> - dr

Re: [syzbot] inconsistent lock state in sync_info_debugfs_show

2022-11-22 Thread Christian König

Am 22.11.22 um 15:51 schrieb Daniel Vetter:

On Fri, Nov 18, 2022 at 06:46:38PM -0800, syzbot wrote:

syzbot has found a reproducer for the following issue on:

HEAD commit:84368d882b96 Merge tag 'soc-fixes-6.1-3' of git://git.kern..
git tree:   upstream
console+strace: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsyzkaller.appspot.com%2Fx%2Flog.txt%3Fx%3D1670fb6588&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2BNj%2B3vZRTi6DhdOdNlQDG7GJN9u3IZy8%2FZxqHScu2Fs%3D&reserved=0
kernel config:  
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsyzkaller.appspot.com%2Fx%2F.config%3Fx%3D6f4e5e9899396248&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qZpvAr44nbuFs3jvd9fggeVnMgAt7J5Izxa1GvoMDjA%3D&reserved=0
dashboard link: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsyzkaller.appspot.com%2Fbug%3Fextid%3D007bfe0f3330f6e1e7d1&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=tTHNWwrjD%2F%2FddfRi%2Fe3AbwMw%2B7MPWoolWeZt%2FgnBa4g%3D&reserved=0
compiler:   gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for 
Debian) 2.35.2
syz repro:  
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsyzkaller.appspot.com%2Fx%2Frepro.syz%3Fx%3D164376f988&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bay5gePJjT3Nm4J%2BINtso1hKNPMUmfP6xActclfXqa8%3D&reserved=0
C reproducer:   
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsyzkaller.appspot.com%2Fx%2Frepro.c%3Fx%3D16cf096588&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=o5JlI1IDcmP%2Bd08wLRhyD5gUIjuPioMEyK%2FkH6m%2FNIQ%3D&reserved=0

Downloadable assets:
disk image: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstorage.googleapis.com%2Fsyzbot-assets%2F031b6e68785d%2Fdisk-84368d88.raw.xz&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dAT5hGTqrR01QiyGbOHyWkBIvTYvsmGqLiPqXuw7vK4%3D&reserved=0
vmlinux: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstorage.googleapis.com%2Fsyzbot-assets%2Fcff5e66b90e8%2Fvmlinux-84368d88.xz&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bHMhF3kWMS3ZVbPs8z6mWewUlNBTJPabhhI58e77KuI%3D&reserved=0
kernel image: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstorage.googleapis.com%2Fsyzbot-assets%2Fe75525784a66%2FbzImage-84368d88.xz&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KFJD3fDW1Ppqla%2BPA6wY6NC3qSNrlCpnkhdqIDZGxoY%3D&reserved=0

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+007bfe0f3330f6e1e...@syzkaller.appspotmail.com


WARNING: inconsistent lock state
6.1.0-rc5-syzkaller-00144-g84368d882b96 #0 Not tainted

inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
syz-executor333/3645 [HC0[0]:SC0[0]:HE0:SE1] takes:
8d295c38 (sync_timeline_list_lock){?...}-{2:2}, at: spin_lock_irq 
include/linux/spinlock.h:375 [inline]
8d295c38 (sync_timeline_list_lock){?...}-{2:2}, at: 
sync_info_debugfs_show+0x31/0x200 drivers/dma-buf/sync_debug.c:147
{IN-HARDIRQ-W} state was registered at:
   lock_acquire kernel/locking/lockdep.c:5668 [inline]
   lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
   __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
   _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/s

Re: [PATCH RFC 16/19] mm/frame-vector: remove FOLL_FORCE usage

2022-11-22 Thread Linus Torvalds
On Tue, Nov 22, 2022 at 4:25 AM Hans Verkuil  wrote:
>
> I tracked the use of 'force' all the way back to the first git commit
> (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the
> reason is lost in the mists of time.

Well, not entirely.

For archaeology reasons, I went back to the old BK history, which
exists as a git conversion in

https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/

and there you can actually find it.

Not with a lot of explanations, though - it's commit b7649ef789
("[PATCH] videobuf update"):

This updates the video-buf.c module (helper module for video buffer
management).  Some memory management fixes, also some adaptions to the
final v4l2 api.

but it went from

 err = get_user_pages(current,current->mm,
-data, dma->nr_pages,
-rw == READ, 0, /* don't force */
+data & PAGE_MASK, dma->nr_pages,
+rw == READ, 1, /* force */
 dma->pages, NULL);

in that commit.

So it goes back to October 2002.

> Looking at this old LWN article https://lwn.net/Articles/28548/ suggests
> that it might be related to calling get_user_pages for write buffers

The timing roughly matches.

> I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still
> allow drivers to read from the buffer?

The issue with some of the driver hacks has been that

 - they only want to read, and the buffer may be read-only

 - they then used FOLL_WRITE despite that, because they want to break
COW (due to the issue that David is now fixing with his series)

 - but that means that the VM layer says "nope, you can't write to
this read-only user mapping"

 - ... and then they use FOLL_FORCE to say "yes, I can".

iOW, the FOLL_FORCE may be entirely due to an (incorrect, but
historically needed) FOLL_WRITE.

 Linus


Re: Try to address the DMA-buf coherency problem

2022-11-22 Thread Christian König

Am 22.11.22 um 15:36 schrieb Daniel Vetter:

On Fri, Nov 18, 2022 at 11:32:19AM -0800, Rob Clark wrote:

On Thu, Nov 17, 2022 at 7:38 AM Nicolas Dufresne  wrote:

Le jeudi 17 novembre 2022 à 13:10 +0100, Christian König a écrit :

DMA-Buf let's the exporter setup the DMA addresses the importer uses to
be able to directly decided where a certain operation should go. E.g. we
have cases where for example a P2P write doesn't even go to memory, but
rather a doorbell BAR to trigger another operation. Throwing in CPU
round trips for explicit ownership transfer completely breaks that
concept.

It sounds like we should have a dma_dev_is_coherent_with_dev() which
accepts two (or an array?) of devices and tells the caller whether the
devices need explicit ownership transfer.

No, exactly that's the concept I'm pushing back on very hard here.

In other words explicit ownership transfer is not something we would
want as requirement in the framework, cause otherwise we break tons of
use cases which require concurrent access to the underlying buffer.

I'm not pushing for this solution, but really felt the need to correct you here.
I have quite some experience with ownership transfer mechanism, as this is how
GStreamer framework works since 2000. Concurrent access is a really common use
cases and it is quite well defined in that context. The bracketing system (in
this case called map() unmap(), with flag stating the usage intention like reads
and write) is combined the the refcount. The basic rules are simple:

This is all CPU oriented, I think Christian is talking about the case
where ownership transfer happens without CPU involvement, such as via
GPU waiting on a fence

Yeah for pure device2device handover the rule pretty much has to be that
any coherency management that needs to be done must be done from the
device side (flushing device side caches and stuff like that) only. But
under the assumption that _all_ cpu side management has been done already
before the first device access started.

And then the map/unmap respectively begin/end_cpu_access can be used what
it was meant for with cpu side invalidation/flushing and stuff like that,
while having pretty clear handover/ownership rules and hopefully not doing
no unecessary flushes. And all that while allowing device acces to be
pipelined. Worst case the exporter has to insert some pipelined cache
flushes as a dma_fence pipelined work of its own between the device access
when moving from one device to the other. That last part sucks a bit right
now, because we don't have any dma_buf_attachment method which does this
syncing without recreating the mapping, but in reality this is solved by
caching mappings in the exporter (well dma-buf layer) nowadays.

True concurrent access like vk/compute expects is still a model that
dma-buf needs to support on top, but that's a special case and pretty much
needs hw that supports such concurrent access without explicit handover
and fencing.

Aside from some historical accidents and still a few warts, I do think
dma-buf does support both of these models.


We should have come up with dma-heaps earlier and make it clear that 
exporting a DMA-buf from a device gives you something device specific 
which might or might not work with others.


Apart from that I agree, DMA-buf should be capable of handling this. 
Question left is what documentation is missing to make it clear how 
things are supposed to work?


Regards,
Christian.


  Of course in the case of
gpu/drm drivers, userspace must know whats possible and act accordingly,
otherwise you just get to keep all the pieces.
-Daniel




Re: git send-email friendly smtp provider anyone?

2022-11-22 Thread Noralf Trønnes



Den 22.11.2022 16.51, skrev Konstantin Ryabitsev:
> On Mon, Nov 21, 2022 at 07:13:28PM +0100, Noralf Trønnes wrote:
>>> Otherwise, you might consider using:
>>> https://b4.docs.kernel.org/en/latest/contributor/send.html#authenticating-with-the-web-submission-endpoint
>>>
>>
>> That's an interesting option. I did briefly look at b4 a few months back
>> but it looked like it was under heavy development so I figured I'd wait
>> before trying it out. I think I'll give b4 a spin to see how it works, I
>> wonder how it handles patch changelogs.
> 
> I'd be happy to help set this up for you -- to date, this service hasn't been
> used beyond a few test posts.
> 

Ok, I'll give it a try.

I have now prepared the patchset, generated a key and can now do:
b4 send -o

The first thing that strikes me is that everyone mentioned in one of the
patches get the entire patchset, even sta...@vger.kernel.org (cc'ed in a
fixes patch). The first patch touches a core file and as a result a few
drivers, so I've cc'ed the driver maintainers in that patch, but now
they get the entire patchset where 5 of 6 patches is about a driver that
I maintain. So from their point of view, they see a patchset about a
driver they don't care about and a patch touching a core file, but from
the subject it's not apparent that it touches their driver. I'm afraid
that this might result in none of them looking at that patch. In this
particular case it's not that important, but in another case it might be.

As for the setting up the web endpoint, should I just follow the b4 docs
on that?

I use b4 version 0.10.1, is that recent enough?

Noralf.


Re: [PATCH v2 3/5] drm/i915/guc: Add GuC specific debug print wrappers

2022-11-22 Thread Michal Wajdeczko



On 18.11.2022 02:58, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> Create a set of GuC printers and start using them.
> 
> Signed-off-by: John Harrison 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc.c| 32 --
>  drivers/gpu/drm/i915/gt/uc/intel_guc.h| 35 +++
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  8 +--
>  .../gpu/drm/i915/gt/uc/intel_guc_capture.c| 48 +-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 19 +++---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_log.c| 37 ++-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_rc.c |  7 +--
>  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c   | 55 +++-
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 62 +--
>  drivers/gpu/drm/i915/gt/uc/selftest_guc.c | 34 +-
>  .../drm/i915/gt/uc/selftest_guc_hangcheck.c   | 22 +++
>  .../drm/i915/gt/uc/selftest_guc_multi_lrc.c   | 10 +--
>  12 files changed, 179 insertions(+), 190 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> index 52aede324788e..d9972510ee29b 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> @@ -94,8 +94,8 @@ static void gen9_enable_guc_interrupts(struct intel_guc 
> *guc)
>   assert_rpm_wakelock_held(>->i915->runtime_pm);
>  
>   spin_lock_irq(gt->irq_lock);
> - WARN_ON_ONCE(intel_uncore_read(gt->uncore, GEN8_GT_IIR(2)) &
> -  gt->pm_guc_events);
> + guc_WARN_ON_ONCE(guc, intel_uncore_read(gt->uncore, GEN8_GT_IIR(2)) &
> +  gt->pm_guc_events);
>   gen6_gt_pm_enable_irq(gt, gt->pm_guc_events);
>   spin_unlock_irq(gt->irq_lock);
>  
> @@ -339,7 +339,7 @@ static void guc_init_params(struct intel_guc *guc)
>   params[GUC_CTL_DEVID] = guc_ctl_devid(guc);
>  
>   for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
> - DRM_DEBUG_DRIVER("param[%2d] = %#x\n", i, params[i]);
> + guc_dbg(guc, "init param[%2d] = %#x\n", i, params[i]);
>  }
>  
>  /*
> @@ -451,7 +451,7 @@ int intel_guc_init(struct intel_guc *guc)
>   intel_uc_fw_fini(&guc->fw);
>  out:
>   intel_uc_fw_change_status(&guc->fw, INTEL_UC_FIRMWARE_INIT_FAIL);
> - i915_probe_error(gt->i915, "failed with %d\n", ret);
> + guc_probe_error(guc, "init failed with %d\n", ret);
>   return ret;
>  }
>  
> @@ -484,7 +484,6 @@ void intel_guc_fini(struct intel_guc *guc)
>  int intel_guc_send_mmio(struct intel_guc *guc, const u32 *request, u32 len,
>   u32 *response_buf, u32 response_buf_size)
>  {
> - struct drm_i915_private *i915 = guc_to_gt(guc)->i915;
>   struct intel_uncore *uncore = guc_to_gt(guc)->uncore;
>   u32 header;
>   int i;
> @@ -519,8 +518,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 
> *request, u32 len,
>  10, 10, &header);
>   if (unlikely(ret)) {
>  timeout:
> - drm_err(&i915->drm, "mmio request %#x: no reply %x\n",
> - request[0], header);
> + guc_err(guc, "mmio request %#x: no reply %x\n", request[0], 
> header);
>   goto out;
>   }
>  
> @@ -541,8 +539,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 
> *request, u32 len,
>   if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) == 
> GUC_HXG_TYPE_NO_RESPONSE_RETRY) {
>   u32 reason = FIELD_GET(GUC_HXG_RETRY_MSG_0_REASON, header);
>  
> - drm_dbg(&i915->drm, "mmio request %#x: retrying, reason %u\n",
> - request[0], reason);
> + guc_dbg(guc, "mmio request %#x: retrying, reason %u\n", 
> request[0], reason);
>   goto retry;
>   }
>  
> @@ -550,16 +547,14 @@ int intel_guc_send_mmio(struct intel_guc *guc, const 
> u32 *request, u32 len,
>   u32 hint = FIELD_GET(GUC_HXG_FAILURE_MSG_0_HINT, header);
>   u32 error = FIELD_GET(GUC_HXG_FAILURE_MSG_0_ERROR, header);
>  
> - drm_err(&i915->drm, "mmio request %#x: failure %x/%u\n",
> - request[0], error, hint);
> + guc_err(guc, "mmio request %#x: failure %x/%u\n", request[0], 
> error, hint);
>   ret = -ENXIO;
>   goto out;
>   }
>  
>   if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) != 
> GUC_HXG_TYPE_RESPONSE_SUCCESS) {
>  proto:
> - drm_err(&i915->drm, "mmio request %#x: unexpected reply %#x\n",
> - request[0], header);
> + guc_err(guc, "mmio request %#x: unexpected reply %#x\n", 
> request[0], header);
>   ret = -EPROTO;
>   goto out;
>   }
> @@ -601,9 +596,9 @@ int intel_guc_to_host_process_recv_msg(struct intel_guc 
> *guc,
>   msg = payload[0] & guc->msg_enabled_mask;
>  
>   if (msg & INTEL_GUC_RECV_MSG_CRASH_DUMP_POSTED)
> - drm_err(&guc_to_gt(guc)->i915->drm, "Received early GuC crash 
> dump notifi

Re: [Intel-gfx] [MAINTAINER TOOLS] docs: updated rules for topic/core-for-CI commit management

2022-11-22 Thread Rodrigo Vivi
On Tue, Nov 22, 2022 at 03:59:10PM +0100, Daniel Vetter wrote:
> On Tue, Nov 22, 2022 at 03:17:14PM +0200, Jani Nikula wrote:
> > Introduce stricter rules for topic/core-for-CI management. Way too many
> > commits have been added over the years, with insufficient rationale
> > recorded in the commit message, and insufficient follow-up with removing
> > the commits from the topic branch.
> > 
> > New rules:
> > 
> > 1. Require maintainer ack for rebase. Have better gating on when rebases
> >happen and on which baselines.
> > 
> > 2. Require maintainer/committer ack for adding/removing commits. No
> >single individual should decide.
> > 
> > 3. Require gitlab issues for new commits added. Improve tracking for
> >removing the commits.
> > 
> > Also use the stronger "must" for commit message requiring the
> > justification for the commit being in topic/core-for-CI.
> > 
> > Cc: Joonas Lahtinen 
> > Cc: Rodrigo Vivi 
> > Cc: Tvrtko Ursulin 
> > Cc: David Airlie 
> > Cc: Daniel Vetter 
> > Cc: intel-...@lists.freedesktop.org
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: dim-to...@lists.freedesktop.org
> > Signed-off-by: Jani Nikula 
> 
> Reviewed-by: Daniel Vetter 

Acked-by: Rodrigo Vivi 

> 
> > ---
> >  drm-tip.rst | 27 ---
> >  1 file changed, 20 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drm-tip.rst b/drm-tip.rst
> > index deae95cdd2fe..24036e2ef576 100644
> > --- a/drm-tip.rst
> > +++ b/drm-tip.rst
> > @@ -203,11 +203,13 @@ justified exception. The primary goal is to fix 
> > issues originating from Linus'
> >  tree. Issues that would need drm-next or other DRM subsystem tree as 
> > baseline
> >  should be fixed in the offending DRM subsystem tree.
> >  
> > -Only rebase the branch if you really know what you're doing. When in 
> > doubt, ask
> > -the maintainers. You'll need to be able to handle any conflicts in non-drm 
> > code
> > -while rebasing.
> > +Only rebase the branch if you really know what you're doing. You'll need 
> > to be
> > +able to handle any conflicts in non-drm code while rebasing.
> >  
> > -Simply drop fixes that are already available in the new baseline.
> > +Always ask for maintainer ack before rebasing. IRC ack is sufficient.
> > +
> > +Simply drop fixes that are already available in the new baseline. Close the
> > +associated gitlab issue when removing commits.
> >  
> >  Force pushing a rebased topic/core-for-CI requires passing the ``--force``
> >  parameter to git::
> > @@ -225,11 +227,22 @@ judgement call.
> >  Only add or remove commits if you really know what you're doing. When in 
> > doubt,
> >  ask the maintainers.
> >  
> > -Apply new commits on top with regular push. The commit message needs to 
> > explain
> > -why the patch has been applied to topic/core-for-CI. If it's a cherry-pick 
> > from
> > +Always ask for maintainer/committer ack before adding/removing commits. 
> > IRC ack
> > +is sufficient. Record the ``Acked-by:`` in commits being added.
> > +
> > +Apply new commits on top with regular push. The commit message must 
> > explain why
> > +the patch has been applied to topic/core-for-CI. If it's a cherry-pick from
> >  another subsystem, please reference the commit with ``git cherry-pick -x``
> >  option. If it's a patch from another subsystem, please reference the patch 
> > on
> >  the mailing list with ``Link:`` tag.
> >  
> > +New commits always need an associated gitlab issue for tracking purposes. 
> > The
> > +goal is to have as few commits in topic/core-for-CI as possible, and we 
> > need to
> > +be able to track the progress in making that happen. Reference the issue 
> > with
> > +``References:`` tag. Add the ``core-for-CI`` label to the issue. (Note: Do 
> > not
> > +use ``Closes:`` because the logic here is backwards; the issue is having 
> > the
> > +commit in the branch in the first place.)
> > +
> >  Instead of applying reverts, just remove the commit. This implies ``git 
> > rebase
> > --i`` on the current baseline; see directions above.
> > +-i`` on the current baseline; see directions above. Close the associated 
> > gitlab
> > +issue when removing commits.
> > -- 
> > 2.34.1
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH v2 4/5] drm/i915/guc: Add GuC CT specific debug print wrappers

2022-11-22 Thread Michal Wajdeczko



On 18.11.2022 02:58, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> Re-work the existing GuC CT printers and extend as required to match
> the new wrapping scheme.
> 
> Signed-off-by: John Harrison 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 222 +++---
>  1 file changed, 113 insertions(+), 109 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index 2b22065e87bf9..9d404fb377637 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -18,31 +18,49 @@ static inline struct intel_guc *ct_to_guc(struct 
> intel_guc_ct *ct)
>   return container_of(ct, struct intel_guc, ct);
>  }
>  
> -static inline struct intel_gt *ct_to_gt(struct intel_guc_ct *ct)
> -{
> - return guc_to_gt(ct_to_guc(ct));
> -}
> -
>  static inline struct drm_i915_private *ct_to_i915(struct intel_guc_ct *ct)
>  {
> - return ct_to_gt(ct)->i915;
> -}
> + struct intel_guc *guc = ct_to_guc(ct);
> + struct intel_gt *gt = guc_to_gt(guc);
>  
> -static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct)
> -{
> - return &ct_to_i915(ct)->drm;
> + return gt->i915;
>  }
>  
> -#define CT_ERROR(_ct, _fmt, ...) \
> - drm_err(ct_to_drm(_ct), "CT: " _fmt, ##__VA_ARGS__)
> +#define ct_err(_ct, _fmt, ...) \
> + guc_err(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__)
> +
> +#define ct_warn(_ct, _fmt, ...) \
> + guc_warn(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__)
> +
> +#define ct_notice(_ct, _fmt, ...) \
> + guc_notice(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__)
> +
> +#define ct_info(_ct, _fmt, ...) \
> + guc_info(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__)
> +
>  #ifdef CONFIG_DRM_I915_DEBUG_GUC
> -#define CT_DEBUG(_ct, _fmt, ...) \
> - drm_dbg(ct_to_drm(_ct), "CT: " _fmt, ##__VA_ARGS__)
> +#define ct_dbg(_ct, _fmt, ...) \
> + guc_dbg(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__)
>  #else
> -#define CT_DEBUG(...)do { } while (0)
> +#define ct_dbg(...)  do { } while (0)
>  #endif
> -#define CT_PROBE_ERROR(_ct, _fmt, ...) \
> - i915_probe_error(ct_to_i915(ct), "CT: " _fmt, ##__VA_ARGS__)
> +
> +#define ct_probe_error(_ct, _fmt, ...) \
> + do { \
> + if (i915_error_injected()) \
> + ct_dbg(_ct, _fmt, ##__VA_ARGS__); \
> + else \
> + ct_err(_ct, _fmt, ##__VA_ARGS__); \
> + } while (0)

guc_probe_error ?

> +
> +#define ct_WARN_ON(_ct, _condition) \
> + ct_WARN(_ct, _condition, "%s", "ct_WARN_ON(" __stringify(_condition) 
> ")")
> +
> +#define ct_WARN(_ct, _condition, _fmt, ...) \
> + guc_WARN(ct_to_guc(_ct), _condition, "CT " _fmt, ##__VA_ARGS__)
> +
> +#define ct_WARN_ONCE(_ct, _condition, _fmt, ...) \
> + guc_WARN_ONCE(ct_to_guc(_ct), _condition, "CT " _fmt, ##__VA_ARGS__)
>  
>  /**
>   * DOC: CTB Blob
> @@ -170,7 +188,7 @@ static int ct_control_enable(struct intel_guc_ct *ct, 
> bool enable)
>   err = guc_action_control_ctb(ct_to_guc(ct), enable ?
>GUC_CTB_CONTROL_ENABLE : 
> GUC_CTB_CONTROL_DISABLE);
>   if (unlikely(err))
> - CT_PROBE_ERROR(ct, "Failed to control/%s CTB (%pe)\n",
> + ct_probe_error(ct, "Failed to control/%s CTB (%pe)\n",
>  str_enable_disable(enable), ERR_PTR(err));

btw, shouldn't we change all messages to start with lowercase ?

was:
"CT0: Failed to control/%s CTB (%pe)"
is:
"GT0: GuC CT Failed to control/%s CTB (%pe)"

unless we keep colon (as suggested by Tvrtko) as then:

"GT0: GuC CT: Failed to control/%s CTB (%pe)"

Michal

>  
>   return err;
> @@ -201,7 +219,7 @@ static int ct_register_buffer(struct intel_guc_ct *ct, 
> bool send,
>  size);
>   if (unlikely(err))
>  failed:
> - CT_PROBE_ERROR(ct, "Failed to register %s buffer (%pe)\n",
> + ct_probe_error(ct, "Failed to register %s buffer (%pe)\n",
>  send ? "SEND" : "RECV", ERR_PTR(err));
>  
>   return err;
> @@ -235,21 +253,21 @@ int intel_guc_ct_init(struct intel_guc_ct *ct)
>   blob_size = 2 * CTB_DESC_SIZE + CTB_H2G_BUFFER_SIZE + 
> CTB_G2H_BUFFER_SIZE;
>   err = intel_guc_allocate_and_map_vma(guc, blob_size, &ct->vma, &blob);
>   if (unlikely(err)) {
> - CT_PROBE_ERROR(ct, "Failed to allocate %u for CTB data (%pe)\n",
> + ct_probe_error(ct, "Failed to allocate %u for CTB data (%pe)\n",
>  blob_size, ERR_PTR(err));
>   return err;
>   }
>  
> - CT_DEBUG(ct, "base=%#x size=%u\n", intel_guc_ggtt_offset(guc, ct->vma), 
> blob_size);
> + ct_dbg(ct, "base=%#x size=%u\n", intel_guc_ggtt_offset(guc, ct->vma), 
> blob_size);
>  
>   /* store pointers to desc and cmds for send ctb */
>   desc = blob;
>   cmds = blob + 2 * CTB_DESC_SIZE;
>   cmds_size = CTB_H

Re: [PATCH] dma-buf: Require VM_PFNMAP vma for mmap

2022-11-22 Thread Jason Gunthorpe
On Tue, Nov 22, 2022 at 06:08:00PM +0100, Daniel Vetter wrote:
> tldr; DMA buffers aren't normal memory, expecting that you can use
> them like that (like calling get_user_pages works, or that they're
> accounting like any other normal memory) cannot be guaranteed.
> 
> Since some userspace only runs on integrated devices, where all
> buffers are actually all resident system memory, there's a huge
> temptation to assume that a struct page is always present and useable
> like for any more pagecache backed mmap. This has the potential to
> result in a uapi nightmare.
> 
> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> blocks get_user_pages and all the other struct page based
> infrastructure for everyone. In spirit this is the uapi counterpart to
> the kernel-internal CONFIG_DMABUF_DEBUG.
> 
> Motivated by a recent patch which wanted to swich the system dma-buf
> heap to vm_insert_page instead of vm_insert_pfn.
> 
> v2:
> 
> Jason brought up that we also want to guarantee that all ptes have the
> pte_special flag set, to catch fast get_user_pages (on architectures
> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> 
> From auditing the various functions to insert pfn pte entires
> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> this should be the correct flag to check for.

I didn't look at how this actually gets used, but it is a bit of a
pain to insert a lifetime controlled object like a struct page as a
special PTE/VM_PFNMAP

How is the lifetime model implemented here? How do you know when
userspace has finally unmapped the page?

Jason


Re: [PATCH] dma-buf: Require VM_PFNMAP vma for mmap

2022-11-22 Thread Daniel Vetter
On Tue, 22 Nov 2022 at 19:04, Jason Gunthorpe  wrote:
>
> On Tue, Nov 22, 2022 at 06:08:00PM +0100, Daniel Vetter wrote:
> > tldr; DMA buffers aren't normal memory, expecting that you can use
> > them like that (like calling get_user_pages works, or that they're
> > accounting like any other normal memory) cannot be guaranteed.
> >
> > Since some userspace only runs on integrated devices, where all
> > buffers are actually all resident system memory, there's a huge
> > temptation to assume that a struct page is always present and useable
> > like for any more pagecache backed mmap. This has the potential to
> > result in a uapi nightmare.
> >
> > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> > blocks get_user_pages and all the other struct page based
> > infrastructure for everyone. In spirit this is the uapi counterpart to
> > the kernel-internal CONFIG_DMABUF_DEBUG.
> >
> > Motivated by a recent patch which wanted to swich the system dma-buf
> > heap to vm_insert_page instead of vm_insert_pfn.
> >
> > v2:
> >
> > Jason brought up that we also want to guarantee that all ptes have the
> > pte_special flag set, to catch fast get_user_pages (on architectures
> > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> >
> > From auditing the various functions to insert pfn pte entires
> > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> > this should be the correct flag to check for.
>
> I didn't look at how this actually gets used, but it is a bit of a
> pain to insert a lifetime controlled object like a struct page as a
> special PTE/VM_PFNMAP
>
> How is the lifetime model implemented here? How do you know when
> userspace has finally unmapped the page?

The vma has a filp which is the refcounted dma_buf. With dma_buf you
never get an individual page it's always the entire object. And it's
up to the allocator how exactly it wants to use or not use the page's
refcount. So if gup goes in and elevates the refcount, you can break
stuff, which is why I'm doing this.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v4 01/10] drm: bridge: cadence: convert mailbox functions to macro functions

2022-11-22 Thread Fabio Estevam
Hi Sandor,

On Mon, Nov 21, 2022 at 4:27 AM Sandor Yu  wrote:
>
> Mailbox access functions could be share to other mhdp driver and
> HDP-TX HDMI/DP PHY drivers, move those functions to head file
> include/drm/bridge/cdns-mhdp-mailbox.h and convert them to
> macro functions.

What is the reason for converting the functions to macro?


Re: [PATCH v3 1/4] i915: Move list_count() to list.h for broader use

2022-11-22 Thread kernel test robot
Hi Andy,

I love your patch! Yet something to improve:

[auto build test ERROR on usb/usb-testing]
[also build test ERROR on usb/usb-next usb/usb-linus drm-intel/for-linux-next 
drm-intel/for-linux-next-fixes linus/master v6.1-rc6 next-20221122]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Andy-Shevchenko/i915-Move-list_count-to-list-h-for-broader-use/20221122-233657
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git 
usb-testing
patch link:
https://lore.kernel.org/r/20221122153516.52577-1-andriy.shevchenko%40linux.intel.com
patch subject: [PATCH v3 1/4] i915: Move list_count() to list.h for broader use
config: um-i386_defconfig
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
# 
https://github.com/intel-lab-lkp/linux/commit/3d217ef769536f160f5c20224aeb0f9070bdfdcf
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Andy-Shevchenko/i915-Move-list_count-to-list-h-for-broader-use/20221122-233657
git checkout 3d217ef769536f160f5c20224aeb0f9070bdfdcf
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=um SUBARCH=i386 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   ld: fs/isofs/inode.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> fs/isofs/namei.o:include/linux/list.h:663: first defined here
   ld: fs/isofs/dir.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> fs/isofs/namei.o:include/linux/list.h:663: first defined here
   ld: fs/isofs/util.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> fs/isofs/namei.o:include/linux/list.h:663: first defined here
   ld: fs/isofs/rock.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> fs/isofs/namei.o:include/linux/list.h:663: first defined here
   ld: fs/isofs/export.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> fs/isofs/namei.o:include/linux/list.h:663: first defined here
   ld: fs/isofs/joliet.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> fs/isofs/namei.o:include/linux/list.h:663: first defined here
--
   ld: fs/autofs/inode.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> fs/autofs/init.o:include/linux/list.h:663: first defined here
   ld: fs/autofs/root.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> fs/autofs/init.o:include/linux/list.h:663: first defined here
   ld: fs/autofs/symlink.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> fs/autofs/init.o:include/linux/list.h:663: first defined here
   ld: fs/autofs/waitq.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> fs/autofs/init.o:include/linux/list.h:663: first defined here
   ld: fs/autofs/expire.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> fs/autofs/init.o:include/linux/list.h:663: first defined here
   ld: fs/autofs/dev-ioctl.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> fs/autofs/init.o:include/linux/list.h:663: first defined here
--
   ld: block/bfq-wf2q.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> block/bfq-iosched.o:include/linux/list.h:663: first defined here
   ld: block/bfq-cgroup.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> block/bfq-iosched.o:include/linux/list.h:663: first defined here


vim +663 include/linux/list.h

   557  
   558  /**
   559   * list_next_entry - get the next element in list
   560   * @pos:the type * to cursor
   561   * @member: the name of the list_head within the struct.
   562   */
   563  #define list_next_entry(pos, member) \
   564  list_entry((pos)->member.next, typeof(*(pos)), member)
   565  
   566  /**
   567   * list_next_entry_circular - get the next element in list
   568   * @pos

Re: [Intel-gfx] [PATCH v2 0/5] Add module oriented dmesg output

2022-11-22 Thread Jani Nikula
On Tue, 22 Nov 2022, Michal Wajdeczko  wrote:
> On 18.11.2022 11:52, Jani Nikula wrote:
>> On Thu, 17 Nov 2022, john.c.harri...@intel.com wrote:
>>> From: John Harrison 
>>>
>>> When trying to analyse bug reports from CI, customers, etc. it can be
>>> difficult to work out exactly what is happening on which GT in a
>>> multi-GT system. So add GT oriented debug/error message wrappers. If
>>> used instead of the drm_ equivalents, you get the same output but with
>>> a GT# prefix on it.
>>>
>>> It was also requested to extend this further to submodules in order to
>>> factor out the repeated structure accessing constructs and common
>>> string prefixes. So, add versions for GuC, HuC and GuC CTB as well.
>>>
>>> This patch set updates all the gt/uc files to use the new helpers as a
>>> first step. The intention would be to convert all output messages that
>>> have access to a GT structure.
>>>
>>> v2: Go back to using lower case names, add more wrapper sets (combined
>>> review feedback). Also, wrap up probe injection and WARN entries.
>>>
>>> Signed-off-by: John Harrison 
>> 
>> For adding the wrappers in general, I'm going to disagree and
>> commit. I'll leave it up to Tvrtko and Joonas.
>> 
>> Regarding the placement of the macros, I insist you add individual
>> header files for the wrappers and include them only where needed.
>
> do you mean:
>
>   intel_gt_print.h
>   intel_guc_print.h
>   intel_huc_print.h
>
> with just macros or also with all functions that work with drm_printer?

At least for the macros being added now. If adding others does not
require you to pull in a bunch of additional header dependencies, you
can add more. And that can be separate patches.

>
>> 
>> We have a fairly serious problem with everything including everything in
>> i915 that I've been slowly trying to tackle. Touch one thing, rebuild
>> everything. About a third of our headers cause the rebuild of the entire
>> driver when modified. We need to reduce the surface of things that cause
>> rebuilds.
>> 
>> For example, intel_gt.h is included by 97 files, intel_guc.h by 332
>> files, and intel_huc.h by 329 files (counting recursively).
>> 
>> There's absolutely no reason any of the display code, for example, needs
>> to have these logging macros in their build. Long term, the headers
>> should be reorganized to reduce the interdependencies, and this is what
>> I've been doing in i915_drv.h and display/ in general. But the least we
>> can do is not make the problem worse.
>
> to solve this we should really consider splitting out GuC and HuC
> definitions to dedicated _types.h files and only include them in
> i915_drv.h (and print macros are orthogonal for this problem)

It's an orthogonal problem, but IMO with the current headers there's no
reason to make the problem worse by adding somewhat independent new
stuff to the headers.

---

I've looked at untangling this a bunch of times, but I've always felt
that it's really not my area of expertise, and it would inevitably
conflict with someone else's work in progress and someone else's idea of
how the headers should be refactored.

There are chains like this:

i915_drv.h:47:#include "gt/intel_gt_types.h"
gt/intel_gt_types.h:19:#include "uc/intel_uc.h"
gt/uc/intel_uc.h:9:#include "intel_guc.h"
gt/uc/intel_guc.h:15:#include "intel_guc_fwif.h"
gt/uc/intel_guc_fwif.h:14:#include "abi/guc_actions_abi.h"
gt/uc/intel_guc_fwif.h:15:#include "abi/guc_actions_slpc_abi.h"
gt/uc/intel_guc_fwif.h:16:#include "abi/guc_errors_abi.h"
gt/uc/intel_guc_fwif.h:17:#include "abi/guc_communication_mmio_abi.h"
gt/uc/intel_guc_fwif.h:18:#include "abi/guc_communication_ctb_abi.h"
gt/uc/intel_guc_fwif.h:19:#include "abi/guc_klvs_abi.h"
gt/uc/intel_guc_fwif.h:20:#include "abi/guc_messages_abi.h"

They need to be broken up at some point. There are a bunch of headers
where only minimal amount of info is actually needed in other headers,
and the rest is used in a limited number of .c files only.

It's a lot of tedious work to refactor and nobody's going to notice the
impact directly, they'll just be less grumpy about the build being slow
and the organization of the headers being messy. And if they don't build
the driver a lot (like me) or don't refactor the driver a lot (like me)
maybe they'll never notice.


BR,
Jani.


>
> Michal
>
>> 
>> BR,
>> Jani.
>> 
>>>
>>>
>>> John Harrison (5):
>>>   drm/i915/gt: Start adding module oriented dmesg output
>>>   drm/i915/huc: Add HuC specific debug print wrappers
>>>   drm/i915/guc: Add GuC specific debug print wrappers
>>>   drm/i915/guc: Add GuC CT specific debug print wrappers
>>>   drm/i915/uc: Update the gt/uc code to use gt_err and friends
>>>
>>>  drivers/gpu/drm/i915/gt/intel_gt.c|  96 
>>>  drivers/gpu/drm/i915/gt/intel_gt.h|  35 +++
>>>  drivers/gpu/drm/i915/gt/uc/intel_guc.c|  32 +--
>>>  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  35 +++
>>>  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|   8 +-
>>>  

Re: Try to address the DMA-buf coherency problem

2022-11-22 Thread Daniel Vetter
On Tue, 22 Nov 2022 at 18:34, Christian König  wrote:
> Am 22.11.22 um 15:36 schrieb Daniel Vetter:
> > On Fri, Nov 18, 2022 at 11:32:19AM -0800, Rob Clark wrote:
> >> On Thu, Nov 17, 2022 at 7:38 AM Nicolas Dufresne  
> >> wrote:
> >>> Le jeudi 17 novembre 2022 à 13:10 +0100, Christian König a écrit :
> >> DMA-Buf let's the exporter setup the DMA addresses the importer uses to
> >> be able to directly decided where a certain operation should go. E.g. 
> >> we
> >> have cases where for example a P2P write doesn't even go to memory, but
> >> rather a doorbell BAR to trigger another operation. Throwing in CPU
> >> round trips for explicit ownership transfer completely breaks that
> >> concept.
> > It sounds like we should have a dma_dev_is_coherent_with_dev() which
> > accepts two (or an array?) of devices and tells the caller whether the
> > devices need explicit ownership transfer.
>  No, exactly that's the concept I'm pushing back on very hard here.
> 
>  In other words explicit ownership transfer is not something we would
>  want as requirement in the framework, cause otherwise we break tons of
>  use cases which require concurrent access to the underlying buffer.
> >>> I'm not pushing for this solution, but really felt the need to correct 
> >>> you here.
> >>> I have quite some experience with ownership transfer mechanism, as this 
> >>> is how
> >>> GStreamer framework works since 2000. Concurrent access is a really 
> >>> common use
> >>> cases and it is quite well defined in that context. The bracketing system 
> >>> (in
> >>> this case called map() unmap(), with flag stating the usage intention 
> >>> like reads
> >>> and write) is combined the the refcount. The basic rules are simple:
> >> This is all CPU oriented, I think Christian is talking about the case
> >> where ownership transfer happens without CPU involvement, such as via
> >> GPU waiting on a fence
> > Yeah for pure device2device handover the rule pretty much has to be that
> > any coherency management that needs to be done must be done from the
> > device side (flushing device side caches and stuff like that) only. But
> > under the assumption that _all_ cpu side management has been done already
> > before the first device access started.
> >
> > And then the map/unmap respectively begin/end_cpu_access can be used what
> > it was meant for with cpu side invalidation/flushing and stuff like that,
> > while having pretty clear handover/ownership rules and hopefully not doing
> > no unecessary flushes. And all that while allowing device acces to be
> > pipelined. Worst case the exporter has to insert some pipelined cache
> > flushes as a dma_fence pipelined work of its own between the device access
> > when moving from one device to the other. That last part sucks a bit right
> > now, because we don't have any dma_buf_attachment method which does this
> > syncing without recreating the mapping, but in reality this is solved by
> > caching mappings in the exporter (well dma-buf layer) nowadays.
> >
> > True concurrent access like vk/compute expects is still a model that
> > dma-buf needs to support on top, but that's a special case and pretty much
> > needs hw that supports such concurrent access without explicit handover
> > and fencing.
> >
> > Aside from some historical accidents and still a few warts, I do think
> > dma-buf does support both of these models.
>
> We should have come up with dma-heaps earlier and make it clear that
> exporting a DMA-buf from a device gives you something device specific
> which might or might not work with others.

Yeah, but engineering practicalities were pretty clear that no one
would rewrite the entire Xorg stack and all the drivers just to make
that happen for prime.

> Apart from that I agree, DMA-buf should be capable of handling this.
> Question left is what documentation is missing to make it clear how
> things are supposed to work?

Given the historical baggage of existing use-case, I think the only
way out is that we look at concrete examples from real world users
that break, and figure out how to fix them. Without breaking any of
the existing mess.

One idea might be that we have a per-platform
dma_buf_legacy_coherency_mode(), which tells you what mode (cpu cache
snooping or uncached memory) you need to use to make sure that all
devices agree. On x86 the rule might be that it's cpu cache snooping
by default, but if you have an integrated gpu then everyone needs to
use uncached. That sucks, but at least we could keep the existing mess
going and clean it up. Everyone else would be uncached, except maybe
arm64 servers with pcie connectors. Essentially least common
denominator to make this work. Note that uncached actually means
device access doesn't snoop, the cpu side you can handle with either
uc/wc mappings or explicit flushing.

Then once we have that we could implement the coherency negotiation
protocol on top as an explicit opt-in, 

Running into problems with UDMABUF

2022-11-22 Thread Paul Cercueil

Hi Gerd,

I'm running into a problem trying to use UDMABUF and I'm not sure if 
it's a bug in udmabuf, or I'm just using it wrong.


I create a UDMABUF in userspace, enable CPU access to it, copy some 
data, disable CPU access, then register it with the IIO subsystem using 
an ioctl I'm working on (not yet upstream).


In udmabuf.c, the "get_sg_table" function is first called by the 
"begin_cpu_udmabuf", and then called a second time by "map_udmabuf" 
when the IIO subsystem creates a dma_buf_attachment then maps it.


Now the behaviour I'm seeing, is that the first call to 
"dma_map_sgtable" in get_sg_table sets sg->nents == 1 which is equal to 
sg_nents(sg->sgl) and as such is the expected value, but the second 
call to "dma_map_sgtable" will set sg->nents == 0 and I don't 
understand why.


I wonder if you are seeing this behaviour as well?

Cheers,
-Paul




Re: [PATCH v3 1/4] i915: Move list_count() to list.h for broader use

2022-11-22 Thread kernel test robot
Hi Andy,

I love your patch! Yet something to improve:

[auto build test ERROR on usb/usb-testing]
[also build test ERROR on usb/usb-next usb/usb-linus drm-intel/for-linux-next 
drm-intel/for-linux-next-fixes linus/master v6.1-rc6 next-20221122]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Andy-Shevchenko/i915-Move-list_count-to-list-h-for-broader-use/20221122-233657
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git 
usb-testing
patch link:
https://lore.kernel.org/r/20221122153516.52577-1-andriy.shevchenko%40linux.intel.com
patch subject: [PATCH v3 1/4] i915: Move list_count() to list.h for broader use
config: s390-defconfig
compiler: s390-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/intel-lab-lkp/linux/commit/3d217ef769536f160f5c20224aeb0f9070bdfdcf
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Andy-Shevchenko/i915-Move-list_count-to-list-h-for-broader-use/20221122-233657
git checkout 3d217ef769536f160f5c20224aeb0f9070bdfdcf
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 
O=build_dir ARCH=s390 prepare

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   scripts/genksyms/parse.y: warning: 9 shift/reduce conflicts [-Wconflicts-sr]
   scripts/genksyms/parse.y: warning: 5 reduce/reduce conflicts [-Wconflicts-rr]
   scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to 
generate conflict counterexamples
   In file included from include/linux/preempt.h:11,
from include/linux/percpu.h:6,
from include/linux/context_tracking_state.h:5,
from include/linux/hardirq.h:5,
from include/linux/kvm_host.h:7,
from arch/s390/kernel/asm-offsets.c:11:
   include/linux/list.h:662:8: warning: no previous prototype for 'list_count' 
[-Wmissing-prototypes]
 662 | size_t list_count(struct list_head *head)
 |^~
   In file included from include/linux/preempt.h:11,
from arch/s390/include/asm/timex.h:13,
from arch/s390/kernel/vdso64/getcpu.c:6:
   include/linux/list.h:662:8: warning: no previous prototype for 'list_count' 
[-Wmissing-prototypes]
 662 | size_t list_count(struct list_head *head)
 |^~
   In file included from include/linux/rculist.h:10,
from include/linux/pid.h:5,
from include/linux/sched.h:14,
from arch/s390/include/asm/syscall.h:13,
from arch/s390/include/asm/vdso/gettimeofday.h:9,
from include/vdso/datapage.h:137,
from 
arch/s390/kernel/vdso64/../../../../lib/vdso/gettimeofday.c:5,
from arch/s390/kernel/vdso64/vdso64_generic.c:2:
   include/linux/list.h:662:8: warning: no previous prototype for 'list_count' 
[-Wmissing-prototypes]
 662 | size_t list_count(struct list_head *head)
 |^~
   s390-linux-ld: arch/s390/kernel/vdso64/getcpu.o: in function `list_count':
>> include/linux/list.h:663: multiple definition of `list_count'; 
>> arch/s390/kernel/vdso64/vdso64_generic.o:include/linux/list.h:663: first 
>> defined here
   make[2]: *** [arch/s390/kernel/vdso64/Makefile:50: 
arch/s390/kernel/vdso64/vdso64.so.dbg] Error 1
   make[2]: Target 'include/generated/vdso64-offsets.h' not remade because of 
errors.
   make[1]: *** [arch/s390/Makefile:159: vdso_prepare] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [Makefile:231: __sub-make] Error 2
   make: Target 'prepare' not remade because of errors.


vim +663 include/linux/list.h

   557  
   558  /**
   559   * list_next_entry - get the next element in list
   560   * @pos:the type * to cursor
   561   * @member: the name of the list_head within the struct.
   562   */
   563  #define list_next_entry(pos, member) \
   564  list_entry((pos)->member.next, typeof(*(pos)), member)
   565  
   566  /**
   567   * list_next_entry_circular - get the next element in list
   568   * @pos:the type * to cursor.
   569   * @head:   the list head to take the element from.
   570   * @member: the name of the list_head wit

Re: git send-email friendly smtp provider anyone?

2022-11-22 Thread Konstantin Ryabitsev
On Tue, Nov 22, 2022 at 06:42:19PM +0100, Noralf Trønnes wrote:
> The first thing that strikes me is that everyone mentioned in one of the
> patches get the entire patchset, even sta...@vger.kernel.org (cc'ed in a
> fixes patch). The first patch touches a core file and as a result a few
> drivers, so I've cc'ed the driver maintainers in that patch, but now
> they get the entire patchset where 5 of 6 patches is about a driver that
> I maintain. So from their point of view, they see a patchset about a
> driver they don't care about and a patch touching a core file, but from
> the subject it's not apparent that it touches their driver. I'm afraid
> that this might result in none of them looking at that patch. In this
> particular case it's not that important, but in another case it might be.

I did some (unscientific) polling among kernel maintainers and, by a vast
margin, they always prefer to receive the entire series instead of
cherry-picked patches -- having the entire series helps provide important
context for the change they are looking at.

So, this is deliberate and, for now at least, not configurable. Unless you're
sending 100+ patch series, I doubt anyone will have any problem with receiving
the whole series instead of individual patches.

> As for the setting up the web endpoint, should I just follow the b4 docs
> on that?
> 
> I use b4 version 0.10.1, is that recent enough?

Yes. There will be a 0.10.2 in the near future, but the incoming fixes
shouldn't make much difference for the b4 send code.

-K


Re: [PATCH] dma-buf: Require VM_PFNMAP vma for mmap

2022-11-22 Thread Jason Gunthorpe
On Tue, Nov 22, 2022 at 07:08:25PM +0100, Daniel Vetter wrote:
> On Tue, 22 Nov 2022 at 19:04, Jason Gunthorpe  wrote:
> >
> > On Tue, Nov 22, 2022 at 06:08:00PM +0100, Daniel Vetter wrote:
> > > tldr; DMA buffers aren't normal memory, expecting that you can use
> > > them like that (like calling get_user_pages works, or that they're
> > > accounting like any other normal memory) cannot be guaranteed.
> > >
> > > Since some userspace only runs on integrated devices, where all
> > > buffers are actually all resident system memory, there's a huge
> > > temptation to assume that a struct page is always present and useable
> > > like for any more pagecache backed mmap. This has the potential to
> > > result in a uapi nightmare.
> > >
> > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> > > blocks get_user_pages and all the other struct page based
> > > infrastructure for everyone. In spirit this is the uapi counterpart to
> > > the kernel-internal CONFIG_DMABUF_DEBUG.
> > >
> > > Motivated by a recent patch which wanted to swich the system dma-buf
> > > heap to vm_insert_page instead of vm_insert_pfn.
> > >
> > > v2:
> > >
> > > Jason brought up that we also want to guarantee that all ptes have the
> > > pte_special flag set, to catch fast get_user_pages (on architectures
> > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> > >
> > > From auditing the various functions to insert pfn pte entires
> > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> > > this should be the correct flag to check for.
> >
> > I didn't look at how this actually gets used, but it is a bit of a
> > pain to insert a lifetime controlled object like a struct page as a
> > special PTE/VM_PFNMAP
> >
> > How is the lifetime model implemented here? How do you know when
> > userspace has finally unmapped the page?
> 
> The vma has a filp which is the refcounted dma_buf. With dma_buf you
> never get an individual page it's always the entire object. And it's
> up to the allocator how exactly it wants to use or not use the page's
> refcount. So if gup goes in and elevates the refcount, you can break
> stuff, which is why I'm doing this.

But how does move work?

Jason


Re: [Intel-gfx] [PATCH 2/6] drm/i915/gsc: Skip the version check when fetching the GSC FW

2022-11-22 Thread Rodrigo Vivi
On Mon, Nov 21, 2022 at 03:16:13PM -0800, Daniele Ceraolo Spurio wrote:
> The current exectation from the FW side is that the driver will query
> the GSC FW version after the FW is loaded, similarly to what the mei
> driver does on DG2. However, we're discussing with the FW team if there
> is a way to extract the version from the bin file before loading, so we
> can keep the code the same as for older FWs.
> 
> Since the GSC FW version is not currently required for functionality and
> is only needed for debug purposes, we can skip the FW version for now at
> fetch time and add it later on when we've agreed on the approach.
> 
> Signed-off-by: Daniele Ceraolo Spurio 
> Cc: Alan Previn 
> Cc: John Harrison 

Reviewed-by: Rodrigo Vivi 

> ---
>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 29 +++-
>  1 file changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
> index 5b2e4503aee7..3df52fd59edc 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
> @@ -564,6 +564,26 @@ static int check_ccs_header(struct intel_gt *gt,
>   return 0;
>  }
>  
> +static int check_fw_header(struct intel_gt *gt,
> +const struct firmware *fw,
> +struct intel_uc_fw *uc_fw)
> +{
> + int err = 0;
> +
> + /* GSC FW version is queried after the FW is loaded */
> + if (uc_fw->type == INTEL_UC_FW_TYPE_GSC)
> + return 0;
> +
> + if (uc_fw->loaded_via_gsc)
> + err = check_gsc_manifest(fw, uc_fw);
> + else
> + err = check_ccs_header(gt, fw, uc_fw);
> + if (err)
> + return err;
> +
> + return 0;
> +}
> +
>  /**
>   * intel_uc_fw_fetch - fetch uC firmware
>   * @uc_fw: uC firmware
> @@ -633,14 +653,11 @@ int intel_uc_fw_fetch(struct intel_uc_fw *uc_fw)
>   if (err)
>   goto fail;
>  
> - if (uc_fw->loaded_via_gsc)
> - err = check_gsc_manifest(fw, uc_fw);
> - else
> - err = check_ccs_header(gt, fw, uc_fw);
> + err = check_fw_header(gt, fw, uc_fw);
>   if (err)
>   goto fail;
>  
> - if (uc_fw->file_wanted.major_ver) {
> + if (uc_fw->file_wanted.major_ver && uc_fw->file_selected.major_ver) {
>   /* Check the file's major version was as it claimed */
>   if (uc_fw->file_selected.major_ver != 
> uc_fw->file_wanted.major_ver) {
>   drm_notice(&i915->drm, "%s firmware %s: unexpected 
> version: %u.%u != %u.%u\n",
> @@ -657,7 +674,7 @@ int intel_uc_fw_fetch(struct intel_uc_fw *uc_fw)
>   }
>   }
>  
> - if (old_ver) {
> + if (old_ver && uc_fw->file_selected.major_ver) {
>   /* Preserve the version that was really wanted */
>   memcpy(&uc_fw->file_wanted, &file_ideal, 
> sizeof(uc_fw->file_wanted));
>  
> -- 
> 2.37.3
> 


[PATCH v2 0/4] Add guard padding around i915_vma

2022-11-22 Thread Andi Shyti
Hi,

This series adds guards around vma's but setting a pages at the
beginning and at the end that work as padding.

The first user of the vma guard are scanout objects which don't
need anymore to add scratch to all the unused ggtt's and speeding
up up considerably the boot and resume by several hundreds of
milliseconds up to over a full second in slower machines.

Because of this we don't need anymore 2ef6efa79fec ("drm/i915:
Improve on suspend / resume time with VT-d enabled") which gets
reverted.

Changelog
=
v1 -> v2:
 - Revert 2ef6efa79fec ("drm/i915: Improve on suspend / resume
   time with VT-d enabled") 

Andi

Andi Shyti (1):
  Revert "drm/i915: Improve on suspend / resume time with VT-d enabled"

Chris Wilson (3):
  drm/i915: Wrap all access to i915_vma.node.start|size
  drm/i915: Introduce guard pages to i915_vma
  drm/i915: Refine VT-d scanout workaround

 drivers/gpu/drm/i915/display/intel_fbdev.c|   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_domain.c|  13 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  33 +++---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_tiling.c|   4 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |   2 +-
 .../i915/gem/selftests/i915_gem_client_blt.c  |  23 ++--
 .../drm/i915/gem/selftests/i915_gem_context.c |  15 ++-
 .../drm/i915/gem/selftests/i915_gem_mman.c|   2 +-
 .../drm/i915/gem/selftests/igt_gem_utils.c|   7 +-
 drivers/gpu/drm/i915/gt/gen7_renderclear.c|   2 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 108 --
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c  |   3 +-
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  24 
 drivers/gpu/drm/i915/gt/intel_renderstate.c   |   2 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   |   2 +-
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c  |   8 +-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |  18 +--
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  15 +--
 drivers/gpu/drm/i915/gt/selftest_lrc.c|  16 +--
 .../drm/i915/gt/selftest_ring_submission.c|   2 +-
 drivers/gpu/drm/i915/gt/selftest_rps.c|  12 +-
 .../gpu/drm/i915/gt/selftest_workarounds.c|   8 +-
 drivers/gpu/drm/i915/i915_cmd_parser.c|   4 +-
 drivers/gpu/drm/i915/i915_debugfs.c   |   2 +-
 drivers/gpu/drm/i915/i915_driver.c|  16 ---
 drivers/gpu/drm/i915/i915_gem_gtt.h   |   3 +-
 drivers/gpu/drm/i915/i915_perf.c  |   2 +-
 drivers/gpu/drm/i915/i915_vma.c   |  59 +++---
 drivers/gpu/drm/i915/i915_vma.h   |  52 -
 drivers/gpu/drm/i915/i915_vma_resource.c  |   4 +-
 drivers/gpu/drm/i915/i915_vma_resource.h  |  17 ++-
 drivers/gpu/drm/i915/i915_vma_types.h |   3 +-
 drivers/gpu/drm/i915/selftests/i915_request.c |  20 ++--
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |   8 +-
 36 files changed, 259 insertions(+), 256 deletions(-)

-- 
2.38.1



[PATCH v2 1/4] drm/i915: Wrap all access to i915_vma.node.start|size

2022-11-22 Thread Andi Shyti
From: Chris Wilson 

We already wrap i915_vma.node.start for use with the GGTT, as there we
can perform additional sanity checks that the node belongs to the GGTT
and fits within the 32b registers. In the next couple of patches, we
will introduce guard pages around the objects _inside_ the drm_mm_node
allocation. That is we will offset the vma->pages so that the first page
is at drm_mm_node.start + vma->guard (not 0 as is currently the case).
All users must then not use i915_vma.node.start directly, but compute
the guard offset, thus all users are converted to use a
i915_vma_offset() wrapper.

The notable exceptions are the selftests that are testing exact
behaviour of i915_vma_pin/i915_vma_insert.

Signed-off-by: Chris Wilson 
Signed-off-by: Tejas Upadhyay 
Co-developed-by: Thomas Hellström 
Signed-off-by: Thomas Hellström 
Signed-off-by: Tvrtko Ursulin 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/display/intel_fbdev.c|  2 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 33 ++--
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_tiling.c|  4 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../i915/gem/selftests/i915_gem_client_blt.c  | 23 +
 .../drm/i915/gem/selftests/i915_gem_context.c | 15 +++---
 .../drm/i915/gem/selftests/i915_gem_mman.c|  2 +-
 .../drm/i915/gem/selftests/igt_gem_utils.c|  7 +--
 drivers/gpu/drm/i915/gt/gen7_renderclear.c|  2 +-
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c  |  3 +-
 drivers/gpu/drm/i915/gt/intel_renderstate.c   |  2 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c  |  8 +--
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 18 +++
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  | 15 +++---
 drivers/gpu/drm/i915/gt/selftest_lrc.c| 16 +++---
 .../drm/i915/gt/selftest_ring_submission.c|  2 +-
 drivers/gpu/drm/i915/gt/selftest_rps.c| 12 ++---
 .../gpu/drm/i915/gt/selftest_workarounds.c|  8 +--
 drivers/gpu/drm/i915/i915_cmd_parser.c|  4 +-
 drivers/gpu/drm/i915/i915_debugfs.c   |  2 +-
 drivers/gpu/drm/i915/i915_perf.c  |  2 +-
 drivers/gpu/drm/i915/i915_vma.c   | 25 -
 drivers/gpu/drm/i915/i915_vma.h   | 51 +--
 drivers/gpu/drm/i915/i915_vma_resource.h  | 10 ++--
 drivers/gpu/drm/i915/selftests/i915_request.c | 20 
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |  8 +--
 29 files changed, 180 insertions(+), 122 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c 
b/drivers/gpu/drm/i915/display/intel_fbdev.c
index 5575d7abdc092..03ed4607a46d2 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -286,7 +286,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
 
/* Our framebuffer is the entirety of fbdev's system memory */
info->fix.smem_start =
-   (unsigned long)(ggtt->gmadr.start + vma->node.start);
+   (unsigned long)(ggtt->gmadr.start + 
i915_ggtt_offset(vma));
info->fix.smem_len = vma->size;
}
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 29e9e8d5b6fec..86956b902c978 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -379,22 +379,25 @@ eb_vma_misplaced(const struct drm_i915_gem_exec_object2 
*entry,
 const struct i915_vma *vma,
 unsigned int flags)
 {
-   if (vma->node.size < entry->pad_to_size)
+   const u64 start = i915_vma_offset(vma);
+   const u64 size = i915_vma_size(vma);
+
+   if (size < entry->pad_to_size)
return true;
 
-   if (entry->alignment && !IS_ALIGNED(vma->node.start, entry->alignment))
+   if (entry->alignment && !IS_ALIGNED(start, entry->alignment))
return true;
 
if (flags & EXEC_OBJECT_PINNED &&
-   vma->node.start != entry->offset)
+   start != entry->offset)
return true;
 
if (flags & __EXEC_OBJECT_NEEDS_BIAS &&
-   vma->node.start < BATCH_OFFSET_BIAS)
+   start < BATCH_OFFSET_BIAS)
return true;
 
if (!(flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
-   (vma->node.start + vma->node.size + 4095) >> 32)
+   (start + size + 4095) >> 32)
return true;
 
if (flags & __EXEC_OBJECT_NEEDS_MAP &&
@@ -440,7 +443,7 @@ eb_pin_vma(struct i915_execbuffer *eb,
int err;
 
if (vma->node.size)
-   pin_flags = vma->node.start;
+   pin_flags =  __i915_vma_offset(vma);
else
pin_flags = entry->offset & PIN_OFFSET_MASK;
 
@@ -663,8 +666,8 @@ static int eb_reserve_vma(str

  1   2   3   >