Re: [PATCH v1] drm/ttm: Clean up page shift operation
Am 22.11.22 um 08:51 schrieb Somalapuram Amaranath: Remove page shift operations as ttm_resource moved from num_pages to size_t size in bytes. v1 -> v2: fix missing page shift to fpfn and lpfn Signed-off-by: Somalapuram Amaranath --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 +--- drivers/gpu/drm/ttm/ttm_range_manager.c| 13 ++--- 2 files changed, 7 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 974e85d8b6cc..19ad365dc159 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -541,12 +541,10 @@ int amdgpu_bo_create(struct amdgpu_device *adev, if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA)) { /* GWS and OA don't need any alignment. */ page_align = bp->byte_align; - size <<= PAGE_SHIFT; - } else if (bp->domain & AMDGPU_GEM_DOMAIN_GDS) { /* Both size and alignment must be a multiple of 4. */ page_align = ALIGN(bp->byte_align, 4); - size = ALIGN(size, 4) << PAGE_SHIFT; + size = ALIGN(size, 4); The amdgpu changes should probably be a separate patch. } else { /* Memory should be aligned at least to a page size. */ page_align = ALIGN(bp->byte_align, PAGE_SIZE) >> PAGE_SHIFT; diff --git a/drivers/gpu/drm/ttm/ttm_range_manager.c b/drivers/gpu/drm/ttm/ttm_range_manager.c index 0a8bc0b7f380..6ac38092dd2a 100644 --- a/drivers/gpu/drm/ttm/ttm_range_manager.c +++ b/drivers/gpu/drm/ttm/ttm_range_manager.c @@ -83,9 +83,10 @@ static int ttm_range_man_alloc(struct ttm_resource_manager *man, spin_lock(&rman->lock); ret = drm_mm_insert_node_in_range(mm, &node->mm_nodes[0], - PFN_UP(node->base.size), + node->base.size, bo->page_alignment, 0, - place->fpfn, lpfn, mode); + place->fpfn << PAGE_SHIFT, + lpfn << PAGE_SHIFT, mode); spin_unlock(&rman->lock); if (unlikely(ret)) { @@ -119,11 +120,10 @@ static bool ttm_range_man_intersects(struct ttm_resource_manager *man, size_t size) { struct drm_mm_node *node = &to_ttm_range_mgr_node(res)->mm_nodes[0]; - u32 num_pages = PFN_UP(size); /* Don't evict BOs outside of the requested placement range */ - if (place->fpfn >= (node->start + num_pages) || - (place->lpfn && place->lpfn <= node->start)) + if ((place->fpfn << PAGE_SHIFT) >= (node->start + size) || + (place->lpfn && (place->lpfn << PAGE_SHIFT) <= node->start)) return false; return true; @@ -135,10 +135,9 @@ static bool ttm_range_man_compatible(struct ttm_resource_manager *man, size_t size) { struct drm_mm_node *node = &to_ttm_range_mgr_node(res)->mm_nodes[0]; - u32 num_pages = PFN_UP(size); if (node->start < place->fpfn || - (place->lpfn && (node->start + num_pages) > place->lpfn)) + (place->lpfn && (node->start + size) > place->lpfn << PAGE_SHIFT)) return false; This looks good but can't be complete. We have a couple of place where node->start and node->size is used outside of TTM. See drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h and drivers/gpu/drm/i915/intel_region_ttm.c for example. Those need to be updated as well. Regards, Christian. return true;
Re: [Intel-gfx] [PATCH v2 0/5] Add module oriented dmesg output
On 21/11/2022 18:21, John Harrison wrote: On 11/18/2022 02:52, Jani Nikula wrote: On Thu, 17 Nov 2022, john.c.harri...@intel.com wrote: From: John Harrison When trying to analyse bug reports from CI, customers, etc. it can be difficult to work out exactly what is happening on which GT in a multi-GT system. So add GT oriented debug/error message wrappers. If used instead of the drm_ equivalents, you get the same output but with a GT# prefix on it. It was also requested to extend this further to submodules in order to factor out the repeated structure accessing constructs and common string prefixes. So, add versions for GuC, HuC and GuC CTB as well. This patch set updates all the gt/uc files to use the new helpers as a first step. The intention would be to convert all output messages that have access to a GT structure. v2: Go back to using lower case names, add more wrapper sets (combined review feedback). Also, wrap up probe injection and WARN entries. Signed-off-by: John Harrison For adding the wrappers in general, I'm going to disagree and commit. I'll leave it up to Tvrtko and Joonas. Regarding the placement of the macros, I insist you add individual header files for the wrappers and include them only where needed. We have a fairly serious problem with everything including everything in i915 that I've been slowly trying to tackle. Touch one thing, rebuild everything. About a third of our headers cause the rebuild of the entire driver when modified. We need to reduce the surface of things that cause rebuilds. For example, intel_gt.h is included by 97 files, intel_guc.h by 332 files, and intel_huc.h by 329 files (counting recursively). There's absolutely no reason any of the display code, for example, needs to have these logging macros in their build. Long term, the headers should be reorganized to reduce the interdependencies, and this is what I've been doing in i915_drv.h and display/ in general. But the least we can do is not make the problem worse. @Tvrtko/@Michal W, any other review comments or feedback? I'd rather not spend time fixing up the header file issue and reposting only to have someone point out another issue that could have been resolved at the same time. I read through the patches when you posted them and it looked nice and clean to me. I think I spotted one instance of a debug build only message getting upgraded to production build, and one loss of stack trace on a warning, but it wasn't a concern to me AFAIR. Regards, Tvrtko John. BR, Jani. John Harrison (5): drm/i915/gt: Start adding module oriented dmesg output drm/i915/huc: Add HuC specific debug print wrappers drm/i915/guc: Add GuC specific debug print wrappers drm/i915/guc: Add GuC CT specific debug print wrappers drm/i915/uc: Update the gt/uc code to use gt_err and friends drivers/gpu/drm/i915/gt/intel_gt.c | 96 drivers/gpu/drm/i915/gt/intel_gt.h | 35 +++ drivers/gpu/drm/i915/gt/uc/intel_guc.c | 32 +-- drivers/gpu/drm/i915/gt/uc/intel_guc.h | 35 +++ drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 8 +- .../gpu/drm/i915/gt/uc/intel_guc_capture.c | 48 ++-- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 222 +- drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 19 +- drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 37 ++- drivers/gpu/drm/i915/gt/uc/intel_guc_rc.c | 7 +- drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 55 ++--- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 62 +++-- drivers/gpu/drm/i915/gt/uc/intel_huc.c | 31 +-- drivers/gpu/drm/i915/gt/uc/intel_huc.h | 23 ++ drivers/gpu/drm/i915/gt/uc/intel_uc.c | 108 - drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 98 drivers/gpu/drm/i915/gt/uc/selftest_guc.c | 34 +-- .../drm/i915/gt/uc/selftest_guc_hangcheck.c | 22 +- .../drm/i915/gt/uc/selftest_guc_multi_lrc.c | 10 +- 19 files changed, 507 insertions(+), 475 deletions(-)
Re: [PATCH v1 2/8] dt-bindings: display: bridge: renesas, dsi-csi2-tx: Add r8a779g0
On 17/11/2022 17:14, Geert Uytterhoeven wrote: Hi Tomi, On Thu, Nov 17, 2022 at 1:26 PM Tomi Valkeinen wrote: From: Tomi Valkeinen Extend the Renesas DSI display bindings to support the r8a779g0 V4H. Signed-off-by: Tomi Valkeinen --- .../bindings/display/bridge/renesas,dsi-csi2-tx.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml b/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml index afeeb967393d..bc3101f77e5a 100644 --- a/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml +++ b/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml @@ -11,13 +11,14 @@ maintainers: description: | This binding describes the MIPI DSI/CSI-2 encoder embedded in the Renesas - R-Car V3U SoC. The encoder can operate in either DSI or CSI-2 mode, with up + R-Car V3U/V4H SoC. The encoder can operate in either DSI or CSI-2 mode, with up Perhaps "R-Car Gen4 SoCs", so we stay within 80 chars, and don't have to update this when the next member of the family is around the block? Is V3U gen 4? Or do you mean "R-Car V3U and Gen 4 SoCs"? Is there anything that might be SoC-specific? If not, perhaps the time is ripe for a family-specific compatible value? At least v3u and v4h DSIs are slightly different. Well, the DSI IP block itself looks the same, but the PLL and PHY are different. Tomi
Re: [PATCH] drm/msm/dpu: Print interrupt index in addition to the mask
On November 21, 2022 11:24:55 PM GMT+01:00, Marijn Suijten wrote: >The mask only describes the `irq_idx % 32` part, making it generally >impossible to deduce what interrupt is being enabled/disabled. Since >`debug/core_irq` in debugfs (and other prints) also include the full >`DPU_IRQ_IDX()` value, print the same full value here for easier >correlation instead of only adding the `irq_idx / 32` part. > >Furthermore, make the dbgstr messages more consistent. > >Signed-off-by: Marijn Suijten >--- > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > >diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c >b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c >index cf1b6d84c18a..64589a9c2c51 100644 >--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c >+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c >@@ -252,9 +252,9 @@ static int dpu_hw_intr_enable_irq_locked(struct >dpu_hw_intr *intr, int irq_idx) > > cache_irq_mask = intr->cache_irq_mask[reg_idx]; > if (cache_irq_mask & DPU_IRQ_MASK(irq_idx)) { >- dbgstr = "DPU IRQ already set:"; >+ dbgstr = "already "; > } else { >- dbgstr = "DPU IRQ enabled:"; >+ dbgstr = ""; > > cache_irq_mask |= DPU_IRQ_MASK(irq_idx); > /* Cleaning any pending interrupt */ >@@ -268,7 +268,7 @@ static int dpu_hw_intr_enable_irq_locked(struct >dpu_hw_intr *intr, int irq_idx) > intr->cache_irq_mask[reg_idx] = cache_irq_mask; > } > >- pr_debug("%s MASK:0x%.8lx, CACHE-MASK:0x%.8x\n", dbgstr, >+ pr_debug("DPU IRQ %d %senabled: MASK:0x%.8lx, CACHE-MASK:0x%.8x\n", >irq_idx, dbgstr, > DPU_IRQ_MASK(irq_idx), cache_irq_mask); > > return 0; >@@ -301,9 +301,9 @@ static int dpu_hw_intr_disable_irq_locked(struct >dpu_hw_intr *intr, int irq_idx) > > cache_irq_mask = intr->cache_irq_mask[reg_idx]; > if ((cache_irq_mask & DPU_IRQ_MASK(irq_idx)) == 0) { >- dbgstr = "DPU IRQ is already cleared:"; >+ dbgstr = "already "; > } else { >- dbgstr = "DPU IRQ mask disable:"; >+ dbgstr = ""; > > cache_irq_mask &= ~DPU_IRQ_MASK(irq_idx); > /* Disable interrupts based on the new mask */ >@@ -317,7 +317,7 @@ static int dpu_hw_intr_disable_irq_locked(struct >dpu_hw_intr *intr, int irq_idx) > intr->cache_irq_mask[reg_idx] = cache_irq_mask; > } > >- pr_debug("%s MASK:0x%.8lx, CACHE-MASK:0x%.8x\n", dbgstr, >+ pr_debug("DPU IRQ %d %sdisabled: MASK:0x%.8lx, CACHE-MASK:0x%.8x\n", >irq_idx, dbgstr, > DPU_IRQ_MASK(irq_idx), cache_irq_mask); > > return 0; Looks good to me. Reviewed-by: Martin Botka
Re: ast: resolutions that require single-buffering (due to VRAM limitations) are unavailable
Thomas Zimmermann: Hi Am 25.10.22 um 09:12 schrieb Jeremy Rand: Hi dri-devel, I have two machines with ASPEED GPU's (ast Linux driver). One machine is x86_64, running an ASRock Rack Tommy 90-SC02P1-00UBNZ GPU (AST2510 chipset) with KDE Plasma Wayland; the other is ppc64le, running an integrated AST2500 GPU with KDE Plasma X11. Both the AST2510 and AST2500 have 16 MiB VRAM according to lspci. Both ASPEED GPU's are advertised as supporting up to 1920x1200 resolution, but KDE only detects a maximum resolution of 1920x1080. Some additional information about this bug can be found at https://forums.raptorcs.com/index.php/topic,31.0.html . I believe this is a Linux bug, because it is solely dependent on the Linux version. The following Linux versions are confirmed to have the bug: Debian: 5.6.0-1 (ppc64el) Fedora: 5.6.0-1.fc33.x86_64 5.6.0-1.fc33.ppc64le 5.17.5-300.fc36.x86_64 5.18.6-200.fc36.ppc64le 6.1.0-0.rc0.20221007git4c86114194e6.5.fc38.ppc64le Whereas the following Linux versions are confirmed to work fine (max resolution detected by KDE is 1920x1200 as it should be, and that resolution works fine when selected): Debian: 5.5.0-2 (ppc64el) Fedora: 5.5.17-200.fc31.x86_64 5.5.17-200.fc31.ppc64le I believe the bug was introduced by Linux commit 9253f830c9166bfa6cc07d5ed59e174e9d5ec6ca, which adds a VRAM size check that assumes double-buffering. 1920x1080 resolution at 4 bytes per pixel with 2 buffers is 16.6 MB, while bumping that to 1920x1200 results in 18.4 MB. Since the VRAM size is 16 MiB == 16.8 MB, that explains the issue. Thanks for reporting. It's been a known issue for a while. But in the most recent devel tree, we have replaced ast memory management, so that it can now use the full vram size for scanout buffers. See https://cgit.freedesktop.org/drm/drm-tip/commit/drivers/gpu/drm/ast/ast_mode.c?id=f2fa5a99ca81ce1056539e83c705f3d6bec62e31 To test, get the latest drm-tip from git://anongit.freedesktop.org/drm/drm-tip and try on your machine. The updated driver should become available in Linux v6.3. Best regards Thomas Hi Thomas! Thanks for pointing me to that branch. It took me some weeks to get it to build due to my lack of experience building kernels from source (hence the delayed reply), but I can confirm that the issue is fixed in drm-tip (1920x1200 works as it should on my ppc64le system; I didn't test on my x86_64 system). Additionally, drm-tip fixes a different AST bug that was causing certain Wayland ppc64le systems (mainly GNOME, KDE, and GDM) to hang on a black screen (for which I was halfway through preparing a bug report), so that is a pleasant surprise. Looking forward to Linux v6.3. Cheers, -- -Jeremy Rand Lead Application Engineer at Namecoin OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH v4 02/10] dt-bindings: display: bridge: Add MHDP DP for i.MX8MQ
On 21/11/2022 08:23, Sandor Yu wrote: > Add bindings for i.MX8MQ MHDP DisplayPort. > > Signed-off-by: Sandor Yu > --- > .../display/bridge/cdns,mhdp-imx8mq-dp.yaml | 93 +++ > 1 file changed, 93 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp.yaml > > diff --git > a/Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp.yaml > b/Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp.yaml > new file mode 100644 > index ..d82f3ceddaa8 > --- /dev/null > +++ > b/Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp.yaml > @@ -0,0 +1,93 @@ > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/display/bridge/cdns,mhdp-imx8mq-dp.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: Cadence MHDP Displayport bridge > + > +maintainers: > + - Sandor Yu > + > +description: > + The Cadence MHDP Displayport TX interface. > + > +properties: > + compatible: > +enum: > + - cdns,mhdp-imx8mq-dp > + > + reg: > +maxItems: 1 > + > + clocks: > +maxItems: 1 > +description: MHDP DP APB clock. > + > + phys: > +maxItems: 1 > + > + interrupts: > +items: > + - description: Hotplug cable plugin. > + - description: Hotplug cable plugout. > + > + interrupt-names: > +items: > + - const: plug_in > + - const: plug_out > + > + ports: > +$ref: /schemas/graph.yaml#/properties/ports > + > +properties: > + port@0: > +$ref: /schemas/graph.yaml#/properties/port > +description: > + Input port from display controller output. > + port@1: > +$ref: /schemas/graph.yaml#/properties/port > +description: > + Output port to DP connector. > + > +required: > + - port@0 > + > +required: > + - compatible > + - reg > + - clocks > + - interrupts > + - interrupt-names > + - phys > + - ports > + > +additionalProperties: false > + > +examples: > + - | > +#include > +#include > + > +mhdp_dp: dp-bridge@32c0 { > +compatible = "cdns,mhdp-imx8mq-dp"; > +reg = <0x32c0 0x10>; > +interrupts = , > + ; > +interrupt-names = "plug_in", "plug_out"; > +clocks = <&clk IMX8MQ_CLK_DISP_APB_ROOT>; > +phys = <&dp_phy>; > + > +ports { > +#address-cells = <1>; > +#size-cells = <0>; > + > +port@0 { > +reg = <0>; > + > +mhdp_in: endpoint { > +remote-endpoint = <&dcss_out>; > +}; As Rob suggested, you allowed property for output to DP port. However it is not in the example. If this is a bridge, what does it bridge if there is no output connector? Best regards, Krzysztof
Re: [PATCH v1 4/8] arm64: dts: renesas: r8a779g0: Add display related data
On 17/11/2022 17:03, Kieran Bingham wrote: Quoting Tomi Valkeinen (2022-11-17 12:25:43) From: Tomi Valkeinen Add DT nodes for components needed to get the DSI output working: - FCPv - VSPd - DU - DSI Signed-off-by: Tomi Valkeinen --- arch/arm64/boot/dts/renesas/r8a779g0.dtsi | 129 ++ 1 file changed, 129 insertions(+) diff --git a/arch/arm64/boot/dts/renesas/r8a779g0.dtsi b/arch/arm64/boot/dts/renesas/r8a779g0.dtsi index 45d8d927ad26..31d4930c5adc 100644 --- a/arch/arm64/boot/dts/renesas/r8a779g0.dtsi +++ b/arch/arm64/boot/dts/renesas/r8a779g0.dtsi @@ -1207,6 +1207,135 @@ prr: chipid@fff00044 { compatible = "renesas,prr"; reg = <0 0xfff00044 0 4>; }; I think these nodes are supposed to be in sort order based on the register address in memory. Ah, I didn't realize that. Disregarding sort order, I'll review the node contents. I would probably s/data/nodes/ in $SUBJECT too. + + fcpvd0: fcp@fea1 { + compatible = "renesas,fcpv"; + reg = <0 0xfea1 0 0x200>; + clocks = <&cpg CPG_MOD 508>; + power-domains = <&sysc R8A779G0_PD_ALWAYS_ON>; + resets = <&cpg 508>; + }; + + fcpvd1: fcp@fea11000 { + compatible = "renesas,fcpv"; + reg = <0 0xfea11000 0 0x200>; + clocks = <&cpg CPG_MOD 509>; + power-domains = <&sysc R8A779G0_PD_ALWAYS_ON>; + resets = <&cpg 509>; + }; I'm intrigued at the length of 0x200 as I only see 3 registers up to 0x0018 .. But all existing platforms with fcpv* set 0x200 ... so lets cargo cult it up... :-) + + vspd0: vsp@fea2 { + compatible = "renesas,vsp2"; + reg = <0 0xfea2 0 0x5000>; """ Below are the base addresses of each VSP unit. VSPX has 32Kbyte address space. VSPD has 28Kbyte address space. """ Hrm : 28K is 0x7000 RPf n OSD CLUT Table: H’4000 + H’0400*n to H’43fc + H’0400*n 0x43fc+(0x400*5) 22524 [0x57fc] So this needs to be /at least/ 0x6000 (Would 0x5800 be odd?) and perhaps as it clearly states 28k, we should just set it to 0x7000. Ok. These are identical to v3u, and I was just copying v3u's dts, assuming they're correct. We probably should fix the v3u dts files too. + interrupts = ; + clocks = <&cpg CPG_MOD 830>; + power-domains = <&sysc R8A779G0_PD_ALWAYS_ON>; + resets = <&cpg 830>; + + renesas,fcp = <&fcpvd0>; + }; + + vspd1: vsp@fea28000 { + compatible = "renesas,vsp2"; + reg = <0 0xfea28000 0 0x5000>; Same here of course (reg = <0 0xfea28000 0 0x7000>) + interrupts = ; + clocks = <&cpg CPG_MOD 831>; + power-domains = <&sysc R8A779G0_PD_ALWAYS_ON>; + resets = <&cpg 831>; + + renesas,fcp = <&fcpvd1>; + }; + + du: display@feb0 { + compatible = "renesas,du-r8a779g0"; + reg = <0 0xfeb0 0 0x4>; + interrupts = , +; + clocks = <&cpg CPG_MOD 411>; + clock-names = "du.0"; + power-domains = <&sysc R8A779G0_PD_ALWAYS_ON>; + resets = <&cpg 411>; + reset-names = "du.0"; + renesas,vsps = <&vspd0 0>, <&vspd1 0>; + + status = "disabled"; + + ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + du_out_dsi0: endpoint { + remote-endpoint = <&dsi0_in>; + }; + }; + + port@1 { + reg = <1>; + du_out_dsi1: endpoint { + remote-endpoint = <&dsi1_in>; + }; + }; + }; + }; + + dsi0: dsi-encoder@fed8 { + compatible = "renesas,r8a779g0-dsi-csi2-tx"; + reg = <0 0xfed8 0 0x1>; + power-domains = <&sysc R8A779G0_PD_ALWAYS_ON>; + clocks = <&cpg CPG_MOD 415>, +
Re: [PATCH v1 6/8] drm: rcar-du: Add r8a779g0 support
On 22/11/2022 05:05, Laurent Pinchart wrote: On Thu, Nov 17, 2022 at 03:08:38PM +, Kieran Bingham wrote: Quoting Tomi Valkeinen (2022-11-17 12:25:45) From: Tomi Valkeinen Add support for DU on r8a779g0, which is identical to DU on r8a779a0. Signed-off-by: Tomi Valkeinen --- drivers/gpu/drm/rcar-du/rcar_du_drv.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c b/drivers/gpu/drm/rcar-du/rcar_du_drv.c index d003e8d9e7a2..b1761d4ec4e5 100644 --- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c +++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c @@ -524,6 +524,27 @@ static const struct rcar_du_device_info rcar_du_r8a779a0_info = { .dsi_clk_mask = BIT(1) | BIT(0), }; +static const struct rcar_du_device_info rcar_du_r8a779g0_info = { + .gen = 3, Given that this is the V4H ... I wonder if this should be bumped already. I guess that has knock on effects through the driver though... rcar_du_group_setup_didsr() would need to be fixed to test gen >= 3 instead of gen == 3. That seems to be the only problematic location. It could thus fairly easily be done in v2, but we can also delay it. Ok, I can fix that here. Tomi
Re: [PATCH v1 7/8] drm: rcar-du: dsi: Add r8A779g0 support
On 17/11/2022 17:46, Kieran Bingham wrote: Quoting Tomi Valkeinen (2022-11-17 12:25:46) From: Tomi Valkeinen Add DSI support for r8a779g0. The main differences to r8a779a0 are in the PLL and PHTW setups. Signed-off-by: Tomi Valkeinen --- drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c | 484 +++ drivers/gpu/drm/rcar-du/rcar_mipi_dsi_regs.h | 6 +- 2 files changed, 384 insertions(+), 106 deletions(-) diff --git a/drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c b/drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c index a7f2b7f66a17..723c35726c38 100644 --- a/drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c +++ b/drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include #include @@ -28,6 +29,20 @@ #include "rcar_mipi_dsi.h" #include "rcar_mipi_dsi_regs.h" +#define MHZ(v) ((v) * 100u) + +enum rcar_mipi_dsi_hw_model { + RCAR_DSI_R8A779A0, + RCAR_DSI_R8A779G0, +}; + +struct rcar_mipi_dsi_device_info { + enum rcar_mipi_dsi_hw_model model; + const struct dsi_clk_config *clk_cfg; + u8 clockset2_m_offset; + u8 clockset2_n_offset; +}; + struct rcar_mipi_dsi { struct device *dev; const struct rcar_mipi_dsi_device_info *info; @@ -50,6 +65,17 @@ struct rcar_mipi_dsi { unsigned int lanes; }; +struct dsi_setup_info { + unsigned long hsfreq; + u16 hsfreqrange; + + unsigned long fout; + u16 m; + u16 n; + u16 vclk_divider; + const struct dsi_clk_config *clkset; +}; + static inline struct rcar_mipi_dsi * bridge_to_rcar_mipi_dsi(struct drm_bridge *bridge) { @@ -62,22 +88,6 @@ host_to_rcar_mipi_dsi(struct mipi_dsi_host *host) return container_of(host, struct rcar_mipi_dsi, host); } -static const u32 phtw[] = { - 0x01020114, 0x01600115, /* General testing */ - 0x01030116, 0x0102011d, /* General testing */ - 0x011101a4, 0x018601a4, /* 1Gbps testing */ - 0x014201a0, 0x010001a3, /* 1Gbps testing */ - 0x0101011f, /* 1Gbps testing */ -}; - -static const u32 phtw2[] = { - 0x010c0130, 0x010c0140, /* General testing */ - 0x010c0150, 0x010c0180, /* General testing */ - 0x010c0190, - 0x010a0160, 0x010a0170, - 0x01800164, 0x01800174, /* 1Gbps testing */ -}; - static const u32 hsfreqrange_table[][2] = { { 8000U, 0x00 }, { 9000U, 0x10 }, { 1U, 0x20 }, { 11000U, 0x30 }, { 12000U, 0x01 }, { 13000U, 0x11 }, @@ -103,24 +113,53 @@ static const u32 hsfreqrange_table[][2] = { { /* sentinel */ }, }; -struct vco_cntrl_value { +struct dsi_clk_config { u32 min_freq; u32 max_freq; - u16 value; + u8 vco_cntrl; + u8 cpbias_cntrl; + u8 gmp_cntrl; + u8 int_cntrl; + u8 prop_cntrl; }; -static const struct vco_cntrl_value vco_cntrl_table[] = { - { .min_freq = 4000U, .max_freq = 5500U, .value = 0x3f }, - { .min_freq = 5250U, .max_freq = 8000U, .value = 0x39 }, - { .min_freq = 8000U, .max_freq = 11000U, .value = 0x2f }, - { .min_freq = 10500U, .max_freq = 16000U, .value = 0x29 }, - { .min_freq = 16000U, .max_freq = 22000U, .value = 0x1f }, - { .min_freq = 21000U, .max_freq = 32000U, .value = 0x19 }, - { .min_freq = 32000U, .max_freq = 44000U, .value = 0x0f }, - { .min_freq = 42000U, .max_freq = 66000U, .value = 0x09 }, - { .min_freq = 63000U, .max_freq = 114900U, .value = 0x03 }, - { .min_freq = 11U, .max_freq = 115200U, .value = 0x01 }, - { .min_freq = 115000U, .max_freq = 125000U, .value = 0x01 }, +static const struct dsi_clk_config dsi_clk_cfg_r8a779a0[] = { + { 4000u, 5500u, 0x3f, 0x10, 0x01, 0x00, 0x0b }, + { 5250u, 8000u, 0x39, 0x10, 0x01, 0x00, 0x0b }, + { 8000u, 11000u, 0x2f, 0x10, 0x01, 0x00, 0x0b }, + { 10500u, 16000u, 0x29, 0x10, 0x01, 0x00, 0x0b }, + { 16000u, 22000u, 0x1f, 0x10, 0x01, 0x00, 0x0b }, + { 21000u, 32000u, 0x19, 0x10, 0x01, 0x00, 0x0b }, + { 32000u, 44000u, 0x0f, 0x10, 0x01, 0x00, 0x0b }, + { 42000u, 66000u, 0x09, 0x10, 0x01, 0x00, 0x0b }, + { 63000u, 114900u, 0x03, 0x10, 0x01, 0x00, 0x0b }, + { 11u, 115200u, 0x01, 0x10, 0x01, 0x00, 0x0b }, + { 115000u, 125000u, 0x01, 0x10, 0x01, 0x00, 0x0c }, Sigh ... is it this one 0x0c value that means we need to keep all these entries repeated ? :-S If it weren't for that, it seems we could keep just two sets of + u8 cpbias_cntrl; + u8 gmp_cntrl; + u8 int_cntrl; + u8 prop_cntrl; One for each of the 9a0, and the 9g0... But this is fine, and I guess the implication is there may be other future differences
Re: [PATCH v1 2/8] dt-bindings: display: bridge: renesas, dsi-csi2-tx: Add r8a779g0
Hi Tomi, On Tue, Nov 22, 2022 at 9:20 AM Tomi Valkeinen wrote: > On 17/11/2022 17:14, Geert Uytterhoeven wrote: > > On Thu, Nov 17, 2022 at 1:26 PM Tomi Valkeinen > > wrote: > >> From: Tomi Valkeinen > >> > >> Extend the Renesas DSI display bindings to support the r8a779g0 V4H. > >> > >> Signed-off-by: Tomi Valkeinen > >> --- > >> .../bindings/display/bridge/renesas,dsi-csi2-tx.yaml | 3 ++- > >> 1 file changed, 2 insertions(+), 1 deletion(-) > >> > >> diff --git > >> a/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml > >> > >> b/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml > >> index afeeb967393d..bc3101f77e5a 100644 > >> --- > >> a/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml > >> +++ > >> b/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml > >> @@ -11,13 +11,14 @@ maintainers: > >> > >> description: | > >> This binding describes the MIPI DSI/CSI-2 encoder embedded in the > >> Renesas > >> - R-Car V3U SoC. The encoder can operate in either DSI or CSI-2 mode, > >> with up > >> + R-Car V3U/V4H SoC. The encoder can operate in either DSI or CSI-2 mode, > >> with up > > > > Perhaps "R-Car Gen4 SoCs", so we stay within 80 chars, and don't have > > to update this when the next member of the family is around the block? > > Is V3U gen 4? Or do you mean "R-Car V3U and Gen 4 SoCs"? Despite the name, R-Car V3U is the first member of the R-Car Gen4 family... https://www.renesas.com/us/en/products/automotive-products/automotive-system-chips-socs/r-car-v3u-best-class-r-car-v3u-asil-d-system-chip-automated-driving > > Is there anything that might be SoC-specific? > > If not, perhaps the time is ripe for a family-specific compatible value? > > At least v3u and v4h DSIs are slightly different. Well, the DSI IP block > itself looks the same, but the PLL and PHY are different. I noticed, when I saw the dsi-csi2 driver changes. So no family-specific compatible value is needed. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [Intel-gfx] [PATCH 1/6] drm/i915/uc: Introduce GSC FW
On Mon, 21 Nov 2022, Daniele Ceraolo Spurio wrote: > On MTL the GSC FW needs to be loaded on the media GT by the graphics > driver. We're going to treat it like a new uc_fw, so add the initial > defs and init/fini functions for it. > > Similarly to the other FWs, the GSC FW path can be overriden via > modparam. The modparam can also be used to disable the GSC FW loading by > setting it to an empty string. > > Note that the new structure has been called intel_gsc_uc to avoid > confusion with the existing intel_gsc, which instead represents the heci > gsc interfaces. > > Signed-off-by: Daniele Ceraolo Spurio > Cc: Alan Previn > Cc: John Harrison > --- > drivers/gpu/drm/i915/Makefile | 3 +- > drivers/gpu/drm/i915/gt/intel_gt.h| 5 ++ > drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c | 70 +++ > drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h | 36 > drivers/gpu/drm/i915/gt/uc/intel_uc.c | 17 ++ > drivers/gpu/drm/i915/gt/uc/intel_uc.h | 3 + > drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 25 +++- > drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h | 7 ++- > drivers/gpu/drm/i915/i915_params.c| 3 + > drivers/gpu/drm/i915/i915_params.h| 1 + > 10 files changed, 164 insertions(+), 6 deletions(-) > create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c > create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index 01974b82d205..92d37cf71e16 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -205,7 +205,8 @@ i915-y += gt/uc/intel_uc.o \ > gt/uc/intel_guc_submission.o \ > gt/uc/intel_huc.o \ > gt/uc/intel_huc_debugfs.o \ > - gt/uc/intel_huc_fw.o > + gt/uc/intel_huc_fw.o \ > + gt/uc/intel_gsc_uc.o Comment near the top of the file: # Please keep these build lists sorted! > > # graphics system controller (GSC) support > i915-y += gt/intel_gsc.o > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h > b/drivers/gpu/drm/i915/gt/intel_gt.h > index e0365d556248..d2f4fbde5f9f 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt.h > +++ b/drivers/gpu/drm/i915/gt/intel_gt.h > @@ -39,6 +39,11 @@ static inline struct intel_gt *huc_to_gt(struct intel_huc > *huc) > return container_of(huc, struct intel_gt, uc.huc); > } > > +static inline struct intel_gt *gsc_uc_to_gt(struct intel_gsc_uc *gsc_uc) > +{ > + return container_of(gsc_uc, struct intel_gt, uc.gsc); > +} > + > static inline struct intel_gt *gsc_to_gt(struct intel_gsc *gsc) > { > return container_of(gsc, struct intel_gt, gsc); > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c > b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c > new file mode 100644 > index ..65cbf1ce9fa1 > --- /dev/null > +++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c > @@ -0,0 +1,70 @@ > +// SPDX-License-Identifier: MIT > +/* > + * Copyright © 2022 Intel Corporation > + */ > + > +#include > + > +#include "gt/intel_gt.h" > +#include "intel_gsc_uc.h" > +#include "i915_drv.h" > + > +static bool gsc_engine_supported(struct intel_gt *gt) > +{ > + intel_engine_mask_t mask; > + > + /* > + * We reach here from i915_driver_early_probe for the primary GT before > + * its engine mask is set, so we use the device info engine mask for it. > + * For other GTs we expect the GT-specific mask to be set before we > + * call this function. > + */ > + GEM_BUG_ON(!gt_is_root(gt) && !gt->info.engine_mask); > + > + if (gt_is_root(gt)) > + mask = RUNTIME_INFO(gt->i915)->platform_engine_mask; > + else > + mask = gt->info.engine_mask; > + > + return __HAS_ENGINE(mask, GSC0); > +} > + > +void intel_gsc_uc_init_early(struct intel_gsc_uc *gsc) > +{ > + intel_uc_fw_init_early(&gsc->fw, INTEL_UC_FW_TYPE_GSC); > + > + /* we can arrive here from i915_driver_early_probe for primary > + * GT with it being not fully setup hence check device info's > + * engine mask > + */ > + if (!gsc_engine_supported(gsc_uc_to_gt(gsc))){ > + intel_uc_fw_change_status(&gsc->fw, > INTEL_UC_FIRMWARE_NOT_SUPPORTED); > + return; > + } > +} > + > +int intel_gsc_uc_init(struct intel_gsc_uc *gsc) > +{ > + struct drm_i915_private *i915 = gsc_uc_to_gt(gsc)->i915; > + int err; > + > + err = intel_uc_fw_init(&gsc->fw); > + if (err) > + goto out; > + > + intel_uc_fw_change_status(&gsc->fw, INTEL_UC_FIRMWARE_LOADABLE); > + > + return 0; > + > +out: > + i915_probe_error(i915, "failed with %d\n", err); > + return err; > +} > + > +void intel_gsc_uc_fini(struct intel_gsc_uc *gsc) > +{ > + if (!intel_uc_fw_is_loadable(&gsc->fw)) > + return; > + > + intel_uc_fw_fini(&gsc->fw); > +} > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h > b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h > new file mode 10
Re: [PATCH] drm/mediatek: Clean dangling pointer on bind error path
Il 21/11/22 23:37, Nícolas F. R. A. Prado ha scritto: mtk_drm_bind() can fail, in which case drm_dev_put() is called, destroying the drm_device object. However a pointer to it was still being held in the private object, and that pointer would be passed along to DRM in mtk_drm_sys_prepare() if a suspend were triggered at that point, resulting in a panic. Clean the pointer when destroying the object in the error path to prevent this from happening. Signed-off-by: Nícolas F. R. A. Prado Fixes tag please! :-) Cheers, Angelo --- drivers/gpu/drm/mediatek/mtk_drm_drv.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c index 39a42dc8fb85..a21ff1b3258c 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c @@ -514,6 +514,7 @@ static int mtk_drm_bind(struct device *dev) err_deinit: mtk_drm_kms_deinit(drm); err_free: + private->drm = NULL; drm_dev_put(drm); return ret; }
[PATCH] drm/bridge: ti-sn65dsi83: Fix delay after reset deassert to match spec
From: Frieder Schrempf The datasheet specifies a delay of 10 milliseconds, but the current driver only waits for 1 ms. Fix this to make sure the initialization sequence meets the spec. Fixes: ceb515ba29ba ("drm/bridge: ti-sn65dsi83: Add TI SN65DSI83 and SN65DSI84 driver") Signed-off-by: Frieder Schrempf --- drivers/gpu/drm/bridge/ti-sn65dsi83.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c b/drivers/gpu/drm/bridge/ti-sn65dsi83.c index 7ba9467fff12..047c14ddbbf1 100644 --- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c +++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c @@ -346,7 +346,7 @@ static void sn65dsi83_atomic_enable(struct drm_bridge *bridge, /* Deassert reset */ gpiod_set_value_cansleep(ctx->enable_gpio, 1); - usleep_range(1000, 1100); + usleep_range(1, 11000); /* Get the LVDS format from the bridge state. */ bridge_state = drm_atomic_get_new_bridge_state(state, bridge); -- 2.38.1
[Bug 216727] New: Failure to wake up from suspend to RAM
https://bugzilla.kernel.org/show_bug.cgi?id=216727 Bug ID: 216727 Summary: Failure to wake up from suspend to RAM Product: Drivers Version: 2.5 Kernel Version: 6.0.9 Hardware: AMD OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: martin...@gmx.com Regression: No Created attachment 303261 --> https://bugzilla.kernel.org/attachment.cgi?id=303261&action=edit errors and tracess logged when the issue occured I've had this issue at least twice so far. The previous time it was on kernel version 5.19.17. When I tried waking up my computer from sleep I was greeted with corrupt graphics. Was able to login via ssh and reboot. I'm on Slackware-current. CPU is AMD Phenom II X4 965 Black Edition. GPU is AMD Radeon RX 550 / 550 Series Both time this happened I had a process running and using up all the memory and eventually being killed by "out of memory". Though first time I experienced this issue the out of memory happend the day beofre. The issue occured only on wake up the next day. I'm attaching some logs (some minor comments in the attached log file), I hope it's not too much but I didn't want to miss any relevant information. Apologies in advance. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [PATCH v4 0/4] new subsystem for compute accelerator devices
Hi, On 11/19/2022 9:44 PM, Oded Gabbay wrote: > This is the fourth (and hopefully last) version of the patch-set to add the > new subsystem for compute accelerators. I removed the RFC headline as > I believe it is now ready for merging. Looks good and works without issues. I will rebase next version of Intel VPU patchest on top of this. Acked-by: Jacek Lawrynowicz Tested-by: Jacek Lawrynowicz Regards, Jacek
Re: [PATCH v4] dma-buf: fix racing conflict of dma_heap_add()
Hi Dawei Li, On Mon, 21 Nov 2022 at 23:53, Christian König wrote: > > Hi Dawei, > > from the technical description, coding style etc.. it looks clean to me, > but I'm the completely wrong person to ask for a background check. > > I have a high level understanding of how dma-heaps work, but not a > single line of this code is from me. > > Feel free to add my Acked-by, but Laura, John and others do you have any > opinion? > > Regards, > Christian. > > Am 21.11.22 um 16:28 schrieb Dawei Li: > > On Sat, Nov 05, 2022 at 12:05:36AM +0800, Dawei Li wrote: > > > > Hi Christian, > > May I have your opinion on this change? > > > > Thanks, > > Dawei > > > >> Racing conflict could be: > >> task A task B > >> list_for_each_entry > >> strcmp(h->name)) > >> list_for_each_entry > >> strcmp(h->name) > >> kzallockzalloc > >> .. . > >> device_create device_create > >> list_add > >> list_add > >> > >> The root cause is that task B has no idea about the fact someone > >> else(A) has inserted heap with same name when it calls list_add, > >> so a potential collision occurs. > >> > >> Fixes: c02a81fba74f ("dma-buf: Add dma-buf heaps framework") > >> Signed-off-by: Dawei Li Looks good to me as well. I will apply it over on drm-misc. Best, Sumit. > >> --- > >> v1: > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FTYCP286MB2323950197F60FC3473123B7CA349%40TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM%2F&data=05%7C01%7Cchristian.koenig%40amd.com%7C1989f31257fc458a27c508dacbd5078e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638046413707443203%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=OWPUPrIHGnzwXyQE4WlIthThwSuSK2y3Eq2Wq5zAzbo%3D&reserved=0 > >> v1->v2: Narrow down locking scope, check the existence of heap before > >> insertion, as suggested by Andrew Davis. > >> v2->v3: Remove double checking. > >> v3->v4: Minor coding style and patch formatting adjustment. > >> --- > >> drivers/dma-buf/dma-heap.c | 28 +++- > >> 1 file changed, 15 insertions(+), 13 deletions(-) > >> > >> diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c > >> index 8f5848aa144f..59d158873f4c 100644 > >> --- a/drivers/dma-buf/dma-heap.c > >> +++ b/drivers/dma-buf/dma-heap.c > >> @@ -233,18 +233,6 @@ struct dma_heap *dma_heap_add(const struct > >> dma_heap_export_info *exp_info) > >> return ERR_PTR(-EINVAL); > >> } > >> > >> -/* check the name is unique */ > >> -mutex_lock(&heap_list_lock); > >> -list_for_each_entry(h, &heap_list, list) { > >> -if (!strcmp(h->name, exp_info->name)) { > >> -mutex_unlock(&heap_list_lock); > >> -pr_err("dma_heap: Already registered heap named %s\n", > >> - exp_info->name); > >> -return ERR_PTR(-EINVAL); > >> -} > >> -} > >> -mutex_unlock(&heap_list_lock); > >> - > >> heap = kzalloc(sizeof(*heap), GFP_KERNEL); > >> if (!heap) > >> return ERR_PTR(-ENOMEM); > >> @@ -283,13 +271,27 @@ struct dma_heap *dma_heap_add(const struct > >> dma_heap_export_info *exp_info) > >> err_ret = ERR_CAST(dev_ret); > >> goto err2; > >> } > >> -/* Add heap to the list */ > >> + > >> mutex_lock(&heap_list_lock); > >> +/* check the name is unique */ > >> +list_for_each_entry(h, &heap_list, list) { > >> +if (!strcmp(h->name, exp_info->name)) { > >> +mutex_unlock(&heap_list_lock); > >> +pr_err("dma_heap: Already registered heap named %s\n", > >> + exp_info->name); > >> +err_ret = ERR_PTR(-EINVAL); > >> +goto err3; > >> +} > >> +} > >> + > >> +/* Add heap to the list */ > >> list_add(&heap->list, &heap_list); > >> mutex_unlock(&heap_list_lock); > >> > >> return heap; > >> > >> +err3: > >> +device_destroy(dma_heap_class, heap->heap_devt); > >> err2: > >> cdev_del(&heap->heap_cdev); > >> err1: > >> -- > >> 2.25.1 > >> > -- Thanks and regards, Sumit Semwal (he / him) Tech Lead - LCG, Vertical Technologies Linaro.org │ Open source software for ARM SoCs
Re: [PATCH 6/6] drm/i915: Bpp/timeslot calculation fixes for DP MST DSC
On Thu, Nov 10, 2022 at 02:23:53PM -0800, Navare, Manasi wrote: > On Thu, Nov 03, 2022 at 03:23:00PM +0200, Stanislav Lisovskiy wrote: > > Fix intel_dp_dsc_compute_config, previously timeslots parameter > > was used in fact not as a timeslots, but more like a ratio > > timeslots/64, which of course didn't have any effect for SST DSC, > > but causes now issues for MST DSC. > > Secondly we need to calculate pipe_bpp using intel_dp_dsc_compute_bpp > > only for SST DSC case, while for MST case it has been calculated > > earlier already with intel_dp_dsc_mst_compute_link_config. > > Third we also were wrongly determining sink min bpp/max bpp, those > > limites should be intersected with our limits to find common > > acceptable bpp's, plus on top of that we should align those with > > VESA bpps and only then calculate required timeslots amount. > > Some MST hubs started to work only after third change was made. > > > > v2: Make kernel test robot happy(claimed there was unitialzed use, > > while there is none) > > > > Signed-off-by: Stanislav Lisovskiy > > --- > > drivers/gpu/drm/i915/display/intel_dp.c | 69 ++--- > > drivers/gpu/drm/i915/display/intel_dp.h | 3 +- > > drivers/gpu/drm/i915/display/intel_dp_mst.c | 69 + > > 3 files changed, 106 insertions(+), 35 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c > > b/drivers/gpu/drm/i915/display/intel_dp.c > > index 8288a30dbd51..82752b696498 100644 > > --- a/drivers/gpu/drm/i915/display/intel_dp.c > > +++ b/drivers/gpu/drm/i915/display/intel_dp.c > > @@ -716,9 +716,14 @@ u16 intel_dp_dsc_get_output_bpp(struct > > drm_i915_private *i915, > > * for SST -> TimeSlotsPerMTP is 1, > > * for MST -> TimeSlotsPerMTP has to be calculated > > */ > > - bits_per_pixel = (link_clock * lane_count * 8) * timeslots / > > -intel_dp_mode_to_fec_clock(mode_clock); > > - drm_dbg_kms(&i915->drm, "Max link bpp: %u\n", bits_per_pixel); > > + bits_per_pixel = DIV_ROUND_UP((link_clock * lane_count) * timeslots, > > + intel_dp_mode_to_fec_clock(mode_clock) * > > 8); > > Why did we remove the *8 in the numerator for the total bandwidth > link_clock * lane_count * 8 ? > > Other than this clarification, all changes look good > > Manasi Hi Manasi, Because previously this function was actually confusing the ratio timeslots/64, with the timeslots number. It was actually expecting a ratio timeslots/64, rather than the timeslots number. For SST it didn't matter as timeslots were always 1, but for MST case if we multiply that by number number of timeslots, this formula will return some big bogus bits_per_pixel number(checked that). Of course we can pass a ratio timeslots/64 here, but it isn't very convenient and intuitive to manipulate. So I made it to use a "timeslots" parameter as timeslots number, so that the ratio is calculated as part of the formula i.e: ((link_clock * lane_count * 8) * (timeslots / 64)) / intel_dp_mode_to_fec_clock(mode_clock); which can be simplified as ((link_clock * lane_count * timeslots) / 8) / intel_dp_mode_to_fec_clock(mode_clock); the whole formula comes from that pipe_bpp * crtc_clock should be equal to link_total_bw * (timeslots / 64), i.e timeslots/64 ratio defines, how much of the link_total_bw(link_clock * lane_count * 8) we have for those pipe_bpp * crtc_clock, which we want to accomodate there. Obviously if we just multiplied link_total_bw by timeslots, we would get a situation that the more timeslots we allocate, the more total bw we get, which is wrong and will result in some bogus huge pipe_bpp numbers. Stan > > > + > > + drm_dbg_kms(&i915->drm, "Max link bpp is %u for %u timeslots " > > + "total bw %u pixel clock %u\n", > > + bits_per_pixel, timeslots, > > + (link_clock * lane_count * 8), > > + intel_dp_mode_to_fec_clock(mode_clock)); > > > > /* Small Joiner Check: output bpp <= joiner RAM (bits) / Horiz. width */ > > max_bpp_small_joiner_ram = small_joiner_ram_size_bits(i915) / > > @@ -1047,7 +1052,7 @@ intel_dp_mode_valid(struct drm_connector *_connector, > > target_clock, > > mode->hdisplay, > > bigjoiner, > > - pipe_bpp, 1) >> 4; > > + pipe_bpp, 64) >> 4; > > dsc_slice_count = > > intel_dp_dsc_get_slice_count(intel_dp, > > target_clock, > > @@ -1481,7 +1486,8 @@ int intel_dp_dsc_compute_config(struct intel_dp > > *intel_dp, > > struct intel_crtc_state *pip
Re: [Intel-gfx] [PATCH v3 1/2] drm/i915: Fix negative value passed as remaining time
On 21/11/2022 23:19, Janusz Krzysztofik wrote: Hi Andrzej, Thanks for providing your R-b. On Monday, 21 November 2022 18:40:51 CET Andrzej Hajda wrote: On 21.11.2022 15:56, Janusz Krzysztofik wrote: Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC") extended the API of intel_gt_retire_requests_timeout() with an extra argument 'remaining_timeout', intended for passing back unconsumed portion of requested timeout when 0 (success) is returned. However, when request retirement happens to succeed despite an error returned by a call to dma_fence_wait_timeout(), that error code (a negative value) is passed back instead of remaining time. If we then pass that negative value forward as requested timeout to intel_uc_wait_for_idle(), an explicit BUG will be triggered. Right, AFAICT a GEM_BUG_ON in debug builds, but in production builds negative timeout will get passed along all the way to schedule_timeout where error and call trace will be dumped. So fix appears warranted indeed. If request retirement succeeds but an error code is passed back via remaininig_timeout, we may have no clue on how much of the initial timeout might have been left for spending it on waiting for GuC to become idle. OTOH, since all pending requests have been successfully retired, that error code has been already ignored by intel_gt_retire_requests_timeout(), then we shouldn't fail. Assume no more time has been left on error and pass 0 timeout value to intel_uc_wait_for_idle() to give it a chance to return success if GuC is already idle. v3: Don't fail on any error passed back via remaining_timeout. v2: Fix the issue on the caller side, not the provider. Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC") Signed-off-by: Janusz Krzysztofik Cc: sta...@vger.kernel.org # v5.15+ Reviewed-by: Andrzej Hajda While still open for comments from others, I'm now looking for potential committer. Both patches are considered good to go? Regards, Tvrtko Thanks, Janusz Regards Andrzej --- drivers/gpu/drm/i915/gt/intel_gt.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/ intel_gt.c index b5ad9caa55372..7ef0edb2e37cd 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -677,8 +677,13 @@ int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout) return -EINTR; } - return timeout ? timeout : intel_uc_wait_for_idle(>->uc, - remaining_timeout); + if (timeout) + return timeout; + + if (remaining_timeout < 0) + remaining_timeout = 0; + + return intel_uc_wait_for_idle(>->uc, remaining_timeout); } int intel_gt_init(struct intel_gt *gt)
[git pull] new subsystem for compute accelerator devices
Hi Dave, Here is the pull request for creating the compute acceleration (accel) subsystem. It includes the fixes to the few comments given in the latest patch-set review. Highlights of the patch-set contents are in the signed tag. Thanks, Oded The following changes since commit fc58764bbf602b65a6f63c53e5fd6feae76c510c: Merge tag 'amd-drm-next-6.2-2022-11-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-next (2022-11-22 13:41:11 +1000) are available in the Git repository at: https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git tags/drm-accel-2022-11-22 for you to fetch changes up to cadbcc80e81800a3189f9fed23407fa014b8e3c3: doc: add documentation for accel subsystem (2022-11-22 12:41:08 +0200) This tag contains the patches that add the new compute acceleration subsystem, which is part of the DRM subsystem. The patches: - Add a new directory at drivers/accel. - Add a new major (261) for compute accelerators. - Add a new DRM minor type for compute accelerators. - Integrate the accel core code with DRM core code. - Add documentation for the accel subsystem. Oded Gabbay (4): drivers/accel: define kconfig and register a new major accel: add dedicated minor for accelerator devices drm: initialize accel framework doc: add documentation for accel subsystem Documentation/accel/index.rst | 17 ++ Documentation/accel/introduction.rst | 110 Documentation/admin-guide/devices.txt | 5 + Documentation/subsystem-apis.rst | 1 + MAINTAINERS | 9 + drivers/Kconfig | 2 + drivers/accel/Kconfig | 24 +++ drivers/accel/drm_accel.c | 323 ++ drivers/gpu/drm/Makefile | 1 + drivers/gpu/drm/drm_drv.c | 101 --- drivers/gpu/drm/drm_file.c| 2 +- drivers/gpu/drm/drm_sysfs.c | 24 ++- include/drm/drm_accel.h | 97 ++ include/drm/drm_device.h | 3 + include/drm/drm_drv.h | 8 + include/drm/drm_file.h| 21 ++- 16 files changed, 711 insertions(+), 37 deletions(-) create mode 100644 Documentation/accel/index.rst create mode 100644 Documentation/accel/introduction.rst create mode 100644 drivers/accel/Kconfig create mode 100644 drivers/accel/drm_accel.c create mode 100644 include/drm/drm_accel.h
Re: [PATCH v3 2/2] drm/i915: Never return 0 if not all requests retired
On 21/11/2022 14:56, Janusz Krzysztofik wrote: Users of intel_gt_retire_requests_timeout() expect 0 return value on success. However, we have no protection from passing back 0 potentially returned by a call to dma_fence_wait_timeout() when it succedes right after its timeout has expired. Is this talking about a potential weakness, or ambiguous kerneldoc, of dma_fence_wait_timeout, dma_fence_default_wait and i915_request_wait_timeout? They appear to say 0 return means timeout, implying unsignaled fence. In other words signaled must return positive remaining timeout. Implementations seems to allow a race which indeed appears that return 0 and signaled fence is possible. If dma_fence_wait can indeed return 0 even when a request is signaled, then how is timeout ?: -ETIME below correct? It isn't a chance for false negative in its' callers? Regards, Tvrtko Replace 0 with -ETIME before potentially using the timeout value as return code, so -ETIME is returned if there are still some requests not retired after timeout, 0 otherwise. v3: Use conditional expression, more compact but also better reflecting intention standing behind the change. v2: Move the added lines down so flush_submission() is not affected. Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request") Signed-off-by: Janusz Krzysztofik Reviewed-by: Andrzej Hajda Cc: sta...@vger.kernel.org # v5.5+ --- drivers/gpu/drm/i915/gt/intel_gt_requests.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c index edb881d756309..1dfd01668c79c 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c @@ -199,7 +199,7 @@ out_active: spin_lock(&timelines->lock); if (remaining_timeout) *remaining_timeout = timeout; - return active_count ? timeout : 0; + return active_count ? timeout ?: -ETIME : 0; } static void retire_work_handler(struct work_struct *work)
Re: [PATCH v28 01/11] dt-bindings: arm: mediatek: mmsys: add vdosys1 compatible for MT8195
Re: Build regressions/improvements in v6.1-rc6
On Tue, 22 Nov 2022, Geert Uytterhoeven wrote: JFYI, when comparing v6.1-rc6[1] to v6.1-rc5[3], the summaries are: - build errors: +6/-0 + /kisskb/src/arch/sh/include/asm/io.h: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]: => 239:34 sh4-gcc11/sh-allmodconfig (in cvm_oct_free_hw_memory()) + /kisskb/src/arch/um/include/asm/processor-generic.h: error: called object is not a function or function pointer: => 94:18 + /kisskb/src/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c: error: control reaches end of non-void function [-Werror=return-type]: => 1934:1 um-x86_64/um-all{mod,yes}config (in kfd_cpumask_to_apic_id()) + /kisskb/src/drivers/infiniband/hw/qib/qib_wc_x86_64.c: error: 'X86_VENDOR_AMD' undeclared (first use in this function): => 149:37 + /kisskb/src/drivers/infiniband/hw/qib/qib_wc_x86_64.c: error: 'struct cpuinfo_um' has no member named 'x86_vendor': => 149:22 + /kisskb/src/drivers/infiniband/hw/qib/qib_wc_x86_64.c: error: control reaches end of non-void function [-Werror=return-type]: => 150:1 um-x86_64/um-allyesconfig [1] http://kisskb.ellerman.id.au/kisskb/branch/linus/head/eb7081409f94a9a8608593d0fb63a1aa3d6f95d8/ (all 149 configs) [3] http://kisskb.ellerman.id.au/kisskb/branch/linus/head/094226ad94f471a9f19e8f8e7140a09c2625abaa/ (all 149 configs) Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH v4 3/4] drm: initialize accel framework
On 11/19, Oded Gabbay wrote: > Now that we have the accel framework code ready, let's call the > accel functions from all the appropriate places. These places are the > drm module init/exit functions, and all the drm_minor handling > functions. Hi Oded, The proposal overall LGTM, but I didn't manage to compile the kernel with your patch series when DRM is enabled but accel support doesn't. Multiple missing link errors... Am I missing something? Can you take a look at it from this patch 3/4? Thanks, Melissa > > Signed-off-by: Oded Gabbay > --- > drivers/gpu/drm/drm_drv.c | 102 ++-- > drivers/gpu/drm/drm_sysfs.c | 24 ++--- > 2 files changed, 91 insertions(+), 35 deletions(-) > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > index 8214a0b1ab7f..1aec019f6389 100644 > --- a/drivers/gpu/drm/drm_drv.c > +++ b/drivers/gpu/drm/drm_drv.c > @@ -35,6 +35,7 @@ > #include > #include > > +#include > #include > #include > #include > @@ -90,6 +91,8 @@ static struct drm_minor **drm_minor_get_slot(struct > drm_device *dev, > return &dev->primary; > case DRM_MINOR_RENDER: > return &dev->render; > + case DRM_MINOR_ACCEL: > + return &dev->accel; > default: > BUG(); > } > @@ -104,9 +107,13 @@ static void drm_minor_alloc_release(struct drm_device > *dev, void *data) > > put_device(minor->kdev); > > - spin_lock_irqsave(&drm_minor_lock, flags); > - idr_remove(&drm_minors_idr, minor->index); > - spin_unlock_irqrestore(&drm_minor_lock, flags); > + if (minor->type == DRM_MINOR_ACCEL) { > + accel_minor_remove(minor->index); > + } else { > + spin_lock_irqsave(&drm_minor_lock, flags); > + idr_remove(&drm_minors_idr, minor->index); > + spin_unlock_irqrestore(&drm_minor_lock, flags); > + } > } > > static int drm_minor_alloc(struct drm_device *dev, unsigned int type) > @@ -123,13 +130,17 @@ static int drm_minor_alloc(struct drm_device *dev, > unsigned int type) > minor->dev = dev; > > idr_preload(GFP_KERNEL); > - spin_lock_irqsave(&drm_minor_lock, flags); > - r = idr_alloc(&drm_minors_idr, > - NULL, > - 64 * type, > - 64 * (type + 1), > - GFP_NOWAIT); > - spin_unlock_irqrestore(&drm_minor_lock, flags); > + if (type == DRM_MINOR_ACCEL) { > + r = accel_minor_alloc(); > + } else { > + spin_lock_irqsave(&drm_minor_lock, flags); > + r = idr_alloc(&drm_minors_idr, > + NULL, > + 64 * type, > + 64 * (type + 1), > + GFP_NOWAIT); > + spin_unlock_irqrestore(&drm_minor_lock, flags); > + } > idr_preload_end(); > > if (r < 0) > @@ -161,10 +172,14 @@ static int drm_minor_register(struct drm_device *dev, > unsigned int type) > if (!minor) > return 0; > > - ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root); > - if (ret) { > - DRM_ERROR("DRM: Failed to initialize /sys/kernel/debug/dri.\n"); > - goto err_debugfs; > + if (minor->type == DRM_MINOR_ACCEL) { > + accel_debugfs_init(minor, minor->index); > + } else { > + ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root); > + if (ret) { > + DRM_ERROR("DRM: Failed to initialize > /sys/kernel/debug/dri.\n"); > + goto err_debugfs; > + } > } > > ret = device_add(minor->kdev); > @@ -172,9 +187,13 @@ static int drm_minor_register(struct drm_device *dev, > unsigned int type) > goto err_debugfs; > > /* replace NULL with @minor so lookups will succeed from now on */ > - spin_lock_irqsave(&drm_minor_lock, flags); > - idr_replace(&drm_minors_idr, minor, minor->index); > - spin_unlock_irqrestore(&drm_minor_lock, flags); > + if (minor->type == DRM_MINOR_ACCEL) { > + accel_minor_replace(minor, minor->index); > + } else { > + spin_lock_irqsave(&drm_minor_lock, flags); > + idr_replace(&drm_minors_idr, minor, minor->index); > + spin_unlock_irqrestore(&drm_minor_lock, flags); > + } > > DRM_DEBUG("new minor registered %d\n", minor->index); > return 0; > @@ -194,9 +213,13 @@ static void drm_minor_unregister(struct drm_device *dev, > unsigned int type) > return; > > /* replace @minor with NULL so lookups will fail from now on */ > - spin_lock_irqsave(&drm_minor_lock, flags); > - idr_replace(&drm_minors_idr, NULL, minor->index); > - spin_unlock_irqrestore(&drm_minor_lock, flags); > + if (minor->type == DRM_MINOR_ACCEL) { > + accel_minor_replace(NULL, minor->index); > + } else { > + spi
Re: [PATCH v4 3/4] drm: initialize accel framework
On Tue, Nov 22, 2022 at 12:56 PM Melissa Wen wrote: > > On 11/19, Oded Gabbay wrote: > > Now that we have the accel framework code ready, let's call the > > accel functions from all the appropriate places. These places are the > > drm module init/exit functions, and all the drm_minor handling > > functions. > > Hi Oded, > > The proposal overall LGTM, but I didn't manage to compile the kernel > with your patch series when DRM is enabled but accel support doesn't. > Multiple missing link errors... > > Am I missing something? Can you take a look at it from this patch 3/4? > > Thanks, > > Melissa Looking at it now, thanks for letting me know. Oded > > > > Signed-off-by: Oded Gabbay > > --- > > drivers/gpu/drm/drm_drv.c | 102 ++-- > > drivers/gpu/drm/drm_sysfs.c | 24 ++--- > > 2 files changed, 91 insertions(+), 35 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > > index 8214a0b1ab7f..1aec019f6389 100644 > > --- a/drivers/gpu/drm/drm_drv.c > > +++ b/drivers/gpu/drm/drm_drv.c > > @@ -35,6 +35,7 @@ > > #include > > #include > > > > +#include > > #include > > #include > > #include > > @@ -90,6 +91,8 @@ static struct drm_minor **drm_minor_get_slot(struct > > drm_device *dev, > > return &dev->primary; > > case DRM_MINOR_RENDER: > > return &dev->render; > > + case DRM_MINOR_ACCEL: > > + return &dev->accel; > > default: > > BUG(); > > } > > @@ -104,9 +107,13 @@ static void drm_minor_alloc_release(struct drm_device > > *dev, void *data) > > > > put_device(minor->kdev); > > > > - spin_lock_irqsave(&drm_minor_lock, flags); > > - idr_remove(&drm_minors_idr, minor->index); > > - spin_unlock_irqrestore(&drm_minor_lock, flags); > > + if (minor->type == DRM_MINOR_ACCEL) { > > + accel_minor_remove(minor->index); > > + } else { > > + spin_lock_irqsave(&drm_minor_lock, flags); > > + idr_remove(&drm_minors_idr, minor->index); > > + spin_unlock_irqrestore(&drm_minor_lock, flags); > > + } > > } > > > > static int drm_minor_alloc(struct drm_device *dev, unsigned int type) > > @@ -123,13 +130,17 @@ static int drm_minor_alloc(struct drm_device *dev, > > unsigned int type) > > minor->dev = dev; > > > > idr_preload(GFP_KERNEL); > > - spin_lock_irqsave(&drm_minor_lock, flags); > > - r = idr_alloc(&drm_minors_idr, > > - NULL, > > - 64 * type, > > - 64 * (type + 1), > > - GFP_NOWAIT); > > - spin_unlock_irqrestore(&drm_minor_lock, flags); > > + if (type == DRM_MINOR_ACCEL) { > > + r = accel_minor_alloc(); > > + } else { > > + spin_lock_irqsave(&drm_minor_lock, flags); > > + r = idr_alloc(&drm_minors_idr, > > + NULL, > > + 64 * type, > > + 64 * (type + 1), > > + GFP_NOWAIT); > > + spin_unlock_irqrestore(&drm_minor_lock, flags); > > + } > > idr_preload_end(); > > > > if (r < 0) > > @@ -161,10 +172,14 @@ static int drm_minor_register(struct drm_device *dev, > > unsigned int type) > > if (!minor) > > return 0; > > > > - ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root); > > - if (ret) { > > - DRM_ERROR("DRM: Failed to initialize > > /sys/kernel/debug/dri.\n"); > > - goto err_debugfs; > > + if (minor->type == DRM_MINOR_ACCEL) { > > + accel_debugfs_init(minor, minor->index); > > + } else { > > + ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root); > > + if (ret) { > > + DRM_ERROR("DRM: Failed to initialize > > /sys/kernel/debug/dri.\n"); > > + goto err_debugfs; > > + } > > } > > > > ret = device_add(minor->kdev); > > @@ -172,9 +187,13 @@ static int drm_minor_register(struct drm_device *dev, > > unsigned int type) > > goto err_debugfs; > > > > /* replace NULL with @minor so lookups will succeed from now on */ > > - spin_lock_irqsave(&drm_minor_lock, flags); > > - idr_replace(&drm_minors_idr, minor, minor->index); > > - spin_unlock_irqrestore(&drm_minor_lock, flags); > > + if (minor->type == DRM_MINOR_ACCEL) { > > + accel_minor_replace(minor, minor->index); > > + } else { > > + spin_lock_irqsave(&drm_minor_lock, flags); > > + idr_replace(&drm_minors_idr, minor, minor->index); > > + spin_unlock_irqrestore(&drm_minor_lock, flags); > > + } > > > > DRM_DEBUG("new minor registered %d\n", minor->index); > > return 0; > > @@ -194,9 +213,13 @@ static void drm_minor_unregister(struct drm_device > > *dev, unsigned int type) > > return; > > > > /* replace @mino
Re: [PATCH v4 3/4] drm: initialize accel framework
On Tue, Nov 22, 2022 at 12:59 PM Oded Gabbay wrote: > > On Tue, Nov 22, 2022 at 12:56 PM Melissa Wen wrote: > > > > On 11/19, Oded Gabbay wrote: > > > Now that we have the accel framework code ready, let's call the > > > accel functions from all the appropriate places. These places are the > > > drm module init/exit functions, and all the drm_minor handling > > > functions. > > > > Hi Oded, > > > > The proposal overall LGTM, but I didn't manage to compile the kernel > > with your patch series when DRM is enabled but accel support doesn't. > > Multiple missing link errors... > > > > Am I missing something? Can you take a look at it from this patch 3/4? > > > > Thanks, > > > > Melissa > Looking at it now, thanks for letting me know. > Oded Found the issue, missing } at end of accel_debugfs_init() in drm_accel.h. Only compiles when accel support isn't compiled in. I'll fix it before sending the PR to Dave. Much appreciated :) Oded > > > > > > > Signed-off-by: Oded Gabbay > > > --- > > > drivers/gpu/drm/drm_drv.c | 102 ++-- > > > drivers/gpu/drm/drm_sysfs.c | 24 ++--- > > > 2 files changed, 91 insertions(+), 35 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > > > index 8214a0b1ab7f..1aec019f6389 100644 > > > --- a/drivers/gpu/drm/drm_drv.c > > > +++ b/drivers/gpu/drm/drm_drv.c > > > @@ -35,6 +35,7 @@ > > > #include > > > #include > > > > > > +#include > > > #include > > > #include > > > #include > > > @@ -90,6 +91,8 @@ static struct drm_minor **drm_minor_get_slot(struct > > > drm_device *dev, > > > return &dev->primary; > > > case DRM_MINOR_RENDER: > > > return &dev->render; > > > + case DRM_MINOR_ACCEL: > > > + return &dev->accel; > > > default: > > > BUG(); > > > } > > > @@ -104,9 +107,13 @@ static void drm_minor_alloc_release(struct > > > drm_device *dev, void *data) > > > > > > put_device(minor->kdev); > > > > > > - spin_lock_irqsave(&drm_minor_lock, flags); > > > - idr_remove(&drm_minors_idr, minor->index); > > > - spin_unlock_irqrestore(&drm_minor_lock, flags); > > > + if (minor->type == DRM_MINOR_ACCEL) { > > > + accel_minor_remove(minor->index); > > > + } else { > > > + spin_lock_irqsave(&drm_minor_lock, flags); > > > + idr_remove(&drm_minors_idr, minor->index); > > > + spin_unlock_irqrestore(&drm_minor_lock, flags); > > > + } > > > } > > > > > > static int drm_minor_alloc(struct drm_device *dev, unsigned int type) > > > @@ -123,13 +130,17 @@ static int drm_minor_alloc(struct drm_device *dev, > > > unsigned int type) > > > minor->dev = dev; > > > > > > idr_preload(GFP_KERNEL); > > > - spin_lock_irqsave(&drm_minor_lock, flags); > > > - r = idr_alloc(&drm_minors_idr, > > > - NULL, > > > - 64 * type, > > > - 64 * (type + 1), > > > - GFP_NOWAIT); > > > - spin_unlock_irqrestore(&drm_minor_lock, flags); > > > + if (type == DRM_MINOR_ACCEL) { > > > + r = accel_minor_alloc(); > > > + } else { > > > + spin_lock_irqsave(&drm_minor_lock, flags); > > > + r = idr_alloc(&drm_minors_idr, > > > + NULL, > > > + 64 * type, > > > + 64 * (type + 1), > > > + GFP_NOWAIT); > > > + spin_unlock_irqrestore(&drm_minor_lock, flags); > > > + } > > > idr_preload_end(); > > > > > > if (r < 0) > > > @@ -161,10 +172,14 @@ static int drm_minor_register(struct drm_device > > > *dev, unsigned int type) > > > if (!minor) > > > return 0; > > > > > > - ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root); > > > - if (ret) { > > > - DRM_ERROR("DRM: Failed to initialize > > > /sys/kernel/debug/dri.\n"); > > > - goto err_debugfs; > > > + if (minor->type == DRM_MINOR_ACCEL) { > > > + accel_debugfs_init(minor, minor->index); > > > + } else { > > > + ret = drm_debugfs_init(minor, minor->index, > > > drm_debugfs_root); > > > + if (ret) { > > > + DRM_ERROR("DRM: Failed to initialize > > > /sys/kernel/debug/dri.\n"); > > > + goto err_debugfs; > > > + } > > > } > > > > > > ret = device_add(minor->kdev); > > > @@ -172,9 +187,13 @@ static int drm_minor_register(struct drm_device > > > *dev, unsigned int type) > > > goto err_debugfs; > > > > > > /* replace NULL with @minor so lookups will succeed from now on */ > > > - spin_lock_irqsave(&drm_minor_lock, flags); > > > - idr_replace(&drm_minors_idr, minor, minor->index); > > > - spin_unlock_irqrestore(&drm_minor_lock, flags); > > > + if (minor->type == DRM_MINOR_ACCEL) { > > > + accel_min
Re: [PATCH v4 07/10] dt-bindings: phy: Add Cadence HDP-TX DP PHY
On 21/11/2022 08:23, Sandor Yu wrote: > Add bindings for Cadence HDP-TX DisplayPort PHY. > > Signed-off-by: Sandor Yu > ---> .../bindings/phy/cdns,hdptx-dp-phy.yaml | 68 +++ > 1 file changed, 68 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/phy/cdns,hdptx-dp-phy.yaml > > diff --git a/Documentation/devicetree/bindings/phy/cdns,hdptx-dp-phy.yaml > b/Documentation/devicetree/bindings/phy/cdns,hdptx-dp-phy.yaml > new file mode 100644 > index ..b997c15ff0bb > --- /dev/null > +++ b/Documentation/devicetree/bindings/phy/cdns,hdptx-dp-phy.yaml > @@ -0,0 +1,68 @@ > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/phy/cdns,hdptx-dp-phy.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: Cadence HDP-TX(HDMI/DisplayPort) PHY for DisplayPort protocol > + > +maintainers: > + - Sandor Yu > + > +properties: > + compatible: > +enum: > + - cdns,hdptx-dp-phy > + > + reg: > +maxItems: 1 > + > + clocks: > +items: > + - description: PHY reference clock. > + - description: APB clock. > + > + clock-names: > +items: > + - const: refclk > + - const: apbclk Drop "clk" suffix. Best regards, Krzysztof
Re: [PATCH v4 09/10] dt-bindings: phy: Add Cadence HDP-TX HDMI PHY
On 21/11/2022 08:23, Sandor Yu wrote: > Add bindings for Cadence HDP-TX HDMI PHY. > > Signed-off-by: Sandor Yu > --- > .../bindings/phy/cdns,hdptx-hdmi-phy.yaml | 52 +++ > 1 file changed, 52 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/phy/cdns,hdptx-hdmi-phy.yaml > > diff --git a/Documentation/devicetree/bindings/phy/cdns,hdptx-hdmi-phy.yaml > b/Documentation/devicetree/bindings/phy/cdns,hdptx-hdmi-phy.yaml > new file mode 100644 > index ..99352e655eec > --- /dev/null > +++ b/Documentation/devicetree/bindings/phy/cdns,hdptx-hdmi-phy.yaml > @@ -0,0 +1,52 @@ > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/phy/cdns,hdptx-hdmi-phy.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: Cadence HDP-TX(HDMI/DisplayPort) PHY for HDMI protocol > + > +maintainers: > + - Sandor Yu > + > +properties: > + compatible: > +enum: > + - cdns,hdptx-hdmi-phy > + > + reg: > +maxItems: 1 > + > + clocks: > +items: > + - description: PHY reference clock. > + - description: APB clock. > + > + clock-names: > +items: > + - const: refclk > + - const: apbclk Drop "clk" suffix. Best regards, Krzysztof
Re: [PATCH v4 3/4] drm: initialize accel framework
11/22, Oded Gabbay wrote: > On Tue, Nov 22, 2022 at 12:59 PM Oded Gabbay wrote: > > > > On Tue, Nov 22, 2022 at 12:56 PM Melissa Wen wrote: > > > > > > On 11/19, Oded Gabbay wrote: > > > > Now that we have the accel framework code ready, let's call the > > > > accel functions from all the appropriate places. These places are the > > > > drm module init/exit functions, and all the drm_minor handling > > > > functions. > > > > > > Hi Oded, > > > > > > The proposal overall LGTM, but I didn't manage to compile the kernel > > > with your patch series when DRM is enabled but accel support doesn't. > > > Multiple missing link errors... > > > > > > Am I missing something? Can you take a look at it from this patch 3/4? > > > > > > Thanks, > > > > > > Melissa > > Looking at it now, thanks for letting me know. > > Oded > Found the issue, missing } at end of accel_debugfs_init() in > drm_accel.h. Only compiles when accel support isn't compiled in. > I'll fix it before sending the PR to Dave. > Much appreciated :) > Oded Oh, great! Just checked here and everything is fine now. With that, you can add my r-b for the entire series too Reviewed-by: Melissa Wen Thanks for working on it, Melissa > > > > > > > > > > > Signed-off-by: Oded Gabbay > > > > --- > > > > drivers/gpu/drm/drm_drv.c | 102 ++-- > > > > drivers/gpu/drm/drm_sysfs.c | 24 ++--- > > > > 2 files changed, 91 insertions(+), 35 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > > > > index 8214a0b1ab7f..1aec019f6389 100644 > > > > --- a/drivers/gpu/drm/drm_drv.c > > > > +++ b/drivers/gpu/drm/drm_drv.c > > > > @@ -35,6 +35,7 @@ > > > > #include > > > > #include > > > > > > > > +#include > > > > #include > > > > #include > > > > #include > > > > @@ -90,6 +91,8 @@ static struct drm_minor **drm_minor_get_slot(struct > > > > drm_device *dev, > > > > return &dev->primary; > > > > case DRM_MINOR_RENDER: > > > > return &dev->render; > > > > + case DRM_MINOR_ACCEL: > > > > + return &dev->accel; > > > > default: > > > > BUG(); > > > > } > > > > @@ -104,9 +107,13 @@ static void drm_minor_alloc_release(struct > > > > drm_device *dev, void *data) > > > > > > > > put_device(minor->kdev); > > > > > > > > - spin_lock_irqsave(&drm_minor_lock, flags); > > > > - idr_remove(&drm_minors_idr, minor->index); > > > > - spin_unlock_irqrestore(&drm_minor_lock, flags); > > > > + if (minor->type == DRM_MINOR_ACCEL) { > > > > + accel_minor_remove(minor->index); > > > > + } else { > > > > + spin_lock_irqsave(&drm_minor_lock, flags); > > > > + idr_remove(&drm_minors_idr, minor->index); > > > > + spin_unlock_irqrestore(&drm_minor_lock, flags); > > > > + } > > > > } > > > > > > > > static int drm_minor_alloc(struct drm_device *dev, unsigned int type) > > > > @@ -123,13 +130,17 @@ static int drm_minor_alloc(struct drm_device > > > > *dev, unsigned int type) > > > > minor->dev = dev; > > > > > > > > idr_preload(GFP_KERNEL); > > > > - spin_lock_irqsave(&drm_minor_lock, flags); > > > > - r = idr_alloc(&drm_minors_idr, > > > > - NULL, > > > > - 64 * type, > > > > - 64 * (type + 1), > > > > - GFP_NOWAIT); > > > > - spin_unlock_irqrestore(&drm_minor_lock, flags); > > > > + if (type == DRM_MINOR_ACCEL) { > > > > + r = accel_minor_alloc(); > > > > + } else { > > > > + spin_lock_irqsave(&drm_minor_lock, flags); > > > > + r = idr_alloc(&drm_minors_idr, > > > > + NULL, > > > > + 64 * type, > > > > + 64 * (type + 1), > > > > + GFP_NOWAIT); > > > > + spin_unlock_irqrestore(&drm_minor_lock, flags); > > > > + } > > > > idr_preload_end(); > > > > > > > > if (r < 0) > > > > @@ -161,10 +172,14 @@ static int drm_minor_register(struct drm_device > > > > *dev, unsigned int type) > > > > if (!minor) > > > > return 0; > > > > > > > > - ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root); > > > > - if (ret) { > > > > - DRM_ERROR("DRM: Failed to initialize > > > > /sys/kernel/debug/dri.\n"); > > > > - goto err_debugfs; > > > > + if (minor->type == DRM_MINOR_ACCEL) { > > > > + accel_debugfs_init(minor, minor->index); > > > > + } else { > > > > + ret = drm_debugfs_init(minor, minor->index, > > > > drm_debugfs_root); > > > > + if (ret) { > > > > + DRM_ERROR("DRM: Failed to initialize > > > > /sys/kernel/debug/dri.\n"); > > > > + goto err_debugfs; > > > > + } > > > > } > > > > > > > > ret = device_add(minor->kdev); > > > > @@ -172,9 +187,13 @@ st
[PATCH 1/2] drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()
As comment of pci_get_class() says, it returns a pci_device with its refcount increased and decreased the refcount for the input parameter @from if it is not NULL. If we break the loop in radeon_atrm_get_bios() with 'pdev' not NULL, we need to call pci_dev_put() to decrease the refcount. Add the missing pci_dev_put() to avoid refcount leak. Fixes: d8ade3526b2a ("drm/radeon: handle non-VGA class pci devices with ATRM") Fixes: c61e2775873f ("drm/radeon: split ATRM support out from the ATPX handler (v3)") Signed-off-by: Xiongfeng Wang --- drivers/gpu/drm/radeon/radeon_bios.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/radeon/radeon_bios.c b/drivers/gpu/drm/radeon/radeon_bios.c index 33121655d50b..2df6ce3e32cb 100644 --- a/drivers/gpu/drm/radeon/radeon_bios.c +++ b/drivers/gpu/drm/radeon/radeon_bios.c @@ -227,6 +227,7 @@ static bool radeon_atrm_get_bios(struct radeon_device *rdev) if (!found) return false; + pci_dev_put(pdev); rdev->bios = kmalloc(size, GFP_KERNEL); if (!rdev->bios) { -- 2.20.1
[PATCH 0/2] drm: Fix PCI device refcount leak
As comment of pci_get_class() says, it returns a pci_device with its refcount increased and decreased the refcount for the input parameter @from if it is not NULL. If we use pci_get_class() to iterate all the PCI device of a certain class, when we find the expected device and break the iteration loop, the refcount of the found PCI device is increased. When finish using the PCI device, we need to call pci_dev_put() to decrease the refcount. Xiongfeng Wang (2): drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios() drm/amdgpu: Fix PCI device refcount leak in amdgpu_atrm_get_bios() drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 1 + drivers/gpu/drm/radeon/radeon_bios.c | 1 + 2 files changed, 2 insertions(+) -- 2.20.1
[PATCH 2/2] drm/amdgpu: Fix PCI device refcount leak in amdgpu_atrm_get_bios()
As comment of pci_get_class() says, it returns a pci_device with its refcount increased and decreased the refcount for the input parameter @from if it is not NULL. If we break the loop in amdgpu_atrm_get_bios() with 'pdev' not NULL, we need to call pci_dev_put() to decrease the refcount. Add the missing pci_dev_put() to avoid refcount leak. Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)") Signed-off-by: Xiongfeng Wang --- drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c index e363f56c72af..30c28a69e847 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c @@ -317,6 +317,7 @@ static bool amdgpu_atrm_get_bios(struct amdgpu_device *adev) if (!found) return false; + pci_dev_put(pdev); adev->bios = kmalloc(size, GFP_KERNEL); if (!adev->bios) { -- 2.20.1
[git pull v2] new subsystem for compute accelerator devices
Hi Dave, Resending the pull-request with the fix of the build when CONFIG_DRM_ACCEL is set to N. Thanks, Oded The following changes since commit fc58764bbf602b65a6f63c53e5fd6feae76c510c: Merge tag 'amd-drm-next-6.2-2022-11-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-next (2022-11-22 13:41:11 +1000) are available in the Git repository at: https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git tags/drm-accel-2022-11-22 for you to fetch changes up to 8c5577a5ccc632685e65168fc6890b72a779f93a: doc: add documentation for accel subsystem (2022-11-22 13:14:52 +0200) This tag contains the patches that add the new compute acceleration subsystem, which is part of the DRM subsystem. The patches: - Add a new directory at drivers/accel. - Add a new major (261) for compute accelerators. - Add a new DRM minor type for compute accelerators. - Integrate the accel core code with DRM core code. - Add documentation for the accel subsystem. Oded Gabbay (4): drivers/accel: define kconfig and register a new major accel: add dedicated minor for accelerator devices drm: initialize accel framework doc: add documentation for accel subsystem Documentation/accel/index.rst | 17 ++ Documentation/accel/introduction.rst | 110 Documentation/admin-guide/devices.txt | 5 + Documentation/subsystem-apis.rst | 1 + MAINTAINERS | 9 + drivers/Kconfig | 2 + drivers/accel/Kconfig | 24 +++ drivers/accel/drm_accel.c | 323 ++ drivers/gpu/drm/Makefile | 1 + drivers/gpu/drm/drm_drv.c | 101 --- drivers/gpu/drm/drm_file.c| 2 +- drivers/gpu/drm/drm_sysfs.c | 24 ++- include/drm/drm_accel.h | 97 ++ include/drm/drm_device.h | 3 + include/drm/drm_drv.h | 8 + include/drm/drm_file.h| 21 ++- 16 files changed, 711 insertions(+), 37 deletions(-) create mode 100644 Documentation/accel/index.rst create mode 100644 Documentation/accel/introduction.rst create mode 100644 drivers/accel/Kconfig create mode 100644 drivers/accel/drm_accel.c create mode 100644 include/drm/drm_accel.h
Re: [regression][6.0] After commit b261509952bc19d1012cf732f853659be6ebc61e I see WARNING message at drivers/gpu/drm/drm_modeset_lock.c:276 drm_modeset_drop_locks+0x63/0x70
On Thu, Oct 13, 2022 at 6:36 PM Mikhail Gavrilov wrote: > > Hi! > I bisected an issue of the 6.0 kernel which started happening after > 6.0-rc7 on all my machines. > > Backtrace of this issue looks like as: > > [ 2807.339439] [ cut here ] > [ 2807.339445] WARNING: CPU: 11 PID: 2061 at > drivers/gpu/drm/drm_modeset_lock.c:276 > drm_modeset_drop_locks+0x63/0x70 > > bisect points to this commit: b261509952bc19d1012cf732f853659be6ebc61e. > > After reverting this commit the WARNING messages described here disappeared. > Hi Harry, Christian says that you can help with it. Thanks. -- Best Regards, Mike Gavrilov.
[PATCH] drm/amd/display: Fix set scaling doesn's work
[Why] Setting scaling does not correctly update CRTC state. As a result dc stream state's src (composition area) && dest (addressable area) was not calculated as expected. This causes set scaling doesn's work. [How] Correctly update CRTC state when setting scaling property. Signed-off-by: hongao diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 3e1ecca72430..a88a6f758748 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -9386,8 +9386,8 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev, goto fail; } - if (dm_old_con_state->abm_level != - dm_new_con_state->abm_level) + if (dm_old_con_state->abm_level != dm_new_con_state->abm_level || + dm_old_con_state->scaling != dm_new_con_state->scaling) new_crtc_state->connectors_changed = true; } -- 2.20.1
[PATCH linux-next] drm/i915/gvt: use sysfs_streq() instead of strncmp()
From: Xu Panda Replace the open-code with sysfs_streq(). Signed-off-by: Xu Panda Signed-off-by: Yang Yang --- drivers/gpu/drm/i915/gvt/cmd_parser.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.c b/drivers/gpu/drm/i915/gvt/cmd_parser.c index 0ebf5fbf0e39..7b9a5f7f5363 100644 --- a/drivers/gpu/drm/i915/gvt/cmd_parser.c +++ b/drivers/gpu/drm/i915/gvt/cmd_parser.c @@ -917,8 +917,8 @@ static int cmd_reg_handler(struct parser_exec_state *s, return -EBADRQC; } - if (!strncmp(cmd, "srm", 3) || - !strncmp(cmd, "lrm", 3)) { + if (sysfs_streq(cmd, "srm") || + sysfs_streq(cmd, "lrm")) { if (offset == i915_mmio_reg_offset(GEN8_L3SQCREG4) || offset == 0x21f0 || (IS_BROADWELL(gvt->gt->i915) && @@ -931,8 +931,8 @@ static int cmd_reg_handler(struct parser_exec_state *s, } } - if (!strncmp(cmd, "lrr-src", 7) || - !strncmp(cmd, "lrr-dst", 7)) { + if (sysfs_streq(cmd, "lrr-src") || + sysfs_streq(cmd, "lrr-dst")) { if (IS_BROADWELL(gvt->gt->i915) && offset == 0x215c) return 0; else { @@ -941,12 +941,12 @@ static int cmd_reg_handler(struct parser_exec_state *s, } } - if (!strncmp(cmd, "pipe_ctrl", 9)) { + if (sysfs_streq(cmd, "pipe_ctrl")) { /* TODO: add LRI POST logic here */ return 0; } - if (strncmp(cmd, "lri", 3)) + if (!sysfs_streq(cmd, "lri")) return -EPERM; /* below are all lri handlers */ @@ -1011,7 +1011,7 @@ static int cmd_reg_handler(struct parser_exec_state *s, */ if (GRAPHICS_VER(s->engine->i915) == 9 && intel_gvt_mmio_is_sr_in_ctx(gvt, offset) && - !strncmp(cmd, "lri", 3)) { + sysfs_streq(cmd, "lri")) { intel_gvt_read_gpa(s->vgpu, s->workload->ring_context_gpa + 12, &ctx_sr_ctl, 4); /* check inhibit context */ -- 2.15.2
Re: [PATCH RFC 16/19] mm/frame-vector: remove FOLL_FORCE usage
Hi Tomasz, David, On 11/8/22 05:45, Tomasz Figa wrote: > Hi David, > > On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand wrote: >> >> FOLL_FORCE is really only for debugger access. According to commit >> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always >> writable"), the pinned pages are always writable. > > Actually that patch is only a workaround to temporarily disable > support for read-only pages as they seemed to suffer from some > corruption issues in the retrieved user pages. We expect to support > read-only pages as hardware input after. That said, FOLL_FORCE doesn't > sound like the right thing even in that case, but I don't know the > background behind it being added here in the first place. +Hans > Verkuil +Marek Szyprowski do you happen to remember anything about it? I tracked the use of 'force' all the way back to the first git commit (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the reason is lost in the mists of time. I'm not sure if the 'force' argument of get_user_pages() at that time even meant the same as FOLL_FORCE today. From what I can tell it has just been faithfully used ever since, but I have my doubt that anyone understands the reason behind it since it was never explained. Looking at this old LWN article https://lwn.net/Articles/28548/ suggests that it might be related to calling get_user_pages for write buffers (non-zero write argument) where you also want to be able to read from the buffer. That is certainly something that some drivers need to do post-capture fixups. But 'force' was also always set for read buffers, and I don't know if that was something that was actually needed, or just laziness. I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still allow drivers to read from the buffer? Regards, Hans > > Best regards, > Tomasz > >> >> FOLL_FORCE in this case seems to be a legacy leftover. Let's just remove >> it. >> >> Cc: Tomasz Figa >> Cc: Marek Szyprowski >> Cc: Mauro Carvalho Chehab >> Signed-off-by: David Hildenbrand >> --- >> drivers/media/common/videobuf2/frame_vector.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/media/common/videobuf2/frame_vector.c >> b/drivers/media/common/videobuf2/frame_vector.c >> index 542dde9d2609..062e98148c53 100644 >> --- a/drivers/media/common/videobuf2/frame_vector.c >> +++ b/drivers/media/common/videobuf2/frame_vector.c >> @@ -50,7 +50,7 @@ int get_vaddr_frames(unsigned long start, unsigned int >> nr_frames, >> start = untagged_addr(start); >> >> ret = pin_user_pages_fast(start, nr_frames, >> - FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, >> + FOLL_WRITE | FOLL_LONGTERM, >> (struct page **)(vec->ptrs)); >> if (ret > 0) { >> vec->got_ref = true; >> -- >> 2.38.1 >>
Re: [PATCH RFC 16/19] mm/frame-vector: remove FOLL_FORCE usage
On 22.11.22 13:25, Hans Verkuil wrote: Hi Tomasz, David, On 11/8/22 05:45, Tomasz Figa wrote: Hi David, On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand wrote: FOLL_FORCE is really only for debugger access. According to commit 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always writable"), the pinned pages are always writable. Actually that patch is only a workaround to temporarily disable support for read-only pages as they seemed to suffer from some corruption issues in the retrieved user pages. We expect to support read-only pages as hardware input after. That said, FOLL_FORCE doesn't sound like the right thing even in that case, but I don't know the background behind it being added here in the first place. +Hans Verkuil +Marek Szyprowski do you happen to remember anything about it? I tracked the use of 'force' all the way back to the first git commit (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the reason is lost in the mists of time. I'm not sure if the 'force' argument of get_user_pages() at that time even meant the same as FOLL_FORCE today. From what I can tell it has just been faithfully used ever since, but I have my doubt that anyone understands the reason behind it since it was never explained. Looking at this old LWN article https://lwn.net/Articles/28548/ suggests that it might be related to calling get_user_pages for write buffers (non-zero write argument) where you also want to be able to read from the buffer. That is certainly something that some drivers need to do post-capture fixups. But 'force' was also always set for read buffers, and I don't know if that was something that was actually needed, or just laziness. I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still allow drivers to read from the buffer? Yes. The only problematic corner case I can imagine is if someone has a VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin user space pages as a read buffer. We'd specify now FOLL_WRITE without FOLL_FORCE and GUP would reject that: write access without write permissions is invalid. There would be no way around "fixing" this implementation to not specify FOLL_WRITE when only reading from user-space pages. Not sure what the implications are regarding that corruption that was mentioned in 707947247e95. Having said that, I assume such a scenario is unlikely -- but you might know better how user space usually uses this interface. There would be three options: 1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid. 2) Remove FOLL_FORCE and see if anybody even notices (this patch) and leave the implementation as is for now. 3) Remove FOLL_FORCE and fixup the implementation to only specify FOLL_WRITE if the pages will actually get written to. 3) would most probably ideal, however, I am no expert on that code and can't do it (707947247e95 confuses me). So naive me would go with 2) first. -- Thanks, David / dhildenb
Re: Coverity: intel_hti_uses_phy(): Integer handling issues
On Fri, 18 Nov 2022, coverity-bot wrote: > Hello! > > This is an experimental semi-automated report about issues detected by > Coverity from a scan of next-20221118 as part of the linux-next scan project: > https://scan.coverity.com/projects/linux-next-weekly-scan > > You're getting this email because you were associated with the identified > lines of code (noted below) that were touched by commits: > > Thu Nov 17 16:12:56 2022 +0200 > 62749912540b ("drm/i915/display: move hti under display sub-struct") > > Coverity reported the following: > > *** CID 1527374: Integer handling issues (BAD_SHIFT) > drivers/gpu/drm/i915/display/intel_hti.c:24 in intel_hti_uses_phy() > 18if (INTEL_INFO(i915)->display.has_hti) > 19i915->display.hti.state = intel_de_read(i915, > HDPORT_STATE); > 20 } > 21 > 22 bool intel_hti_uses_phy(struct drm_i915_private *i915, enum phy phy) > 23 { > vvv CID 1527374: Integer handling issues (BAD_SHIFT) > vvv In expression "1UL << 2 * phy + 1", shifting by a negative amount has > undefined behavior. The shift amount, "2 * phy + 1", is as little as -1. > 24return i915->display.hti.state & HDPORT_ENABLED && > 25i915->display.hti.state & HDPORT_DDI_USED(phy); > 26 } > 27 > 28 u32 intel_hti_dpll_mask(struct drm_i915_private *i915) > 29 { > > If this is a false positive, please let us know so we can mark it as > such, or teach the Coverity rules to be smarter. If not, please make > sure fixes get into linux-next. :) For patches fixing this, please > include these lines (but double-check the "Fixes" first): > > Reported-by: coverity-bot > Addresses-Coverity-ID: 1527374 ("Integer handling issues") > Fixes: 62749912540b ("drm/i915/display: move hti under display sub-struct") Thanks for the report, fix at [1]. I realize I didn't use the suggested tags above. For one thing, we've never really logged any proprietary tools used. Looks like "Addresses-Coverity-ID:" is growing in popularity though. The Fixes: tag points at code refactoring, it was a pre-existing condition. BR, Jani. [1] https://patchwork.freedesktop.org/patch/msgid/20221122120948.3436180-1-jani.nik...@intel.com > > This code appears to be safe currently (intel_hti_uses_phy() is never > called with PHY_NONE), but perhaps add an explicit check? > > if (WARN_ON(phy == PHY_NONE)) > return false; > > Thanks for your attention! -- Jani Nikula, Intel Open Source Graphics Center
[no subject]
Date: Tue, 22 Nov 2022 15:51:44 +0300 Subject: [PATCH] powerplay: hwmgr: added result check The return value of the 'div64_s64' function called in ppevvmath.h:371 was not checked. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Denis Arefev --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h index dac29fe6cfc6..82aa7130d143 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h @@ -357,6 +357,7 @@ static fInt fDivide (fInt X, fInt Y) { fInt fZERO, fQuotient; int64_t longlongX, longlongY; + int64_t result; fZERO = ConvertToFraction(0); @@ -367,10 +368,11 @@ static fInt fDivide (fInt X, fInt Y) longlongY = (int64_t)Y.full; longlongX = longlongX << 16; /*Q(16,16) -> Q(32,32) */ + longlongY = longlongY << 16; - div64_s64(longlongX, longlongY); /*Q(32,32) divided by Q(16,16) = Q(16,16) Back to original format */ + result = div64_s64(longlongX, longlongY); - fQuotient.full = (int)longlongX; + fQuotient = ConvertToFraction((int)result); return fQuotient; } -- 2.25.1
[MAINTAINER TOOLS] docs: updated rules for topic/core-for-CI commit management
Introduce stricter rules for topic/core-for-CI management. Way too many commits have been added over the years, with insufficient rationale recorded in the commit message, and insufficient follow-up with removing the commits from the topic branch. New rules: 1. Require maintainer ack for rebase. Have better gating on when rebases happen and on which baselines. 2. Require maintainer/committer ack for adding/removing commits. No single individual should decide. 3. Require gitlab issues for new commits added. Improve tracking for removing the commits. Also use the stronger "must" for commit message requiring the justification for the commit being in topic/core-for-CI. Cc: Joonas Lahtinen Cc: Rodrigo Vivi Cc: Tvrtko Ursulin Cc: David Airlie Cc: Daniel Vetter Cc: intel-...@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: dim-to...@lists.freedesktop.org Signed-off-by: Jani Nikula --- drm-tip.rst | 27 --- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/drm-tip.rst b/drm-tip.rst index deae95cdd2fe..24036e2ef576 100644 --- a/drm-tip.rst +++ b/drm-tip.rst @@ -203,11 +203,13 @@ justified exception. The primary goal is to fix issues originating from Linus' tree. Issues that would need drm-next or other DRM subsystem tree as baseline should be fixed in the offending DRM subsystem tree. -Only rebase the branch if you really know what you're doing. When in doubt, ask -the maintainers. You'll need to be able to handle any conflicts in non-drm code -while rebasing. +Only rebase the branch if you really know what you're doing. You'll need to be +able to handle any conflicts in non-drm code while rebasing. -Simply drop fixes that are already available in the new baseline. +Always ask for maintainer ack before rebasing. IRC ack is sufficient. + +Simply drop fixes that are already available in the new baseline. Close the +associated gitlab issue when removing commits. Force pushing a rebased topic/core-for-CI requires passing the ``--force`` parameter to git:: @@ -225,11 +227,22 @@ judgement call. Only add or remove commits if you really know what you're doing. When in doubt, ask the maintainers. -Apply new commits on top with regular push. The commit message needs to explain -why the patch has been applied to topic/core-for-CI. If it's a cherry-pick from +Always ask for maintainer/committer ack before adding/removing commits. IRC ack +is sufficient. Record the ``Acked-by:`` in commits being added. + +Apply new commits on top with regular push. The commit message must explain why +the patch has been applied to topic/core-for-CI. If it's a cherry-pick from another subsystem, please reference the commit with ``git cherry-pick -x`` option. If it's a patch from another subsystem, please reference the patch on the mailing list with ``Link:`` tag. +New commits always need an associated gitlab issue for tracking purposes. The +goal is to have as few commits in topic/core-for-CI as possible, and we need to +be able to track the progress in making that happen. Reference the issue with +``References:`` tag. Add the ``core-for-CI`` label to the issue. (Note: Do not +use ``Closes:`` because the logic here is backwards; the issue is having the +commit in the branch in the first place.) + Instead of applying reverts, just remove the commit. This implies ``git rebase --i`` on the current baseline; see directions above. +-i`` on the current baseline; see directions above. Close the associated gitlab +issue when removing commits. -- 2.34.1
Re: Build regressions/improvements in v6.1-rc6
On Tue, Nov 22, 2022 at 5:56 AM Geert Uytterhoeven wrote: > > On Tue, 22 Nov 2022, Geert Uytterhoeven wrote: > > JFYI, when comparing v6.1-rc6[1] to v6.1-rc5[3], the summaries are: > > - build errors: +6/-0 > >+ /kisskb/src/arch/sh/include/asm/io.h: error: cast to pointer from > integer of different size [-Werror=int-to-pointer-cast]: => 239:34 > > sh4-gcc11/sh-allmodconfig (in cvm_oct_free_hw_memory()) > >+ /kisskb/src/arch/um/include/asm/processor-generic.h: error: called > object is not a function or function pointer: => 94:18 >+ /kisskb/src/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c: error: > control reaches end of non-void function [-Werror=return-type]: => 1934:1 > > um-x86_64/um-all{mod,yes}config (in kfd_cpumask_to_apic_id()) Presumably cpu_data is not defined on um-x86_64? Does it even make sense to build drivers on um-x86_64? Alex > >+ /kisskb/src/drivers/infiniband/hw/qib/qib_wc_x86_64.c: error: > 'X86_VENDOR_AMD' undeclared (first use in this function): => 149:37 >+ /kisskb/src/drivers/infiniband/hw/qib/qib_wc_x86_64.c: error: 'struct > cpuinfo_um' has no member named 'x86_vendor': => 149:22 >+ /kisskb/src/drivers/infiniband/hw/qib/qib_wc_x86_64.c: error: control > reaches end of non-void function [-Werror=return-type]: => 150:1 > > um-x86_64/um-allyesconfig > > > [1] > > http://kisskb.ellerman.id.au/kisskb/branch/linus/head/eb7081409f94a9a8608593d0fb63a1aa3d6f95d8/ > > (all 149 configs) > > [3] > > http://kisskb.ellerman.id.au/kisskb/branch/linus/head/094226ad94f471a9f19e8f8e7140a09c2625abaa/ > > (all 149 configs) > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- > ge...@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like > that. > -- Linus Torvalds
Re: [PATCH RFC 16/19] mm/frame-vector: remove FOLL_FORCE usage
On 11/22/22 13:38, David Hildenbrand wrote: > On 22.11.22 13:25, Hans Verkuil wrote: >> Hi Tomasz, David, >> >> On 11/8/22 05:45, Tomasz Figa wrote: >>> Hi David, >>> >>> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand wrote: FOLL_FORCE is really only for debugger access. According to commit 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always writable"), the pinned pages are always writable. >>> >>> Actually that patch is only a workaround to temporarily disable >>> support for read-only pages as they seemed to suffer from some >>> corruption issues in the retrieved user pages. We expect to support >>> read-only pages as hardware input after. That said, FOLL_FORCE doesn't >>> sound like the right thing even in that case, but I don't know the >>> background behind it being added here in the first place. +Hans >>> Verkuil +Marek Szyprowski do you happen to remember anything about it? >> >> I tracked the use of 'force' all the way back to the first git commit >> (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the >> reason is lost in the mists of time. >> >> I'm not sure if the 'force' argument of get_user_pages() at that time >> even meant the same as FOLL_FORCE today. From what I can tell it has just >> been faithfully used ever since, but I have my doubt that anyone understands >> the reason behind it since it was never explained. >> >> Looking at this old LWN article https://lwn.net/Articles/28548/ suggests >> that it might be related to calling get_user_pages for write buffers >> (non-zero write argument) where you also want to be able to read from the >> buffer. That is certainly something that some drivers need to do post-capture >> fixups. >> >> But 'force' was also always set for read buffers, and I don't know if that >> was something that was actually needed, or just laziness. >> >> I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still >> allow drivers to read from the buffer? > > Yes. The only problematic corner case I can imagine is if someone has a > VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin > user space pages as a read buffer. We'd specify now FOLL_WRITE without > FOLL_FORCE and GUP would reject that: write access without write > permissions is invalid. I do not believe this will be an issue. > > There would be no way around "fixing" this implementation to not specify > FOLL_WRITE when only reading from user-space pages. Not sure what the > implications are regarding that corruption that was mentioned in > 707947247e95. Before 707947247e95 the FOLL_WRITE flag was only set for write buffers (i.e. video capture, DMA_FROM_DEVICE), not for read buffers (video output, DMA_TO_DEVICE). In the video output case there should never be any need for drivers to write to the buffer to the best of my knowledge. But I have had some complaints about that commit that it causes problems in some scenarios, and it has been on my todo list for quite some time now to dig deeper into this. I probably should prioritize this for this or next week. > > Having said that, I assume such a scenario is unlikely -- but you might > know better how user space usually uses this interface. There would be > three options: > > 1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid. > 2) Remove FOLL_FORCE and see if anybody even notices (this patch) and > leave the implementation as is for now. > 3) Remove FOLL_FORCE and fixup the implementation to only specify > FOLL_WRITE if the pages will actually get written to. > > 3) would most probably ideal, however, I am no expert on that code and > can't do it (707947247e95 confuses me). So naive me would go with 2) first. > Option 3 would be best. And 707947247e95 confuses me as well, and I actually wrote it :-) I am wondering whether it was addressed at the right level, but as I said, I need to dig a bit deeper into this. Regards, Hans
Re: [PATCH mm-unstable v1 06/20] mm: rework handling in do_wp_page() based on private vs. shared mappings
On 11/16/22 11:26, David Hildenbrand wrote: > We want to extent FAULT_FLAG_UNSHARE support to anything mapped into a > COW mapping (pagecache page, zeropage, PFN, ...), not just anonymous pages. > Let's prepare for that by handling shared mappings first such that we can > handle private mappings last. > > While at it, use folio-based functions instead of page-based functions > where we touch the code either way. > > Signed-off-by: David Hildenbrand Reviewed-by: Vlastimil Babka
Re: Try to address the DMA-buf coherency problem
On Fri, Nov 18, 2022 at 11:32:19AM -0800, Rob Clark wrote: > On Thu, Nov 17, 2022 at 7:38 AM Nicolas Dufresne wrote: > > > > Le jeudi 17 novembre 2022 à 13:10 +0100, Christian König a écrit : > > > > > DMA-Buf let's the exporter setup the DMA addresses the importer uses > > > > > to > > > > > be able to directly decided where a certain operation should go. E.g. > > > > > we > > > > > have cases where for example a P2P write doesn't even go to memory, > > > > > but > > > > > rather a doorbell BAR to trigger another operation. Throwing in CPU > > > > > round trips for explicit ownership transfer completely breaks that > > > > > concept. > > > > It sounds like we should have a dma_dev_is_coherent_with_dev() which > > > > accepts two (or an array?) of devices and tells the caller whether the > > > > devices need explicit ownership transfer. > > > > > > No, exactly that's the concept I'm pushing back on very hard here. > > > > > > In other words explicit ownership transfer is not something we would > > > want as requirement in the framework, cause otherwise we break tons of > > > use cases which require concurrent access to the underlying buffer. > > > > I'm not pushing for this solution, but really felt the need to correct you > > here. > > I have quite some experience with ownership transfer mechanism, as this is > > how > > GStreamer framework works since 2000. Concurrent access is a really common > > use > > cases and it is quite well defined in that context. The bracketing system > > (in > > this case called map() unmap(), with flag stating the usage intention like > > reads > > and write) is combined the the refcount. The basic rules are simple: > > This is all CPU oriented, I think Christian is talking about the case > where ownership transfer happens without CPU involvement, such as via > GPU waiting on a fence Yeah for pure device2device handover the rule pretty much has to be that any coherency management that needs to be done must be done from the device side (flushing device side caches and stuff like that) only. But under the assumption that _all_ cpu side management has been done already before the first device access started. And then the map/unmap respectively begin/end_cpu_access can be used what it was meant for with cpu side invalidation/flushing and stuff like that, while having pretty clear handover/ownership rules and hopefully not doing no unecessary flushes. And all that while allowing device acces to be pipelined. Worst case the exporter has to insert some pipelined cache flushes as a dma_fence pipelined work of its own between the device access when moving from one device to the other. That last part sucks a bit right now, because we don't have any dma_buf_attachment method which does this syncing without recreating the mapping, but in reality this is solved by caching mappings in the exporter (well dma-buf layer) nowadays. True concurrent access like vk/compute expects is still a model that dma-buf needs to support on top, but that's a special case and pretty much needs hw that supports such concurrent access without explicit handover and fencing. Aside from some historical accidents and still a few warts, I do think dma-buf does support both of these models. Of course in the case of gpu/drm drivers, userspace must know whats possible and act accordingly, otherwise you just get to keep all the pieces. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
[PATCH v2] drm/mediatek: Clean dangling pointer on bind error path
mtk_drm_bind() can fail, in which case drm_dev_put() is called, destroying the drm_device object. However a pointer to it was still being held in the private object, and that pointer would be passed along to DRM in mtk_drm_sys_prepare() if a suspend were triggered at that point, resulting in a panic. Clean the pointer when destroying the object in the error path to prevent this from happening. Fixes: 119f5173628a ("drm/mediatek: Add DRM Driver for Mediatek SoC MT8173.") Signed-off-by: Nícolas F. R. A. Prado --- Changes in v2: - Added Fixes tag drivers/gpu/drm/mediatek/mtk_drm_drv.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c index 39a42dc8fb85..a21ff1b3258c 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c @@ -514,6 +514,7 @@ static int mtk_drm_bind(struct device *dev) err_deinit: mtk_drm_kms_deinit(drm); err_free: + private->drm = NULL; drm_dev_put(drm); return ret; } -- 2.38.1
Re: [PATCH 7/7] drm/fb-helper: Don't use the preferred depth for the BPP default
On Wed, Nov 16, 2022 at 05:09:17PM +0100, Thomas Zimmermann wrote: > If no preferred value for bits-per-pixel has been given, fall back > to 32. Never use the preferred depth. The color depth is the number > of color/alpha bits per pixel, while bpp is the overall number of > bits in most cases. > > Most noteworthy, XRGB has a depth of 24 and a bpp value of 32. > Using depth for bpp would make the value 24 as well and format > selection in fbdev helpers fails. Unfortunately XRGB is the most > common format and the old heuristic therefore fails for most of > the drivers (unless they implement the 24-bit RGB888 format). > > Picking a bpp of 32 will lateron result in a default depth of 24 > and the format XRGB. As XRGB is the default format for most > of the current and legacy graphics stack, all drivers must support > it. So it is the safe choice. > > Signed-off-by: Thomas Zimmermann > --- > drivers/gpu/drm/drm_fbdev_generic.c | 15 +-- > 1 file changed, 9 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/drm_fbdev_generic.c > b/drivers/gpu/drm/drm_fbdev_generic.c > index ab86956692795..0a4c160e0e58a 100644 > --- a/drivers/gpu/drm/drm_fbdev_generic.c > +++ b/drivers/gpu/drm/drm_fbdev_generic.c > @@ -431,7 +431,6 @@ static const struct drm_client_funcs > drm_fbdev_client_funcs = { > * drm_fbdev_generic_setup() - Setup generic fbdev emulation > * @dev: DRM device > * @preferred_bpp: Preferred bits per pixel for the device. > - * @dev->mode_config.preferred_depth is used if this is zero. > * > * This function sets up generic fbdev emulation for drivers that supports > * dumb buffers with a virtual address and that can be mmap'ed. > @@ -475,12 +474,16 @@ void drm_fbdev_generic_setup(struct drm_device *dev, > } > > /* > - * FIXME: This mixes up depth with bpp, which results in a glorious > - * mess, resulting in some drivers picking wrong fbdev defaults and > - * others wrong preferred_depth defaults. > + * Pick a preferred bpp of 32 if no value has been given. This > + * will select XRGB for the framebuffer formats. All drivers > + * have to support XRGB for backwards compatibility with legacy > + * userspace, so it's the safe choice here. > + * > + * TODO: Replace struct drm_mode_config.preferred_depth and this > + * bpp value with a preferred format that is given as struct > + * drm_format_info. Then derive all other values from the > + * format. I concur on this being the right design. What I've attempted years ago, but never managed to finish, is sort the formats list on the primary plane in preference order (since that seems useful for other reasons), and then let everyone derive the preferred_whatever from the first format of the first primary plane automatically. But doing that is a pretty huge refactor, since you get to audit every driver. So I kinda gave up on that. But I still thing something in that direction would be a good design overall, since then userspace could also use the same trick to infer format preferences ... Anyway on the series, since it pushes in a direction I wanted to fix years ago but gave up because too ambitious :-) Acked-by: Daniel Vetter >*/ > - if (!preferred_bpp) > - preferred_bpp = dev->mode_config.preferred_depth; > if (!preferred_bpp) > preferred_bpp = 32; > fb_helper->preferred_bpp = preferred_bpp; > -- > 2.38.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH] drm/doc: make drm-uapi igt-tests more readable
On Fri, Nov 18, 2022 at 03:51:37PM -0800, Randy Dunlap wrote: > Correct grammar and make the use of the igt-tests more readable. > > Signed-off-by: Randy Dunlap > Cc: Gabriela Bittencourt > Cc: Rodrigo Siqueira > Cc: David Airlie > Cc: Daniel Vetter > Cc: Maarten Lankhorst > Cc: Maxime Ripard > Cc: Thomas Zimmermann > Cc: dri-devel@lists.freedesktop.org > Cc: Jonathan Corbet Pushed to drm-misc-next, thanks for your patch. -Daniel > --- > Documentation/gpu/drm-uapi.rst | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff -- a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst > --- a/Documentation/gpu/drm-uapi.rst > +++ b/Documentation/gpu/drm-uapi.rst > @@ -402,19 +402,19 @@ It's possible to run the IGT-tests in a > 1. Use IGT inside a VM > 2. Use IGT from the host machine and write the results in a shared > directory. > > -As follow, there is an example of using a VM with a shared directory with > -the host machine to run igt-tests. As an example it's used virtme:: > +Following is an example of using a VM with a shared directory with > +the host machine to run igt-tests. This example uses virtme:: > > $ virtme-run --rwdir /path/for/shared_dir > --kdir=path/for/kernel/directory --mods=auto > > -Run the igt-tests in the guest machine, as example it's ran the 'kms_flip' > +Run the igt-tests in the guest machine. This example runs the 'kms_flip' > tests:: > > $ /path/for/igt-gpu-tools/scripts/run-tests.sh -p -s -t "kms_flip.*" -v > > -In this example, instead of build the igt_runner, Piglit is used > -(-p option); it's created html summary of the tests results and it's saved > -in the folder "igt-gpu-tools/results"; it's executed only the igt-tests > +In this example, instead of building the igt_runner, Piglit is used > +(-p option). It creates an HTML summary of the test results and saves > +them in the folder "igt-gpu-tools/results". It executes only the igt-tests > matching the -t option. > > Display CRC Support -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()
On Tue, Nov 22, 2022 at 6:12 AM Xiongfeng Wang wrote: > > As comment of pci_get_class() says, it returns a pci_device with its > refcount increased and decreased the refcount for the input parameter > @from if it is not NULL. > > If we break the loop in radeon_atrm_get_bios() with 'pdev' not NULL, we > need to call pci_dev_put() to decrease the refcount. Add the missing > pci_dev_put() to avoid refcount leak. For both patches, I think pci_dev_put() needs to go into the loops. There are 2 or more GPUs on the systems where this is relevant. Alex > > Fixes: d8ade3526b2a ("drm/radeon: handle non-VGA class pci devices with ATRM") > Fixes: c61e2775873f ("drm/radeon: split ATRM support out from the ATPX > handler (v3)") > Signed-off-by: Xiongfeng Wang > --- > drivers/gpu/drm/radeon/radeon_bios.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/radeon/radeon_bios.c > b/drivers/gpu/drm/radeon/radeon_bios.c > index 33121655d50b..2df6ce3e32cb 100644 > --- a/drivers/gpu/drm/radeon/radeon_bios.c > +++ b/drivers/gpu/drm/radeon/radeon_bios.c > @@ -227,6 +227,7 @@ static bool radeon_atrm_get_bios(struct radeon_device > *rdev) > > if (!found) > return false; > + pci_dev_put(pdev); > > rdev->bios = kmalloc(size, GFP_KERNEL); > if (!rdev->bios) { > -- > 2.20.1 >
Re: [PATCH mm-unstable v1 07/20] mm: don't call vm_ops->huge_fault() in wp_huge_pmd()/wp_huge_pud() for private mappings
On 11/16/22 11:26, David Hildenbrand wrote: > If we already have a PMD/PUD mapped write-protected in a private mapping > and we want to break COW either due to FAULT_FLAG_WRITE or > FAULT_FLAG_UNSHARE, there is no need to inform the file system just like on > the PTE path. > > Let's just split (->zap) + fallback in that case. > > This is a preparation for more generic FAULT_FLAG_UNSHARE support in > COW mappings. > > Signed-off-by: David Hildenbrand Reviewed-by: Vlastimil Babka Nits: > --- > mm/memory.c | 24 +++- > 1 file changed, 15 insertions(+), 9 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index c35e6cd32b6a..d47ad33c6487 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4802,6 +4802,7 @@ static inline vm_fault_t create_huge_pmd(struct > vm_fault *vmf) > static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf) > { > const bool unshare = vmf->flags & FAULT_FLAG_UNSHARE; > + vm_fault_t ret; > > if (vma_is_anonymous(vmf->vma)) { > if (likely(!unshare) && > @@ -4809,11 +4810,13 @@ static inline vm_fault_t wp_huge_pmd(struct vm_fault > *vmf) > return handle_userfault(vmf, VM_UFFD_WP); > return do_huge_pmd_wp_page(vmf); > } > - if (vmf->vma->vm_ops->huge_fault) { > - vm_fault_t ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD); > > - if (!(ret & VM_FAULT_FALLBACK)) > - return ret; > + if (vmf->vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) { > + if (vmf->vma->vm_ops->huge_fault) { I guess it could have been a single if with && and the reduced identation could fit keeping 'ret' declaration inside. AFAICS the later patches don't build more on top of this anyway. But also fine keeping as is. (the hunk below same) > + ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD); > + if (!(ret & VM_FAULT_FALLBACK)) > + return ret; > + } > } > > /* COW or write-notify handled on pte level: split pmd. */ > @@ -4839,14 +4842,17 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, > pud_t orig_pud) > { > #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \ > defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) > + vm_fault_t ret; > + > /* No support for anonymous transparent PUD pages yet */ > if (vma_is_anonymous(vmf->vma)) > goto split; > - if (vmf->vma->vm_ops->huge_fault) { > - vm_fault_t ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD); > - > - if (!(ret & VM_FAULT_FALLBACK)) > - return ret; > + if (vmf->vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) { > + if (vmf->vma->vm_ops->huge_fault) { > + ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD); > + if (!(ret & VM_FAULT_FALLBACK)) > + return ret; > + } > } > split: > /* COW or write-notify not handled on PUD level: split pud.*/
Re: [syzbot] inconsistent lock state in sync_info_debugfs_show
On Fri, Nov 18, 2022 at 06:46:38PM -0800, syzbot wrote: > syzbot has found a reproducer for the following issue on: > > HEAD commit:84368d882b96 Merge tag 'soc-fixes-6.1-3' of git://git.kern.. > git tree: upstream > console+strace: https://syzkaller.appspot.com/x/log.txt?x=1670fb6588 > kernel config: https://syzkaller.appspot.com/x/.config?x=6f4e5e9899396248 > dashboard link: https://syzkaller.appspot.com/bug?extid=007bfe0f3330f6e1e7d1 > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils > for Debian) 2.35.2 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=164376f988 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16cf096588 > > Downloadable assets: > disk image: > https://storage.googleapis.com/syzbot-assets/031b6e68785d/disk-84368d88.raw.xz > vmlinux: > https://storage.googleapis.com/syzbot-assets/cff5e66b90e8/vmlinux-84368d88.xz > kernel image: > https://storage.googleapis.com/syzbot-assets/e75525784a66/bzImage-84368d88.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+007bfe0f3330f6e1e...@syzkaller.appspotmail.com > > > WARNING: inconsistent lock state > 6.1.0-rc5-syzkaller-00144-g84368d882b96 #0 Not tainted > > inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. > syz-executor333/3645 [HC0[0]:SC0[0]:HE0:SE1] takes: > 8d295c38 (sync_timeline_list_lock){?...}-{2:2}, at: spin_lock_irq > include/linux/spinlock.h:375 [inline] > 8d295c38 (sync_timeline_list_lock){?...}-{2:2}, at: > sync_info_debugfs_show+0x31/0x200 drivers/dma-buf/sync_debug.c:147 > {IN-HARDIRQ-W} state was registered at: > lock_acquire kernel/locking/lockdep.c:5668 [inline] > lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633 > __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] > _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162 > sync_timeline_debug_remove+0x29/0x1a0 drivers/dma-buf/sync_debug.c:31 > sync_timeline_free drivers/dma-buf/sw_sync.c:104 [inline] > kref_put include/linux/kref.h:65 [inline] > sync_timeline_put drivers/dma-buf/sw_sync.c:116 [inline] > timeline_fence_release+0x267/0x340 drivers/dma-buf/sw_sync.c:144 > dma_fence_release+0x14b/0x690 drivers/dma-buf/dma-fence.c:559 Do we need to just generally push all dma_fence_release finalization to an irq work to untangle this mess once and for all? Or is this not just plain old recursion? Christian, any good ideas? -Daniel > kref_put include/linux/kref.h:65 [inline] > dma_fence_put include/linux/dma-fence.h:276 [inline] > dma_fence_array_release+0x1fa/0x2d0 drivers/dma-buf/dma-fence-array.c:120 > dma_fence_release+0x14b/0x690 drivers/dma-buf/dma-fence.c:559 > kref_put include/linux/kref.h:65 [inline] > dma_fence_put include/linux/dma-fence.h:276 [inline] > irq_dma_fence_array_work+0xa9/0xd0 drivers/dma-buf/dma-fence-array.c:52 > irq_work_single+0x124/0x260 kernel/irq_work.c:211 > irq_work_run_list kernel/irq_work.c:242 [inline] > irq_work_run_list+0x91/0xc0 kernel/irq_work.c:225 > irq_work_run+0x58/0xd0 kernel/irq_work.c:251 > __sysvec_irq_work+0xce/0x4e0 arch/x86/kernel/irq_work.c:22 > sysvec_irq_work+0x92/0xc0 arch/x86/kernel/irq_work.c:17 > asm_sysvec_irq_work+0x1a/0x20 arch/x86/include/asm/idtentry.h:675 > __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:160 [inline] > _raw_spin_unlock_irq+0x29/0x50 kernel/locking/spinlock.c:202 > spin_unlock_irq include/linux/spinlock.h:400 [inline] > sw_sync_debugfs_release+0x162/0x240 drivers/dma-buf/sw_sync.c:321 > __fput+0x27c/0xa90 fs/file_table.c:320 > task_work_run+0x16f/0x270 kernel/task_work.c:179 > ptrace_notify+0x118/0x140 kernel/signal.c:2354 > ptrace_report_syscall include/linux/ptrace.h:420 [inline] > ptrace_report_syscall_exit include/linux/ptrace.h:482 [inline] > syscall_exit_work kernel/entry/common.c:251 [inline] > syscall_exit_to_user_mode_prepare+0x129/0x280 kernel/entry/common.c:278 > __syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline] > syscall_exit_to_user_mode+0xd/0x50 kernel/entry/common.c:296 > do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > irq event stamp: 638 > hardirqs last enabled at (637): [] > ___slab_alloc+0xca0/0x1400 mm/slub.c:3132 > hardirqs last disabled at (638): [] __raw_spin_lock_irq > include/linux/spinlock_api_smp.h:117 [inline] > hardirqs last disabled at (638): [] > _raw_spin_lock_irq+0x45/0x50 kernel/locking/spinlock.c:170 > softirqs last enabled at (538): [] invoke_softirq > kernel/softirq.c:445 [inline] > softirqs last enabled at (538): [] > __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650 > softirqs last disabled at (505): [] invoke_softirq > kernel/softirq.c:445 [inline] > softirqs last disabled at (505): [] > __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650 > > other info that migh
[Bug 216727] Failure to wake up from suspend to RAM
https://bugzilla.kernel.org/show_bug.cgi?id=216727 Alex Deucher (alexdeuc...@gmail.com) changed: What|Removed |Added CC||alexdeuc...@gmail.com --- Comment #1 from Alex Deucher (alexdeuc...@gmail.com) --- Does this patch help? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d4de331f1b24a22d18e3c6116aa25228cf54854 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [PATCH v4 0/4] new subsystem for compute accelerator devices
On Tue, Nov 22, 2022 at 03:46:25PM +1000, Dave Airlie wrote: > On Tue, 22 Nov 2022 at 09:06, Sonal Santan wrote: > > > > On 11/19/22 12:44, Oded Gabbay wrote: > > > This is the fourth (and hopefully last) version of the patch-set to add > > > the > > > new subsystem for compute accelerators. I removed the RFC headline as > > > I believe it is now ready for merging. > > > > > > Compare to v3, this patch-set contains one additional patch that adds > > > documentation regarding the accel subsystem. I hope it's good enough for > > > this stage. In addition, there were few very minor fixes according to > > > comments received on v3. > > > > > > The patches are in the following repo: > > > https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git/log/?h=accel_v4 > > > > > > As in v3, The HEAD of that branch is a commit adding a dummy driver that > > > registers an accel device using the new framework. This can be served > > > as a simple reference. > > > > > > v1 cover letter: > > > https://lkml.org/lkml/2022/10/22/544 > > > > > > v2 cover letter: > > > https://lore.kernel.org/lkml/20221102203405.1797491-1-ogab...@kernel.org/T/ > > > > > > v3 cover letter: > > > https://lore.kernel.org/lkml/20221106210225.2065371-1-ogab...@kernel.org/T/ > > > > Thanks for defining the new accel subsystem. We are currently working on > > DRM based drivers for unannounced acceleration devices. I am fine with > > these changes with the assumption that the choice of using classic DRM > > or accel is left up to the individual driver. > > I don't think that decision should be up to any individual driver > author. It will have to be consensus with me/Daniel/Oded and the > driver authors. Plus the entire point of this is that it's _still_ a drm based driver. So aside from changing a flag in the kernel driver and adjusting userspace to find the right chardev, there should be zero changes need for an existing drm based driver stack that gets ported to drivers/accel. And of course if we realize there's issues as we add drivers, we can fix things up. This is just to kick things off, not something that's going to be cast in stone for all eternity. Sonal, with that clarification/explanation, is this entire thing reasonable in principal and you can drop an Ack onto the series? Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
[Bug 216727] Failure to wake up from suspend to RAM
https://bugzilla.kernel.org/show_bug.cgi?id=216727 --- Comment #2 from Martin (martin...@gmx.com) --- (In reply to Alex Deucher from comment #1) > Does this patch help? > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/ > ?id=8d4de331f1b24a22d18e3c6116aa25228cf54854 Thanks, I'll try and let you know tomorrow. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [MAINTAINER TOOLS] docs: updated rules for topic/core-for-CI commit management
On Tue, Nov 22, 2022 at 03:17:14PM +0200, Jani Nikula wrote: > Introduce stricter rules for topic/core-for-CI management. Way too many > commits have been added over the years, with insufficient rationale > recorded in the commit message, and insufficient follow-up with removing > the commits from the topic branch. > > New rules: > > 1. Require maintainer ack for rebase. Have better gating on when rebases >happen and on which baselines. > > 2. Require maintainer/committer ack for adding/removing commits. No >single individual should decide. > > 3. Require gitlab issues for new commits added. Improve tracking for >removing the commits. > > Also use the stronger "must" for commit message requiring the > justification for the commit being in topic/core-for-CI. > > Cc: Joonas Lahtinen > Cc: Rodrigo Vivi > Cc: Tvrtko Ursulin > Cc: David Airlie > Cc: Daniel Vetter > Cc: intel-...@lists.freedesktop.org > Cc: dri-devel@lists.freedesktop.org > Cc: dim-to...@lists.freedesktop.org > Signed-off-by: Jani Nikula Reviewed-by: Daniel Vetter > --- > drm-tip.rst | 27 --- > 1 file changed, 20 insertions(+), 7 deletions(-) > > diff --git a/drm-tip.rst b/drm-tip.rst > index deae95cdd2fe..24036e2ef576 100644 > --- a/drm-tip.rst > +++ b/drm-tip.rst > @@ -203,11 +203,13 @@ justified exception. The primary goal is to fix issues > originating from Linus' > tree. Issues that would need drm-next or other DRM subsystem tree as baseline > should be fixed in the offending DRM subsystem tree. > > -Only rebase the branch if you really know what you're doing. When in doubt, > ask > -the maintainers. You'll need to be able to handle any conflicts in non-drm > code > -while rebasing. > +Only rebase the branch if you really know what you're doing. You'll need to > be > +able to handle any conflicts in non-drm code while rebasing. > > -Simply drop fixes that are already available in the new baseline. > +Always ask for maintainer ack before rebasing. IRC ack is sufficient. > + > +Simply drop fixes that are already available in the new baseline. Close the > +associated gitlab issue when removing commits. > > Force pushing a rebased topic/core-for-CI requires passing the ``--force`` > parameter to git:: > @@ -225,11 +227,22 @@ judgement call. > Only add or remove commits if you really know what you're doing. When in > doubt, > ask the maintainers. > > -Apply new commits on top with regular push. The commit message needs to > explain > -why the patch has been applied to topic/core-for-CI. If it's a cherry-pick > from > +Always ask for maintainer/committer ack before adding/removing commits. IRC > ack > +is sufficient. Record the ``Acked-by:`` in commits being added. > + > +Apply new commits on top with regular push. The commit message must explain > why > +the patch has been applied to topic/core-for-CI. If it's a cherry-pick from > another subsystem, please reference the commit with ``git cherry-pick -x`` > option. If it's a patch from another subsystem, please reference the patch on > the mailing list with ``Link:`` tag. > > +New commits always need an associated gitlab issue for tracking purposes. The > +goal is to have as few commits in topic/core-for-CI as possible, and we need > to > +be able to track the progress in making that happen. Reference the issue with > +``References:`` tag. Add the ``core-for-CI`` label to the issue. (Note: Do > not > +use ``Closes:`` because the logic here is backwards; the issue is having the > +commit in the branch in the first place.) > + > Instead of applying reverts, just remove the commit. This implies ``git > rebase > --i`` on the current baseline; see directions above. > +-i`` on the current baseline; see directions above. Close the associated > gitlab > +issue when removing commits. > -- > 2.34.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()
On 11/22/2022 8:19 PM, Alex Deucher wrote: On Tue, Nov 22, 2022 at 6:12 AM Xiongfeng Wang wrote: As comment of pci_get_class() says, it returns a pci_device with its refcount increased and decreased the refcount for the input parameter @from if it is not NULL. If we break the loop in radeon_atrm_get_bios() with 'pdev' not NULL, we need to call pci_dev_put() to decrease the refcount. Add the missing pci_dev_put() to avoid refcount leak. For both patches, I think pci_dev_put() needs to go into the loops. There are 2 or more GPUs on the systems where this is relevant. As per the logic, the loop breaks when it finds a valid ATRM handle. So dev_put is required only for that device. When inside the loop this happens - "decreased the refcount for the input parameter @from if it is not NULL" Thanks, Lijo Alex Fixes: d8ade3526b2a ("drm/radeon: handle non-VGA class pci devices with ATRM") Fixes: c61e2775873f ("drm/radeon: split ATRM support out from the ATPX handler (v3)") Signed-off-by: Xiongfeng Wang --- drivers/gpu/drm/radeon/radeon_bios.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/radeon/radeon_bios.c b/drivers/gpu/drm/radeon/radeon_bios.c index 33121655d50b..2df6ce3e32cb 100644 --- a/drivers/gpu/drm/radeon/radeon_bios.c +++ b/drivers/gpu/drm/radeon/radeon_bios.c @@ -227,6 +227,7 @@ static bool radeon_atrm_get_bios(struct radeon_device *rdev) if (!found) return false; + pci_dev_put(pdev); rdev->bios = kmalloc(size, GFP_KERNEL); if (!rdev->bios) { -- 2.20.1
Re: [PATCH RFC 16/19] mm/frame-vector: remove FOLL_FORCE usage
On 22.11.22 15:07, Hans Verkuil wrote: On 11/22/22 13:38, David Hildenbrand wrote: On 22.11.22 13:25, Hans Verkuil wrote: Hi Tomasz, David, On 11/8/22 05:45, Tomasz Figa wrote: Hi David, On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand wrote: FOLL_FORCE is really only for debugger access. According to commit 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always writable"), the pinned pages are always writable. Actually that patch is only a workaround to temporarily disable support for read-only pages as they seemed to suffer from some corruption issues in the retrieved user pages. We expect to support read-only pages as hardware input after. That said, FOLL_FORCE doesn't sound like the right thing even in that case, but I don't know the background behind it being added here in the first place. +Hans Verkuil +Marek Szyprowski do you happen to remember anything about it? I tracked the use of 'force' all the way back to the first git commit (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the reason is lost in the mists of time. I'm not sure if the 'force' argument of get_user_pages() at that time even meant the same as FOLL_FORCE today. From what I can tell it has just been faithfully used ever since, but I have my doubt that anyone understands the reason behind it since it was never explained. Looking at this old LWN article https://lwn.net/Articles/28548/ suggests that it might be related to calling get_user_pages for write buffers (non-zero write argument) where you also want to be able to read from the buffer. That is certainly something that some drivers need to do post-capture fixups. But 'force' was also always set for read buffers, and I don't know if that was something that was actually needed, or just laziness. I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still allow drivers to read from the buffer? Yes. The only problematic corner case I can imagine is if someone has a VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin user space pages as a read buffer. We'd specify now FOLL_WRITE without FOLL_FORCE and GUP would reject that: write access without write permissions is invalid. I do not believe this will be an issue. There would be no way around "fixing" this implementation to not specify FOLL_WRITE when only reading from user-space pages. Not sure what the implications are regarding that corruption that was mentioned in 707947247e95. Before 707947247e95 the FOLL_WRITE flag was only set for write buffers (i.e. video capture, DMA_FROM_DEVICE), not for read buffers (video output, DMA_TO_DEVICE). In the video output case there should never be any need for drivers to write to the buffer to the best of my knowledge. But I have had some complaints about that commit that it causes problems in some scenarios, and it has been on my todo list for quite some time now to dig deeper into this. I probably should prioritize this for this or next week. Having said that, I assume such a scenario is unlikely -- but you might know better how user space usually uses this interface. There would be three options: 1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid. 2) Remove FOLL_FORCE and see if anybody even notices (this patch) and leave the implementation as is for now. 3) Remove FOLL_FORCE and fixup the implementation to only specify FOLL_WRITE if the pages will actually get written to. 3) would most probably ideal, however, I am no expert on that code and can't do it (707947247e95 confuses me). So naive me would go with 2) first. Option 3 would be best. And 707947247e95 confuses me as well, and I actually wrote it :-) I am wondering whether it was addressed at the right level, but as I said, I need to dig a bit deeper into this. Cool, let me know if I can help! -- Thanks, David / dhildenb
Re: [PATCH v2 2/4] usb: gadget: hid: Convert to use list_count()
On Mon, Nov 14, 2022 at 04:15:35PM -0300, Fabio Estevam wrote: > On Mon, Nov 14, 2022 at 1:22 PM Andy Shevchenko > wrote: > > > > The list API now provides the list_count() to help with counting > > existing nodes in the list. Uilise it. > > s/Uilise/Utilise Fixed for v3, thanks! -- With Best Regards, Andy Shevchenko
Re: [PATCH v2 1/4] i915: Move list_count() to list.h for broader use
On Tue, Nov 15, 2022 at 05:46:28PM +0200, Jani Nikula wrote: > On Mon, 14 Nov 2022, Andy Shevchenko > wrote: > > Some of the existing users, and definitely will be new ones, want to > > count existing nodes in the list. Provide a generic API for that by > > moving code from i915 to list.h. > > I think I'd find list_length() a much more natural name for this. i915 suggests my variant :-) > *shrug* > > Acked-by: Jani Nikula > > regardless of what you decide to do with name or static inline etc. Thanks! I will check which one looks and feels better and update for v3. -- With Best Regards, Andy Shevchenko
RE: [EXT] Re: [PATCH v4 02/10] dt-bindings: display: bridge: Add MHDP DP for i.MX8MQ
Hi Krzysztof, Thanks your comments, > -Original Message- > From: Krzysztof Kozlowski > Sent: 2022年11月22日 16:28 > To: Sandor Yu ; andrzej.ha...@intel.com; > neil.armstr...@linaro.org; robert.f...@linaro.org; > laurent.pinch...@ideasonboard.com; jo...@kwiboo.se; > jernej.skra...@gmail.com; airl...@gmail.com; dan...@ffwll.ch; > robh...@kernel.org; krzysztof.kozlowski...@linaro.org; > shawn...@kernel.org; s.ha...@pengutronix.de; feste...@gmail.com; > kis...@ti.com; vk...@kernel.org; dri-devel@lists.freedesktop.org; > devicet...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; > linux-ker...@vger.kernel.org; linux-...@lists.infradead.org; > alexander.st...@ew.tq-group.com > Cc: ker...@pengutronix.de; dl-linux-imx ; Oliver Brown > > Subject: [EXT] Re: [PATCH v4 02/10] dt-bindings: display: bridge: Add MHDP > DP for i.MX8MQ > > Caution: EXT Email > > On 21/11/2022 08:23, Sandor Yu wrote: > > Add bindings for i.MX8MQ MHDP DisplayPort. > > > > Signed-off-by: Sandor Yu > > --- > > .../display/bridge/cdns,mhdp-imx8mq-dp.yaml | 93 > +++ > > 1 file changed, 93 insertions(+) > > create mode 100644 > > > Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp.y > > aml > > > > diff --git > > > a/Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp > > .yaml > > > b/Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8mq-dp > > .yaml > > new file mode 100644 > > index ..d82f3ceddaa8 > > --- /dev/null > > +++ > b/Documentation/devicetree/bindings/display/bridge/cdns,mhdp-imx8m > > +++ q-dp.yaml > > @@ -0,0 +1,93 @@ > > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) %YAML 1.2 > > +--- > > +$id: > > +https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdevi > > > +cetree.org%2Fschemas%2Fdisplay%2Fbridge%2Fcdns%2Cmhdp-imx8mq-dp. > yaml% > > > +23&data=05%7C01%7CSandor.yu%40nxp.com%7C163690c8a8ab4f7a6 > c9208dac > > > +c6373c4%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6380470 > 247542869 > > > +46%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luM > zIiLCJBTi > > > +I6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Eei5Nkp0 > Hl8SHpBLZU > > +1HsJDnWQujHvmPh2XMuC%2BSZ58%3D&reserved=0 > > +$schema: > > +https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdevi > > > +cetree.org%2Fmeta-schemas%2Fcore.yaml%23&data=05%7C01%7CSa > ndor.yu > > > +%40nxp.com%7C163690c8a8ab4f7a6c9208dacc6373c4%7C686ea1d3bc2b4 > c6fa92cd > > > +99c5c301635%7C0%7C0%7C638047024754286946%7CUnknown%7CTWFp > bGZsb3d8eyJW > > > +IjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7 > C3000 > > > +%7C%7C%7C&sdata=eZUnepnQHS9ewyWb9AHMht%2F%2BevutmOt > mTajL%2B5Ewwbs > > +%3D&reserved=0 > > + > > +title: Cadence MHDP Displayport bridge > > + > > +maintainers: > > + - Sandor Yu > > + > > +description: > > + The Cadence MHDP Displayport TX interface. > > + > > +properties: > > + compatible: > > +enum: > > + - cdns,mhdp-imx8mq-dp > > + > > + reg: > > +maxItems: 1 > > + > > + clocks: > > +maxItems: 1 > > +description: MHDP DP APB clock. > > + > > + phys: > > +maxItems: 1 > > + > > + interrupts: > > +items: > > + - description: Hotplug cable plugin. > > + - description: Hotplug cable plugout. > > + > > + interrupt-names: > > +items: > > + - const: plug_in > > + - const: plug_out > > + > > + ports: > > +$ref: /schemas/graph.yaml#/properties/ports > > + > > +properties: > > + port@0: > > +$ref: /schemas/graph.yaml#/properties/port > > +description: > > + Input port from display controller output. > > + port@1: > > +$ref: /schemas/graph.yaml#/properties/port > > +description: > > + Output port to DP connector. > > + > > +required: > > + - port@0 > > + > > +required: > > + - compatible > > + - reg > > + - clocks > > + - interrupts > > + - interrupt-names > > + - phys > > + - ports > > + > > +additionalProperties: false > > + > > +examples: > > + - | > > +#include > > +#include > > + > > +mhdp_dp: dp-bridge@32c0 { > > +compatible = "cdns,mhdp-imx8mq-dp"; > > +reg = <0x32c0 0x10>; > > +interrupts = , > > + ; > > +interrupt-names = "plug_in", "plug_out"; > > +clocks = <&clk IMX8MQ_CLK_DISP_APB_ROOT>; > > +phys = <&dp_phy>; > > + > > +ports { > > +#address-cells = <1>; > > +#size-cells = <0>; > > + > > +port@0 { > > +reg = <0>; > > + > > +mhdp_in: endpoint { > > +remote-endpoint = <&dcss_out>; > > +}; > > As Rob suggested, you allowed property for output to DP port. However it is > not in the example. If this is a bridge, what does it bridge if there is no > output > connector? > OK, I got it. Output connector will be added to the example in the next version. Thanks, Sandor > > Best r
[PATCH v3 1/4] i915: Move list_count() to list.h for broader use
Some of the existing users, and definitely will be new ones, want to count existing nodes in the list. Provide a generic API for that by moving code from i915 to list.h. Signed-off-by: Andy Shevchenko Acked-by: Jani Nikula --- v3: added tag (Jani), changed to be static inline (Mike) v2: dropped the duplicate code in i915 (LKP) drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 + include/linux/list.h | 15 +++ 2 files changed, 16 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 6ae8b07cfaa1..b5d474be564d 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -2085,17 +2085,6 @@ static void print_request_ring(struct drm_printer *m, struct i915_request *rq) } } -static unsigned long list_count(struct list_head *list) -{ - struct list_head *pos; - unsigned long count = 0; - - list_for_each(pos, list) - count++; - - return count; -} - static unsigned long read_ul(void *p, size_t x) { return *(unsigned long *)(p + x); @@ -2270,7 +2259,7 @@ void intel_engine_dump(struct intel_engine_cs *engine, spin_lock_irqsave(&engine->sched_engine->lock, flags); engine_dump_active_requests(engine, m); - drm_printf(m, "\tOn hold?: %lu\n", + drm_printf(m, "\tOn hold?: %zu\n", list_count(&engine->sched_engine->hold)); spin_unlock_irqrestore(&engine->sched_engine->lock, flags); diff --git a/include/linux/list.h b/include/linux/list.h index 61762054b4be..65aab596ae33 100644 --- a/include/linux/list.h +++ b/include/linux/list.h @@ -655,6 +655,21 @@ static inline void list_splice_tail_init(struct list_head *list, !list_is_head(pos, (head)); \ pos = n, n = pos->prev) +/** + * list_count - count nodes in the list + * @head: the head for your list. + */ +size_t list_count(struct list_head *head) +{ + struct list_head *pos; + size_t count = 0; + + list_for_each(pos, head) + count++; + + return count; +} + /** * list_entry_is_head - test if the entry points to the head of the list * @pos: the type * to cursor -- 2.35.1
[PATCH v3 3/4] usb: gadget: udc: bcm63xx: Convert to use list_count()
The list API now provides the list_count() to help with counting existing nodes in the list. Utilise it. Signed-off-by: Andy Shevchenko --- v3: no change v2: no change drivers/usb/gadget/udc/bcm63xx_udc.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/drivers/usb/gadget/udc/bcm63xx_udc.c b/drivers/usb/gadget/udc/bcm63xx_udc.c index 2cdb07905bde..0762e49e85f8 100644 --- a/drivers/usb/gadget/udc/bcm63xx_udc.c +++ b/drivers/usb/gadget/udc/bcm63xx_udc.c @@ -2172,7 +2172,6 @@ static int bcm63xx_iudma_dbg_show(struct seq_file *s, void *p) for (ch_idx = 0; ch_idx < BCM63XX_NUM_IUDMA; ch_idx++) { struct iudma_ch *iudma = &udc->iudma[ch_idx]; - struct list_head *pos; seq_printf(s, "IUDMA channel %d -- ", ch_idx); switch (iudma_defaults[ch_idx].ep_type) { @@ -2205,14 +2204,10 @@ static int bcm63xx_iudma_dbg_show(struct seq_file *s, void *p) seq_printf(s, " desc: %d/%d used", iudma->n_bds_used, iudma->n_bds); - if (iudma->bep) { - i = 0; - list_for_each(pos, &iudma->bep->queue) - i++; - seq_printf(s, "; %d queued\n", i); - } else { + if (iudma->bep) + seq_printf(s, "; %zu queued\n", list_count(&iudma->bep->queue)); + else seq_printf(s, "\n"); - } for (i = 0; i < iudma->n_bds; i++) { struct bcm_enet_desc *d = &iudma->bd_ring[i]; -- 2.35.1
[PATCH v3 2/4] usb: gadget: hid: Convert to use list_count()
The list API now provides the list_count() to help with counting existing nodes in the list. Utilise it. Signed-off-by: Andy Shevchenko --- v3: fixed typo in the commit message (Fabio) v2: no change drivers/usb/gadget/legacy/hid.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/usb/gadget/legacy/hid.c b/drivers/usb/gadget/legacy/hid.c index 1187ee4f316a..6196c3456e0b 100644 --- a/drivers/usb/gadget/legacy/hid.c +++ b/drivers/usb/gadget/legacy/hid.c @@ -133,14 +133,11 @@ static struct usb_configuration config_driver = { static int hid_bind(struct usb_composite_dev *cdev) { struct usb_gadget *gadget = cdev->gadget; - struct list_head *tmp; struct hidg_func_node *n = NULL, *m, *iter_n; struct f_hid_opts *hid_opts; - int status, funcs = 0; - - list_for_each(tmp, &hidg_func_list) - funcs++; + int status, funcs; + funcs = list_count(&hidg_func_list); if (!funcs) return -ENODEV; -- 2.35.1
[PATCH v3 4/4] xhci: Convert to use list_count()
The list API now provides the list_count() to help with counting existing nodes in the list. Utilise it. Signed-off-by: Andy Shevchenko --- v3: no change v2: no change drivers/usb/host/xhci-ring.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index ad81e9a508b1..817c31e3b0c8 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -2532,7 +2532,6 @@ static int handle_tx_event(struct xhci_hcd *xhci, union xhci_trb *ep_trb; int status = -EINPROGRESS; struct xhci_ep_ctx *ep_ctx; - struct list_head *tmp; u32 trb_comp_code; int td_num = 0; bool handling_skipped_tds = false; @@ -2580,10 +2579,8 @@ static int handle_tx_event(struct xhci_hcd *xhci, } /* Count current td numbers if ep->skip is set */ - if (ep->skip) { - list_for_each(tmp, &ep_ring->td_list) - td_num++; - } + if (ep->skip) + td_num += list_count(&ep_ring->td_list); /* Look for common error cases */ switch (trb_comp_code) { -- 2.35.1
Re: [PATCH mm-unstable v1 08/20] mm: extend FAULT_FLAG_UNSHARE support to anything in a COW mapping
On 11/16/22 11:26, David Hildenbrand wrote: > Extend FAULT_FLAG_UNSHARE to break COW on anything mapped into a > COW (i.e., private writable) mapping and adjust the documentation > accordingly. > > FAULT_FLAG_UNSHARE will now also break COW when encountering the shared > zeropage, a pagecache page, a PFNMAP, ... inside a COW mapping, by > properly replacing the mapped page/pfn by a private copy (an exclusive > anonymous page). > > Note that only do_wp_page() needs care: hugetlb_wp() already handles > FAULT_FLAG_UNSHARE correctly. wp_huge_pmd()/wp_huge_pud() also handles it > correctly, for example, splitting the huge zeropage on FAULT_FLAG_UNSHARE > such that we can handle FAULT_FLAG_UNSHARE on the PTE level. > > This change is a requirement for reliable long-term R/O pinning in > COW mappings. > > Signed-off-by: David Hildenbrand Reviewed-by: Vlastimil Babka > --- > include/linux/mm_types.h | 8 > mm/memory.c | 4 > 2 files changed, 4 insertions(+), 8 deletions(-) > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 5e7f4fac1e78..5e9aaad8c7b2 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -1037,9 +1037,9 @@ typedef struct { > * @FAULT_FLAG_REMOTE: The fault is not for current task/mm. > * @FAULT_FLAG_INSTRUCTION: The fault was during an instruction fetch. > * @FAULT_FLAG_INTERRUPTIBLE: The fault can be interrupted by non-fatal > signals. > - * @FAULT_FLAG_UNSHARE: The fault is an unsharing request to unshare (and > mark > - * exclusive) a possibly shared anonymous page that is > - * mapped R/O. > + * @FAULT_FLAG_UNSHARE: The fault is an unsharing request to break COW in a > + * COW mapping, making sure that an exclusive anon page > is > + * mapped after the fault. > * @FAULT_FLAG_ORIG_PTE_VALID: whether the fault has vmf->orig_pte cached. > *We should only access orig_pte if this flag set. > * > @@ -1064,7 +1064,7 @@ typedef struct { > * > * The combination FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE is illegal. > * FAULT_FLAG_UNSHARE is ignored and treated like an ordinary read fault when > - * no existing R/O-mapped anonymous page is encountered. > + * applied to mappings that are not COW mappings. > */ > enum fault_flag { > FAULT_FLAG_WRITE = 1 << 0, > diff --git a/mm/memory.c b/mm/memory.c > index d47ad33c6487..56b21ab1e4d2 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3432,10 +3432,6 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) > } > wp_page_reuse(vmf); > return 0; > - } else if (unshare) { > - /* No anonymous page -> nothing to do. */ > - pte_unmap_unlock(vmf->pte, vmf->ptl); > - return 0; > } > copy: > /*
[PATCH] drm/amd/pm: added result check
The return value of the 'div64_s64' function called in ppevvmath.h:371 was not checked. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Denis Arefev --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h index dac29fe6cfc6..82aa7130d143 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h @@ -357,6 +357,7 @@ static fInt fDivide (fInt X, fInt Y) { fInt fZERO, fQuotient; int64_t longlongX, longlongY; + int64_t result; fZERO = ConvertToFraction(0); @@ -367,10 +368,11 @@ static fInt fDivide (fInt X, fInt Y) longlongY = (int64_t)Y.full; longlongX = longlongX << 16; /*Q(16,16) -> Q(32,32) */ + longlongY = longlongY << 16; - div64_s64(longlongX, longlongY); /*Q(32,32) divided by Q(16,16) = Q(16,16) Back to original format */ + result = div64_s64(longlongX, longlongY); - fQuotient.full = (int)longlongX; + fQuotient = ConvertToFraction((int)result); return fQuotient; } -- 2.25.1
Re: Build regressions/improvements in v6.1-rc6
On Tue, 2022-11-22 at 08:55 -0500, Alex Deucher wrote: > > > >+ /kisskb/src/arch/um/include/asm/processor-generic.h: error: called > > object is not a function or function pointer: => 94:18 > >+ /kisskb/src/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c: > > error: control reaches end of non-void function [-Werror=return-type]: => > > 1934:1 > > > > um-x86_64/um-all{mod,yes}config (in kfd_cpumask_to_apic_id()) > > Presumably cpu_data is not defined on um-x86_64? Does it even make > sense to build drivers on um-x86_64? Drivers in general yes ;-) This driver, probably not. But the issue is that a lot of drivers "depends on X86_64" or such, where only "X86" is the arch symbol. You could add "X86 && X86_64" to really build on x86 64-bit only. I didn't check this driver, but this has mostly popped up since UM got PCI support some time ago (which I added.) johannes
Re: [PATCH v3 1/4] i915: Move list_count() to list.h for broader use
On Tue, Nov 22, 2022 at 05:35:13PM +0200, Andy Shevchenko wrote: > Some of the existing users, and definitely will be new ones, want to > count existing nodes in the list. Provide a generic API for that by > moving code from i915 to list.h. Sorry for the noise, forgot to squash one part, so it get's not compilable. -- With Best Regards, Andy Shevchenko
Re: [PATCH v28 01/11] dt-bindings: arm: mediatek: mmsys: add vdosys1 compatible for MT8195
On 22/11/2022 11:51, Nancy Lin (林欣螢) wrote: Dear Matthias, Thanks for the review. On Thu, 2022-11-10 at 14:10 +0100, Matthias Brugger wrote: On 09/11/2022 06:10, Jason-JH Lin (林睿祥) wrote: > On Tue, 2022-11-08 at 18:46 +0100, Matthias Brugger wrote: > > > > On 07/11/2022 08:22, Nancy.Lin wrote: > > > Add vdosys1 mmsys compatible for MT8195 platform. > > > > > > For MT8195, VDOSYS0 and VDOSYS1 are 2 display HW pipelines > > > binding > > > to > > > 2 different power domains, different clock drivers and > > > different > > > mediatek-drm drivers. > > > > > > Signed-off-by: Nancy.Lin > > > Reviewed-by: Nícolas F. R. A. Prado > > > --- > > > > > > .../devicetree/bindings/arm/mediatek/mediatek,mmsys.yaml | > > > 4 > > > +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git > > > a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys > > > .yam > > > l > > > b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys > > > .yam > > > l > > > index 0711f1834fbd..aaabe2196185 100644 > > > --- > > > a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys > > > .yam > > > l > > > +++ > > > b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys > > > .yam > > > l > > > @@ -48,7 +48,9 @@ properties: > > > - const: syscon > > > > > > - items: > > > - - const: mediatek,mt8195-vdosys0 > > > + - enum: > > > + - mediatek,mt8195-vdosys0 > > > + - mediatek,mt8195-vdosys1 > > > - const: mediatek,mt8195-mmsys > > > - const: syscon > > > > > > > I think we had that several times already: > > > > https://lore.kernel.org/all/6bbe9527-ae48-30e0-fb45-519223a74...@linaro.org/ > > > > We will something like this, but please check that this does not > > give > > any > > errors/warnings: > > > > diff --git > > a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.y > > aml > > b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.y > > aml > > index eb451bec23d3d..8e9c4f4d7c389 100644 > > --- > > a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.y > > aml > > +++ > > b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.y > > aml > > @@ -32,13 +32,22 @@ properties: > > - mediatek,mt8183-mmsys > > - mediatek,mt8186-mmsys > > - mediatek,mt8192-mmsys > > - - mediatek,mt8195-mmsys > > - mediatek,mt8365-mmsys > > - const: syscon > > - items: > > - const: mediatek,mt7623-mmsys > > - const: mediatek,mt2701-mmsys > > - const: syscon > > + - items: > > + - const: mediatek,mt8195-vdosys0 > > + - const: syscon > > + - items: > > + - const: mediatek,mt8195-vdosys1 > > + - const: syscon > > + - items: > > + - const: mediatek,mt8195-mmsys > > + - const: syscon > > + deprecated: true > > > > reg: > > maxItems: 1 > > Hi Matthias, > > As the vdosys0 previous reviewed patch: > > https://urldefense.com/v3/__https://patchwork.kernel.org/project/linux-mediatek/patch/20220927152704.12018-2-jason-jh@mediatek.com/__;!!CTRNKA9wMg0ARbw!zRdbIyyAsfqob2kapMAcKYATAguhEV0x0qE5cTOAcWUNhzeAbMHzZoos_2QzUCxS$ > > Should I modify the vdosys0 items format like your example? > > Or should vdosys1 add items format like vdosys0's previous patch? > - items: > - const: mediatek,mt8195-vdosys1 > - const: mediatek,mt8195-mmsys > - const: syscon > No, vdosys1 must not have mediatek,mt8195-mmsys fallback. Regards, Matthias I will fix it and add the following vdosys1 binding base on [1]. - description: vdosys0 and vdosys1 are 2 display HW pipelines, so mt8195 binding should be deprecated. deprecated: true items: - const: mediatek,mt8195-mmsys - const: syscon - items: - const: mediatek,mt7623-mmsys - const: mediatek,mt2701-mmsys - const: syscon - items: - const: mediatek,mt8195-vdosys0 - const: mediatek,mt8195-mmsys - const: syscon +- items: +- const: mediatek,mt8195-vdosys1 +- const: syscon Looks good, thanks Matthias [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/matthias.bgg/linux/+/b237efd47df7d25b78c306e90b97c5aa0ff4c4fc/Documentation/devicetree/bindings/arm/mediatek/mediatek%2Cmmsys.yaml Regards, Nancy > Regards, > Jason-JH.Lin > > > > * MEDIATEK Confidentiality Notice > > The information contained in this e-mail message (including any > attachments) may be confidential, proprietary, privileged, or > otherwise > exempt from disclosure under applicable laws. It is intended to be > conveyed only to the designated recipient(s). Any use, > disseminat
Re: git send-email friendly smtp provider anyone?
On Mon, Nov 21, 2022 at 07:13:28PM +0100, Noralf Trønnes wrote: > > Otherwise, you might consider using: > > https://b4.docs.kernel.org/en/latest/contributor/send.html#authenticating-with-the-web-submission-endpoint > > > > That's an interesting option. I did briefly look at b4 a few months back > but it looked like it was under heavy development so I figured I'd wait > before trying it out. I think I'll give b4 a spin to see how it works, I > wonder how it handles patch changelogs. I'd be happy to help set this up for you -- to date, this service hasn't been used beyond a few test posts. -K
Re: [PATCH v4 0/4] new subsystem for compute accelerator devices
On 11/20/2022 3:04 PM, Jeffrey Hugo wrote: On 11/19/2022 1:44 PM, Oded Gabbay wrote: This is the fourth (and hopefully last) version of the patch-set to add the new subsystem for compute accelerators. I removed the RFC headline as I believe it is now ready for merging. Compare to v3, this patch-set contains one additional patch that adds documentation regarding the accel subsystem. I hope it's good enough for this stage. In addition, there were few very minor fixes according to comments received on v3. The patches are in the following repo: https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git/log/?h=accel_v4 As in v3, The HEAD of that branch is a commit adding a dummy driver that registers an accel device using the new framework. This can be served as a simple reference. v1 cover letter: https://lkml.org/lkml/2022/10/22/544 v2 cover letter: https://lore.kernel.org/lkml/20221102203405.1797491-1-ogab...@kernel.org/T/ v3 cover letter: https://lore.kernel.org/lkml/20221106210225.2065371-1-ogab...@kernel.org/T/ Thanks, Oded. Reviewed-by: Jeffrey Hugo I have some nits. Nothing that I think should be a blocker for this series. I don't recall if I previously mentioned this. I'm planning on updating the QAIC series to be an accel driver. Therefore there should be at-least 1 user (probably several) for this subsystem in short order.
Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()
On Tue, Nov 22, 2022 at 9:59 AM Lazar, Lijo wrote: > > > > On 11/22/2022 8:19 PM, Alex Deucher wrote: > > On Tue, Nov 22, 2022 at 6:12 AM Xiongfeng Wang > > wrote: > >> > >> As comment of pci_get_class() says, it returns a pci_device with its > >> refcount increased and decreased the refcount for the input parameter > >> @from if it is not NULL. > >> > >> If we break the loop in radeon_atrm_get_bios() with 'pdev' not NULL, we > >> need to call pci_dev_put() to decrease the refcount. Add the missing > >> pci_dev_put() to avoid refcount leak. > > > > For both patches, I think pci_dev_put() needs to go into the loops. > > There are 2 or more GPUs on the systems where this is relevant. > > > > As per the logic, the loop breaks when it finds a valid ATRM handle. So > dev_put is required only for that device. Sure, but what if the handle is on the second DISPLAY_VGA or DISPLAY_OTHER class PCI device on the system? We've already called pci_get_class() for the first PCI device that matched. Alex > > When inside the loop this happens - "decreased the refcount for the > input parameter @from if it is not NULL" > > Thanks, > Lijo > > > Alex > > > >> > >> Fixes: d8ade3526b2a ("drm/radeon: handle non-VGA class pci devices with > >> ATRM") > >> Fixes: c61e2775873f ("drm/radeon: split ATRM support out from the ATPX > >> handler (v3)") > >> Signed-off-by: Xiongfeng Wang > >> --- > >> drivers/gpu/drm/radeon/radeon_bios.c | 1 + > >> 1 file changed, 1 insertion(+) > >> > >> diff --git a/drivers/gpu/drm/radeon/radeon_bios.c > >> b/drivers/gpu/drm/radeon/radeon_bios.c > >> index 33121655d50b..2df6ce3e32cb 100644 > >> --- a/drivers/gpu/drm/radeon/radeon_bios.c > >> +++ b/drivers/gpu/drm/radeon/radeon_bios.c > >> @@ -227,6 +227,7 @@ static bool radeon_atrm_get_bios(struct radeon_device > >> *rdev) > >> > >> if (!found) > >> return false; > >> + pci_dev_put(pdev); > >> > >> rdev->bios = kmalloc(size, GFP_KERNEL); > >> if (!rdev->bios) { > >> -- > >> 2.20.1 > >>
Re: [PATCH mm-unstable v1 09/20] mm/gup: reliable R/O long-term pinning in COW mappings
On 11/16/22 11:26, David Hildenbrand wrote: > We already support reliable R/O pinning of anonymous memory. However, > assume we end up pinning (R/O long-term) a pagecache page or the shared > zeropage inside a writable private ("COW") mapping. The next write access > will trigger a write-fault and replace the pinned page by an exclusive > anonymous page in the process page tables to break COW: the pinned page no > longer corresponds to the page mapped into the process' page table. > > Now that FAULT_FLAG_UNSHARE can break COW on anything mapped into a > COW mapping, let's properly break COW first before R/O long-term > pinning something that's not an exclusive anon page inside a COW > mapping. FAULT_FLAG_UNSHARE will break COW and map an exclusive anon page > instead that can get pinned safely. > > With this change, we can stop using FOLL_FORCE|FOLL_WRITE for reliable > R/O long-term pinning in COW mappings. > > With this change, the new R/O long-term pinning tests for non-anonymous > memory succeed: > # [RUN] R/O longterm GUP pin ... with shared zeropage > ok 151 Longterm R/O pin is reliable > # [RUN] R/O longterm GUP pin ... with memfd > ok 152 Longterm R/O pin is reliable > # [RUN] R/O longterm GUP pin ... with tmpfile > ok 153 Longterm R/O pin is reliable > # [RUN] R/O longterm GUP pin ... with huge zeropage > ok 154 Longterm R/O pin is reliable > # [RUN] R/O longterm GUP pin ... with memfd hugetlb (2048 kB) > ok 155 Longterm R/O pin is reliable > # [RUN] R/O longterm GUP pin ... with memfd hugetlb (1048576 kB) > ok 156 Longterm R/O pin is reliable > # [RUN] R/O longterm GUP-fast pin ... with shared zeropage > ok 157 Longterm R/O pin is reliable > # [RUN] R/O longterm GUP-fast pin ... with memfd > ok 158 Longterm R/O pin is reliable > # [RUN] R/O longterm GUP-fast pin ... with tmpfile > ok 159 Longterm R/O pin is reliable > # [RUN] R/O longterm GUP-fast pin ... with huge zeropage > ok 160 Longterm R/O pin is reliable > # [RUN] R/O longterm GUP-fast pin ... with memfd hugetlb (2048 kB) > ok 161 Longterm R/O pin is reliable > # [RUN] R/O longterm GUP-fast pin ... with memfd hugetlb (1048576 kB) > ok 162 Longterm R/O pin is reliable > > Note 1: We don't care about short-term R/O-pinning, because they have > snapshot semantics: they are not supposed to observe modifications that > happen after pinning. > > As one example, assume we start direct I/O to read from a page and store > page content into a file: modifications to page content after starting > direct I/O are not guaranteed to end up in the file. So even if we'd pin > the shared zeropage, the end result would be as expected -- getting zeroes > stored to the file. > > Note 2: For shared mappings we'll now always fallback to the slow path to > lookup the VMA when R/O long-term pining. While that's the necessary price > we have to pay right now, it's actually not that bad in practice: most > FOLL_LONGTERM users already specify FOLL_WRITE, for example, along with > FOLL_FORCE because they tried dealing with COW mappings correctly ... > > Note 3: For users that use FOLL_LONGTERM right now without FOLL_WRITE, > such as VFIO, we'd now no longer pin the shared zeropage. Instead, we'd > populate exclusive anon pages that we can pin. There was a concern that > this could affect the memlock limit of existing setups. > > For example, a VM running with VFIO could run into the memlock limit and > fail to run. However, we essentially had the same behavior already in > commit 17839856fd58 ("gup: document and work around "COW can break either > way" issue") which got merged into some enterprise distros, and there were > not any such complaints. So most probably, we're fine. > > Signed-off-by: David Hildenbrand Reviewed-by: Vlastimil Babka
Re: [Intel-gfx] [PATCH v2 0/5] Add module oriented dmesg output
On 18.11.2022 11:52, Jani Nikula wrote: > On Thu, 17 Nov 2022, john.c.harri...@intel.com wrote: >> From: John Harrison >> >> When trying to analyse bug reports from CI, customers, etc. it can be >> difficult to work out exactly what is happening on which GT in a >> multi-GT system. So add GT oriented debug/error message wrappers. If >> used instead of the drm_ equivalents, you get the same output but with >> a GT# prefix on it. >> >> It was also requested to extend this further to submodules in order to >> factor out the repeated structure accessing constructs and common >> string prefixes. So, add versions for GuC, HuC and GuC CTB as well. >> >> This patch set updates all the gt/uc files to use the new helpers as a >> first step. The intention would be to convert all output messages that >> have access to a GT structure. >> >> v2: Go back to using lower case names, add more wrapper sets (combined >> review feedback). Also, wrap up probe injection and WARN entries. >> >> Signed-off-by: John Harrison > > For adding the wrappers in general, I'm going to disagree and > commit. I'll leave it up to Tvrtko and Joonas. > > Regarding the placement of the macros, I insist you add individual > header files for the wrappers and include them only where needed. do you mean: intel_gt_print.h intel_guc_print.h intel_huc_print.h with just macros or also with all functions that work with drm_printer? > > We have a fairly serious problem with everything including everything in > i915 that I've been slowly trying to tackle. Touch one thing, rebuild > everything. About a third of our headers cause the rebuild of the entire > driver when modified. We need to reduce the surface of things that cause > rebuilds. > > For example, intel_gt.h is included by 97 files, intel_guc.h by 332 > files, and intel_huc.h by 329 files (counting recursively). > > There's absolutely no reason any of the display code, for example, needs > to have these logging macros in their build. Long term, the headers > should be reorganized to reduce the interdependencies, and this is what > I've been doing in i915_drv.h and display/ in general. But the least we > can do is not make the problem worse. to solve this we should really consider splitting out GuC and HuC definitions to dedicated _types.h files and only include them in i915_drv.h (and print macros are orthogonal for this problem) Michal > > BR, > Jani. > >> >> >> John Harrison (5): >> drm/i915/gt: Start adding module oriented dmesg output >> drm/i915/huc: Add HuC specific debug print wrappers >> drm/i915/guc: Add GuC specific debug print wrappers >> drm/i915/guc: Add GuC CT specific debug print wrappers >> drm/i915/uc: Update the gt/uc code to use gt_err and friends >> >> drivers/gpu/drm/i915/gt/intel_gt.c| 96 >> drivers/gpu/drm/i915/gt/intel_gt.h| 35 +++ >> drivers/gpu/drm/i915/gt/uc/intel_guc.c| 32 +-- >> drivers/gpu/drm/i915/gt/uc/intel_guc.h| 35 +++ >> drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 8 +- >> .../gpu/drm/i915/gt/uc/intel_guc_capture.c| 48 ++-- >> drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 222 +- >> drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 19 +- >> drivers/gpu/drm/i915/gt/uc/intel_guc_log.c| 37 ++- >> drivers/gpu/drm/i915/gt/uc/intel_guc_rc.c | 7 +- >> drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 55 ++--- >> .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 62 +++-- >> drivers/gpu/drm/i915/gt/uc/intel_huc.c| 31 +-- >> drivers/gpu/drm/i915/gt/uc/intel_huc.h| 23 ++ >> drivers/gpu/drm/i915/gt/uc/intel_uc.c | 108 - >> drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 98 >> drivers/gpu/drm/i915/gt/uc/selftest_guc.c | 34 +-- >> .../drm/i915/gt/uc/selftest_guc_hangcheck.c | 22 +- >> .../drm/i915/gt/uc/selftest_guc_multi_lrc.c | 10 +- >> 19 files changed, 507 insertions(+), 475 deletions(-) >
Re: [PATCH printk v5 00/40] reduce console_lock scope
On Wed, Nov 16, 2022 at 05:27:12PM +0106, John Ogness wrote: > This is v5 of a series to prepare for threaded/atomic > printing. v4 is here [0]. This series focuses on reducing the > scope of the BKL console_lock. It achieves this by switching to > SRCU and a dedicated mutex for console list iteration and > modification, respectively. The console_lock will no longer > offer this protection. > > Also, during the review of v2 it came to our attention that > many console drivers are checking CON_ENABLED to see if they > are registered. Because this flag can change without > unregistering and because this flag does not represent an > atomic point when an (un)registration process is complete, > a new console_is_registered() function is introduced. This > function uses the console_list_lock to synchronize with the > (un)registration process to provide a reliable status. > > All users of the console_lock for list iteration have been > modified. For the call sites where the console_lock is still > needed (for other reasons), comments are added to explain > exactly why the console_lock is needed. > > All users of CON_ENABLED for registration status have been > modified to use console_is_registered(). Note that there are > still users of CON_ENABLED, but this is for legitimate purposes > about a registered console being able to print. > > The base commit for this series is from Paul McKenney's RCU tree > and provides an NMI-safe SRCU implementation [1]. Without the > NMI-safe SRCU implementation, this series is not less safe than > mainline. But we will need the NMI-safe SRCU implementation for > atomic consoles anyway, so we might as well get it in > now. Especially since it _does_ increase the reliability for > mainline in the panic path. > > Changes since v4: > > printk: > > - Introduce console_init_seq() to handle the now rather complex > procedure to find an appropriate start sequence number for a > new console upon registration. > > - When registering a non-boot console and boot consoles are > registered, try to flush all the consoles to get the next @seq > value before falling back to use the @seq of the enabled boot > console that is furthest behind. > > - For console_force_preferred_locked(), make the console the > head of the console list. > Reviewed-by: Greg Kroah-Hartman
Re: [PATCH v2 1/5] drm/i915/gt: Start adding module oriented dmesg output
On 18.11.2022 02:58, john.c.harri...@intel.com wrote: > From: John Harrison > > When trying to analyse bug reports from CI, customers, etc. it can be > difficult to work out exactly what is happening on which GT in a > multi-GT system. So add GT oriented debug/error message wrappers. If > used instead of the drm_ equivalents, you get the same output but with > a GT# prefix on it. > > v2: Go back to using lower case names (combined review feedback). > Convert intel_gt.c as a first step. > > Signed-off-by: John Harrison > --- > drivers/gpu/drm/i915/gt/intel_gt.c | 96 ++ > drivers/gpu/drm/i915/gt/intel_gt.h | 35 +++ > 2 files changed, 81 insertions(+), 50 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c > b/drivers/gpu/drm/i915/gt/intel_gt.c > index 0325f071046ca..349fcfdd14a6d 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt.c > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c > @@ -90,9 +90,8 @@ static int intel_gt_probe_lmem(struct intel_gt *gt) > if (err == -ENODEV) > return 0; > > - drm_err(&i915->drm, > - "Failed to setup region(%d) type=%d\n", > - err, INTEL_MEMORY_LOCAL); > + gt_err(gt, "Failed to setup region(%d) type=%d\n", > +err, INTEL_MEMORY_LOCAL); > return err; > } > > @@ -192,14 +191,14 @@ int intel_gt_init_hw(struct intel_gt *gt) > > ret = i915_ppgtt_init_hw(gt); > if (ret) { > - drm_err(&i915->drm, "Enabling PPGTT failed (%d)\n", ret); > + gt_err(gt, "Enabling PPGTT failed (%d)\n", ret); > goto out; > } > > /* We can't enable contexts until all firmware is loaded */ > ret = intel_uc_init_hw(>->uc); > if (ret) { > - i915_probe_error(i915, "Enabling uc failed (%d)\n", ret); > + gt_probe_error(gt, "Enabling uc failed (%d)\n", ret); > goto out; > } > > @@ -264,7 +263,7 @@ intel_gt_clear_error_registers(struct intel_gt *gt, >* some errors might have become stuck, >* mask them. >*/ > - drm_dbg(>->i915->drm, "EIR stuck: 0x%08x, masking\n", eir); > + gt_dbg(gt, "EIR stuck: 0x%08x, masking\n", eir); > rmw_set(uncore, EMR, eir); > intel_uncore_write(uncore, GEN2_IIR, > I915_MASTER_ERROR_INTERRUPT); > @@ -298,16 +297,16 @@ static void gen6_check_faults(struct intel_gt *gt) > for_each_engine(engine, gt, id) { > fault = GEN6_RING_FAULT_REG_READ(engine); > if (fault & RING_FAULT_VALID) { > - drm_dbg(&engine->i915->drm, "Unexpected fault\n" > - "\tAddr: 0x%08lx\n" > - "\tAddress space: %s\n" > - "\tSource ID: %d\n" > - "\tType: %d\n", > - fault & PAGE_MASK, > - fault & RING_FAULT_GTTSEL_MASK ? > - "GGTT" : "PPGTT", > - RING_FAULT_SRCID(fault), > - RING_FAULT_FAULT_TYPE(fault)); > + gt_dbg(gt, "Unexpected fault\n" > +"\tAddr: 0x%08lx\n" > +"\tAddress space: %s\n" > +"\tSource ID: %d\n" > +"\tType: %d\n", > +fault & PAGE_MASK, > +fault & RING_FAULT_GTTSEL_MASK ? > +"GGTT" : "PPGTT", > +RING_FAULT_SRCID(fault), > +RING_FAULT_FAULT_TYPE(fault)); > } > } > } > @@ -334,17 +333,17 @@ static void xehp_check_faults(struct intel_gt *gt) > fault_addr = ((u64)(fault_data1 & FAULT_VA_HIGH_BITS) << 44) | >((u64)fault_data0 << 12); > > - drm_dbg(>->i915->drm, "Unexpected fault\n" > - "\tAddr: 0x%08x_%08x\n" > - "\tAddress space: %s\n" > - "\tEngine ID: %d\n" > - "\tSource ID: %d\n" > - "\tType: %d\n", > - upper_32_bits(fault_addr), lower_32_bits(fault_addr), > - fault_data1 & FAULT_GTT_SEL ? "GGTT" : "PPGTT", > - GEN8_RING_FAULT_ENGINE_ID(fault), > - RING_FAULT_SRCID(fault), > - RING_FAULT_FAULT_TYPE(fault)); > + gt_dbg(gt, "Unexpected fault\n" > +"\tAddr: 0x%08x_%08x\n" > +"\tAddress space: %s\n" > +"\tEngine ID: %d\n" > +"\tSource ID: %d\n" > +"\tType: %d\n", > +upper_32_bits(fault_addr), lower_32_bits(
Re: Build regressions/improvements in v6.1-rc6
On 11/22/22 07:00, Johannes Berg wrote: > On Tue, 2022-11-22 at 08:55 -0500, Alex Deucher wrote: >>> >>>+ /kisskb/src/arch/um/include/asm/processor-generic.h: error: called >>> object is not a function or function pointer: => 94:18 >>>+ /kisskb/src/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c: >>> error: control reaches end of non-void function [-Werror=return-type]: => >>> 1934:1 >>> >>> um-x86_64/um-all{mod,yes}config (in kfd_cpumask_to_apic_id()) >> >> Presumably cpu_data is not defined on um-x86_64? Does it even make >> sense to build drivers on um-x86_64? > > Drivers in general yes ;-) > > This driver, probably not. > > But the issue is that a lot of drivers "depends on X86_64" or such, > where only "X86" is the arch symbol. You could add "X86 && X86_64" to > really build on x86 64-bit only. > > I didn't check this driver, but this has mostly popped up since UM got > PCI support some time ago (which I added.) I have patches for lots of these issues, but some people said that they would want to build DRM drivers for use with KUNIT (i.e. UML), so I thought that meant my patches were not needed. -- ~Randy
[PATCH] dma-buf: Require VM_PFNMAP vma for mmap
tldr; DMA buffers aren't normal memory, expecting that you can use them like that (like calling get_user_pages works, or that they're accounting like any other normal memory) cannot be guaranteed. Since some userspace only runs on integrated devices, where all buffers are actually all resident system memory, there's a huge temptation to assume that a struct page is always present and useable like for any more pagecache backed mmap. This has the potential to result in a uapi nightmare. To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which blocks get_user_pages and all the other struct page based infrastructure for everyone. In spirit this is the uapi counterpart to the kernel-internal CONFIG_DMABUF_DEBUG. Motivated by a recent patch which wanted to swich the system dma-buf heap to vm_insert_page instead of vm_insert_pfn. v2: Jason brought up that we also want to guarantee that all ptes have the pte_special flag set, to catch fast get_user_pages (on architectures that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >From auditing the various functions to insert pfn pte entires (vm_insert_pfn_prot, remap_pfn_range and all it's callers like dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so this should be the correct flag to check for. v3: Change to WARN_ON_ONCE (Thomas Zimmermann) References: https://lore.kernel.org/lkml/cakmk7uhi+mg0z0humnt13qccvuturvjpcr0njrl12k-wbwz...@mail.gmail.com/ Acked-by: Christian König Acked-by: Thomas Zimmermann Cc: Thomas Zimmermann Cc: Jason Gunthorpe Cc: Suren Baghdasaryan Cc: Matthew Wilcox Cc: John Stultz Signed-off-by: Daniel Vetter Cc: Sumit Semwal Cc: "Christian König" Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org -- Ok I entirely forgot about this patch but stumbled over it and checked what's up with it no. I think it's ready now for merging: - shmem helper patches to fix up vgem landed - ttm has been fixed since a while - I don't think we've had any other open issues Time to lock down this uapi contract for real? -Daniel --- drivers/dma-buf/dma-buf.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index b6c36914e7c6..88718665c3c3 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -150,6 +150,8 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) ret = dmabuf->ops->mmap(dmabuf, vma); dma_resv_unlock(dmabuf->resv); + WARN_ON_ONCE(!(vma->vm_flags & VM_PFNMAP)); + return ret; } @@ -1495,6 +1497,8 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, ret = dmabuf->ops->mmap(dmabuf, vma); dma_resv_unlock(dmabuf->resv); + WARN_ON_ONCE(!(vma->vm_flags & VM_PFNMAP)); + return ret; } EXPORT_SYMBOL_NS_GPL(dma_buf_mmap, DMA_BUF); -- 2.37.2
Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()
[AMD Official Use Only - General] When only second GPU has valid ATRM handle - then it stays inside the loop and in the next call to pci_get_class(), it passes pdev reference to first GPU as the "from" param. That time it drops the reference count of "from" device. Thanks, Lijo From: Alex Deucher Sent: Tuesday, November 22, 2022 9:55:33 PM To: Lazar, Lijo Cc: Xiongfeng Wang ; Deucher, Alexander ; Koenig, Christian ; Pan, Xinhui ; airl...@gmail.com ; dan...@ffwll.ch ; Zhang, Hawking ; dri-devel@lists.freedesktop.org ; amd-...@lists.freedesktop.org ; yangyingli...@huawei.com Subject: Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios() On Tue, Nov 22, 2022 at 9:59 AM Lazar, Lijo wrote: > > > > On 11/22/2022 8:19 PM, Alex Deucher wrote: > > On Tue, Nov 22, 2022 at 6:12 AM Xiongfeng Wang > > wrote: > >> > >> As comment of pci_get_class() says, it returns a pci_device with its > >> refcount increased and decreased the refcount for the input parameter > >> @from if it is not NULL. > >> > >> If we break the loop in radeon_atrm_get_bios() with 'pdev' not NULL, we > >> need to call pci_dev_put() to decrease the refcount. Add the missing > >> pci_dev_put() to avoid refcount leak. > > > > For both patches, I think pci_dev_put() needs to go into the loops. > > There are 2 or more GPUs on the systems where this is relevant. > > > > As per the logic, the loop breaks when it finds a valid ATRM handle. So > dev_put is required only for that device. Sure, but what if the handle is on the second DISPLAY_VGA or DISPLAY_OTHER class PCI device on the system? We've already called pci_get_class() for the first PCI device that matched. Alex > > When inside the loop this happens - "decreased the refcount for the > input parameter @from if it is not NULL" > > Thanks, > Lijo > > > Alex > > > >> > >> Fixes: d8ade3526b2a ("drm/radeon: handle non-VGA class pci devices with > >> ATRM") > >> Fixes: c61e2775873f ("drm/radeon: split ATRM support out from the ATPX > >> handler (v3)") > >> Signed-off-by: Xiongfeng Wang > >> --- > >> drivers/gpu/drm/radeon/radeon_bios.c | 1 + > >> 1 file changed, 1 insertion(+) > >> > >> diff --git a/drivers/gpu/drm/radeon/radeon_bios.c > >> b/drivers/gpu/drm/radeon/radeon_bios.c > >> index 33121655d50b..2df6ce3e32cb 100644 > >> --- a/drivers/gpu/drm/radeon/radeon_bios.c > >> +++ b/drivers/gpu/drm/radeon/radeon_bios.c > >> @@ -227,6 +227,7 @@ static bool radeon_atrm_get_bios(struct radeon_device > >> *rdev) > >> > >> if (!found) > >> return false; > >> + pci_dev_put(pdev); > >> > >> rdev->bios = kmalloc(size, GFP_KERNEL); > >> if (!rdev->bios) { > >> -- > >> 2.20.1 > >> <>
Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()
On Tue, Nov 22, 2022 at 12:10 PM Lazar, Lijo wrote: > > [AMD Official Use Only - General] > > When only second GPU has valid ATRM handle - > then it stays inside the loop and in the next call to pci_get_class(), it > passes pdev reference to first GPU as the "from" param. That time it drops > the reference count of "from" device. ah, right, that was the part I missed. Thanks. Alex > > Thanks, > Lijo > > From: Alex Deucher > Sent: Tuesday, November 22, 2022 9:55:33 PM > To: Lazar, Lijo > Cc: Xiongfeng Wang ; Deucher, Alexander > ; Koenig, Christian ; > Pan, Xinhui ; airl...@gmail.com ; > dan...@ffwll.ch ; Zhang, Hawking ; > dri-devel@lists.freedesktop.org ; > amd-...@lists.freedesktop.org ; > yangyingli...@huawei.com > Subject: Re: [PATCH 1/2] drm/radeon: Fix PCI device refcount leak in > radeon_atrm_get_bios() > > On Tue, Nov 22, 2022 at 9:59 AM Lazar, Lijo wrote: > > > > > > > > On 11/22/2022 8:19 PM, Alex Deucher wrote: > > > On Tue, Nov 22, 2022 at 6:12 AM Xiongfeng Wang > > > wrote: > > >> > > >> As comment of pci_get_class() says, it returns a pci_device with its > > >> refcount increased and decreased the refcount for the input parameter > > >> @from if it is not NULL. > > >> > > >> If we break the loop in radeon_atrm_get_bios() with 'pdev' not NULL, we > > >> need to call pci_dev_put() to decrease the refcount. Add the missing > > >> pci_dev_put() to avoid refcount leak. > > > > > > For both patches, I think pci_dev_put() needs to go into the loops. > > > There are 2 or more GPUs on the systems where this is relevant. > > > > > > > As per the logic, the loop breaks when it finds a valid ATRM handle. So > > dev_put is required only for that device. > > Sure, but what if the handle is on the second DISPLAY_VGA or > DISPLAY_OTHER class PCI device on the system? We've already called > pci_get_class() for the first PCI device that matched. > > Alex > > > > > When inside the loop this happens - "decreased the refcount for the > > input parameter @from if it is not NULL" > > > > Thanks, > > Lijo > > > > > Alex > > > > > >> > > >> Fixes: d8ade3526b2a ("drm/radeon: handle non-VGA class pci devices with > > >> ATRM") > > >> Fixes: c61e2775873f ("drm/radeon: split ATRM support out from the ATPX > > >> handler (v3)") > > >> Signed-off-by: Xiongfeng Wang > > >> --- > > >> drivers/gpu/drm/radeon/radeon_bios.c | 1 + > > >> 1 file changed, 1 insertion(+) > > >> > > >> diff --git a/drivers/gpu/drm/radeon/radeon_bios.c > > >> b/drivers/gpu/drm/radeon/radeon_bios.c > > >> index 33121655d50b..2df6ce3e32cb 100644 > > >> --- a/drivers/gpu/drm/radeon/radeon_bios.c > > >> +++ b/drivers/gpu/drm/radeon/radeon_bios.c > > >> @@ -227,6 +227,7 @@ static bool radeon_atrm_get_bios(struct > > >> radeon_device *rdev) > > >> > > >> if (!found) > > >> return false; > > >> + pci_dev_put(pdev); > > >> > > >> rdev->bios = kmalloc(size, GFP_KERNEL); > > >> if (!rdev->bios) { > > >> -- > > >> 2.20.1 > > >>
Re: [PATCH v2 2/5] drm/i915/huc: Add HuC specific debug print wrappers
On 18.11.2022 02:58, john.c.harri...@intel.com wrote: > From: John Harrison > > Create a set of HuC printers and start using them. > > Signed-off-by: John Harrison > --- > drivers/gpu/drm/i915/gt/uc/intel_huc.c | 31 ++ > drivers/gpu/drm/i915/gt/uc/intel_huc.h | 23 +++ > 2 files changed, 35 insertions(+), 19 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c > b/drivers/gpu/drm/i915/gt/uc/intel_huc.c > index be855811d85df..0bbbc7192da63 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c > @@ -107,11 +107,9 @@ static enum hrtimer_restart > huc_delayed_load_timer_callback(struct hrtimer *hrti > > if (!intel_huc_is_authenticated(huc)) { > if (huc->delayed_load.status == INTEL_HUC_WAITING_ON_GSC) > - drm_notice(&huc_to_gt(huc)->i915->drm, > -"timed out waiting for MEI GSC init to load > HuC\n"); > + huc_notice(huc, "Timed out waiting for MEI GSC init to > load FW\n"); > else if (huc->delayed_load.status == INTEL_HUC_WAITING_ON_PXP) > - drm_notice(&huc_to_gt(huc)->i915->drm, > -"timed out waiting for MEI PXP init to load > HuC\n"); > + huc_notice(huc, "Timed out waiting for MEI PXP init to > load FW\n"); > else > MISSING_CASE(huc->delayed_load.status); > > @@ -174,8 +172,7 @@ static int gsc_notifier(struct notifier_block *nb, > unsigned long action, void *d > > case BUS_NOTIFY_DRIVER_NOT_BOUND: /* mei driver fails to be bound */ > case BUS_NOTIFY_UNBIND_DRIVER: /* mei driver about to be unbound */ > - drm_info(&huc_to_gt(huc)->i915->drm, > - "mei driver not bound, disabling HuC load\n"); > + huc_info(huc, "- mei driver not bound, disabling HuC load\n"); > gsc_init_error(huc); > break; > } > @@ -193,8 +190,7 @@ void intel_huc_register_gsc_notifier(struct intel_huc > *huc, struct bus_type *bus > huc->delayed_load.nb.notifier_call = gsc_notifier; > ret = bus_register_notifier(bus, &huc->delayed_load.nb); > if (ret) { > - drm_err(&huc_to_gt(huc)->i915->drm, > - "failed to register GSC notifier\n"); > + huc_err(huc, "Failed to register GSC notifier\n"); > huc->delayed_load.nb.notifier_call = NULL; > gsc_init_error(huc); > } > @@ -284,8 +280,7 @@ static int check_huc_loading_mode(struct intel_huc *huc) > GSC_LOADS_HUC; > > if (fw_needs_gsc != hw_uses_gsc) { > - drm_err(>->i915->drm, > - "mismatch between HuC FW (%s) and HW (%s) load modes\n", > + huc_err(huc, "Mismatch between FW (%s) and HW (%s) load > modes\n", > HUC_LOAD_MODE_STRING(fw_needs_gsc), > HUC_LOAD_MODE_STRING(hw_uses_gsc)); > return -ENOEXEC; > @@ -294,19 +289,17 @@ static int check_huc_loading_mode(struct intel_huc *huc) > /* make sure we can access the GSC via the mei driver if we need it */ > if (!(IS_ENABLED(CONFIG_INTEL_MEI_PXP) && > IS_ENABLED(CONFIG_INTEL_MEI_GSC)) && > fw_needs_gsc) { > - drm_info(>->i915->drm, > - "Can't load HuC due to missing MEI modules\n"); > + huc_info(huc, "Can't load due to missing MEI modules\n"); > return -EIO; > } > > - drm_dbg(>->i915->drm, "GSC loads huc=%s\n", str_yes_no(fw_needs_gsc)); > + huc_dbg(huc, "GSC loads huc=%s\n", str_yes_no(fw_needs_gsc)); this will give: "GT0: HuC GSC loads huc=yes" but maybe better to change that to get: "GT0: HuC loaded by GSC=yes" so this should be: huc_dbg(huc, "loaded by GSC=%s\n", str_yes_no(fw_needs_gsc)); > > return 0; > } > > int intel_huc_init(struct intel_huc *huc) > { > - struct drm_i915_private *i915 = huc_to_gt(huc)->i915; > int err; > > err = check_huc_loading_mode(huc); > @@ -323,7 +316,7 @@ int intel_huc_init(struct intel_huc *huc) > > out: > intel_uc_fw_change_status(&huc->fw, INTEL_UC_FIRMWARE_INIT_FAIL); > - drm_info(&i915->drm, "HuC init failed with %d\n", err); > + huc_info(huc, "init failed with %d\n", err); > return err; > } > > @@ -366,13 +359,13 @@ int intel_huc_wait_for_auth_complete(struct intel_huc > *huc) > delayed_huc_load_complete(huc); > > if (ret) { > - drm_err(>->i915->drm, "HuC: Firmware not verified %d\n", ret); > + huc_err(huc, "firmware not verified %d\n", ret); > intel_uc_fw_change_status(&huc->fw, > INTEL_UC_FIRMWARE_LOAD_FAIL); > return ret; > } > > intel_uc_fw_change_status(&huc->fw, INTEL_UC_FIRMWARE_RUNNING); > - dr
Re: [syzbot] inconsistent lock state in sync_info_debugfs_show
Am 22.11.22 um 15:51 schrieb Daniel Vetter: On Fri, Nov 18, 2022 at 06:46:38PM -0800, syzbot wrote: syzbot has found a reproducer for the following issue on: HEAD commit:84368d882b96 Merge tag 'soc-fixes-6.1-3' of git://git.kern.. git tree: upstream console+strace: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsyzkaller.appspot.com%2Fx%2Flog.txt%3Fx%3D1670fb6588&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2BNj%2B3vZRTi6DhdOdNlQDG7GJN9u3IZy8%2FZxqHScu2Fs%3D&reserved=0 kernel config: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsyzkaller.appspot.com%2Fx%2F.config%3Fx%3D6f4e5e9899396248&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qZpvAr44nbuFs3jvd9fggeVnMgAt7J5Izxa1GvoMDjA%3D&reserved=0 dashboard link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsyzkaller.appspot.com%2Fbug%3Fextid%3D007bfe0f3330f6e1e7d1&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=tTHNWwrjD%2F%2FddfRi%2Fe3AbwMw%2B7MPWoolWeZt%2FgnBa4g%3D&reserved=0 compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 syz repro: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsyzkaller.appspot.com%2Fx%2Frepro.syz%3Fx%3D164376f988&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bay5gePJjT3Nm4J%2BINtso1hKNPMUmfP6xActclfXqa8%3D&reserved=0 C reproducer: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsyzkaller.appspot.com%2Fx%2Frepro.c%3Fx%3D16cf096588&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=o5JlI1IDcmP%2Bd08wLRhyD5gUIjuPioMEyK%2FkH6m%2FNIQ%3D&reserved=0 Downloadable assets: disk image: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstorage.googleapis.com%2Fsyzbot-assets%2F031b6e68785d%2Fdisk-84368d88.raw.xz&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dAT5hGTqrR01QiyGbOHyWkBIvTYvsmGqLiPqXuw7vK4%3D&reserved=0 vmlinux: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstorage.googleapis.com%2Fsyzbot-assets%2Fcff5e66b90e8%2Fvmlinux-84368d88.xz&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bHMhF3kWMS3ZVbPs8z6mWewUlNBTJPabhhI58e77KuI%3D&reserved=0 kernel image: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstorage.googleapis.com%2Fsyzbot-assets%2Fe75525784a66%2FbzImage-84368d88.xz&data=05%7C01%7Cchristian.koenig%40amd.com%7C93106507f7994c3175f408dacc99034c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047254866307874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KFJD3fDW1Ppqla%2BPA6wY6NC3qSNrlCpnkhdqIDZGxoY%3D&reserved=0 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+007bfe0f3330f6e1e...@syzkaller.appspotmail.com WARNING: inconsistent lock state 6.1.0-rc5-syzkaller-00144-g84368d882b96 #0 Not tainted inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. syz-executor333/3645 [HC0[0]:SC0[0]:HE0:SE1] takes: 8d295c38 (sync_timeline_list_lock){?...}-{2:2}, at: spin_lock_irq include/linux/spinlock.h:375 [inline] 8d295c38 (sync_timeline_list_lock){?...}-{2:2}, at: sync_info_debugfs_show+0x31/0x200 drivers/dma-buf/sync_debug.c:147 {IN-HARDIRQ-W} state was registered at: lock_acquire kernel/locking/lockdep.c:5668 [inline] lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/s
Re: [PATCH RFC 16/19] mm/frame-vector: remove FOLL_FORCE usage
On Tue, Nov 22, 2022 at 4:25 AM Hans Verkuil wrote: > > I tracked the use of 'force' all the way back to the first git commit > (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the > reason is lost in the mists of time. Well, not entirely. For archaeology reasons, I went back to the old BK history, which exists as a git conversion in https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/ and there you can actually find it. Not with a lot of explanations, though - it's commit b7649ef789 ("[PATCH] videobuf update"): This updates the video-buf.c module (helper module for video buffer management). Some memory management fixes, also some adaptions to the final v4l2 api. but it went from err = get_user_pages(current,current->mm, -data, dma->nr_pages, -rw == READ, 0, /* don't force */ +data & PAGE_MASK, dma->nr_pages, +rw == READ, 1, /* force */ dma->pages, NULL); in that commit. So it goes back to October 2002. > Looking at this old LWN article https://lwn.net/Articles/28548/ suggests > that it might be related to calling get_user_pages for write buffers The timing roughly matches. > I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still > allow drivers to read from the buffer? The issue with some of the driver hacks has been that - they only want to read, and the buffer may be read-only - they then used FOLL_WRITE despite that, because they want to break COW (due to the issue that David is now fixing with his series) - but that means that the VM layer says "nope, you can't write to this read-only user mapping" - ... and then they use FOLL_FORCE to say "yes, I can". iOW, the FOLL_FORCE may be entirely due to an (incorrect, but historically needed) FOLL_WRITE. Linus
Re: Try to address the DMA-buf coherency problem
Am 22.11.22 um 15:36 schrieb Daniel Vetter: On Fri, Nov 18, 2022 at 11:32:19AM -0800, Rob Clark wrote: On Thu, Nov 17, 2022 at 7:38 AM Nicolas Dufresne wrote: Le jeudi 17 novembre 2022 à 13:10 +0100, Christian König a écrit : DMA-Buf let's the exporter setup the DMA addresses the importer uses to be able to directly decided where a certain operation should go. E.g. we have cases where for example a P2P write doesn't even go to memory, but rather a doorbell BAR to trigger another operation. Throwing in CPU round trips for explicit ownership transfer completely breaks that concept. It sounds like we should have a dma_dev_is_coherent_with_dev() which accepts two (or an array?) of devices and tells the caller whether the devices need explicit ownership transfer. No, exactly that's the concept I'm pushing back on very hard here. In other words explicit ownership transfer is not something we would want as requirement in the framework, cause otherwise we break tons of use cases which require concurrent access to the underlying buffer. I'm not pushing for this solution, but really felt the need to correct you here. I have quite some experience with ownership transfer mechanism, as this is how GStreamer framework works since 2000. Concurrent access is a really common use cases and it is quite well defined in that context. The bracketing system (in this case called map() unmap(), with flag stating the usage intention like reads and write) is combined the the refcount. The basic rules are simple: This is all CPU oriented, I think Christian is talking about the case where ownership transfer happens without CPU involvement, such as via GPU waiting on a fence Yeah for pure device2device handover the rule pretty much has to be that any coherency management that needs to be done must be done from the device side (flushing device side caches and stuff like that) only. But under the assumption that _all_ cpu side management has been done already before the first device access started. And then the map/unmap respectively begin/end_cpu_access can be used what it was meant for with cpu side invalidation/flushing and stuff like that, while having pretty clear handover/ownership rules and hopefully not doing no unecessary flushes. And all that while allowing device acces to be pipelined. Worst case the exporter has to insert some pipelined cache flushes as a dma_fence pipelined work of its own between the device access when moving from one device to the other. That last part sucks a bit right now, because we don't have any dma_buf_attachment method which does this syncing without recreating the mapping, but in reality this is solved by caching mappings in the exporter (well dma-buf layer) nowadays. True concurrent access like vk/compute expects is still a model that dma-buf needs to support on top, but that's a special case and pretty much needs hw that supports such concurrent access without explicit handover and fencing. Aside from some historical accidents and still a few warts, I do think dma-buf does support both of these models. We should have come up with dma-heaps earlier and make it clear that exporting a DMA-buf from a device gives you something device specific which might or might not work with others. Apart from that I agree, DMA-buf should be capable of handling this. Question left is what documentation is missing to make it clear how things are supposed to work? Regards, Christian. Of course in the case of gpu/drm drivers, userspace must know whats possible and act accordingly, otherwise you just get to keep all the pieces. -Daniel
Re: git send-email friendly smtp provider anyone?
Den 22.11.2022 16.51, skrev Konstantin Ryabitsev: > On Mon, Nov 21, 2022 at 07:13:28PM +0100, Noralf Trønnes wrote: >>> Otherwise, you might consider using: >>> https://b4.docs.kernel.org/en/latest/contributor/send.html#authenticating-with-the-web-submission-endpoint >>> >> >> That's an interesting option. I did briefly look at b4 a few months back >> but it looked like it was under heavy development so I figured I'd wait >> before trying it out. I think I'll give b4 a spin to see how it works, I >> wonder how it handles patch changelogs. > > I'd be happy to help set this up for you -- to date, this service hasn't been > used beyond a few test posts. > Ok, I'll give it a try. I have now prepared the patchset, generated a key and can now do: b4 send -o The first thing that strikes me is that everyone mentioned in one of the patches get the entire patchset, even sta...@vger.kernel.org (cc'ed in a fixes patch). The first patch touches a core file and as a result a few drivers, so I've cc'ed the driver maintainers in that patch, but now they get the entire patchset where 5 of 6 patches is about a driver that I maintain. So from their point of view, they see a patchset about a driver they don't care about and a patch touching a core file, but from the subject it's not apparent that it touches their driver. I'm afraid that this might result in none of them looking at that patch. In this particular case it's not that important, but in another case it might be. As for the setting up the web endpoint, should I just follow the b4 docs on that? I use b4 version 0.10.1, is that recent enough? Noralf.
Re: [PATCH v2 3/5] drm/i915/guc: Add GuC specific debug print wrappers
On 18.11.2022 02:58, john.c.harri...@intel.com wrote: > From: John Harrison > > Create a set of GuC printers and start using them. > > Signed-off-by: John Harrison > --- > drivers/gpu/drm/i915/gt/uc/intel_guc.c| 32 -- > drivers/gpu/drm/i915/gt/uc/intel_guc.h| 35 +++ > drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 8 +-- > .../gpu/drm/i915/gt/uc/intel_guc_capture.c| 48 +- > drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 19 +++--- > drivers/gpu/drm/i915/gt/uc/intel_guc_log.c| 37 ++- > drivers/gpu/drm/i915/gt/uc/intel_guc_rc.c | 7 +-- > drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 55 +++- > .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 62 +-- > drivers/gpu/drm/i915/gt/uc/selftest_guc.c | 34 +- > .../drm/i915/gt/uc/selftest_guc_hangcheck.c | 22 +++ > .../drm/i915/gt/uc/selftest_guc_multi_lrc.c | 10 +-- > 12 files changed, 179 insertions(+), 190 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c > b/drivers/gpu/drm/i915/gt/uc/intel_guc.c > index 52aede324788e..d9972510ee29b 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c > @@ -94,8 +94,8 @@ static void gen9_enable_guc_interrupts(struct intel_guc > *guc) > assert_rpm_wakelock_held(>->i915->runtime_pm); > > spin_lock_irq(gt->irq_lock); > - WARN_ON_ONCE(intel_uncore_read(gt->uncore, GEN8_GT_IIR(2)) & > - gt->pm_guc_events); > + guc_WARN_ON_ONCE(guc, intel_uncore_read(gt->uncore, GEN8_GT_IIR(2)) & > + gt->pm_guc_events); > gen6_gt_pm_enable_irq(gt, gt->pm_guc_events); > spin_unlock_irq(gt->irq_lock); > > @@ -339,7 +339,7 @@ static void guc_init_params(struct intel_guc *guc) > params[GUC_CTL_DEVID] = guc_ctl_devid(guc); > > for (i = 0; i < GUC_CTL_MAX_DWORDS; i++) > - DRM_DEBUG_DRIVER("param[%2d] = %#x\n", i, params[i]); > + guc_dbg(guc, "init param[%2d] = %#x\n", i, params[i]); > } > > /* > @@ -451,7 +451,7 @@ int intel_guc_init(struct intel_guc *guc) > intel_uc_fw_fini(&guc->fw); > out: > intel_uc_fw_change_status(&guc->fw, INTEL_UC_FIRMWARE_INIT_FAIL); > - i915_probe_error(gt->i915, "failed with %d\n", ret); > + guc_probe_error(guc, "init failed with %d\n", ret); > return ret; > } > > @@ -484,7 +484,6 @@ void intel_guc_fini(struct intel_guc *guc) > int intel_guc_send_mmio(struct intel_guc *guc, const u32 *request, u32 len, > u32 *response_buf, u32 response_buf_size) > { > - struct drm_i915_private *i915 = guc_to_gt(guc)->i915; > struct intel_uncore *uncore = guc_to_gt(guc)->uncore; > u32 header; > int i; > @@ -519,8 +518,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 > *request, u32 len, > 10, 10, &header); > if (unlikely(ret)) { > timeout: > - drm_err(&i915->drm, "mmio request %#x: no reply %x\n", > - request[0], header); > + guc_err(guc, "mmio request %#x: no reply %x\n", request[0], > header); > goto out; > } > > @@ -541,8 +539,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 > *request, u32 len, > if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) == > GUC_HXG_TYPE_NO_RESPONSE_RETRY) { > u32 reason = FIELD_GET(GUC_HXG_RETRY_MSG_0_REASON, header); > > - drm_dbg(&i915->drm, "mmio request %#x: retrying, reason %u\n", > - request[0], reason); > + guc_dbg(guc, "mmio request %#x: retrying, reason %u\n", > request[0], reason); > goto retry; > } > > @@ -550,16 +547,14 @@ int intel_guc_send_mmio(struct intel_guc *guc, const > u32 *request, u32 len, > u32 hint = FIELD_GET(GUC_HXG_FAILURE_MSG_0_HINT, header); > u32 error = FIELD_GET(GUC_HXG_FAILURE_MSG_0_ERROR, header); > > - drm_err(&i915->drm, "mmio request %#x: failure %x/%u\n", > - request[0], error, hint); > + guc_err(guc, "mmio request %#x: failure %x/%u\n", request[0], > error, hint); > ret = -ENXIO; > goto out; > } > > if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) != > GUC_HXG_TYPE_RESPONSE_SUCCESS) { > proto: > - drm_err(&i915->drm, "mmio request %#x: unexpected reply %#x\n", > - request[0], header); > + guc_err(guc, "mmio request %#x: unexpected reply %#x\n", > request[0], header); > ret = -EPROTO; > goto out; > } > @@ -601,9 +596,9 @@ int intel_guc_to_host_process_recv_msg(struct intel_guc > *guc, > msg = payload[0] & guc->msg_enabled_mask; > > if (msg & INTEL_GUC_RECV_MSG_CRASH_DUMP_POSTED) > - drm_err(&guc_to_gt(guc)->i915->drm, "Received early GuC crash > dump notifi
Re: [Intel-gfx] [MAINTAINER TOOLS] docs: updated rules for topic/core-for-CI commit management
On Tue, Nov 22, 2022 at 03:59:10PM +0100, Daniel Vetter wrote: > On Tue, Nov 22, 2022 at 03:17:14PM +0200, Jani Nikula wrote: > > Introduce stricter rules for topic/core-for-CI management. Way too many > > commits have been added over the years, with insufficient rationale > > recorded in the commit message, and insufficient follow-up with removing > > the commits from the topic branch. > > > > New rules: > > > > 1. Require maintainer ack for rebase. Have better gating on when rebases > >happen and on which baselines. > > > > 2. Require maintainer/committer ack for adding/removing commits. No > >single individual should decide. > > > > 3. Require gitlab issues for new commits added. Improve tracking for > >removing the commits. > > > > Also use the stronger "must" for commit message requiring the > > justification for the commit being in topic/core-for-CI. > > > > Cc: Joonas Lahtinen > > Cc: Rodrigo Vivi > > Cc: Tvrtko Ursulin > > Cc: David Airlie > > Cc: Daniel Vetter > > Cc: intel-...@lists.freedesktop.org > > Cc: dri-devel@lists.freedesktop.org > > Cc: dim-to...@lists.freedesktop.org > > Signed-off-by: Jani Nikula > > Reviewed-by: Daniel Vetter Acked-by: Rodrigo Vivi > > > --- > > drm-tip.rst | 27 --- > > 1 file changed, 20 insertions(+), 7 deletions(-) > > > > diff --git a/drm-tip.rst b/drm-tip.rst > > index deae95cdd2fe..24036e2ef576 100644 > > --- a/drm-tip.rst > > +++ b/drm-tip.rst > > @@ -203,11 +203,13 @@ justified exception. The primary goal is to fix > > issues originating from Linus' > > tree. Issues that would need drm-next or other DRM subsystem tree as > > baseline > > should be fixed in the offending DRM subsystem tree. > > > > -Only rebase the branch if you really know what you're doing. When in > > doubt, ask > > -the maintainers. You'll need to be able to handle any conflicts in non-drm > > code > > -while rebasing. > > +Only rebase the branch if you really know what you're doing. You'll need > > to be > > +able to handle any conflicts in non-drm code while rebasing. > > > > -Simply drop fixes that are already available in the new baseline. > > +Always ask for maintainer ack before rebasing. IRC ack is sufficient. > > + > > +Simply drop fixes that are already available in the new baseline. Close the > > +associated gitlab issue when removing commits. > > > > Force pushing a rebased topic/core-for-CI requires passing the ``--force`` > > parameter to git:: > > @@ -225,11 +227,22 @@ judgement call. > > Only add or remove commits if you really know what you're doing. When in > > doubt, > > ask the maintainers. > > > > -Apply new commits on top with regular push. The commit message needs to > > explain > > -why the patch has been applied to topic/core-for-CI. If it's a cherry-pick > > from > > +Always ask for maintainer/committer ack before adding/removing commits. > > IRC ack > > +is sufficient. Record the ``Acked-by:`` in commits being added. > > + > > +Apply new commits on top with regular push. The commit message must > > explain why > > +the patch has been applied to topic/core-for-CI. If it's a cherry-pick from > > another subsystem, please reference the commit with ``git cherry-pick -x`` > > option. If it's a patch from another subsystem, please reference the patch > > on > > the mailing list with ``Link:`` tag. > > > > +New commits always need an associated gitlab issue for tracking purposes. > > The > > +goal is to have as few commits in topic/core-for-CI as possible, and we > > need to > > +be able to track the progress in making that happen. Reference the issue > > with > > +``References:`` tag. Add the ``core-for-CI`` label to the issue. (Note: Do > > not > > +use ``Closes:`` because the logic here is backwards; the issue is having > > the > > +commit in the branch in the first place.) > > + > > Instead of applying reverts, just remove the commit. This implies ``git > > rebase > > --i`` on the current baseline; see directions above. > > +-i`` on the current baseline; see directions above. Close the associated > > gitlab > > +issue when removing commits. > > -- > > 2.34.1 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
Re: [Intel-gfx] [PATCH v2 4/5] drm/i915/guc: Add GuC CT specific debug print wrappers
On 18.11.2022 02:58, john.c.harri...@intel.com wrote: > From: John Harrison > > Re-work the existing GuC CT printers and extend as required to match > the new wrapping scheme. > > Signed-off-by: John Harrison > --- > drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 222 +++--- > 1 file changed, 113 insertions(+), 109 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > index 2b22065e87bf9..9d404fb377637 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > @@ -18,31 +18,49 @@ static inline struct intel_guc *ct_to_guc(struct > intel_guc_ct *ct) > return container_of(ct, struct intel_guc, ct); > } > > -static inline struct intel_gt *ct_to_gt(struct intel_guc_ct *ct) > -{ > - return guc_to_gt(ct_to_guc(ct)); > -} > - > static inline struct drm_i915_private *ct_to_i915(struct intel_guc_ct *ct) > { > - return ct_to_gt(ct)->i915; > -} > + struct intel_guc *guc = ct_to_guc(ct); > + struct intel_gt *gt = guc_to_gt(guc); > > -static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct) > -{ > - return &ct_to_i915(ct)->drm; > + return gt->i915; > } > > -#define CT_ERROR(_ct, _fmt, ...) \ > - drm_err(ct_to_drm(_ct), "CT: " _fmt, ##__VA_ARGS__) > +#define ct_err(_ct, _fmt, ...) \ > + guc_err(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__) > + > +#define ct_warn(_ct, _fmt, ...) \ > + guc_warn(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__) > + > +#define ct_notice(_ct, _fmt, ...) \ > + guc_notice(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__) > + > +#define ct_info(_ct, _fmt, ...) \ > + guc_info(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__) > + > #ifdef CONFIG_DRM_I915_DEBUG_GUC > -#define CT_DEBUG(_ct, _fmt, ...) \ > - drm_dbg(ct_to_drm(_ct), "CT: " _fmt, ##__VA_ARGS__) > +#define ct_dbg(_ct, _fmt, ...) \ > + guc_dbg(ct_to_guc(_ct), "CT " _fmt, ##__VA_ARGS__) > #else > -#define CT_DEBUG(...)do { } while (0) > +#define ct_dbg(...) do { } while (0) > #endif > -#define CT_PROBE_ERROR(_ct, _fmt, ...) \ > - i915_probe_error(ct_to_i915(ct), "CT: " _fmt, ##__VA_ARGS__) > + > +#define ct_probe_error(_ct, _fmt, ...) \ > + do { \ > + if (i915_error_injected()) \ > + ct_dbg(_ct, _fmt, ##__VA_ARGS__); \ > + else \ > + ct_err(_ct, _fmt, ##__VA_ARGS__); \ > + } while (0) guc_probe_error ? > + > +#define ct_WARN_ON(_ct, _condition) \ > + ct_WARN(_ct, _condition, "%s", "ct_WARN_ON(" __stringify(_condition) > ")") > + > +#define ct_WARN(_ct, _condition, _fmt, ...) \ > + guc_WARN(ct_to_guc(_ct), _condition, "CT " _fmt, ##__VA_ARGS__) > + > +#define ct_WARN_ONCE(_ct, _condition, _fmt, ...) \ > + guc_WARN_ONCE(ct_to_guc(_ct), _condition, "CT " _fmt, ##__VA_ARGS__) > > /** > * DOC: CTB Blob > @@ -170,7 +188,7 @@ static int ct_control_enable(struct intel_guc_ct *ct, > bool enable) > err = guc_action_control_ctb(ct_to_guc(ct), enable ? >GUC_CTB_CONTROL_ENABLE : > GUC_CTB_CONTROL_DISABLE); > if (unlikely(err)) > - CT_PROBE_ERROR(ct, "Failed to control/%s CTB (%pe)\n", > + ct_probe_error(ct, "Failed to control/%s CTB (%pe)\n", > str_enable_disable(enable), ERR_PTR(err)); btw, shouldn't we change all messages to start with lowercase ? was: "CT0: Failed to control/%s CTB (%pe)" is: "GT0: GuC CT Failed to control/%s CTB (%pe)" unless we keep colon (as suggested by Tvrtko) as then: "GT0: GuC CT: Failed to control/%s CTB (%pe)" Michal > > return err; > @@ -201,7 +219,7 @@ static int ct_register_buffer(struct intel_guc_ct *ct, > bool send, > size); > if (unlikely(err)) > failed: > - CT_PROBE_ERROR(ct, "Failed to register %s buffer (%pe)\n", > + ct_probe_error(ct, "Failed to register %s buffer (%pe)\n", > send ? "SEND" : "RECV", ERR_PTR(err)); > > return err; > @@ -235,21 +253,21 @@ int intel_guc_ct_init(struct intel_guc_ct *ct) > blob_size = 2 * CTB_DESC_SIZE + CTB_H2G_BUFFER_SIZE + > CTB_G2H_BUFFER_SIZE; > err = intel_guc_allocate_and_map_vma(guc, blob_size, &ct->vma, &blob); > if (unlikely(err)) { > - CT_PROBE_ERROR(ct, "Failed to allocate %u for CTB data (%pe)\n", > + ct_probe_error(ct, "Failed to allocate %u for CTB data (%pe)\n", > blob_size, ERR_PTR(err)); > return err; > } > > - CT_DEBUG(ct, "base=%#x size=%u\n", intel_guc_ggtt_offset(guc, ct->vma), > blob_size); > + ct_dbg(ct, "base=%#x size=%u\n", intel_guc_ggtt_offset(guc, ct->vma), > blob_size); > > /* store pointers to desc and cmds for send ctb */ > desc = blob; > cmds = blob + 2 * CTB_DESC_SIZE; > cmds_size = CTB_H
Re: [PATCH] dma-buf: Require VM_PFNMAP vma for mmap
On Tue, Nov 22, 2022 at 06:08:00PM +0100, Daniel Vetter wrote: > tldr; DMA buffers aren't normal memory, expecting that you can use > them like that (like calling get_user_pages works, or that they're > accounting like any other normal memory) cannot be guaranteed. > > Since some userspace only runs on integrated devices, where all > buffers are actually all resident system memory, there's a huge > temptation to assume that a struct page is always present and useable > like for any more pagecache backed mmap. This has the potential to > result in a uapi nightmare. > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > blocks get_user_pages and all the other struct page based > infrastructure for everyone. In spirit this is the uapi counterpart to > the kernel-internal CONFIG_DMABUF_DEBUG. > > Motivated by a recent patch which wanted to swich the system dma-buf > heap to vm_insert_page instead of vm_insert_pfn. > > v2: > > Jason brought up that we also want to guarantee that all ptes have the > pte_special flag set, to catch fast get_user_pages (on architectures > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > From auditing the various functions to insert pfn pte entires > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > this should be the correct flag to check for. I didn't look at how this actually gets used, but it is a bit of a pain to insert a lifetime controlled object like a struct page as a special PTE/VM_PFNMAP How is the lifetime model implemented here? How do you know when userspace has finally unmapped the page? Jason
Re: [PATCH] dma-buf: Require VM_PFNMAP vma for mmap
On Tue, 22 Nov 2022 at 19:04, Jason Gunthorpe wrote: > > On Tue, Nov 22, 2022 at 06:08:00PM +0100, Daniel Vetter wrote: > > tldr; DMA buffers aren't normal memory, expecting that you can use > > them like that (like calling get_user_pages works, or that they're > > accounting like any other normal memory) cannot be guaranteed. > > > > Since some userspace only runs on integrated devices, where all > > buffers are actually all resident system memory, there's a huge > > temptation to assume that a struct page is always present and useable > > like for any more pagecache backed mmap. This has the potential to > > result in a uapi nightmare. > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > > blocks get_user_pages and all the other struct page based > > infrastructure for everyone. In spirit this is the uapi counterpart to > > the kernel-internal CONFIG_DMABUF_DEBUG. > > > > Motivated by a recent patch which wanted to swich the system dma-buf > > heap to vm_insert_page instead of vm_insert_pfn. > > > > v2: > > > > Jason brought up that we also want to guarantee that all ptes have the > > pte_special flag set, to catch fast get_user_pages (on architectures > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > > > From auditing the various functions to insert pfn pte entires > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > > this should be the correct flag to check for. > > I didn't look at how this actually gets used, but it is a bit of a > pain to insert a lifetime controlled object like a struct page as a > special PTE/VM_PFNMAP > > How is the lifetime model implemented here? How do you know when > userspace has finally unmapped the page? The vma has a filp which is the refcounted dma_buf. With dma_buf you never get an individual page it's always the entire object. And it's up to the allocator how exactly it wants to use or not use the page's refcount. So if gup goes in and elevates the refcount, you can break stuff, which is why I'm doing this. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v4 01/10] drm: bridge: cadence: convert mailbox functions to macro functions
Hi Sandor, On Mon, Nov 21, 2022 at 4:27 AM Sandor Yu wrote: > > Mailbox access functions could be share to other mhdp driver and > HDP-TX HDMI/DP PHY drivers, move those functions to head file > include/drm/bridge/cdns-mhdp-mailbox.h and convert them to > macro functions. What is the reason for converting the functions to macro?
Re: [PATCH v3 1/4] i915: Move list_count() to list.h for broader use
Hi Andy, I love your patch! Yet something to improve: [auto build test ERROR on usb/usb-testing] [also build test ERROR on usb/usb-next usb/usb-linus drm-intel/for-linux-next drm-intel/for-linux-next-fixes linus/master v6.1-rc6 next-20221122] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Andy-Shevchenko/i915-Move-list_count-to-list-h-for-broader-use/20221122-233657 base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-testing patch link: https://lore.kernel.org/r/20221122153516.52577-1-andriy.shevchenko%40linux.intel.com patch subject: [PATCH v3 1/4] i915: Move list_count() to list.h for broader use config: um-i386_defconfig compiler: gcc-11 (Debian 11.3.0-8) 11.3.0 reproduce (this is a W=1 build): # https://github.com/intel-lab-lkp/linux/commit/3d217ef769536f160f5c20224aeb0f9070bdfdcf git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Andy-Shevchenko/i915-Move-list_count-to-list-h-for-broader-use/20221122-233657 git checkout 3d217ef769536f160f5c20224aeb0f9070bdfdcf # save the config file mkdir build_dir && cp config build_dir/.config make W=1 O=build_dir ARCH=um SUBARCH=i386 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot All errors (new ones prefixed by >>): ld: fs/isofs/inode.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> fs/isofs/namei.o:include/linux/list.h:663: first defined here ld: fs/isofs/dir.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> fs/isofs/namei.o:include/linux/list.h:663: first defined here ld: fs/isofs/util.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> fs/isofs/namei.o:include/linux/list.h:663: first defined here ld: fs/isofs/rock.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> fs/isofs/namei.o:include/linux/list.h:663: first defined here ld: fs/isofs/export.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> fs/isofs/namei.o:include/linux/list.h:663: first defined here ld: fs/isofs/joliet.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> fs/isofs/namei.o:include/linux/list.h:663: first defined here -- ld: fs/autofs/inode.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> fs/autofs/init.o:include/linux/list.h:663: first defined here ld: fs/autofs/root.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> fs/autofs/init.o:include/linux/list.h:663: first defined here ld: fs/autofs/symlink.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> fs/autofs/init.o:include/linux/list.h:663: first defined here ld: fs/autofs/waitq.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> fs/autofs/init.o:include/linux/list.h:663: first defined here ld: fs/autofs/expire.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> fs/autofs/init.o:include/linux/list.h:663: first defined here ld: fs/autofs/dev-ioctl.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> fs/autofs/init.o:include/linux/list.h:663: first defined here -- ld: block/bfq-wf2q.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> block/bfq-iosched.o:include/linux/list.h:663: first defined here ld: block/bfq-cgroup.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> block/bfq-iosched.o:include/linux/list.h:663: first defined here vim +663 include/linux/list.h 557 558 /** 559 * list_next_entry - get the next element in list 560 * @pos:the type * to cursor 561 * @member: the name of the list_head within the struct. 562 */ 563 #define list_next_entry(pos, member) \ 564 list_entry((pos)->member.next, typeof(*(pos)), member) 565 566 /** 567 * list_next_entry_circular - get the next element in list 568 * @pos
Re: [Intel-gfx] [PATCH v2 0/5] Add module oriented dmesg output
On Tue, 22 Nov 2022, Michal Wajdeczko wrote: > On 18.11.2022 11:52, Jani Nikula wrote: >> On Thu, 17 Nov 2022, john.c.harri...@intel.com wrote: >>> From: John Harrison >>> >>> When trying to analyse bug reports from CI, customers, etc. it can be >>> difficult to work out exactly what is happening on which GT in a >>> multi-GT system. So add GT oriented debug/error message wrappers. If >>> used instead of the drm_ equivalents, you get the same output but with >>> a GT# prefix on it. >>> >>> It was also requested to extend this further to submodules in order to >>> factor out the repeated structure accessing constructs and common >>> string prefixes. So, add versions for GuC, HuC and GuC CTB as well. >>> >>> This patch set updates all the gt/uc files to use the new helpers as a >>> first step. The intention would be to convert all output messages that >>> have access to a GT structure. >>> >>> v2: Go back to using lower case names, add more wrapper sets (combined >>> review feedback). Also, wrap up probe injection and WARN entries. >>> >>> Signed-off-by: John Harrison >> >> For adding the wrappers in general, I'm going to disagree and >> commit. I'll leave it up to Tvrtko and Joonas. >> >> Regarding the placement of the macros, I insist you add individual >> header files for the wrappers and include them only where needed. > > do you mean: > > intel_gt_print.h > intel_guc_print.h > intel_huc_print.h > > with just macros or also with all functions that work with drm_printer? At least for the macros being added now. If adding others does not require you to pull in a bunch of additional header dependencies, you can add more. And that can be separate patches. > >> >> We have a fairly serious problem with everything including everything in >> i915 that I've been slowly trying to tackle. Touch one thing, rebuild >> everything. About a third of our headers cause the rebuild of the entire >> driver when modified. We need to reduce the surface of things that cause >> rebuilds. >> >> For example, intel_gt.h is included by 97 files, intel_guc.h by 332 >> files, and intel_huc.h by 329 files (counting recursively). >> >> There's absolutely no reason any of the display code, for example, needs >> to have these logging macros in their build. Long term, the headers >> should be reorganized to reduce the interdependencies, and this is what >> I've been doing in i915_drv.h and display/ in general. But the least we >> can do is not make the problem worse. > > to solve this we should really consider splitting out GuC and HuC > definitions to dedicated _types.h files and only include them in > i915_drv.h (and print macros are orthogonal for this problem) It's an orthogonal problem, but IMO with the current headers there's no reason to make the problem worse by adding somewhat independent new stuff to the headers. --- I've looked at untangling this a bunch of times, but I've always felt that it's really not my area of expertise, and it would inevitably conflict with someone else's work in progress and someone else's idea of how the headers should be refactored. There are chains like this: i915_drv.h:47:#include "gt/intel_gt_types.h" gt/intel_gt_types.h:19:#include "uc/intel_uc.h" gt/uc/intel_uc.h:9:#include "intel_guc.h" gt/uc/intel_guc.h:15:#include "intel_guc_fwif.h" gt/uc/intel_guc_fwif.h:14:#include "abi/guc_actions_abi.h" gt/uc/intel_guc_fwif.h:15:#include "abi/guc_actions_slpc_abi.h" gt/uc/intel_guc_fwif.h:16:#include "abi/guc_errors_abi.h" gt/uc/intel_guc_fwif.h:17:#include "abi/guc_communication_mmio_abi.h" gt/uc/intel_guc_fwif.h:18:#include "abi/guc_communication_ctb_abi.h" gt/uc/intel_guc_fwif.h:19:#include "abi/guc_klvs_abi.h" gt/uc/intel_guc_fwif.h:20:#include "abi/guc_messages_abi.h" They need to be broken up at some point. There are a bunch of headers where only minimal amount of info is actually needed in other headers, and the rest is used in a limited number of .c files only. It's a lot of tedious work to refactor and nobody's going to notice the impact directly, they'll just be less grumpy about the build being slow and the organization of the headers being messy. And if they don't build the driver a lot (like me) or don't refactor the driver a lot (like me) maybe they'll never notice. BR, Jani. > > Michal > >> >> BR, >> Jani. >> >>> >>> >>> John Harrison (5): >>> drm/i915/gt: Start adding module oriented dmesg output >>> drm/i915/huc: Add HuC specific debug print wrappers >>> drm/i915/guc: Add GuC specific debug print wrappers >>> drm/i915/guc: Add GuC CT specific debug print wrappers >>> drm/i915/uc: Update the gt/uc code to use gt_err and friends >>> >>> drivers/gpu/drm/i915/gt/intel_gt.c| 96 >>> drivers/gpu/drm/i915/gt/intel_gt.h| 35 +++ >>> drivers/gpu/drm/i915/gt/uc/intel_guc.c| 32 +-- >>> drivers/gpu/drm/i915/gt/uc/intel_guc.h| 35 +++ >>> drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 8 +- >>>
Re: Try to address the DMA-buf coherency problem
On Tue, 22 Nov 2022 at 18:34, Christian König wrote: > Am 22.11.22 um 15:36 schrieb Daniel Vetter: > > On Fri, Nov 18, 2022 at 11:32:19AM -0800, Rob Clark wrote: > >> On Thu, Nov 17, 2022 at 7:38 AM Nicolas Dufresne > >> wrote: > >>> Le jeudi 17 novembre 2022 à 13:10 +0100, Christian König a écrit : > >> DMA-Buf let's the exporter setup the DMA addresses the importer uses to > >> be able to directly decided where a certain operation should go. E.g. > >> we > >> have cases where for example a P2P write doesn't even go to memory, but > >> rather a doorbell BAR to trigger another operation. Throwing in CPU > >> round trips for explicit ownership transfer completely breaks that > >> concept. > > It sounds like we should have a dma_dev_is_coherent_with_dev() which > > accepts two (or an array?) of devices and tells the caller whether the > > devices need explicit ownership transfer. > No, exactly that's the concept I'm pushing back on very hard here. > > In other words explicit ownership transfer is not something we would > want as requirement in the framework, cause otherwise we break tons of > use cases which require concurrent access to the underlying buffer. > >>> I'm not pushing for this solution, but really felt the need to correct > >>> you here. > >>> I have quite some experience with ownership transfer mechanism, as this > >>> is how > >>> GStreamer framework works since 2000. Concurrent access is a really > >>> common use > >>> cases and it is quite well defined in that context. The bracketing system > >>> (in > >>> this case called map() unmap(), with flag stating the usage intention > >>> like reads > >>> and write) is combined the the refcount. The basic rules are simple: > >> This is all CPU oriented, I think Christian is talking about the case > >> where ownership transfer happens without CPU involvement, such as via > >> GPU waiting on a fence > > Yeah for pure device2device handover the rule pretty much has to be that > > any coherency management that needs to be done must be done from the > > device side (flushing device side caches and stuff like that) only. But > > under the assumption that _all_ cpu side management has been done already > > before the first device access started. > > > > And then the map/unmap respectively begin/end_cpu_access can be used what > > it was meant for with cpu side invalidation/flushing and stuff like that, > > while having pretty clear handover/ownership rules and hopefully not doing > > no unecessary flushes. And all that while allowing device acces to be > > pipelined. Worst case the exporter has to insert some pipelined cache > > flushes as a dma_fence pipelined work of its own between the device access > > when moving from one device to the other. That last part sucks a bit right > > now, because we don't have any dma_buf_attachment method which does this > > syncing without recreating the mapping, but in reality this is solved by > > caching mappings in the exporter (well dma-buf layer) nowadays. > > > > True concurrent access like vk/compute expects is still a model that > > dma-buf needs to support on top, but that's a special case and pretty much > > needs hw that supports such concurrent access without explicit handover > > and fencing. > > > > Aside from some historical accidents and still a few warts, I do think > > dma-buf does support both of these models. > > We should have come up with dma-heaps earlier and make it clear that > exporting a DMA-buf from a device gives you something device specific > which might or might not work with others. Yeah, but engineering practicalities were pretty clear that no one would rewrite the entire Xorg stack and all the drivers just to make that happen for prime. > Apart from that I agree, DMA-buf should be capable of handling this. > Question left is what documentation is missing to make it clear how > things are supposed to work? Given the historical baggage of existing use-case, I think the only way out is that we look at concrete examples from real world users that break, and figure out how to fix them. Without breaking any of the existing mess. One idea might be that we have a per-platform dma_buf_legacy_coherency_mode(), which tells you what mode (cpu cache snooping or uncached memory) you need to use to make sure that all devices agree. On x86 the rule might be that it's cpu cache snooping by default, but if you have an integrated gpu then everyone needs to use uncached. That sucks, but at least we could keep the existing mess going and clean it up. Everyone else would be uncached, except maybe arm64 servers with pcie connectors. Essentially least common denominator to make this work. Note that uncached actually means device access doesn't snoop, the cpu side you can handle with either uc/wc mappings or explicit flushing. Then once we have that we could implement the coherency negotiation protocol on top as an explicit opt-in,
Running into problems with UDMABUF
Hi Gerd, I'm running into a problem trying to use UDMABUF and I'm not sure if it's a bug in udmabuf, or I'm just using it wrong. I create a UDMABUF in userspace, enable CPU access to it, copy some data, disable CPU access, then register it with the IIO subsystem using an ioctl I'm working on (not yet upstream). In udmabuf.c, the "get_sg_table" function is first called by the "begin_cpu_udmabuf", and then called a second time by "map_udmabuf" when the IIO subsystem creates a dma_buf_attachment then maps it. Now the behaviour I'm seeing, is that the first call to "dma_map_sgtable" in get_sg_table sets sg->nents == 1 which is equal to sg_nents(sg->sgl) and as such is the expected value, but the second call to "dma_map_sgtable" will set sg->nents == 0 and I don't understand why. I wonder if you are seeing this behaviour as well? Cheers, -Paul
Re: [PATCH v3 1/4] i915: Move list_count() to list.h for broader use
Hi Andy, I love your patch! Yet something to improve: [auto build test ERROR on usb/usb-testing] [also build test ERROR on usb/usb-next usb/usb-linus drm-intel/for-linux-next drm-intel/for-linux-next-fixes linus/master v6.1-rc6 next-20221122] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Andy-Shevchenko/i915-Move-list_count-to-list-h-for-broader-use/20221122-233657 base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-testing patch link: https://lore.kernel.org/r/20221122153516.52577-1-andriy.shevchenko%40linux.intel.com patch subject: [PATCH v3 1/4] i915: Move list_count() to list.h for broader use config: s390-defconfig compiler: s390-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/3d217ef769536f160f5c20224aeb0f9070bdfdcf git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Andy-Shevchenko/i915-Move-list_count-to-list-h-for-broader-use/20221122-233657 git checkout 3d217ef769536f160f5c20224aeb0f9070bdfdcf # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=s390 prepare If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot All errors (new ones prefixed by >>): scripts/genksyms/parse.y: warning: 9 shift/reduce conflicts [-Wconflicts-sr] scripts/genksyms/parse.y: warning: 5 reduce/reduce conflicts [-Wconflicts-rr] scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples In file included from include/linux/preempt.h:11, from include/linux/percpu.h:6, from include/linux/context_tracking_state.h:5, from include/linux/hardirq.h:5, from include/linux/kvm_host.h:7, from arch/s390/kernel/asm-offsets.c:11: include/linux/list.h:662:8: warning: no previous prototype for 'list_count' [-Wmissing-prototypes] 662 | size_t list_count(struct list_head *head) |^~ In file included from include/linux/preempt.h:11, from arch/s390/include/asm/timex.h:13, from arch/s390/kernel/vdso64/getcpu.c:6: include/linux/list.h:662:8: warning: no previous prototype for 'list_count' [-Wmissing-prototypes] 662 | size_t list_count(struct list_head *head) |^~ In file included from include/linux/rculist.h:10, from include/linux/pid.h:5, from include/linux/sched.h:14, from arch/s390/include/asm/syscall.h:13, from arch/s390/include/asm/vdso/gettimeofday.h:9, from include/vdso/datapage.h:137, from arch/s390/kernel/vdso64/../../../../lib/vdso/gettimeofday.c:5, from arch/s390/kernel/vdso64/vdso64_generic.c:2: include/linux/list.h:662:8: warning: no previous prototype for 'list_count' [-Wmissing-prototypes] 662 | size_t list_count(struct list_head *head) |^~ s390-linux-ld: arch/s390/kernel/vdso64/getcpu.o: in function `list_count': >> include/linux/list.h:663: multiple definition of `list_count'; >> arch/s390/kernel/vdso64/vdso64_generic.o:include/linux/list.h:663: first >> defined here make[2]: *** [arch/s390/kernel/vdso64/Makefile:50: arch/s390/kernel/vdso64/vdso64.so.dbg] Error 1 make[2]: Target 'include/generated/vdso64-offsets.h' not remade because of errors. make[1]: *** [arch/s390/Makefile:159: vdso_prepare] Error 2 make[1]: Target 'prepare' not remade because of errors. make: *** [Makefile:231: __sub-make] Error 2 make: Target 'prepare' not remade because of errors. vim +663 include/linux/list.h 557 558 /** 559 * list_next_entry - get the next element in list 560 * @pos:the type * to cursor 561 * @member: the name of the list_head within the struct. 562 */ 563 #define list_next_entry(pos, member) \ 564 list_entry((pos)->member.next, typeof(*(pos)), member) 565 566 /** 567 * list_next_entry_circular - get the next element in list 568 * @pos:the type * to cursor. 569 * @head: the list head to take the element from. 570 * @member: the name of the list_head wit
Re: git send-email friendly smtp provider anyone?
On Tue, Nov 22, 2022 at 06:42:19PM +0100, Noralf Trønnes wrote: > The first thing that strikes me is that everyone mentioned in one of the > patches get the entire patchset, even sta...@vger.kernel.org (cc'ed in a > fixes patch). The first patch touches a core file and as a result a few > drivers, so I've cc'ed the driver maintainers in that patch, but now > they get the entire patchset where 5 of 6 patches is about a driver that > I maintain. So from their point of view, they see a patchset about a > driver they don't care about and a patch touching a core file, but from > the subject it's not apparent that it touches their driver. I'm afraid > that this might result in none of them looking at that patch. In this > particular case it's not that important, but in another case it might be. I did some (unscientific) polling among kernel maintainers and, by a vast margin, they always prefer to receive the entire series instead of cherry-picked patches -- having the entire series helps provide important context for the change they are looking at. So, this is deliberate and, for now at least, not configurable. Unless you're sending 100+ patch series, I doubt anyone will have any problem with receiving the whole series instead of individual patches. > As for the setting up the web endpoint, should I just follow the b4 docs > on that? > > I use b4 version 0.10.1, is that recent enough? Yes. There will be a 0.10.2 in the near future, but the incoming fixes shouldn't make much difference for the b4 send code. -K
Re: [PATCH] dma-buf: Require VM_PFNMAP vma for mmap
On Tue, Nov 22, 2022 at 07:08:25PM +0100, Daniel Vetter wrote: > On Tue, 22 Nov 2022 at 19:04, Jason Gunthorpe wrote: > > > > On Tue, Nov 22, 2022 at 06:08:00PM +0100, Daniel Vetter wrote: > > > tldr; DMA buffers aren't normal memory, expecting that you can use > > > them like that (like calling get_user_pages works, or that they're > > > accounting like any other normal memory) cannot be guaranteed. > > > > > > Since some userspace only runs on integrated devices, where all > > > buffers are actually all resident system memory, there's a huge > > > temptation to assume that a struct page is always present and useable > > > like for any more pagecache backed mmap. This has the potential to > > > result in a uapi nightmare. > > > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > > > blocks get_user_pages and all the other struct page based > > > infrastructure for everyone. In spirit this is the uapi counterpart to > > > the kernel-internal CONFIG_DMABUF_DEBUG. > > > > > > Motivated by a recent patch which wanted to swich the system dma-buf > > > heap to vm_insert_page instead of vm_insert_pfn. > > > > > > v2: > > > > > > Jason brought up that we also want to guarantee that all ptes have the > > > pte_special flag set, to catch fast get_user_pages (on architectures > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > > > > > From auditing the various functions to insert pfn pte entires > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > > > this should be the correct flag to check for. > > > > I didn't look at how this actually gets used, but it is a bit of a > > pain to insert a lifetime controlled object like a struct page as a > > special PTE/VM_PFNMAP > > > > How is the lifetime model implemented here? How do you know when > > userspace has finally unmapped the page? > > The vma has a filp which is the refcounted dma_buf. With dma_buf you > never get an individual page it's always the entire object. And it's > up to the allocator how exactly it wants to use or not use the page's > refcount. So if gup goes in and elevates the refcount, you can break > stuff, which is why I'm doing this. But how does move work? Jason
Re: [Intel-gfx] [PATCH 2/6] drm/i915/gsc: Skip the version check when fetching the GSC FW
On Mon, Nov 21, 2022 at 03:16:13PM -0800, Daniele Ceraolo Spurio wrote: > The current exectation from the FW side is that the driver will query > the GSC FW version after the FW is loaded, similarly to what the mei > driver does on DG2. However, we're discussing with the FW team if there > is a way to extract the version from the bin file before loading, so we > can keep the code the same as for older FWs. > > Since the GSC FW version is not currently required for functionality and > is only needed for debug purposes, we can skip the FW version for now at > fetch time and add it later on when we've agreed on the approach. > > Signed-off-by: Daniele Ceraolo Spurio > Cc: Alan Previn > Cc: John Harrison Reviewed-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 29 +++- > 1 file changed, 23 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c > b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c > index 5b2e4503aee7..3df52fd59edc 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c > @@ -564,6 +564,26 @@ static int check_ccs_header(struct intel_gt *gt, > return 0; > } > > +static int check_fw_header(struct intel_gt *gt, > +const struct firmware *fw, > +struct intel_uc_fw *uc_fw) > +{ > + int err = 0; > + > + /* GSC FW version is queried after the FW is loaded */ > + if (uc_fw->type == INTEL_UC_FW_TYPE_GSC) > + return 0; > + > + if (uc_fw->loaded_via_gsc) > + err = check_gsc_manifest(fw, uc_fw); > + else > + err = check_ccs_header(gt, fw, uc_fw); > + if (err) > + return err; > + > + return 0; > +} > + > /** > * intel_uc_fw_fetch - fetch uC firmware > * @uc_fw: uC firmware > @@ -633,14 +653,11 @@ int intel_uc_fw_fetch(struct intel_uc_fw *uc_fw) > if (err) > goto fail; > > - if (uc_fw->loaded_via_gsc) > - err = check_gsc_manifest(fw, uc_fw); > - else > - err = check_ccs_header(gt, fw, uc_fw); > + err = check_fw_header(gt, fw, uc_fw); > if (err) > goto fail; > > - if (uc_fw->file_wanted.major_ver) { > + if (uc_fw->file_wanted.major_ver && uc_fw->file_selected.major_ver) { > /* Check the file's major version was as it claimed */ > if (uc_fw->file_selected.major_ver != > uc_fw->file_wanted.major_ver) { > drm_notice(&i915->drm, "%s firmware %s: unexpected > version: %u.%u != %u.%u\n", > @@ -657,7 +674,7 @@ int intel_uc_fw_fetch(struct intel_uc_fw *uc_fw) > } > } > > - if (old_ver) { > + if (old_ver && uc_fw->file_selected.major_ver) { > /* Preserve the version that was really wanted */ > memcpy(&uc_fw->file_wanted, &file_ideal, > sizeof(uc_fw->file_wanted)); > > -- > 2.37.3 >
[PATCH v2 0/4] Add guard padding around i915_vma
Hi, This series adds guards around vma's but setting a pages at the beginning and at the end that work as padding. The first user of the vma guard are scanout objects which don't need anymore to add scratch to all the unused ggtt's and speeding up up considerably the boot and resume by several hundreds of milliseconds up to over a full second in slower machines. Because of this we don't need anymore 2ef6efa79fec ("drm/i915: Improve on suspend / resume time with VT-d enabled") which gets reverted. Changelog = v1 -> v2: - Revert 2ef6efa79fec ("drm/i915: Improve on suspend / resume time with VT-d enabled") Andi Andi Shyti (1): Revert "drm/i915: Improve on suspend / resume time with VT-d enabled" Chris Wilson (3): drm/i915: Wrap all access to i915_vma.node.start|size drm/i915: Introduce guard pages to i915_vma drm/i915: Refine VT-d scanout workaround drivers/gpu/drm/i915/display/intel_fbdev.c| 2 +- drivers/gpu/drm/i915/gem/i915_gem_domain.c| 13 +++ .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 33 +++--- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_tiling.c| 4 +- .../gpu/drm/i915/gem/selftests/huge_pages.c | 2 +- .../i915/gem/selftests/i915_gem_client_blt.c | 23 ++-- .../drm/i915/gem/selftests/i915_gem_context.c | 15 ++- .../drm/i915/gem/selftests/i915_gem_mman.c| 2 +- .../drm/i915/gem/selftests/igt_gem_utils.c| 7 +- drivers/gpu/drm/i915/gt/gen7_renderclear.c| 2 +- drivers/gpu/drm/i915/gt/intel_ggtt.c | 108 -- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 3 +- drivers/gpu/drm/i915/gt/intel_gtt.h | 24 drivers/gpu/drm/i915/gt/intel_renderstate.c | 2 +- .../gpu/drm/i915/gt/intel_ring_submission.c | 2 +- drivers/gpu/drm/i915/gt/selftest_engine_cs.c | 8 +- drivers/gpu/drm/i915/gt/selftest_execlists.c | 18 +-- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 15 +-- drivers/gpu/drm/i915/gt/selftest_lrc.c| 16 +-- .../drm/i915/gt/selftest_ring_submission.c| 2 +- drivers/gpu/drm/i915/gt/selftest_rps.c| 12 +- .../gpu/drm/i915/gt/selftest_workarounds.c| 8 +- drivers/gpu/drm/i915/i915_cmd_parser.c| 4 +- drivers/gpu/drm/i915/i915_debugfs.c | 2 +- drivers/gpu/drm/i915/i915_driver.c| 16 --- drivers/gpu/drm/i915/i915_gem_gtt.h | 3 +- drivers/gpu/drm/i915/i915_perf.c | 2 +- drivers/gpu/drm/i915/i915_vma.c | 59 +++--- drivers/gpu/drm/i915/i915_vma.h | 52 - drivers/gpu/drm/i915/i915_vma_resource.c | 4 +- drivers/gpu/drm/i915/i915_vma_resource.h | 17 ++- drivers/gpu/drm/i915/i915_vma_types.h | 3 +- drivers/gpu/drm/i915/selftests/i915_request.c | 20 ++-- drivers/gpu/drm/i915/selftests/igt_spinner.c | 8 +- 36 files changed, 259 insertions(+), 256 deletions(-) -- 2.38.1
[PATCH v2 1/4] drm/i915: Wrap all access to i915_vma.node.start|size
From: Chris Wilson We already wrap i915_vma.node.start for use with the GGTT, as there we can perform additional sanity checks that the node belongs to the GGTT and fits within the 32b registers. In the next couple of patches, we will introduce guard pages around the objects _inside_ the drm_mm_node allocation. That is we will offset the vma->pages so that the first page is at drm_mm_node.start + vma->guard (not 0 as is currently the case). All users must then not use i915_vma.node.start directly, but compute the guard offset, thus all users are converted to use a i915_vma_offset() wrapper. The notable exceptions are the selftests that are testing exact behaviour of i915_vma_pin/i915_vma_insert. Signed-off-by: Chris Wilson Signed-off-by: Tejas Upadhyay Co-developed-by: Thomas Hellström Signed-off-by: Thomas Hellström Signed-off-by: Tvrtko Ursulin Signed-off-by: Andi Shyti --- drivers/gpu/drm/i915/display/intel_fbdev.c| 2 +- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 33 ++-- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_tiling.c| 4 +- .../gpu/drm/i915/gem/selftests/huge_pages.c | 2 +- .../i915/gem/selftests/i915_gem_client_blt.c | 23 + .../drm/i915/gem/selftests/i915_gem_context.c | 15 +++--- .../drm/i915/gem/selftests/i915_gem_mman.c| 2 +- .../drm/i915/gem/selftests/igt_gem_utils.c| 7 +-- drivers/gpu/drm/i915/gt/gen7_renderclear.c| 2 +- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 3 +- drivers/gpu/drm/i915/gt/intel_renderstate.c | 2 +- .../gpu/drm/i915/gt/intel_ring_submission.c | 2 +- drivers/gpu/drm/i915/gt/selftest_engine_cs.c | 8 +-- drivers/gpu/drm/i915/gt/selftest_execlists.c | 18 +++ drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 15 +++--- drivers/gpu/drm/i915/gt/selftest_lrc.c| 16 +++--- .../drm/i915/gt/selftest_ring_submission.c| 2 +- drivers/gpu/drm/i915/gt/selftest_rps.c| 12 ++--- .../gpu/drm/i915/gt/selftest_workarounds.c| 8 +-- drivers/gpu/drm/i915/i915_cmd_parser.c| 4 +- drivers/gpu/drm/i915/i915_debugfs.c | 2 +- drivers/gpu/drm/i915/i915_perf.c | 2 +- drivers/gpu/drm/i915/i915_vma.c | 25 - drivers/gpu/drm/i915/i915_vma.h | 51 +-- drivers/gpu/drm/i915/i915_vma_resource.h | 10 ++-- drivers/gpu/drm/i915/selftests/i915_request.c | 20 drivers/gpu/drm/i915/selftests/igt_spinner.c | 8 +-- 29 files changed, 180 insertions(+), 122 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c index 5575d7abdc092..03ed4607a46d2 100644 --- a/drivers/gpu/drm/i915/display/intel_fbdev.c +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c @@ -286,7 +286,7 @@ static int intelfb_create(struct drm_fb_helper *helper, /* Our framebuffer is the entirety of fbdev's system memory */ info->fix.smem_start = - (unsigned long)(ggtt->gmadr.start + vma->node.start); + (unsigned long)(ggtt->gmadr.start + i915_ggtt_offset(vma)); info->fix.smem_len = vma->size; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 29e9e8d5b6fec..86956b902c978 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -379,22 +379,25 @@ eb_vma_misplaced(const struct drm_i915_gem_exec_object2 *entry, const struct i915_vma *vma, unsigned int flags) { - if (vma->node.size < entry->pad_to_size) + const u64 start = i915_vma_offset(vma); + const u64 size = i915_vma_size(vma); + + if (size < entry->pad_to_size) return true; - if (entry->alignment && !IS_ALIGNED(vma->node.start, entry->alignment)) + if (entry->alignment && !IS_ALIGNED(start, entry->alignment)) return true; if (flags & EXEC_OBJECT_PINNED && - vma->node.start != entry->offset) + start != entry->offset) return true; if (flags & __EXEC_OBJECT_NEEDS_BIAS && - vma->node.start < BATCH_OFFSET_BIAS) + start < BATCH_OFFSET_BIAS) return true; if (!(flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) && - (vma->node.start + vma->node.size + 4095) >> 32) + (start + size + 4095) >> 32) return true; if (flags & __EXEC_OBJECT_NEEDS_MAP && @@ -440,7 +443,7 @@ eb_pin_vma(struct i915_execbuffer *eb, int err; if (vma->node.size) - pin_flags = vma->node.start; + pin_flags = __i915_vma_offset(vma); else pin_flags = entry->offset & PIN_OFFSET_MASK; @@ -663,8 +666,8 @@ static int eb_reserve_vma(str