Re: [PATCH 11/15] RDMA/hbl: add habanalabs RDMA driver
On 7/16/24 16:40, Jason Gunthorpe wrote: > On Sun, Jul 14, 2024 at 10:18:12AM +, Omer Shpigelman wrote: >> On 7/12/24 16:08, Jason Gunthorpe wrote: >>> [You don't often get email from j...@ziepe.ca. Learn why this is important >>> at https://aka.ms/LearnAboutSenderIdentification ] >>> >>> On Fri, Jun 28, 2024 at 10:24:32AM +, Omer Shpigelman wrote: >>> We need the core driver to access the IB driver (and to the ETH driver as well). As you wrote, we can't use exported symbols from our IB driver nor rely on function pointers, but what about providing the core driver an ops structure? meaning exporting a register function from the core driver that should be called by the IB driver during auxiliary device probe. Something like: int hbl_cn_register_ib_aux_dev(struct auxiliary_device *adev, struct hbl_ib_ops *ops) { ... } EXPORT_SYMBOL(hbl_cn_register_ib_aux_dev); >>> >>> Definately do not do some kind of double-register like this. >>> >>> The auxiliary_device scheme can already be extended to provide ops for >>> each sub device. >>> >>> Like >>> >>> struct habana_driver { >>>struct auxiliary_driver base; >>>const struct habana_ops *ops; >>> }; >>> >>> If the ops are justified or not is a different question. >>> >> >> Well, I suggested this double-register option because I got a comment that >> the design pattern of embedded ops structure shouldn't be used. >> So I'm confused now... > > Yeah, don't stick ops in random places, but the device_driver is the > right place. > Sorry, let me explain again. My original code has an ops structure exactly like you are suggesting now (see struct hbl_aux_dev in the first patch of the series). But I was instructed not to use this ops structure and to rely on exported symbols for inter-driver communication. I'll be happy to use this ops structure like in your example rather than converting my code to use exported symbols. Leon - am I missing anything? what's the verdict here?
[PATCH] drm/ci: Upgrade setuptools requirement to 70.0.0
GitHub Dependabot has issued the following alert: "Upgrade setuptools to version 70.0.0 or later. A vulnerability in the package_index module of pypa/setuptools versions up to 69.1.1 allows for remote code execution via its download functions. These functions, which are used to download packages from URLs provided by users or retrieved from package index servers, are susceptible to code injection. If these functions are exposed to user-controlled inputs, such as package URLs, they can execute arbitrary commands on the system. The issue is fixed in version 70.0. Severity: 8.8 / 10 (High) Attack vector:Network Attack complexity:Low Privileges required: None User interaction:Required Scope: Unchanged Confidentiality: High Integrity: High Availability:High CVE ID: CVE-2024-6345" To avoid disturbing everyone with the kernel repo hosted on GitHub, I suggest we upgrade our python dependencies once again to appease GitHub Dependabot. Link: https://github.com/dependabot Signed-off-by: WangYuli --- drivers/gpu/drm/ci/xfails/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/ci/xfails/requirements.txt b/drivers/gpu/drm/ci/xfails/requirements.txt index e9994c9db799..5e6d48d98e4e 100644 --- a/drivers/gpu/drm/ci/xfails/requirements.txt +++ b/drivers/gpu/drm/ci/xfails/requirements.txt @@ -11,7 +11,7 @@ requests==2.31.0 requests-toolbelt==1.0.0 ruamel.yaml==0.17.32 ruamel.yaml.clib==0.2.7 -setuptools==68.0.0 +setuptools==70.0.0 tenacity==8.2.3 urllib3==2.0.7 wheel==0.41.1 -- 2.43.4
Re: [PATCH] drm/rockchip: dsi: Reset ISP1 DPHY before powering it on
On Wed, Jul 17, 2024 at 08:48:29AM GMT, Dragan Simic wrote: > Hello all, > > On 2024-07-17 08:29, Dragan Simic wrote: > > From: Ondrej Jirman > > > > After a suspend and resume cycle, ISP1 stops receiving data, as observed > > on the Pine64 PinePhone Pro, which is based on the Rockchip RK3399 SoC. > > Re-initializing DPHY during the PHY power-on, if the SoC variant > > supports > > initialization, fixes this issue. > > > > [ dsimic: Added more details to the commit summary and description ] > > > > Signed-off-by: Ondrej Jirman > > Signed-off-by: Dragan Simic > > --- > > drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 8 > > 1 file changed, 8 insertions(+) > > > > diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c > > b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c > > index 4cc8ed8f4fbd..9ad48c6dfac3 100644 > > --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c > > +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c > > @@ -1240,6 +1240,14 @@ static int dw_mipi_dsi_dphy_power_on(struct phy > > *phy) > > goto err_phy_cfg_clk; > > } > > > > + if (dsi->cdata->dphy_rx_init) { > > + ret = dsi->cdata->dphy_rx_init(phy); > > + if (ret < 0) { > > + DRM_DEV_ERROR(dsi->dev, "hardware-specific phy init > > failed: %d\n", > > ret); > > + goto err_pwr_on; > > + } > > + } > > + > > /* do soc-variant specific init */ > > if (dsi->cdata->dphy_rx_power_on) { > > ret = dsi->cdata->dphy_rx_power_on(phy); > > After thinking a bit more about this patch in its original form [1] > that's preserved above, I think it would be better to move the > additional DPHY initialization to dw_mipi_dsi_rockchip_resume(), > because that function seems to be the right place for such fixes. > > Please, let me know your thoughts. That also works (see attachment) to fix the original issue in the commit message, but if you keep the stream on across suspend/resume it does halt so it's not a complete solution either. Kind regards, o. > [1] > https://megous.com/git/linux/commit/?h=orange-pi-6.9&id=ed7992f668a1e529719ee6847ca114f9b67efacb >From 4db7a9a913bb4b78094d7845295d2c3306513c56 Mon Sep 17 00:00:00 2001 From: Ondrej Jirman Date: Wed, 17 Jul 2024 09:30:39 +0200 Subject: [PATCH] drm: rockchip: dw-mipi-dsi-rockchip: Restore DPHY config on resume After a suspend and resume cycle, ISP1 stops receiving data, as observed on the Pine64 PinePhone Pro, which is based on the Rockchip RK3399 SoC. Re-initializing DPHY during resume, if the SoC variant supports initialization, fixes this issue. Signed-off-by: Ondrej Jirman --- .../gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 51 --- 1 file changed, 34 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c index 364b738b6935..c2b58e545080 100644 --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c @@ -1118,6 +1118,31 @@ static const struct component_ops dw_mipi_dsi_rockchip_dphy_ops = { .unbind = dw_mipi_dsi_rockchip_dphy_unbind, }; +static int dw_mipi_dsi_dphy_rx_init(struct dw_mipi_dsi_rockchip *dsi) +{ + int ret; + + if (dsi->cdata->dphy_rx_init) { + ret = clk_prepare_enable(dsi->pclk); + if (ret < 0) + return ret; + + ret = clk_prepare_enable(dsi->grf_clk); + if (ret) { + clk_disable_unprepare(dsi->pclk); + return ret; + } + + ret = dsi->cdata->dphy_rx_init(dsi->dphy); + clk_disable_unprepare(dsi->grf_clk); + clk_disable_unprepare(dsi->pclk); + if (ret < 0) + return ret; + } + + return 0; +} + static int dw_mipi_dsi_dphy_init(struct phy *phy) { struct dw_mipi_dsi_rockchip *dsi = phy_get_drvdata(phy); @@ -1138,23 +1163,9 @@ static int dw_mipi_dsi_dphy_init(struct phy *phy) if (ret < 0) goto err_graph; - if (dsi->cdata->dphy_rx_init) { - ret = clk_prepare_enable(dsi->pclk); - if (ret < 0) - goto err_init; - - ret = clk_prepare_enable(dsi->grf_clk); - if (ret) { - clk_disable_unprepare(dsi->pclk); - goto err_init; - } - - ret = dsi->cdata->dphy_rx_init(phy); - clk_disable_unprepare(dsi->grf_clk); - clk_disable_unprepare(dsi->pclk); - if (ret < 0) - goto err_init; - } + ret = dw_mipi_dsi_dphy_rx_init(dsi); + if (ret < 0) + goto err_init; return 0; @@ -1337,6 +1348,12 @@ static int __maybe_unused dw_mipi_dsi_rockchip_resume(struct device *dev)
Re: [PATCH 1/2] dma-buf: heaps: DMA_HEAP_IOCTL_ALLOC_READ_FILE framework
在 2024/7/16 20:07, Christian König 写道: Am 16.07.24 um 11:31 schrieb Daniel Vetter: On Tue, Jul 16, 2024 at 10:48:40AM +0800, Huan Yang wrote: I just research the udmabuf, Please correct me if I'm wrong. 在 2024/7/15 20:32, Christian König 写道: Am 15.07.24 um 11:11 schrieb Daniel Vetter: On Thu, Jul 11, 2024 at 11:00:02AM +0200, Christian König wrote: Am 11.07.24 um 09:42 schrieb Huan Yang: Some user may need load file into dma-buf, current way is: 1. allocate a dma-buf, get dma-buf fd 2. mmap dma-buf fd into vaddr 3. read(file_fd, vaddr, fsz) This is too heavy if fsz reached to GB. You need to describe a bit more why that is to heavy. I can only assume you need to save memory bandwidth and avoid the extra copy with the CPU. This patch implement a feature called DMA_HEAP_IOCTL_ALLOC_READ_FILE. User need to offer a file_fd which you want to load into dma-buf, then, it promise if you got a dma-buf fd, it will contains the file content. Interesting idea, that has at least more potential than trying to enable direct I/O on mmap()ed DMA-bufs. The approach with the new IOCTL might not work because it is a very specialized use case. But IIRC there was a copy_file_range callback in the file_operations structure you could use for that. I'm just not sure when and how that's used with the copy_file_range() system call. I'm not sure any of those help, because internally they're all still based on struct page (or maybe in the future on folios). And that's the thing dma-buf can't give you, at least without peaking behind the curtain. I think an entirely different option would be malloc+udmabuf. That essentially handles the impendence-mismatch between direct I/O and dma-buf on the dma-buf side. The downside is that it'll make the permanently pinned memory accounting and tracking issues even more apparent, but I guess eventually we do need to sort that one out. Oh, very good idea! Just one minor correction: it's not malloc+udmabuf, but rather create_memfd()+udmabuf. Hm right, it's create_memfd() + mmap(memfd) + udmabuf And you need to complete your direct I/O before creating the udmabuf since that reference will prevent direct I/O from working. udmabuf will pin all pages, so, if returned fd, can't trigger direct I/O (same as dmabuf). So, must complete read before pin it. Why does pinning prevent direct I/O? I haven't tested, but I'd expect the rdma folks would be really annoyed if that's the case ... I used to believe that a pinned page cannot be re-pinned, so performing direct I/O on it would fail. But I misunderstood, and it doesn't have any impact. dma-buf mmap vaddr can't to trigger direct I/O due to can't pin kernel page(PFN), So, not same. Pinning (or rather taking another page reference) prevents writes from using direct I/O because writes try to find all references and make them read only so that nobody modifies the content while the write is done. As far as I know the same approach is used for NUMA migration and replacing small pages with big ones in THP. But for the read case here it should still work. Hmm, with udmabuf direct I/O test, I find this will not effect it. Test code I set in email tail. Maybe pin only let page can't be reclaimed, rather prevent the write? With mine test, udmabuf direct I/O read 3GB file, average cost 2.2s.(I use ftrace to trace f2fs_direct_IO can make sure direct IO trigger success), Same as mine normal cache file read cost My patchset average is 1.2s,The difference between the two was obvious before. But current way is use `memfd_pin_folios` to boost alloc and pin, so maybe need suit it. I currently doubt that the udmabuf solution is suitable for our gigabyte-level read operations. 1. The current mmap operation uses faulting, so frequent page faults will be triggered during reads, resulting in a lot of context switching overhead. 2. current udmabuf size limit is 64MB, even can change, maybe not good to use in large size? Yeah that's just a figleaf so we don't have to bother about the accounting issue. 3. The migration and adaptation of the driver is also a challenge, and currently, we are unable to control it. Why does a udmabuf fd not work instead of any other dmabuf fd? That shouldn't matter for the consuming driver ... Perhaps implementing `copy_file_range` would be more suitable for us. See my other mail, fundamentally these all rely on struct page being present, and dma-buf doesn't give you that. Which means you need to go below the dma-buf abstraction. And udmabuf is pretty much the thing for that, because it wraps normal struct page memory into a dmabuf. And copy_file_range on the underlying memfd might already work, I haven't checked though. Yeah completely agree. Regards, Christian. Cheers, Sima Test code, if test above 2GB, need this patch: https://lore.kernel.org/all/20240717065444.369876-1-l...@vivo.com/ ```c // SPDX-License-Identifier: GPL-2.0 #define _GNU_SOURCE #define
Re: [PATCH 11/15] RDMA/hbl: add habanalabs RDMA driver
On Wed, Jul 17, 2024 at 07:08:59AM +, Omer Shpigelman wrote: > On 7/16/24 16:40, Jason Gunthorpe wrote: > > On Sun, Jul 14, 2024 at 10:18:12AM +, Omer Shpigelman wrote: > >> On 7/12/24 16:08, Jason Gunthorpe wrote: > >>> [You don't often get email from j...@ziepe.ca. Learn why this is > >>> important at https://aka.ms/LearnAboutSenderIdentification ] > >>> > >>> On Fri, Jun 28, 2024 at 10:24:32AM +, Omer Shpigelman wrote: > >>> > We need the core driver to access the IB driver (and to the ETH driver as > well). As you wrote, we can't use exported symbols from our IB driver nor > rely on function pointers, but what about providing the core driver an > ops > structure? meaning exporting a register function from the core driver > that > should be called by the IB driver during auxiliary device probe. > Something like: > > int hbl_cn_register_ib_aux_dev(struct auxiliary_device *adev, > struct hbl_ib_ops *ops) > { > ... > } > EXPORT_SYMBOL(hbl_cn_register_ib_aux_dev); > >>> > >>> Definately do not do some kind of double-register like this. > >>> > >>> The auxiliary_device scheme can already be extended to provide ops for > >>> each sub device. > >>> > >>> Like > >>> > >>> struct habana_driver { > >>>struct auxiliary_driver base; > >>>const struct habana_ops *ops; > >>> }; > >>> > >>> If the ops are justified or not is a different question. > >>> > >> > >> Well, I suggested this double-register option because I got a comment that > >> the design pattern of embedded ops structure shouldn't be used. > >> So I'm confused now... > > > > Yeah, don't stick ops in random places, but the device_driver is the > > right place. > > > > Sorry, let me explain again. My original code has an ops structure > exactly like you are suggesting now (see struct hbl_aux_dev in the first > patch of the series). But I was instructed not to use this ops structure > and to rely on exported symbols for inter-driver communication. > I'll be happy to use this ops structure like in your example rather than > converting my code to use exported symbols. > Leon - am I missing anything? what's the verdict here? You are missing the main sentence from Jason's response: "don't stick ops in random places". It is fine to have ops in device driver, so the core driver can call them. However, in your original code, you added ops everywhere. It caused to the need to implement module reference counting and crazy stuff like calls to lock and unlock functions from the aux driver to the core. Verdict is still the same. Core driver should provide EXPORT_SYMBOLs, so the aux driver can call them directly and enjoy from proper module loading and unloading. The aux driver can have ops in the device driver, so the core driver can call them to perform something specific for that aux driver. Calls between aux drivers should be done via the core driver. Thanks
Re: [PATCH] drm/rockchip: dsi: Reset ISP1 DPHY before powering it on
Hello Ondrej, On 2024-07-17 09:32, Ondřej Jirman wrote: On Wed, Jul 17, 2024 at 08:48:29AM GMT, Dragan Simic wrote: Hello all, On 2024-07-17 08:29, Dragan Simic wrote: > From: Ondrej Jirman > > After a suspend and resume cycle, ISP1 stops receiving data, as observed > on the Pine64 PinePhone Pro, which is based on the Rockchip RK3399 SoC. > Re-initializing DPHY during the PHY power-on, if the SoC variant > supports > initialization, fixes this issue. > > [ dsimic: Added more details to the commit summary and description ] > > Signed-off-by: Ondrej Jirman > Signed-off-by: Dragan Simic > --- > drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c > b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c > index 4cc8ed8f4fbd..9ad48c6dfac3 100644 > --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c > +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c > @@ -1240,6 +1240,14 @@ static int dw_mipi_dsi_dphy_power_on(struct phy > *phy) >goto err_phy_cfg_clk; >} > > + if (dsi->cdata->dphy_rx_init) { > + ret = dsi->cdata->dphy_rx_init(phy); > + if (ret < 0) { > + DRM_DEV_ERROR(dsi->dev, "hardware-specific phy init failed: %d\n", > ret); > + goto err_pwr_on; > + } > + } > + >/* do soc-variant specific init */ >if (dsi->cdata->dphy_rx_power_on) { >ret = dsi->cdata->dphy_rx_power_on(phy); After thinking a bit more about this patch in its original form [1] that's preserved above, I think it would be better to move the additional DPHY initialization to dw_mipi_dsi_rockchip_resume(), because that function seems to be the right place for such fixes. Please, let me know your thoughts. That also works (see attachment) to fix the original issue in the commit message, but if you keep the stream on across suspend/resume it does halt so it's not a complete solution either. Great, thanks for the attached patch. I assume that you already have a patch that performs the other required operations on suspend and resume, i.e. stops the stream and restarts it? How about dropping my "handled" variant of your patch and having you submit the patch you sent as attachment, and the additional patch you described as also needed? [1] https://megous.com/git/linux/commit/?h=orange-pi-6.9&id=ed7992f668a1e529719ee6847ca114f9b67efacb
[Bug 201497] [amdgpu]: '*ERROR* No EDID read' is back in 4.19
https://bugzilla.kernel.org/show_bug.cgi?id=201497 Artem S. Tashkinov (a...@gmx.com) changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |ANSWERED --- Comment #32 from Artem S. Tashkinov (a...@gmx.com) --- All the amdgpu hackers are here, please refile and attach dmesg for 6.10 if you're still affected: https://gitlab.freedesktop.org/drm/amd/-/issues -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 201497] [amdgpu]: '*ERROR* No EDID read' is back in 4.19
https://bugzilla.kernel.org/show_bug.cgi?id=201497 Artem S. Tashkinov (a...@gmx.com) changed: What|Removed |Added URL||https://gitlab.freedesktop. ||org/drm/amd/-/issues/3494 --- Comment #33 from Artem S. Tashkinov (a...@gmx.com) --- Please take it to here: https://gitlab.freedesktop.org/drm/amd/-/issues/3494 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[PATCH 0/3] drm/panic: Remove build time dependency with FRAMEBUFFER_CONSOLE
When proposing to enable DRM_PANIC on Fedora, some users raised concern about the need to disable VT_CONSOLE. So this is my new attempt to avoid fbcon/vt_console to overwrite the panic screen. This time it doesn't involve any locking, so it should be safe. I added a skip_panic option in struct fb_info, and check if this option and the panic_cpu are set in fb_is_inactive(), to prevent any framebuffer operation. Also skip_panic is only true if the drm driver supports drm_panic, so you will still get the VT panic info on drivers that don't have drm_panic support yet. Jocelyn Falempe (3): drm/panic: Add drm_panic_is_enabled() fbcon: Add an option to disable fbcon in panic. drm/panic: Remove build time dependency with FRAMEBUFFER_CONSOLE drivers/gpu/drm/Kconfig | 2 +- drivers/gpu/drm/drm_fb_helper.c | 2 ++ drivers/gpu/drm/drm_panic.c | 20 drivers/video/fbdev/core/fbcon.c | 7 ++- include/drm/drm_panic.h | 2 ++ include/linux/fb.h | 1 + 6 files changed, 32 insertions(+), 2 deletions(-) base-commit: a237f217bad50c381773da5b00442710d1449098 -- 2.45.2
[PATCH 1/3] drm/panic: Add drm_panic_is_enabled()
It allows to check if the drm device supports drm_panic. Prepare the work to have better integration with fbcon and vtconsole. Signed-off-by: Jocelyn Falempe --- drivers/gpu/drm/drm_panic.c | 20 include/drm/drm_panic.h | 2 ++ 2 files changed, 22 insertions(+) diff --git a/drivers/gpu/drm/drm_panic.c b/drivers/gpu/drm/drm_panic.c index 948aed00595e..d9a25c2d0a65 100644 --- a/drivers/gpu/drm/drm_panic.c +++ b/drivers/gpu/drm/drm_panic.c @@ -703,6 +703,26 @@ static void debugfs_register_plane(struct drm_plane *plane, int index) static void debugfs_register_plane(struct drm_plane *plane, int index) {} #endif /* CONFIG_DRM_PANIC_DEBUG */ +/** + * drm_panic_is_enabled + * @dev: the drm device that may supports drm_panic + * + * returns true if the drm device supports drm_panic + */ +bool drm_panic_is_enabled(struct drm_device *dev) +{ + struct drm_plane *plane; + + if (!dev->mode_config.num_total_plane) + return false; + + drm_for_each_plane(plane, dev) + if (plane->helper_private && plane->helper_private->get_scanout_buffer) + return true; + return false; +} +EXPORT_SYMBOL(drm_panic_is_enabled); + /** * drm_panic_register() - Initialize DRM panic for a device * @dev: the drm device on which the panic screen will be displayed. diff --git a/include/drm/drm_panic.h b/include/drm/drm_panic.h index 73bb3f3d9ed9..c3a358dc3e27 100644 --- a/include/drm/drm_panic.h +++ b/include/drm/drm_panic.h @@ -148,11 +148,13 @@ struct drm_scanout_buffer { #ifdef CONFIG_DRM_PANIC +bool drm_panic_is_enabled(struct drm_device *dev); void drm_panic_register(struct drm_device *dev); void drm_panic_unregister(struct drm_device *dev); #else +bool drm_panic_is_enabled(struct drm_device *dev) {return false; } static inline void drm_panic_register(struct drm_device *dev) {} static inline void drm_panic_unregister(struct drm_device *dev) {} -- 2.45.2
[PATCH 2/3] fbcon: Add an option to disable fbcon in panic.
This is required to avoid conflict between DRM_PANIC, and fbcon. If a drm device already handle panic with drm_panic, it should set the skip_panic field in fb_info, so that fbcon will stay quiet, and not overwrite the panic_screen. Signed-off-by: Jocelyn Falempe --- drivers/gpu/drm/drm_fb_helper.c | 2 ++ drivers/video/fbdev/core/fbcon.c | 7 ++- include/linux/fb.h | 1 + 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index e2e19f49342e..3662d664d8f9 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -40,6 +40,7 @@ #include #include #include +#include #include #include @@ -524,6 +525,7 @@ struct fb_info *drm_fb_helper_alloc_info(struct drm_fb_helper *fb_helper) fb_helper->info = info; info->skip_vt_switch = true; + info->skip_panic = drm_panic_is_enabled(fb_helper->dev); return info; err_release: diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c index 3f7333dca508..498d9c07df80 100644 --- a/drivers/video/fbdev/core/fbcon.c +++ b/drivers/video/fbdev/core/fbcon.c @@ -270,12 +270,17 @@ static int fbcon_get_rotate(struct fb_info *info) return (ops) ? ops->rotate : 0; } +static bool fbcon_skip_panic(struct fb_info *info) +{ + return (info->skip_panic && unlikely(atomic_read(&panic_cpu) != PANIC_CPU_INVALID)); +} + static inline int fbcon_is_inactive(struct vc_data *vc, struct fb_info *info) { struct fbcon_ops *ops = info->fbcon_par; return (info->state != FBINFO_STATE_RUNNING || - vc->vc_mode != KD_TEXT || ops->graphics); + vc->vc_mode != KD_TEXT || ops->graphics || fbcon_skip_panic(info)); } static int get_color(struct vc_data *vc, struct fb_info *info, diff --git a/include/linux/fb.h b/include/linux/fb.h index db7d97b10964..865dad03e73e 100644 --- a/include/linux/fb.h +++ b/include/linux/fb.h @@ -510,6 +510,7 @@ struct fb_info { void *par; bool skip_vt_switch; /* no VT switch on suspend/resume required */ + bool skip_panic; /* Do not write to the fb after a panic */ }; /* This will go away -- 2.45.2
[PATCH 3/3] drm/panic: Remove build time dependency with FRAMEBUFFER_CONSOLE
Now that fbcon has the skip_panic option, there is no more conflicts between drm_panic and fbcon. Remove the build time dependency, so they can be both enabled. Signed-off-by: Jocelyn Falempe --- drivers/gpu/drm/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 6dd0016fc9cd..a22cab218004 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -107,7 +107,7 @@ config DRM_KMS_HELPER config DRM_PANIC bool "Display a user-friendly message when a kernel panic occurs" - depends on DRM && !(FRAMEBUFFER_CONSOLE && VT_CONSOLE) + depends on DRM select FONT_SUPPORT help Enable a drm panic handler, which will display a user-friendly message -- 2.45.2
Re: [PATCH v5 2/2] drm/amdgpu: Add address alignment support to DCC buffers
Well that approach was discussed before and seemed to be to complicated. But I totally agree that the AMDGPU_GEM_CREATE_GFX12_DCC flag is a bad idea. This isn't anything userspace should need to specify in the first place. All we need is a hardware workaround which kicks in all the time while pinning BOs for this specific hw generation and texture channel configuration. Please remove the AMDGPU_GEM_CREATE_GFX12_DCC flag again if possible or specify why it is actually necessary? Regards, Christian. Am 17.07.24 um 05:44 schrieb Marek Olšák: AMDGPU_GEM_CREATE_GFX12_DCC is set on 90% of all memory allocations, and almost all of them are not displayable. Shouldn't we use a different way to indicate that we need a non-power-of-two alignment, such as looking at the alignment field directly? Marek On Tue, Jul 16, 2024, 11:45 Arunpravin Paneer Selvam wrote: Add address alignment support to the DCC VRAM buffers. v2: - adjust size based on the max_texture_channel_caches values only for GFX12 DCC buffers. - used AMDGPU_GEM_CREATE_GFX12_DCC flag to apply change only for DCC buffers. - roundup non power of two DCC buffer adjusted size to nearest power of two number as the buddy allocator does not support non power of two alignments. This applies only to the contiguous DCC buffers. v3:(Alex) - rewrite the max texture channel caches comparison code in an algorithmic way to determine the alignment size. v4:(Alex) - Move the logic from amdgpu_vram_mgr_dcc_alignment() to gmc_v12_0.c and add a new gmc func callback for dcc alignment. If the callback is non-NULL, call it to get the alignment, otherwise, use the default. v5:(Alex) - Set the Alignment to a default value if the callback doesn't exist. - Add the callback to amdgpu_gmc_funcs. v6: - Fix checkpatch error reported by Intel CI. Signed-off-by: Arunpravin Paneer Selvam Acked-by: Alex Deucher Acked-by: Christian König Reviewed-by: Frank Min --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 6 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 36 ++-- drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c | 15 3 files changed, 55 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h index febca3130497..654d0548a3f8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h @@ -156,6 +156,8 @@ struct amdgpu_gmc_funcs { uint64_t addr, uint64_t *flags); /* get the amount of memory used by the vbios for pre-OS console */ unsigned int (*get_vbios_fb_size)(struct amdgpu_device *adev); + /* get the DCC buffer alignment */ + u64 (*get_dcc_alignment)(struct amdgpu_device *adev); enum amdgpu_memory_partition (*query_mem_partition_mode)( struct amdgpu_device *adev); @@ -363,6 +365,10 @@ struct amdgpu_gmc { (adev)->gmc.gmc_funcs->override_vm_pte_flags \ ((adev), (vm), (addr), (pte_flags)) #define amdgpu_gmc_get_vbios_fb_size(adev) (adev)->gmc.gmc_funcs->get_vbios_fb_size((adev)) +#define amdgpu_gmc_get_dcc_alignment(_adev) ({ \ + typeof(_adev) (adev) = (_adev); \ + ((adev)->gmc.gmc_funcs->get_dcc_alignment((adev))); \ +}) /** * amdgpu_gmc_vram_full_visible - Check if full VRAM is visible through the BAR diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c index f91cc149d06c..aa9dca12371c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -512,6 +512,16 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, vres->flags |= DRM_BUDDY_RANGE_ALLOCATION; remaining_size = (u64)vres->base.size; + if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS && + bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC) { + u64 adjust_size; + + if (adev->gmc.gmc_funcs->get_dcc_alignment) { + adjust_size = amdgpu_gmc_get_dcc_alignment(adev); + remaining_size = roundup_pow_of_two(remaining_size + adjust_size); + vres->flags |= DRM_BUDDY_TRIM_DISABLE; + } + } mutex_lock(&mgr->lock); while (remaining_size) { @@ -521,8 +531,12 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, min_block_size = mgr->default_page_size; size = remaining_size; - if ((size >= (u64)pag
Re: [PATCH 0/4] fixes for Adreno A5Xx preemption
On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak wrote: > > There are several issues with preemption on Adreno A5XX GPUs which > render system unusable if more than one priority level is used. Those > issues include persistent GPU faults and hangs, full UI lockups with > idling GPU. > > --- > Vladimir Lypak (4): > drm/msm/a5xx: disable preemption in submits by default > drm/msm/a5xx: properly clear preemption records on resume > drm/msm/a5xx: fix races in preemption evaluation stage > drm/msm/a5xx: workaround early ring-buffer emptiness check > > drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 18 ++ > drivers/gpu/drm/msm/adreno/a5xx_gpu.h | 12 ++--- > drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 30 --- > 3 files changed, 47 insertions(+), 13 deletions(-) > --- > base-commit: 523b23f0bee3014a7a752c9bb9f5c54f0eddae88 > -- > 2.45.2 > Hi Vladimir, While looking at preemption on a7xx, where the overall logic is pretty much the same, and I've been seeing the same "soft lockups". However even after porting patch 3, it turns out that's not enough because there's a different race. The sequence of events is something like this: 1. Medium-prio app A submits to ring 2. 2. Ring 0 retires its last job and we start a preemption to ring 2. 3. High-prio app B submits to ring 0. It sees the preemption from step 2 ongoing and doesn't write the WTPR register or try to preempt. 4. The preemption finishes and we update WPTR. 5. App A's submit retires. We try to preempt, but the submit and ring->cur write from step 3 happened on a different CPU and the write hasn't landed yet so we skip it. It's a bit tricky because write reordering is involved, but this seems to be what's happening - everything except my speculation about the delayed write to ring->cur being the problem comes straight from a trace of this happening. Rob suggested on IRC that we make retire handling happen on the same workqueue as submissions, so that preempt_trigger is always serialized, which IIUC would also make patch 3 unnecessary. What do you think? Best regards, Connor
Re: [PATCH] drm/mipi-dsi: Introduce macros to create mipi_dsi_*_multi functions
On 7/16/24 10:31 PM, Dmitry Baryshkov wrote: > On Tue, Jul 16, 2024 at 07:01:17PM GMT, Tejas Vipin wrote: >> Introduce 2 new macros, DSI_CTX_NO_OP and MIPI_DSI_ADD_MULTI_VARIANT. >> >> DSI_CTX_NO_OP calls a function only if the context passed to it hasn't >> encountered any errors. It is a generic form of what mipi_dsi_msleep >> does. >> >> MIPI_DSI_ADD_MULTI_VARIANT defines a multi style function of any >> mipi_dsi function that follows a certain style. This allows us to >> greatly reduce the amount of redundant code written for each multi >> function. It reduces the overhead for a developer introducing new >> mipi_dsi_*_multi functions. >> >> Signed-off-by: Tejas Vipin >> --- >> drivers/gpu/drm/drm_mipi_dsi.c | 286 ++--- >> 1 file changed, 83 insertions(+), 203 deletions(-) >> > > [...] > >> -void mipi_dsi_dcs_set_tear_on_multi(struct mipi_dsi_multi_context *ctx, >> -enum mipi_dsi_dcs_tear_mode mode) >> -{ >> -struct mipi_dsi_device *dsi = ctx->dsi; >> -struct device *dev = &dsi->dev; >> -ssize_t ret; >> - >> -if (ctx->accum_err) >> -return; >> - >> -ret = mipi_dsi_dcs_set_tear_on(dsi, mode); >> -if (ret < 0) { >> -ctx->accum_err = ret; >> -dev_err(dev, "sending DCS SET_TEAR_ON failed: %d\n", >> -ctx->accum_err); >> -} >> -} >> -EXPORT_SYMBOL(mipi_dsi_dcs_set_tear_on_multi); >> +#define MIPI_DSI_ADD_MULTI_VARIANT(proto, err, inner_func, ...) \ >> +proto { \ >> +struct mipi_dsi_device *dsi = ctx->dsi; \ >> +struct device *dev = &dsi->dev; \ >> +int ret;\ >> +\ >> +if (ctx->accum_err) \ >> +return; \ >> +\ >> +ret = inner_func(dsi, ##__VA_ARGS__); \ >> +if (ret < 0) { \ >> +ctx->accum_err = ret; \ >> +dev_err(dev, err, ctx->accum_err); \ >> +} \ >> +} \ >> +EXPORT_SYMBOL(inner_func##_multi); >> + >> +MIPI_DSI_ADD_MULTI_VARIANT( >> +void mipi_dsi_picture_parameter_set_multi( >> +struct mipi_dsi_multi_context *ctx, >> +const struct drm_dsc_picture_parameter_set *pps), >> +"sending PPS failed: %d\n", >> +mipi_dsi_picture_parameter_set, pps); > > I'd say that having everything wrapped in the macro looks completely > unreadable. > > If you really insist, it can become something like: > > MIPI_DSI_ADD_MULTI_VARIANT(mipi_dsi_picture_parameter_set > MULTI_PROTO(const struct drm_dsc_picture_parameter_set *pps), > MULTI_ARGS(pps), > "sending PPS failed"); > > (note, I dropped the obvious parts: that the funciton is foo_multi, its > return type is void, first parameter is context, etc). > > However it might be better to go other way arround. > Do we want for all the drivers to migrate to _multi()-kind of API? If > so, what about renaming the multi and non-multi functions accordingly > and making the old API a wrapper around the multi() function? > I think this would be good. For the wrapper to make a multi() function non-multi, what do you think about a macro that would just pass a default dsi_ctx to the multi() func and return on error? In this case it would also be good to let the code fill inline instead of generating a whole new function imo. So in this scenario all the mipi dsi functions would be multi functions, and a function could be called non-multi like MACRO_NAME(func) at the call site. I also think there is merit in keeping DSI_CTX_NO_OP. What do you think? -- Tejas Vipin
Re: [PATCH 1/3] drm/panic: Add drm_panic_is_enabled()
Jocelyn Falempe writes: Hello Jocelyn, > It allows to check if the drm device supports drm_panic. > Prepare the work to have better integration with fbcon and vtconsole. > > Signed-off-by: Jocelyn Falempe > --- > drivers/gpu/drm/drm_panic.c | 20 > include/drm/drm_panic.h | 2 ++ > 2 files changed, 22 insertions(+) > > diff --git a/drivers/gpu/drm/drm_panic.c b/drivers/gpu/drm/drm_panic.c > index 948aed00595e..d9a25c2d0a65 100644 > --- a/drivers/gpu/drm/drm_panic.c > +++ b/drivers/gpu/drm/drm_panic.c > @@ -703,6 +703,26 @@ static void debugfs_register_plane(struct drm_plane > *plane, int index) > static void debugfs_register_plane(struct drm_plane *plane, int index) {} > #endif /* CONFIG_DRM_PANIC_DEBUG */ > > +/** > + * drm_panic_is_enabled > + * @dev: the drm device that may supports drm_panic > + * > + * returns true if the drm device supports drm_panic > + */ > +bool drm_panic_is_enabled(struct drm_device *dev) > +{ > + struct drm_plane *plane; > + > + if (!dev->mode_config.num_total_plane) > + return false; > + > + drm_for_each_plane(plane, dev) > + if (plane->helper_private && > plane->helper_private->get_scanout_buffer) > + return true; > + return false; > +} > +EXPORT_SYMBOL(drm_panic_is_enabled); > + Reviewed-by: Javier Martinez Canillas -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
Re: [PATCH] drm/mipi-dsi: Introduce macros to create mipi_dsi_*_multi functions
On Wed, 17 Jul 2024 at 12:58, Tejas Vipin wrote: > > > > On 7/16/24 10:31 PM, Dmitry Baryshkov wrote: > > On Tue, Jul 16, 2024 at 07:01:17PM GMT, Tejas Vipin wrote: > >> Introduce 2 new macros, DSI_CTX_NO_OP and MIPI_DSI_ADD_MULTI_VARIANT. > >> > >> DSI_CTX_NO_OP calls a function only if the context passed to it hasn't > >> encountered any errors. It is a generic form of what mipi_dsi_msleep > >> does. > >> > >> MIPI_DSI_ADD_MULTI_VARIANT defines a multi style function of any > >> mipi_dsi function that follows a certain style. This allows us to > >> greatly reduce the amount of redundant code written for each multi > >> function. It reduces the overhead for a developer introducing new > >> mipi_dsi_*_multi functions. > >> > >> Signed-off-by: Tejas Vipin > >> --- > >> drivers/gpu/drm/drm_mipi_dsi.c | 286 ++--- > >> 1 file changed, 83 insertions(+), 203 deletions(-) > >> > > > > [...] > > > >> -void mipi_dsi_dcs_set_tear_on_multi(struct mipi_dsi_multi_context *ctx, > >> -enum mipi_dsi_dcs_tear_mode mode) > >> -{ > >> -struct mipi_dsi_device *dsi = ctx->dsi; > >> -struct device *dev = &dsi->dev; > >> -ssize_t ret; > >> - > >> -if (ctx->accum_err) > >> -return; > >> - > >> -ret = mipi_dsi_dcs_set_tear_on(dsi, mode); > >> -if (ret < 0) { > >> -ctx->accum_err = ret; > >> -dev_err(dev, "sending DCS SET_TEAR_ON failed: %d\n", > >> -ctx->accum_err); > >> -} > >> -} > >> -EXPORT_SYMBOL(mipi_dsi_dcs_set_tear_on_multi); > >> +#define MIPI_DSI_ADD_MULTI_VARIANT(proto, err, inner_func, ...) \ > >> +proto { \ > >> +struct mipi_dsi_device *dsi = ctx->dsi; \ > >> +struct device *dev = &dsi->dev; \ > >> +int ret;\ > >> +\ > >> +if (ctx->accum_err) \ > >> +return; \ > >> +\ > >> +ret = inner_func(dsi, ##__VA_ARGS__); \ > >> +if (ret < 0) { \ > >> +ctx->accum_err = ret; \ > >> +dev_err(dev, err, ctx->accum_err); \ > >> +} \ > >> +} \ > >> +EXPORT_SYMBOL(inner_func##_multi); > >> + > >> +MIPI_DSI_ADD_MULTI_VARIANT( > >> +void mipi_dsi_picture_parameter_set_multi( > >> +struct mipi_dsi_multi_context *ctx, > >> +const struct drm_dsc_picture_parameter_set *pps), > >> +"sending PPS failed: %d\n", > >> +mipi_dsi_picture_parameter_set, pps); > > > > I'd say that having everything wrapped in the macro looks completely > > unreadable. > > > > If you really insist, it can become something like: > > > > MIPI_DSI_ADD_MULTI_VARIANT(mipi_dsi_picture_parameter_set > > MULTI_PROTO(const struct drm_dsc_picture_parameter_set *pps), > > MULTI_ARGS(pps), > > "sending PPS failed"); > > > > (note, I dropped the obvious parts: that the funciton is foo_multi, its > > return type is void, first parameter is context, etc). > > > > However it might be better to go other way arround. > > Do we want for all the drivers to migrate to _multi()-kind of API? If > > so, what about renaming the multi and non-multi functions accordingly > > and making the old API a wrapper around the multi() function? > > > > I think this would be good. For the wrapper to make a multi() function > non-multi, what do you think about a macro that would just pass a > default dsi_ctx to the multi() func and return on error? In this case > it would also be good to let the code fill inline instead of generating > a whole new function imo. > > So in this scenario all the mipi dsi functions would be multi functions, > and a function could be called non-multi like MACRO_NAME(func) at the > call site. Sounds good to me. I'd suggest to wait for a day or two for further feedback / comments from other developers. > > I also think there is merit in keeping DSI_CTX_NO_OP. > > What do you think? > > -- > Tejas Vipin -- With best wishes Dmitry
Re: [PATCH 2/3] fbcon: Add an option to disable fbcon in panic.
Jocelyn Falempe writes: > This is required to avoid conflict between DRM_PANIC, and fbcon. If > a drm device already handle panic with drm_panic, it should set > the skip_panic field in fb_info, so that fbcon will stay quiet, and > not overwrite the panic_screen. > > Signed-off-by: Jocelyn Falempe > --- This makes sense to me as well. Reviewed-by: Javier Martinez Canillas -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
Re: [PATCH 3/3] drm/panic: Remove build time dependency with FRAMEBUFFER_CONSOLE
Jocelyn Falempe writes: > Now that fbcon has the skip_panic option, there is no more conflicts > between drm_panic and fbcon. > Remove the build time dependency, so they can be both enabled. > > Signed-off-by: Jocelyn Falempe > --- > drivers/gpu/drm/Kconfig | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig > index 6dd0016fc9cd..a22cab218004 100644 > --- a/drivers/gpu/drm/Kconfig > +++ b/drivers/gpu/drm/Kconfig > @@ -107,7 +107,7 @@ config DRM_KMS_HELPER > > config DRM_PANIC > bool "Display a user-friendly message when a kernel panic occurs" > - depends on DRM && !(FRAMEBUFFER_CONSOLE && VT_CONSOLE) > + depends on DRM This is great. Thanks for finding an alternative approach! I don't see any issues this time, because there is no locking involved. But let's see what others think about it. Reviewed-by: Javier Martinez Canillas -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
Re: [PATCH v9 00/53] fix CONFIG_DRM_USE_DYNAMIC_DEBUG=y
On Mon, Jul 15, 2024 at 8:00 PM wrote: > > On Mon, Jul 15, 2024 at 4:05 AM Łukasz Bartosik wrote: > > > > On Sat, Jul 13, 2024 at 11:45 PM wrote: > > > > > > On Fri, Jul 12, 2024 at 9:44 AM Łukasz Bartosik > > > wrote: > > > > > > > > On Wed, Jul 3, 2024 at 12:14 AM wrote: > > > > > > > > > > On Tue, Jul 2, 2024 at 4:01 PM Luis Chamberlain > > > > > wrote: > > > > > > > > > > > > On Tue, Jul 02, 2024 at 03:56:50PM -0600, Jim Cromie wrote: > > > > > > > This fixes dynamic-debug support for DRM.debug, added via > > > > > > > classmaps. > > > > > > > commit bb2ff6c27bc9 (drm: Disable dynamic debug as broken) > > > > > > > > > > > > > > CONFIG_DRM_USE_DYNAMIC_DEBUG=y was marked broken because > > > > > > > drm.debug=val > > > > > > > was applied when drm.ko was modprobed; too early for the > > > > > > > yet-to-load > > > > > > > drivers, which thus missed the enablement. My testing with > > > > > > > /etc/modprobe.d/ entries and modprobes with dyndbg=$querycmd > > > > > > > options > > > > > > > obscured this omission. > > > > > > > > > > > > > > The fix is to replace invocations of DECLARE_DYNDBG_CLASSMAP with > > > > > > > DYNDBG_CLASSMAP_DEFINE for core, and DYNDBG_CLASSMAP_USE for > > > > > > > drivers. > > > > > > > The distinction allows dyndbg to also handle the users properly. > > > > > > > > > > > > > > DRM is the only current classmaps user, and is not really using > > > > > > > it, > > > > > > > so if you think DRM could benefit from zero-off-cost debugs based > > > > > > > on > > > > > > > static-keys, please test. > > > > > > > > > > > > > > HISTORY > > > > > > > > > > > > > > 9/4/22 - ee879be38bc8..ace7c4bbb240 commited - classmaps-v1 > > > > > > > dyndbg parts > > > > > > > 9/11/22 - 0406faf25fb1..16deeb8e18ca commited - classmaps-v1 drm > > > > > > > parts > > > > > > > > > > > > > > https://lore.kernel.org/lkml/y3xurogav4i7b...@kroah.com/ > > > > > > > greg k-h says: > > > > > > > This should go through the drm tree now. The rest probably > > > > > > > should also > > > > > > > go that way and not through my tree as well. > > > > > > > > > > > > Can't this just be defined as a coccinelle smpl patch? Must easier > > > > > > to read than 53 patches? > > > > > > > > > > > > > > > > perhaps it could - Im not sure that would be easier to review > > > > > than a file-scoped struct declaration or reference per driver > > > > > > > > > > Also, I did it hoping to solicit more Tested-by:s with drm.debug=0x1ff > > > > > > > > > > Jim > > > > > > > > > > > > > Jim, > > > > > > > > When testing different combinations of Y/M for TEST_DYNAMIC_DEBUG and > > > > TEST_DYNAMIC_DEBUG_SUBMOD in virtme-ng I spotted test failures: > > > > > > > > When the TEST_DYNAMIC_DEBUG=M and TEST_DYNAMIC_DEBUG_SUBMOD=M - > > > > BASIC_TESTS, COMMA_TERMINATOR_TESTS, TEST_PERCENT_SPLITTING, > > > > TEST_MOD_SUBMOD selftests passed > > > > When the TEST_DYNAMIC_DEBUG=Y and TEST_DYNAMIC_DEBUG_SUBMOD=M - > > > > BASIC_TESTS, COMMA_TERMINATOR_TESTS selftests passed, however > > > > TEST_PERCENT_SPLITTING selftest fails with ": ./dyndbg_selftest.sh:270 > > > > check failed expected 1 on =pf, got 0" > > > > When the TEST_DYNAMIC_DEBUG=Y and TEST_DYNAMIC_DEBUG_SUBMOD=Y - > > > > BASIC_TESTS, COMMA_TERMINATOR_TESTS selftests passed, however > > > > TEST_PERCENT_SPLITTING selftest fails also with ": > > > > ./dyndbg_selftest.sh:270 check failed expected 1 on =pf, got 0" > > > > > > > > Have I missed something ? > > > > > > > > > > I am not seeing those 2 failures on those 2 configs. > > > > > > most of my recent testing has been on x86-defconfig + minimals, > > > built and run using/inside virtme-ng > > > > > > the last kernel I installed on this hw was june 16, I will repeat that, > > > and report soon if I see the failure outside the vm > > > > > > I'll also send you my script, to maybe speed isolation of the differences. > > > > > > > Jim, > > > > I know why I saw these failures. > > I ran dyndbg_selftest.sh directly in thw directory > > tools/testing/selftests/dynamic_debug/. > > thats odd. > I mostly run it from src-root, > also whereever make selftest target is/works (I forgot) > > I went into that subdir and ran it there > I got no test differences / failures. > Jim, The dyndbg_selftest.sh checks the location of kernel .config if it is configured and if not it sets it to the current dir. [ -f "$KCONFIG_CONFIG" ] || KCONFIG_CONFIG=".config" if [ -f "$KCONFIG_CONFIG" ]; then If it does not find the .config it will set the variables to: LACK_DD_BUILTIN=0 LACK_TMOD=0 LACK_TMOD_SUBMOD=0 and run all selftests no matter what the values (Y/M) of TEST_DYNAMIC_DEBUG and TEST_DYNAMIC_DEBUG_SUBMOD are. > IIRC, the failure was on line 270, just after a modprobe. > can you further isolate it ? > > > All works as expected when I run it from the top kernel directory. > > Here are the results: > > > > When the TEST_DYNAMIC_DEBUG=M and TEST_DYNAMIC_DEBUG_SUBMOD=M - > > BASIC_TESTS, COMMA_TERMINATOR_TESTS, TEST
Re: [PATCH 11/15] RDMA/hbl: add habanalabs RDMA driver
On 7/17/24 10:36, Leon Romanovsky wrote: > On Wed, Jul 17, 2024 at 07:08:59AM +, Omer Shpigelman wrote: >> On 7/16/24 16:40, Jason Gunthorpe wrote: >>> On Sun, Jul 14, 2024 at 10:18:12AM +, Omer Shpigelman wrote: On 7/12/24 16:08, Jason Gunthorpe wrote: > [You don't often get email from j...@ziepe.ca. Learn why this is > important at https://aka.ms/LearnAboutSenderIdentification ] > > On Fri, Jun 28, 2024 at 10:24:32AM +, Omer Shpigelman wrote: > >> We need the core driver to access the IB driver (and to the ETH driver as >> well). As you wrote, we can't use exported symbols from our IB driver nor >> rely on function pointers, but what about providing the core driver an >> ops >> structure? meaning exporting a register function from the core driver >> that >> should be called by the IB driver during auxiliary device probe. >> Something like: >> >> int hbl_cn_register_ib_aux_dev(struct auxiliary_device *adev, >> struct hbl_ib_ops *ops) >> { >> ... >> } >> EXPORT_SYMBOL(hbl_cn_register_ib_aux_dev); > > Definately do not do some kind of double-register like this. > > The auxiliary_device scheme can already be extended to provide ops for > each sub device. > > Like > > struct habana_driver { >struct auxiliary_driver base; >const struct habana_ops *ops; > }; > > If the ops are justified or not is a different question. > Well, I suggested this double-register option because I got a comment that the design pattern of embedded ops structure shouldn't be used. So I'm confused now... >>> >>> Yeah, don't stick ops in random places, but the device_driver is the >>> right place. >>> >> >> Sorry, let me explain again. My original code has an ops structure >> exactly like you are suggesting now (see struct hbl_aux_dev in the first >> patch of the series). But I was instructed not to use this ops structure >> and to rely on exported symbols for inter-driver communication. >> I'll be happy to use this ops structure like in your example rather than >> converting my code to use exported symbols. >> Leon - am I missing anything? what's the verdict here? > > You are missing the main sentence from Jason's response: "don't stick ops in > random places". > > It is fine to have ops in device driver, so the core driver can call them. > However, in your > original code, you added ops everywhere. It caused to the need to implement > module reference > counting and crazy stuff like calls to lock and unlock functions from the aux > driver to the core. > > Verdict is still the same. Core driver should provide EXPORT_SYMBOLs, so the > aux driver can call > them directly and enjoy from proper module loading and unloading. > > The aux driver can have ops in the device driver, so the core driver can call > them to perform something > specific for that aux driver. > > Calls between aux drivers should be done via the core driver. > > Thanks The only place we have an ops structure is in the device driver, similarly to Jason's example. In our code it is struct hbl_aux_dev. What other random places did you see? We have several auxiliary devices so we have several instances of this structure but the definition is in a single place. The module reference counting is unrelated to the ops structure - we used it to block the son driver removal while the parent driver can access it. Even with exported symbols we would use it. Anyway, in v2 we'd like to allow the son driver removal before the parent so this module reference counting will be removed. The lock/unlock functions are also unrelated to the ops structure, we would add these even with exported symbols. The reason is that our NIC drivers are the sons/grandsons of a compute device which can enter a reset flow as part of a TDR mechanism. During this flow we must not access the HW so we need to block a parallel son device probing. In addition, we don't have any direct communication between the aux drivers, everything is done via the parent driver. Given all of the above, what is the problem with our current code? we did exactly what Jason wrote in his example - having an ops structure in the device driver to allow inter-driver communication. The only issue I see here is the question if this ops structure is for unidirectional communication (meaning parent to son only) or for bidirectional communication between the drivers (meaning also son to parent). That's the only point that was not mentioned by Jason while you are clear about the answer. AFAIU EXPORT_SYMBOLs should be used to expose driver level operations, not operations which are device specific (and that's our case). Hence we used this ops structure also for son-to-parent communication, although we can switch them with exported symbols if we have to.
Re: [PATCH] drm/ci: Upgrade setuptools requirement to 70.0.0
On 16/07/2024 05:37, WangYuli wrote: GitHub Dependabot has issued the following alert: "Upgrade setuptools to version 70.0.0 or later. A vulnerability in the package_index module of pypa/setuptools versions up to 69.1.1 allows for remote code execution via its download functions. These functions, which are used to download packages from URLs provided by users or retrieved from package index servers, are susceptible to code injection. If these functions are exposed to user-controlled inputs, such as package URLs, they can execute arbitrary commands on the system. The issue is fixed in version 70.0. Severity: 8.8 / 10 (High) Attack vector:Network Attack complexity:Low Privileges required: None User interaction:Required Scope: Unchanged Confidentiality: High Integrity: High Availability:High CVE ID: CVE-2024-6345" To avoid disturbing everyone with the kernel repo hosted on GitHub, I suggest we upgrade our python dependencies once again to appease GitHub Dependabot. Link: https://github.com/dependabot Signed-off-by: WangYuli Acked-by: Helen Koike Thanks Helen --- drivers/gpu/drm/ci/xfails/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/ci/xfails/requirements.txt b/drivers/gpu/drm/ci/xfails/requirements.txt index e9994c9db799..5e6d48d98e4e 100644 --- a/drivers/gpu/drm/ci/xfails/requirements.txt +++ b/drivers/gpu/drm/ci/xfails/requirements.txt @@ -11,7 +11,7 @@ requests==2.31.0 requests-toolbelt==1.0.0 ruamel.yaml==0.17.32 ruamel.yaml.clib==0.2.7 -setuptools==68.0.0 +setuptools==70.0.0 tenacity==8.2.3 urllib3==2.0.7 wheel==0.41.1
Re: [PATCH v2] printk: Add a short description string to kmsg_dump()
On 02/07/2024 14:26, Jocelyn Falempe wrote: kmsg_dump doesn't forward the panic reason string to the kmsg_dumper callback. This patch adds a new struct kmsg_dump_detail, that will hold the reason and description, and pass it to the dump() callback. To avoid updating all kmsg_dump() call, it adds a kmsg_dump_desc() function and a macro for backward compatibility. I've written this for drm_panic, but it can be useful for other kmsg_dumper. It allows to see the panic reason, like "sysrq triggered crash" or "VFS: Unable to mount root fs on " on the drm panic screen. v2: * Use a struct kmsg_dump_detail to hold the reason and description pointer, for more flexibility if we want to add other parameters. (Kees Cook) * Fix powerpc/nvram_64 build, as I didn't update the forward declaration of oops_to_nvram() Signed-off-by: Jocelyn Falempe --- arch/powerpc/kernel/nvram_64.c | 8 arch/powerpc/platforms/powernv/opal-kmsg.c | 4 ++-- arch/um/kernel/kmsg_dump.c | 2 +- drivers/gpu/drm/drm_panic.c| 4 ++-- drivers/hv/hv_common.c | 4 ++-- drivers/mtd/mtdoops.c | 2 +- fs/pstore/platform.c | 10 +- include/linux/kmsg_dump.h | 22 +++--- kernel/panic.c | 2 +- kernel/printk/printk.c | 11 --- 10 files changed, 45 insertions(+), 24 deletions(-) [...] I pushed it to drm-misc-next, with the whitespace change in kmsg_dump.h Thanks all for your reviews and comments. Best regards, -- Jocelyn
Re: [PATCH 11/15] RDMA/hbl: add habanalabs RDMA driver
On Wed, Jul 17, 2024 at 10:51:03AM +, Omer Shpigelman wrote: > The only place we have an ops structure is in the device driver, > similarly to Jason's example. In our code it is struct > hbl_aux_dev. What No, hbl_aux_dev is an 'struct auxiliary_device', not a 'struct device_driver', it is different. I did literally mean struct device_driver. Jason
Re: [PATCH RFC 0/3] Implement Qualcomm TEE IPC and ioctl calls
Adding TEE mailing list and maintainers to the CC list. Amirreza, please include them in future even if you are not going to use the framework. On Wed, Jul 10, 2024 at 09:16:48AM GMT, Amirreza Zarrabi wrote: > > > On 7/3/2024 9:36 PM, Dmitry Baryshkov wrote: > > On Tue, Jul 02, 2024 at 10:57:35PM GMT, Amirreza Zarrabi wrote: > >> Qualcomm TEE hosts Trusted Applications (TAs) and services that run in > >> the secure world. Access to these resources is provided using MinkIPC. > >> MinkIPC is a capability-based synchronous message passing facility. It > >> allows code executing in one domain to invoke objects running in other > >> domains. When a process holds a reference to an object that lives in > >> another domain, that object reference is a capability. Capabilities > >> allow us to separate implementation of policies from implementation of > >> the transport. > >> > >> As part of the upstreaming of the object invoke driver (called SMC-Invoke > >> driver), we need to provide a reasonable kernel API and UAPI. The clear > >> option is to use TEE subsystem and write a back-end driver, however the > >> TEE subsystem doesn't fit with the design of Qualcomm TEE. > >> > > To answer your "general comment", maybe a bit of background :). > > Traditionally, policy enforcement is based on access-control models, > either (1) access-control list or (2) capability [0]. A capability is an > opaque ("non-forge-able") object reference that grants the holder the > right to perform certain operations on the object (e.g. Read, Write, > Execute, or Grant). Capabilities are preferred mechanism for representing > a policy, due to their fine-grained representation of access right, inline > with > (P1) the principle of least privilege [1], and > (P2) the ability to avoid the confused deputy problem [2]. > > [0] Jack B. Dennis and Earl C. Van Horn. 1966. Programming Semantics for > Multiprogrammed Computations. Commun. ACM 9 (1966), 143–155. > > [1] Jerome H. Saltzer and Michael D. Schroeder. 1975. The Protection of > Information in Computer Systems. Proc. IEEE 63 (1975), 1278–1308. > > [2] Norm Hardy. 1988. The Confused Deputy (or Why Capabilities Might Have > Been Invented). ACM Operating Systems Review 22, 4 (1988), 36–38. > > For MinkIPC, an object represents a TEE or TA service. The reference to > the object is the "handle" that is returned from TEE (let's call it > TEE-Handle). The supported operations are "service invocation" (similar > to Execute), and "sharing access to a service" (similar to Grant). > Anyone with access to the TEE-Handle can invoke the service or pass the > TEE-Handle to someone else to access the same service. > > The responsibility of the MinkIPC framework is to hide the TEE-Handle, > so that the client can not forge it, and allow the owner of the handle > to transfer it to other clients as it wishes. Using a file descriptor > table we can achieve that. We wrap the TEE-Handle as a FD and let the > client invoke FD (e.g. using IOCTL), or transfer the FD (e.g. using > UNIX socket). > > As a side note, for the sake of completeness, capabilities are fundamentally > a "discretionary mechanism", as the holder of the object reference has the > ability to share it with others. A secure system requires "mandatory > enforcement" (i.e. ability to revoke authority and ability to control > the authority propagation). This is out of scope for the MinkIPC. > MinkIPC is only interested in P1 and P2 (mention above). > > > >> Does TEE subsystem fit requirements of a capability based system? > >> - > >> In TEE subsystem, to invoke a function: > >>- client should open a device file "/dev/teeX", > >>- create a session with a TA, and > >>- invoke the functions in that session. > >> > >> 1. The privilege to invoke a function is determined by a session. If a > >>client has a session, it cannot share it with other clients. Even if > >> it does, it is not fine-grained enough, i.e. either all accessible > >> functions/resources in a session or none. Assume a scenario when a client > >> wants to grant a permission to invoke just a function that it has the > >> rights, > >> to another client. > >> > >> The "all or nothing" for sharing sessions is not in line with our > >> capability system: "if you own a capability, you should be able to grant > >> or share it". > > > > Can you please be more specific here? What kind of sharing is expected > > on the user side of it? > > In MinkIPC, after authenticating a client credential, a TA (or TEE) may > return multiple TEE-Handles, each representing a service that the client > has privilege to access. The client should be able to "individually" > reference each TEE-Handle, e.g. to invoke and share it (as per capability- > based system requirements). > > If we use TEE subsystem, which has a session based design, all TEE-Handles > are meaningful with respect to the session in which they are a
Re: [PATCH 11/15] RDMA/hbl: add habanalabs RDMA driver
On Wed, Jul 17, 2024 at 10:51:03AM +, Omer Shpigelman wrote: > On 7/17/24 10:36, Leon Romanovsky wrote: > > On Wed, Jul 17, 2024 at 07:08:59AM +, Omer Shpigelman wrote: > >> On 7/16/24 16:40, Jason Gunthorpe wrote: > >>> On Sun, Jul 14, 2024 at 10:18:12AM +, Omer Shpigelman wrote: > On 7/12/24 16:08, Jason Gunthorpe wrote: > > [You don't often get email from j...@ziepe.ca. Learn why this is > > important at https://aka.ms/LearnAboutSenderIdentification ] > > > > On Fri, Jun 28, 2024 at 10:24:32AM +, Omer Shpigelman wrote: > > > >> We need the core driver to access the IB driver (and to the ETH driver > >> as > >> well). As you wrote, we can't use exported symbols from our IB driver > >> nor > >> rely on function pointers, but what about providing the core driver an > >> ops > >> structure? meaning exporting a register function from the core driver > >> that > >> should be called by the IB driver during auxiliary device probe. > >> Something like: > >> > >> int hbl_cn_register_ib_aux_dev(struct auxiliary_device *adev, > >> struct hbl_ib_ops *ops) > >> { > >> ... > >> } > >> EXPORT_SYMBOL(hbl_cn_register_ib_aux_dev); > > > > Definately do not do some kind of double-register like this. > > > > The auxiliary_device scheme can already be extended to provide ops for > > each sub device. > > > > Like > > > > struct habana_driver { > >struct auxiliary_driver base; > >const struct habana_ops *ops; > > }; > > > > If the ops are justified or not is a different question. > > > > Well, I suggested this double-register option because I got a comment > that > the design pattern of embedded ops structure shouldn't be used. > So I'm confused now... > >>> > >>> Yeah, don't stick ops in random places, but the device_driver is the > >>> right place. > >>> > >> > >> Sorry, let me explain again. My original code has an ops structure > >> exactly like you are suggesting now (see struct hbl_aux_dev in the first > >> patch of the series). But I was instructed not to use this ops structure > >> and to rely on exported symbols for inter-driver communication. > >> I'll be happy to use this ops structure like in your example rather than > >> converting my code to use exported symbols. > >> Leon - am I missing anything? what's the verdict here? > > > > You are missing the main sentence from Jason's response: "don't stick ops > > in random places". > > > > It is fine to have ops in device driver, so the core driver can call them. > > However, in your > > original code, you added ops everywhere. It caused to the need to implement > > module reference > > counting and crazy stuff like calls to lock and unlock functions from the > > aux driver to the core. > > > > Verdict is still the same. Core driver should provide EXPORT_SYMBOLs, so > > the aux driver can call > > them directly and enjoy from proper module loading and unloading. > > > > The aux driver can have ops in the device driver, so the core driver can > > call them to perform something > > specific for that aux driver. > > > > Calls between aux drivers should be done via the core driver. > > > > Thanks > > The only place we have an ops structure is in the device driver, > similarly to Jason's example. In our code it is struct hbl_aux_dev. What > other random places did you see? This is exactly random place. I suggest you to take time, learn how existing drivers in netdev and RDMA uses auxbus infrastructure and follow the same pattern. There are many examples already in the kernel. And no, if you do everything right, you won't need custom module reference counting and other hacks. There is nothing special in your device/driver which requires special treatment. Thanks
Re: [PATCH 0/7] drm/probe-helpers: Work around multi-outputs-per-CRTC problem
As discussed on irc, we rather improve the in-kernel DRM client's support for cloned outputs (for fbcon) and then remove the hacks from the ast and mgag200 drivers. Userspace compositors need to support cloned outputs to a minimum. https://dri.freedesktop.org/~cbrill/dri-log/index.php?channel=dri-devel&date=2024-07-17 Best regards Thomas Am 15.07.24 um 11:38 schrieb Thomas Zimmermann: Old or simple hardware only supports a single CRTC with multiple conenctoed outputs. This breaks most userspace compositors, which only support a single output per CRTC. This currently happens with ast and mgag200 drivers. The drivers contain a work around that dynamically disables all but one connected output. Patches 1 and 2 push the workaround into probe helpers and make it configurable in the kernel config. For each connector, the driver needs to specify a bitmask of connectors with higher priority. If one of them is connected, the connector at hand is always reported as disconnected. Connectors without priority bitmask as not affected. Patches 3 to 5 update and simplify the ast drivers. The new workaround now allows to have multiple physical conenctors in ast. So patch 5 finally allows VGA and DisplayPort on the same device. Patches 6 and 7 update mgag200. Any future driver that exposes the same problem as ast and mgag200 can simply hook into the workaround. Hopefully userspace can be fixed at some point. Thomas Zimmermann (7): drm/probe-helper: Call connector detect functions in single helper drm/probe-helper: Optionally report single connected output per CRTC drm/ast: Set connector priorities drm/ast: Remove struct ast_bmc_connector drm/ast: Support ASTDP and VGA at the same time drm/mgag200: Set connector priorities drm/mgag200: Remove struct mgag200_bmc_connector drivers/gpu/drm/Kconfig | 15 +++ drivers/gpu/drm/ast/ast_drv.h | 17 +-- drivers/gpu/drm/ast/ast_main.c| 2 +- drivers/gpu/drm/ast/ast_mode.c| 49 ++-- drivers/gpu/drm/drm_probe_helper.c| 137 +++--- drivers/gpu/drm/mgag200/mgag200_bmc.c | 44 +-- drivers/gpu/drm/mgag200/mgag200_drv.h | 9 +- drivers/gpu/drm/mgag200/mgag200_g200eh.c | 4 +- drivers/gpu/drm/mgag200/mgag200_g200eh3.c | 4 +- drivers/gpu/drm/mgag200/mgag200_g200er.c | 4 +- drivers/gpu/drm/mgag200/mgag200_g200ev.c | 4 +- drivers/gpu/drm/mgag200/mgag200_g200ew3.c | 4 +- drivers/gpu/drm/mgag200/mgag200_g200se.c | 4 +- drivers/gpu/drm/mgag200/mgag200_g200wb.c | 4 +- include/drm/drm_connector.h | 2 + include/drm/drm_probe_helper.h| 2 + 16 files changed, 177 insertions(+), 128 deletions(-) -- -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg)
Re: [PATCH v4 2/5] drm/drm_property: require DRM_MODE_PROP_IMMUTABLE for single-value props
Hi, On Mon, Jul 15, 2024 at 09:33:02AM GMT, Dmitry Baryshkov wrote: > Document that DRM_MODE_PROP_IMMUTABLE must be set for the properties > that are immutable by definition - e.g. ranges with min == max or enums > with a single value. This matches the behaviour of the IGT tests, see > kms_properties.c / validate_range_prop(), validate_enum_prop(), > validate_bitmask_prop(). > > Signed-off-by: Dmitry Baryshkov We had a discussion yesterday about it on IRC with Sima, Simon and Xaver. https://oftc.irclog.whitequark.org/dri-devel/2024-07-16#33374622; The conclusion was that it would create an inconsistency between drivers on whether a given property is immutable or not, which will lead to more troubles for userspace. It's not clear why Ville added that check in the first place, so the best course of action is to remove the IGT test and get the discussion started there. Maxime signature.asc Description: PGP signature
Re: [PATCH v4 3/5] drm/mediatek: Support "Pre-multiplied" blending in OVL
Il 17/07/24 07:24, Hsiao Chien Sung via B4 Relay ha scritto: From: Hsiao Chien Sung Support "Pre-multiplied" alpha blending mode on in OVL. Before this patch, only the "coverage" mode is supported. As whether OVL_CON_CLRFMT_MAN bit is enabled, (3 << 12) means different formats in the datasheet. To prevent misunderstandings going forward, instead of reusing OVL_CON_CLRFMT_RGBA, we intetionally defined OVL_CON_CLRFMT_PARGB with bit operation again. Reviewed-by: CK Hu Signed-off-by: Hsiao Chien Sung Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH v4 1/5] drm/mediatek: Support "None" blending in OVL
Il 17/07/24 07:24, Hsiao Chien Sung via B4 Relay ha scritto: From: Hsiao Chien Sung Support "None" alpha blending mode on MediaTek's chips. Reviewed-by: CK Hu Signed-off-by: Hsiao Chien Sung Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH] drm/bridge-connector: Fix double free in error handling paths
On Thu, Jul 11, 2024 at 02:26:55PM GMT, Cristian Ciocaltea wrote: > The recent switch to drmm allocation in drm_bridge_connector_init() may > cause double free on bridge_connector in some of the error handling > paths. > > Drop the explicit kfree() calls on bridge_connector. > > Fixes: c12907be57b1 ("drm/bridge-connector: switch to using drmm allocations") > Signed-off-by: Cristian Ciocaltea > --- > drivers/gpu/drm/drm_bridge_connector.c | 8 ++-- > 1 file changed, 2 insertions(+), 6 deletions(-) > Reviewed-by: Dmitry Baryshkov -- With best wishes Dmitry
Re: [PATCH v4 2/5] drm/drm_property: require DRM_MODE_PROP_IMMUTABLE for single-value props
On Wed, Jul 17, 2024 at 03:42:46PM GMT, Maxime Ripard wrote: > Hi, > > On Mon, Jul 15, 2024 at 09:33:02AM GMT, Dmitry Baryshkov wrote: > > Document that DRM_MODE_PROP_IMMUTABLE must be set for the properties > > that are immutable by definition - e.g. ranges with min == max or enums > > with a single value. This matches the behaviour of the IGT tests, see > > kms_properties.c / validate_range_prop(), validate_enum_prop(), > > validate_bitmask_prop(). > > > > Signed-off-by: Dmitry Baryshkov > > We had a discussion yesterday about it on IRC with Sima, Simon and > Xaver. > > https://oftc.irclog.whitequark.org/dri-devel/2024-07-16#33374622; > > The conclusion was that it would create an inconsistency between drivers > on whether a given property is immutable or not, which will lead to more > troubles for userspace. > > It's not clear why Ville added that check in the first place, so the > best course of action is to remove the IGT test and get the discussion > started there. Ack, I'll work on removing those tests later today. -- With best wishes Dmitry
[PATCH v4 0/4] drm/panic: Add a QR code panic screen
This series adds a new panic screen, with the kmsg data embedded in a QR code. The main advantage of QR code, is that you can copy/paste the debug data to a bug report. The QR code encoder is written in rust, and is very specific to drm panic. The reason is that it is called in a panic handler, and thus can't allocate memory, or use locking. The rust code uses a few rust core API, and provides only two C entry points. There is no particular reason to do it in rust, I just wanted to learn rust, and see if it can work in the kernel. If you want to see what it looks like, I've put a few screenshots here: https://github.com/kdj0c/panic_report/issues/1 v2: * Rewrite the rust comments with Markdown (Alice Ryhl) * Mark drm_panic_qr_generate() as unsafe (Alice Ryhl) * Use CStr directly, and remove the call to as_str_unchecked() (Alice Ryhl) * Add a check for data_len <= data_size (Greg KH) v3: * Fix all rust comments (typo, punctuation) (Miguel Ojeda) * Change the wording of safety comments (Alice Ryhl) * Add a link to the javascript decoder in the Kconfig (Greg KH) * Fix data_size and tmp_size check in drm_panic_qr_generate() v4: * Fix the logic to find next line and skip the '\n' (Alice Ryhl) * Remove __LOG_PREFIX as it's not used (Alice Ryhl) Jocelyn Falempe (4): drm/panic: Add integer scaling to blit() drm/rect: Add drm_rect_overlap() drm/panic: Simplify logo handling drm/panic: Add a QR code panic screen drivers/gpu/drm/Kconfig | 31 + drivers/gpu/drm/Makefile|1 + drivers/gpu/drm/drm_drv.c |3 + drivers/gpu/drm/drm_panic.c | 340 +-- drivers/gpu/drm/drm_panic_qr.rs | 1003 +++ include/drm/drm_panic.h |4 + include/drm/drm_rect.h | 15 + 7 files changed, 1358 insertions(+), 39 deletions(-) create mode 100644 drivers/gpu/drm/drm_panic_qr.rs base-commit: e1a261ba599eec97e1c5c7760d5c3698fc24e6a6 -- 2.45.2
[PATCH v4 1/4] drm/panic: Add integer scaling to blit()
Add a parameter to the blit function, to upscale the image. This is necessary to draw a QR code, otherwise, the pixels are usually too small to be readable by most QR code reader. It can also be used later for drawing fonts on high DPI display. Signed-off-by: Jocelyn Falempe --- drivers/gpu/drm/drm_panic.c | 33 +++-- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/drm_panic.c b/drivers/gpu/drm/drm_panic.c index 8794c7f6c0ee..fa147542d801 100644 --- a/drivers/gpu/drm/drm_panic.c +++ b/drivers/gpu/drm/drm_panic.c @@ -249,20 +249,20 @@ static bool drm_panic_is_pixel_fg(const u8 *sbuf8, unsigned int spitch, int x, i static void drm_panic_blit16(struct iosys_map *dmap, unsigned int dpitch, const u8 *sbuf8, unsigned int spitch, unsigned int height, unsigned int width, -u16 fg16) +unsigned int scale, u16 fg16) { unsigned int y, x; for (y = 0; y < height; y++) for (x = 0; x < width; x++) - if (drm_panic_is_pixel_fg(sbuf8, spitch, x, y)) + if (drm_panic_is_pixel_fg(sbuf8, spitch, x / scale, y / scale)) iosys_map_wr(dmap, y * dpitch + x * sizeof(u16), u16, fg16); } static void drm_panic_blit24(struct iosys_map *dmap, unsigned int dpitch, const u8 *sbuf8, unsigned int spitch, unsigned int height, unsigned int width, -u32 fg32) +unsigned int scale, u32 fg32) { unsigned int y, x; @@ -270,7 +270,7 @@ static void drm_panic_blit24(struct iosys_map *dmap, unsigned int dpitch, for (x = 0; x < width; x++) { u32 off = y * dpitch + x * 3; - if (drm_panic_is_pixel_fg(sbuf8, spitch, x, y)) { + if (drm_panic_is_pixel_fg(sbuf8, spitch, x / scale, y / scale)) { /* write blue-green-red to output in little endianness */ iosys_map_wr(dmap, off, u8, (fg32 & 0x00FF) >> 0); iosys_map_wr(dmap, off + 1, u8, (fg32 & 0xFF00) >> 8); @@ -283,24 +283,25 @@ static void drm_panic_blit24(struct iosys_map *dmap, unsigned int dpitch, static void drm_panic_blit32(struct iosys_map *dmap, unsigned int dpitch, const u8 *sbuf8, unsigned int spitch, unsigned int height, unsigned int width, -u32 fg32) +unsigned int scale, u32 fg32) { unsigned int y, x; for (y = 0; y < height; y++) for (x = 0; x < width; x++) - if (drm_panic_is_pixel_fg(sbuf8, spitch, x, y)) + if (drm_panic_is_pixel_fg(sbuf8, spitch, x / scale, y / scale)) iosys_map_wr(dmap, y * dpitch + x * sizeof(u32), u32, fg32); } static void drm_panic_blit_pixel(struct drm_scanout_buffer *sb, struct drm_rect *clip, -const u8 *sbuf8, unsigned int spitch, u32 fg_color) +const u8 *sbuf8, unsigned int spitch, unsigned int scale, +u32 fg_color) { unsigned int y, x; for (y = 0; y < drm_rect_height(clip); y++) for (x = 0; x < drm_rect_width(clip); x++) - if (drm_panic_is_pixel_fg(sbuf8, spitch, x, y)) + if (drm_panic_is_pixel_fg(sbuf8, spitch, x / scale, y / scale)) sb->set_pixel(sb, clip->x1 + x, clip->y1 + y, fg_color); } @@ -310,18 +311,22 @@ static void drm_panic_blit_pixel(struct drm_scanout_buffer *sb, struct drm_rect * @clip: destination rectangle * @sbuf8: source buffer, in monochrome format, 8 pixels per byte. * @spitch: source pitch in bytes + * @scale: integer scale, source buffer is scale time smaller than destination + * rectangle * @fg_color: foreground color, in destination format * * This can be used to draw a font character, which is a monochrome image, to a * framebuffer in other supported format. */ static void drm_panic_blit(struct drm_scanout_buffer *sb, struct drm_rect *clip, - const u8 *sbuf8, unsigned int spitch, u32 fg_color) + const u8 *sbuf8, unsigned int spitch, + unsigned int scale, u32 fg_color) + { struct iosys_map map; if (sb->set_pixel) - return drm_panic_blit_pixel(sb, clip, sbuf8, spitch, fg_color); + return drm_panic_blit_pixel(sb, clip, sbuf8, spitch, scale, fg_color); map = sb->map[0]; iosys_map_incr(&map, clip->y1 * sb->pitch[0] + clip->x1 * sb->format->cpp[0])
[PATCH v4 2/4] drm/rect: Add drm_rect_overlap()
Check if two rectangles overlap. It's a bit similar to drm_rect_intersect() but this won't modify the rectangle. Simplifies a bit drm_panic. Signed-off-by: Jocelyn Falempe --- drivers/gpu/drm/drm_panic.c | 3 +-- include/drm/drm_rect.h | 15 +++ 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_panic.c b/drivers/gpu/drm/drm_panic.c index fa147542d801..1693348b19a6 100644 --- a/drivers/gpu/drm/drm_panic.c +++ b/drivers/gpu/drm/drm_panic.c @@ -522,8 +522,7 @@ static void draw_panic_static_user(struct drm_scanout_buffer *sb) /* Fill with the background color, and draw text on top */ drm_panic_fill(sb, &r_screen, bg_color); - if ((r_msg.x1 >= logo_width || r_msg.y1 >= logo_height) && - logo_width <= sb->width && logo_height <= sb->height) { + if (!drm_rect_overlap(&r_logo, &r_msg)) { if (logo_mono) drm_panic_blit(sb, &r_logo, logo_mono->data, DIV_ROUND_UP(logo_width, 8), fg_color); diff --git a/include/drm/drm_rect.h b/include/drm/drm_rect.h index 73fcb899a01d..7bafde747d56 100644 --- a/include/drm/drm_rect.h +++ b/include/drm/drm_rect.h @@ -238,6 +238,21 @@ static inline void drm_rect_fp_to_int(struct drm_rect *dst, drm_rect_height(src) >> 16); } +/** + * drm_rect_overlap - Check if two rectangles overlap + * @r1: first rectangle + * @r2: second rectangle + * + * RETURNS: + * %true if the rectangles overlap, %false otherwise. + */ +static inline bool drm_rect_overlap(const struct drm_rect *r1, + const struct drm_rect *r2) +{ + return (r1->x2 > r2->x1 && r2->x2 > r1->x1 && + r1->y2 > r2->y1 && r2->y2 > r1->y1); +} + bool drm_rect_intersect(struct drm_rect *r, const struct drm_rect *clip); bool drm_rect_clip_scaled(struct drm_rect *src, struct drm_rect *dst, const struct drm_rect *clip); -- 2.45.2
[PATCH v4 4/4] drm/panic: Add a QR code panic screen
This patch adds a new panic screen, with a QR code and the kmsg data embedded. If DRM_PANIC_SCREEN_QR_CODE_URL is set, then the kmsg data will be compressed with zlib and encoded as a numerical segment, and appended to the URL as a URL parameter. This allows to save space, and put about ~7500 bytes of kmsg data, in a V40 QR code. Linux distributions can customize the URL, and put a web frontend to directly open a bug report with the kmsg data. Otherwise the kmsg data will be encoded as a binary segment (ie raw ascii) and only a maximum of 2953 bytes of kmsg data will be available in the QR code. You can also limit the QR code size with DRM_PANIC_SCREEN_QR_VERSION. Signed-off-by: Jocelyn Falempe --- v2: * Rewrite the rust comments with Markdown (Alice Ryhl) * Mark drm_panic_qr_generate() as unsafe (Alice Ryhl) * Use CStr directly, and remove the call to as_str_unchecked() (Alice Ryhl) * Add a check for data_len <= data_size (Greg KH) v3: * Fix all rust comments (typo, punctuation) (Miguel Ojeda) * Change the wording of safety comments (Alice Ryhl) * Add a link to the javascript decoder in the Kconfig (Greg KH) * Fix data_size and tmp_size check in drm_panic_qr_generate() v4: * Fix the logic to find next line and skip the '\n' (Alic Ryhl) * Remove __LOG_PREFIX as it's not used (Alice Ryhl) drivers/gpu/drm/Kconfig | 31 + drivers/gpu/drm/Makefile|1 + drivers/gpu/drm/drm_drv.c |3 + drivers/gpu/drm/drm_panic.c | 249 drivers/gpu/drm/drm_panic_qr.rs | 1003 +++ include/drm/drm_panic.h |4 + 6 files changed, 1291 insertions(+) create mode 100644 drivers/gpu/drm/drm_panic_qr.rs diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 6dd0016fc9cd..50ac967b56be 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -149,6 +149,37 @@ config DRM_PANIC_SCREEN or by writing to /sys/module/drm/parameters/panic_screen sysfs entry Default is "user" +config DRM_PANIC_SCREEN_QR_CODE + bool "Add a panic screen with a QR code" + depends on DRM_PANIC && RUST + help + This option adds a QR code generator, and a panic screen with a QR + code. The QR code will contain the last lines of kmsg and other debug + information. This should be easier for the user to report a kernel + panic, with all debug information available. + To use this panic screen, also set DRM_PANIC_SCREEN to "qr_code" + +config DRM_PANIC_SCREEN_QR_CODE_URL + string "Base URL of the QR code in the panic screen" + depends on DRM_PANIC_SCREEN_QR_CODE + help + This option sets the base URL to report the kernel panic. If it's set + the QR code will contain the URL and the kmsg compressed with zlib as + a URL parameter. If it's empty, the QR code will contain the kmsg as + uncompressed text only. + There is a demo code in javascript, to decode and uncompress the kmsg + data from the URL parameter at https://github.com/kdj0c/panic_report + +config DRM_PANIC_SCREEN_QR_VERSION + int "Maximum version (size) of the QR code." + depends on DRM_PANIC_SCREEN_QR_CODE + default 40 + help + This option limits the version (or size) of the QR code. QR code + version ranges from Version 1 (21x21) to Version 40 (177x177). + Smaller QR code are easier to read, but will contain less debugging + data. Default is 40. + config DRM_DEBUG_DP_MST_TOPOLOGY_REFS bool "Enable refcount backtrace history in the DP MST helpers" depends on STACKTRACE_SUPPORT diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 68cc9258ffc4..c62339b89d46 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -89,6 +89,7 @@ drm-$(CONFIG_DRM_PRIVACY_SCREEN) += \ drm_privacy_screen_x86.o drm-$(CONFIG_DRM_ACCEL) += ../../accel/drm_accel.o drm-$(CONFIG_DRM_PANIC) += drm_panic.o +drm-$(CONFIG_DRM_PANIC_SCREEN_QR_CODE) += drm_panic_qr.o obj-$(CONFIG_DRM) += drm.o obj-$(CONFIG_DRM_PANEL_ORIENTATION_QUIRKS) += drm_panel_orientation_quirks.o diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 93543071a500..27007b53a8c8 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -1067,6 +1067,7 @@ static const struct file_operations drm_stub_fops = { static void drm_core_exit(void) { drm_privacy_screen_lookup_exit(); + drm_panic_exit(); accel_core_exit(); unregister_chrdev(DRM_MAJOR, "drm"); debugfs_remove(drm_debugfs_root); @@ -1099,6 +1100,8 @@ static int __init drm_core_init(void) if (ret < 0) goto error; + drm_panic_init(); + drm_privacy_screen_lookup_init(); drm_core_init_complete = true; diff --git a/drivers/gpu/drm/drm_panic.c b/drivers/gpu/drm/drm_panic.c index 1fbefb99cf6e..1357d910b742 10
[PATCH v4 3/4] drm/panic: Simplify logo handling
Move logo rectangle initialisation, and logo drawing in separate functions, so they can be re-used by different panic screens. It prepares the introduction of the QR code panic screen. Signed-off-by: Jocelyn Falempe --- drivers/gpu/drm/drm_panic.c | 57 + 1 file changed, 33 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/drm_panic.c b/drivers/gpu/drm/drm_panic.c index 1693348b19a6..1fbefb99cf6e 100644 --- a/drivers/gpu/drm/drm_panic.c +++ b/drivers/gpu/drm/drm_panic.c @@ -80,6 +80,7 @@ static struct drm_panic_line panic_msg[] = { PANIC_LINE(""), PANIC_LINE("Please reboot your computer."), }; +static const int panic_msg_lines = ARRAY_SIZE(panic_msg); static const struct drm_panic_line logo_ascii[] = { PANIC_LINE(" .--._"), @@ -90,6 +91,7 @@ static const struct drm_panic_line logo_ascii[] = { PANIC_LINE(" /'\\_ _/`\\(_)"), PANIC_LINE(" \\___)=(___/"), }; +static const int logo_ascii_lines = ARRAY_SIZE(logo_ascii); #if defined(CONFIG_LOGO) && !defined(MODULE) static const struct linux_logo *logo_mono; @@ -488,33 +490,45 @@ static void draw_txt_rectangle(struct drm_scanout_buffer *sb, } } +static void drm_panic_logo_rect(struct drm_rect *rect, const struct font_desc *font) +{ + if (logo_mono) + drm_rect_init(rect, 0, 0, logo_mono->width, logo_mono->height); + else { + int logo_width = get_max_line_len(logo_ascii, logo_ascii_lines) * font->width; + + drm_rect_init(rect, 0, 0, logo_width, logo_ascii_lines * font->height); + } +} + +static void drm_panic_logo_draw(struct drm_scanout_buffer *sb, struct drm_rect *rect, + const struct font_desc *font, u32 fg_color) +{ + if (logo_mono) + drm_panic_blit(sb, rect, logo_mono->data, + DIV_ROUND_UP(drm_rect_width(rect), 8), 1, fg_color); + else + draw_txt_rectangle(sb, font, logo_ascii, logo_ascii_lines, false, rect, + fg_color); +} + static void draw_panic_static_user(struct drm_scanout_buffer *sb) { - size_t msg_lines = ARRAY_SIZE(panic_msg); - size_t logo_ascii_lines = ARRAY_SIZE(logo_ascii); u32 fg_color = convert_from_xrgb(CONFIG_DRM_PANIC_FOREGROUND_COLOR, sb->format->format); u32 bg_color = convert_from_xrgb(CONFIG_DRM_PANIC_BACKGROUND_COLOR, sb->format->format); const struct font_desc *font = get_default_font(sb->width, sb->height, NULL, NULL); struct drm_rect r_screen, r_logo, r_msg; - unsigned int logo_width, logo_height; + unsigned int panic_msg_width; if (!font) return; r_screen = DRM_RECT_INIT(0, 0, sb->width, sb->height); - - if (logo_mono) { - logo_width = logo_mono->width; - logo_height = logo_mono->height; - } else { - logo_width = get_max_line_len(logo_ascii, logo_ascii_lines) * font->width; - logo_height = logo_ascii_lines * font->height; - } - - r_logo = DRM_RECT_INIT(0, 0, logo_width, logo_height); + drm_panic_logo_rect(&r_logo, font); + panic_msg_width = get_max_line_len(panic_msg, panic_msg_lines) * font->width; r_msg = DRM_RECT_INIT(0, 0, - min(get_max_line_len(panic_msg, msg_lines) * font->width, sb->width), - min(msg_lines * font->height, sb->height)); + min(panic_msg_width, sb->width), + min(panic_msg_lines * font->height, sb->height)); /* Center the panic message */ drm_rect_translate(&r_msg, (sb->width - r_msg.x2) / 2, (sb->height - r_msg.y2) / 2); @@ -522,15 +536,10 @@ static void draw_panic_static_user(struct drm_scanout_buffer *sb) /* Fill with the background color, and draw text on top */ drm_panic_fill(sb, &r_screen, bg_color); - if (!drm_rect_overlap(&r_logo, &r_msg)) { - if (logo_mono) - drm_panic_blit(sb, &r_logo, logo_mono->data, DIV_ROUND_UP(logo_width, 8), - fg_color); - else - draw_txt_rectangle(sb, font, logo_ascii, logo_ascii_lines, false, &r_logo, - fg_color); - } - draw_txt_rectangle(sb, font, panic_msg, msg_lines, true, &r_msg, fg_color); + if (!drm_rect_overlap(&r_logo, &r_msg)) + drm_panic_logo_draw(sb, &r_logo, font, fg_color); + + draw_txt_rectangle(sb, font, panic_msg, panic_msg_lines, true, &r_msg, fg_color); } /* -- 2.45.2
[PATCH 3/5] drm/ast: astdp: Only test HDP state in ast_astdp_is_connected()
The overall control flow of the driver ensures that it never reads EDID or sets display state on unconnected outputs. Therefore remove all tests for Hot Plug Detection from these helpers. Also rename the register constants. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/ast/ast_dp.c | 12 +++- drivers/gpu/drm/ast/ast_reg.h | 3 +-- 2 files changed, 4 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/ast/ast_dp.c b/drivers/gpu/drm/ast/ast_dp.c index 59885d10d308..c45b336baf45 100644 --- a/drivers/gpu/drm/ast/ast_dp.c +++ b/drivers/gpu/drm/ast/ast_dp.c @@ -9,7 +9,7 @@ bool ast_astdp_is_connected(struct ast_device *ast) { - if (!ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, ASTDP_HPD)) + if (!ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, AST_IO_VGACRDF_HPD)) return false; if (!ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS)) return false; @@ -23,11 +23,9 @@ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata) /* * CRDC[b0]: DP link success -* CRDF[b0]: DP HPD * CRE5[b0]: Host reading EDID process is done */ if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS) && - ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, ASTDP_HPD) && ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xE5, ASTDP_HOST_EDID_READ_DONE_MASK))) { goto err_astdp_edid_not_ready; @@ -61,8 +59,7 @@ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata) mdelay(j+1); if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, - ASTDP_LINK_SUCCESS) && - ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, ASTDP_HPD))) { + ASTDP_LINK_SUCCESS))) { goto err_astdp_jump_out_loop_of_edid; } @@ -111,8 +108,6 @@ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata) err_astdp_edid_not_ready: if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS))) return (~0xDC + 1); - if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, ASTDP_HPD))) - return (~0xDF + 1); if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xE5, ASTDP_HOST_EDID_READ_DONE_MASK))) return (~0xE5 + 1); @@ -182,8 +177,7 @@ void ast_dp_set_on_off(struct drm_device *dev, bool on) ast_set_index_reg_mask(ast, AST_IO_VGACRI, 0xE3, (u8) ~AST_DP_VIDEO_ENABLE, on); // If DP plug in and link successful then check video on / off status - if (ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS) && - ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, ASTDP_HPD)) { + if (ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS)) { video_on_off <<= 4; while (ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, ASTDP_MIRROR_VIDEO_ENABLE) != video_on_off) { diff --git a/drivers/gpu/drm/ast/ast_reg.h b/drivers/gpu/drm/ast/ast_reg.h index 569de3188191..e61954dabf1a 100644 --- a/drivers/gpu/drm/ast/ast_reg.h +++ b/drivers/gpu/drm/ast/ast_reg.h @@ -38,6 +38,7 @@ #define AST_IO_VGACRCB_HWC_ENABLED BIT(1) #define AST_IO_VGACRD1_MCU_FW_EXECUTINGBIT(5) +#define AST_IO_VGACRDF_HPD BIT(0) #define AST_IO_VGAIR1_R(0x5A) #define AST_IO_VGAIR1_VREFRESH BIT(3) @@ -70,11 +71,9 @@ /* * CRDC[b0]: DP link success - * CRDF[b0]: DP HPD * CRE5[b0]: Host reading EDID process is done */ #define ASTDP_LINK_SUCCESS BIT(0) -#define ASTDP_HPD BIT(0) #define ASTDP_HOST_EDID_READ_DONE BIT(0) #define ASTDP_HOST_EDID_READ_DONE_MASK GENMASK(0, 0) -- 2.45.2
[PATCH 1/5] drm/ast: astdp: Wake up during connector status detection
Power up the ASTDP connector for connection status detection if the connector is not active. Keep it powered if a display is attached. This fixes a bug where the connector does not come back after disconnecting the display. The encoder's atomic_disable turns off power on the physical connector. Further HPD reads will fail, thus preventing the driver from detecting re-connected displays. For connectors that are actively used, only test the HPD flag without touching power. Fixes: f81bb0ac7872 ("drm/ast: report connection status on Display Port.") Cc: Jocelyn Falempe Cc: Thomas Zimmermann Cc: Dave Airlie Cc: dri-devel@lists.freedesktop.org Cc: # v6.6+ Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/ast/ast_dp.c | 7 +++ drivers/gpu/drm/ast/ast_drv.h | 1 + drivers/gpu/drm/ast/ast_mode.c | 29 +++-- 3 files changed, 35 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/ast/ast_dp.c b/drivers/gpu/drm/ast/ast_dp.c index 1e9259416980..e6c7f0d64e99 100644 --- a/drivers/gpu/drm/ast/ast_dp.c +++ b/drivers/gpu/drm/ast/ast_dp.c @@ -158,7 +158,14 @@ void ast_dp_launch(struct drm_device *dev) ASTDP_HOST_EDID_READ_DONE); } +bool ast_dp_power_is_on(struct ast_device *ast) +{ + u8 vgacre3; + + vgacre3 = ast_get_index_reg(ast, AST_IO_VGACRI, 0xe3); + return !(vgacre3 & AST_DP_PHY_SLEEP); +} void ast_dp_power_on_off(struct drm_device *dev, bool on) { diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h index ba3d86973995..47bab5596c16 100644 --- a/drivers/gpu/drm/ast/ast_drv.h +++ b/drivers/gpu/drm/ast/ast_drv.h @@ -472,6 +472,7 @@ void ast_init_3rdtx(struct drm_device *dev); bool ast_astdp_is_connected(struct ast_device *ast); int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata); void ast_dp_launch(struct drm_device *dev); +bool ast_dp_power_is_on(struct ast_device *ast); void ast_dp_power_on_off(struct drm_device *dev, bool no); void ast_dp_set_on_off(struct drm_device *dev, bool no); void ast_dp_set_mode(struct drm_crtc *crtc, struct ast_vbios_mode_info *vbios_mode); diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c index dc8f639e82fd..049ee1477c33 100644 --- a/drivers/gpu/drm/ast/ast_mode.c +++ b/drivers/gpu/drm/ast/ast_mode.c @@ -28,6 +28,7 @@ * Authors: Dave Airlie */ +#include #include #include @@ -1687,11 +1688,35 @@ static int ast_astdp_connector_helper_detect_ctx(struct drm_connector *connector struct drm_modeset_acquire_ctx *ctx, bool force) { + struct drm_device *dev = connector->dev; struct ast_device *ast = to_ast_device(connector->dev); + enum drm_connector_status status = connector_status_disconnected; + struct drm_connector_state *connector_state = connector->state; + bool is_active = false; + + mutex_lock(&ast->modeset_lock); + + if (connector_state && connector_state->crtc) { + struct drm_crtc_state *crtc_state = connector_state->crtc->state; + + if (crtc_state && crtc_state->active) + is_active = true; + } + + if (!is_active && !ast_dp_power_is_on(ast)) { + ast_dp_power_on_off(dev, true); + msleep(50); + } if (ast_astdp_is_connected(ast)) - return connector_status_connected; - return connector_status_disconnected; + status = connector_status_connected; + + if (!is_active && status == connector_status_disconnected) + ast_dp_power_on_off(dev, false); + + mutex_unlock(&ast->modeset_lock); + + return status; } static const struct drm_connector_helper_funcs ast_astdp_connector_helper_funcs = { -- 2.45.2
[PATCH 2/5] drm/ast: astdp: Test firmware status once during probing
Test for running ASTDP firmware during probe. Do not bother testing this later. We cannot do much anyway if the firmware fails. Do not initialize the ASTDP conenctor if the test fails during device probing. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/ast/ast_dp.c | 41 +- drivers/gpu/drm/ast/ast_drv.h | 2 +- drivers/gpu/drm/ast/ast_main.c | 6 +++-- drivers/gpu/drm/ast/ast_post.c | 2 +- drivers/gpu/drm/ast/ast_reg.h | 4 ++-- 5 files changed, 23 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/ast/ast_dp.c b/drivers/gpu/drm/ast/ast_dp.c index e6c7f0d64e99..59885d10d308 100644 --- a/drivers/gpu/drm/ast/ast_dp.c +++ b/drivers/gpu/drm/ast/ast_dp.c @@ -9,8 +9,6 @@ bool ast_astdp_is_connected(struct ast_device *ast) { - if (!ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xD1, ASTDP_MCU_FW_EXECUTING)) - return false; if (!ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, ASTDP_HPD)) return false; if (!ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS)) @@ -24,13 +22,11 @@ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata) u8 i = 0, j = 0; /* -* CRD1[b5]: DP MCU FW is executing * CRDC[b0]: DP link success * CRDF[b0]: DP HPD * CRE5[b0]: Host reading EDID process is done */ - if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xD1, ASTDP_MCU_FW_EXECUTING) && - ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS) && + if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS) && ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, ASTDP_HPD) && ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xE5, ASTDP_HOST_EDID_READ_DONE_MASK))) { @@ -64,9 +60,7 @@ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata) */ mdelay(j+1); - if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xD1, - ASTDP_MCU_FW_EXECUTING) && - ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, + if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS) && ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, ASTDP_HPD))) { goto err_astdp_jump_out_loop_of_edid; @@ -115,8 +109,6 @@ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata) return (~(j+256) + 1); err_astdp_edid_not_ready: - if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xD1, ASTDP_MCU_FW_EXECUTING))) - return (~0xD1 + 1); if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS))) return (~0xDC + 1); if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, ASTDP_HPD))) @@ -130,32 +122,29 @@ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata) /* * Launch Aspeed DP */ -void ast_dp_launch(struct drm_device *dev) +int ast_dp_launch(struct ast_device *ast) { - u32 i = 0; - u8 bDPExecute = 1; - struct ast_device *ast = to_ast_device(dev); + struct drm_device *dev = &ast->base; + unsigned int i = 10; - // Wait one second then timeout. - while (ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xD1, ASTDP_MCU_FW_EXECUTING) != - ASTDP_MCU_FW_EXECUTING) { - i++; - // wait 100 ms - msleep(100); + while (i) { + u8 vgacrd1 = ast_get_index_reg(ast, AST_IO_VGACRI, 0xd1); - if (i >= 10) { - // DP would not be ready. - bDPExecute = 0; + if (vgacrd1 & AST_IO_VGACRD1_MCU_FW_EXECUTING) break; - } + --i; + msleep(100); } - - if (!bDPExecute) + if (!i) { drm_err(dev, "Wait DPMCU executing timeout\n"); + return -ENODEV; + } ast_set_index_reg_mask(ast, AST_IO_VGACRI, 0xE5, (u8) ~ASTDP_HOST_EDID_READ_DONE_MASK, ASTDP_HOST_EDID_READ_DONE); + + return 0; } bool ast_dp_power_is_on(struct ast_device *ast) diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h index 47bab5596c16..02476eb78119 100644 --- a/drivers/gpu/drm/ast/ast_drv.h +++ b/drivers/gpu/drm/ast/ast_drv.h @@ -471,7 +471,7 @@ void ast_init_3rdtx(struct drm_device *dev); /* aspeed DP */ bool ast_astdp_is_connected(struct ast_device *ast); int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata); -void ast_dp_launch(struct drm_device *dev); +int ast_dp_launch(struct ast_device *a
[PATCH 5/5] drm/ast: astdp: Clean up EDID reading
Simplify ast_astdp_read_edid(). Rename register constants. Drop unnecessary error handling. On success, the helper returns 0; an error code otherwise. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/ast/ast_dp.c | 93 --- drivers/gpu/drm/ast/ast_reg.h | 12 + 2 files changed, 44 insertions(+), 61 deletions(-) diff --git a/drivers/gpu/drm/ast/ast_dp.c b/drivers/gpu/drm/ast/ast_dp.c index 6cbde46f24dc..5d07678b502c 100644 --- a/drivers/gpu/drm/ast/ast_dp.c +++ b/drivers/gpu/drm/ast/ast_dp.c @@ -17,54 +17,55 @@ bool ast_astdp_is_connected(struct ast_device *ast) int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata) { struct ast_device *ast = to_ast_device(dev); - u8 i = 0, j = 0; + int ret = 0; + u8 i; - /* -* CRE5[b0]: Host reading EDID process is done -*/ - if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xE5, ASTDP_HOST_EDID_READ_DONE_MASK))) - goto err_astdp_edid_not_ready; - - ast_set_index_reg_mask(ast, AST_IO_VGACRI, 0xE5, (u8) ~ASTDP_HOST_EDID_READ_DONE_MASK, - 0x00); + /* Start reading EDID data */ + ast_set_index_reg_mask(ast, AST_IO_VGACRI, 0xe5, (u8)~AST_IO_VGACRE5_EDID_READ_DONE, 0x00); for (i = 0; i < 32; i++) { + unsigned int j; + /* * CRE4[7:0]: Read-Pointer for EDID (Unit: 4bytes); valid range: 0~64 */ - ast_set_index_reg_mask(ast, AST_IO_VGACRI, 0xE4, - ASTDP_AND_CLEAR_MASK, (u8)i); - j = 0; + ast_set_index_reg(ast, AST_IO_VGACRI, 0xe4, i); /* * CRD7[b0]: valid flag for EDID * CRD6[b0]: mirror read pointer for EDID */ - while ((ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xD7, - ASTDP_EDID_VALID_FLAG_MASK) != 0x01) || - (ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xD6, - ASTDP_EDID_READ_POINTER_MASK) != i)) { + for (j = 0; j < 200; ++j) { + u8 vgacrd7, vgacrd6; + /* * Delay are getting longer with each retry. -* 1. The Delays are often 2 loops when users request "Display Settings" +* +* 1. No delay on first try +* 2. The Delays are often 2 loops when users request "Display Settings" *of right-click of mouse. -* 2. The Delays are often longer a lot when system resume from S3/S4. +* 3. The Delays are often longer a lot when system resume from S3/S4. */ - mdelay(j+1); - - j++; - if (j > 200) - goto err_astdp_jump_out_loop_of_edid; + if (j) + mdelay(j + 1); + + /* Wait for EDID offset to show up in mirror register */ + vgacrd7 = ast_get_index_reg(ast, AST_IO_VGACRI, 0xd7); + if (vgacrd7 & AST_IO_VGACRD7_EDID_VALID_FLAG) { + vgacrd6 = ast_get_index_reg(ast, AST_IO_VGACRI, 0xd6); + if (vgacrd6 == i) + break; + } + } + if (j == 200) { + ret = -EBUSY; + goto out; } - *(ediddata) = ast_get_index_reg_mask(ast, AST_IO_VGACRI, - 0xD8, ASTDP_EDID_READ_DATA_MASK); - *(ediddata + 1) = ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xD9, - ASTDP_EDID_READ_DATA_MASK); - *(ediddata + 2) = ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDA, - ASTDP_EDID_READ_DATA_MASK); - *(ediddata + 3) = ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDB, - ASTDP_EDID_READ_DATA_MASK); + ediddata[0] = ast_get_index_reg(ast, AST_IO_VGACRI, 0xd8); + ediddata[1] = ast_get_index_reg(ast, AST_IO_VGACRI, 0xd9); + ediddata[2] = ast_get_index_reg(ast, AST_IO_VGACRI, 0xda); + ediddata[3] = ast_get_index_reg(ast, AST_IO_VGACRI, 0xdb); if (i == 31) { /* @@ -76,29 +77,19 @@ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata) * The Bytes-126 indicates the Number of extensions t
[PATCH 4/5] drm/ast: astdp: Perform link training during atomic_enable
The place for link training is in the encoder's atomic_enable helper. Remove all related tests from other helper ASTDP functions; especially ast_astdp_is_connected(), which tests HPD status. DP link training is controlled by the firmware. A status flag reports success or failure. The process can be fragile on Aspeed hardware. Moving the test from connector detection to the atomic_enable allows for several retries and a longer timeout. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/ast/ast_dp.c | 45 +- drivers/gpu/drm/ast/ast_drv.h | 1 + drivers/gpu/drm/ast/ast_mode.c | 2 ++ drivers/gpu/drm/ast/ast_reg.h | 3 +-- 4 files changed, 26 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/ast/ast_dp.c b/drivers/gpu/drm/ast/ast_dp.c index c45b336baf45..6cbde46f24dc 100644 --- a/drivers/gpu/drm/ast/ast_dp.c +++ b/drivers/gpu/drm/ast/ast_dp.c @@ -11,8 +11,6 @@ bool ast_astdp_is_connected(struct ast_device *ast) { if (!ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, AST_IO_VGACRDF_HPD)) return false; - if (!ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS)) - return false; return true; } @@ -22,14 +20,10 @@ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata) u8 i = 0, j = 0; /* -* CRDC[b0]: DP link success * CRE5[b0]: Host reading EDID process is done */ - if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS) && - ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xE5, - ASTDP_HOST_EDID_READ_DONE_MASK))) { + if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xE5, ASTDP_HOST_EDID_READ_DONE_MASK))) goto err_astdp_edid_not_ready; - } ast_set_index_reg_mask(ast, AST_IO_VGACRI, 0xE5, (u8) ~ASTDP_HOST_EDID_READ_DONE_MASK, 0x00); @@ -58,11 +52,6 @@ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata) */ mdelay(j+1); - if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, - ASTDP_LINK_SUCCESS))) { - goto err_astdp_jump_out_loop_of_edid; - } - j++; if (j > 200) goto err_astdp_jump_out_loop_of_edid; @@ -106,8 +95,6 @@ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata) return (~(j+256) + 1); err_astdp_edid_not_ready: - if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS))) - return (~0xDC + 1); if (!(ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xE5, ASTDP_HOST_EDID_READ_DONE_MASK))) return (~0xE5 + 1); @@ -165,7 +152,22 @@ void ast_dp_power_on_off(struct drm_device *dev, bool on) ast_set_index_reg_mask(ast, AST_IO_VGACRI, 0xE3, (u8) ~AST_DP_PHY_SLEEP, bE3); } +void ast_dp_link_training(struct ast_device *ast) +{ + struct drm_device *dev = &ast->base; + unsigned int i = 10; + + while (i--) { + u8 vgacrdc = ast_get_index_reg(ast, AST_IO_VGACRI, 0xdc); + if (vgacrdc & AST_IO_VGACRDC_LINK_SUCCESS) + break; + if (i) + msleep(100); + } + if (!i) + drm_err(dev, "Link training failed\n"); +} void ast_dp_set_on_off(struct drm_device *dev, bool on) { @@ -176,16 +178,13 @@ void ast_dp_set_on_off(struct drm_device *dev, bool on) // Video On/Off ast_set_index_reg_mask(ast, AST_IO_VGACRI, 0xE3, (u8) ~AST_DP_VIDEO_ENABLE, on); - // If DP plug in and link successful then check video on / off status - if (ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDC, ASTDP_LINK_SUCCESS)) { - video_on_off <<= 4; - while (ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, + video_on_off <<= 4; + while (ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xDF, ASTDP_MIRROR_VIDEO_ENABLE) != video_on_off) { - // wait 1 ms - mdelay(1); - if (++i > 200) - break; - } + // wait 1 ms + mdelay(1); + if (++i > 200) + break; } } diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h index 02476eb78119..d23b98ce4359 100644 --- a/drivers/gpu/drm/ast/ast_drv.h +++ b/drivers/gpu/drm/ast/ast_drv.h @@ -474,6 +474,7 @@ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata); int ast_dp_launch(struct ast_device *ast); bool ast_dp_power_is_on(struct ast_device *ast); void ast_dp_pow
[PATCH 0/5] drm/ast: Fix DP hotplugging and clean up
Here are a number of updates for ast's ASTDP transmitter code. So far the ast driver required the DisplayPort to be connected at boot. Later detection was not supported. Re-connecting the cable was also not supported. Once atomic_disable powered off the physical ASTDP connector, there was no way of detecting a conencted display. Patch 1 makes Hot Plug Detection work. If ncesessary, the connector's detect helper powers up the physical connector to read the HPD status. That's a good oportunity to clean up ast's whole detection code for ASTDP transmitters. So patch 2 to 4 remove duplicated status tests throughout the ASTDP code. Patch 5 simplified the code for reading the display's EDID data from the firmware. Tested on AST2600 hardware with an ASTDP transmitter. Thomas Zimmermann (5): drm/ast: astdp: Wake up during connector status detection drm/ast: astdp: Test firmware status once during probing drm/ast: astdp: Only test HDP state in ast_astdp_is_connected() drm/ast: astdp: Perform link training during atomic_enable drm/ast: astdp: Clean up EDID reading drivers/gpu/drm/ast/ast_dp.c | 186 +++-- drivers/gpu/drm/ast/ast_drv.h | 4 +- drivers/gpu/drm/ast/ast_main.c | 6 +- drivers/gpu/drm/ast/ast_mode.c | 31 +- drivers/gpu/drm/ast/ast_post.c | 2 +- drivers/gpu/drm/ast/ast_reg.h | 22 ++-- 6 files changed, 126 insertions(+), 125 deletions(-) -- 2.45.2
Re: [PATCH v5 2/2] drm/amdgpu: Add address alignment support to DCC buffers
Hi Christian, Can we use the below combination flags to kick in hardware workaround while pinning BO's for this specific hw generation. if (place->flags & TTM_PL_FLAG_CONTIGUOUS) && (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(12, 0, 0) || amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(12, 0, 1))) { } Regards, Arun. On 7/17/2024 2:38 PM, Christian König wrote: Well that approach was discussed before and seemed to be to complicated. But I totally agree that the AMDGPU_GEM_CREATE_GFX12_DCC flag is a bad idea. This isn't anything userspace should need to specify in the first place. All we need is a hardware workaround which kicks in all the time while pinning BOs for this specific hw generation and texture channel configuration. Please remove the AMDGPU_GEM_CREATE_GFX12_DCC flag again if possible or specify why it is actually necessary? Regards, Christian. Am 17.07.24 um 05:44 schrieb Marek Olšák: AMDGPU_GEM_CREATE_GFX12_DCC is set on 90% of all memory allocations, and almost all of them are not displayable. Shouldn't we use a different way to indicate that we need a non-power-of-two alignment, such as looking at the alignment field directly? Marek On Tue, Jul 16, 2024, 11:45 Arunpravin Paneer Selvam wrote: Add address alignment support to the DCC VRAM buffers. v2: - adjust size based on the max_texture_channel_caches values only for GFX12 DCC buffers. - used AMDGPU_GEM_CREATE_GFX12_DCC flag to apply change only for DCC buffers. - roundup non power of two DCC buffer adjusted size to nearest power of two number as the buddy allocator does not support non power of two alignments. This applies only to the contiguous DCC buffers. v3:(Alex) - rewrite the max texture channel caches comparison code in an algorithmic way to determine the alignment size. v4:(Alex) - Move the logic from amdgpu_vram_mgr_dcc_alignment() to gmc_v12_0.c and add a new gmc func callback for dcc alignment. If the callback is non-NULL, call it to get the alignment, otherwise, use the default. v5:(Alex) - Set the Alignment to a default value if the callback doesn't exist. - Add the callback to amdgpu_gmc_funcs. v6: - Fix checkpatch error reported by Intel CI. Signed-off-by: Arunpravin Paneer Selvam Acked-by: Alex Deucher Acked-by: Christian König Reviewed-by: Frank Min --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 6 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 36 ++-- drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c | 15 3 files changed, 55 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h index febca3130497..654d0548a3f8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h @@ -156,6 +156,8 @@ struct amdgpu_gmc_funcs { uint64_t addr, uint64_t *flags); /* get the amount of memory used by the vbios for pre-OS console */ unsigned int (*get_vbios_fb_size)(struct amdgpu_device *adev); + /* get the DCC buffer alignment */ + u64 (*get_dcc_alignment)(struct amdgpu_device *adev); enum amdgpu_memory_partition (*query_mem_partition_mode)( struct amdgpu_device *adev); @@ -363,6 +365,10 @@ struct amdgpu_gmc { (adev)->gmc.gmc_funcs->override_vm_pte_flags \ ((adev), (vm), (addr), (pte_flags)) #define amdgpu_gmc_get_vbios_fb_size(adev) (adev)->gmc.gmc_funcs->get_vbios_fb_size((adev)) +#define amdgpu_gmc_get_dcc_alignment(_adev) ({ \ + typeof(_adev) (adev) = (_adev); \ + ((adev)->gmc.gmc_funcs->get_dcc_alignment((adev))); \ +}) /** * amdgpu_gmc_vram_full_visible - Check if full VRAM is visible through the BAR diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c index f91cc149d06c..aa9dca12371c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -512,6 +512,16 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, vres->flags |= DRM_BUDDY_RANGE_ALLOCATION; remaining_size = (u64)vres->base.size; + if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS && + bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC) { + u64 adjust_size; + + if (adev->gmc.gmc_funcs->get_dcc_alignment) { + adjust_size = amdgpu_gmc_get_dcc_alignment(adev); + remaining_size = roundup_pow_of_two(remaining_size + adjust_size); +
Re: [PATCH v5 2/2] drm/amdgpu: Add address alignment support to DCC buffers
As far as I know, yes. Regards, Christian. Am 17.07.24 um 16:38 schrieb Paneer Selvam, Arunpravin: Hi Christian, Can we use the below combination flags to kick in hardware workaround while pinning BO's for this specific hw generation. if (place->flags & TTM_PL_FLAG_CONTIGUOUS) && (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(12, 0, 0) || amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(12, 0, 1))) { } Regards, Arun. On 7/17/2024 2:38 PM, Christian König wrote: Well that approach was discussed before and seemed to be to complicated. But I totally agree that the AMDGPU_GEM_CREATE_GFX12_DCC flag is a bad idea. This isn't anything userspace should need to specify in the first place. All we need is a hardware workaround which kicks in all the time while pinning BOs for this specific hw generation and texture channel configuration. Please remove the AMDGPU_GEM_CREATE_GFX12_DCC flag again if possible or specify why it is actually necessary? Regards, Christian. Am 17.07.24 um 05:44 schrieb Marek Olšák: AMDGPU_GEM_CREATE_GFX12_DCC is set on 90% of all memory allocations, and almost all of them are not displayable. Shouldn't we use a different way to indicate that we need a non-power-of-two alignment, such as looking at the alignment field directly? Marek On Tue, Jul 16, 2024, 11:45 Arunpravin Paneer Selvam wrote: Add address alignment support to the DCC VRAM buffers. v2: - adjust size based on the max_texture_channel_caches values only for GFX12 DCC buffers. - used AMDGPU_GEM_CREATE_GFX12_DCC flag to apply change only for DCC buffers. - roundup non power of two DCC buffer adjusted size to nearest power of two number as the buddy allocator does not support non power of two alignments. This applies only to the contiguous DCC buffers. v3:(Alex) - rewrite the max texture channel caches comparison code in an algorithmic way to determine the alignment size. v4:(Alex) - Move the logic from amdgpu_vram_mgr_dcc_alignment() to gmc_v12_0.c and add a new gmc func callback for dcc alignment. If the callback is non-NULL, call it to get the alignment, otherwise, use the default. v5:(Alex) - Set the Alignment to a default value if the callback doesn't exist. - Add the callback to amdgpu_gmc_funcs. v6: - Fix checkpatch error reported by Intel CI. Signed-off-by: Arunpravin Paneer Selvam Acked-by: Alex Deucher Acked-by: Christian König Reviewed-by: Frank Min --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 6 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 36 ++-- drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c | 15 3 files changed, 55 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h index febca3130497..654d0548a3f8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h @@ -156,6 +156,8 @@ struct amdgpu_gmc_funcs { uint64_t addr, uint64_t *flags); /* get the amount of memory used by the vbios for pre-OS console */ unsigned int (*get_vbios_fb_size)(struct amdgpu_device *adev); + /* get the DCC buffer alignment */ + u64 (*get_dcc_alignment)(struct amdgpu_device *adev); enum amdgpu_memory_partition (*query_mem_partition_mode)( struct amdgpu_device *adev); @@ -363,6 +365,10 @@ struct amdgpu_gmc { (adev)->gmc.gmc_funcs->override_vm_pte_flags \ ((adev), (vm), (addr), (pte_flags)) #define amdgpu_gmc_get_vbios_fb_size(adev) (adev)->gmc.gmc_funcs->get_vbios_fb_size((adev)) +#define amdgpu_gmc_get_dcc_alignment(_adev) ({ \ + typeof(_adev) (adev) = (_adev); \ + ((adev)->gmc.gmc_funcs->get_dcc_alignment((adev))); \ +}) /** * amdgpu_gmc_vram_full_visible - Check if full VRAM is visible through the BAR diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c index f91cc149d06c..aa9dca12371c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -512,6 +512,16 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, vres->flags |= DRM_BUDDY_RANGE_ALLOCATION; remaining_size = (u64)vres->base.size; + if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS && + bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC) { + u64 adjust_size; + + if (adev->gmc.gmc_funcs->get_dcc_alignment) { + adjust_size = amdgpu_gmc_get_dcc_alignment(adev); +
Re: [PATCH RFC 0/3] Implement Qualcomm TEE IPC and ioctl calls
Hi, On Wed, Jul 10, 2024 at 1:17 AM Amirreza Zarrabi wrote: > > > > On 7/3/2024 9:36 PM, Dmitry Baryshkov wrote: > > On Tue, Jul 02, 2024 at 10:57:35PM GMT, Amirreza Zarrabi wrote: > >> Qualcomm TEE hosts Trusted Applications (TAs) and services that run in > >> the secure world. Access to these resources is provided using MinkIPC. > >> MinkIPC is a capability-based synchronous message passing facility. It > >> allows code executing in one domain to invoke objects running in other > >> domains. When a process holds a reference to an object that lives in > >> another domain, that object reference is a capability. Capabilities > >> allow us to separate implementation of policies from implementation of > >> the transport. > >> > >> As part of the upstreaming of the object invoke driver (called SMC-Invoke > >> driver), we need to provide a reasonable kernel API and UAPI. The clear > >> option is to use TEE subsystem and write a back-end driver, however the > >> TEE subsystem doesn't fit with the design of Qualcomm TEE. > >> > > To answer your "general comment", maybe a bit of background :). > > Traditionally, policy enforcement is based on access-control models, > either (1) access-control list or (2) capability [0]. A capability is an > opaque ("non-forge-able") object reference that grants the holder the > right to perform certain operations on the object (e.g. Read, Write, > Execute, or Grant). Capabilities are preferred mechanism for representing > a policy, due to their fine-grained representation of access right, inline > with > (P1) the principle of least privilege [1], and > (P2) the ability to avoid the confused deputy problem [2]. > > [0] Jack B. Dennis and Earl C. Van Horn. 1966. Programming Semantics for > Multiprogrammed Computations. Commun. ACM 9 (1966), 143–155. > > [1] Jerome H. Saltzer and Michael D. Schroeder. 1975. The Protection of > Information in Computer Systems. Proc. IEEE 63 (1975), 1278–1308. > > [2] Norm Hardy. 1988. The Confused Deputy (or Why Capabilities Might Have > Been Invented). ACM Operating Systems Review 22, 4 (1988), 36–38. > > For MinkIPC, an object represents a TEE or TA service. The reference to > the object is the "handle" that is returned from TEE (let's call it > TEE-Handle). The supported operations are "service invocation" (similar > to Execute), and "sharing access to a service" (similar to Grant). > Anyone with access to the TEE-Handle can invoke the service or pass the > TEE-Handle to someone else to access the same service. > > The responsibility of the MinkIPC framework is to hide the TEE-Handle, > so that the client can not forge it, and allow the owner of the handle > to transfer it to other clients as it wishes. Using a file descriptor > table we can achieve that. We wrap the TEE-Handle as a FD and let the > client invoke FD (e.g. using IOCTL), or transfer the FD (e.g. using > UNIX socket). > > As a side note, for the sake of completeness, capabilities are fundamentally > a "discretionary mechanism", as the holder of the object reference has the > ability to share it with others. A secure system requires "mandatory > enforcement" (i.e. ability to revoke authority and ability to control > the authority propagation). This is out of scope for the MinkIPC. > MinkIPC is only interested in P1 and P2 (mention above). This is still quite abstract. We have tried to avoid inventing yet another IPC mechanism in the TEE subsystem. But that's not written in stone if it turns out there's a use case that needs it. > > > >> Does TEE subsystem fit requirements of a capability based system? > >> - > >> In TEE subsystem, to invoke a function: > >>- client should open a device file "/dev/teeX", > >>- create a session with a TA, and > >>- invoke the functions in that session. > >> > >> 1. The privilege to invoke a function is determined by a session. If a > >>client has a session, it cannot share it with other clients. Even if > >> it does, it is not fine-grained enough, i.e. either all accessible > >> functions/resources in a session or none. Assume a scenario when a client > >> wants to grant a permission to invoke just a function that it has the > >> rights, > >> to another client. > >> > >> The "all or nothing" for sharing sessions is not in line with our > >> capability system: "if you own a capability, you should be able to grant > >> or share it". > > > > Can you please be more specific here? What kind of sharing is expected > > on the user side of it? > > In MinkIPC, after authenticating a client credential, a TA (or TEE) may > return multiple TEE-Handles, each representing a service that the client > has privilege to access. The client should be able to "individually" > reference each TEE-Handle, e.g. to invoke and share it (as per capability- > based system requirements). > > If we use TEE subsystem, which has a session based design, all TEE-Handles > are meaningful with respect t
Re: [PATCH 1/9] drm/amdgpu: use GEM references instead of TTMs
Am 16.07.24 um 23:53 schrieb Matthew Brost: On Tue, Jul 16, 2024 at 02:35:11PM +0200, Christian König wrote: Instead of a TTM reference grab a GEM reference whenever necessary. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 8 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 7 ++- 2 files changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 67c234bcf89f..6be3d7cd1c51 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -87,11 +87,11 @@ static const struct vm_operations_struct amdgpu_gem_vm_ops = { static void amdgpu_gem_object_free(struct drm_gem_object *gobj) { - struct amdgpu_bo *robj = gem_to_amdgpu_bo(gobj); + struct amdgpu_bo *aobj = gem_to_amdgpu_bo(gobj); - if (robj) { - amdgpu_hmm_unregister(robj); - amdgpu_bo_unref(&robj); + if (aobj) { + amdgpu_hmm_unregister(aobj); + ttm_bo_put(&aobj->tbo); So, this relates to my comment in patch number #9 about dropping the TTM ref count. If TTM only uses the GEM ref count, could we convert this ttm_bo_put to something like ttm_bo_fini here (also in Xe and any other drivers with code like this)? That's exactly what I was planning to do as a follow up. Regards, Christian. I think that might be cleaner. Matt } } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 8d8c39be6129..6c187e310034 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -853,7 +853,7 @@ struct amdgpu_bo *amdgpu_bo_ref(struct amdgpu_bo *bo) if (bo == NULL) return NULL; - ttm_bo_get(&bo->tbo); + drm_gem_object_get(&bo->tbo.base); return bo; } @@ -865,13 +865,10 @@ struct amdgpu_bo *amdgpu_bo_ref(struct amdgpu_bo *bo) */ void amdgpu_bo_unref(struct amdgpu_bo **bo) { - struct ttm_buffer_object *tbo; - if ((*bo) == NULL) return; - tbo = &((*bo)->tbo); - ttm_bo_put(tbo); + drm_gem_object_get(&(*bo)->tbo.base); *bo = NULL; } -- 2.34.1
Re: [PATCH v1] drm/panel-edp: Add panels with conservative timings
Hi, On Wed, Jul 17, 2024 at 2:39 AM Terry Hsiao wrote: > > The 6 panels are used on Mediatek MT8186 Chromebooks > - B116XAT04.1 (06AF/B4C4) > - NV116WHM-A4D (09E5/FA0C) > - N116BCP-EA2 (0DAE/6111) > - B116XTN02.3 (06AF/AA73) > - B116XAN06.1 (06AF/99A1) > - N116BCA-EA2 (0DAE/5D11) > > Signed-off-by: Terry Hsiao > --- > drivers/gpu/drm/panel/panel-edp.c | 6 ++ > 1 file changed, 6 insertions(+) Please resend with a better patch subject, like "drm/panel-edp: Add 6 panels used by MT8186 Chromebooks". Also: are you adding timings based on the datasheets, or are you just guessing here? The previous patches that added "conservative" timings were because the Chromebooks involved were really old and couldn't be tracked down and folks couldn't find the relevant datasheets. In the case of MT8188 I'd expect you to be adding timings based on the datasheets. Please confirm that you are. If possible, it's really nice to have the raw EDIDs for the panels in the commit message in case someone needs it later. > diff --git a/drivers/gpu/drm/panel/panel-edp.c > b/drivers/gpu/drm/panel/panel-edp.c > index f85a6404ba58..ac280607998f 100644 > --- a/drivers/gpu/drm/panel/panel-edp.c > +++ b/drivers/gpu/drm/panel/panel-edp.c > @@ -1845,8 +1845,11 @@ static const struct edp_panel_entry edp_panels[] = { > EDP_PANEL_ENTRY('A', 'U', 'O', 0x635c, &delay_200_500_e50, > "B116XAN06.3"), > EDP_PANEL_ENTRY('A', 'U', 'O', 0x639c, &delay_200_500_e50, > "B140HAK02.7"), > EDP_PANEL_ENTRY('A', 'U', 'O', 0x723c, &delay_200_500_e50, > "B140XTN07.2"), > + EDP_PANEL_ENTRY('A', 'U', 'O', 0x73aa, &delay_200_500_e50, > "B116XTN02.3"), > EDP_PANEL_ENTRY('A', 'U', 'O', 0x8594, &delay_200_500_e50, > "B133UAN01.0"), > EDP_PANEL_ENTRY('A', 'U', 'O', 0xd497, &delay_200_500_e50, > "B120XAN01.0"), > + EDP_PANEL_ENTRY('A', 'U', 'O', 0xa199, &delay_200_500_e50, > "B116XAN06.1"), Please keep this sorted. For instance, 0xa199 should come _before_ 0xd497, right? > + EDP_PANEL_ENTRY('A', 'U', 'O', 0xc4b4, &delay_200_500_e50, > "B116XAT04.1"), > EDP_PANEL_ENTRY('A', 'U', 'O', 0xf390, &delay_200_500_e50, > "B140XTN07.7"), > > EDP_PANEL_ENTRY('B', 'O', 'E', 0x0607, &delay_200_500_e200, > "Unknown"), > @@ -1901,6 +1904,7 @@ static const struct edp_panel_entry edp_panels[] = { > EDP_PANEL_ENTRY('B', 'O', 'E', 0x0b56, &delay_200_500_e80, > "NT140FHM-N47"), > EDP_PANEL_ENTRY('B', 'O', 'E', 0x0c20, &delay_200_500_e80, > "NT140FHM-N47"), > EDP_PANEL_ENTRY('B', 'O', 'E', 0x0cb6, &delay_200_500_e200, > "NT116WHM-N44"), > + EDP_PANEL_ENTRY('B', 'O', 'E', 0x0cfa, &delay_200_500_e50, > "NV116WHM-A4D"), > > EDP_PANEL_ENTRY('C', 'M', 'N', 0x1130, &delay_200_500_e50, > "N116BGE-EB2"), > EDP_PANEL_ENTRY('C', 'M', 'N', 0x1132, &delay_200_500_e80_d50, > "N116BGE-EA2"), > @@ -1916,8 +1920,10 @@ static const struct edp_panel_entry edp_panels[] = { > EDP_PANEL_ENTRY('C', 'M', 'N', 0x1156, &delay_200_500_e80_d50, > "Unknown"), > EDP_PANEL_ENTRY('C', 'M', 'N', 0x1157, &delay_200_500_e80_d50, > "N116BGE-EA2"), > EDP_PANEL_ENTRY('C', 'M', 'N', 0x115b, &delay_200_500_e80_d50, > "N116BCN-EB1"), > + EDP_PANEL_ENTRY('C', 'M', 'N', 0x115d, &delay_200_500_e80_d50, > "N116BCA-EA2"), > EDP_PANEL_ENTRY('C', 'M', 'N', 0x115e, &delay_200_500_e80_d50, > "N116BCA-EA1"), > EDP_PANEL_ENTRY('C', 'M', 'N', 0x1160, &delay_200_500_e80_d50, > "N116BCJ-EAK"), > + EDP_PANEL_ENTRY('C', 'M', 'N', 0x1161, &delay_200_500_e80, > "N116BCP-EA2"), It looks suspicious that all the panels around this one need 50 ms for disable but yours doesn't. While it's certainly possible that things changed for this panel, it's worth double-checking. -Doug
Re: [PATCH] drm/buddy: Add start address support to trim function
On 7/16/2024 3:34 PM, Matthew Auld wrote: On 16/07/2024 10:50, Paneer Selvam, Arunpravin wrote: Hi Matthew, On 7/10/2024 6:20 PM, Matthew Auld wrote: On 10/07/2024 07:03, Paneer Selvam, Arunpravin wrote: Thanks Alex. Hi Matthew, Any comments? Do we not pass the required address alignment when allocating the pages in the first place? If address alignment is really useful, we can add that in the drm_buddy_alloc_blocks() function. I mean don't we already pass the min page size, which should give us matching physical address alignment? I think we don't need to align the address to the passed min_block_size value for all the contiguous buffers, so I thought that decision we can leave it to the drivers and they can achieve that through trim function in this kind of a specific request. https://patchwork.freedesktop.org/series/136150/ We are getting this sparse error from the Intel CI. Do you think these errors are introduced with this patches? Thanks, Arun. Thanks, Arun. Thanks, Arun. On 7/9/2024 1:42 AM, Alex Deucher wrote: On Thu, Jul 4, 2024 at 4:40 AM Arunpravin Paneer Selvam wrote: - Add a new start parameter in trim function to specify exact address from where to start the trimming. This would help us in situations like if drivers would like to do address alignment for specific requirements. - Add a new flag DRM_BUDDY_TRIM_DISABLE. Drivers can use this flag to disable the allocator trimming part. This patch enables the drivers control trimming and they can do it themselves based on the application requirements. v1:(Matthew) - check new_start alignment with min chunk_size - use range_overflows() Signed-off-by: Arunpravin Paneer Selvam Series is: Acked-by: Alex Deucher I'd like to take this series through the amdgpu tree if there are no objections as it's required for display buffers on some chips and I'd like to make sure it lands in 6.11. Thanks, Alex --- drivers/gpu/drm/drm_buddy.c | 25 +++-- drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 2 +- include/drm/drm_buddy.h | 2 ++ 3 files changed, 26 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index 94f8c34fc293..8cebe1fa4e9d 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -851,6 +851,7 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm, * drm_buddy_block_trim - free unused pages * * @mm: DRM buddy manager + * @start: start address to begin the trimming. * @new_size: original size requested * @blocks: Input and output list of allocated blocks. * MUST contain single block as input to be trimmed. @@ -866,11 +867,13 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm, * 0 on success, error code on failure. */ int drm_buddy_block_trim(struct drm_buddy *mm, + u64 *start, u64 new_size, struct list_head *blocks) { struct drm_buddy_block *parent; struct drm_buddy_block *block; + u64 block_start, block_end; LIST_HEAD(dfs); u64 new_start; int err; @@ -882,6 +885,9 @@ int drm_buddy_block_trim(struct drm_buddy *mm, struct drm_buddy_block, link); + block_start = drm_buddy_block_offset(block); + block_end = block_start + drm_buddy_block_size(mm, block); + if (WARN_ON(!drm_buddy_block_is_allocated(block))) return -EINVAL; @@ -894,6 +900,20 @@ int drm_buddy_block_trim(struct drm_buddy *mm, if (new_size == drm_buddy_block_size(mm, block)) return 0; + new_start = block_start; + if (start) { + new_start = *start; + + if (new_start < block_start) + return -EINVAL; + + if (!IS_ALIGNED(new_start, mm->chunk_size)) + return -EINVAL; + + if (range_overflows(new_start, new_size, block_end)) + return -EINVAL; + } + list_del(&block->link); mark_free(mm, block); mm->avail += drm_buddy_block_size(mm, block); @@ -904,7 +924,6 @@ int drm_buddy_block_trim(struct drm_buddy *mm, parent = block->parent; block->parent = NULL; - new_start = drm_buddy_block_offset(block); list_add(&block->tmp_link, &dfs); err = __alloc_range(mm, &dfs, new_start, new_size, blocks, NULL); if (err) { @@ -1066,7 +1085,8 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, } while (1); /* Trim the allocated block to the required size */ - if (original_size != size) { + if (!(flags & DRM_BUDDY_TRIM_DISABLE) && + original_size != size) { struct list_head *trim_list; LIST_HEAD(temp); u64 trim_size; @@ -1083
Re: [PATCH 2/3] fbcon: Add an option to disable fbcon in panic.
On Wed, Jul 17, 2024 at 10:48:40AM +0200, Jocelyn Falempe wrote: > This is required to avoid conflict between DRM_PANIC, and fbcon. If > a drm device already handle panic with drm_panic, it should set > the skip_panic field in fb_info, so that fbcon will stay quiet, and > not overwrite the panic_screen. > > Signed-off-by: Jocelyn Falempe > --- > drivers/gpu/drm/drm_fb_helper.c | 2 ++ > drivers/video/fbdev/core/fbcon.c | 7 ++- > include/linux/fb.h | 1 + > 3 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c > index e2e19f49342e..3662d664d8f9 100644 > --- a/drivers/gpu/drm/drm_fb_helper.c > +++ b/drivers/gpu/drm/drm_fb_helper.c > @@ -40,6 +40,7 @@ > #include > #include > #include > +#include > #include > #include > > @@ -524,6 +525,7 @@ struct fb_info *drm_fb_helper_alloc_info(struct > drm_fb_helper *fb_helper) > fb_helper->info = info; > info->skip_vt_switch = true; > > + info->skip_panic = drm_panic_is_enabled(fb_helper->dev); > return info; > > err_release: Bit a bikeshed, but I'd split this patch out since it's for drm's fbdev emulation, not the fbcon core code. With that: Reviewed-by: Daniel Vetter > diff --git a/drivers/video/fbdev/core/fbcon.c > b/drivers/video/fbdev/core/fbcon.c > index 3f7333dca508..498d9c07df80 100644 > --- a/drivers/video/fbdev/core/fbcon.c > +++ b/drivers/video/fbdev/core/fbcon.c > @@ -270,12 +270,17 @@ static int fbcon_get_rotate(struct fb_info *info) > return (ops) ? ops->rotate : 0; > } > > +static bool fbcon_skip_panic(struct fb_info *info) > +{ > + return (info->skip_panic && unlikely(atomic_read(&panic_cpu) != > PANIC_CPU_INVALID)); > +} > + > static inline int fbcon_is_inactive(struct vc_data *vc, struct fb_info *info) > { > struct fbcon_ops *ops = info->fbcon_par; > > return (info->state != FBINFO_STATE_RUNNING || > - vc->vc_mode != KD_TEXT || ops->graphics); > + vc->vc_mode != KD_TEXT || ops->graphics || > fbcon_skip_panic(info)); > } > > static int get_color(struct vc_data *vc, struct fb_info *info, > diff --git a/include/linux/fb.h b/include/linux/fb.h > index db7d97b10964..865dad03e73e 100644 > --- a/include/linux/fb.h > +++ b/include/linux/fb.h > @@ -510,6 +510,7 @@ struct fb_info { > void *par; > > bool skip_vt_switch; /* no VT switch on suspend/resume required */ > + bool skip_panic; /* Do not write to the fb after a panic */ > }; > > /* This will go away > -- > 2.45.2 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 1/3] drm/panic: Add drm_panic_is_enabled()
On Wed, Jul 17, 2024 at 10:48:39AM +0200, Jocelyn Falempe wrote: > It allows to check if the drm device supports drm_panic. > Prepare the work to have better integration with fbcon and vtconsole. > > Signed-off-by: Jocelyn Falempe > --- > drivers/gpu/drm/drm_panic.c | 20 > include/drm/drm_panic.h | 2 ++ > 2 files changed, 22 insertions(+) > > diff --git a/drivers/gpu/drm/drm_panic.c b/drivers/gpu/drm/drm_panic.c > index 948aed00595e..d9a25c2d0a65 100644 > --- a/drivers/gpu/drm/drm_panic.c > +++ b/drivers/gpu/drm/drm_panic.c > @@ -703,6 +703,26 @@ static void debugfs_register_plane(struct drm_plane > *plane, int index) > static void debugfs_register_plane(struct drm_plane *plane, int index) {} > #endif /* CONFIG_DRM_PANIC_DEBUG */ > > +/** > + * drm_panic_is_enabled > + * @dev: the drm device that may supports drm_panic > + * > + * returns true if the drm device supports drm_panic > + */ > +bool drm_panic_is_enabled(struct drm_device *dev) > +{ > + struct drm_plane *plane; > + > + if (!dev->mode_config.num_total_plane) > + return false; > + > + drm_for_each_plane(plane, dev) > + if (plane->helper_private && > plane->helper_private->get_scanout_buffer) > + return true; > + return false; > +} > +EXPORT_SYMBOL(drm_panic_is_enabled); This feels like overkill since you currently only have one user in the fbdev emulation code, but maybe useful in some other places ... > + > /** > * drm_panic_register() - Initialize DRM panic for a device > * @dev: the drm device on which the panic screen will be displayed. > diff --git a/include/drm/drm_panic.h b/include/drm/drm_panic.h > index 73bb3f3d9ed9..c3a358dc3e27 100644 > --- a/include/drm/drm_panic.h > +++ b/include/drm/drm_panic.h > @@ -148,11 +148,13 @@ struct drm_scanout_buffer { > > #ifdef CONFIG_DRM_PANIC > > +bool drm_panic_is_enabled(struct drm_device *dev); Since it's internal only, this should be in drivers/gpu/drm/drm_crtc_internal.h and not int he include for drivers. With that: Reviewed-by: Daniel Vetter > void drm_panic_register(struct drm_device *dev); > void drm_panic_unregister(struct drm_device *dev); These two are only used in drm.ko. Can you please move them to drm_crtc_internal.h too and drop the EXPORT_SYMBOL in a follow-up patch? We're trying to limit the exported interface and official headers to really only the pieces drivers actually need. Thanks, Sima > > #else > > +bool drm_panic_is_enabled(struct drm_device *dev) {return false; } > static inline void drm_panic_register(struct drm_device *dev) {} > static inline void drm_panic_unregister(struct drm_device *dev) {} > > -- > 2.45.2 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 3/3] drm/panic: Remove build time dependency with FRAMEBUFFER_CONSOLE
On Wed, Jul 17, 2024 at 12:09:46PM +0200, Javier Martinez Canillas wrote: > Jocelyn Falempe writes: > > > Now that fbcon has the skip_panic option, there is no more conflicts > > between drm_panic and fbcon. > > Remove the build time dependency, so they can be both enabled. > > > > Signed-off-by: Jocelyn Falempe > > --- > > drivers/gpu/drm/Kconfig | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig > > index 6dd0016fc9cd..a22cab218004 100644 > > --- a/drivers/gpu/drm/Kconfig > > +++ b/drivers/gpu/drm/Kconfig > > @@ -107,7 +107,7 @@ config DRM_KMS_HELPER > > > > config DRM_PANIC > > bool "Display a user-friendly message when a kernel panic occurs" > > - depends on DRM && !(FRAMEBUFFER_CONSOLE && VT_CONSOLE) > > + depends on DRM > > This is great. Thanks for finding an alternative approach! I don't see any > issues this time, because there is no locking involved. But let's see what > others think about it. Looks like it should work, I did check the history of fbcon_is_active and we've used that to force/disable panic output for fbcon in the past. So I think it's the right tool. > Reviewed-by: Javier Martinez Canillas Reviewed-by: Daniel Vetter Cheers, Sima > > -- > Best regards, > > Javier Martinez Canillas > Core Platforms > Red Hat > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 1/2] dma-buf: heaps: DMA_HEAP_IOCTL_ALLOC_READ_FILE framework
On Tue, Jul 16, 2024 at 06:14:48PM +0800, Huan Yang wrote: > > 在 2024/7/16 17:31, Daniel Vetter 写道: > > [你通常不会收到来自 daniel.vet...@ffwll.ch 的电子邮件。请访问 > > https://aka.ms/LearnAboutSenderIdentification,以了解这一点为什么很重要] > > > > On Tue, Jul 16, 2024 at 10:48:40AM +0800, Huan Yang wrote: > > > I just research the udmabuf, Please correct me if I'm wrong. > > > > > > 在 2024/7/15 20:32, Christian König 写道: > > > > Am 15.07.24 um 11:11 schrieb Daniel Vetter: > > > > > On Thu, Jul 11, 2024 at 11:00:02AM +0200, Christian König wrote: > > > > > > Am 11.07.24 um 09:42 schrieb Huan Yang: > > > > > > > Some user may need load file into dma-buf, current > > > > > > > way is: > > > > > > > 1. allocate a dma-buf, get dma-buf fd > > > > > > > 2. mmap dma-buf fd into vaddr > > > > > > > 3. read(file_fd, vaddr, fsz) > > > > > > > This is too heavy if fsz reached to GB. > > > > > > You need to describe a bit more why that is to heavy. I can only > > > > > > assume you > > > > > > need to save memory bandwidth and avoid the extra copy with the CPU. > > > > > > > > > > > > > This patch implement a feature called > > > > > > > DMA_HEAP_IOCTL_ALLOC_READ_FILE. > > > > > > > User need to offer a file_fd which you want to load into > > > > > > > dma-buf, then, > > > > > > > it promise if you got a dma-buf fd, it will contains the file > > > > > > > content. > > > > > > Interesting idea, that has at least more potential than trying > > > > > > to enable > > > > > > direct I/O on mmap()ed DMA-bufs. > > > > > > > > > > > > The approach with the new IOCTL might not work because it is a very > > > > > > specialized use case. > > > > > > > > > > > > But IIRC there was a copy_file_range callback in the file_operations > > > > > > structure you could use for that. I'm just not sure when and how > > > > > > that's used > > > > > > with the copy_file_range() system call. > > > > > I'm not sure any of those help, because internally they're all still > > > > > based > > > > > on struct page (or maybe in the future on folios). And that's the > > > > > thing > > > > > dma-buf can't give you, at least without peaking behind the curtain. > > > > > > > > > > I think an entirely different option would be malloc+udmabuf. That > > > > > essentially handles the impendence-mismatch between direct I/O and > > > > > dma-buf > > > > > on the dma-buf side. The downside is that it'll make the permanently > > > > > pinned memory accounting and tracking issues even more apparent, but I > > > > > guess eventually we do need to sort that one out. > > > > Oh, very good idea! > > > > Just one minor correction: it's not malloc+udmabuf, but rather > > > > create_memfd()+udmabuf. > > Hm right, it's create_memfd() + mmap(memfd) + udmabuf > > > > > > And you need to complete your direct I/O before creating the udmabuf > > > > since that reference will prevent direct I/O from working. > > > udmabuf will pin all pages, so, if returned fd, can't trigger direct I/O > > > (same as dmabuf). So, must complete read before pin it. > > Why does pinning prevent direct I/O? I haven't tested, but I'd expect the > > rdma folks would be really annoyed if that's the case ... > > > > > But current way is use `memfd_pin_folios` to boost alloc and pin, so maybe > > > need suit it. > > > > > > > > > I currently doubt that the udmabuf solution is suitable for our > > > gigabyte-level read operations. > > > > > > 1. The current mmap operation uses faulting, so frequent page faults will > > > be > > > triggered during reads, resulting in a lot of context switching overhead. > > > > > > 2. current udmabuf size limit is 64MB, even can change, maybe not good to > > > use in large size? > > Yeah that's just a figleaf so we don't have to bother about the accounting > > issue. > > > > > 3. The migration and adaptation of the driver is also a challenge, and > > > currently, we are unable to control it. > > Why does a udmabuf fd not work instead of any other dmabuf fd? That > > shouldn't matter for the consuming driver ... > > Hmm, our production's driver provider by other oem. I see many of they > implement > > their own dma_buf_ops. These may not be generic and may require them to > reimplement. Yeah, for exporting a buffer object allocated by that driver. But any competent gles/vk stack also supports importing dma-buf, and that should work with udmabuf exactly the same way as with a dma-buf allocated from the system heap. > > > Perhaps implementing `copy_file_range` would be more suitable for us. > > See my other mail, fundamentally these all rely on struct page being > > present, and dma-buf doesn't give you that. Which means you need to go > > below the dma-buf abstraction. And udmabuf is pretty much the thing for > > that, because it wraps normal struct page memory into a dmabuf. > Yes, udmabuf give this, I am very interested in whether the page provided by > udmabuf can trigger direct I/O. > > So, I'll give a test and report soon. > > > > And copy_file_range o
Re: [PATCH 1/2] dma-buf: heaps: DMA_HEAP_IOCTL_ALLOC_READ_FILE framework
On Tue, Jul 16, 2024 at 02:07:20PM +0200, Christian König wrote: > Am 16.07.24 um 11:31 schrieb Daniel Vetter: > > On Tue, Jul 16, 2024 at 10:48:40AM +0800, Huan Yang wrote: > > > I just research the udmabuf, Please correct me if I'm wrong. > > > > > > 在 2024/7/15 20:32, Christian König 写道: > > > > Am 15.07.24 um 11:11 schrieb Daniel Vetter: > > > > > On Thu, Jul 11, 2024 at 11:00:02AM +0200, Christian König wrote: > > > > > > Am 11.07.24 um 09:42 schrieb Huan Yang: > > > > > > > Some user may need load file into dma-buf, current > > > > > > > way is: > > > > > > > 1. allocate a dma-buf, get dma-buf fd > > > > > > > 2. mmap dma-buf fd into vaddr > > > > > > > 3. read(file_fd, vaddr, fsz) > > > > > > > This is too heavy if fsz reached to GB. > > > > > > You need to describe a bit more why that is to heavy. I can only > > > > > > assume you > > > > > > need to save memory bandwidth and avoid the extra copy with the CPU. > > > > > > > > > > > > > This patch implement a feature called > > > > > > > DMA_HEAP_IOCTL_ALLOC_READ_FILE. > > > > > > > User need to offer a file_fd which you want to load into > > > > > > > dma-buf, then, > > > > > > > it promise if you got a dma-buf fd, it will contains the file > > > > > > > content. > > > > > > Interesting idea, that has at least more potential than trying > > > > > > to enable > > > > > > direct I/O on mmap()ed DMA-bufs. > > > > > > > > > > > > The approach with the new IOCTL might not work because it is a very > > > > > > specialized use case. > > > > > > > > > > > > But IIRC there was a copy_file_range callback in the file_operations > > > > > > structure you could use for that. I'm just not sure when and how > > > > > > that's used > > > > > > with the copy_file_range() system call. > > > > > I'm not sure any of those help, because internally they're all still > > > > > based > > > > > on struct page (or maybe in the future on folios). And that's the > > > > > thing > > > > > dma-buf can't give you, at least without peaking behind the curtain. > > > > > > > > > > I think an entirely different option would be malloc+udmabuf. That > > > > > essentially handles the impendence-mismatch between direct I/O and > > > > > dma-buf > > > > > on the dma-buf side. The downside is that it'll make the permanently > > > > > pinned memory accounting and tracking issues even more apparent, but I > > > > > guess eventually we do need to sort that one out. > > > > Oh, very good idea! > > > > Just one minor correction: it's not malloc+udmabuf, but rather > > > > create_memfd()+udmabuf. > > Hm right, it's create_memfd() + mmap(memfd) + udmabuf > > > > > > And you need to complete your direct I/O before creating the udmabuf > > > > since that reference will prevent direct I/O from working. > > > udmabuf will pin all pages, so, if returned fd, can't trigger direct I/O > > > (same as dmabuf). So, must complete read before pin it. > > Why does pinning prevent direct I/O? I haven't tested, but I'd expect the > > rdma folks would be really annoyed if that's the case ... > > Pinning (or rather taking another page reference) prevents writes from using > direct I/O because writes try to find all references and make them read only > so that nobody modifies the content while the write is done. Where do you see that happen? That's counter to my understading of what pin_user_page() does, which is what direct I/O uses ... > As far as I know the same approach is used for NUMA migration and replacing > small pages with big ones in THP. But for the read case here it should still > work. Yeah elevated refcount breaks migration, but that's entirely different from the direct I/O use-case. Count me somewhat confused. -Sima -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v4 6/7] drm/mgag200: Add vblank support
Am 05.07.24 um 13:47 schrieb Thomas Zimmermann: There's no VBLANK interrupt on Matrox chipsets. The workaround that is being used here and in other free Matrox drivers is to program to the value of and enable the VLINE interrupt. This triggers an interrupt at the time when VBLANK begins. VLINE uses separate registers for enabling and clearing pending interrupts. No extra synchronization between irq handler and the rest of the driver is required. v4: - recreate patch on latest upstream - use devm_request_irq() for managed cleanup - fail if vblanking cannot be initialized - rename register constants (Sam, Emil) - clear interrupt before registering handler (Ville) - move programming into separate commit - set to - fix typo in commit message v3: - set to + 1 to trigger at VBLANK - expand comment on linecomp v2: - only signal vblank on CRTC 0 - use constants for registers and fields - set VLINECLR before enabling interrupt - test against STATUS and IEN in irq handler - coding-style fixes Signed-off-by: Thomas Zimmermann Acked-by: Gerd Hoffmann Acked-by: Sam Ravnborg Tested-by: Jocelyn Falempe via irc: https://dri.freedesktop.org/~cbrill/dri-log/index.php?channel=dri-devel&date=2024-07-17 --- drivers/gpu/drm/mgag200/mgag200_drv.c | 47 drivers/gpu/drm/mgag200/mgag200_drv.h | 6 ++- drivers/gpu/drm/mgag200/mgag200_g200.c| 5 +++ drivers/gpu/drm/mgag200/mgag200_g200eh.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_g200eh3.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_g200er.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_g200ev.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_g200ew3.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_g200se.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_g200wb.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_mode.c| 54 ++- drivers/gpu/drm/mgag200/mgag200_reg.h | 7 +++ 12 files changed, 152 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.c b/drivers/gpu/drm/mgag200/mgag200_drv.c index 62080cf0f2da..62479de9e659 100644 --- a/drivers/gpu/drm/mgag200/mgag200_drv.c +++ b/drivers/gpu/drm/mgag200/mgag200_drv.c @@ -18,6 +18,7 @@ #include #include #include +#include #include "mgag200_drv.h" @@ -84,6 +85,35 @@ resource_size_t mgag200_probe_vram(void __iomem *mem, resource_size_t size) return offset - 65536; } +static irqreturn_t mgag200_irq_handler(int irq, void *arg) +{ + struct drm_device *dev = arg; + struct mga_device *mdev = to_mga_device(dev); + struct drm_crtc *crtc; + u32 status, ien, iclear; + + status = RREG32(MGAREG_STATUS); + + if (status & MGAREG_STATUS_VLINEPEN) { + ien = RREG32(MGAREG_IEN); + if (!(ien & MGAREG_IEN_VLINEIEN)) + goto out; + + crtc = drm_crtc_from_index(dev, 0); + if (WARN_ON_ONCE(!crtc)) + goto out; + drm_crtc_handle_vblank(crtc); + + iclear = RREG32(MGAREG_ICLEAR); + iclear |= MGAREG_ICLEAR_VLINEICLR; + WREG32(MGAREG_ICLEAR, iclear); + return IRQ_HANDLED; + } + +out: + return IRQ_NONE; +} + /* * DRM driver */ @@ -167,7 +197,9 @@ int mgag200_device_init(struct mga_device *mdev, const struct mgag200_device_funcs *funcs) { struct drm_device *dev = &mdev->base; + struct pci_dev *pdev = to_pci_dev(dev->dev); u8 crtcext3, misc; + u32 ien, iclear; int ret; mdev->info = info; @@ -192,6 +224,21 @@ int mgag200_device_init(struct mga_device *mdev, mutex_unlock(&mdev->rmmio_lock); + ien = RREG32(MGAREG_IEN); + ien &= ~(MGAREG_IEN_VLINEIEN); + WREG32(MGAREG_IEN, ien); + + iclear = RREG32(MGAREG_ICLEAR); + iclear |= MGAREG_ICLEAR_VLINEICLR; + WREG32(MGAREG_ICLEAR, iclear); + + ret = devm_request_irq(&pdev->dev, pdev->irq, mgag200_irq_handler, IRQF_SHARED, + dev->driver->name, dev); + if (ret) { + drm_err(dev, "Failed to acquire interrupt, error %d\n", ret); + return ret; + } + return 0; } diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h b/drivers/gpu/drm/mgag200/mgag200_drv.h index 7f7dfbd0f013..f7b22b195016 100644 --- a/drivers/gpu/drm/mgag200/mgag200_drv.h +++ b/drivers/gpu/drm/mgag200/mgag200_drv.h @@ -421,6 +421,8 @@ void mgag200_crtc_helper_atomic_disable(struct drm_crtc *crtc, struct drm_atomic void mgag200_crtc_reset(struct drm_crtc *crtc); struct drm_crtc_state *mgag200_crtc_atomic_duplicate_state(struct drm_crtc *crtc); void mgag200_crtc_atomic_destroy_state(struct drm_crtc *crtc, struct drm_crtc_state *crtc_state); +int mgag200_crtc_enable_vblank(struct drm_crtc *crtc); +void mgag200_crtc_disable_vblank(struct drm_crtc *crtc); #define MGAG200_CRTC_FUNCS \ .reset = mgag200_cr
Re: [PATCH] drm: Fix documentation warning for read_mpcc_state in mpc.h
On Mon, Jul 15, 2024 at 05:46:38PM -0400, Aurabindo Pillai wrote: > > > On 7/12/24 1:45 PM, Abhishek Tamboli wrote: > > Add detail description for the read_mpcc_state function in the > > mpc_funcs struct to resolve the documentation warning. > > > > A kernel-doc warning was addressed: > > ./drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h:547: warning: > > Function parameter or struct member 'read_mpcc_state' not > > described in 'mpc_funcs'. > > > > Signed-off-by: Abhishek Tamboli > > --- > > drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h | 16 > > 1 file changed, 16 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h > > b/drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h > > index 34a398f23fc6..9e65ecf1d3b0 100644 > > --- a/drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h > > +++ b/drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h > > @@ -282,6 +282,22 @@ struct mpcc_state { > >* struct mpc_funcs - funcs > >*/ > > struct mpc_funcs { > > + /** > > +* @read_mpcc_state: > > +* > > +* Reads the state of a given MPCC instance. > > +* > > +* Parameters: > > +* > > +* - [in/out] mpc - MPC context. > > +* - [in] mpcc_inst - MPCC Instance whose state is to be read. > > +* - [out] mpcc_state - MPCC state structure where the state > > +*of the MPCC instance will be stored. > > +* > > +* Return: > > +* > > +* void > > +*/ > > void (*read_mpcc_state)( > > struct mpc *mpc, > > int mpcc_inst, > > Looks like fix for this has been already merged via a195f08636f9d7 > drm/amd/display: fix documentation warnings for mpc.h > Thanks Aurabindo for pointing this out. Regards, Abhishek > -- > > Thanks & Regards, > Aurabindo Pillai
Re: [PATCH] drm/edid: add a quirk for two 240Hz Samsung monitors
On Thu, Nov 2, 2023 at 8:06 PM Alex Deucher wrote: > > On Thu, Nov 2, 2023 at 3:00 PM Hamza Mahfooz wrote: > > > > On 11/1/23 17:36, Alex Deucher wrote: > > > On Wed, Nov 1, 2023 at 5:01 PM Hamza Mahfooz > > > wrote: > > >> > > >> Without this fix the 5120x1440@240 timing of these monitors > > >> leads to screen flickering. > > >> > > >> Cc: sta...@vger.kernel.org # 6.1+ > > >> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1442 > > >> Co-developed-by: Harry Wentland > > >> Signed-off-by: Harry Wentland > > >> Signed-off-by: Hamza Mahfooz > > >> --- > > >> drivers/gpu/drm/drm_edid.c | 47 +++--- > > >> 1 file changed, 44 insertions(+), 3 deletions(-) > > >> > > >> diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c > > >> index bca2af4fe1fc..3fdb8907f66b 100644 > > >> --- a/drivers/gpu/drm/drm_edid.c > > >> +++ b/drivers/gpu/drm/drm_edid.c > > >> @@ -89,6 +89,8 @@ static int oui(u8 first, u8 second, u8 third) > > >> #define EDID_QUIRK_NON_DESKTOP (1 << 12) > > >> /* Cap the DSC target bitrate to 15bpp */ > > >> #define EDID_QUIRK_CAP_DSC_15BPP (1 << 13) > > >> +/* Fix up a particular 5120x1440@240Hz timing */ > > >> +#define EDID_QUIRK_FIXUP_5120_1440_240 (1 << 14) > > > > > > What is wrong with the original timing that needs to be fixed? > > > > Apparently, all of timing values for the 5120x1440@240 mode of these > > monitors aren't set correctly (they are all lower than they should be) > > in their EDIDs. For what it's worth, the windows driver has had a quirk > > similar the one proposed in this patch for ~2 years. > > It would be good to at least include the original mode timings from > the EDID and the new ones added by the quirk in the commit message and > a description of why they are problematic and why the new ones work. > > Alex > I think this part what nvidia is doing in their driver is missing: https://github.com/NVIDIA/open-gpu-kernel-modules/blob/main/src/common/modeset/timing/nvt_edidext_861.c#L342 A nouveau user hit this and I think the edid parser in the kernel is just lacking whatever that "RID_MODE" stuff is all about. > > > > > > > > > Alex > > > > > > > > >> > > >> #define MICROSOFT_IEEE_OUI 0xca125c > > >> > > >> @@ -170,6 +172,12 @@ static const struct edid_quirk { > > >> EDID_QUIRK('S', 'A', 'M', 596, EDID_QUIRK_PREFER_LARGE_60), > > >> EDID_QUIRK('S', 'A', 'M', 638, EDID_QUIRK_PREFER_LARGE_60), > > >> > > >> + /* Samsung C49G95T */ > > >> + EDID_QUIRK('S', 'A', 'M', 0x7053, > > >> EDID_QUIRK_FIXUP_5120_1440_240), > > >> + > > >> + /* Samsung S49AG95 */ > > >> + EDID_QUIRK('S', 'A', 'M', 0x71ac, > > >> EDID_QUIRK_FIXUP_5120_1440_240), > > >> + > > >> /* Sony PVM-2541A does up to 12 bpc, but only reports max 8 bpc > > >> */ > > >> EDID_QUIRK('S', 'N', 'Y', 0x2541, EDID_QUIRK_FORCE_12BPC), > > >> > > >> @@ -6586,7 +6594,37 @@ static void update_display_info(struct > > >> drm_connector *connector, > > >> drm_edid_to_eld(connector, drm_edid); > > >> } > > >> > > >> -static struct drm_display_mode *drm_mode_displayid_detailed(struct > > >> drm_device *dev, > > >> +static void drm_mode_displayid_detailed_edid_quirks(struct > > >> drm_connector *connector, > > >> + struct > > >> drm_display_mode *mode) > > >> +{ > > >> + unsigned int hsync_width; > > >> + unsigned int vsync_width; > > >> + > > >> + if (connector->display_info.quirks & > > >> EDID_QUIRK_FIXUP_5120_1440_240) { > > >> + if (mode->hdisplay == 5120 && mode->vdisplay == 1440 && > > >> + mode->clock == 1939490) { > > >> + hsync_width = mode->hsync_end - > > >> mode->hsync_start; > > >> + vsync_width = mode->vsync_end - > > >> mode->vsync_start; > > >> + > > >> + mode->clock = 2018490; > > >> + mode->hdisplay = 5120; > > >> + mode->hsync_start = 5120 + 8; > > >> + mode->hsync_end = 5120 + 8 + hsync_width; > > >> + mode->htotal = 5200; > > >> + > > >> + mode->vdisplay = 1440; > > >> + mode->vsync_start = 1440 + 165; > > >> + mode->vsync_end = 1440 + 165 + vsync_width; > > >> + mode->vtotal = 1619; > > >> + > > >> + drm_dbg_kms(connector->dev, > > >> + "[CONNECTOR:%d:%s] Samsung 240Hz > > >> mode quirk applied\n", > > >> + connector->base.id, connector->name); > > >> + } > > >> + } > > >> +} > > >> + > > >> +static struct drm_display_mode *drm_mode_displayid_detailed(struct > > >> drm_connector *connector, > > >> struct > > >> di
Re: [PATCH 0/4] fixes for Adreno A5Xx preemption
On Wed, Jul 17, 2024 at 10:40:26AM +0100, Connor Abbott wrote: > On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak > wrote: > > > > There are several issues with preemption on Adreno A5XX GPUs which > > render system unusable if more than one priority level is used. Those > > issues include persistent GPU faults and hangs, full UI lockups with > > idling GPU. > > > > --- > > Vladimir Lypak (4): > > drm/msm/a5xx: disable preemption in submits by default > > drm/msm/a5xx: properly clear preemption records on resume > > drm/msm/a5xx: fix races in preemption evaluation stage > > drm/msm/a5xx: workaround early ring-buffer emptiness check > > > > drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 18 ++ > > drivers/gpu/drm/msm/adreno/a5xx_gpu.h | 12 ++--- > > drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 30 --- > > 3 files changed, 47 insertions(+), 13 deletions(-) > > --- > > base-commit: 523b23f0bee3014a7a752c9bb9f5c54f0eddae88 > > -- > > 2.45.2 > > > > Hi Vladimir, Hi Connor! > > While looking at preemption on a7xx, where the overall logic is pretty > much the same, and I've been seeing the same "soft lockups". However > even after porting patch 3, it turns out that's not enough because > there's a different race. The sequence of events is something like > this: > > 1. Medium-prio app A submits to ring 2. > 2. Ring 0 retires its last job and we start a preemption to ring 2. > 3. High-prio app B submits to ring 0. It sees the preemption from step > 2 ongoing and doesn't write the WTPR register or try to preempt. > 4. The preemption finishes and we update WPTR. At this point with patch 3 applied it should switch to ring 0 right away because of trigger call in the end of a5xx_preempt_irq. Didn't you forget it? Downstream has such call too. Even though it makes preemption a little more aggressive it doesn't work without it. > 5. App A's submit retires. We try to preempt, but the submit and > ring->cur write from step 3 happened on a different CPU and the write > hasn't landed yet so we skip it. I don't think this is possible on modern CPUs. Could it be that retire IRQ appeared earlier (while it was switching from 0 to 2) and you are looking at msm_gpu_submit_retired trace event which is called from retire work later. > > It's a bit tricky because write reordering is involved, but this seems > to be what's happening - everything except my speculation about the > delayed write to ring->cur being the problem comes straight from a > trace of this happening. > > Rob suggested on IRC that we make retire handling happen on the same > workqueue as submissions, so that preempt_trigger is always > serialized, which IIUC would also make patch 3 unnecessary. What do > you think? In this patch series i have tried to do least amount of changes so it could be easily back-ported. It isn't pretty but it works reliably for me. Otherwise it would be fine to just disable preemption like it's done on LTS before 5.4 and rework preemption in new kernel releases. Kind regards, Vladimir > > Best regards, > > Connor
[PATCH v6 0/2] io-pgtable-arm + drm/msm: Extend iova fault debugging
From: Rob Clark This series extends io-pgtable-arm with a method to retrieve the page table entries traversed in the process of address translation, and then beefs up drm/msm gpu devcore dump to include this (and additional info) in the devcore dump. This is a respin of https://patchwork.freedesktop.org/series/94968/ (minus a patch that was already merged) v2: Fix an armv7/32b build error in the last patch v3: Incorperate Will Deacon's suggestion to make the interface callback based. v4: Actually wire up the callback v5: Drop the callback approach v6: Make walk-data struct pgtable specific and rename io_pgtable_walk_data to arm_lpae_io_pgtable_walk_data Rob Clark (2): iommu/io-pgtable-arm: Add way to debug pgtable walk drm/msm: Extend gpu devcore dumps with pgtbl info drivers/gpu/drm/msm/adreno/adreno_gpu.c | 10 +++ drivers/gpu/drm/msm/msm_gpu.c | 9 +++ drivers/gpu/drm/msm/msm_gpu.h | 8 ++ drivers/gpu/drm/msm/msm_iommu.c | 22 +++ drivers/gpu/drm/msm/msm_mmu.h | 3 ++- drivers/iommu/io-pgtable-arm.c | 36 ++--- include/linux/io-pgtable.h | 17 7 files changed, 95 insertions(+), 10 deletions(-) -- 2.45.2
[PATCH v6 1/2] iommu/io-pgtable-arm: Add way to debug pgtable walk
From: Rob Clark Add an io-pgtable method to walk the pgtable returning the raw PTEs that would be traversed for a given iova access. Signed-off-by: Rob Clark Acked-by: Joerg Roedel --- drivers/iommu/io-pgtable-arm.c | 36 +- include/linux/io-pgtable.h | 17 2 files changed, 44 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 3d23b924cec1..e70803940b46 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -690,9 +690,11 @@ static size_t arm_lpae_unmap_pages(struct io_pgtable_ops *ops, unsigned long iov data->start_level, ptep); } -static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, -unsigned long iova) +static int arm_lpae_pgtable_walk(struct io_pgtable_ops *ops, +unsigned long iova, +void *_wd) { + struct arm_lpae_io_pgtable_walk_data *wd = _wd; struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); arm_lpae_iopte pte, *ptep = data->pgd; int lvl = data->start_level; @@ -700,7 +702,7 @@ static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, do { /* Valid IOPTE pointer? */ if (!ptep) - return 0; + return -ENOENT; /* Grab the IOPTE we're interested in */ ptep += ARM_LPAE_LVL_IDX(iova, lvl, data); @@ -708,22 +710,37 @@ static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, /* Valid entry? */ if (!pte) - return 0; + return -ENOENT; - /* Leaf entry? */ + wd->ptes[wd->level++] = pte; + + /* Leaf entry? If so, we've found the translation */ if (iopte_leaf(pte, lvl, data->iop.fmt)) - goto found_translation; + return 0; /* Take it to the next level */ ptep = iopte_deref(pte, data); } while (++lvl < ARM_LPAE_MAX_LEVELS); /* Ran out of page tables to walk */ - return 0; + return -ENOENT; +} + +static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, +unsigned long iova) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + struct arm_lpae_io_pgtable_walk_data wd = {}; + int ret, lvl; + + ret = arm_lpae_pgtable_walk(ops, iova, &wd); + if (ret) + return 0; + + lvl = wd.level + data->start_level; -found_translation: iova &= (ARM_LPAE_BLOCK_SIZE(lvl, data) - 1); - return iopte_to_paddr(pte, data) | iova; + return iopte_to_paddr(wd.ptes[wd.level - 1], data) | iova; } static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg) @@ -804,6 +821,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) .map_pages = arm_lpae_map_pages, .unmap_pages= arm_lpae_unmap_pages, .iova_to_phys = arm_lpae_iova_to_phys, + .pgtable_walk = arm_lpae_pgtable_walk, }; return data; diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index 86cf1f7ae389..df6f6e58310c 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -171,12 +171,28 @@ struct io_pgtable_cfg { }; }; +/** + * struct arm_lpae_io_pgtable_walk_data - information from a pgtable walk + * + * @ptes: The recorded PTE values from the walk + * @level:The level of the last PTE + * + * @level also specifies the last valid index in @ptes + */ +struct arm_lpae_io_pgtable_walk_data { + u64 ptes[4]; + int level; +}; + /** * struct io_pgtable_ops - Page table manipulation API for IOMMU drivers. * * @map_pages:Map a physically contiguous range of pages of the same size. * @unmap_pages: Unmap a range of virtually contiguous pages of the same size. * @iova_to_phys: Translate iova to physical address. + * @pgtable_walk: (optional) Perform a page table walk for a given iova. The + *type for the wd parameter is specific to pgtable type, as + *the PTE size and number of levels differs per pgtable type. * * These functions map directly onto the iommu_ops member functions with * the same names. @@ -190,6 +206,7 @@ struct io_pgtable_ops { struct iommu_iotlb_gather *gather); phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops, unsigned long iova); + int (*pgtable_walk)(struct io_pgtable_ops *ops, unsigned long iova, void *wd); int (*read_and_clear_dirty)(struct io_pgtable_ops *ops, unsign
[PATCH v6 2/2] drm/msm: Extend gpu devcore dumps with pgtbl info
From: Rob Clark In the case of iova fault triggered devcore dumps, include additional debug information based on what we think is the current page tables, including the TTBR0 value (which should match what we have in adreno_smmu_fault_info unless things have gone horribly wrong), and the pagetable entries traversed in the process of resolving the faulting iova. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 10 ++ drivers/gpu/drm/msm/msm_gpu.c | 9 + drivers/gpu/drm/msm/msm_gpu.h | 8 drivers/gpu/drm/msm/msm_iommu.c | 22 ++ drivers/gpu/drm/msm/msm_mmu.h | 3 ++- 5 files changed, 51 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index 99661af8d941..422dae873b6b 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -861,6 +861,16 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state *state, drm_printf(p, " - dir=%s\n", info->flags & IOMMU_FAULT_WRITE ? "WRITE" : "READ"); drm_printf(p, " - type=%s\n", info->type); drm_printf(p, " - source=%s\n", info->block); + + /* Information extracted from what we think are the current +* pgtables. Hopefully the TTBR0 matches what we've extracted +* from the SMMU registers in smmu_info! +*/ + drm_puts(p, "pgtable-fault-info:\n"); + drm_printf(p, " - ttbr0: %.16llx\n", (u64)info->pgtbl_ttbr0); + drm_printf(p, " - asid: %d\n", info->asid); + drm_printf(p, " - ptes: %.16llx %.16llx %.16llx %.16llx\n", + info->ptes[0], info->ptes[1], info->ptes[2], info->ptes[3]); } drm_printf(p, "rbbm-status: 0x%08x\n", state->rbbm_status); diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 3666b42b4ecd..bf2f8b2a7ccc 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -281,6 +281,15 @@ static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, if (submit) { int i; + if (state->fault_info.ttbr0) { + struct msm_gpu_fault_info *info = &state->fault_info; + struct msm_mmu *mmu = submit->aspace->mmu; + + msm_iommu_pagetable_params(mmu, &info->pgtbl_ttbr0, + &info->asid); + msm_iommu_pagetable_walk(mmu, info->iova, info->ptes); + } + state->bos = kcalloc(submit->nr_bos, sizeof(struct msm_gpu_state_bo), GFP_KERNEL); diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 1f02bb9956be..82e838ba8c80 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -101,6 +101,14 @@ struct msm_gpu_fault_info { int flags; const char *type; const char *block; + + /* Information about what we think/expect is the current SMMU state, +* for example expected_ttbr0 should match smmu_info.ttbr0 which +* was read back from SMMU registers. +*/ + phys_addr_t pgtbl_ttbr0; + u64 ptes[4]; + int asid; }; /** diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c index d5512037c38b..0c35a5b597f3 100644 --- a/drivers/gpu/drm/msm/msm_iommu.c +++ b/drivers/gpu/drm/msm/msm_iommu.c @@ -195,6 +195,28 @@ struct iommu_domain_geometry *msm_iommu_get_geometry(struct msm_mmu *mmu) return &iommu->domain->geometry; } +int +msm_iommu_pagetable_walk(struct msm_mmu *mmu, unsigned long iova, uint64_t ptes[4]) +{ + struct msm_iommu_pagetable *pagetable; + struct arm_lpae_io_pgtable_walk_data wd = {}; + + if (mmu->type != MSM_MMU_IOMMU_PAGETABLE) + return -EINVAL; + + pagetable = to_pagetable(mmu); + + if (!pagetable->pgtbl_ops->pgtable_walk) + return -EINVAL; + + pagetable->pgtbl_ops->pgtable_walk(pagetable->pgtbl_ops, iova, &wd); + + for (int i = 0; i < ARRAY_SIZE(wd.ptes); i++) + ptes[i] = wd.ptes[i]; + + return 0; +} + static const struct msm_mmu_funcs pagetable_funcs = { .map = msm_iommu_pagetable_map, .unmap = msm_iommu_pagetable_unmap, diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h index 88af4f490881..96e509bd96a6 100644 --- a/drivers/gpu/drm/msm/msm_mmu.h +++ b/drivers/gpu/drm/msm/msm_mmu.h @@ -53,7 +53,8 @@ static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg, struct msm_mmu *msm_iommu_pagetable_create(struct msm_mmu *parent); int msm_iommu_pagetable_params(struct msm_mmu *mmu, phys_addr_t *ttbr, - int *asid); +
Re: [PATCH v4 6/7] drm/mgag200: Add vblank support
Hi Am 08.07.24 um 14:46 schrieb Jocelyn Falempe: On 05/07/2024 13:47, Thomas Zimmermann wrote: There's no VBLANK interrupt on Matrox chipsets. The workaround that is being used here and in other free Matrox drivers is to program to the value of and enable the VLINE interrupt. This triggers an interrupt at the time when VBLANK begins. VLINE uses separate registers for enabling and clearing pending interrupts. No extra synchronization between irq handler and the rest of the driver is required. Thanks for this patch, I have a few comments below: v4: - recreate patch on latest upstream - use devm_request_irq() for managed cleanup - fail if vblanking cannot be initialized - rename register constants (Sam, Emil) - clear interrupt before registering handler (Ville) - move programming into separate commit - set to - fix typo in commit message v3: - set to + 1 to trigger at VBLANK - expand comment on linecomp v2: - only signal vblank on CRTC 0 - use constants for registers and fields - set VLINECLR before enabling interrupt - test against STATUS and IEN in irq handler - coding-style fixes Signed-off-by: Thomas Zimmermann Acked-by: Gerd Hoffmann Acked-by: Sam Ravnborg --- drivers/gpu/drm/mgag200/mgag200_drv.c | 47 drivers/gpu/drm/mgag200/mgag200_drv.h | 6 ++- drivers/gpu/drm/mgag200/mgag200_g200.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_g200eh.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_g200eh3.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_g200er.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_g200ev.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_g200ew3.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_g200se.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_g200wb.c | 5 +++ drivers/gpu/drm/mgag200/mgag200_mode.c | 54 ++- drivers/gpu/drm/mgag200/mgag200_reg.h | 7 +++ 12 files changed, 152 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.c b/drivers/gpu/drm/mgag200/mgag200_drv.c index 62080cf0f2da..62479de9e659 100644 --- a/drivers/gpu/drm/mgag200/mgag200_drv.c +++ b/drivers/gpu/drm/mgag200/mgag200_drv.c @@ -18,6 +18,7 @@ #include #include #include +#include #include "mgag200_drv.h" @@ -84,6 +85,35 @@ resource_size_t mgag200_probe_vram(void __iomem *mem, resource_size_t size) return offset - 65536; } +static irqreturn_t mgag200_irq_handler(int irq, void *arg) +{ + struct drm_device *dev = arg; + struct mga_device *mdev = to_mga_device(dev); + struct drm_crtc *crtc; + u32 status, ien, iclear; + + status = RREG32(MGAREG_STATUS); + + if (status & MGAREG_STATUS_VLINEPEN) { + ien = RREG32(MGAREG_IEN); + if (!(ien & MGAREG_IEN_VLINEIEN)) + goto out; + This is to catch a race condition, receiving the interrupt, just after disabling it ? I think you still want to clear it, to avoid having it fire as soon as you re-enable it ? Shared interrupts (with another device) can bring us here. I was thinking that we might have stale state on the Matrox device (i.e., VLINEIEN is off, but the VLINEPEN status bit is stil set). So it checks whether the VLINE interrupt is really on. + crtc = drm_crtc_from_index(dev, 0); + if (WARN_ON_ONCE(!crtc)) + goto out; + drm_crtc_handle_vblank(crtc); + + iclear = RREG32(MGAREG_ICLEAR); iclear is a Write-Only register, according to the documentation. So reading it will always return 0. I would suggest to just write: WREG32(MGAREG_ICLEAR, MGAREG_ICLEAR_VLINEICLR); + iclear |= MGAREG_ICLEAR_VLINEICLR; + WREG32(MGAREG_ICLEAR, iclear); + return IRQ_HANDLED; + } + +out: + return IRQ_NONE; +} + /* * DRM driver */ @@ -167,7 +197,9 @@ int mgag200_device_init(struct mga_device *mdev, const struct mgag200_device_funcs *funcs) { struct drm_device *dev = &mdev->base; + struct pci_dev *pdev = to_pci_dev(dev->dev); u8 crtcext3, misc; + u32 ien, iclear; int ret; mdev->info = info; @@ -192,6 +224,21 @@ int mgag200_device_init(struct mga_device *mdev, mutex_unlock(&mdev->rmmio_lock); + ien = RREG32(MGAREG_IEN); + ien &= ~(MGAREG_IEN_VLINEIEN); + WREG32(MGAREG_IEN, ien); I would suggest to disable all interrupt instead, WREG32(MGAREG_IEN, 0); + + iclear = RREG32(MGAREG_ICLEAR); same here, don't read the iclear register. + iclear |= MGAREG_ICLEAR_VLINEICLR; + WREG32(MGAREG_ICLEAR, iclear); + + ret = devm_request_irq(&pdev->dev, pdev->irq, mgag200_irq_handler, IRQF_SHARED, + dev->driver->name, dev); + if (ret) { + drm_err(dev, "Failed to acquire interrupt, error %d\n", ret); + return ret; + } + return 0; } diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h b/drivers/gpu/drm/mgag200/mgag200_drv.h index 7f7dfbd0f013..f7b22b195016 100644 --- a/drivers/gpu/drm/mgag200/mgag200_drv.h +++ b/drivers/gpu
Re: [PATCH 0/4] fixes for Adreno A5Xx preemption
On Wed, Jul 17, 2024 at 5:33 PM Vladimir Lypak wrote: > > On Wed, Jul 17, 2024 at 10:40:26AM +0100, Connor Abbott wrote: > > On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak > > wrote: > > > > > > There are several issues with preemption on Adreno A5XX GPUs which > > > render system unusable if more than one priority level is used. Those > > > issues include persistent GPU faults and hangs, full UI lockups with > > > idling GPU. > > > > > > --- > > > Vladimir Lypak (4): > > > drm/msm/a5xx: disable preemption in submits by default > > > drm/msm/a5xx: properly clear preemption records on resume > > > drm/msm/a5xx: fix races in preemption evaluation stage > > > drm/msm/a5xx: workaround early ring-buffer emptiness check > > > > > > drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 18 ++ > > > drivers/gpu/drm/msm/adreno/a5xx_gpu.h | 12 ++--- > > > drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 30 --- > > > 3 files changed, 47 insertions(+), 13 deletions(-) > > > --- > > > base-commit: 523b23f0bee3014a7a752c9bb9f5c54f0eddae88 > > > -- > > > 2.45.2 > > > > > > > Hi Vladimir, > > Hi Connor! > > > > > While looking at preemption on a7xx, where the overall logic is pretty > > much the same, and I've been seeing the same "soft lockups". However > > even after porting patch 3, it turns out that's not enough because > > there's a different race. The sequence of events is something like > > this: > > > > 1. Medium-prio app A submits to ring 2. > > 2. Ring 0 retires its last job and we start a preemption to ring 2. > > 3. High-prio app B submits to ring 0. It sees the preemption from step > > 2 ongoing and doesn't write the WTPR register or try to preempt. > > 4. The preemption finishes and we update WPTR. > At this point with patch 3 applied it should switch to ring 0 right away > because of trigger call in the end of a5xx_preempt_irq. Didn't you > forget it? Downstream has such call too. Even though it makes preemption > a little more aggressive it doesn't work without it. Yes, I didn't apply that bit because it didn't seem necessary to fix the original issue you described and it seemed like just an optimization to make preemption more aggressive, however it does seem to fix the issue. I can't verify that the issue you describe (the retire IRQ arriving before preemption IRQ) is what's actually happening because adding a tracepoint on retire seems to change the timing enough so that the lockup doesn't happen, though. So I guess I'll just have to assume that's what it was. Given how subtle this is, enough that I missed it, maybe it's worth a comment and an extra commit. Also, I forgot to mention that while I was reading this over I found another (theoretical) race - we could flush a submit in between calling update_wptr() and set_preempt_state(PREEMPT_NONE) in a5xx_preempt_irq() and never update wptr. I would fix it by renaming PREEMPT_ABORT to PREEMPT_FINISH and pulling out the ABORT -> update_wptr() -> NONE sequence in a5xx_preempt_trigger() into a separate function that also gets called in a5xx_preempt_irq(). Connor > > > 5. App A's submit retires. We try to preempt, but the submit and > > ring->cur write from step 3 happened on a different CPU and the write > > hasn't landed yet so we skip it. > > I don't think this is possible on modern CPUs. Could it be that retire > IRQ appeared earlier (while it was switching from 0 to 2) and you are > looking at msm_gpu_submit_retired trace event which is called from > retire work later. > > > > > It's a bit tricky because write reordering is involved, but this seems > > to be what's happening - everything except my speculation about the > > delayed write to ring->cur being the problem comes straight from a > > trace of this happening. > > > > Rob suggested on IRC that we make retire handling happen on the same > > workqueue as submissions, so that preempt_trigger is always > > serialized, which IIUC would also make patch 3 unnecessary. What do > > you think? > > In this patch series i have tried to do least amount of changes so it > could be easily back-ported. It isn't pretty but it works reliably for > me. Otherwise it would be fine to just disable preemption like it's done > on LTS before 5.4 and rework preemption in new kernel releases. > > Kind regards, > > Vladimir > > > > > Best regards, > > > > Connor
Re: [PATCH v2] printk: Add a short description string to kmsg_dump()
On 7/2/2024 5:26 AM, Jocelyn Falempe wrote: > kmsg_dump doesn't forward the panic reason string to the kmsg_dumper > callback. > This patch adds a new struct kmsg_dump_detail, that will hold the > reason and description, and pass it to the dump() callback. > > To avoid updating all kmsg_dump() call, it adds a kmsg_dump_desc() > function and a macro for backward compatibility. > > I've written this for drm_panic, but it can be useful for other > kmsg_dumper. > It allows to see the panic reason, like "sysrq triggered crash" > or "VFS: Unable to mount root fs on " on the drm panic screen. > > v2: > * Use a struct kmsg_dump_detail to hold the reason and description >pointer, for more flexibility if we want to add other parameters. >(Kees Cook) > * Fix powerpc/nvram_64 build, as I didn't update the forward >declaration of oops_to_nvram() > > Signed-off-by: Jocelyn Falempe > --- > arch/powerpc/kernel/nvram_64.c | 8 > arch/powerpc/platforms/powernv/opal-kmsg.c | 4 ++-- > arch/um/kernel/kmsg_dump.c | 2 +- > drivers/gpu/drm/drm_panic.c| 4 ++-- > drivers/hv/hv_common.c | 4 ++-- Acked-by: Nuno Das Neves (hyperv) LGTM
Re: [PATCH 1/9] drm/amdgpu: use GEM references instead of TTMs
On Wed, Jul 17, 2024 at 04:53:16PM +0200, Christian König wrote: > Am 16.07.24 um 23:53 schrieb Matthew Brost: > > On Tue, Jul 16, 2024 at 02:35:11PM +0200, Christian König wrote: > > > Instead of a TTM reference grab a GEM reference whenever necessary. > > > > > > Signed-off-by: Christian König > > > --- > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 8 > > > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 7 ++- > > > 2 files changed, 6 insertions(+), 9 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > > > index 67c234bcf89f..6be3d7cd1c51 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > > > @@ -87,11 +87,11 @@ static const struct vm_operations_struct > > > amdgpu_gem_vm_ops = { > > > static void amdgpu_gem_object_free(struct drm_gem_object *gobj) > > > { > > > - struct amdgpu_bo *robj = gem_to_amdgpu_bo(gobj); > > > + struct amdgpu_bo *aobj = gem_to_amdgpu_bo(gobj); > > > - if (robj) { > > > - amdgpu_hmm_unregister(robj); > > > - amdgpu_bo_unref(&robj); > > > + if (aobj) { > > > + amdgpu_hmm_unregister(aobj); > > > + ttm_bo_put(&aobj->tbo); > > So, this relates to my comment in patch number #9 about dropping the TTM > > ref count. If TTM only uses the GEM ref count, could we convert this > > ttm_bo_put to something like ttm_bo_fini here (also in Xe and any other > > drivers with code like this)? > > That's exactly what I was planning to do as a follow up. > Cool, glad we are aligned. Matt > Regards, > Christian. > > > > > I think that might be cleaner. > > > > Matt > > > > > } > > > } > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > > > index 8d8c39be6129..6c187e310034 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > > > @@ -853,7 +853,7 @@ struct amdgpu_bo *amdgpu_bo_ref(struct amdgpu_bo *bo) > > > if (bo == NULL) > > > return NULL; > > > - ttm_bo_get(&bo->tbo); > > > + drm_gem_object_get(&bo->tbo.base); > > > return bo; > > > } > > > @@ -865,13 +865,10 @@ struct amdgpu_bo *amdgpu_bo_ref(struct amdgpu_bo > > > *bo) > > >*/ > > > void amdgpu_bo_unref(struct amdgpu_bo **bo) > > > { > > > - struct ttm_buffer_object *tbo; > > > - > > > if ((*bo) == NULL) > > > return; > > > - tbo = &((*bo)->tbo); > > > - ttm_bo_put(tbo); > > > + drm_gem_object_get(&(*bo)->tbo.base); > > > *bo = NULL; > > > } > > > -- > > > 2.34.1 > > > >
Re: [PATCH 9/9] drm/ttm: make ttm_bo_get internal
On Tue, Jul 16, 2024 at 02:35:19PM +0200, Christian König wrote: > Prevent drivers from using this directly. > > Signed-off-by: Christian König Reviewed-by: Matthew Brost > --- > drivers/gpu/drm/ttm/ttm_bo_internal.h | 10 ++ > include/drm/ttm/ttm_bo.h | 10 -- > 2 files changed, 10 insertions(+), 10 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_internal.h > b/drivers/gpu/drm/ttm/ttm_bo_internal.h > index 6a7305efd778..9d8b747a34db 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo_internal.h > +++ b/drivers/gpu/drm/ttm/ttm_bo_internal.h > @@ -27,6 +27,16 @@ > > #include > > +/** > + * ttm_bo_get - reference a struct ttm_buffer_object > + * > + * @bo: The buffer object. > + */ > +static inline void ttm_bo_get(struct ttm_buffer_object *bo) > +{ > + kref_get(&bo->kref); > +} > + > /** > * ttm_bo_get_unless_zero - reference a struct ttm_buffer_object unless > * its refcount has already reached zero. > diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h > index 31ec7fd34eeb..8c1577d8793c 100644 > --- a/include/drm/ttm/ttm_bo.h > +++ b/include/drm/ttm/ttm_bo.h > @@ -229,16 +229,6 @@ struct ttm_lru_walk { > s64 ttm_lru_walk_for_evict(struct ttm_lru_walk *walk, struct ttm_device > *bdev, > struct ttm_resource_manager *man, s64 target); > > -/** > - * ttm_bo_get - reference a struct ttm_buffer_object > - * > - * @bo: The buffer object. > - */ > -static inline void ttm_bo_get(struct ttm_buffer_object *bo) > -{ > - kref_get(&bo->kref); > -} > - > /** > * ttm_bo_reserve: > * > -- > 2.34.1 >
Re: [PATCH 7/9] drm/ttm: revert "Export ttm_bo_get_unless_zero()"
On Tue, Jul 16, 2024 at 02:35:17PM +0200, Christian König wrote: > This reverts commit 24dc64c1ba5c3ef0463d59fef6df09336754188d. > > Shouldn't be needed by drivers any more. > > Signed-off-by: Christian König Reviewed-by: Matthew Brost > --- > drivers/gpu/drm/ttm/ttm_bo.c | 1 + > drivers/gpu/drm/ttm/ttm_bo_internal.h | 48 +++ > drivers/gpu/drm/ttm/ttm_bo_util.c | 2 ++ > drivers/gpu/drm/ttm/ttm_device.c | 1 + > include/drm/ttm/ttm_bo.h | 18 -- > 5 files changed, 52 insertions(+), 18 deletions(-) > create mode 100644 drivers/gpu/drm/ttm/ttm_bo_internal.h > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > index 0131ec802066..fe4638ec0976 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo.c > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > @@ -45,6 +45,7 @@ > #include > > #include "ttm_module.h" > +#include "ttm_bo_internal.h" > > static void ttm_bo_mem_space_debug(struct ttm_buffer_object *bo, > struct ttm_placement *placement) > diff --git a/drivers/gpu/drm/ttm/ttm_bo_internal.h > b/drivers/gpu/drm/ttm/ttm_bo_internal.h > new file mode 100644 > index ..6a7305efd778 > --- /dev/null > +++ b/drivers/gpu/drm/ttm/ttm_bo_internal.h > @@ -0,0 +1,48 @@ > +/* > + * Copyright 2018 Advanced Micro Devices, Inc. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > + * OTHER DEALINGS IN THE SOFTWARE. > + * > + * > + */ > + > +#ifndef _TTM_BO_INTERNAL_H_ > +#define _TTM_BO_INTERNAL_H_ > + > +#include > + > +/** > + * ttm_bo_get_unless_zero - reference a struct ttm_buffer_object unless > + * its refcount has already reached zero. > + * @bo: The buffer object. > + * > + * Used to reference a TTM buffer object in lookups where the object is > removed > + * from the lookup structure during the destructor and for RCU lookups. > + * > + * Returns: @bo if the referencing was successful, NULL otherwise. > + */ > +static inline __must_check struct ttm_buffer_object * > +ttm_bo_get_unless_zero(struct ttm_buffer_object *bo) > +{ > + if (!kref_get_unless_zero(&bo->kref)) > + return NULL; > + return bo; > +} > + > +#endif > diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c > b/drivers/gpu/drm/ttm/ttm_bo_util.c > index 3c07f4712d5c..f7143384ef1c 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c > @@ -37,6 +37,8 @@ > > #include > > +#include "ttm_bo_internal.h" > + > struct ttm_transfer_obj { > struct ttm_buffer_object base; > struct ttm_buffer_object *bo; > diff --git a/drivers/gpu/drm/ttm/ttm_device.c > b/drivers/gpu/drm/ttm/ttm_device.c > index e7cc4954c1bc..2e7fa3a11dc0 100644 > --- a/drivers/gpu/drm/ttm/ttm_device.c > +++ b/drivers/gpu/drm/ttm/ttm_device.c > @@ -36,6 +36,7 @@ > #include > > #include "ttm_module.h" > +#include "ttm_bo_internal.h" > > /* > * ttm_global_mutex - protecting the global state > diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h > index d1a732d56259..31ec7fd34eeb 100644 > --- a/include/drm/ttm/ttm_bo.h > +++ b/include/drm/ttm/ttm_bo.h > @@ -239,24 +239,6 @@ static inline void ttm_bo_get(struct ttm_buffer_object > *bo) > kref_get(&bo->kref); > } > > -/** > - * ttm_bo_get_unless_zero - reference a struct ttm_buffer_object unless > - * its refcount has already reached zero. > - * @bo: The buffer object. > - * > - * Used to reference a TTM buffer object in lookups where the object is > removed > - * from the lookup structure during the destructor and for RCU lookups. > - * > - * Returns: @bo if the referencing was successful, NULL otherwise. > - */ > -static inline __must_check struct ttm_buffer_object * > -ttm_bo_get_unless_zero(struct ttm_buffer_object *bo) > -{ > - if (!kref_get_unless_zero(&bo->kref)) > - return NULL; > - return bo; > -} > - > /** > * ttm_bo_reserve: > * > -- > 2.34.1 >
Re: [PATCH 8/9] drm/ttm: use GEM references for VM mappings
On Tue, Jul 16, 2024 at 02:35:18PM +0200, Christian König wrote: > Instead of a TTM reference grab a GEM reference whenever necessary for a > VM mapping. > > Signed-off-by: Christian König Reviewed-by: Matthew Brost > --- > drivers/gpu/drm/ttm/ttm_bo_vm.c | 14 +++--- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c > index 4212b8c91dd4..3f283b3433f8 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c > @@ -58,13 +58,13 @@ static vm_fault_t ttm_bo_vm_fault_idle(struct > ttm_buffer_object *bo, > if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT) > return VM_FAULT_RETRY; > > - ttm_bo_get(bo); > + drm_gem_object_get(&bo->base); > mmap_read_unlock(vmf->vma->vm_mm); > (void)dma_resv_wait_timeout(bo->base.resv, > DMA_RESV_USAGE_KERNEL, true, > MAX_SCHEDULE_TIMEOUT); > dma_resv_unlock(bo->base.resv); > - ttm_bo_put(bo); > + drm_gem_object_put(&bo->base); > return VM_FAULT_RETRY; > } > > @@ -130,12 +130,12 @@ vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object > *bo, >*/ > if (fault_flag_allow_retry_first(vmf->flags)) { > if (!(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) { > - ttm_bo_get(bo); > + drm_gem_object_get(&bo->base); > mmap_read_unlock(vmf->vma->vm_mm); > if (!dma_resv_lock_interruptible(bo->base.resv, >NULL)) > dma_resv_unlock(bo->base.resv); > - ttm_bo_put(bo); > + drm_gem_object_put(&bo->base); > } > > return VM_FAULT_RETRY; > @@ -353,7 +353,7 @@ void ttm_bo_vm_open(struct vm_area_struct *vma) > > WARN_ON(bo->bdev->dev_mapping != vma->vm_file->f_mapping); > > - ttm_bo_get(bo); > + drm_gem_object_get(&bo->base); > } > EXPORT_SYMBOL(ttm_bo_vm_open); > > @@ -361,7 +361,7 @@ void ttm_bo_vm_close(struct vm_area_struct *vma) > { > struct ttm_buffer_object *bo = vma->vm_private_data; > > - ttm_bo_put(bo); > + drm_gem_object_put(&bo->base); > vma->vm_private_data = NULL; > } > EXPORT_SYMBOL(ttm_bo_vm_close); > @@ -462,7 +462,7 @@ int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct > ttm_buffer_object *bo) > if (is_cow_mapping(vma->vm_flags)) > return -EINVAL; > > - ttm_bo_get(bo); > + drm_gem_object_get(&bo->base); > > /* >* Drivers may want to override the vm_ops field. Otherwise we > -- > 2.34.1 >
Re: [PATCH v5 08/16] drm/msm/dpu: drop msm_format from struct dpu_hw_fmt_layout
On 6/24/2024 2:13 PM, Dmitry Baryshkov wrote: The struct dpu_hw_fmt_layout defines hardware data layout (addresses, sizes and pitches. Drop format field from this structure as it's not a Missing closing brace ")" here? part of the data layout. Its a bit subjective IMO whether you consider format as part of hardware data layout or not. Registers do have format bitfields too so I am somewhat unsure if this change is really needed. Signed-off-by: Dmitry Baryshkov --- .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c| 31 +++--- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c| 23 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h| 2 -- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.c | 4 +-- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.h | 3 ++- 5 files changed, 25 insertions(+), 38 deletions(-) @@ -318,15 +318,10 @@ static void dpu_encoder_phys_wb_setup( { struct dpu_hw_wb *hw_wb = phys_enc->hw_wb; struct drm_display_mode mode = phys_enc->cached_mode; - struct drm_framebuffer *fb = NULL; struct dpu_encoder_phys_wb *wb_enc = to_dpu_encoder_phys_wb(phys_enc); - struct drm_writeback_job *wb_job; const struct msm_format *format; - const struct msm_format *dpu_fmt; - wb_job = wb_enc->wb_job; format = msm_framebuffer_format(wb_enc->wb_job->fb); - dpu_fmt = mdp_get_format(&phys_enc->dpu_kms->base, format->pixel_format, wb_job->fb->modifier); This is interesting. I wonder why I just didnt use format directly that time itself :) Maybe I was thinking that mdp_get_format() will also match the modifiers and return the corresponding msm_format. DPU_DEBUG("[mode_set:%d, \"%s\",%d,%d]\n", hw_wb->idx - WB_0, mode.name, @@ -338,9 +333,9 @@ static void dpu_encoder_phys_wb_setup( dpu_encoder_phys_wb_set_qos(phys_enc); - dpu_encoder_phys_wb_setup_fb(phys_enc, fb); + dpu_encoder_phys_wb_setup_fb(phys_enc, format); - dpu_encoder_helper_phys_setup_cdm(phys_enc, dpu_fmt, CDM_CDWN_OUTPUT_WB); + dpu_encoder_helper_phys_setup_cdm(phys_enc, format, CDM_CDWN_OUTPUT_WB); dpu_encoder_phys_wb_setup_ctl(phys_enc); } @@ -584,14 +579,6 @@ static void dpu_encoder_phys_wb_prepare_wb_job(struct dpu_encoder_phys *phys_enc format = msm_framebuffer_format(job->fb); - wb_cfg->dest.format = mdp_get_format(&phys_enc->dpu_kms->base, -format->pixel_format, job->fb->modifier); - if (!wb_cfg->dest.format) { - /* this error should be detected during atomic_check */ - DPU_ERROR("failed to get format %p4cc\n", &format->pixel_format); - return; - } - ret = dpu_format_populate_layout(aspace, job->fb, &wb_cfg->dest); if (ret) { DPU_DEBUG("failed to populate layout %d\n", ret);
Re: [PATCH v5 09/16] drm/msm/dpu: pass drm_framebuffer to _dpu_format_get_plane_sizes()
On 6/24/2024 2:13 PM, Dmitry Baryshkov wrote: Instead of passing width / height / pitches, pass drm_framebuffer directly. This allows us to drop the useless check for !pitches, since an array can not be NULL. Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 73 ++--- 1 file changed, 34 insertions(+), 39 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c index 46237a1ca6a5..df046bc88715 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c @@ -95,8 +95,7 @@ static int _dpu_format_get_media_color_ubwc(const struct msm_format *fmt) Reviewed-by: Abhinav Kumar
[RFC] drm/panel/simple-edp: Add Samsung ATNA45DC02
From: Rob Clark Just a guess on the panel timings, but they appear to work. Signed-off-by: Rob Clark --- This adds the panel I have on my lenovo yoga slim 7x laptop. I couldn't find any datasheet so timings is just a guess. But AFAICT everything works fine. drivers/gpu/drm/panel/panel-edp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/panel/panel-edp.c b/drivers/gpu/drm/panel/panel-edp.c index 38e5178f55ac..411b7218af55 100644 --- a/drivers/gpu/drm/panel/panel-edp.c +++ b/drivers/gpu/drm/panel/panel-edp.c @@ -1966,6 +1966,8 @@ static const struct edp_panel_entry edp_panels[] = { EDP_PANEL_ENTRY('S', 'H', 'P', 0x153a, &delay_200_500_e50, "LQ140T1JH01"), EDP_PANEL_ENTRY('S', 'H', 'P', 0x154c, &delay_200_500_p2e100, "LQ116M1JW10"), + EDP_PANEL_ENTRY('S', 'D', 'C', 0x4189, &delay_100_500_e200, "ATNA45DC02-0"), + EDP_PANEL_ENTRY('S', 'T', 'A', 0x0100, &delay_100_500_e200, "2081116HHD028001-51D"), { /* sentinal */ } -- 2.45.2
Re: [PATCH v5 08/16] drm/msm/dpu: drop msm_format from struct dpu_hw_fmt_layout
On Wed, 17 Jul 2024 at 23:15, Abhinav Kumar wrote: > > > > On 6/24/2024 2:13 PM, Dmitry Baryshkov wrote: > > The struct dpu_hw_fmt_layout defines hardware data layout (addresses, > > sizes and pitches. Drop format field from this structure as it's not a > Missing closing brace ")" here? > > > part of the data layout. > > > > Its a bit subjective IMO whether you consider format as part of hardware > data layout or not. Registers do have format bitfields too so I am > somewhat unsure if this change is really needed. It's not a part of the data buffer layout (num_planes, sizes, pitches and offsets) - the items that are defined by struct dpu_hw_fmt_layout. As such there is no need to store it in the structure. When necessary we can always get it from the framebuffer itself. > > > Signed-off-by: Dmitry Baryshkov > > --- > > .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c| 31 > > +++--- > > drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c| 23 > > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h| 2 -- > > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.c | 4 +-- > > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.h | 3 ++- > > 5 files changed, 25 insertions(+), 38 deletions(-) > > > > > > > @@ -318,15 +318,10 @@ static void dpu_encoder_phys_wb_setup( > > { > > struct dpu_hw_wb *hw_wb = phys_enc->hw_wb; > > struct drm_display_mode mode = phys_enc->cached_mode; > > - struct drm_framebuffer *fb = NULL; > > struct dpu_encoder_phys_wb *wb_enc = to_dpu_encoder_phys_wb(phys_enc); > > - struct drm_writeback_job *wb_job; > > const struct msm_format *format; > > - const struct msm_format *dpu_fmt; > > > > - wb_job = wb_enc->wb_job; > > format = msm_framebuffer_format(wb_enc->wb_job->fb); > > - dpu_fmt = mdp_get_format(&phys_enc->dpu_kms->base, > > format->pixel_format, wb_job->fb->modifier); > > > > This is interesting. I wonder why I just didnt use format directly that > time itself :) > > Maybe I was thinking that mdp_get_format() will also match the modifiers > and return the corresponding msm_format. > > > DPU_DEBUG("[mode_set:%d, \"%s\",%d,%d]\n", > > hw_wb->idx - WB_0, mode.name, > > @@ -338,9 +333,9 @@ static void dpu_encoder_phys_wb_setup( > > > > dpu_encoder_phys_wb_set_qos(phys_enc); > > > > - dpu_encoder_phys_wb_setup_fb(phys_enc, fb); > > + dpu_encoder_phys_wb_setup_fb(phys_enc, format); > > > > - dpu_encoder_helper_phys_setup_cdm(phys_enc, dpu_fmt, > > CDM_CDWN_OUTPUT_WB); > > + dpu_encoder_helper_phys_setup_cdm(phys_enc, format, > > CDM_CDWN_OUTPUT_WB); > > > > dpu_encoder_phys_wb_setup_ctl(phys_enc); > > } > > @@ -584,14 +579,6 @@ static void dpu_encoder_phys_wb_prepare_wb_job(struct > > dpu_encoder_phys *phys_enc > > > > format = msm_framebuffer_format(job->fb); > > > > - wb_cfg->dest.format = mdp_get_format(&phys_enc->dpu_kms->base, > > - format->pixel_format, > > job->fb->modifier); > > - if (!wb_cfg->dest.format) { > > - /* this error should be detected during atomic_check */ > > - DPU_ERROR("failed to get format %p4cc\n", > > &format->pixel_format); > > - return; > > - } > > - > > ret = dpu_format_populate_layout(aspace, job->fb, &wb_cfg->dest); > > if (ret) { > > DPU_DEBUG("failed to populate layout %d\n", ret); -- With best wishes Dmitry
Re: [RFC] drm/panel/simple-edp: Add Samsung ATNA45DC02
On Wed, Jul 17, 2024 at 02:58:46PM GMT, Rob Clark wrote: > From: Rob Clark > > Just a guess on the panel timings, but they appear to work. > > Signed-off-by: Rob Clark > --- > This adds the panel I have on my lenovo yoga slim 7x laptop. I couldn't > find any datasheet so timings is just a guess. But AFAICT everything > works fine. Could you please add EDID to the commit message? > > drivers/gpu/drm/panel/panel-edp.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/panel/panel-edp.c > b/drivers/gpu/drm/panel/panel-edp.c > index 38e5178f55ac..411b7218af55 100644 > --- a/drivers/gpu/drm/panel/panel-edp.c > +++ b/drivers/gpu/drm/panel/panel-edp.c > @@ -1966,6 +1966,8 @@ static const struct edp_panel_entry edp_panels[] = { > EDP_PANEL_ENTRY('S', 'H', 'P', 0x153a, &delay_200_500_e50, > "LQ140T1JH01"), > EDP_PANEL_ENTRY('S', 'H', 'P', 0x154c, &delay_200_500_p2e100, > "LQ116M1JW10"), > > + EDP_PANEL_ENTRY('S', 'D', 'C', 0x4189, &delay_100_500_e200, > "ATNA45DC02-0"), > + > EDP_PANEL_ENTRY('S', 'T', 'A', 0x0100, &delay_100_500_e200, > "2081116HHD028001-51D"), > > { /* sentinal */ } > -- > 2.45.2 > -- With best wishes Dmitry
Re: [PATCH v5 10/16] drm/msm/dpu: move pitch check to _dpu_format_get_plane_sizes_linear()
On 6/24/2024 2:13 PM, Dmitry Baryshkov wrote: The _dpu_format_get_plane_sizes_linear() already compares pitches of the framebuffer with the calculated pitches. Move the check to the same place, demoting DPU_ERROR to DPU_DEBUG to prevent user from spamming the kernel log. Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 16 ++-- 1 file changed, 6 insertions(+), 10 deletions(-) Not fully convinced about demoting DPU_ERROR to DPU_DEBUG but I think we have had a similar discussion earlier while logging atomic_check failures in the CDM series, so keeping that in mind, Reviewed-by: Abhinav Kumar
[PATCH] drm/ci: update link to Gitlab server
Before building an image, the build script looks to see if there are fixes to apply from an upstream repository. The link for the upstream repository git://anongit.freedesktop.org/drm/drm became obsolete with the move to Gitlab server in March 2024. Until recently, this obsolete link was harmless because anongit would at least respond that there were no such fixes available. In the last few days anongit has stopped responding to requests causing the build script to hang indefinitely. Update the link from anongit to the Gitlab server to prevent the build script from hanging indefinitely. Signed-off-by: Deborah Brouwer --- Link to pipeline for this change: https://gitlab.freedesktop.org/dbrouwer/kernel/-/pipelines/1226742 drivers/gpu/drm/ci/gitlab-ci.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/ci/gitlab-ci.yml b/drivers/gpu/drm/ci/gitlab-ci.yml index b09976c3d2c2..259fb1c9a855 100644 --- a/drivers/gpu/drm/ci/gitlab-ci.yml +++ b/drivers/gpu/drm/ci/gitlab-ci.yml @@ -2,7 +2,7 @@ variables: DRM_CI_PROJECT_PATH: &drm-ci-project-path mesa/mesa DRM_CI_COMMIT_SHA: &drm-ci-commit-sha e2b9c5a9e3e4f9b532067af8022eaef8d6fc6c00 - UPSTREAM_REPO: git://anongit.freedesktop.org/drm/drm + UPSTREAM_REPO: https://gitlab.freedesktop.org/drm/kernel.git TARGET_BRANCH: drm-next IGT_VERSION: f13702b8e4e847c56da3ef6f0969065d686049c5 -- 2.45.2
Re: [RFC] drm/panel/simple-edp: Add Samsung ATNA45DC02
Hi, On Wed, Jul 17, 2024 at 2:58 PM Rob Clark wrote: > > From: Rob Clark > > Just a guess on the panel timings, but they appear to work. > > Signed-off-by: Rob Clark > --- > This adds the panel I have on my lenovo yoga slim 7x laptop. I couldn't > find any datasheet so timings is just a guess. But AFAICT everything > works fine. > > drivers/gpu/drm/panel/panel-edp.c | 2 ++ > 1 file changed, 2 insertions(+) Given that this is a Samsung ATNA, is there any chance it's an OLED panel? Should it be supported with the Samsung OLED panel driver like this: https://lore.kernel.org/r/20240715-x1e80100-crd-backlight-v2-0-31b7f2f65...@linaro.org -Doug
Re: [PATCH v2 1/4] dt-bindings: display: panel: samsung,atna33xc20: Document ATNA45AF01
Conor (and/or) Krzysztof and Rob, On Mon, Jul 15, 2024 at 8:31 AM Conor Dooley wrote: > > On Mon, Jul 15, 2024 at 02:15:37PM +0200, Stephan Gerhold wrote: > > The Samsung ATNA45AF01 panel is an AMOLED eDP panel that has backlight > > control over the DP AUX channel. While it works almost correctly with the > > generic "edp-panel" compatible, the backlight needs special handling to > > work correctly. It is similar to the existing ATNA33XC20 panel, just with > > a larger resolution and size. > > > > Add a new "samsung,atna45af01" compatible to describe this panel in the DT. > > Use the existing "samsung,atna33xc20" as fallback compatible since existing > > drivers should work as-is, given that resolution and size are discoverable > > through the eDP link. > > > > Signed-off-by: Stephan Gerhold > > Acked-by: Conor Dooley Can you comment on whether you would consider this bindings a "Fix" since it's a dependency for later patches in this series (which are "Fix"es) to pass dtbs_check? See: https://lore.kernel.org/r/4bca316a-2334-425b-87a6-e1bb241d2...@linaro.org -Doug
Re: [RFC] drm/panel/simple-edp: Add Samsung ATNA45DC02
On Wed, Jul 17, 2024 at 5:19 PM Doug Anderson wrote: > > Hi, > > On Wed, Jul 17, 2024 at 2:58 PM Rob Clark wrote: > > > > From: Rob Clark > > > > Just a guess on the panel timings, but they appear to work. > > > > Signed-off-by: Rob Clark > > --- > > This adds the panel I have on my lenovo yoga slim 7x laptop. I couldn't > > find any datasheet so timings is just a guess. But AFAICT everything > > works fine. > > > > drivers/gpu/drm/panel/panel-edp.c | 2 ++ > > 1 file changed, 2 insertions(+) > > Given that this is a Samsung ATNA, is there any chance it's an > OLED panel? Should it be supported with the Samsung OLED panel driver > like this: > > https://lore.kernel.org/r/20240715-x1e80100-crd-backlight-v2-0-31b7f2f65...@linaro.org it is an OLED panel, and I did see that patchset and thought that samsung panel driver would work. But changing the compat string on the yoga dts only gave me a black screen (and I didn't see any of the other mentioned problems with bl control or dpms with panel-edp). So :shrug:? It could be I overlooked something else, but it _seems_ like panel-edp is fine for this panel. Plus, it would avoid awkwardness if it turned out there were other non-samsung 2nd sources, but so far with a sample size of two 100% of these laptops have the same panel ;-) But that was the reason for posting this as an RFC. I was hoping someone had some hint about where to find datasheets (my google'ing was not successful), etc. BR, -R
Re: [PATCH 1/2] dma-buf: heaps: DMA_HEAP_IOCTL_ALLOC_READ_FILE framework
在 2024/7/18 1:03, Christoph Hellwig 写道: [Some people who received this message don't often get email from h...@infradead.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] On Wed, Jul 17, 2024 at 05:15:07PM +0200, Daniel Vetter wrote: I'm talking about memfd, not dma-buf here. I think copy_file_range to dma-buf is as architecturally unsound as allowing O_DIRECT on the dma-buf mmap. copy_file_range only work inside the same file system anyway, so it is completely irrelevant here. Yes, actually, if dma-buf want's to copy_file_range from a file, it need change something in vfs_copy_file_range: 1. in generic_file_rw_checks, dma-buf file is not a normal file, but inode_out check it. need bypass 2. file in and out need in the same file system which you point it. So, need bypass it 3. if dma-buf above 2G, need bypass generic_write_check_limits's file O_LARGEFILE check, it only allow copy range below 2G. I feel that the above limitations indicate that copy_file_range is not really suitable for copying between different file systems or unconventional file types.(both shmemfs and other's) Perhaps enabling dma-buf to support copy_file_range is not a good idea? :) What should work just fine is using sendfile (or splice if you like it complicated) to write TO the dma buf. That just iterates over the page OK, I'll research it also. cache on the source file and calls ->write_iter from the page cache pages. Of course that requires that you actually implement ->write_iter, but given that dmabufs support mmaping there I can't see why you should not be able to write to it. Reading FROM the dma buf in that fashion should also work if you provide a ->read_iter wire up ->splice_read to copy_splice_read so that it doesn't require any page cache.
[PATCH] drm/ast: Fix black screen after resume
Suspend will disable pcie device. Thus, resume should do full hw initialization again. Add some APIs to ast_drm_thaw() before ast_post_gpu() to fix the issue. Fixes: 5b71707dd13 ("drm/ast: Enable and unlock device access early during init") Signed-off-by: Jammy Huang --- drivers/gpu/drm/ast/ast_drv.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c index f8c49ba68e78..45a9c7bf49c8 100644 --- a/drivers/gpu/drm/ast/ast_drv.c +++ b/drivers/gpu/drm/ast/ast_drv.c @@ -391,6 +391,11 @@ static int ast_drm_freeze(struct drm_device *dev) static int ast_drm_thaw(struct drm_device *dev) { + struct ast_device *ast = to_ast_device(dev); + + ast_enable_vga(ioregs); + ast_open_key(ioregs); + ast_enable_mmio(dev, ioregs); ast_post_gpu(dev); return drm_mode_config_helper_resume(dev); base-commit: e9d22f7a6655941fc8b2b942ed354ec780936b3e prerequisite-patch-id: a03a33902f33bcc38091e2cdce9d79f630464c30 prerequisite-patch-id: 505779a1e2094f2ee2f2a55ad44aac5cd1d5342f -- 2.25.1
Re: [PATCH 1/2] dma-buf: heaps: DMA_HEAP_IOCTL_ALLOC_READ_FILE framework
在 2024/7/18 11:08, Christoph Hellwig 写道: [Some people who received this message don't often get email from h...@infradead.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] On Thu, Jul 18, 2024 at 09:51:39AM +0800, Huan Yang wrote: Yes, actually, if dma-buf want's to copy_file_range from a file, it need change something in vfs_copy_file_range: No, it doesn't. copy_file_range is specifically designed to copy inside a single file system as already mentioned. The generic offload for OK, got it. Thanks to point this. copying between arbitrary FDs is splice and the sendfile convenience wrapper around it
Re: [PATCH v3 2/3] Subject: [PATCH] drm/mediatek/dp: Add HDCP2.x feature for DisplayPort
Re: [PATCH v2 1/4] dt-bindings: display: panel: samsung,atna33xc20: Document ATNA45AF01
On 18/07/2024 02:21, Doug Anderson wrote: > Conor (and/or) Krzysztof and Rob, > > On Mon, Jul 15, 2024 at 8:31 AM Conor Dooley wrote: >> >> On Mon, Jul 15, 2024 at 02:15:37PM +0200, Stephan Gerhold wrote: >>> The Samsung ATNA45AF01 panel is an AMOLED eDP panel that has backlight >>> control over the DP AUX channel. While it works almost correctly with the >>> generic "edp-panel" compatible, the backlight needs special handling to >>> work correctly. It is similar to the existing ATNA33XC20 panel, just with >>> a larger resolution and size. >>> >>> Add a new "samsung,atna45af01" compatible to describe this panel in the DT. >>> Use the existing "samsung,atna33xc20" as fallback compatible since existing >>> drivers should work as-is, given that resolution and size are discoverable >>> through the eDP link. >>> >>> Signed-off-by: Stephan Gerhold >> >> Acked-by: Conor Dooley > > Can you comment on whether you would consider this bindings a "Fix" > since it's a dependency for later patches in this series (which are > "Fix"es) to pass dtbs_check? See: > > https://lore.kernel.org/r/4bca316a-2334-425b-87a6-e1bb241d2...@linaro.org The patch itself is not a fix, for sure, but it might be a dependency of a fix (which you wrote above), thus could be pulled to stable as a dependency. I do not care about dtbs_check warnings in stable kernels, mostly because dtbs_check warnings depend heavily on dtschema and dtschema follows mainline kernel. Basically if you had warnings-free v6.8 but try to run dtbs_check now with latest dtschema, your results will differ. At some point in the future, I could imagine "no new dtbs_check warnings in stable kernels" requirement or at least preference, but so far I don't think there is any benefit. Best regards, Krzysztof
Re: [PATCH v3 2/3] Subject: [PATCH] drm/mediatek/dp: Add HDCP2.x feature for DisplayPort
Re: [PATCH] drm/ast: Fix black screen after resume
(Cary, this looks like it fixes the problem you reported.) Hi Jammy Am 18.07.24 um 05:03 schrieb Jammy Huang: Suspend will disable pcie device. Thus, resume should do full hw initialization again. Add some APIs to ast_drm_thaw() before ast_post_gpu() to fix the issue. Fixes: 5b71707dd13 ("drm/ast: Enable and unlock device access early during init") Signed-off-by: Jammy Huang Reviewed-by: Thomas Zimmermann Thanks a lot for this fix. Best regards Thomas --- drivers/gpu/drm/ast/ast_drv.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c index f8c49ba68e78..45a9c7bf49c8 100644 --- a/drivers/gpu/drm/ast/ast_drv.c +++ b/drivers/gpu/drm/ast/ast_drv.c @@ -391,6 +391,11 @@ static int ast_drm_freeze(struct drm_device *dev) static int ast_drm_thaw(struct drm_device *dev) { + struct ast_device *ast = to_ast_device(dev); + + ast_enable_vga(ioregs); + ast_open_key(ioregs); + ast_enable_mmio(dev, ioregs); ast_post_gpu(dev); return drm_mode_config_helper_resume(dev); base-commit: e9d22f7a6655941fc8b2b942ed354ec780936b3e prerequisite-patch-id: a03a33902f33bcc38091e2cdce9d79f630464c30 prerequisite-patch-id: 505779a1e2094f2ee2f2a55ad44aac5cd1d5342f -- -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg)
Re: [PATCH 11/15] RDMA/hbl: add habanalabs RDMA driver
On 7/17/24 15:33, Leon Romanovsky wrote: > On Wed, Jul 17, 2024 at 10:51:03AM +, Omer Shpigelman wrote: >> On 7/17/24 10:36, Leon Romanovsky wrote: >>> On Wed, Jul 17, 2024 at 07:08:59AM +, Omer Shpigelman wrote: On 7/16/24 16:40, Jason Gunthorpe wrote: > On Sun, Jul 14, 2024 at 10:18:12AM +, Omer Shpigelman wrote: >> On 7/12/24 16:08, Jason Gunthorpe wrote: >>> [You don't often get email from j...@ziepe.ca. Learn why this is >>> important at https://aka.ms/LearnAboutSenderIdentification ] >>> >>> On Fri, Jun 28, 2024 at 10:24:32AM +, Omer Shpigelman wrote: >>> We need the core driver to access the IB driver (and to the ETH driver as well). As you wrote, we can't use exported symbols from our IB driver nor rely on function pointers, but what about providing the core driver an ops structure? meaning exporting a register function from the core driver that should be called by the IB driver during auxiliary device probe. Something like: int hbl_cn_register_ib_aux_dev(struct auxiliary_device *adev, struct hbl_ib_ops *ops) { ... } EXPORT_SYMBOL(hbl_cn_register_ib_aux_dev); >>> >>> Definately do not do some kind of double-register like this. >>> >>> The auxiliary_device scheme can already be extended to provide ops for >>> each sub device. >>> >>> Like >>> >>> struct habana_driver { >>>struct auxiliary_driver base; >>>const struct habana_ops *ops; >>> }; >>> >>> If the ops are justified or not is a different question. >>> >> >> Well, I suggested this double-register option because I got a comment >> that >> the design pattern of embedded ops structure shouldn't be used. >> So I'm confused now... > > Yeah, don't stick ops in random places, but the device_driver is the > right place. > Sorry, let me explain again. My original code has an ops structure exactly like you are suggesting now (see struct hbl_aux_dev in the first patch of the series). But I was instructed not to use this ops structure and to rely on exported symbols for inter-driver communication. I'll be happy to use this ops structure like in your example rather than converting my code to use exported symbols. Leon - am I missing anything? what's the verdict here? >>> >>> You are missing the main sentence from Jason's response: "don't stick ops >>> in random places". >>> >>> It is fine to have ops in device driver, so the core driver can call them. >>> However, in your >>> original code, you added ops everywhere. It caused to the need to implement >>> module reference >>> counting and crazy stuff like calls to lock and unlock functions from the >>> aux driver to the core. >>> >>> Verdict is still the same. Core driver should provide EXPORT_SYMBOLs, so >>> the aux driver can call >>> them directly and enjoy from proper module loading and unloading. >>> >>> The aux driver can have ops in the device driver, so the core driver can >>> call them to perform something >>> specific for that aux driver. >>> >>> Calls between aux drivers should be done via the core driver. >>> >>> Thanks >> >> The only place we have an ops structure is in the device driver, >> similarly to Jason's example. In our code it is struct hbl_aux_dev. What >> other random places did you see? > > This is exactly random place. > > I suggest you to take time, learn how existing drivers in netdev and > RDMA uses auxbus infrastructure and follow the same pattern. There are > many examples already in the kernel. > > And no, if you do everything right, you won't need custom module > reference counting and other hacks. There is nothing special in your > device/driver which requires special treatment. > > Thanks How come it is a random place? Look at irdma/i40e - they have an ops struct (struct i40e_ops) embedded in their shared aux struct (struct i40e_auxiliary_device) which is the wrapper of the base aux struct (struct auxiliary_device). This is very similar to what we have - a pointer to an ops struct (void *aux_ops) embedded in our shared aux struct (struct hbl_aux_dev) which is the wrapper of the base struct (struct auxiliary_device). The only difference is that they put their ops struct inside some info struct (struct i40e_info) and we have a separate pointer for that info (void *aux_data). In addition, they use the ops struct for accessing the net driver from the RDMA driver, meaning son-to-parent communication instead of having an exported symbol e.g. i40e_client_request_reset(). They have only a single operation as EXPORT_SYMBOL function for (un)registering the son - i40e_client_device_register() and i40e_client_device_unregister(). So what is the problem with how we implemented it?