Hi Alper, On Mon, 21 Aug 2023 at 07:51, Alper Nebi Yasak <alpernebiya...@gmail.com> wrote: > > This is a rebase of Alexander Graf's video damage tracking series, with > some tests and other changes. The original cover letter is as follows: > > > This patch set speeds up graphics output on ARM by a factor of 60x. > > > > On most ARM SBCs, we keep the frame buffer in DRAM and map it as cached, > > but need it accessible by the display controller which reads directly > > from a later point of consistency. Hence, we flush the frame buffer to > > DRAM on every change. The full frame buffer.
It should not, see below. > > > > Unfortunately, with the advent of 4k displays, we are seeing frame buffers > > that can take a while to flush out. This was reported by Da Xue with grub, > > which happily print 1000s of spaces on the screen to draw a menu. Every > > printed space triggers a cache flush. That is a bug somewhere in EFI. > > > > This patch set implements the easiest mitigation against this problem: > > Damage tracking. We remember the lowest common denominator region that was > > touched since the last video_sync() call and only flush that. The most > > typical writer to the frame buffer is the video console, which always > > writes rectangles of characters on the screen and syncs afterwards. > > > > With this patch set applied, we reduce drawing a large grub menu (with > > serial console attached for size information) on an RK3399-ROC system > > at 1440p from 55 seconds to less than 1 second. > > > > Version 2 also implements VIDEO_COPY using this mechanism, reducing its > > overhead compared to before as well. So even x86 systems should be faster > > with this now :). > > > > > > Alternatives considered: > > > > 1) Lazy sync - Sandbox does this. It only calls video_sync(true) ever > > so often. We are missing timers to do this generically. > > > > 2) Double buffering - We could try to identify whether anything changed > > at all and only draw to the FB if it did. That would require > > maintaining a second buffer that we need to scan. > > > > 3) Text buffer - Maintain a buffer of all text printed on the screen with > > respective location. Don't write if the old and new character are > > identical. This would limit applicability to text only and is an > > optimization on top of this patch set. > > > > 4) Hash screen lines - Create a hash (sha256?) over every line when it > > changes. Only flush when it does. I'm not sure if this would waste > > more time, memory and cache than the current approach. It would make > > full screen updates much more expensive. 5) Fix the bug mentioned above? > > Changes in v5: > - Add patch "video: test: Split copy frame buffer check into a function" > - Add patch "video: test: Support checking copy frame buffer contents" > - Add patch "video: test: Test partial updates of hardware frame buffer" > - Use xstart, ystart, xend, yend as names for damage region > - Document damage struct and fields in struct video_priv comment > - Return void from video_damage() > - Fix undeclared priv error in video_sync() > - Drop unused headers from video-uclass.c > - Use IS_ENABLED() instead of CONFIG_IS_ENABLED() > - Call video_damage() also in video_fill_part() > - Use met->baseline instead of priv->baseline > - Use fontdata->height/width instead of VIDEO_FONT_HEIGHT/WIDTH > - Update console_rotate.c video_damage() calls to pass video tests > - Remove mention about not having minimal damage for console_rotate.c > - Add patch "video: test: Test video damage tracking via vidconsole" > - Document new vdev field in struct efi_gop_obj comment > - Remove video_sync_copy() also from video_fill(), video_fill_part() > - Fix memmove() calls by removing the extra dev argument > - Call video_sync() before checking copy_fb in video tests > - Imply VIDEO_DAMAGE for video drivers instead of selecting it > - Imply VIDEO_DAMAGE also for VIDEO_TIDSS > > v4: https://lore.kernel.org/all/20230103215004.22646-1-ag...@csgraf.de/ > > Changes in v4: > - Move damage clear to patch "dm: video: Add damage tracking API" > - Simplify first damage logic > - Remove VIDEO_DAMAGE default for ARM > - Skip damage on EfiBltVideoToBltBuffer > - Add patch "video: Always compile cache flushing code" > - Add patch "video: Enable VIDEO_DAMAGE for drivers that need it" > > v3: https://lore.kernel.org/all/20221230195828.88134-1-ag...@csgraf.de/ > > Changes in v3: > - Adapt to always assume DM is used > - Adapt to always assume DM is used > - Make VIDEO_COPY always select VIDEO_DAMAGE > > v2: https://lore.kernel.org/all/20220609225921.62462-1-ag...@csgraf.de/ > > Changes in v2: > - Remove ifdefs > - Fix ranges in truetype target > - Limit rotate to necessary damage > - Remove ifdefs from gop > - Fix dcache range; we were flushing too much before > - Add patch "video: Use VIDEO_DAMAGE for VIDEO_COPY" > > v1: https://lore.kernel.org/all/20220606234336.5021-1-ag...@csgraf.de/ > > Alexander Graf (9): > dm: video: Add damage tracking API > dm: video: Add damage notification on display fills > vidconsole: Add damage notifications to all vidconsole drivers > video: Add damage notification on bmp display > efi_loader: GOP: Add damage notification on BLT > video: Only dcache flush damaged lines > video: Use VIDEO_DAMAGE for VIDEO_COPY > video: Always compile cache flushing code > video: Enable VIDEO_DAMAGE for drivers that need it > > Alper Nebi Yasak (4): > video: test: Split copy frame buffer check into a function > video: test: Support checking copy frame buffer contents > video: test: Test partial updates of hardware frame buffer > video: test: Test video damage tracking via vidconsole > > arch/arm/mach-omap2/omap3/Kconfig | 1 + > arch/arm/mach-sunxi/Kconfig | 1 + > drivers/video/Kconfig | 26 +++ > drivers/video/console_normal.c | 27 ++-- > drivers/video/console_rotate.c | 94 +++++++---- > drivers/video/console_truetype.c | 37 +++-- > drivers/video/exynos/Kconfig | 1 + > drivers/video/imx/Kconfig | 1 + > drivers/video/meson/Kconfig | 1 + > drivers/video/rockchip/Kconfig | 1 + > drivers/video/stm32/Kconfig | 1 + > drivers/video/tegra20/Kconfig | 1 + > drivers/video/tidss/Kconfig | 1 + > drivers/video/vidconsole-uclass.c | 16 -- > drivers/video/video-uclass.c | 190 ++++++++++++---------- > drivers/video/video_bmp.c | 7 +- > include/video.h | 59 +++---- > include/video_console.h | 52 ------ > lib/efi_loader/efi_gop.c | 7 + > test/dm/video.c | 256 ++++++++++++++++++++++++------ > 20 files changed, 483 insertions(+), 297 deletions(-) It is good to see this tidied up into something that can be applied! I am unsure what is going on with the EFI performance, though. It should not flush the cache after every character, only after a new line. Is there something wrong in here? If so, we should fix that bug first and it should be patch 1 of this series. Regards, Simon