Hi Alex, On Mon, 21 Aug 2023 at 13:33, Alexander Graf <ag...@csgraf.de> wrote: > > > On 21.08.23 21:11, Simon Glass wrote: > > Hi Alper, > > > > On Mon, 21 Aug 2023 at 07:51, Alper Nebi Yasak <alpernebiya...@gmail.com> > > wrote: > >> This is a rebase of Alexander Graf's video damage tracking series, with > >> some tests and other changes. The original cover letter is as follows: > >> > >>> This patch set speeds up graphics output on ARM by a factor of 60x. > >>> > >>> On most ARM SBCs, we keep the frame buffer in DRAM and map it as cached, > >>> but need it accessible by the display controller which reads directly > >>> from a later point of consistency. Hence, we flush the frame buffer to > >>> DRAM on every change. The full frame buffer. > > It should not, see below. > > > >>> Unfortunately, with the advent of 4k displays, we are seeing frame buffers > >>> that can take a while to flush out. This was reported by Da Xue with grub, > >>> which happily print 1000s of spaces on the screen to draw a menu. Every > >>> printed space triggers a cache flush. > > That is a bug somewhere in EFI. > > > Unfortunately not :). You may call it a bug in grub: It literally prints > over space characters for every character in its menu that it wants > cleared. On every text screen draw. > > This wouldn't be a big issue if we only flush the reactangle that gets > modified. But without this patch set, we're flushing the full DRAM > buffer on every u-boot text console character write, which means for > every character (as that's the only API UEFI has). > > As a nice side effect, we speed up the normal U-Boot text console as > well with this patch set, because even "normal" text prints that write > for example a single line of text on the screen today flush the full > frame buffer to DRAM.
No, I mean that it is a bug that U-Boot (apparently) flushes the cache after every character. It doesn't do that for normal character output and I don't think it makes sense to do it for EFI either. > > > > > >>> This patch set implements the easiest mitigation against this problem: > >>> Damage tracking. We remember the lowest common denominator region that was > >>> touched since the last video_sync() call and only flush that. The most > >>> typical writer to the frame buffer is the video console, which always > >>> writes rectangles of characters on the screen and syncs afterwards. > >>> > >>> With this patch set applied, we reduce drawing a large grub menu (with > >>> serial console attached for size information) on an RK3399-ROC system > >>> at 1440p from 55 seconds to less than 1 second. > >>> > >>> Version 2 also implements VIDEO_COPY using this mechanism, reducing its > >>> overhead compared to before as well. So even x86 systems should be faster > >>> with this now :). > >>> > >>> > >>> Alternatives considered: > >>> > >>> 1) Lazy sync - Sandbox does this. It only calls video_sync(true) ever > >>> so often. We are missing timers to do this generically. > >>> > >>> 2) Double buffering - We could try to identify whether anything changed > >>> at all and only draw to the FB if it did. That would require > >>> maintaining a second buffer that we need to scan. > >>> > >>> 3) Text buffer - Maintain a buffer of all text printed on the screen > >>> with > >>> respective location. Don't write if the old and new character are > >>> identical. This would limit applicability to text only and is an > >>> optimization on top of this patch set. > >>> > >>> 4) Hash screen lines - Create a hash (sha256?) over every line when it > >>> changes. Only flush when it does. I'm not sure if this would waste > >>> more time, memory and cache than the current approach. It would make > >>> full screen updates much more expensive. > > 5) Fix the bug mentioned above? > > > >> Changes in v5: > >> - Add patch "video: test: Split copy frame buffer check into a function" > >> - Add patch "video: test: Support checking copy frame buffer contents" > >> - Add patch "video: test: Test partial updates of hardware frame buffer" > >> - Use xstart, ystart, xend, yend as names for damage region > >> - Document damage struct and fields in struct video_priv comment > >> - Return void from video_damage() > >> - Fix undeclared priv error in video_sync() > >> - Drop unused headers from video-uclass.c > >> - Use IS_ENABLED() instead of CONFIG_IS_ENABLED() > >> - Call video_damage() also in video_fill_part() > >> - Use met->baseline instead of priv->baseline > >> - Use fontdata->height/width instead of VIDEO_FONT_HEIGHT/WIDTH > >> - Update console_rotate.c video_damage() calls to pass video tests > >> - Remove mention about not having minimal damage for console_rotate.c > >> - Add patch "video: test: Test video damage tracking via vidconsole" > >> - Document new vdev field in struct efi_gop_obj comment > >> - Remove video_sync_copy() also from video_fill(), video_fill_part() > >> - Fix memmove() calls by removing the extra dev argument > >> - Call video_sync() before checking copy_fb in video tests > >> - Imply VIDEO_DAMAGE for video drivers instead of selecting it > >> - Imply VIDEO_DAMAGE also for VIDEO_TIDSS > >> > >> v4: https://lore.kernel.org/all/20230103215004.22646-1-ag...@csgraf.de/ > >> > >> Changes in v4: > >> - Move damage clear to patch "dm: video: Add damage tracking API" > >> - Simplify first damage logic > >> - Remove VIDEO_DAMAGE default for ARM > >> - Skip damage on EfiBltVideoToBltBuffer > >> - Add patch "video: Always compile cache flushing code" > >> - Add patch "video: Enable VIDEO_DAMAGE for drivers that need it" > >> > >> v3: https://lore.kernel.org/all/20221230195828.88134-1-ag...@csgraf.de/ > >> > >> Changes in v3: > >> - Adapt to always assume DM is used > >> - Adapt to always assume DM is used > >> - Make VIDEO_COPY always select VIDEO_DAMAGE > >> > >> v2: https://lore.kernel.org/all/20220609225921.62462-1-ag...@csgraf.de/ > >> > >> Changes in v2: > >> - Remove ifdefs > >> - Fix ranges in truetype target > >> - Limit rotate to necessary damage > >> - Remove ifdefs from gop > >> - Fix dcache range; we were flushing too much before > >> - Add patch "video: Use VIDEO_DAMAGE for VIDEO_COPY" > >> > >> v1: https://lore.kernel.org/all/20220606234336.5021-1-ag...@csgraf.de/ > >> > >> Alexander Graf (9): > >> dm: video: Add damage tracking API > >> dm: video: Add damage notification on display fills > >> vidconsole: Add damage notifications to all vidconsole drivers > >> video: Add damage notification on bmp display > >> efi_loader: GOP: Add damage notification on BLT > >> video: Only dcache flush damaged lines > >> video: Use VIDEO_DAMAGE for VIDEO_COPY > >> video: Always compile cache flushing code > >> video: Enable VIDEO_DAMAGE for drivers that need it > >> > >> Alper Nebi Yasak (4): > >> video: test: Split copy frame buffer check into a function > >> video: test: Support checking copy frame buffer contents > >> video: test: Test partial updates of hardware frame buffer > >> video: test: Test video damage tracking via vidconsole > >> > >> arch/arm/mach-omap2/omap3/Kconfig | 1 + > >> arch/arm/mach-sunxi/Kconfig | 1 + > >> drivers/video/Kconfig | 26 +++ > >> drivers/video/console_normal.c | 27 ++-- > >> drivers/video/console_rotate.c | 94 +++++++---- > >> drivers/video/console_truetype.c | 37 +++-- > >> drivers/video/exynos/Kconfig | 1 + > >> drivers/video/imx/Kconfig | 1 + > >> drivers/video/meson/Kconfig | 1 + > >> drivers/video/rockchip/Kconfig | 1 + > >> drivers/video/stm32/Kconfig | 1 + > >> drivers/video/tegra20/Kconfig | 1 + > >> drivers/video/tidss/Kconfig | 1 + > >> drivers/video/vidconsole-uclass.c | 16 -- > >> drivers/video/video-uclass.c | 190 ++++++++++++---------- > >> drivers/video/video_bmp.c | 7 +- > >> include/video.h | 59 +++---- > >> include/video_console.h | 52 ------ > >> lib/efi_loader/efi_gop.c | 7 + > >> test/dm/video.c | 256 ++++++++++++++++++++++++------ > >> 20 files changed, 483 insertions(+), 297 deletions(-) > > It is good to see this tidied up into something that can be applied! > > > > I am unsure what is going on with the EFI performance, though. It > > should not flush the cache after every character, only after a new > > line. Is there something wrong in here? If so, we should fix that bug > > first and it should be patch 1 of this series. > > > Before I came up with this series, I was trying to identify the UEFI bug > in question as well, because intuition told me surely this is a bug in > UEFI :). Turns out it really isn't this time around. I don't mean a bug in UEFI, I mean a bug in U-Boot's EFI implementation. Where did you look for the bug? Regards, Simon