Hi Jan,
On 27/11/2024 11:09, Jan Beulich wrote:
On 27.11.2024 11:55, Julien Grall wrote:
From: Julien Grall <jgr...@amazon.com>
All the code in arch/arm32/lib/ where copied from Linux 3.16
and never re-synced since then.
A few years ago, Linux got rid of __memzero() because the implementation
is very similar to memset(p,0,n) and the current use of __memzero()
interferes with optimization. See full commit message from Linux below.
So it makes sense to get rid of __memzero in Xen as well.
From ff5fdafc9e9702846480e0cea55ba861f72140a2 Mon Sep 17 00:00:00 2001
From: Nicolas Pitre <nicolas.pi...@linaro.org>
Date: Fri, 19 Jan 2018 18:17:46 +0100
Subject: [PATCH] ARM: 8745/1: get rid of __memzero()
The __memzero assembly code is almost identical to memset's except for
two orr instructions. The runtime performance of __memset(p, n) and
memset(p, 0, n) is accordingly almost identical.
However, the memset() macro used to guard against a zero length and to
call __memzero at compile time when the fill value is a constant zero
interferes with compiler optimizations.
Arnd found tha the test against a zero length brings up some new
warnings with gcc v8:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82103
And successively rremoving the test against a zero length and the call
to __memzero optimization produces the following kernel sizes for
defconfig with gcc 6:
text data bss dec hex filename
12248142 6278960 413588 18940690 1210312 vmlinux.orig
12244474 6278960 413588 18937022 120f4be vmlinux.no_zero_test
12239160 6278960 413588 18931708 120dffc vmlinux.no_memzero
So it is probably not worth keeping __memzero around given that the
compiler can do a better job at inlining trivial memset(p,0,n) on its
own. And the memset code already handles a zero length just fine.
Suggested-by: Arnd Bergmann <a...@arndb.de>
Signed-off-by: Nicolas Pitre <n...@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Acked-by: Arnd Bergmann <a...@arndb.de>
Signed-off-by: Russell King <rmk+ker...@armlinux.org.uk>
Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
ff5fdafc9e97
Signed-off-by: Julien Grall <jgr...@amazon.com>
Reviewed-by: Jan Beulich <jbeul...@suse.com>
Thanks!
with a suggestion:
--- a/xen/arch/arm/README.LinuxPrimitives
+++ b/xen/arch/arm/README.LinuxPrimitives
@@ -108,10 +108,9 @@ linux/arch/arm/lib/memchr.S
xen/arch/arm/arm32/lib/memchr.S
linux/arch/arm/lib/memcpy.S xen/arch/arm/arm32/lib/memcpy.S
linux/arch/arm/lib/memmove.S xen/arch/arm/arm32/lib/memmove.S
linux/arch/arm/lib/memset.S xen/arch/arm/arm32/lib/memset.S
-linux/arch/arm/lib/memzero.S xen/arch/arm/arm32/lib/memzero.S
for i in copy_template.S memchr.S memcpy.S memmove.S memset.S \
- memzero.S ; do
+ ; do
diff -u linux/arch/arm/lib/$i xen/arch/arm/arm32/lib/$i
done
Also do away with the line continuation at the same time? E.g.
for i in copy_template.S memchr.S memcpy.S memmove.S memset.S; do
diff -u linux/arch/arm/lib/$i xen/arch/arm/arm32/lib/$i
done
I will go with this version because I don't expect the number of files
to increase.
Cheers,
--
Julien Grall