control: clone -1 -2
control: reassign -2 src:libinsane
control: retitle -2 libinsane: please disable valgrind in armhf autopkgtest
control: reassign -1 valgrind
control: retitle -1 valgrind: false memory overlap positive in memcpy on armhf
control: severity -1 important
Hi,
On 2025-03-20 11:18, Emanuele Rocca wrote:
On 2025-03-19 09:45, Aurelien Jarno wrote:
So far my finding are quite limited. It seems the new GCC generates more
optimized code in general, and especially for the memmem() function, and
that changes the location of the memmove() function located just after in
the binary. Aligning it to 6 bits seems to fix the problem with -18 (it is
aligned to 5 bits), but strangely it is only aligned to 4 bits with -17...
Also padding it with the same number of nops at the beginning of the
function doesn't help, so this somehow excludes alignment issues of
functions located after it. Overall it seems that memmove() wrongly jumps to
memcpy() for cases it shouldn't, but I don't really see why and while
alignment would matter.
I am puzzled by all of that I and don't really know how to debug that
further.
It could very well be that this is a bug in valgrind, from what you
describe. My uninformed impression is that upstream support for armhf
isn't great. I did once open an issue asking to support the latest
(2018) Procedure Call Standard and received no replies, which makes me
wonder if there's much upstream interest at all when it comes to 32bit
arm: https://bugs.kde.org/show_bug.cgi?id=479699
On that topic, just as a sanity check I tried rebuilding libinsane
without stack-clash-protection and still could reproduce #1100805.
Here's my 2 cents: if there's no obvious compiler or C library change
that may suggest a regression was introduced, just stop running valgrind
in the libinsane autopkgtest on armhf and move on.
Yes, I confirm that this is a valgrind issue. Or rather two, a false
positive and a check done randomly depending on the code alignment.
On the GCC side, I have found that the issue is trigger by this commit:
commit 95c98c5368aedf2a482bf551cd2573c1961a6823
Author: Richard Biener <rguent...@suse.de>
Date: Wed Jan 15 14:31:57 2025 +0100
tree-optimization/115494 - PRE PHI translation and ranges
It significantly changes the generated code, and it seems more compact
overall. This is especially true for the memset function, which in
turns move the location of the memmove function in the binary. This is
the part I just do not understand, but the behaviour of this code seems
to depends on its location. Aligning it to 6 bits fixes the issue, but
padding the beginning of the function with the equivalent number of nops
doesn't.
Said otherwise, this does trigger the issue:
00071560 <memmove@@GLIBC_2.4>:
71560: e320f000 nop {0}
71564: e320f000 nop {0}
71568: e320f000 nop {0}
7156c: e320f000 nop {0}
71570: e320f000 nop {0}
71574: e320f000 nop {0}
71578: e320f000 nop {0}
7157c: e320f000 nop {0}
71580: e050c001 subs ip, r0, r1
71584: 8152000c cmphi r2, ip
71588: 9afeb686 bls 1efa8 <_dl_audit_preinit@plt+0x20>
7158c: e92d4011 push {r0, r4, lr}
This doesn't trigger the issue:
00071580 <memmove@@GLIBC_2.4>:
71580: e050c001 subs ip, r0, r1
71584: 8152000c cmphi r2, ip
71588: 9afeb686 bls 1efa8 <_dl_audit_preinit@plt+0x20>
7158c: e92d4011 push {r0, r4, lr}
Besides the change of the memmove address, libc.so.6 binaries with
memmove padded or aligned are identical. Using GDB with valgrind and
adding a breakpoint on memmove, it appears that the program just doesn't
stop. This also appears with the small reproducer below. I attribute
this to a valgrind bug, although it's just a guess.
Now about the problem reported by valgrind, this is the call stack
at 0x4881648: memcpy (vg_replace_strmem.c:1150)
by 0x495823B: memmove (string_fortified.h:36)
by 0x495823B: re_string_reconstruct (regex_internal.c:685)
by 0x495C69B: re_search_internal (regexec.c:778)
by 0x49606D1: regexec@@GLIBC_2.4 (regexec.c:216)
by 0x48AAE17: item_filter (source_types.c:91)
by 0x48A4C8F: lis_bw_item_get_children (basewrapper.c:392)
by 0x48A4BF3: lis_bw_item_get_children (basewrapper.c:360)
by 0x108D17: tests_one (in /tmp/autopkgtest/tests_workaround_one_page_flatbed)
In short memmove is called by internal GNU libc code, more precisely the
regex related functions that are called from the libinsane library. The
arm specific memmove assembly code first checks if memcpy can be called
to do the job:
ENTRY(memmove)
subs ip, r0, r1
cmphi r2, ip
#if !IS_IN (libc)
bls memcpy
#else
bls HIDDEN_JUMPTARGET(memcpy)
#endif
This happens in two cases, if forward copy can be used (i.e. dst <= src)
or if the memory addresses do not overlap. This basically relies on the
fact that all the arm memcpy implementations allow overlap in case of
forward copy. Note that the call to memcpy is done through the PLT, and
that different implementations are provided through the IFUNC mechanism.
On Debian armhf this default to the VFP implementation, but the NEON
implementation can be selected if the CPU supports it. This also means
that valgrind will intercept those calls to memcpy to check for overlap,
as if it was called directly by user code. In this specific case this is
perfectly fine, but valgrind believes the contrary. This is a small
reproducer:
#include <string.h>
#include <stdio.h>
int main(int, char **)
{
char buffer[32] = " -- Hello world -- ";
printf("%s\n", buffer);
memmove(buffer, buffer + 4, strlen(buffer) - 4 + 1);
printf("%s\n", buffer);
}
Other architectures are not affected because their memmove and memcpy
implementations do not call each other, and usually generate both
memmove and memcpy functions from the same source code, using
preprocessor directives to handle the differences. The arm memmove and
memcpy assembly codes in glibc would probably benefit from a rewrite, but
nowadays nobody is really interested by arm32 anymore.
Anyway this is therefore not a GCC nor a GLIBC issue, but rather a false
positive from valgrind. One option is to add a corresponding suppression
entry in the default suppression file. The other alternative is to
disable valgrind testing in libinsane on armhf, with the risk that it
pops up at a later point on another package.
Given the state valgrind, especially on 32-bit architectures following
the removal of OpenMPI, it's probably better to handle that at the
libinsane level, removing the valgrind from the autopkgtest on armhf. I
am therefore reassigning the bug there. Neverthless I am cloning also it
to valgrind to avoid having the issue poping up on another package at a
later point.
Regards
Aurelien
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://aurel32.net