control: clone -1 -2
control: reassign -2 src:libinsane
control: retitle -2 libinsane: please disable valgrind in armhf autopkgtest
control: reassign -1 valgrind
control: retitle -1 valgrind: false memory overlap positive in memcpy on armhf
control: severity -1 important

Hi,

On 2025-03-20 11:18, Emanuele Rocca wrote:
On 2025-03-19 09:45, Aurelien Jarno wrote:
So far my finding are quite limited. It seems the new GCC generates more
optimized code in general, and especially for the memmem() function, and
that changes the location of the memmove() function located just after in
the binary. Aligning it to 6 bits seems to fix the problem with -18 (it is
aligned to 5 bits), but strangely it is only aligned to 4 bits with -17...
Also padding it with the same number of nops at the beginning of the
function doesn't help, so this somehow excludes alignment issues of
functions located after it. Overall it seems that memmove() wrongly jumps to
memcpy() for cases it shouldn't, but I don't really see why and while
alignment would matter.

I am puzzled by all of that I and don't really know how to debug that
further.

It could very well be that this is a bug in valgrind, from what you
describe. My uninformed impression is that upstream support for armhf
isn't great. I did once open an issue asking to support the latest
(2018) Procedure Call Standard and received no replies, which makes me
wonder if there's much upstream interest at all when it comes to 32bit
arm: https://bugs.kde.org/show_bug.cgi?id=479699

On that topic, just as a sanity check I tried rebuilding libinsane
without stack-clash-protection and still could reproduce #1100805.

Here's my 2 cents: if there's no obvious compiler or C library change
that may suggest a regression was introduced, just stop running valgrind
in the libinsane autopkgtest on armhf and move on.

Yes, I confirm that this is a valgrind issue. Or rather two, a false positive and a check done randomly depending on the code alignment.

On the GCC side, I have found that the issue is trigger by this commit:

commit 95c98c5368aedf2a482bf551cd2573c1961a6823
Author: Richard Biener <rguent...@suse.de>
Date:   Wed Jan 15 14:31:57 2025 +0100

    tree-optimization/115494 - PRE PHI translation and ranges

It significantly changes the generated code, and it seems more compact
overall. This is especially true for the memset function, which in
turns move the location of the memmove function in the binary. This is
the part I just do not understand, but the behaviour of this code seems
to depends on its location. Aligning it to 6 bits fixes the issue, but
padding the beginning of the function with the equivalent number of nops
doesn't.

Said otherwise, this does trigger the issue:

00071560 <memmove@@GLIBC_2.4>:
   71560:       e320f000        nop     {0}
   71564:       e320f000        nop     {0}
   71568:       e320f000        nop     {0}
   7156c:       e320f000        nop     {0}
   71570:       e320f000        nop     {0}
   71574:       e320f000        nop     {0}
   71578:       e320f000        nop     {0}
   7157c:       e320f000        nop     {0}
   71580:       e050c001        subs    ip, r0, r1
   71584:       8152000c        cmphi   r2, ip
   71588:       9afeb686        bls     1efa8 <_dl_audit_preinit@plt+0x20>
   7158c:       e92d4011        push    {r0, r4, lr}

This doesn't trigger the issue:

00071580 <memmove@@GLIBC_2.4>:
   71580:       e050c001        subs    ip, r0, r1
   71584:       8152000c        cmphi   r2, ip
   71588:       9afeb686        bls     1efa8 <_dl_audit_preinit@plt+0x20>
   7158c:       e92d4011        push    {r0, r4, lr}

Besides the change of the memmove address, libc.so.6 binaries with memmove padded or aligned are identical. Using GDB with valgrind and adding a breakpoint on memmove, it appears that the program just doesn't stop. This also appears with the small reproducer below. I attribute this to a valgrind bug, although it's just a guess.

Now about the problem reported by valgrind, this is the call stack
 at 0x4881648: memcpy (vg_replace_strmem.c:1150)
 by 0x495823B: memmove (string_fortified.h:36)
 by 0x495823B: re_string_reconstruct (regex_internal.c:685)
 by 0x495C69B: re_search_internal (regexec.c:778)
 by 0x49606D1: regexec@@GLIBC_2.4 (regexec.c:216)
 by 0x48AAE17: item_filter (source_types.c:91)
 by 0x48A4C8F: lis_bw_item_get_children (basewrapper.c:392)
 by 0x48A4BF3: lis_bw_item_get_children (basewrapper.c:360)
 by 0x108D17: tests_one (in /tmp/autopkgtest/tests_workaround_one_page_flatbed)

In short memmove is called by internal GNU libc code, more precisely the
regex related functions that are called from the libinsane library. The
arm specific memmove assembly code first checks if memcpy can be called
to do the job:

ENTRY(memmove)
                subs    ip, r0, r1
                cmphi   r2, ip
#if !IS_IN (libc)
                bls     memcpy
#else
                bls     HIDDEN_JUMPTARGET(memcpy)
#endif

This happens in two cases, if forward copy can be used (i.e. dst <= src)
or if the memory addresses do not overlap. This basically relies on the fact that all the arm memcpy implementations allow overlap in case of forward copy. Note that the call to memcpy is done through the PLT, and that different implementations are provided through the IFUNC mechanism. On Debian armhf this default to the VFP implementation, but the NEON implementation can be selected if the CPU supports it. This also means that valgrind will intercept those calls to memcpy to check for overlap, as if it was called directly by user code. In this specific case this is perfectly fine, but valgrind believes the contrary. This is a small reproducer:

#include <string.h>
#include <stdio.h>

int main(int, char **)
{
    char buffer[32] = " -- Hello world -- ";

    printf("%s\n", buffer);
    memmove(buffer, buffer + 4, strlen(buffer) - 4 + 1);
    printf("%s\n", buffer);
}

Other architectures are not affected because their memmove and memcpy
implementations do not call each other, and usually generate both
memmove and memcpy functions from the same source code, using
preprocessor directives to handle the differences. The arm memmove and
memcpy assembly codes in glibc would probably benefit from a rewrite, but
nowadays nobody is really interested by arm32 anymore.

Anyway this is therefore not a GCC nor a GLIBC issue, but rather a false
positive from valgrind. One option is to add a corresponding suppression
entry in the default suppression file. The other alternative is to
disable valgrind testing in libinsane on armhf, with the risk that it
pops up at a later point on another package.

Given the state valgrind, especially on 32-bit architectures following the removal of OpenMPI, it's probably better to handle that at the libinsane level, removing the valgrind from the autopkgtest on armhf. I am therefore reassigning the bug there. Neverthless I am cloning also it to valgrind to avoid having the issue poping up on another package at a later point.

Regards
Aurelien

--
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurel...@aurel32.net                     http://aurel32.net

Reply via email to