** Description changed: [Impact] valgrind on bionic coredump and errors out as follows: ARM64 front end: branch_etc disInstr(arm64): unhandled instruction 0xD5380000 disInstr(arm64): 1101'0101 0011'1000 0000'0000 0000'0000 ==11950== valgrind: Unrecognised instruction at address 0x4014c90. ==11950== at 0x4014C90: init_cpu_features (cpu-features.c:72) ==11950== by 0x4014C90: dl_platform_init (dl-machine.h:208) ==11950== by 0x4014C90: _dl_sysdep_start (dl-sysdep.c:231) ==11950== by 0x40018C3: _dl_start_final (rtld.c:414) ==11950== by 0x4001B47: _dl_start (rtld.c:523) ==11950== by 0x40011C7: ??? (in /lib/aarch64-linux-gnu/ld-2.27.so) ==11950== Your program just tried to execute an instruction that Valgrind ==11950== did not recognise. There are two possible reasons for this. ==11950== 1. Your program has a bug and erroneously jumped to a non-code ==11950== location. If you are running Memcheck and you just saw a ==11950== warning about a bad jump, it's probably your program's fault. ==11950== 2. The instruction is legitimate but Valgrind doesn't handle it, ==11950== i.e. it's Valgrind's fault. If you think this is the case or ==11950== you are not sure, please let us know and we'll try to fix it. ==11950== Either way, Valgrind will now raise a SIGILL signal which will ==11950== probably kill your program. ==11950== ==11950== Process terminating with default action of signal 4 (SIGILL) ==11950== Illegal opcode at address 0x4014C90 ==11950== at 0x4014C90: init_cpu_features (cpu-features.c:72) ==11950== by 0x4014C90: dl_platform_init (dl-machine.h:208) ==11950== by 0x4014C90: _dl_sysdep_start (dl-sysdep.c:231) ==11950== by 0x40018C3: _dl_start_final (rtld.c:414) ==11950== by 0x4001B47: _dl_start (rtld.c:523) ==11950== by 0x40011C7: ??? (in /lib/aarch64-linux-gnu/ld-2.27.so) The crash occurs because Valgrind is trying to simulate the CPU instructions when debugging a specific process. Valgrind tries to disassemble the whole instructions running by the process and insert the debugging instructions in run time. However, in this case, Valgrind cannot identify the MIDR_EL1 flag which happens in the "mrs %0, midr_el1" instruction. And this instruction means to read the CPU ID state register to %0(id) variable. asm volatile ("mrs %0, midr_el1" : "=r"(id)); so, Valrind cannot recognize what "midr_el1" is and then crashes. https://www.kernel.org/doc/Documentation/arm64/cpu-feature-registers.txt .... d) CPU Identification : MIDR_EL1 is exposed to help identify the processor. On a heterogeneous system, this could be racy (just like getcpu()). The process could be migrated to another CPU by the time it uses the register value, unless the CPU affinity is set. Hence, there is no guarantee that the value reflects the processor that it is currently executing on. The REVIDR is not exposed due to this constraint, as REVIDR makes sense only in conjunction with the MIDR. Alternately, MIDR_EL1 and REVIDR_EL1 are exposed via sysfs at: /sys/devices/system/cpu/cpu$ID/regs/identification/ \- midr \- revidr [Test Case] 1) Write a 'Hello World' program: ---- #include <stdio.h> void main(void) { printf("Hello World!\n"); }; ---- 2) Build it: $ cc -o hello hello.c 3) Then run valgrind on it: $ valgrind ./hello [Regression Potential] For the regression possibility, it should be fine. The symtpom happens when Valgrind is trying to disassemble code inside glibc (sysdeps/unix/sysv/linux/aarch64/cpu-features.c): Even if the HWCAP_CPUID is not supported, the default value is to assign 0 to the midr variable. So, I think it's not an important feature to support. As stated in the fix itself as a comment: ++ /* Limit the AT_HWCAP to just those features we explicitly ++ support in VEX. */ - Additionally, the fix is found in Ubuntu already (disco and late). For some reasons, if a regression happens, the regression will be limited to ARM arch and shouldn't affect other cpu(s) architecture. [Other information] Upstream fix: https://sourceware.org/git/?p=valgrind.git;a=commit;h=fbbb696c5d1e93d4ac6cb548c68bb3f443ceef42 + + * For some reason, Xenial is not affected: + ---- + # lsb_release -cs + xenial + + # lscpu + Architecture: aarch64 + + # valgrind ./hello + ==32367== Memcheck, a memory error detector + ==32367== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. + ==32367== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info + ==32367== Command: ./hello + ==32367== + Hello World! + ==32367== + ==32367== HEAP SUMMARY: + ==32367== in use at exit: 0 bytes in 0 blocks + ==32367== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated + ==32367== + ==32367== All heap blocks were freed -- no leaks are possible + ==32367== + ==32367== For counts of detected and suppressed errors, rerun with: -v + ==32367== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) + ---- * Only affecting Bionic: # git describe --contains fbbb696c5d1e93d4ac6cb548c68bb3f443ceef42 VALGRIND_3_14_0~96 # rmadison valgrind => valgrind | 1:3.13.0-2ubuntu2.1 | bionic-updates valgrind | 1:3.14.0-2ubuntu6 | disco valgrind | 1:3.15.0-1ubuntu3.1 | eoan-updates valgrind | 1:3.15.0-1ubuntu5 | focal [Original Description] I'm performing Valgrind testing on an ElPotato running Ubuntu Bionic Aarch64 image. My program is dying like in https://bugs.kde.org/show_bug.cgi?id=381556 : ``` $ valgrind --track-origins=yes --suppressions=cryptopp.supp ./cryptest.exe v ==12969== Memcheck, a memory error detector ==12969== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==12969== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info ==12969== Command: ./cryptest.exe v ==12969== ARM64 front end: branch_etc disInstr(arm64): unhandled instruction 0xD5380000 disInstr(arm64): 1101'0101 0011'1000 0000'0000 0000'0000 ==12969== valgrind: Unrecognised instruction at address 0x4014c90. ==12969== at 0x4014C90: init_cpu_features (cpu-features.c:72) ==12969== by 0x4014C90: dl_platform_init (dl-machine.h:208) ==12969== by 0x4014C90: _dl_sysdep_start (dl-sysdep.c:231) ==12969== by 0x40018C3: _dl_start_final (rtld.c:414) ==12969== by 0x4001B47: _dl_start (rtld.c:523) ==12969== by 0x40011C7: ??? (in /lib/aarch64-linux-gnu/ld-2.27.so) ... ``` Here's a similar Red Hat issue report: https://bugzilla.redhat.com/show_bug.cgi?id=1467952 . Please pickup the patch in the 381556 bug report. ----- $ lsb_release -rd Description: Ubuntu 18.04.2 LTS Release: 18.04 $ apt-cache policy valgrind valgrind: Installed: 1:3.13.0-2ubuntu2.1 Candidate: 1:3.13.0-2ubuntu2.1 Version table: *** 1:3.13.0-2ubuntu2.1 500 500 http://ports.ubuntu.com bionic-updates/main arm64 Packages 100 /var/lib/dpkg/status 1:3.13.0-2ubuntu2 500 500 http://ports.ubuntu.com bionic/main arm64 Packages
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1826811 Title: Valgrind unhandled instruction 0xD5380000 on Aarch64 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs