On 02/04/2019 14:51, Lluis Campos wrote:
Hi Paul,
On 02.04.2019 14:49, Paul Barker wrote:
On 02/04/2019 12:45, Lluis Campos wrote:
Hi all,
This is my very first question in the Yocto mailing list. Very
exited! Please let me know if I should use other list for this.
I am building an image using musl libc instead of gnu libc. I am not
using yocto-tiny distro, instead I achieve this by setting on my
local.conf:
TCLIBC = "musl"
My app (mender) got a segfault just starting. See output from strace:
root@raspberrypi3:~# strace mender
execve("/usr/bin/mender", ["mender"], 0x7ee65e10 /* 13 vars */) = 0
set_tls(0x76f1bffc) = 0
set_tid_address(0x76f1bfa0) = 3020
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x530bc8}
---
+++ killed by SIGSEGV +++
Segmentation fault
To be able to debug the process, I added gdb to my image adding to my
local.conf:
CORE_IMAGE_EXTRA_INSTALL += "packagegroup-core-buildessential
packagegroup-core-tools-debug"
Then, ironically, gdb itself also segfaults:
root@raspberrypi3:~# strace gdb 2>&1 | tail
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
getdents64(3, /* 6 entries */, 2048) = 144
getdents64(3, /* 0 entries */, 2048) = 0
close(3) = 0
ioctl(0, TIOCGWINSZ, {ws_row=25, ws_col=74, ws_xpixel=0,
ws_ypixel=0}) = 0
getcwd("/home/root", 4096) = 11
access("/usr/local/bin/gdb", X_OK) = -1 ENOENT (No such file or
directory)
access("/usr/bin/gdb", X_OK) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR,
si_addr=0x7e35aff0} ---
+++ killed by SIGSEGV +++
So, what is going on here? My guess is that some recipes are being
wrongly linked with gnu libc instead of musl, and then cannot run in
my device.
Any ideas on how to debug the issue?
Hi Lluis,
This is an issue I've seen before with runc and gdb.
In runc we saw SIGILL which I tracked down to some hideous
setjmp/longjmp magic written in C. cgo is used to include this C code
in with the Go code that comprises the rest of the application.
Our application is written in Go and we use CGO as well. So it sounds
quite similar.
Does your application use setjmp/longjmp or C++ exceptions in the
sections built with CGO?
I've dug into the runc disassembly as well as adding extra prints and
can see that it's at calls to those functions that the program counter
jumps off into the weeds resulting in SIGILL. For gdb there's usage of
C++ exception handling around gdb_main() and strace shows that the crash
is very very early in execution so I suspect it's setjmp/longjmp again,
used by C++ exception handling.
In gdb we saw SIGSEGV which is what you've got above.
I think things are being correctly linked against musl but then
there's some runtime issue in recent musl versions, possibly in
conjunction with recent kernel headers.
Are you using the thud or master branch?
I am using thud branch. I haven't actually tried with master but I will
do it later today
I've reproduced the issue on master. I've also ruled out the issue in
sumo branch even if we uprev musl to the same version used on master.
It's likely a gcc/musl incompatibility of some kind introduced in a
recent gcc version.
I'm passing this one to a colleague for now but I'll try to have another
look myself next week. It's captured in our issue tracker for Oryx here:
https://gitlab.com/oryx/oryx/issues/14.
Thanks,
--
Paul Barker
Managing Director & Principal Engineer
Beta Five Ltd
--
_______________________________________________
yocto mailing list
yocto@yoctoproject.org
https://lists.yoctoproject.org/listinfo/yocto