[valgrind] [Bug 377006] valgrind/memcheck segfaults under certain kernel versions (amd64) but not others.

zephyrus00jp Mon, 06 Mar 2017 23:05:42 -0800

https://bugs.kde.org/show_bug.cgi?id=377006


--- Comment #6 from zephyrus00jp <ishik...@yk.rim.or.jp> ---
Created attachment 104426
  --> https://bugs.kde.org/attachment.cgi?id=104426&action=edit
A log to show valgrind with --vex-iropt-register-updates=all still segfaults
(under 4.7.0.1)

Sorry I was not specific.

I DID follow Julien's advice in the last several months and still no luck.

The following options for creating Thunderbird are in my mozconfig
file (on two PCs where I test.  

>  ac_add_options --disable-jemalloc
>  ac_add_options --enable-valgrind

[And I don't see unhandled syscall: 317 before segfault
in recent logs as you can verify. At least, I didn't recall seeing this for the
last few months.]

Re the following option:

>  --vex-iropt-register-updates=allregs-at-mem-access

This has been added to the valgrind option when I ran 
|make mozmill| test suite under one of the test machines, and
still no luck. Even with this and a few other options Julian suggested, the
combination valgrind+thunderbird runs under 3.19.5 and
segfaults under 4.7.0.1 and 4.9.x series.
Actually, this option does not make a difference as far as I can tell :-(

> https://bugs.kde.org/show_bug.cgi?id=345414#c3

Yes, that thread was reported by me almost two years ago.

Back then 4.y series kernel was not available for Debian (it was only
in testing repository). But it is now. And I want to use the later
kernel versions for obvious reasons.

I have to emphasize that the bug still stands with the suggested option.

I am attaching the segfault case when valgrind is run with the following
parameter under kernel 4.7.0.1 (Debian's distribution). Note the addition of
--vex-iropt-register-updates=allregs-at-mem-access.
(It does not make a difference. valgrind+thunderbird still
segfaults. Sorry I was not specific enough about this in my original
report. I did not want to clutter the bug report with the options that
do not seem to have effect.)

strace -ff valgrind --verbose --trace-syscalls=yes --trrace-signals=yes
--show-mismatched-frees=no --trace-children=yes
--vex-iropt-register-updates=allregs-at-mem-access
~ishikawa/objdir-tb3/dist/bin/thunderbird-bbin 

On this machine with this kernel, original valgrind+thunderbird segfaults AFTER
a child process spawned by thunderbird finishes.
On another machine with 4.9.x kernel, valgrind+thunderbird segfaults way before
the child process fork/exec happens.

It is really frustrating to see the combination of valgrind+thunderbird work
only under certain kernel revisions (in my case, 3.19.5) as noted in
https://bugs.kde.org/show_bug.cgi?id=345414#c6

With a slim hope of success, I tried to use the old kernel config for 3.19.5 to
create 4.9.z kernel (using make oldconfig ), but valgrind+thunderbird still
segfaults under the resulting kernel. (That was on a different PC.)

It would be great to find out TO WHERE (if it is meaningful) the stray pointer
reported in SIGSEGV points.

I think the routine to report the symbols that get mapped to anonymous
maps area as viewed by valgrind can be very useful for this.

I have a few pet theories for possible issues:

1. Given that when the fatal SIGSEGV is received, the stack trace
seems to messed up somehow (they seem to be too low address in
comparison to other values), I suspect that it may be that valgrind is
experiencing a segfault in the code to set up signal handlers
including SIGSEGV.  (There were some races in linux kernel regarding
some signal issues before after a fork(). Maybe valgrind code inherits similar
problems. But do note that under 4.9.x, the valgrind+thunderbird combination 
still crashes BEFORE fork() is reached. So there may be multpile issues here.

2. I am not sure how valgrind handles this, but, given the different mmap
layout, I wonder if the malloc routine in valgrind may assign an area as a
return value of malloc() which is at the end of sbrk()'ed area. If so, what
happens if multiple-bytes access by x86_64 for strcmp, etc. goes beyond the
sbrk() value during operation. That is what happens, that eager access to speed
up operation by reading extra bytes (8 or 16 octes) and if the tail part of the
octet array falls beyond the user's valid vm address are. Does it get caught as
SIGSEGV? Or is such an access checked byte by byte before such an illegal
access is attemped?

If the reason for the segfault is not one of the above,
I am at my wit's end.

TIA

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 377006] valgrind/memcheck segfaults under certain kernel versions (amd64) but not others.

Reply via email to