https://bugs.kde.org/show_bug.cgi?id=377006
--- Comment #10 from zephyrus00jp <ishik...@yk.rim.or.jp> --- (In reply to John Reiser from comment #9) > It looks to me like some part of the problem arises when memcheck is working > on the driver for the video graphics card. This suggests a cause for > non-determinism, and also a reason for different behavior on different Linux > kernels. At various times over the last few years, different parts of the > driver have moved between kernel space and user space. So one strategy to > avoid SIGSEGV might be to choose a video driver that is as simple as > possible; probably this is "VGA framebuffer" (which does exist, but I don't > know its actual name.) Thank you for taking your time to look into the issue. I believe that SIGSEGV issue is now being reproduced in a non-deterministic manner on your PC. > > Another source of non-determinism is the use of threads. I usually see two > threads. One of them gets the SIGSEGV, then the other terminates "normally". OK, I think I will add the fair-schedulling option to valgrind to see if it makes any difference. > I ran this group of sessions on : > Linux deb81p64 4.9.0-1-amd64 #1 SMP Debian 4.9.6-3 (2017-01-28) x86_64 > GNU/Linux > 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK208 [GeForce > GT 710B] [10de:128b] (rev a1) > libdrm-nouveau2:amd64 2.4.74-1 > Gnome 1:3.20+3 desktop environment > Thunderbird is icedove-45.7.1 re-built from source in the usual Debian way. > THUNDERBIRD_BIN=icedove-45.7.1/obj-thunderbird/dist/bin/thunderbird-bin > > After building valgrind from current SVN, I modified vg-in-place so that the > last command is > strace -i -e signal=SIGSEGV -e trace=file,memory > "$vgbasedir/coregrind/valgrind" --run-libc-freeres=no --trace-flags=10000000 > --trace-notbelow=22081 --trace-syscalls=yes $THUNDERBIRD_BIN >foo 2>&1 > > and then experimented with --trace-notbelow until I got close to just before > the killing SIGSEGV. [The number of basic blocks varied from run to run, > which I attribute to non-determinism.] The last two basic blocks are below. > You can see the SIGSEGV in the middle of the last block. I have never done the detailed debugging of valgrind/memcheck at this level. Do you think that the SIGSEGV somehow occurs in the valgrind code anyhow? (I have a feeling that it may be due to a broken emulation of a very complex instruction that *may* involve certain context-level information.: the current emulation may not be quite well protected/complete in terms of atomicity or something like that. Pure guess. But otherwise, I cannot explain valgrind failure to report nice memory error(s) on its own.) > > I saw the SIGSEGV on every run, usually in about 20 seconds of real time on > Intel Core 2 Duo @ 3GHz. > Actually, I once tried to allocate only a single core to see if the problem symptom changed, but no luck. Still the same SIGSEGV. > ===== > > GuestBytes 1B5BF56B 22 48 8D 3D 1E 33 FF FF 48 89 3C D3 48 63 90 74 02 00 > 00 85 D2 78 0B 00A1B31B > > VexExpansionRatio 22 363 165 :10 > > ==== SB 23786 (evchecks 13409019) [tid 1] 0x1b5bf581 UNKNOWN_FUNCTION > /usr/lib/x86_64-linux-gnu/dri/nouveau_dri.so+0xa9581 > > ------------------------ Front end ------------------------ > > 0x1B5BF581: leaq -52824(%rip), %rcx > > ------ IMark(0x1B5BF581, 7, 0) ------ > t0 = Add64(0x1B5BF588:I64,0xFFFFFFFFFFFF31A8:I64) > PUT(24) = t0 > PUT(184) = 0x1B5BF588:I64 > > 0x1B5BF588: movq %rcx,(%rbx,%rdx,8) > > ------ IMark(0x1B5BF588, 4, 0) ------ > t1 = Add64(GET:I64(40),Shl64(GET:I64(32),0x3:I8)) > STle(t1) = GET:I64(24) > PUT(184) = 0x1B5BF58C:I64 > > 0x1B5BF58C: movslq 640(%rax),%rdx > > ------ IMark(0x1B5BF58C, 7, 0) ------ > t2 = Add64(GET:I64(16),0x280:I64) > PUT(32) = 32Sto64(LDle:I32(t2)) > PUT(184) = 0x1B5BF593:I64 > > 0x1B5BF593: testl %edx,%edx > > ------ IMark(0x1B5BF593, 2, 0) ------ > t5 = 64to32(GET:I64(32)) > t4 = 64to32(GET:I64(32)) > t3 = And32(t5,t4) > PUT(144) = 0x13:I64 > PUT(152) = 32Uto64(t3) > PUT(160) = 0x0:I64 > PUT(184) = 0x1B5BF595:I64 > > 0x1B5BF595: js-8 0x1B5BF5A2 > > ------ IMark(0x1B5BF595, 2, 0) ------ > if > (64to1(amd64g_calculate_condition[mcx=0x13]{0x3817bec0}(0x8:I64,GET:I64(144), > GET:I64(152),GET:I64(160),GET:I64(168)):I64)) { PUT(184) = 0x1B5BF5A2:I64; > exit-Boring } > PUT(184) = 0x1B5BF597:I64 > PUT(184) = GET:I64(184); exit-Boring > > GuestBytes 1B5BF581 22 48 8D 0D A8 31 FF FF 48 89 0C D3 48 63 90 80 02 00 > 00 85 D2 78 0B 03FEC91B > > VexExpansionRatio 22 363 165 :10 > > ==== SB 23787 (evchecks 13409020) [tid 1] 0x1b5bf597 UNKNOWN_FUNCTION > /usr/lib/x86_64-linux-gnu/dri/nouveau_dri.so+0xa9597 > > ------------------------ Front end ------------------------ > > 0x1B5BF597: leaq -53198(%rip), %rsi > > ------ IMark(0x1B5BF597, 7, 0) ------ > t0 = Add64(0x1B5BF59E:I64,0xFFFFFFFFFFFF3032:I64) > PUT(64) = t0 > PUT(184) = 0x1B5BF59E:I64 > > 0x1B5BF59E: movq %rsi,(%rbx,%rdx,8) > > ------ IMark(0x1B5BF59E, 4, 0) ------ > t1 = Add64(GET:I64(40),Shl64(GET:I64(32),0x3:I8)) > STle(t1) = GET:I64(64) > PUT(184) = 0x1B5BF5A2:I64 > > 0x1B5BF5A2: movslq 636(%rax),%rdx > > ------ IMark(0x1B5BF5A2, 7, 0) ------ > t2[????????????????] +++ killed by SIGSEGV +++ > = Add64(GET:I64(16),0x27C:I64) > PUT(32) = 32Sto64(LDle:I32(t2)) > PUT(184) = 0x1B5BF5A9:I64 > > 0x1B5BF5A9: testl %edx,%edx > > ------ IMark(0x1B5BF5A9, 2, 0) ------ > t5 = 64to32(Segmentation fault > GET:I64(32)) > t4 = 64to32(GET:I64(32)) > t3 = And32(t5,t4) > PUT(144) = 0x13:I64 > PUT(152) = 32Uto64(t3) > PUT(160) = 0x0:I64 > PUT(184) = 0x1B5BF5AB:I64 > > 0x1B5BF5AB: js-8 0x1B5BF5B8 > > ===== > > $ gdb /usr/lib/x86_64-linux-gnu/dri/nouveau_dri.so > > (gdb) x/12i 0xa9597 > 0xa9597: lea -0xcfce(%rip),%rsi # 0x9c5d0 > 0xa959e: mov %rsi,(%rbx,%rdx,8) > 0xa95a2: movslq 0x27c(%rax),%rdx > 0xa95a9: test %edx,%edx > 0xa95ab: js 0xa95b8 > 0xa95ad: lea -0xd144(%rip),%rdi # 0x9c470 > 0xa95b4: mov %rdi,(%rbx,%rdx,8) > 0xa95b8: movslq 0x284(%rax),%rdx > 0xa95bf: test %edx,%edx > 0xa95c1: js 0xa95ce > 0xa95c3: lea -0xd2ba(%rip),%rcx # 0x9c310 > 0xa95ca: mov %rcx,(%rbx,%rdx,8) > > (gdb) x/12i 0xa9597-0x20 > 0xa9577: movslq 0x274(%rax),%edx > 0xa957d: test %edx,%edx > 0xa957f: js 0xa958c > 0xa9581: lea -0xce58(%rip),%rcx # 0x9c730 > 0xa9588: mov %rcx,(%rbx,%rdx,8) > 0xa958c: movslq 0x280(%rax),%rdx > 0xa9593: test %edx,%edx > 0xa9595: js 0xa95a2 > 0xa9597: lea -0xcfce(%rip),%rsi # 0x9c5d0 > 0xa959e: mov %rsi,(%rbx,%rdx,8) > 0xa95a2: movslq 0x27c(%rax),%rdx > 0xa95a9: test %edx,%edx > > ===== I am not familiar with this debugging info before, but basically valgrind expands the instructions by including some memory checks and execute the extended instruction stream (which it calls basic block)? I will tinker with the options to see if I can gain any insight on my end. TIA -- You are receiving this mail because: You are watching all bug changes.