hi Ryan, On 07/11/2017 02:58 AM, Ryan Tandy wrote: > Today I built Linux 4.12 from upstream source and the test program still > crashes. I was looking at your fixes to initialize load_{fp,tm,vec} as well > as someone else fixing the CONFIG_ALIVEC typo but none of those have helped.
Right, I tested it with the pending patches for HTM and the bug is still there, so, I doubt is has been fixed already. > I did confirm on this kernel that reverting 613036d9 still stops it from > crashing. Tomorrow I will try to narrow it down to a specific change. There > are only 4 hunks after all (the addition of msr_tm_active cannot be reverted > as there are more calls to it now). In fact I just did it and I found that the following patch fixes the problem. I am not able to understand why yet. Working on it right now. diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 9f3e2c932dcc..21bcb3b19758 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -231,7 +231,7 @@ void enable_kernel_fp(void) EXPORT_SYMBOL(enable_kernel_fp); static int restore_fp(struct task_struct *tsk) { - if (tsk->thread.load_fp || msr_tm_active(tsk->thread.regs->msr)) { + if (tsk->thread.load_fp) { load_fp_state(¤t->thread.fp_state); current->thread.load_fp++; return 1; > It turns out it is _not_ compiler dependent. The test program compiled in a > jessie chroot succeeds in that chroot and then crashes if I run the same > binary in a stretch chroot. This also means I was wrong about the m{t,f}vsrd > instructions being related, as gcc-4.9 doesn't emit them (for this particular > program, at least). I understand that glibc might have VSX instructions, so, even if your application is not using VSX instructions, it might be required depending on the glibc version you are using, although the problem seems to be on the float point (FP) side. > objdump -d libpthread.so.0 output apparently lists some tbegin/tend > instructions, but I suppose those could be selected at runtime? Correct. I checked and Debian is enabling HTM[1] to do lock ellision. It is not a option that you can change on runtime, we would need to reconfigure/recompile glibc if we want to disable it. There is currently an effort to use glibc tunnables to enable/disable lock elision at runtime, but this is still under development. Out of curiosity, how did you bisect the kernel to find that commit-id? Did you do it automatically? [1] https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=ppc64el&ver=2.24-12&stamp=1497900384&raw=0 (Check for --enable-lock-elision)