On Fri, Aug 28, 2015 at 1:30 AM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > On Thu, Aug 27, 2015 at 9:56 PM, Michael Ellerman <m...@ellerman.id.au> wrote: >> On Thu, 2015-08-27 at 11:31 -0400, Ilia Mirkin wrote: >>> I've recently come into the possession of a PowerMac7,3 and have been >>> cross-compiling a chroot for it on my (x86_64) desktop. However >>> elfutils doesn't cross-compile for ppc64 due to its biarch m4 script >>> which tries to execute a built program, so I kicked off a build >>> locally and left for a few minutes. >> >> OK, cross compiling how? A bunch of the guys here use buildroot, but maybe >> they >> aren't building elfutils? > > This is what I get in configure: > > checking whether powerpc64-unknown-linux-gnu-gcc -m32 makes > executables we can run... configure: error: in > `/usr/powerpc64-unknown-linux-gnu/tmp/portage/dev-libs/elfutils-0.158/work/elfutils-0.158-abi_ppc_64.ppc64': > configure: error: cannot run test program while cross compiling > > and config.log has: > > $ > /usr/powerpc64-unknown-linux-gnu/tmp/portage/dev-libs/elfutils-0.158/work/elfutils-0.158/configure > --prefix=/usr --build=x86_64-pc-linux-gnu > --host=powerpc64-unknown-linux-gnu --mandir=/usr/share/man > --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc > --localstatedir=/var/lib --disable-dependency-tracking > --libdir=/usr/lib64 --disable-werror --enable-nls > --disable-thread-safety --program-prefix=eu- --with-zlib --with-bzlib > --without-lzma > ... > configure:6465: checking powerpc64-unknown-linux-gnu-gcc option for > 32-bit word size > configure:6478: powerpc64-unknown-linux-gnu-gcc -m32 -c -O2 -pipe > -mcpu=G5 -mtune=G5 -fomit-frame-pointer conftest.c >&5 > configure:6478: $? = 0 > configure:6486: result: -m32 > configure:6490: checking for 64-bit host > configure:6511: result: yes > configure:6538: checking whether powerpc64-unknown-linux-gnu-gcc -m32 > makes executables we can run > configure:6546: error: in > `/usr/powerpc64-unknown-linux-gnu/tmp/portage/dev-libs/elfutils-0.158/work/elfutils-0.158-abi_ppc_64.ppc64': > configure:6548: error: cannot run test program while cross compiling > See `config.log' for more details > > I'm building with the help of gentoo's crossdev scripts, which in > addition to setting up a crosscompiler, also sets up an easy way to > "emerge" packages into some chroot. > > Looking at https://git.fedorahosted.org/cgit/elfutils.git/tree/m4/biarch.m4 > makes it seem like it runs AC_RUN_IFELSE irrespective of > cross-compilation. Unfortunately I'm not well-enough versed in m4 or > how cross-compilation is normally handled to suggest a proper fix. I > seem to recall it's normally done by just saying "if you're > cross-compiling, you probably know what you're doing and so let's just > assume things work as expected". > >> >>> When I came back, I saw the below >>> through netconsole, the fans were going full blast, and the machine >>> was unresponsive. >> >> Fans going full blast is normal when the kernel crashes, it's just a safety >> precaution so your machine doesn't melt. >> >>> Is this a kernel issue? >> >> Probably. >> >>> Hardware issue? >> >> Unlikely to be a hardware issue. >> >>> What do I need to do in order >>> for the instruction dump to not be XXX's and have a call trace? >> >> The XXX's mean that we couldn't read the memory where the instructions were >> in >> order to dump them, which is odd. I can't immediately see why that happened >> here. >> >> That's separate to getting a call trace, but possibly the same issue is >> causing >> both to not be emitted. > > Yeah, after sending the email I took a look at > arch/powerpc/kernel/process.c which has > > show_instructions() { ... > if (!__kernel_text_address(pc) || > probe_kernel_address((unsigned int __user *)pc, instr)) { > printk(KERN_CONT "XXXXXXXX "); > > and has various guards around printing a call trace. > >> >>> (Is this the annoying security stuff in action? I started with the >> >> Which stuff? Probably not though. > > Oh I just remember a bunch of stuff getting added to the kernel to > prevent information leaks via dmesg prints, in conjunction with kaslr. > But you're right, this isn't it. > >> >>> g5_defconfig, perhaps that was a mistake.) >> >> That should be a good config, and it booted originally right. >> >>> Sorry for the newbie questions, but I'm very new to ppc. >> >> No worries, welcome to ppc land! :) >> >> >>> In case it matters, it's booted on an nfsroot, no swap. >> >> OK. I don't test nfsroot so that could be the problem. >> >> What kernel version, 4.1.6 ? > > Yes, 4.1.6 (as one could surmise from the backtrace). > >> >>> Thanks for any help, >>> >>> -ilia >>> >>> [ 8419.415061] Oops: Kernel access of bad area, sig: 11 [#1] >>> [ 8419.416338] SMP NR_CPUS=4 PowerMac >>> [ 8419.417623] Modules linked in: snd_aoa_codec_tas snd_aoa snd >>> nouveau soundcore btusb btbcm btintel ttm bluetooth drm_kms_helper drm >>> uninorth_agp agpgart >>> [ 8419.419138] CPU: 0 PID: 12927 Comm: as Not tainted 4.1.6 #4 >>> [ 8419.420539] task: c0000000573f3520 ti: c000000057698000 task.ti: >>> c000000057698000 >>> [ 8419.421963] NIP: c00000005769bca8 LR: c00000005769bca8 CTR: >>> c00000000008a710 >>> [ 8419.423400] REGS: c00000005769b7e0 TRAP: 0400 Not tainted (4.1.6) >>> [ 8419.424850] MSR: 9000000010001032 <SF,HV,ME,IR,DR,RI> CR: 001048fc >>> XER: 00000000 >>> [ 8419.426407] SOFTE: 0 >>> GPR00: 00000000ffffffff c00000005769ba60 c000000000b9ac00 c0000000590bb520 >>> GPR04: c0000000573f3ab0 c0000000573f3588 c0000000001048fc c00000005769bca8 >>> GPR08: c00000005769b890 c000000050000000 0000000000000001 c00000005ee0a290 >>> GPR12: 0000000024044048 c00000000ffff000 c00000005769ba20 0000000000000600 >>> GPR16: 0000000000000001 0000000000000000 c00000005bbd8e00 c000000058ccbcb0 >>> GPR20: c00000005769ba50 0000000000000000 c000000000103d60 c00000005bbd8e00 >>> GPR24: c00000005769ba40 0000000000000000 0000000000000001 0000000000000001 >>> GPR28: 000000001007d630 0000000010049d08 c00000005769bc80 c000000058ccbcb0 >>> [ 8419.440558] NIP [c00000005769bca8] 0xc00000005769bca8 >>> [ 8419.442170] LR [c00000005769bca8] 0xc00000005769bca8 >>> [ 8419.443774] Call Trace: >>> [ 8419.445351] Instruction dump: >>> [ 8419.446946] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX >>> XXXXXXXX XXXXXXXX >>> [ 8419.448659] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX >>> XXXXXXXX XXXXXXXX >>> [ 8419.456445] ---[ end trace ad7c77d8920840ff ]--- >>> [ 8419.456511] >>> [ 8419.456565] Fixing recursive fault but reboot is needed! >> >> Is this definitely the first oops? >> >> That looks like a pretty standard null pointer deref, or other bad pointer in >> the kernel. I can't tell exactly without the instruction dump though. > > Not *definitely* the first oops, but definitely the first one in > netconsole. I unfortunately didn't have time to deal with the problem > when it happened and just shut the system off without looking at the > console. I'll give it all another shot. > > Thanks for the detailed reply!
I've been having lots of general trouble on this machine... like no older kernels boot, but 4.1.6 and 4.2-rc8 are fine. I suspect that the toolchain might have some issues :( I've downgraded gcc to 4.8, but that didn't resolve it, binutils is next. However on 4.2-rc8 (well, ~airlied/drm-next), I managed to capture on netconsole the below (although it hangs fairly often, but usually without any messaging on OFfb or netconsole). By the way, you can tell if it's a first oops or not based on the taint... first oops will say 'Not tainted', while follow-up ones will have some taint. [ 247.551040] Oops: Kernel access of bad area, sig: 11 [#1] [ 247.551215] SMP NR_CPUS=4 PowerMac [ 247.551323] Modules linked in: cfg80211 snd_aoa_codec_tas snd_aoa snd soundcore uninorth_agp agpgart [ 247.551655] CPU: 0 PID: 2122 Comm: syslog-ng Not tainted 4.2.0-rc8-01316-g4b9e78b #6 [ 247.551873] task: c000000059b61a90 ti: c000000059bcc000 task.ti: c000000059bcc000 [ 247.552081] NIP: c0000000002d4e14 LR: c0000000002d4df8 CTR: c0000000002ef380 [ 247.552276] REGS: c000000059bcf530 TRAP: 0300 Not tainted (4.2.0-rc8-01316-g4b9e78b) [ 247.552496] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR: 42004422 XER: 20000000 [ 247.552808] DAR: 0000000000100108 DSISR: 42000000 SOFTE: 1 GPR00: c0000000002d4f5c c000000059bcf7b0 c000000000b9dc00 c0000000570058a0 GPR04: c0000000580880c8 0000000000000081 0000000000000001 0000000000100100 GPR08: c0000000580880d0 0000000000200200 0000000000100100 7f7f7f7f7f7f7f7f GPR12: 0000000022004428 c00000000ffff000 0000000044000000 0000000022000000 GPR16: 0000000010042418 00003fffe6383ae6 0000000000000000 ffffffffffffffff GPR20: 000000000000003a 00003fffe6382b58 0000000010020400 0000000000000000 GPR24: 000000001001db68 fffffffffffffff6 c000000059b4501d 0000000000000081 GPR28: c0000000570058b8 c000000057106d00 c0000000580881b8 c0000000570058a0 [ 247.554756] NIP [c0000000002d4e14] .nfs_do_access+0x3b4/0x410 [ 247.554918] LR [c0000000002d4df8] .nfs_do_access+0x398/0x410 [ 247.555075] Call Trace: [ 247.555147] [c000000059bcf7b0] [c0000000002d4e44] .nfs_do_access+0x3e4/0x410 (unreliable) [ 247.563012] [c000000059bcf8a0] [c0000000002d4f5c] .nfs_permission+0xac/0x230 [ 247.567129] [c000000059bcf930] [c000000000171f84] .__inode_permission+0x94/0x100 [ 247.575049] [c000000059bcf9c0] [c00000000017548c] .link_path_walk+0x8c/0x630 [ 247.579155] [c000000059bcfa90] [c000000000175ba8] .path_lookupat+0xb8/0x1b0 [ 247.583183] [c000000059bcfb20] [c00000000017802c] .filename_lookup+0x8c/0x180 [ 247.587134] [c000000059bcfc90] [c00000000016ad68] .vfs_fstatat+0x78/0x130 [ 247.590989] [c000000059bcfd40] [c00000000016b38c] .SyS_newstat+0x1c/0x50 [ 247.594733] [c000000059bcfe30] [c000000000007c98] system_call+0x38/0xd0 [ 247.598380] Instruction dump: [ 247.601901] 4bfff79d 4bfffd8c 7fe3fb78 389eff10 481656ad 60000000 e91f0020 e8ff0018 [ 247.609048] 3d400010 3d200020 61290200 614a0100 <f9070008> f8e80000 f95f0018 f93f0020 [ 247.616395] ---[ end trace c9fc24592b1a7aba ]--- [ 247.619918] [ 247.624094] Unable to handle kernel paging request for data at address 0x00000014 [ 247.631108] Faulting instruction address: 0xc0000000004f60d0 [ 247.634787] Oops: Kernel access of bad area, sig: 11 [#2] [ 247.638480] SMP NR_CPUS=4 PowerMac [ 247.642140] Modules linked in: cfg80211 snd_aoa_codec_tas snd_aoa snd soundcore uninorth_agp agpgart [ 247.649826] CPU: 0 PID: 1052 Comm: kwindfarm Tainted: G D 4.2.0-rc8-01316-g4b9e78b #6 [ 247.657608] task: c000000059524fb0 ti: c000000059afc000 task.ti: c000000059afc000 [ 247.665544] NIP: c0000000004f60d0 LR: c0000000004f60c4 CTR: c000000000041770 [ 247.669647] REGS: c000000059aff700 TRAP: 0300 Tainted: G D (4.2.0-rc8-01316-g4b9e78b) [ 247.677582] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR: 22022442 XER: 20000000 [ 247.685687] DAR: 0000000000000014 DSISR: 40000000 SOFTE: 1 GPR00: c0000000004f60c4 c000000059aff980 c000000000b9dc00 0000000000000001 GPR04: 00000000025da79d 0000000000000000 c000000059524fb0 0000000000000000 GPR08: 0000000080000000 0000000000000009 0000000000000000 0000000000009324 GPR12: 0000000022022448 c00000000ffff000 c0000000000780a0 c000000059ae0740 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR28: c000000000b66f40 0000000000000000 0000000000000004 c000000059affacc [ 247.721539] NIP [c0000000004f60d0] .wf_fcu_fan_get_rpm+0x50/0x150 [ 247.725125] LR [c0000000004f60c4] .wf_fcu_fan_get_rpm+0x44/0x150 [ 247.728656] Call Trace: [ 247.732114] [c000000059aff980] [c0000000004f60c4] .wf_fcu_fan_get_rpm+0x44/0x150 (unreliable) [ 247.739328] [c000000059affa30] [c0000000004f9204] .pm72_wf_notify+0x784/0x1260 [ 247.746623] [c000000059affb50] [c0000000000794ec] .notifier_call_chain+0x7c/0xf0 [ 247.754143] [c000000059affbf0] [c000000000079954] .__blocking_notifier_call_chain+0x64/0xa0 [ 247.761831] [c000000059affc90] [c0000000004f544c] .wf_thread_func+0x9c/0x170 [ 247.765805] [c000000059affd30] [c0000000000781a4] .kthread+0x104/0x130 [ 247.769715] [c000000059affe30] [c000000000007fa8] .ret_from_kernel_thread+0x58/0xb0 [ 247.777183] Instruction dump: [ 247.780808] 3880000b f8010010 f821ff51 ebc30048 38a10073 ebfe0020 7fe3fb78 837f0048 [ 247.788221] 4bfffc31 2f830001 409e00b8 89210073 <815e0010> 7d295630 793d07e1 408200d4 [ 247.795659] ---[ end trace c9fc24592b1a7abb ]--- [ 247.799338] _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev