Hi Greg > > Bad mode in data abort handler detected > > Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP ARM > > Modules linked in: firq(O) ipv6 > > CPU: 0 PID: 103 Comm: systemd-udevd Tainted: G O 3.14.0 #1 > > task: bf2b9300 ti: bf362000 task.ti: bf362000 > > PC is at 0xffff1240 > > LR is at handle_fasteoi_irq+0x9c/0x13c > > pc : [<ffff1240>] lr : [<8005cda0>] psr: 600f01d1 > > sp : bf363e70 ip : 07a7e79d fp : 00000000 > > r10: 76f92008 r9 : 80590080 r8 : 76e8e4d0 > > r7 : f8200100 r6 : bf363fb0 r5 : bf008414 r4 : bf0083c0 > > r3 : 80230d04 r2 : 0000002f r1 : 00000000 r0 : bf0083c0 > > Flags: nZCv IRQs off FIQs off Mode FIQ_32 ISA ARM Segment user > > It looks like we are in FIQ mode and interrupts have been masked. Indeed. > > Control: 10c53c7d Table: 60004059 DAC: 00000015 > > Process systemd-udevd (pid: 103, stack limit = 0xbf362240) > > Stack: (0xbf363e70 to 0xbf364000) > > 3e60: bf0083c0 00000000 0000002f > > 80230d04 > > 3e80: bf0083c0 bf008414 bf363fb0 f8200100 76e8e4d0 80590080 76f92008 > > 00000000 > > 3ea0: 07a7e79d bf363e70 8005cda0 ffff1240 600f01d1 ffffffff 8005cd04 > > 0000002f > > 3ec0: 0000002f 800598bc 8058cc70 8000ed00 f820010c 8059684c bf363ef8 > > 80008528 > > 3ee0: 80023730 80023744 200f0113 ffffffff bf363f2c 80012180 00000000 > > 805baa00 > > 3f00: 00000000 00000100 00000002 00000022 00000000 bf362000 76e8e4d0 > > 80590080 > > 3f20: 76f92008 00000000 0000000a bf363f40 80023730 80023744 200f0113 > > ffffffff > > 3f40: bf007a14 8059ac00 00000000 0000000a ffff8dd7 00400140 bf0079c0 > > 8058cc70 > > 3f60: 00000022 00000000 f8200100 76e8e4d0 76f9201c 76f92008 00000000 > > 80023af0 > > 3f80: 8058cc70 8000ed04 f820010c 8059684c bf363fb0 80008528 00000000 > > 76dd3b44 > > 3fa0: 600f0010 ffffffff 0000000c 8001233c 00000000 00000000 76f93428 > > 76f93428 > > 3fc0: 76f93438 00000000 76f93448 0000000c 76e8e4d0 76f9201c 76f92008 > > 00000000 > > 3fe0: 00000000 7ec115c0 76f60914 76dd3b44 600f0010 ffffffff 9fffd821 > > 9fffdc21 > > [<8005cda0>] (handle_fasteoi_irq) from [<80230d04>] (gic_eoi_irq+0x0/0x4c) > > It certainly looks like we are going down the standard IRQ patch as you > suggested. I'm not a Linux driver guy, but do you see any kind of activity > (break points, printfs, ...) through your FIQ handler? I am reaching 0xffff1224 which i believe is the fiq vector address on the vexpress?
> > [<80230d04>] (gic_eoi_irq) from [<f8200100>] (0xf8200100) > > Code: ee02af10 f57ff06f e59d8000 e59d9004 (e599b00c) > > ---[ end trace 3dc3571209a017e1 ]--- > > Kernel panic - not syncing: Fatal exception in interrupt > > It is hard to determine entirely what is happening here based on this > info. I do have code of my own that routes KGDB interrupts as FIQs and > with the workaround I see the FIQs handled as expected. Some things we can > try to get more info in hopes of pinpointing where to look: > > 1. At the top of hw/intc/arm_gic.c there is the following commented out > line: > //#define DEBUG_GIC > Uncomment the line, rebuild and rerun. This will give us some trace on > what is going through the GIC code. I have commented out some debug lines but i see: Breakpoint 1, gic_update_with_grouping (s=0x5555564dba80) at hw/intc/arm_gic.c:120 120 DPRINTF("Raised pending FIQ %d (cpu %d)\n", best_irq, cpu); With the expected irq nr. 49 (32+17). > 2. Run qemu with the "-d int" option which will print a message on each > interrupt. We should see an FIQ at some point if they are occurring. The > only issue is that there will be numerous IRQs, so you'll have to parse > through them to find an "exception 6 [FIQ]. Here is the relevant output when the FIQ hits: Taking exception 2 [SVC] Taking exception 2 [SVC] pml: pml_timer_tick: raise_irq arm_gic: Raised pending FIQ 49 (cpu 0) Taking exception 6 [FIQ] pml: pml_write: update control flags: 1 pml: pml_update: start timer pml: pml_update: lower irq pml: pml_read: read magic pml: pml_write: update control flags: 3 pml: pml_update: start timer Taking exception 3 [Prefetch Abort] ...with IFSR 0x5 IFAR 0x80221d70 Taking exception 4 [Data Abort] ...with DFSR 0x805 DFAR 0x805c604c Taking exception 4 [Data Abort] ...with DFSR 0x805 DFAR 0x805c604c Taking exception 4 [Data Abort] So the fiq is hitting but unfortunatly i have no idea where the data aborts are coming from. I have shifted all other Irqs besides 49 to group 1 so that only irq 49 is a FIQ. Might it be that i am seeing some secure violations... The address of the IFAR __idr_pre_get which lives in the linux kernel in lib/idr.c seems to be implementing ann integer ID management. > 3. If you set a breakpoint in your driver, is it possible to see that > FIQs are on from the kernel debugger. Clearly you have to try this from > a path where interrupts are masked. I see the following on my system > mentioned above: > ... > Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel > ... So you mean by debugging via the qemu debug port? I have not enabled the kgdb. As stated above, i was not able to catch the fiq irq there. But it might be that i get I have debugged qemu to see if the irq is routed correctly. The depeest call i could find is this: bt #0 tcg_handle_interrupt (cpu=0x555556450790, mask=16) at /home/sander/speedy/soc/qemu/translate-all.c:1503 #1 0x0000555555755323 in cpu_interrupt (cpu=0x555556450790, mask=16) at /home/sander/speedy/soc/qemu/include/qom/cpu.h:556 #2 0x00005555557561b7 in arm_cpu_set_irq (opaque=0x555556450790, irq=1, level=1) at /home/sander/speedy/soc/qemu/target-arm/cpu.c:261 #3 0x00005555558193ec in qemu_set_irq (irq=0x55555642c840, level=1) at hw/core/irq.c:43 #4 0x0000555555879073 in gic_update_with_grouping (s=0x5555564dba80) at hw/intc/arm_gic.c:132 #5 0x000055555587936d in gic_update (s=0x5555564dba80) at hw/intc/arm_gic.c:180 #6 0x00005555558798a7 in gic_set_irq (opaque=0x5555564dba80, irq=49, level=1) at hw/intc/arm_gic.c:264 #7 0x00005555558193ec in qemu_set_irq (irq=0x555556432b00, level=1) at hw/core/irq.c:43 #8 0x0000555555661d4d in a9mp_priv_set_irq (opaque=0x5555564d7260, irq=17, level=1) at /home/sander/speedy/soc/qemu/hw/cpu/a9mpcore.c:17 #9 0x00005555558193ec in qemu_set_irq (irq=0x5555564f3c00, level=1) at hw/core/irq.c:43 #10 0x00005555558f6fed in qemu_irq_raise (irq=0x5555564f3c00) at /home/sander/speedy/soc/qemu/include/hw/irq.h:16 #11 0x00005555558f7363 in pml_timer_tick (opaque=0x555556595020) at hw/timer/pml.c:95 #12 0x000055555599be6e in aio_bh_poll (ctx=0x5555563fdad0) at async.c:82 #13 0x00005555559b2d9f in aio_dispatch (ctx=0x5555563fdad0) at aio-posix.c:137 #14 0x000055555599c2cb in aio_ctx_dispatch (source=0x5555563fdad0, callback=0x0, user_data=0x0) at async.c:221 #15 0x00007ffff7901e04 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #16 0x00005555559b0a79 in glib_pollfds_poll () at main-loop.c:200 #17 0x00005555559b0b7a in os_host_main_loop_wait (timeout=0) at main-loop.c:245 #18 0x00005555559b0c52 in main_loop_wait (nonblocking=1) at main-loop.c:494 #19 0x0000555555791d8b in main_loop () at vl.c:1872 #20 0x00005555557998d5 in main (argc=22, argv=0x7fffffffda38, envp=0x7fffffffdaf0) at vl.c:4348 I am not sure if arm_cpu_set_irq(opaque=0x555556450790, irq=1, level=1) represents a fiq and if mask 16 is the correct mask for the fiq request. Row #6 show clearly that irq 49 configured to Group 0 is triggered. All other interrupt are configured to Group 1 from my Linux kernel. The call to #4 gic_update_with_grouping shows that grouping within the GIC is enabled and that irq is triggered as FIQ within qemu. All of this looks good as far as i understand. So i am pretty confident that qemu is working correctly (minus the Prefetch and Data Aborts). Best regards Tim