-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 03/28/2014 02:09 PM, Christoffer Dall wrote: > On Fri, Mar 28, 2014 at 04:26:59AM -0400, Michael Casadevall > wrote: >> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >> >> As I've made a fair bit of headway since LinaroConnect, I wanted >> to drop a line on my current progress with porting TianoCore to >> KVM >> >> Summary (tl;dr version): >> >> KVM can start TianoCore, and boot all the way to shell, and >> access HDDs via VirtioBlk. We can start grub and successfully >> retrieve files from ext partitions, load a device tree, and start >> the kernel. The kernel runs through most of the EFI stub, but >> falls over during ExitBootServices() > > Thanks for providing this status! > >> >> Long Version: >> >> So, after much blood sweat and tears, we're finally at the point >> of trying to actually start a kernel, though this (for the >> moment) remains an elusive goal. The current problem is that once >> we call EBS(), we get an exception from EFI with no Image >> information, which means the exception handler doesn't know where >> it came from. After several seconds, we get a second exception >> from within DxeCore, and then EFI falls over. >> >> Debugging EFI is difficult and error prone, combined with >> limited debug facilities from the gdb-stub in QEMU (no >> breakpoints), and no decent way to load all of EFI itself (you >> have to run add-symbol-file manually with the output of commands >> printed on the console; supposedly its possible to generate a >> giant GdbSyms.dll file to import in a single go, but I haven't >> succeeded at this). This is further complicated that it appears >> we're asserting somewhere in a driver, and short of adding >> printfs to *every* driver, its impossible to know which is >> asseting. > > Maybe it's worth adding a hack-support-gdb-in-kvm implementation > for this. If we go down this road, I can probably find time to > help you out there. > > Can you do some scripting to replace assert statements with "{ > print("%s:%d\n", __FILE__, __LINE__); orig_assert(); }" type hack? > That's probably a decent idea if I can find where ASSERT() is defined. I'll try that in a bit. >> >> Previous attempts to debug assets shows that EFI does "odd" >> things to the stack when we hit an exception, making walking it >> with GDB impossible. I need to figure out what madness EFI does >> with my SP so I can get the entire stack on an explosion, but >> this remains at best hopeful thinking. > > This sounds very strange - could it be that because you take an > exception, you use a SP from a different mode and everything just > messes up? > This could be GDB just being unhappy. I've had issues walking the stack in KVM in general, but even if I walk the stack by hand, I don't see a pointer to the next frame when we're in an exception. To my knowledge, UEFI uses the standard AArch64 C ABI, but this might be a faulty exception on my part. >> >> Further complicating things is that during EBS, my print >> debugging goes away. I might just cheat and roll a simple >> assembly function to bang out messages through serial without >> calling anything else. Ugly as sin, but this should let me get >> useful debug output through the EBS framework. Complicating >> matters is that I need to locate each and all EBS() event >> functions, which are spread *everywhere* in TianoCore, and then >> debug them each individually. > > I'm a little confused no knowing UEFI, is EBS() not a single > function and what does it matter that it's called from multiple > places? > So, drivers and applications can enlist to get notification of when ExitBootServices are called. This pushes a pointer to a function into an array when is then iterated through and this pointer is then called so drivers can unregister themselves from boot services, etc. Complicating the issue is I can't use printf once GetMemoryMap() is called without breaking EBS() (I think this is a bug in UEFI, leif, 2 cents?, but I think I can twiddle the serial port directly without breaking shit. Having slept on it, its probably easy to print out the pointers as we go through them, so I can get an idea of whats listening for EBS and try and narrow down my list of candidates. >> >> I'm open to ideas on how best to accomplish this. >> >> On a larger scale, there are a couple of other bugs and odds and >> ends which currently affect us: >> >> * wfi doesn't work >> >> THis is probably the biggest w.r.t. to functionality that should >> work, but doesn't. The EFI event loop is built on checking the >> timer, then calling wfi to check the timer later. The problem >> here is we call wfi ... and UEFI never comes back despite events >> firing (I can put print code in the interrupt handler to confirm >> this). This may be related to the VGIC errors I get running kvm >> under foundation, but haven't taken the time to properly nail >> down the bug here. > > So if I understand it, the expected sequence of events are: > > 1. check timer (arch timer counter?) 2. WFI 3. virtual arch timer > interrupt, causes wake-up from WFI 4. go to 1-> > > But you seem to get stuck at (2)? > Exactly. > When you say "print code in the interrupt handler" is that the > UEFI interrupt handler? In that case, you do wake up from the > WFI...? > I put a DEBUG print line in the Timer interrupt handler, which prints out a message every tick letting me know the timer was working. When we call wfi, the timer ticks still show up (and I can see them through vgic with debugging there enabled) > Do you see stuff happening in virt/kvm/arm/arch_timer.c: > kvm_timer_inject_irq()? > > That should call kvm_vgic_inject_irq(), which should > vgic_kick_vcpus(kvm), which is what wakes you up from your WFI. > Hrm, I need some debug code in vgic_kick_vcpus. Thanks! >> >> This was worked around by commenting out the wfi, turning event >> loop into a busy loop, but this has to be resolved before we can >> ever consider merging it >> >> * No RTC >> >> I looked through virt.c in KVM, and as best I can tell, I've got >> no RTC at all (no PL031). It also appears that the kernel can't >> get RTC as running a kernel gets me a 1970 clock. I'm not sure if >> this is by design or not, but it causes GetTime() to return >> EFI_ERROR, and I suspect may be one of the exceptions I'm getting >> avoid (Shell prints a ton of warnings that GetTime is busted). > > The only thing you can use to tell passing of time in mach-virt is > the arch-timer counter and use a fixed starting point. > The problem here is spec says THOU SHALL HAVE RTC. We could fake it with counting up from system start and using the UEFI build time as a starting point, but this is not what the spec rights had in mind (nothing says GetTime() has to be accurate :-)). For KVM, I'm wondering if we should just stick a PL031 on the bus and be done with it. For Xen, we're going to need a way to do this via xenbus. > -Christoffer > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTNc+0AAoJEGKVRgSEnX1Q6/8H/0OJsjz6ovhJcQsa9AMPD69Q 0/qBt5/Xx6KV0aHBMuQzPGHtqsF8qXAK1OzWsJtm8V9PcNeDlhZX1G3cGWcsky6G bVWrtHvoNvnFzEfQt5xf77u6dbD0xckDnRsT6gCV8RlZq0nIurRzTUyvNRem9rUb SfrpVGl3mxpcrBSQOQIpwUjYE+1+hM0x5xmjG6z31D28kiPv7KXszTTVwa+9R/HD 0Z5OYEGFiutU86LFotjUa99NnyLysHZiCgMpdkVmcbzwDP9nMJy36ERnaPRSV90E VE2a5G/ha44UKKJFIaxXAG8TFPWitljRl7tupDFk9ae8NShhxlkpLl+okMTlBnU= =vQzG -----END PGP SIGNATURE----- _______________________________________________ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev