-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On 03/28/2014 02:09 PM, Christoffer Dall wrote:
> On Fri, Mar 28, 2014 at 04:26:59AM -0400, Michael Casadevall
> wrote:
>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>> 
>> As I've made a fair bit of headway since LinaroConnect, I wanted
>> to drop a line on my current progress with porting TianoCore to
>> KVM
>> 
>> Summary (tl;dr version):
>> 
>> KVM can start TianoCore, and boot all the way to shell, and
>> access HDDs via VirtioBlk. We can start grub and successfully
>> retrieve files from ext partitions, load a device tree, and start
>> the kernel. The kernel runs through most of the EFI stub, but
>> falls over during ExitBootServices()
> 
> Thanks for providing this status!
> 
>> 
>> Long Version:
>> 
>> So, after much blood sweat and tears, we're finally at the point
>> of trying to actually start a kernel, though this (for the
>> moment) remains an elusive goal. The current problem is that once
>> we call EBS(), we get an exception from EFI with no Image
>> information, which means the exception handler doesn't know where
>> it came from. After several seconds, we get a second exception
>> from within DxeCore, and then EFI falls over.
>> 
>> Debugging EFI is difficult and error prone, combined with
>> limited debug facilities from the gdb-stub in QEMU (no
>> breakpoints), and no decent way to load all of EFI itself (you
>> have to run add-symbol-file manually with the output of commands
>> printed on the console; supposedly its possible to generate a
>> giant GdbSyms.dll file to import in a single go, but I haven't
>> succeeded at this). This is further complicated that it appears
>> we're asserting somewhere in a driver, and short of adding
>> printfs to *every* driver, its impossible to know which is
>> asseting.
> 
> Maybe it's worth adding a hack-support-gdb-in-kvm implementation
> for this.  If we go down this road, I can probably find time to
> help you out there.
> 
> Can you do some scripting to replace assert statements with "{ 
> print("%s:%d\n", __FILE__, __LINE__); orig_assert(); }" type hack?
> 

That's probably a decent idea if I can find where ASSERT() is defined.
I'll try that in a bit.

>> 
>> Previous attempts to debug assets shows that EFI does "odd"
>> things to the stack when we hit an exception, making walking it
>> with GDB impossible. I need to figure out what madness EFI does
>> with my SP so I can get the entire stack on an explosion, but
>> this remains at best hopeful thinking.
> 
> This sounds very strange - could it be that because you take an 
> exception, you use a SP from a different mode and everything just
> messes up?
> 

This could be GDB just being unhappy. I've had issues walking the
stack in KVM in general, but even if I walk the stack by hand, I don't
see a pointer to the next frame when we're in an exception. To my
knowledge, UEFI uses the standard AArch64 C ABI, but this might be a
faulty exception on my part.

>> 
>> Further complicating things is that during EBS, my print
>> debugging goes away. I might just cheat and roll a simple
>> assembly function to bang out messages through serial without
>> calling anything else. Ugly as sin, but this should let me get
>> useful debug output through the EBS framework. Complicating
>> matters is that I need to locate each and all EBS() event
>> functions, which are spread *everywhere* in TianoCore, and then
>> debug them each individually.
> 
> I'm a little confused no knowing UEFI, is EBS() not a single
> function and what does it matter that it's called from multiple
> places?
> 

So, drivers and applications can enlist to get notification of when
ExitBootServices are called. This pushes a pointer to a function into
an array when is then iterated through and this pointer is then called
so drivers can unregister themselves from boot services, etc.

Complicating the issue is I can't use printf once GetMemoryMap() is
called without breaking EBS() (I think this is a bug in UEFI, leif, 2
cents?, but I think I can twiddle the serial port directly without
breaking shit.

Having slept on it, its probably easy to print out the pointers as we
go through them, so I can get an idea of whats listening for EBS and
try and narrow down my list of candidates.

>> 
>> I'm open to ideas on how best to accomplish this.
>> 
>> On a larger scale, there are a couple of other bugs and odds and
>> ends which currently affect us:
>> 
>> * wfi doesn't work
>> 
>> THis is probably the biggest w.r.t. to functionality that should
>> work, but doesn't. The EFI event loop is built on checking the
>> timer, then calling wfi to check the timer later. The problem
>> here is we call wfi ... and UEFI never comes back despite events
>> firing (I can put print code in the interrupt handler to confirm
>> this). This may be related to the VGIC errors I get running kvm
>> under foundation, but haven't taken the time to properly nail
>> down the bug here.
> 
> So if I understand it, the expected sequence of events are:
> 
> 1. check timer (arch timer counter?) 2. WFI 3. virtual arch timer
> interrupt, causes wake-up from WFI 4. go to 1->
> 
> But you seem to get stuck at (2)?
> 

Exactly.

> When you say "print code in the interrupt handler" is that the
> UEFI interrupt handler?  In that case, you do wake up from the
> WFI...?
> 

I put a DEBUG print line in the Timer interrupt handler, which prints
out a message every tick letting me know the timer was working. When
we call wfi, the timer ticks still show up (and I can see them through
vgic with debugging there enabled)

> Do you see stuff happening in virt/kvm/arm/arch_timer.c: 
> kvm_timer_inject_irq()?
> 
> That should call kvm_vgic_inject_irq(), which should 
> vgic_kick_vcpus(kvm), which is what wakes you up from your WFI.
> 

Hrm, I need some debug code in vgic_kick_vcpus. Thanks!

>> 
>> This was worked around by commenting out the wfi, turning event
>> loop into a busy loop, but this has to be resolved before we can
>> ever consider merging it
>> 
>> * No RTC
>> 
>> I looked through virt.c in KVM, and as best I can tell, I've got
>> no RTC at all (no PL031). It also appears that the kernel can't
>> get RTC as running a kernel gets me a 1970 clock. I'm not sure if
>> this is by design or not, but it causes GetTime() to return
>> EFI_ERROR, and I suspect may be one of the exceptions I'm getting
>> avoid (Shell prints a ton of warnings that GetTime is busted).
> 
> The only thing you can use to tell passing of time in mach-virt is
> the arch-timer counter and use a fixed starting point.
> 

The problem here is spec says THOU SHALL HAVE RTC. We could fake it
with counting up from system start and using the UEFI build time as a
starting point, but this is not what the spec rights had in mind
(nothing says GetTime() has to be accurate :-)).

For KVM, I'm wondering if we should just stick a PL031 on the bus and
be done with it. For Xen, we're going to need a way to do this via xenbus.


> -Christoffer
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTNc+0AAoJEGKVRgSEnX1Q6/8H/0OJsjz6ovhJcQsa9AMPD69Q
0/qBt5/Xx6KV0aHBMuQzPGHtqsF8qXAK1OzWsJtm8V9PcNeDlhZX1G3cGWcsky6G
bVWrtHvoNvnFzEfQt5xf77u6dbD0xckDnRsT6gCV8RlZq0nIurRzTUyvNRem9rUb
SfrpVGl3mxpcrBSQOQIpwUjYE+1+hM0x5xmjG6z31D28kiPv7KXszTTVwa+9R/HD
0Z5OYEGFiutU86LFotjUa99NnyLysHZiCgMpdkVmcbzwDP9nMJy36ERnaPRSV90E
VE2a5G/ha44UKKJFIaxXAG8TFPWitljRl7tupDFk9ae8NShhxlkpLl+okMTlBnU=
=vQzG
-----END PGP SIGNATURE-----

_______________________________________________
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev

Reply via email to