On 13 January 2012 16:21, Alexander Graf <ag...@suse.de> wrote: > On 13.01.2012, at 17:16, Peter Maydell wrote: >> On 13 January 2012 15:46, Alexander Graf <ag...@suse.de> wrote: >>> This patch forces unlocking of that lock in the segv handler. I'm not sure >>> this is the right approach though. Maybe we should rather make sure we don't >>> segfault in the code? I would greatly appreciate someone more intelligible >>> than me to look at this :). >> >> A segfault while we're walking the TB chains in QEMU C code? >> That's just a bug (and we know we have one there) -- we should >> fix it rather than papering over it like this. > > Well, we're segfaulting in this exact special case which calls setrlimit() > before an mmap which fails:
So what's actually happening here is not a problem with the failing mmap requested by the guest, but with one we do ourselves slightly later. The guest attempts to do a call to some code that hasn't been translated yet (as it happens it's an attempt to call the tls function in the ARM commpage at 0xffff0fe0, but that's not particularly important). tb_gen_code() generates the code, and then calls tb_link_page() to add it to our data structures. tb_link_page calls tb_alloc_page() which calls page_find_alloc() which then needs to allocate a level 2 page table entry in the l1_map[]. Unfortunately (a) the mmap() it uses to do this fails because of the rlimit (b) we don't check the mmap() return value, so we proceed on with the bogus -1 and quickly segfault. So "unlock and carry on regardless" is definitely wrong. We could do the same as the system-mode (which is using g_malloc0) and abort when the mmap() fails, although that won't help your test program much I suspect. For a proper fix we probably need to handle set/getrlimit for RLIMIT_AS specially so we can apply them ourselves to the guest's mmap/brk usage and don't get spurious allocation failures of our private data structures... -- PMM