On Mon, Oct 9, 2017 at 9:16 PM, Thomas Meyer <tho...@m3y3r.de> wrote: >> > Hi, >> > >> > are you able to shed light on this topic? >> > Any help is greatly appreciated! >> > >> > With kind regards >> > thomas >> > >> > Date: Sun, 8 Oct 2017 13:18:24 +0200 >> > From: Thomas Meyer <tho...@m3y3r.de> >> > To: Richard Weinberger <rich...@nod.at> >> > Cc: user-mode-linux-de...@lists.sourceforge.net, >> > linux-kernel@vger.kernel.org >> > Subject: Re: [PATCH] um: Fix kcov crash before kernel is started. >> > User-Agent: NeoMutt/20170113 (1.7.2) >> > >> > On Sun, Oct 08, 2017 at 12:44:12PM +0200, Richard Weinberger wrote: >> >> Am Sonntag, 8. Oktober 2017, 12:31:58 CEST schrieb Thomas Meyer: >> >> > UMLs current_thread_info() unconditionally assumes that the top of the >> >> > stack >> >> > contains the thread_info structure. But on UML the >> >> > __sanitizer_cov_trace_pc >> >> > function is called for *all* functions! This results in an early crash: >> >> > >> >> > Prevent kcov from using invalid curent_thread_info() data by checking >> >> > the system_state. >> >> > >> >> > Signed-off-by: Thomas Meyer <tho...@m3y3r.de> >> >> > --- >> >> > kernel/kcov.c | 6 ++++++ >> >> > 1 file changed, 6 insertions(+) >> >> > >> >> > diff --git a/kernel/kcov.c b/kernel/kcov.c >> >> > index 3f693a0f6f3e..d601c0e956f6 100644 >> >> > --- a/kernel/kcov.c >> >> > +++ b/kernel/kcov.c >> >> > @@ -56,6 +56,12 @@ void notrace __sanitizer_cov_trace_pc(void) >> >> > struct task_struct *t; >> >> > enum kcov_mode mode; >> >> > >> >> > +#ifdef CONFIG_UML >> >> > + if(!(system_state == SYSTEM_SCHEDULING || >> >> > + system_state == SYSTEM_RUNNING)) >> >> > + return; >> >> > +#endif >> >> >> >> Hmm, and why does it work on all other archs then? >> > >> > Hi, >> > >> > I guess UML is different then other archs! But to be honest I'm not sure >> > why. I assume that __sanitizer_cov_trace_pc on other archs isn't called >> > that early, or that curent_thread_info returns NULL on other archs when >> > the first task isn't running yet. >> > >> > But as I fail to use/setup the qemu gdb attachment to debug early x86_64 >> > code >> > I can't say exactly why. >> > >> > Maybe someone how knows the inner workings of x86_64 and/or kcov can >> > answer this question! >> >> >> Hi, > > Hi, > >> Yes, kcov can have some issues with early bootstrap code, because it >> accesses current and it can also conflict with say, per-cpu setup code >> (at least it was the case for x86). For x86 and arm64 we just bulk >> blacklist instrumentation of arch code involved in early bootstrap. >> See e.g. KCOV_INSTRUMENT in arch/x86/boot/Makefile. I think you need >> to do the same for um. Start with bulk ignoring as much as possible >> until you get it booting and then bisect back from there. > > oh, arch/um/* already contains the Makefile exception settings! > I guess CONFIG_KCOV_INSTRUMENT_ALL overrides the the Makefile settings? > Or doesn't it? I looked at scripts/Makefile.lib but failed to understand > what config options has precedens in that case.
Then, I guess, boot code calls into some common instrumented code, which gets into kcov and crashes. This check helps, right? +#ifdef CONFIG_UML + if(!(system_state == SYSTEM_SCHEDULING || + system_state == SYSTEM_RUNNING)) + return; +#endif Which means we somehow get here during boot. Is it possible to get a stack trace for the return statement? There is no common recipe. I think x86/arm64 are somewhat fragile in this aspect as well, but somehow work. First of all we need to understand how we get into the instrumentation callback during boot.