On Wed, Dec 05, 2018 at 10:03:38PM +0100, Anton Lindqvist wrote:
> On Sat, Dec 01, 2018 at 04:34:57PM +0100, Anton Lindqvist wrote:
> > On Tue, Nov 27, 2018 at 05:52:15PM -0800, Greg Steuck wrote:
> > > I booted the patched kernel and it seems to have gone farther and I 
> > > believe
> > > reached init before crashing.
> > 
> > By performing a semi-automated bisect I was able to identify the source
> > files that are incompatible with tracing. Common for all source files
> > seems to be that they define routines called early on in the boot
> > process before curcpu() is usable.
> > 
> > I do not have any plans on committing the diff below but please give it
> > a try. Instead, I'm working on extending the files.conf(5) grammar in
> > order to infer a different make target for the files.
> > 
> > Index: arch/amd64/conf/Makefile.amd64
> > ===================================================================
> > RCS file: /cvs/src/sys/arch/amd64/conf/Makefile.amd64,v
> > retrieving revision 1.106
> > diff -u -p -r1.106 Makefile.amd64
> > --- arch/amd64/conf/Makefile.amd64  30 Oct 2018 11:08:30 -0000      1.106
> > +++ arch/amd64/conf/Makefile.amd64  1 Dec 2018 15:32:58 -0000
> > @@ -151,7 +151,31 @@ vers.o: ${SYSTEM_DEP:Ngap.o}
> >     ${CC} ${CFLAGS} ${CPPFLAGS} ${PROF} -c vers.c
> >  
> >  .if ${SYSTEM_OBJ:Mkcov.o} && ${COMPILER_VERSION:Mclang}
> > +amd64_mem.o: $S/arch/amd64/amd64/amd64_mem.c
> > +   ${NORMAL_C} -fno-sanitize-coverage=trace-pc
> > +cpu.o: $S/arch/amd64/amd64/cpu.c
> > +   ${NORMAL_C} -fno-sanitize-coverage=trace-pc
> > +fpu.o: $S/arch/amd64/amd64/fpu.c
> > +   ${NORMAL_C} -fno-sanitize-coverage=trace-pc
> > +gdt.o: $S/arch/amd64/amd64/gdt.c
> > +   ${NORMAL_C} -fno-sanitize-coverage=trace-pc
> > +intr.o: $S/arch/amd64/amd64/intr.c
> > +   ${NORMAL_C} -fno-sanitize-coverage=trace-pc
> > +lapic.o: $S/arch/amd64/amd64/lapic.c
> > +   ${NORMAL_C} -fno-sanitize-coverage=trace-pc
> > +machdep.o: $S/arch/amd64/amd64/machdep.c
> > +   ${NORMAL_C} -fno-sanitize-coverage=trace-pc
> > +tsc.o: $S/arch/amd64/amd64/tsc.c
> > +   ${NORMAL_C} -fno-sanitize-coverage=trace-pc
> >  kcov.o: $S/dev/kcov.c
> > +   ${NORMAL_C} -fno-sanitize-coverage=trace-pc
> > +pvbus.o: $S/dev/pv/pvbus.c
> > +   ${NORMAL_C} -fno-sanitize-coverage=trace-pc
> > +kern_lock.o: $S/kern/kern_lock.c
> > +   ${NORMAL_C} -fno-sanitize-coverage=trace-pc
> > +kern_sched.o: $S/kern/kern_sched.c
> > +   ${NORMAL_C} -fno-sanitize-coverage=trace-pc
> > +kern_tc.o: $S/kern/kern_tc.c
> >     ${NORMAL_C} -fno-sanitize-coverage=trace-pc
> >  .endif
> >  
> 
> Here's a new diff taking a different approach. Keeping tracing off until
> all secondary CPUs have booted solves the issue of accessing curcpu()
> too early. Another issue was then discovered, curproc can be NULL before
> the idle thread tied the current CPU has started. Currently running with
> this diff applied on my laptop (MP) and positive results from Greg. The
> diff will be further exercised in the actual syzkaller setup before
> committing.

Yet another iteration and hopefully the last one. Idea from visa@ to
delay tracing until /dev/kcov has been successfully opened at least
once. At this point, accessing curcpu() is safe. Sort of a hack but it
removes the need to hook into init_main.c which is favorable.

Comments? OK?

Index: dev/kcov.c
===================================================================
RCS file: /cvs/src/sys/dev/kcov.c,v
retrieving revision 1.4
diff -u -p -r1.4 kcov.c
--- dev/kcov.c  27 Aug 2018 15:57:39 -0000      1.4
+++ dev/kcov.c  8 Dec 2018 16:40:41 -0000
@@ -58,6 +58,8 @@ static inline int inintr(void);
 
 TAILQ_HEAD(, kcov_dev) kd_list = TAILQ_HEAD_INITIALIZER(kd_list);
 
+int kcov_cold = 1;
+
 #ifdef KCOV_DEBUG
 int kcov_debug = 1;
 #endif
@@ -76,12 +78,15 @@ int kcov_debug = 1;
 void
 __sanitizer_cov_trace_pc(void)
 {
-       extern int cold;
        struct kcov_dev *kd;
        uint64_t idx;
 
-       /* Do not trace during boot. */
-       if (cold)
+       /*
+        * Do not trace before kcovopen() has been called at least once.
+        * At this point, all secondary CPUs have booted and accessing curcpu()
+        * is safe.
+        */
+       if (kcov_cold)
                return;
 
        /* Do not trace in interrupts to prevent noisy coverage. */
@@ -111,6 +116,9 @@ kcovopen(dev_t dev, int flag, int mode, 
 
        if (kd_lookup(minor(dev)) != NULL)
                return (EBUSY);
+
+       if (kcov_cold)
+               kcov_cold = 0;
 
        DPRINTF("%s: unit=%d\n", __func__, minor(dev));
 

Reply via email to