On Wed, Dec 05, 2018 at 10:03:38PM +0100, Anton Lindqvist wrote: > On Sat, Dec 01, 2018 at 04:34:57PM +0100, Anton Lindqvist wrote: > > On Tue, Nov 27, 2018 at 05:52:15PM -0800, Greg Steuck wrote: > > > I booted the patched kernel and it seems to have gone farther and I > > > believe > > > reached init before crashing. > > > > By performing a semi-automated bisect I was able to identify the source > > files that are incompatible with tracing. Common for all source files > > seems to be that they define routines called early on in the boot > > process before curcpu() is usable. > > > > I do not have any plans on committing the diff below but please give it > > a try. Instead, I'm working on extending the files.conf(5) grammar in > > order to infer a different make target for the files. > > > > Index: arch/amd64/conf/Makefile.amd64 > > =================================================================== > > RCS file: /cvs/src/sys/arch/amd64/conf/Makefile.amd64,v > > retrieving revision 1.106 > > diff -u -p -r1.106 Makefile.amd64 > > --- arch/amd64/conf/Makefile.amd64 30 Oct 2018 11:08:30 -0000 1.106 > > +++ arch/amd64/conf/Makefile.amd64 1 Dec 2018 15:32:58 -0000 > > @@ -151,7 +151,31 @@ vers.o: ${SYSTEM_DEP:Ngap.o} > > ${CC} ${CFLAGS} ${CPPFLAGS} ${PROF} -c vers.c > > > > .if ${SYSTEM_OBJ:Mkcov.o} && ${COMPILER_VERSION:Mclang} > > +amd64_mem.o: $S/arch/amd64/amd64/amd64_mem.c > > + ${NORMAL_C} -fno-sanitize-coverage=trace-pc > > +cpu.o: $S/arch/amd64/amd64/cpu.c > > + ${NORMAL_C} -fno-sanitize-coverage=trace-pc > > +fpu.o: $S/arch/amd64/amd64/fpu.c > > + ${NORMAL_C} -fno-sanitize-coverage=trace-pc > > +gdt.o: $S/arch/amd64/amd64/gdt.c > > + ${NORMAL_C} -fno-sanitize-coverage=trace-pc > > +intr.o: $S/arch/amd64/amd64/intr.c > > + ${NORMAL_C} -fno-sanitize-coverage=trace-pc > > +lapic.o: $S/arch/amd64/amd64/lapic.c > > + ${NORMAL_C} -fno-sanitize-coverage=trace-pc > > +machdep.o: $S/arch/amd64/amd64/machdep.c > > + ${NORMAL_C} -fno-sanitize-coverage=trace-pc > > +tsc.o: $S/arch/amd64/amd64/tsc.c > > + ${NORMAL_C} -fno-sanitize-coverage=trace-pc > > kcov.o: $S/dev/kcov.c > > + ${NORMAL_C} -fno-sanitize-coverage=trace-pc > > +pvbus.o: $S/dev/pv/pvbus.c > > + ${NORMAL_C} -fno-sanitize-coverage=trace-pc > > +kern_lock.o: $S/kern/kern_lock.c > > + ${NORMAL_C} -fno-sanitize-coverage=trace-pc > > +kern_sched.o: $S/kern/kern_sched.c > > + ${NORMAL_C} -fno-sanitize-coverage=trace-pc > > +kern_tc.o: $S/kern/kern_tc.c > > ${NORMAL_C} -fno-sanitize-coverage=trace-pc > > .endif > > > > Here's a new diff taking a different approach. Keeping tracing off until > all secondary CPUs have booted solves the issue of accessing curcpu() > too early. Another issue was then discovered, curproc can be NULL before > the idle thread tied the current CPU has started. Currently running with > this diff applied on my laptop (MP) and positive results from Greg. The > diff will be further exercised in the actual syzkaller setup before > committing.
Yet another iteration and hopefully the last one. Idea from visa@ to delay tracing until /dev/kcov has been successfully opened at least once. At this point, accessing curcpu() is safe. Sort of a hack but it removes the need to hook into init_main.c which is favorable. Comments? OK? Index: dev/kcov.c =================================================================== RCS file: /cvs/src/sys/dev/kcov.c,v retrieving revision 1.4 diff -u -p -r1.4 kcov.c --- dev/kcov.c 27 Aug 2018 15:57:39 -0000 1.4 +++ dev/kcov.c 8 Dec 2018 16:40:41 -0000 @@ -58,6 +58,8 @@ static inline int inintr(void); TAILQ_HEAD(, kcov_dev) kd_list = TAILQ_HEAD_INITIALIZER(kd_list); +int kcov_cold = 1; + #ifdef KCOV_DEBUG int kcov_debug = 1; #endif @@ -76,12 +78,15 @@ int kcov_debug = 1; void __sanitizer_cov_trace_pc(void) { - extern int cold; struct kcov_dev *kd; uint64_t idx; - /* Do not trace during boot. */ - if (cold) + /* + * Do not trace before kcovopen() has been called at least once. + * At this point, all secondary CPUs have booted and accessing curcpu() + * is safe. + */ + if (kcov_cold) return; /* Do not trace in interrupts to prevent noisy coverage. */ @@ -111,6 +116,9 @@ kcovopen(dev_t dev, int flag, int mode, if (kd_lookup(minor(dev)) != NULL) return (EBUSY); + + if (kcov_cold) + kcov_cold = 0; DPRINTF("%s: unit=%d\n", __func__, minor(dev));