On Sat, Jul 15, 2017 at 9:21 AM, 吴潍浠(此彼) <weixi....@antfin.com> wrote: > Hi > > Implementing __sanitizer_cov_trace_cmp[1248]_const is OK . > And I will try to find some determinate way to judge this comparison is for > loop or not. > Because all the loops(for() or while()) seem to be transformed to "if" and > "goto" before running sancov pass. > Does it necessary to include APIs for float and double comparison ? like > __sanitizer_cov_trace_cmpf(float, float) > Why you do not include them ?
Note you don't need to implement any of this right now, since we can make it [almost] backwards compatible. Unless, of course, it's simple and you find it useful and want to do this. I think it's better to get a first version of the change in as is (implement current clang API), and then do improvements in subsequent patches. I would expect that detecting consts should be simple. Re loops, I don't know, I would expect that gcc has some existing analysis for loop (it does lots of optimization with counting loop). Re floats/doubles, I am not sure, we managed to live without them in go-fuzz and then in libfuzzer, and I still don't see lots of value in them. Anyway, it should be a separate patch. > Now, I am following some suggests about my codes from Gribow. Final patch is > still on processing. > I am also waiting for copyright assignment of FSF which is requested by Jeff. > > With Regards > Wish Wu > > ------------------------------------------------------------------ > From:Dmitry Vyukov <dvyu...@google.com> > Time:2017 Jul 15 (Sat) 13:41 > To:Kostya Serebryany <k...@google.com> > Cc:Wish Wu <wishwu...@gmail.com>; gcc <g...@gcc.gnu.org>; gcc-patches > <gcc-patches@gcc.gnu.org>; Wish Wu <weixi....@antfin.com>; Alexander > Potapenko <gli...@google.com>; andreyknvl <andreyk...@google.com>; Victor > Chibotaru <tch...@google.com>; Yuri Gribov <tetra2...@gmail.com> > Subject:Re: Add support to trace comparison instructions and switch statements > > > On Fri, Jul 14, 2017 at 11:17 PM, Kostya Serebryany <k...@google.com> wrote: >>>>> > Hi >>>>> > >>>>> > I wrote a test for "-fsanitize-coverage=trace-cmp" . >>>>> > >>>>> > Is there anybody tells me if these codes could be merged into gcc ? >>>>> >>>>> >>>>> Nice! >>>>> >>>>> We are currently working on Linux kernel fuzzing that use the >>>>> comparison tracing. We use clang at the moment, but having this >>>>> support in gcc would be great for kernel land. >>>>> >>>>> One concern I have: do we want to do some final refinements to the API >>>>> before we implement this in both compilers? >>>>> >>>>> 2 things we considered from our perspective: >>>>> - communicating to the runtime which operands are constants >>>>> - communicating to the runtime which comparisons are counting loop checks >>>>> >>>>> First is useful if you do "find one operand in input and replace with >>>>> the other one" thing. Second is useful because counting loop checks >>>>> are usually not useful (at least all but one). >>>>> In the original Go implementation I also conveyed signedness of >>>>> operands, exact comparison operation (<, >, etc): >>>>> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-defs/defs.go#L13 >>>>> But I did not find any use for that. >>>>> I also gave all comparisons unique IDs: >>>>> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-dep/sonar.go#L24 >>>>> That turned out to be useful. And there are chances we will want this >>>>> for C/C++ as well. >>>>> >>>>> Kostya, did anything like this pop up in your work on libfuzzer? >>>>> Can we still change the clang API? At least add an additional argument >>>>> to the callbacks? >>>> >>>> >>>> I'd prefer not to change the API, but extend it (new compiler flag, new >>>> callbacks), if absolutely needed. >>>> Probably make it trace-cmp-guard (similar to trace-pc-guard, with an extra >>>> parameter that has the ID). >>>> I don't like the approach with compiler-generated constant IDs. >>> >>> Yes, if we do it for C/C++, we need to create globals and pass pointer >>> to a global to the callbacks. IDs do not work for C/C++. >>> >>>> Yes, it's a bit more efficient, but much less flexible in presence of >>>> multiple modules, DSOs, dlopen, etc. >>>> >>>> I was also looking at completely inlining this instrumentation because it's >>>> pretty expensive even in it's current form >>>> (adding more parameters will make things worse). >>>> This is going to be much less flexible, of course, so I'll attack it only >>>> once I settle on the algorithm to handle CMPs in libFuzzer. >>> >>> This will require a new, completely different API for >>> compiler<->runtime anyway, so we can put this aside for now. >>> >>> >>>>> At the very least I would suggest that we add an additional arg that >>>>> contains some flags (1/2 arg is a const, this is counting loop check, >>>>> etc). If we do that we can also have just 1 callback that accepts >>>>> uint64's for args because we can pass operand size in the flags: >>>> >>>> >>>> How many flag combinations do we need and do we *really* need them? >>>> >>>> If the number of flag combinations is small, I'd prefer to have separate >>>> callbacks (__sanitizer_cov_trace_cmp_loop_bound ?) >>>> >>>> Do we really need to know that one arg is a const? >>>> It could well be a constant in fact, but compiler won't see it. >>>> >>>> int foo(int n) { ... if (i < n) ... } >>>> ... >>>> foo(42); // not inlined. >>>> >>>> We need to handle both cases the same way. >>> >>> >>> Well, following this line of reasoning we would need only >>> __asan_load/storeN callbacks for asan and remove >>> __asan_load/store1/2/4/8, because compiler might not know the size at >>> compile time. Constant-ness is only an optimization. If compiler does >>> not know that something is a const, fine. Based on my experience with >>> go-fuzz and our early experience with kernel, we badly need const >>> hint. >>> Otherwise fuzzer generates gazillions of candidates based on >>> comparison arguments. Note that kernel is several order of magnitude >>> larger than what people usually fuzz in user-space, inputs are more >>> complex and at the same time execution speed is several order of >>> magnitude lower. We can't rely on raw speed. >>> >>> Thinking of this more, I don't thing that globals will be useful in >>> the kernel context (the main problem is that we have multiple >>> transient isolated kernels). If we track per-comparison site >>> information, we will probably use PCs. So I am ready to give up on >>> this. >>> >>> Both of you expressed concerns about performance. Kostya says we >>> should not break existing clang API. >>> If we limit this to only constant-ness, then I think we can make this >>> both forward and backward compatible, which means we don't need to >>> handle it now. E.g. we can: >>> - if both operands are const (if it's possible at all), don't emit any >>> callback >>> - if only one is const, emit __sanitizer_cov_trace_cmp_const1 and >>> pass the const in a known position (i.e. always first/second arg) >>> - if none are const, emit callback __sanitizer_cov_trace_cmp_dyn1 >>> Then compiler emits weak aliases form >>> __sanitizer_cov_trace_cmp_const/dyn1 to the old >>> __sanitizer_cov_trace_cmp1, which makes it backwards compatible. >>> New runtimes that implement __sanitizer_cov_trace_cmp_const/dyn1, also >>> need to provide the old __sanitizer_cov_trace_cmp1 for old compilers. >> >> SGTM++ >> >> Although, maybe we can actually break the API this way to keep things >> simpler (no weak aliases) >> * leave __sanitizer_cov_trace_cmp[1248] for general case >> * add __sanitizer_cov_trace_cmp[1248]_const for constant in the 2-nd >> parameter >> * add __sanitizer_cov_trace_cmp[1248]_loop for loop bound in the 2nd >> parameter > > This would work for us. > >> * do we need __sanitizer_cov_trace_cmp[1248]_const_loop ? > > We don't know yet. Potentially yes, because const is reading/writing > to a static buffer, while non-const is reading/writing to a buffer > with user-provided size.