DC with Clang

Steve VanderLeest Thu, 05 Sep 2024 05:27:43 -0700
Resending as plain text so that lists can accept.

On Thu, Sep 5, 2024 at 8:21 AM Vanderleest (US), Steven H
<steven.h.vanderle...@boeing.com> wrote:
>
> From: Vanderleest (US), Steven H <steven.h.vanderle...@boeing.com>
> Date: Thursday, September 5, 2024 at 8:13 AM
> To: Peter Zijlstra <pet...@infradead.org>, Wentao Zhang 
> <wenta...@illinois.edu>
> Cc: Kelly (US), Matt <matt.kel...@boeing.com>, a...@linux-foundation.org 
> <a...@linux-foundation.org>, Oppelt (US), Andrew J 
> <andrew.j.opp...@boeing.com>, anton.iva...@cambridgegreys.com 
> <anton.iva...@cambridgegreys.com>, a...@kernel.org <a...@kernel.org>, 
> a...@arndb.de <a...@arndb.de>, bhelg...@google.com <bhelg...@google.com>, 
> b...@alien8.de <b...@alien8.de>, Wolber (US), Chuck 
> <chuck.wol...@boeing.com>, dave.han...@linux.intel.com 
> <dave.han...@linux.intel.com>, dvyu...@google.com <dvyu...@google.com>, 
> h...@zytor.com <h...@zytor.com>, jingh...@illinois.edu 
> <jingh...@illinois.edu>, johan...@sipsolutions.net 
> <johan...@sipsolutions.net>, jpoim...@kernel.org <jpoim...@kernel.org>, 
> justinst...@google.com <justinst...@google.com>, k...@kernel.org 
> <k...@kernel.org>, kent.overstr...@linux.dev <kent.overstr...@linux.dev>, 
> linux-a...@vger.kernel.org <linux-a...@vger.kernel.org>, 
> linux-...@vger.kernel.org <linux-...@vger.kernel.org>, 
> linux-kbu...@vger.kernel.org <linux-kbu...@vger.kernel.org>, 
> linux-ker...@vger.kernel.org <linux-ker...@vger.kernel.org>, 
> linux-trace-kernel@vger.kernel.org <linux-trace-kernel@vger.kernel.org>, 
> linux...@lists.infradead.org <linux...@lists.infradead.org>, 
> l...@lists.linux.dev <l...@lists.linux.dev>, l...@kernel.org 
> <l...@kernel.org>, mari...@illinois.edu <mari...@illinois.edu>, 
> masahi...@kernel.org <masahi...@kernel.org>, mask...@google.com 
> <mask...@google.com>, mathieu.desnoy...@efficios.com 
> <mathieu.desnoy...@efficios.com>, Weber (US), Matthew L 
> <matthew.l.web...@boeing.com>, mhira...@kernel.org <mhira...@kernel.org>, 
> mi...@redhat.com <mi...@redhat.com>, mo...@google.com <mo...@google.com>, 
> nat...@kernel.org <nat...@kernel.org>, ndesaulni...@google.com 
> <ndesaulni...@google.com>, ober...@linux.ibm.com <ober...@linux.ibm.com>, 
> paul...@kernel.org <paul...@kernel.org>, rich...@nod.at <rich...@nod.at>, 
> rost...@goodmis.org <rost...@goodmis.org>, samitolva...@google.com 
> <samitolva...@google.com>, Sarkisian (US), Samuel 
> <samuel.sarkis...@boeing.com>, t...@linutronix.de <t...@linutronix.de>, 
> ting...@illinois.edu <ting...@illinois.edu>, t...@illinois.edu 
> <t...@illinois.edu>, x...@kernel.org <x...@kernel.org>
> Subject: Re: [EXTERNAL] Re: [PATCH v2 0/4] Enable measuring the kernel's 
> Source-based Code Coverage and MC/DC with Clang
>
> I’ll answer Peter’s last question: “What is the impact on certification of 
> not covering the noinstr code.”
>
>
>
> Any code in the target image that is not executed by a test (and thus not 
> covered) must be analyzed and justified as an exception. For example, 
> defensive code is often impossible to exercise by test, but can be included 
> in the image with a justification to the regulatory authority such as the 
> Federal Aviation Administration (FAA). In practice, this means the number of 
> unique instances of non-instrumented code needs to be manageable.  I say 
> “unique instances” because there may be many instances of a particular 
> category, but justified by the same analysis/rationale. Where we specifically 
> mark a section of code with noinstr, it is typically because the 
> instrumentation would change the behavior of the code, perturbing the test 
> results. With some analysis for each distinct category of this issue, we 
> could then write justification(s) to show the overall coverage is sufficient.
>
>
>
> Regards,
>
> Steve
>
>
>
>
>
> From: Peter Zijlstra <pet...@infradead.org>
> Date: Thursday, September 5, 2024 at 7:42 AM
> To: Wentao Zhang <wenta...@illinois.edu>
> Cc: Kelly (US), Matt <matt.kel...@boeing.com>, a...@linux-foundation.org 
> <a...@linux-foundation.org>, Oppelt (US), Andrew J 
> <andrew.j.opp...@boeing.com>, anton.iva...@cambridgegreys.com 
> <anton.iva...@cambridgegreys.com>, a...@kernel.org <a...@kernel.org>, 
> a...@arndb.de <a...@arndb.de>, bhelg...@google.com <bhelg...@google.com>, 
> b...@alien8.de <b...@alien8.de>, Wolber (US), Chuck 
> <chuck.wol...@boeing.com>, dave.han...@linux.intel.com 
> <dave.han...@linux.intel.com>, dvyu...@google.com <dvyu...@google.com>, 
> h...@zytor.com <h...@zytor.com>, jingh...@illinois.edu 
> <jingh...@illinois.edu>, johan...@sipsolutions.net 
> <johan...@sipsolutions.net>, jpoim...@kernel.org <jpoim...@kernel.org>, 
> justinst...@google.com <justinst...@google.com>, k...@kernel.org 
> <k...@kernel.org>, kent.overstr...@linux.dev <kent.overstr...@linux.dev>, 
> linux-a...@vger.kernel.org <linux-a...@vger.kernel.org>, 
> linux-...@vger.kernel.org <linux-...@vger.kernel.org>, 
> linux-kbu...@vger.kernel.org <linux-kbu...@vger.kernel.org>, 
> linux-ker...@vger.kernel.org <linux-ker...@vger.kernel.org>, 
> linux-trace-kernel@vger.kernel.org <linux-trace-kernel@vger.kernel.org>, 
> linux...@lists.infradead.org <linux...@lists.infradead.org>, 
> l...@lists.linux.dev <l...@lists.linux.dev>, l...@kernel.org 
> <l...@kernel.org>, mari...@illinois.edu <mari...@illinois.edu>, 
> masahi...@kernel.org <masahi...@kernel.org>, mask...@google.com 
> <mask...@google.com>, mathieu.desnoy...@efficios.com 
> <mathieu.desnoy...@efficios.com>, Weber (US), Matthew L 
> <matthew.l.web...@boeing.com>, mhira...@kernel.org <mhira...@kernel.org>, 
> mi...@redhat.com <mi...@redhat.com>, mo...@google.com <mo...@google.com>, 
> nat...@kernel.org <nat...@kernel.org>, ndesaulni...@google.com 
> <ndesaulni...@google.com>, ober...@linux.ibm.com <ober...@linux.ibm.com>, 
> paul...@kernel.org <paul...@kernel.org>, rich...@nod.at <rich...@nod.at>, 
> rost...@goodmis.org <rost...@goodmis.org>, samitolva...@google.com 
> <samitolva...@google.com>, Sarkisian (US), Samuel 
> <samuel.sarkis...@boeing.com>, Vanderleest (US), Steven H 
> <steven.h.vanderle...@boeing.com>, t...@linutronix.de <t...@linutronix.de>, 
> ting...@illinois.edu <ting...@illinois.edu>, t...@illinois.edu 
> <t...@illinois.edu>, x...@kernel.org <x...@kernel.org>
> Subject: [EXTERNAL] Re: [PATCH v2 0/4] Enable measuring the kernel's 
> Source-based Code Coverage and MC/DC with Clang
>
> EXT email: be mindful of links/attachments.
>
>
>
> On Wed, Sep 04, 2024 at 11:32:41PM -0500, Wentao Zhang wrote:
> > From: Wentao Zhang <zhangwt1...@gmail.com>
> >
> > This series adds support for building x86-64 kernels with Clang's Source-
> > based Code Coverage[1] in order to facilitate Modified Condition/Decision
> > Coverage (MC/DC)[2] that provably correlates to source code for all levels
> > of compiler optimization.
> >
> > The newly added kernel/llvm-cov/ directory complements the existing gcov
> > implementation. Gcov works at the object code level which may better
> > reflect actual execution. However, Gcov lacks the necessary information to
> > correlate coverage measurement with source code location when compiler
> > optimization level is non-zero (which is the default when building the
> > kernel). In addition, gcov reports are occasionally ambiguous when
> > attempting to compare with source code level developer intent.
> >
> > In the following gcov example from drivers/firmware/dmi_scan.c, an
> > expression with four conditions is reported to have six branch outcomes,
> > which is not ideally informative in many safety related use cases, such as
> > automotive, medical, and aerospace.
> >
> >         5: 1068:      if (s == e || *e != '/' || !month || month > 12) {
> > branch  0 taken 5 (fallthrough)
> > branch  1 taken 0
> > branch  2 taken 5 (fallthrough)
> > branch  3 taken 0
> > branch  4 taken 0 (fallthrough)
> > branch  5 taken 5
> >
> > On the other hand, Clang's Source-based Code Coverage instruments at the
> > compiler frontend which maintains an accurate mapping from coverage
> > measurement to source code location. Coverage reports reflect exactly how
> > the code is written regardless of optimization and can present advanced
> > metrics like branch coverage and MC/DC in a clearer way. Coverage report
> > for the same snippet by llvm-cov would look as follows:
> >
> >  1068|      5|        if (s == e || *e != '/' || !month || month > 12) {
> >   ------------------
> >   |  Branch (1068:6): [True: 0, False: 5]
> >   |  Branch (1068:16): [True: 0, False: 5]
> >   |  Branch (1068:29): [True: 0, False: 5]
> >   |  Branch (1068:39): [True: 0, False: 5]
> >   ------------------
> >
> > Clang has added MC/DC support as of its 18.1.0 release. MC/DC is a fine-
> > grained coverage metric required by many automotive and aviation industrial
> > standards for certifying mission-critical software [3].
> >
> > In the following example from arch/x86/events/probe.c, llvm-cov gives the
> > MC/DC measurement for the compound logic decision at line 43.
> >
> >    43|     12|                        if (msr[bit].test && 
> > !msr[bit].test(bit, data))
> >   ------------------
> >   |---> MC/DC Decision Region (43:8) to (43:50)
> >   |
> >   |  Number of Conditions: 2
> >   |     Condition C1 --> (43:8)
> >   |     Condition C2 --> (43:25)
> >   |
> >   |  Executed MC/DC Test Vectors:
> >   |
> >   |     C1, C2    Result
> >   |  1 { T,  F  = F      }
> >   |  2 { T,  T  = T      }
> >   |
> >   |  C1-Pair: not covered
> >   |  C2-Pair: covered: (1,2)
> >   |  MC/DC Coverage for Decision: 50.00%
> >   |
> >   ------------------
> >    44|      5|                                continue;
> >
> > As the results suggest, during the span of measurement, only condition C2
> > (!msr[bit].test(bit, data)) is covered. That means C2 was evaluated to both
> > true and false, and in those test vectors C2 affected the decision outcome
> > independently. Therefore MC/DC for this decision is 1 out of 2 (50.00%).
> >
> > To do a full kernel measurement, instrument the kernel with
> > LLVM_COV_KERNEL_MCDC enabled, and optionally set a
> > LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS value (the default is six). Run the
> > testsuites, and collect the raw profile data under
> > /sys/kernel/debug/llvm-cov/profraw. Such raw profile data can be merged and
> > indexed, and used for generating coverage reports in various formats.
> >
> >   $ cp /sys/kernel/debug/llvm-cov/profraw vmlinux.profraw
> >   $ llvm-profdata merge vmlinux.profraw -o vmlinux.profdata
> >   $ llvm-cov show --show-mcdc --show-mcdc-summary                         \
> >              --format=text --use-color=false -output-dir=coverage_reports \
> >              -instr-profile vmlinux.profdata vmlinux
> >
> > The first two patches enable the llvm-cov infrastructure, where the first
> > enables source based code coverage and the second adds MC/DC support. The
> > next patch disables instrumentation for curve25519-x86_64.c for the same
> > reason as gcov. The final patch enables coverage for x86-64.
> >
> > The choice to use a new Makefile variable LLVM_COV_PROFILE, instead of
> > reusing GCOV_PROFILE, was deliberate. More work needs to be done to
> > determine if it is appropriate to reuse the same flag. In addition, given
> > the fundamentally different approaches to instrumentation and the resulting
> > variation in coverage reports, there is a strong likelihood that coverage
> > type will need to be managed separately.
> >
> > This work reuses code from a previous effort by Sami Tolvanen et al. [4].
> > Our aim is for source-based *code coverage* required for high assurance
> > (MC/DC) while [4] focused more on performance optimization.
> >
> > This initial submission is restricted to x86-64. Support for other
> > architectures would need a bit more Makefile & linker script modification.
> > Informally we've confirmed that arm64 works and more are being tested.
> >
> > Note that Source-based Code Coverage is Clang-specific and isn't compatible
> > with Clang's gcov support in kernel/gcov/. Currently, kernel/gcov/ is not
> > able to measure MC/DC without modifying CFLAGS_GCOV and it would face the
> > same issues in terms of source correlation as gcov in general does.
> >
> > Some demo and results can be found in [5]. We will talk about this patch
> > series in the Refereed Track at LPC 2024 [6].
> >
> > Known Limitations:
> >
> > Kernel code with logical expressions exceeding
> > LVM_COV_KERNEL_MCDC_MAX_CONDITIONS will produce a compiler warning.
> > Expressions with up to 47 conditions are found in the Linux kernel source
> > tree (as of v6.11), but 46 seems to be the max value before the build fails
> > due to kernel size. As of LLVM 19 the max number of conditions possible is
> > 32767.
> >
> > As of LLVM 19, certain expressions are still not covered, and will produce
> > build warnings when they are encountered:
> >
> > "[...] if a boolean expression is embedded in the nest of another boolean
> >  expression but separated by a non-logical operator, this is also not
> >  supported. For example, in x = (a && b && c && func(d && f)), the d && f
> >  case starts a new boolean expression that is separated from the other
> >  conditions by the operator func(). When this is encountered, a warning
> >  will be generated and the boolean expression will not be
> >  instrumented." [7]
> >
>
> What does this actually look like in the generated code?
>
> Where is the modification to noinstr ?
>
> What is the impact on certification of not covering the noinstr code.
Re: FW: [EXTERNAL] Re: [PATCH v2 0/4] Enable measuring the kernel's Source-based Code Coverage and MC/DC with Clang

Reply via email to