https://github.com/whentojump created 
https://github.com/llvm/llvm-project/pull/80098

## Problem

The behavior is described with this tiny example: 
https://github.com/whentojump/llvm-mcdc-assertion-failure

Essentially, current MC/DC implementation cannot properly handle decisions that 
involve macros, like this example:

![Picture1](https://github.com/llvm/llvm-project/assets/35722712/e1750238-8523-4871-aed1-8439c873ec55)

## Root cause

Some terms I'll use:

1. Decision region: the composite logical expression
2. Branch region: smaller conditions or branches within a larger expression

In the full lifecycle of these regions, here are several important stages:

1. Compilation
    1. Generated at front end: [source 
code](https://github.com/llvm/llvm-project/blob/a0b6747804e46665ecfd00295b60432bfe1775b6/clang/lib/CodeGen/CoverageMappingGen.cpp#L860-L903)
    2. Written to the binary as coverage mapping sections: [source 
code](https://github.com/llvm/llvm-project/blob/a0b6747804e46665ecfd00295b60432bfe1775b6/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp#L240-L256)
2. Generate report
    1. Read from the binary: [source 
code](https://github.com/llvm/llvm-project/blob/a0b6747804e46665ecfd00295b60432bfe1775b6/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp#L650-L703)

The assertion in question is in step 2.i and goes like this: 

1. `llvm-cov` walks all regions of the target program
2. It encounters a decision region, say **R3**. By reading **R3**'s metadata, 
it knows it has two branches
3. `llvm-cov` then asserts: the next two regions it's gonna see must be two 
branch regions, not otherwise

This assertion assumes the order of MCDC regions. However, this order doesn't 
always hold

In step 1.ii all regions are 
[sorted](https://github.com/llvm/llvm-project/blob/a0b6747804e46665ecfd00295b60432bfe1775b6/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp#L162-L171)
 first by their file IDs, then by locations within the file and finally by 
region types. Macros, however, are traced back to definitions, which can be far 
away from their invocations or even separated in different files. As a result, 
the MC/DC regions could be sorted in a way where they are separated from each 
other. In the shown example, the order could be: `R3 (many irrelevant regions) 
R1 R2` which apparently breaks the assertion at step 2.i.

## Solution

This can be solved in two ways

1. (This PR) Sort with MC/DC in consideration and group relevant decisions and 
branches together.

    I honestly don't know if this's gonna break other things. But at least I 
can confirm it solves the problem mentioned. Again please see an example in 
this repo https://github.com/whentojump/llvm-mcdc-assertion-failure

2. In `CoverageMapping::loadFunctionRecord()`, use a cleverer way to correlate 
decisions and branches that are sorted far away from each other.

## Other to-do's

- Tests are not yet taken care of 
- A bit comments/docs

Last but not least, a huge thanks to @evodius96 et al for the great work 
regarding MC/DC :))


>From daf74be7f20d20bcc54a541d50fc891bff8ae388 Mon Sep 17 00:00:00 2001
From: Wentao Zhang <zhangwt1...@gmail.com>
Date: Tue, 30 Jan 2024 21:32:26 -0500
Subject: [PATCH 1/2] [Coverage][llvm-cov] Fix failures to handle MC/DC
 decisions involving macros (1/2)

This patch lets MC/DC code regions for the same composite logical
expression get sorted next to each other, so that `llvm-cov show` won't
hit assertion failures.

The problematic cases are mostly when macros are involved:

  (FOO(x) && BAR(y))

   <-R2->    <-R3->
  <-------R1------->

Let's call the full expression Region 1, `FOO(x)` Region 2 and `BAR(y)`
Region 3. If these macros are defined far away from their invocations,
the eventual order of all regions would look like: Region 1, (many other
irrelevant regions), Region 2, Region 3. This will break an assertion in
CoverageMapping::loadFunctionRecord() which assumes branch regions of an
MC/DC decision would immediately follow the decision region itself.

This patch adds a new field `GroupID` to `MCDCParameters` struct, sort
all regions based on this information, and place related MC/DC regions
together. The `GroupID` assignment is included in the other patch to
Clang CodeGen. This patch also disables 3 assertions that's based on the
old sorting criteria.
---
 .../llvm/ProfileData/Coverage/CoverageMapping.h        |  1 +
 .../lib/ProfileData/Coverage/CoverageMappingReader.cpp |  8 ++++++--
 .../lib/ProfileData/Coverage/CoverageMappingWriter.cpp | 10 +++++++---
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/llvm/include/llvm/ProfileData/Coverage/CoverageMapping.h 
b/llvm/include/llvm/ProfileData/Coverage/CoverageMapping.h
index 33fa17ba811fa..5ca5923cbd82e 100644
--- a/llvm/include/llvm/ProfileData/Coverage/CoverageMapping.h
+++ b/llvm/include/llvm/ProfileData/Coverage/CoverageMapping.h
@@ -260,6 +260,7 @@ struct CounterMappingRegion {
     /// IDs used to represent a branch region and other branch regions
     /// evaluated based on True and False branches.
     MCDCConditionID ID = 0, TrueID = 0, FalseID = 0;
+    MCDCConditionID GroupID;
   };
 
   /// Primary Counter that is also used for Branch Regions (TrueCount).
diff --git a/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp 
b/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
index ac8e6b56379f2..1638a4d7dcfa0 100644
--- a/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
+++ b/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
@@ -244,7 +244,7 @@ Error RawCoverageMappingReader::readMappingRegionsSubArray(
   unsigned LineStart = 0;
   for (size_t I = 0; I < NumRegions; ++I) {
     Counter C, C2;
-    uint64_t BIDX = 0, NC = 0, ID = 0, TID = 0, FID = 0;
+    uint64_t BIDX = 0, NC = 0, ID = 0, TID = 0, FID = 0, GID = 0;
     CounterMappingRegion::RegionKind Kind = CounterMappingRegion::CodeRegion;
 
     // Read the combined counter + region kind.
@@ -308,6 +308,8 @@ Error RawCoverageMappingReader::readMappingRegionsSubArray(
             return Err;
           if (auto Err = readIntMax(FID, std::numeric_limits<unsigned>::max()))
             return Err;
+          if (auto Err = readIntMax(GID, std::numeric_limits<unsigned>::max()))
+            return Err;
           break;
         case CounterMappingRegion::MCDCDecisionRegion:
           Kind = CounterMappingRegion::MCDCDecisionRegion;
@@ -315,6 +317,8 @@ Error RawCoverageMappingReader::readMappingRegionsSubArray(
             return Err;
           if (auto Err = readIntMax(NC, std::numeric_limits<unsigned>::max()))
             return Err;
+          if (auto Err = readIntMax(GID, std::numeric_limits<unsigned>::max()))
+            return Err;
           break;
         default:
           return make_error<CoverageMapError>(coveragemap_error::malformed,
@@ -374,7 +378,7 @@ Error RawCoverageMappingReader::readMappingRegionsSubArray(
         CounterMappingRegion::MCDCParameters{
             static_cast<unsigned>(BIDX), static_cast<unsigned>(NC),
             static_cast<unsigned>(ID), static_cast<unsigned>(TID),
-            static_cast<unsigned>(FID)},
+            static_cast<unsigned>(FID), static_cast<unsigned>(GID)},
         InferredFileID, ExpandedFileID, LineStart, ColumnStart,
         LineStart + NumLines, ColumnEnd, Kind);
     if (CMR.startLoc() > CMR.endLoc())
diff --git a/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp 
b/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
index 1c7d8a8909c48..5f85e8243fcd5 100644
--- a/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
+++ b/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
@@ -163,6 +163,8 @@ void CoverageMappingWriter::write(raw_ostream &OS) {
   // location. Sort by region kinds to ensure stable order for tests.
   llvm::stable_sort(MappingRegions, [](const CounterMappingRegion &LHS,
                                        const CounterMappingRegion &RHS) {
+    if (LHS.MCDCParams.GroupID != RHS.MCDCParams.GroupID)
+      return LHS.MCDCParams.GroupID < RHS.MCDCParams.GroupID;
     if (LHS.FileID != RHS.FileID)
       return LHS.FileID < RHS.FileID;
     if (LHS.startLoc() != RHS.startLoc())
@@ -192,7 +194,7 @@ void CoverageMappingWriter::write(raw_ostream &OS) {
   for (auto I = MappingRegions.begin(), E = MappingRegions.end(); I != E; ++I) 
{
     if (I->FileID != CurrentFileID) {
       // Ensure that all file ids have at least one mapping region.
-      assert(I->FileID == (CurrentFileID + 1));
+      // assert(I->FileID == (CurrentFileID + 1));
       // Find the number of regions with this file id.
       unsigned RegionCount = 1;
       for (auto J = I + 1; J != E && I->FileID == J->FileID; ++J)
@@ -246,6 +248,7 @@ void CoverageMappingWriter::write(raw_ostream &OS) {
       encodeULEB128(unsigned(I->MCDCParams.ID), OS);
       encodeULEB128(unsigned(I->MCDCParams.TrueID), OS);
       encodeULEB128(unsigned(I->MCDCParams.FalseID), OS);
+      encodeULEB128(unsigned(I->MCDCParams.GroupID), OS);
       break;
     case CounterMappingRegion::MCDCDecisionRegion:
       encodeULEB128(unsigned(I->Kind)
@@ -253,9 +256,10 @@ void CoverageMappingWriter::write(raw_ostream &OS) {
                     OS);
       encodeULEB128(unsigned(I->MCDCParams.BitmapIdx), OS);
       encodeULEB128(unsigned(I->MCDCParams.NumConditions), OS);
+      encodeULEB128(unsigned(I->MCDCParams.GroupID), OS);
       break;
     }
-    assert(I->LineStart >= PrevLineStart);
+    // assert(I->LineStart >= PrevLineStart);
     encodeULEB128(I->LineStart - PrevLineStart, OS);
     encodeULEB128(I->ColumnStart, OS);
     assert(I->LineEnd >= I->LineStart);
@@ -264,7 +268,7 @@ void CoverageMappingWriter::write(raw_ostream &OS) {
     PrevLineStart = I->LineStart;
   }
   // Ensure that all file ids have at least one mapping region.
-  assert(CurrentFileID == (VirtualFileMapping.size() - 1));
+  // assert(CurrentFileID == (VirtualFileMapping.size() - 1));
 }
 
 void TestingFormatWriter::write(raw_ostream &OS, TestingFormatVersion Version) 
{

>From cd5a33851187c872c0de7a766528be097bcc68ad Mon Sep 17 00:00:00 2001
From: Wentao Zhang <zhangwt1...@gmail.com>
Date: Tue, 30 Jan 2024 21:32:40 -0500
Subject: [PATCH 2/2] [Coverage][clang] Fix failures to handle MC/DC decisions
 involving macros (2/2)

This patch lets MC/DC code regions for the same composite logical
expression get sorted next to each other, so that `llvm-cov show` won't
hit assertion failures.

The problematic cases are mostly when macros are involved:

  (FOO(x) && BAR(y))

   <-R2->    <-R3->
  <-------R1------->

Let's call the full expression Region 1, `FOO(x)` Region 2 and `BAR(y)`
Region 3. If these macros are defined far away from their invocations,
the eventual order of all regions would look like: Region 1, (many other
irrelevant regions), Region 2, Region 3. This will break an assertion in
CoverageMapping::loadFunctionRecord() which assumes branch regions of an
MC/DC decision would immediately follow the decision region itself.

An earlier patch adds a new field `GroupID` to `MCDCParameters` struct,
sort all regions based on this information, and place related MC/DC
regions together. This patch let Clang assign the same `GroupID` to
decision- and branch-regions that are related to each other.
---
 clang/lib/CodeGen/CoverageMappingGen.cpp | 46 +++++++++++++++---------
 1 file changed, 29 insertions(+), 17 deletions(-)

diff --git a/clang/lib/CodeGen/CoverageMappingGen.cpp 
b/clang/lib/CodeGen/CoverageMappingGen.cpp
index 8b5e6c4ad8272..ecd90fe0585fc 100644
--- a/clang/lib/CodeGen/CoverageMappingGen.cpp
+++ b/clang/lib/CodeGen/CoverageMappingGen.cpp
@@ -857,6 +857,8 @@ struct CounterCoverageMappingBuilder
 
   unsigned getRegionBitmap(const Stmt *S) { return MCDCBitmapMap[S]; }
 
+  unsigned long MCDCDebugCounter;
+
   /// Push a region onto the stack.
   ///
   /// Returns the index on the stack where the region was pushed. This can be
@@ -866,7 +868,7 @@ struct CounterCoverageMappingBuilder
                     std::optional<SourceLocation> EndLoc = std::nullopt,
                     std::optional<Counter> FalseCount = std::nullopt,
                     MCDCConditionID ID = 0, MCDCConditionID TrueID = 0,
-                    MCDCConditionID FalseID = 0) {
+                    MCDCConditionID FalseID = 0, MCDCConditionID GroupID = 0) {
 
     if (StartLoc && !FalseCount) {
       MostRecentLocation = *StartLoc;
@@ -886,17 +888,19 @@ struct CounterCoverageMappingBuilder
     if (EndLoc && EndLoc->isInvalid())
       EndLoc = std::nullopt;
     RegionStack.emplace_back(Count, FalseCount,
-                             MCDCParameters{0, 0, ID, TrueID, FalseID},
+                             MCDCParameters{
+                               0, 0,
+                               ID, TrueID, FalseID, GroupID},
                              StartLoc, EndLoc);
 
     return RegionStack.size() - 1;
   }
 
-  size_t pushRegion(unsigned BitmapIdx, unsigned Conditions,
+  size_t pushRegion(unsigned BitmapIdx, unsigned Conditions, MCDCConditionID 
GroupID,
                     std::optional<SourceLocation> StartLoc = std::nullopt,
                     std::optional<SourceLocation> EndLoc = std::nullopt) {
 
-    RegionStack.emplace_back(MCDCParameters{BitmapIdx, Conditions}, StartLoc,
+    RegionStack.emplace_back(MCDCParameters{BitmapIdx, Conditions, 0, 0, 0, 
GroupID}, StartLoc,
                              EndLoc);
 
     return RegionStack.size() - 1;
@@ -1032,7 +1036,8 @@ struct CounterCoverageMappingBuilder
   /// result in the generation of a branch.
   void
   createBranchRegion(const Expr *C, Counter TrueCnt, Counter FalseCnt,
-                     const MCDCDecisionIDPair &IDPair = MCDCDecisionIDPair()) {
+                     const MCDCDecisionIDPair &IDPair = MCDCDecisionIDPair(),
+                     MCDCConditionID GroupID = 0) {
     // Check for NULL conditions.
     if (!C)
       return;
@@ -1054,19 +1059,19 @@ struct CounterCoverageMappingBuilder
       // CodeGenFunction.c always returns false, but that is very heavy-handed.
       if (ConditionFoldsToBool(C))
         popRegions(pushRegion(Counter::getZero(), getStart(C), getEnd(C),
-                              Counter::getZero(), ID, TrueID, FalseID));
+                              Counter::getZero(), ID, TrueID, FalseID, 
GroupID));
       else
         // Otherwise, create a region with the True counter and False counter.
         popRegions(pushRegion(TrueCnt, getStart(C), getEnd(C), FalseCnt, ID,
-                              TrueID, FalseID));
+                              TrueID, FalseID, GroupID));
     }
   }
 
   /// Create a Decision Region with a BitmapIdx and number of Conditions. This
   /// type of region "contains" branch regions, one for each of the conditions.
   /// The visualization tool will group everything together.
-  void createDecisionRegion(const Expr *C, unsigned BitmapIdx, unsigned Conds) 
{
-    popRegions(pushRegion(BitmapIdx, Conds, getStart(C), getEnd(C)));
+  void createDecisionRegion(const Expr *C, unsigned BitmapIdx, unsigned Conds, 
MCDCConditionID GroupID) {
+    popRegions(pushRegion(BitmapIdx, Conds, GroupID, getStart(C), getEnd(C)));
   }
 
   /// Create a Branch Region around a SwitchCase for code coverage
@@ -1153,7 +1158,8 @@ struct CounterCoverageMappingBuilder
                 I.getCounter(), I.getFalseCounter(),
                 MCDCParameters{0, 0, I.getMCDCParams().ID,
                                I.getMCDCParams().TrueID,
-                               I.getMCDCParams().FalseID},
+                               I.getMCDCParams().FalseID,
+                               I.getMCDCParams().GroupID},
                 Loc, getEndOfFileOrMacro(Loc), I.isBranch());
           else
             SourceRegions.emplace_back(I.getCounter(), Loc,
@@ -1342,7 +1348,9 @@ struct CounterCoverageMappingBuilder
       SourceManager &SM, const LangOptions &LangOpts)
       : CoverageMappingBuilder(CVM, SM, LangOpts), CounterMap(CounterMap),
         MCDCBitmapMap(MCDCBitmapMap),
-        MCDCBuilder(CVM.getCodeGenModule(), CondIDMap, MCDCBitmapMap) {}
+        MCDCBuilder(CVM.getCodeGenModule(), CondIDMap, MCDCBitmapMap) {
+    MCDCDebugCounter = 0;
+  }
 
   /// Write the mapping data to the output stream
   void write(llvm::raw_ostream &OS) {
@@ -1973,6 +1981,8 @@ struct CounterCoverageMappingBuilder
   void VisitBinLAnd(const BinaryOperator *E) {
     bool IsRootNode = MCDCBuilder.isIdle();
 
+    MCDCDebugCounter++;
+
     // Keep track of Binary Operator and assign MCDC condition IDs.
     MCDCBuilder.pushAndAssignIDs(E);
 
@@ -1993,7 +2003,7 @@ struct CounterCoverageMappingBuilder
     // Create MCDC Decision Region if at top-level (root).
     unsigned NumConds = 0;
     if (IsRootNode && (NumConds = MCDCBuilder.getTotalConditionsAndReset(E)))
-      createDecisionRegion(E, getRegionBitmap(E), NumConds);
+      createDecisionRegion(E, getRegionBitmap(E), NumConds, MCDCDebugCounter);
 
     // Extract the RHS's Execution Counter.
     Counter RHSExecCnt = getRegionCounter(E);
@@ -2006,11 +2016,11 @@ struct CounterCoverageMappingBuilder
 
     // Create Branch Region around LHS condition.
     createBranchRegion(E->getLHS(), RHSExecCnt,
-                       subtractCounters(ParentCnt, RHSExecCnt), DecisionLHS);
+                       subtractCounters(ParentCnt, RHSExecCnt), DecisionLHS, 
MCDCDebugCounter);
 
     // Create Branch Region around RHS condition.
     createBranchRegion(E->getRHS(), RHSTrueCnt,
-                       subtractCounters(RHSExecCnt, RHSTrueCnt), DecisionRHS);
+                       subtractCounters(RHSExecCnt, RHSTrueCnt), DecisionRHS, 
MCDCDebugCounter);
   }
 
   // Determine whether the right side of OR operation need to be visited.
@@ -2026,6 +2036,8 @@ struct CounterCoverageMappingBuilder
   void VisitBinLOr(const BinaryOperator *E) {
     bool IsRootNode = MCDCBuilder.isIdle();
 
+    MCDCDebugCounter++;
+
     // Keep track of Binary Operator and assign MCDC condition IDs.
     MCDCBuilder.pushAndAssignIDs(E);
 
@@ -2046,7 +2058,7 @@ struct CounterCoverageMappingBuilder
     // Create MCDC Decision Region if at top-level (root).
     unsigned NumConds = 0;
     if (IsRootNode && (NumConds = MCDCBuilder.getTotalConditionsAndReset(E)))
-      createDecisionRegion(E, getRegionBitmap(E), NumConds);
+      createDecisionRegion(E, getRegionBitmap(E), NumConds, MCDCDebugCounter);
 
     // Extract the RHS's Execution Counter.
     Counter RHSExecCnt = getRegionCounter(E);
@@ -2063,11 +2075,11 @@ struct CounterCoverageMappingBuilder
 
     // Create Branch Region around LHS condition.
     createBranchRegion(E->getLHS(), subtractCounters(ParentCnt, RHSExecCnt),
-                       RHSExecCnt, DecisionLHS);
+                       RHSExecCnt, DecisionLHS, MCDCDebugCounter);
 
     // Create Branch Region around RHS condition.
     createBranchRegion(E->getRHS(), subtractCounters(RHSExecCnt, RHSFalseCnt),
-                       RHSFalseCnt, DecisionRHS);
+                       RHSFalseCnt, DecisionRHS, MCDCDebugCounter);
   }
 
   void VisitLambdaExpr(const LambdaExpr *LE) {

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to