Issue 148279
Summary Incorrect Coverage Counts When Merging with Same Function Name
Labels new issue
Assignees
Reporter awearden
    If I compile fb.c and mypg.c into one executable and create a .profraw file, then do the same with fb.c and myprogram.c, I get the following output.

`/local/mnt/workspace/awearden/llvm_study/fb.c:
    1|      1|int foo() { return 0; }
    2|       |
    3|      1|int bar() { return 0; }
 4|       |
    5|      1|int bun() { return 0; }

/local/mnt/workspace/awearden/llvm_study/mypg.c:
    1|      0|int baz() { return 0; }
    2|      2|int main() { return 1; }

/local/mnt/workspace/awearden/llvm_study/myprogram.c:
    1| |extern int foo();
    2|       |extern int bar();
    3|       |extern int bun();
    4|       |
    5|      2|int main() {
    6|      2|  return foo() + bar() + bun();
    7|      2|}
`

In this example, there are double counts on the main function. I believe this is an issue that occurs because of the way the functions are hashed. Right now, it only uses the function names to hash so when there are two functions with the same name but different implementations, they will both get hashed to the same hash value and the merging code will add both counts together.

Merging the counts occurs in the InstrProfWriter.cpp:
`void InstrProfWriter::addRecord(StringRef Name, uint64_t Hash,
 InstrProfRecord &&I, uint64_t Weight,
 function_ref<void(Error)> Warn) {
  auto &ProfileDataMap = FunctionData[Name];

  auto [Where, NewFunc] = ProfileDataMap.try_emplace(Hash);
  InstrProfRecord &Dest = Where->second;

  auto MapWarn = [&](instrprof_error E) {
 Warn(make_error<InstrProfError>(E));
  };

  if (NewFunc) {
    // We've never seen a function with this name and hash, add it.
    Dest = std::move(I);
    if (Weight > 1)
      Dest.scale(Weight, 1, MapWarn);
 } else {
    // We're updating a function we've seen before.
 Dest.merge(I, Weight, MapWarn);
  }

  Dest.sortValueData();
}`
It uses the hash to check if you should update some coverage data from a function you've seen before, or if it's a completely new function

And the hashing occurs in the InstrProf.cpp:
`Error addSymbolName(StringRef SymbolName) {
 if (SymbolName.empty())
      return make_error<InstrProfError>(instrprof_error::malformed,
 "symbol name is empty");

    // Insert into NameTab so that MD5NameMap (a vector that will be sorted)
    // won't have duplicated entries in the first place.
    auto Ins = NameTab.insert(SymbolName);
 if (Ins.second) {
      MD5NameMap.push_back(std::make_pair(
 IndexedInstrProf::ComputeHash(SymbolName), Ins.first->getKey()));
 Sorted = false;
    }
    return Error::success();
  }`

SymbolName seems to be just the function name, so IndexedInstrProf::ComputeHash(SymbolName) will only hash the function name.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to