llvmbot wrote:

<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-clang-format

Author: Ilya Biryukov (ilya-biryukov)

<details>
<summary>Changes</summary>

The `format` API receives a StringRef, but crashes whenever it is 
non-null-terminated with the corresponding assertion:

```
FormatTests: llvm/lib/Support/MemoryBuffer.cpp:53:
void llvm::MemoryBuffer::init(const char *, const char *, bool):
Assertion `(!RequiresNullTerminator || BufEnd[0] == 0) &amp;&amp; "Buffer is 
not null terminated!"' failed.
```

Ensure this does not happen by storing a copy of the inputs in `std::string` 
that does have a null terminator.

This changes requires an extra copy of the `Content` in the 
`SourceManagerForFile` APIs, but the costs of that copy should be negligible in 
practice as the API is designed for convenience rather than performance in the 
first place. E.g. running clang-format over `Content` is much more expensive 
than the copy of the Content itself.

This copy could be avoided in most cases if we provide a constructor that 
accepts `std::string` or null-terminated strings directly, but it does not seem 
worth the effort.

An alternative fix would be to teach `SourceManager` to work with 
non-null-terminated buffers, but given how much it is used, this would be very 
complicated and is likely to incur some performance cost.

---
Full diff: https://github.com/llvm/llvm-project/pull/131299.diff


3 Files Affected:

- (modified) clang/include/clang/Basic/SourceManager.h (+1) 
- (modified) clang/lib/Basic/SourceManager.cpp (+8-2) 
- (modified) clang/unittests/Format/FormatTest.cpp (+11) 


``````````diff
diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index e0f1ea435d54e..ec5803ff46290 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -2031,6 +2031,7 @@ class SourceManagerForFile {
   // The order of these fields are important - they should be in the same order
   // as they are created in `createSourceManagerForFile` so that they can be
   // deleted in the reverse order as they are created.
+  std::string ContentBuffer;
   std::unique_ptr<FileManager> FileMgr;
   std::unique_ptr<DiagnosticsEngine> Diagnostics;
   std::unique_ptr<SourceManager> SourceMgr;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index b1f2180c1d462..4e351ec9089a9 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2382,14 +2382,20 @@ size_t SourceManager::getDataStructureSizes() const {
 
 SourceManagerForFile::SourceManagerForFile(StringRef FileName,
                                            StringRef Content) {
+  // We copy to `std::string` for Context instead of StringRef because the
+  // SourceManager::getBufferData() works only with null-terminated buffers.
+  // And we still want to keep the API convenient.
+  ContentBuffer = Content.str();
+
   // This is referenced by `FileMgr` and will be released by `FileMgr` when it
   // is deleted.
   IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem> InMemoryFileSystem(
       new llvm::vfs::InMemoryFileSystem);
+
   InMemoryFileSystem->addFile(
       FileName, 0,
-      llvm::MemoryBuffer::getMemBuffer(Content, FileName,
-                                       /*RequiresNullTerminator=*/false));
+      llvm::MemoryBuffer::getMemBuffer(ContentBuffer, FileName,
+                                       /*RequiresNullTerminator=*/true));
   // This is passed to `SM` as reference, so the pointer has to be referenced
   // in `Environment` so that `FileMgr` can out-live this function scope.
   FileMgr =
diff --git a/clang/unittests/Format/FormatTest.cpp 
b/clang/unittests/Format/FormatTest.cpp
index 9864e7ec1b2ec..54d0b13ab35c0 100644
--- a/clang/unittests/Format/FormatTest.cpp
+++ b/clang/unittests/Format/FormatTest.cpp
@@ -29096,6 +29096,17 @@ TEST_F(FormatTest, BreakBeforeClassName) {
                "    ArenaSafeUniquePtr {};");
 }
 
+TEST_F(FormatTest, DoesNotCrashOnNonNullTerminatedStringRefs) {
+  llvm::StringRef TwoLines = "namespace foo {}\n"
+                             "namespace bar {}";
+  llvm::StringRef FirstLine =
+      TwoLines.take_until([](char c) { return c == '\n'; });
+
+  // The internal API used to crash when passed a non-null-terminated 
StringRef.
+  // Check this does not happen anymore.
+  verifyFormat(FirstLine);
+}
+
 } // namespace
 } // namespace test
 } // namespace format

``````````

</details>


https://github.com/llvm/llvm-project/pull/131299
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to