dexonsmith added inline comments.
================ Comment at: clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp:64-70 MinimizedFileContents.push_back('\0'); - Result.Contents = std::move(MinimizedFileContents); - // Now make the null terminator implicit again, so that Clang's lexer can find - // it right where the buffer ends. - Result.Contents.pop_back(); + Result.Contents = std::make_unique<llvm::SmallVectorMemoryBuffer>( + std::move(MinimizedFileContents), [](SmallVectorImpl<char> &SV) { + // Now make the null terminator implicit again, so that Clang's lexer + // can find it right where the buffer ends. + SV.pop_back(); + }); ---------------- This is a bit simpler: ``` lang=c++ Result.Contents = MinimizedFileContents.isSmall() ? MemoryBuffer::getMemBufferCopy(...) : std::make_unique<llvm::SmallVectorMemoryBuffer>(); ``` and doesn't require changing SmallVectorMemoryBuffer. Another option is: ``` lang=c++ { // Move to the heap and ensure null-terminated to ensure the MemoryBuffer works. SmallVector<char, 0> Heap(std::move(MinimizedFileContents)); Heap.push_back(0); Heap.pop_back(); Result.Contents = std::make_unique<llvm::SmallVectorMemoryBuffer>(std::move(Heap)); } ``` The latter seems like reasonable logic for the SmallVectorMemoryBuffer constructor though. ================ Comment at: llvm/include/llvm/Support/SmallVectorMemoryBuffer.h:54 + /// and invoke the given function right after the move. + SmallVectorMemoryBuffer( + SmallVectorImpl<char> &&SV, ---------------- jansvoboda11 wrote: > I'm not happy with introducing new (hacky) constructor. > > But, the best alternative I could come up with is to avoid using > `SmallVectorMemoryBuffer`, store the `SmallString` with minimized contents > somewhere in the filesystem and refer to it via regular `MemoryBuffer`. This > seems needlessly complicated, so hacky constructor it is... > > Side question: is the hassle with implicit null-termination (expected by > Clang's lexer) worth the complications? What are the benefits anyway? It'd be better to add a `RequiresNullTerminator` argument to the constructor, matching the MemoryBuffer factory functions, which (if set) does the `push_back`/`pop_back` dance, and in either case passes it through to `init()`. ``` lang=c++ SmallVectorMemoryBuffer(SmallVectorImpl<char> &&SV, bool RequiresNullTerminator = true) : SV(std::move(SV)), BufferName("<in-memory object>") { if (RequiresNullTerminator) { this->SV.push_back(0); this->SV.pop_back(0); } init(this->SV.begin(), this->SV.end(), RequiresNullTerminator); } ``` If you do that, you should update the (few) existing callers to explicitly pass `false`. Note that `RequiresNullTerminator` defaults to true in all the other `MemoryBuffer` APIs so it'd be confusing to default it to `false` here. > Side question: is the hassle with implicit null-termination (expected by > Clang's lexer) worth the complications? What are the benefits anyway? Not sure whether it's a micro-optimization for getting better codegen or if it just predates `StringRef`, which made it much easier to reason about a byte range as a single unit (rather than tracking two separate pointers, which is more error-prone). I somewhat doubt it's worth it, which is why when I wrote the minimizer I used `StringRef`, but I don't have data and it'd be a lot of work to change. (The minimizer is fast enough as-is, but it does less work than the full Clang lexer. It's plausible that the optimizations matter there?) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D115043/new/ https://reviews.llvm.org/D115043 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits