https://github.com/DavidSpickett created 
https://github.com/llvm/llvm-project/pull/181165

This PR fixes a few issues including one that prevented the use of 
llvm::StringRef.

Some behaviour for rarely seen inputs has been defined. For example empty 
strings. In normal use with command descriptions we do not expect this to 
happen, but now it's a utility function, it's easier to reason about if we 
cover all possible inputs.

* Empty string in now results in an empty string out. Rather than a single 
newline. This is less surprising, since no lines were split.
* Bugs were fixed in the handling of single word inputs. If a single word 
cannot fit within the column limit we just print it unmodified.
* Leading spaces are trimmed from input and if that results in no text, empty 
string is returned. Another unexpected input, but cheap to handle and makes the 
rest of the code a bit simpler.
* llvm::StringRef is now used for the input text. This was enabled by fixing a 
bug in checking whether end had reached final_end. I think the previous logic 
caused us to read out of bounds by 1 byte sometimes.
* Some buggy test output has changed but this is fine, no one is relying on 
that buggy behaviour.

Ultimately I may wipe out all these changes later with a new implementaion, but 
I think it's a good approach to do this incremenetally at first.

>From 4663e3e740f3c47bd965fb16b1139e6d2c9f5d82 Mon Sep 17 00:00:00 2001
From: David Spickett <[email protected]>
Date: Wed, 11 Feb 2026 16:49:32 +0000
Subject: [PATCH] [lldb] Improve ansi::OutputWordWrappedLines

This PR fixes a few issues including one that prevented the use
of llvm::StringRef.

Some behaviour for rarely seen inputs has been defined. For
example empty strings. In normal use with command descriptions
we do not expect this to happen, but now it's a utility function,
it's easier to reason about if we cover all possible inputs.

* Empty string in now results in an empty string out. Rather
  than a single newline. This is less surprising, since no
  lines were split.
* Bugs were fixed in the handling of single word inputs. If
  a single word cannot fit within the column limit we just
  print it unmodified.
* Leading spaces are trimmed from input and if that results
  in no text, empty string is returned. Another unexpected
  input, but cheap to handle and makes the rest of the
  code a bit simpler.
* llvm::StringRef is now used for the input text. This was
  enabled by fixing a bug in checking whether end had
  reached final_end. I think the previous logic caused
  us to read out of bounds by 1 byte sometimes.
* Some buggy test output has changed but this is fine,
  no one is relying on that buggy behaviour.

Ultimately I may wipe out all these changes later with
a new implementaion, but I think it's a good approach
to do this incremenetally at first.
---
 lldb/include/lldb/Utility/AnsiTerminal.h    | 22 +++++++------
 lldb/unittests/Utility/AnsiTerminalTest.cpp | 34 +++++++++------------
 2 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/lldb/include/lldb/Utility/AnsiTerminal.h 
b/lldb/include/lldb/Utility/AnsiTerminal.h
index 0e696e49eda0b..153602cc08b09 100644
--- a/lldb/include/lldb/Utility/AnsiTerminal.h
+++ b/lldb/include/lldb/Utility/AnsiTerminal.h
@@ -293,16 +293,19 @@ inline size_t ColumnWidth(llvm::StringRef str) {
 // to the given stream. The indent level of the stream is counted towards the
 // output line length.
 // FIXME: This contains several bugs and does not handle unicode.
-inline void OutputWordWrappedLines(Stream &strm,
-                                   // FIXME: should be StringRef, but triggers
-                                   // a crash when changed.
-                                   const std::string &text,
+inline void OutputWordWrappedLines(Stream &strm, llvm::StringRef text,
                                    uint32_t output_max_columns) {
+  // We will indent using the stream, so leading whitespace is not significant.
+  text = text.ltrim();
+  if (text.size() == 0)
+    return;
+
   const size_t visible_length = ansi::ColumnWidth(text);
 
-  // Will it all fit on one line?
+  // Will it all fit on one line, or is it a single word that we must not 
break?
   if (static_cast<uint32_t>(visible_length + strm.GetIndentLevel()) <
-      output_max_columns) {
+          output_max_columns ||
+      text.find_first_of(" \t\n") == llvm::StringRef::npos) {
     // Output it as a single line.
     strm.Indent(text);
     strm.EOL();
@@ -329,9 +332,10 @@ inline void OutputWordWrappedLines(Stream &strm,
       start++;
 
     end = start + max_text_width;
-    if (end > final_end) {
+    if (end > final_end)
       end = final_end;
-    } else {
+
+    if (end != final_end) {
       // If we're not at the end of the text, make sure we break the line on
       // white space.
       while (end > start && text[end] != ' ' && text[end] != '\t' &&
@@ -345,7 +349,7 @@ inline void OutputWordWrappedLines(Stream &strm,
     strm.Indent();
     assert(start < final_end);
     assert(start + sub_len <= final_end);
-    strm.PutCString(llvm::StringRef(text.c_str() + start, sub_len));
+    strm << text.substr(start, sub_len);
     start = end + 1;
   }
   strm.EOL();
diff --git a/lldb/unittests/Utility/AnsiTerminalTest.cpp 
b/lldb/unittests/Utility/AnsiTerminalTest.cpp
index 19c27de4cd886..28fa32461ad5f 100644
--- a/lldb/unittests/Utility/AnsiTerminalTest.cpp
+++ b/lldb/unittests/Utility/AnsiTerminalTest.cpp
@@ -128,28 +128,24 @@ static void TestLines(const std::string &input, int 
indent,
 }
 
 TEST(AnsiTerminal, OutputWordWrappedLines) {
-  TestLines("", 0, 0, "\n");
-  TestLines("", 0, 1, "\n");
-  TestLines("", 2, 1, "\n");
-
-  // FIXME: crashes
-  // TestLines("abc", 0, 1, "\n");
-  // FIXME: crashes
-  // TestLines("abc", 0, 2, "\n");
-  // FIXME: should be "ab\nc\n"
-  TestLines("abc", 0, 3, "\n\nc\n");
+  TestLines("", 0, 0, "");
+  TestLines("", 0, 1, "");
+  TestLines("", 2, 1, "");
+
+  // When it is a single word, we ignore the max columns and do not split it.
+  TestLines("abc", 0, 1, "abc\n");
+  TestLines("abc", 0, 2, "abc\n");
+  TestLines("abc", 0, 3, "abc\n");
   TestLines("abc", 0, 4, "abc\n");
-  // Indent is counted as using up columns.
   TestLines("abc", 1, 5, " abc\n");
-  // FIXME: This output is correctly indented but the content
-  // is mangled.
-  TestLines("abc", 2, 5, "  \n  \n  c\n");
+  TestLines("abc", 2, 5, "  abc\n");
 
-  // FIXME: Should skip leading whitespace and result in "abc\n".
-  TestLines("  abc", 0, 4, "\n\nbc\n");
+  // Leading whitespace is ignored because we're going to indent using the
+  // stream.
+  TestLines("  abc", 0, 4, "abc\n");
+  TestLines("        abc", 2, 6, "  abc\n");
 
-  // FIXME: Should be "abc\ndef\n".
-  TestLines("abc def", 0, 4, "abc\n\nef\n");
+  TestLines("abc def", 0, 4, "abc\ndef\n");
   TestLines("abc def", 0, 5, "abc\ndef\n");
   // Length is 6, 7 required. Has to split at whitespace.
   TestLines("abc def", 0, 6, "abc\ndef\n");
@@ -182,5 +178,5 @@ TEST(AnsiTerminal, OutputWordWrappedLines) {
   // FIXME: Final fox is missing.
   TestLines(fox_str_emoji, 0, 15, "🦊 The quick\nbrown fox. \n");
   // FIXME: should not split the middle of an emoji.
-  TestLines("🦊🦊🦊 🦊🦊", 0, 5, "\n\n\n\n\n\n\n\n\xF0\x9F\xA6\n");
+  TestLines("🦊🦊🦊 🦊🦊", 0, 5, "\n\n\n\n\n\n\n\x8A\xF0\x9F\xA6\n");
 }

_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to