zturner created this revision.
zturner added a reviewer: clayborg.
zturner added subscribers: lldb-commits, LLDB.

In https://reviews.llvm.org/D24952 I tried to refactor the `Args` class to get 
rid of `m_argv`.  The original motivation for this is that I wanted a way to 
implement `GetArgumentAtIndex()` so as to return a `StringRef`.  I could have 
just returned a `StringRef` from the `m_argv` array, but it seems wasteful to 
calculate the length every time when we were already storing a `std::string`.   
Unfortunately the `std::string` was in a `std::list` (required so that it would 
not be invalidated when inserting items into the arg list) so there was no 
random access.  So I wanted to see if I could do better, and that was the 
motivation for the original patch.

However, after fixing up all the places in the code to use the "new" `Args`, I 
wasn't completely satisfied.  I like that the old code could just give you an 
`argv` pointer that you can pass to a system API.  So I started over and ended 
up with this.  Quick overview of changes:

1. Instead of using a `std::list<std::string>` to own the memory, we use a 
`std::vector<std::unique_ptr<char[]>>`.  This provides random access, which is 
something `list` couldn't provide, and it also provides non-const access to the 
underlying buffer, which is something that `std::string` couldn't provide, 
leading to a lot of `const_cast` which are no longer necessary.
2. Each entry's quote char, backing memory (i.e. `std::unique_ptr<char[]>`), 
and `StringRef` with precomputed length are stored together in a single entry.  
This guarantees by design that the various arrays' lengths do not need to stay 
in sync with each other, because there is only one array.
3. Some longstanding undefined behavior is documented and left for someone else 
to fix.
4. Asserts are added to the code to document the pre-conditions.  I know we are 
allergic to asserts, but I want to emphasize that none of these asserts have 
anything to do with the input parameters.  The asserts are ALWAYS on internal 
class state.

With this change it should finally be easy to change `GetArgumentAtIndex()` to 
return a `StringRef`, because it can just return `m_entries[idx].ref`.


https://reviews.llvm.org/D25099

Files:
  include/lldb/Interpreter/Args.h
  source/Core/StringList.cpp
  source/Interpreter/Args.cpp
  unittests/Interpreter/TestArgs.cpp

Index: unittests/Interpreter/TestArgs.cpp
===================================================================
--- unittests/Interpreter/TestArgs.cpp
+++ unittests/Interpreter/TestArgs.cpp
@@ -66,6 +66,109 @@
   EXPECT_STREQ(args.GetArgumentAtIndex(1), "second_arg");
 }
 
+TEST(ArgsTest, TestInsertArg) {
+  Args args;
+  args.AppendArgument("1");
+  args.AppendArgument("2");
+  args.AppendArgument("3");
+  args.InsertArgumentAtIndex(1, "1.5");
+  args.InsertArgumentAtIndex(4, "3.5");
+
+  ASSERT_EQ(5u, args.GetArgumentCount());
+  EXPECT_STREQ("1", args.GetArgumentAtIndex(0));
+  EXPECT_STREQ("1.5", args.GetArgumentAtIndex(1));
+  EXPECT_STREQ("2", args.GetArgumentAtIndex(2));
+  EXPECT_STREQ("3", args.GetArgumentAtIndex(3));
+  EXPECT_STREQ("3.5", args.GetArgumentAtIndex(4));
+}
+
+TEST(ArgsTest, TestArgv) {
+  Args args;
+  EXPECT_EQ(nullptr, args.GetArgumentVector()[0]);
+
+  args.AppendArgument("1");
+  EXPECT_NE(nullptr, args.GetArgumentVector()[0]);
+  EXPECT_EQ(nullptr, args.GetArgumentVector()[1]);
+
+  args.AppendArgument("2");
+  EXPECT_NE(nullptr, args.GetArgumentVector()[0]);
+  EXPECT_NE(nullptr, args.GetArgumentVector()[1]);
+  EXPECT_EQ(nullptr, args.GetArgumentVector()[2]);
+
+  args.AppendArgument("3");
+  EXPECT_NE(nullptr, args.GetArgumentVector()[0]);
+  EXPECT_NE(nullptr, args.GetArgumentVector()[1]);
+  EXPECT_NE(nullptr, args.GetArgumentVector()[2]);
+  EXPECT_EQ(nullptr, args.GetArgumentVector()[3]);
+
+  args.InsertArgumentAtIndex(1, "1.5");
+  EXPECT_NE(nullptr, args.GetArgumentVector()[0]);
+  EXPECT_NE(nullptr, args.GetArgumentVector()[1]);
+  EXPECT_NE(nullptr, args.GetArgumentVector()[2]);
+  EXPECT_NE(nullptr, args.GetArgumentVector()[3]);
+  EXPECT_EQ(nullptr, args.GetArgumentVector()[4]);
+
+  args.InsertArgumentAtIndex(4, "3.5");
+  EXPECT_NE(nullptr, args.GetArgumentVector()[0]);
+  EXPECT_NE(nullptr, args.GetArgumentVector()[1]);
+  EXPECT_NE(nullptr, args.GetArgumentVector()[2]);
+  EXPECT_NE(nullptr, args.GetArgumentVector()[3]);
+  EXPECT_NE(nullptr, args.GetArgumentVector()[4]);
+  EXPECT_EQ(nullptr, args.GetArgumentVector()[5]);
+}
+
+TEST(ArgsTest, GetQuotedCommandString) {
+  Args args;
+  const char *str = "process launch -o stdout.txt -- \"a b c\"";
+  args.SetCommandString(str);
+
+  std::string stdstr;
+  ASSERT_TRUE(args.GetQuotedCommandString(stdstr));
+  EXPECT_EQ(str, stdstr);
+}
+
+TEST(ArgsTest, BareSingleQuote) {
+  Args args;
+  args.SetCommandString("a\\'b");
+  EXPECT_EQ(1u, args.GetArgumentCount());
+
+  EXPECT_STREQ("a'b", args.GetArgumentAtIndex(0));
+}
+
+TEST(ArgsTest, DoubleQuotedItem) {
+  Args args;
+  args.SetCommandString("\"a b c\"");
+  EXPECT_EQ(1u, args.GetArgumentCount());
+
+  EXPECT_STREQ("a b c", args.GetArgumentAtIndex(0));
+}
+
+TEST(ArgsTest, AppendArguments) {
+  Args args;
+  const char *argv[] = {"1", "2", nullptr};
+  const char *argv2[] = {"3", "4", nullptr};
+
+  args.AppendArguments(argv);
+  ASSERT_EQ(2u, args.GetArgumentCount());
+  EXPECT_STREQ("1", args.GetArgumentVector()[0]);
+  EXPECT_STREQ("2", args.GetArgumentVector()[1]);
+  EXPECT_EQ(nullptr, args.GetArgumentVector()[2]);
+  EXPECT_STREQ("1", args.GetArgumentAtIndex(0));
+  EXPECT_STREQ("2", args.GetArgumentAtIndex(1));
+
+  args.AppendArguments(argv2);
+  ASSERT_EQ(4u, args.GetArgumentCount());
+  EXPECT_STREQ("1", args.GetArgumentVector()[0]);
+  EXPECT_STREQ("2", args.GetArgumentVector()[1]);
+  EXPECT_STREQ("3", args.GetArgumentVector()[2]);
+  EXPECT_STREQ("4", args.GetArgumentVector()[3]);
+  EXPECT_EQ(nullptr, args.GetArgumentVector()[4]);
+  EXPECT_STREQ("1", args.GetArgumentAtIndex(0));
+  EXPECT_STREQ("2", args.GetArgumentAtIndex(1));
+  EXPECT_STREQ("3", args.GetArgumentAtIndex(2));
+  EXPECT_STREQ("4", args.GetArgumentAtIndex(3));
+}
+
 TEST(ArgsTest, StringToBoolean) {
   bool success = false;
   EXPECT_TRUE(Args::StringToBoolean(llvm::StringRef("true"), false, nullptr));
Index: source/Interpreter/Args.cpp
===================================================================
--- source/Interpreter/Args.cpp
+++ source/Interpreter/Args.cpp
@@ -25,95 +25,12 @@
 #include "lldb/Target/StackFrame.h"
 #include "lldb/Target/Target.h"
 
+#include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringSwitch.h"
 
 using namespace lldb;
 using namespace lldb_private;
 
-//----------------------------------------------------------------------
-// Args constructor
-//----------------------------------------------------------------------
-Args::Args(llvm::StringRef command) : m_args(), m_argv(), m_args_quote_char() {
-  SetCommandString(command);
-}
-
-//----------------------------------------------------------------------
-// We have to be very careful on the copy constructor of this class
-// to make sure we copy all of the string values, but we can't copy the
-// rhs.m_argv into m_argv since it will point to the "const char *" c
-// strings in rhs.m_args. We need to copy the string list and update our
-// own m_argv appropriately.
-//----------------------------------------------------------------------
-Args::Args(const Args &rhs)
-    : m_args(rhs.m_args), m_argv(), m_args_quote_char(rhs.m_args_quote_char) {
-  UpdateArgvFromArgs();
-}
-
-//----------------------------------------------------------------------
-// We have to be very careful on the copy constructor of this class
-// to make sure we copy all of the string values, but we can't copy the
-// rhs.m_argv into m_argv since it will point to the "const char *" c
-// strings in rhs.m_args. We need to copy the string list and update our
-// own m_argv appropriately.
-//----------------------------------------------------------------------
-const Args &Args::operator=(const Args &rhs) {
-  // Make sure we aren't assigning to self
-  if (this != &rhs) {
-    m_args = rhs.m_args;
-    m_args_quote_char = rhs.m_args_quote_char;
-    UpdateArgvFromArgs();
-  }
-  return *this;
-}
-
-//----------------------------------------------------------------------
-// Destructor
-//----------------------------------------------------------------------
-Args::~Args() {}
-
-void Args::Dump(Stream &s, const char *label_name) const {
-  if (!label_name)
-    return;
-
-  const size_t argc = m_argv.size();
-  for (size_t i = 0; i < argc; ++i) {
-    s.Indent();
-    const char *arg_cstr = m_argv[i];
-    if (arg_cstr)
-      s.Printf("%s[%zi]=\"%s\"\n", label_name, i, arg_cstr);
-    else
-      s.Printf("%s[%zi]=NULL\n", label_name, i);
-  }
-  s.EOL();
-}
-
-bool Args::GetCommandString(std::string &command) const {
-  command.clear();
-  const size_t argc = GetArgumentCount();
-  for (size_t i = 0; i < argc; ++i) {
-    if (i > 0)
-      command += ' ';
-    command += m_argv[i];
-  }
-  return argc > 0;
-}
-
-bool Args::GetQuotedCommandString(std::string &command) const {
-  command.clear();
-  const size_t argc = GetArgumentCount();
-  for (size_t i = 0; i < argc; ++i) {
-    if (i > 0)
-      command.append(1, ' ');
-    char quote_char = GetArgumentQuoteCharAtIndex(i);
-    if (quote_char) {
-      command.append(1, quote_char);
-      command.append(m_argv[i]);
-      command.append(1, quote_char);
-    } else
-      command.append(m_argv[i]);
-  }
-  return argc > 0;
-}
 
 // A helper function for argument parsing.
 // Parses the initial part of the first argument using normal double quote
@@ -159,22 +76,28 @@
   return quoted;
 }
 
-// A helper function for SetCommandString.
-// Parses a single argument from the command string, processing quotes and
-// backslashes in a
-// shell-like manner. The parsed argument is appended to the m_args array. The
-// function returns
-// the unparsed portion of the string, starting at the first unqouted, unescaped
-// whitespace
-// character.
-llvm::StringRef Args::ParseSingleArgument(llvm::StringRef command) {
-  // Argument can be split into multiple discontiguous pieces,
-  // for example:
+static size_t ArgvToArgc(const char **argv) {
+  if (!argv)
+    return 0;
+  size_t count = 0;
+  while (*argv++)
+    ++count;
+  return count;
+}
+
+// A helper function for SetCommandString. Parses a single argument from the
+// command string, processing quotes and backslashes in a shell-like manner.
+// The function returns a tuple consisting of the parsed argument, the quote
+// char used, and the unparsed portion of the string starting at the first
+// unqouted, unescaped whitespace character.
+
+static std::tuple<std::string, char, llvm::StringRef>
+ParseSingleArgument(llvm::StringRef command) {
+  // Argument can be split into multiple discontiguous pieces, // for example:
   //  "Hello ""World"
   // this would result in a single argument "Hello World" (without/
   // the quotes) since the quotes would be removed and there is
   // not space between the strings.
-
   std::string arg;
 
   // Since we can have multiple quotes that form a single command
@@ -246,236 +169,270 @@
     }
   } while (!arg_complete);
 
-  m_args.push_back(arg);
-  m_args_quote_char.push_back(first_quote_char);
-  return command;
+  return std::make_tuple(arg, first_quote_char, command);
 }
 
-void Args::SetCommandString(llvm::StringRef command) {
-  m_args.clear();
+Args::EntryData::EntryData(llvm::StringRef str, char quote) : quote(quote) {
+  size_t size = str.size();
+  ptr.reset(new char[size + 1]);
+
+  ::memcpy(ptr.get(), str.data() ? str.data() : "", size);
+  ptr[size] = 0;
+  ref = llvm::StringRef(ptr.get(), size);
+}
+
+//----------------------------------------------------------------------
+// Args constructor
+//----------------------------------------------------------------------
+Args::Args(llvm::StringRef command) { SetCommandString(command); }
+
+Args::Args(const Args &rhs) { *this = rhs; }
+
+//----------------------------------------------------------------------
+// We have to be very careful on the copy constructor of this class
+// to make sure we copy all of the string values, but we can't copy the
+// rhs.m_argv into m_argv since it will point to the "const char *" c
+// strings in rhs.m_args. We need to copy the string list and update our
+// own m_argv appropriately.
+//----------------------------------------------------------------------
+Args &Args::operator=(const Args &rhs) {
+  Clear();
+
   m_argv.clear();
-  m_args_quote_char.clear();
+  m_entries.clear();
+  for (auto &entry : rhs.m_entries) {
+    m_entries.emplace_back(entry.ref, entry.quote);
+    m_argv.push_back(m_entries.back().ptr.get());
+  }
+  m_argv.push_back(nullptr);
+  return *this;
+}
 
-  static const char *k_space_separators = " \t";
-  command = command.ltrim(k_space_separators);
-  while (!command.empty()) {
-    command = ParseSingleArgument(command);
-    command = command.ltrim(k_space_separators);
+//----------------------------------------------------------------------
+// Destructor
+//----------------------------------------------------------------------
+Args::~Args() { Clear(); }
+
+void Args::Dump(Stream &s, const char *label_name) const {
+  if (!label_name)
+    return;
+
+  int i = 0;
+  for (auto &entry : m_entries) {
+    s.Indent();
+    s.Printf("%s[%zi]=\"%*s\"\n", label_name, i++, entry.ref.size(),
+             entry.ref.data());
   }
+  s.Printf("%s[%zi]=NULL\n", label_name, i);
+  s.EOL();
+}
+
+bool Args::GetCommandString(std::string &command) const {
+  command.clear();
 
-  UpdateArgvFromArgs();
+  for (size_t i = 0; i < m_entries.size(); ++i) {
+    if (i > 0)
+      command += ' ';
+    command += m_entries[i].ref;
+  }
+
+  return !m_entries.empty();
 }
 
-void Args::UpdateArgsAfterOptionParsing() {
-  // Now m_argv might be out of date with m_args, so we need to fix that
-  arg_cstr_collection::const_iterator argv_pos, argv_end = m_argv.end();
-  arg_sstr_collection::iterator args_pos;
-  arg_quote_char_collection::iterator quotes_pos;
-
-  for (argv_pos = m_argv.begin(), args_pos = m_args.begin(),
-      quotes_pos = m_args_quote_char.begin();
-       argv_pos != argv_end && args_pos != m_args.end(); ++argv_pos) {
-    const char *argv_cstr = *argv_pos;
-    if (argv_cstr == nullptr)
-      break;
+bool Args::GetQuotedCommandString(std::string &command) const {
+  command.clear();
 
-    while (args_pos != m_args.end()) {
-      const char *args_cstr = args_pos->c_str();
-      if (args_cstr == argv_cstr) {
-        // We found the argument that matches the C string in the
-        // vector, so we can now look for the next one
-        ++args_pos;
-        ++quotes_pos;
-        break;
-      } else {
-        quotes_pos = m_args_quote_char.erase(quotes_pos);
-        args_pos = m_args.erase(args_pos);
-      }
+  for (size_t i = 0; i < m_entries.size(); ++i) {
+    if (i > 0)
+      command += ' ';
+
+    if (m_entries[i].quote) {
+      command += m_entries[i].quote;
+      command += m_entries[i].ref;
+      command += m_entries[i].quote;
+    } else {
+      command += m_entries[i].ref;
     }
   }
 
-  if (args_pos != m_args.end())
-    m_args.erase(args_pos, m_args.end());
-
-  if (quotes_pos != m_args_quote_char.end())
-    m_args_quote_char.erase(quotes_pos, m_args_quote_char.end());
+  return !m_entries.empty();
 }
 
-void Args::UpdateArgvFromArgs() {
+void Args::SetCommandString(llvm::StringRef command) {
+  Clear();
   m_argv.clear();
-  arg_sstr_collection::const_iterator pos, end = m_args.end();
-  for (pos = m_args.begin(); pos != end; ++pos)
-    m_argv.push_back(pos->c_str());
+
+  static const char *k_space_separators = " \t";
+  command = command.ltrim(k_space_separators);
+  std::string arg;
+  char quote;
+  while (!command.empty()) {
+    std::tie(arg, quote, command) = ParseSingleArgument(command);
+    m_entries.emplace_back(arg, quote);
+    m_argv.push_back(m_entries.back().ptr.get());
+    command = command.ltrim(k_space_separators);
+  }
   m_argv.push_back(nullptr);
-  // Make sure we have enough arg quote chars in the array
-  if (m_args_quote_char.size() < m_args.size())
-    m_args_quote_char.resize(m_argv.size());
 }
 
-size_t Args::GetArgumentCount() const {
-  if (m_argv.empty())
-    return 0;
-  return m_argv.size() - 1;
+void Args::UpdateArgsAfterOptionParsing() {
+  assert(!m_argv.empty());
+  assert(m_argv.back() == nullptr);
+
+  // Now m_argv might be out of date with m_args, so we need to fix that.
+  // This happens because getopt_long_only may permute the order of the
+  // arguments in argv, so we need to re-order the quotes and the refs
+  // array accordingly.
+  for (int i = 0; i < m_argv.size() - 1; ++i) {
+    const char *argv = m_argv[i];
+    auto pos = std::find_if(
+        m_entries.begin() + i, m_entries.end(),
+        [argv](const EntryData &D) { return D.ptr.get() == argv; });
+    assert(pos != m_entries.end());
+    size_t distance = std::distance(m_entries.begin(), pos);
+    if (i == distance)
+      continue;
+
+    assert(distance > i);
+
+    std::swap(m_entries[i], m_entries[distance]);
+    assert(m_entries[i].ref.data() == m_argv[i]);
+  }
+  m_entries.resize(m_argv.size() - 1);
 }
 
+size_t Args::GetArgumentCount() const { return m_entries.size(); }
+
 const char *Args::GetArgumentAtIndex(size_t idx) const {
   if (idx < m_argv.size())
     return m_argv[idx];
   return nullptr;
 }
 
 char Args::GetArgumentQuoteCharAtIndex(size_t idx) const {
-  if (idx < m_args_quote_char.size())
-    return m_args_quote_char[idx];
+  if (idx < m_entries.size())
+    return m_entries[idx].quote;
   return '\0';
 }
 
 char **Args::GetArgumentVector() {
-  if (!m_argv.empty())
-    return const_cast<char **>(&m_argv[0]);
-  return nullptr;
+  assert(!m_argv.empty());
+  // TODO: functions like execve and posix_spawnp exhibit undefined behavior
+  // when argv or envp is null.  So the code below is actually wrong.  However,
+  // other code in LLDB depends on it being null.  The code has been acting this
+  // way for some time, so it makes sense to leave it this way until someone
+  // has the time to come along and fix it.
+  return (m_argv.size() > 1) ? m_argv.data() : nullptr;
 }
 
 const char **Args::GetConstArgumentVector() const {
-  if (!m_argv.empty())
-    return const_cast<const char **>(&m_argv[0]);
-  return nullptr;
+  assert(!m_argv.empty());
+  return (m_argv.size() > 1) ? const_cast<const char **>(m_argv.data())
+                             : nullptr;
 }
 
 void Args::Shift() {
   // Don't pop the last NULL terminator from the argv array
-  if (m_argv.size() > 1) {
-    m_argv.erase(m_argv.begin());
-    m_args.pop_front();
-    if (!m_args_quote_char.empty())
-      m_args_quote_char.erase(m_args_quote_char.begin());
-  }
+  if (m_entries.empty())
+    return;
+  m_argv.erase(m_argv.begin());
+  m_entries.erase(m_entries.begin());
 }
 
 llvm::StringRef Args::Unshift(llvm::StringRef arg_str, char quote_char) {
-  m_args.push_front(arg_str);
-  m_argv.insert(m_argv.begin(), m_args.front().c_str());
-  m_args_quote_char.insert(m_args_quote_char.begin(), quote_char);
-  return llvm::StringRef::withNullAsEmpty(GetArgumentAtIndex(0));
+  return InsertArgumentAtIndex(0, arg_str, quote_char);
 }
 
 void Args::AppendArguments(const Args &rhs) {
-  const size_t rhs_argc = rhs.GetArgumentCount();
-  for (size_t i = 0; i < rhs_argc; ++i)
-    AppendArgument(llvm::StringRef(rhs.GetArgumentAtIndex(i)),
-                   rhs.GetArgumentQuoteCharAtIndex(i));
+  assert(m_argv.size() == m_entries.size() + 1);
+  assert(m_argv.back() == nullptr);
+  m_argv.pop_back();
+  for (auto &entry : rhs.m_entries) {
+    m_entries.emplace_back(entry.ref, entry.quote);
+    m_argv.push_back(m_entries.back().ptr.get());
+  }
+  m_argv.push_back(nullptr);
 }
 
 void Args::AppendArguments(const char **argv) {
-  if (argv) {
-    for (uint32_t i = 0; argv[i]; ++i)
-      AppendArgument(llvm::StringRef::withNullAsEmpty(argv[i]));
+  size_t argc = ArgvToArgc(argv);
+
+  assert(m_argv.size() == m_entries.size() + 1);
+  assert(m_argv.back() == nullptr);
+  m_argv.pop_back();
+  for (int i = 0; i < argc; ++i) {
+    m_entries.emplace_back(argv[i], '\0');
+    m_argv.push_back(m_entries.back().ptr.get());
   }
+
+  m_argv.push_back(nullptr);
 }
 
 llvm::StringRef Args::AppendArgument(llvm::StringRef arg_str, char quote_char) {
   return InsertArgumentAtIndex(GetArgumentCount(), arg_str, quote_char);
 }
 
 llvm::StringRef Args::InsertArgumentAtIndex(size_t idx, llvm::StringRef arg_str,
                                             char quote_char) {
-  // Since we are using a std::list to hold onto the copied C string and
-  // we don't have direct access to the elements, we have to iterate to
-  // find the value.
-  arg_sstr_collection::iterator pos, end = m_args.end();
-  size_t i = idx;
-  for (pos = m_args.begin(); i > 0 && pos != end; ++pos)
-    --i;
-
-  pos = m_args.insert(pos, arg_str);
-
-  if (idx >= m_args_quote_char.size()) {
-    m_args_quote_char.resize(idx + 1);
-    m_args_quote_char[idx] = quote_char;
-  } else
-    m_args_quote_char.insert(m_args_quote_char.begin() + idx, quote_char);
-
-  UpdateArgvFromArgs();
-  return GetArgumentAtIndex(idx);
+  assert(m_argv.size() == m_entries.size() + 1);
+  assert(m_argv.back() == nullptr);
+
+  if (idx > m_entries.size())
+    return llvm::StringRef();
+  m_entries.emplace(m_entries.begin() + idx, arg_str, quote_char);
+  m_argv.insert(m_argv.begin() + idx, m_entries[idx].ptr.get());
+  return m_entries[idx].ref;
 }
 
 llvm::StringRef Args::ReplaceArgumentAtIndex(size_t idx,
                                              llvm::StringRef arg_str,
                                              char quote_char) {
-  // Since we are using a std::list to hold onto the copied C string and
-  // we don't have direct access to the elements, we have to iterate to
-  // find the value.
-  arg_sstr_collection::iterator pos, end = m_args.end();
-  size_t i = idx;
-  for (pos = m_args.begin(); i > 0 && pos != end; ++pos)
-    --i;
-
-  if (pos != end) {
-    pos->assign(arg_str);
-    assert(idx < m_argv.size() - 1);
-    m_argv[idx] = pos->c_str();
-    if (idx >= m_args_quote_char.size())
-      m_args_quote_char.resize(idx + 1);
-    m_args_quote_char[idx] = quote_char;
-    return GetArgumentAtIndex(idx);
+  assert(m_argv.size() == m_entries.size() + 1);
+  assert(m_argv.back() == nullptr);
+
+  if (idx >= m_entries.size())
+    return llvm::StringRef();
+
+  if (arg_str.size() > m_entries[idx].ref.size()) {
+    m_entries[idx] = EntryData(arg_str, quote_char);
+    m_argv[idx] = m_entries[idx].ptr.get();
+  } else {
+    const char *src_data = arg_str.data() ? arg_str.data() : "";
+    ::memcpy(m_entries[idx].ptr.get(), src_data, arg_str.size());
+    m_entries[idx].ptr.get()[arg_str.size()] = 0;
+    m_entries[idx].ref = m_entries[idx].ref.take_front(arg_str.size());
   }
-  return llvm::StringRef();
+
+  return m_entries[idx].ref;
 }
 
 void Args::DeleteArgumentAtIndex(size_t idx) {
-  // Since we are using a std::list to hold onto the copied C string and
-  // we don't have direct access to the elements, we have to iterate to
-  // find the value.
-  arg_sstr_collection::iterator pos, end = m_args.end();
-  size_t i = idx;
-  for (pos = m_args.begin(); i > 0 && pos != end; ++pos)
-    --i;
-
-  if (pos != end) {
-    m_args.erase(pos);
-    assert(idx < m_argv.size() - 1);
-    m_argv.erase(m_argv.begin() + idx);
-    if (idx < m_args_quote_char.size())
-      m_args_quote_char.erase(m_args_quote_char.begin() + idx);
-  }
+  if (idx >= m_entries.size())
+    return;
+
+  m_argv.erase(m_argv.begin() + idx);
+  m_entries.erase(m_entries.begin() + idx);
 }
 
 void Args::SetArguments(size_t argc, const char **argv) {
-  // m_argv will be rebuilt in UpdateArgvFromArgs() below, so there is
-  // no need to clear it here.
-  m_args.clear();
-  m_args_quote_char.clear();
-
-  // First copy each string
-  for (size_t i = 0; i < argc; ++i) {
-    m_args.push_back(argv[i]);
-    if ((argv[i][0] == '\'') || (argv[i][0] == '"') || (argv[i][0] == '`'))
-      m_args_quote_char.push_back(argv[i][0]);
-    else
-      m_args_quote_char.push_back('\0');
+  Clear();
+
+  auto args = llvm::makeArrayRef(argv, argc);
+  m_entries.resize(argc);
+  m_argv.resize(argc + 1);
+  for (int i = 0; i < args.size(); ++i) {
+    char quote =
+        ((args[i][0] == '\'') || (args[i][0] == '"') || (args[i][0] == '`'))
+            ? args[i][0]
+            : '\0';
+
+    m_entries[i] = EntryData(args[i], quote);
+    m_argv[i] = m_entries[i].ptr.get();
   }
-
-  UpdateArgvFromArgs();
 }
 
 void Args::SetArguments(const char **argv) {
-  // m_argv will be rebuilt in UpdateArgvFromArgs() below, so there is
-  // no need to clear it here.
-  m_args.clear();
-  m_args_quote_char.clear();
-
-  if (argv) {
-    // First copy each string
-    for (size_t i = 0; argv[i]; ++i) {
-      m_args.push_back(argv[i]);
-      if ((argv[i][0] == '\'') || (argv[i][0] == '"') || (argv[i][0] == '`'))
-        m_args_quote_char.push_back(argv[i][0]);
-      else
-        m_args_quote_char.push_back('\0');
-    }
-  }
-
-  UpdateArgvFromArgs();
+  SetArguments(ArgvToArgc(argv), argv);
 }
 
 Error Args::ParseOptions(Options &options, ExecutionContext *execution_context,
@@ -598,9 +555,9 @@
 }
 
 void Args::Clear() {
-  m_args.clear();
+  m_entries.clear();
   m_argv.clear();
-  m_args_quote_char.clear();
+  m_argv.push_back(nullptr);
 }
 
 lldb::addr_t Args::StringToAddress(const ExecutionContext *exe_ctx,
@@ -947,36 +904,6 @@
   return result;
 }
 
-void Args::LongestCommonPrefix(std::string &common_prefix) {
-  arg_sstr_collection::iterator pos, end = m_args.end();
-  pos = m_args.begin();
-  if (pos == end)
-    common_prefix.clear();
-  else
-    common_prefix = (*pos);
-
-  for (++pos; pos != end; ++pos) {
-    size_t new_size = (*pos).size();
-
-    // First trim common_prefix if it is longer than the current element:
-    if (common_prefix.size() > new_size)
-      common_prefix.erase(new_size);
-
-    // Then trim it at the first disparity:
-
-    for (size_t i = 0; i < common_prefix.size(); i++) {
-      if ((*pos)[i] != common_prefix[i]) {
-        common_prefix.erase(i);
-        break;
-      }
-    }
-
-    // If we've emptied the common prefix, we're done.
-    if (common_prefix.empty())
-      break;
-  }
-}
-
 void Args::AddOrReplaceEnvironmentVariable(llvm::StringRef env_var_name,
                                            llvm::StringRef new_value) {
   if (env_var_name.empty())
@@ -1415,10 +1342,10 @@
   // it is AT the cursor position.
   // Note, a single quoted dash is not the same as a single dash...
 
+  const EntryData &cursor = m_entries[cursor_index];
   if ((static_cast<int32_t>(dash_dash_pos) == -1 ||
        cursor_index < dash_dash_pos) &&
-      m_args_quote_char[cursor_index] == '\0' &&
-      strcmp(GetArgumentAtIndex(cursor_index), "-") == 0) {
+      cursor.quote == '\0' && cursor.ref == "-") {
     option_element_vector.push_back(
         OptionArgElement(OptionArgElement::eBareDash, cursor_index,
                          OptionArgElement::eBareDash));
Index: source/Core/StringList.cpp
===================================================================
--- source/Core/StringList.cpp
+++ source/Core/StringList.cpp
@@ -103,34 +103,21 @@
 void StringList::Clear() { m_strings.clear(); }
 
 void StringList::LongestCommonPrefix(std::string &common_prefix) {
-  const size_t num_strings = m_strings.size();
-
-  if (num_strings == 0) {
-    common_prefix.clear();
-  } else {
-    common_prefix = m_strings.front();
-
-    for (size_t idx = 1; idx < num_strings; ++idx) {
-      std::string &curr_string = m_strings[idx];
-      size_t new_size = curr_string.size();
-
-      // First trim common_prefix if it is longer than the current element:
-      if (common_prefix.size() > new_size)
-        common_prefix.erase(new_size);
-
-      // Then trim it at the first disparity:
-      for (size_t i = 0; i < common_prefix.size(); i++) {
-        if (curr_string[i] != common_prefix[i]) {
-          common_prefix.erase(i);
-          break;
-        }
-      }
+  common_prefix.clear();
+  if (m_strings.empty())
+    return;
 
-      // If we've emptied the common prefix, we're done.
-      if (common_prefix.empty())
+  auto args = llvm::makeArrayRef(m_strings);
+  llvm::StringRef prefix = args.front();
+  for (auto arg : args.drop_front()) {
+    size_t count = 0;
+    for (count = 0; count < std::min(prefix.size(), arg.size()); ++count) {
+      if (prefix[count] != arg[count])
         break;
     }
+    prefix = prefix.take_front(count);
   }
+  common_prefix = prefix;
 }
 
 void StringList::InsertStringAtIndex(size_t idx, const char *str) {
Index: include/lldb/Interpreter/Args.h
===================================================================
--- include/lldb/Interpreter/Args.h
+++ include/lldb/Interpreter/Args.h
@@ -71,7 +71,7 @@
 
   Args(const Args &rhs);
 
-  const Args &operator=(const Args &rhs);
+  Args &operator=(const Args &rhs);
 
   //------------------------------------------------------------------
   /// Destructor.
@@ -186,50 +186,6 @@
 
   void AppendArguments(const char **argv);
 
-  // Delete const char* versions of StringRef functions.  Normally this would
-  // not be necessary, as const char * is implicitly convertible to StringRef.
-  // However, since the use of const char* is so pervasive, and since StringRef
-  // will assert if you try to construct one from nullptr, this allows the
-  // compiler to catch instances of the function being invoked with a
-  // const char *, allowing us to replace them with explicit conversions at each
-  // call-site.  This ensures that no callsites slip through the cracks where
-  // we would be trying to implicitly convert from nullptr, since it will force
-  // us to evaluate and explicitly convert each one.
-  //
-  // Once StringRef use becomes more pervasive, there will be fewer
-  // implicit conversions because we will be using StringRefs across the whole
-  // pipeline, so we won't have to have this "glue" that converts between the
-  // two, and at that point it becomes easy to just make sure you don't pass
-  // nullptr into the function on the odd occasion that you do pass a
-  // const char *.
-  // Call-site fixing methodology:
-  //   1. If you know the string cannot be null (e.g. it's a const char[], or
-  //      it's been checked for null), use llvm::StringRef(ptr).
-  //   2. If you don't know if it can be null (e.g. it's returned from a
-  //      function whose semantics are unclear), use
-  //      llvm::StringRef::withNullAsEmpty(ptr).
-  //   3. If it's .c_str() of a std::string, just pass the std::string directly.
-  //   4. If it's .str().c_str() of a StringRef, just pass the StringRef
-  //      directly.
-  void ReplaceArgumentAtIndex(size_t, const char *, char = '\0') = delete;
-  void AppendArgument(const char *arg_str, char quote_char = '\0') = delete;
-  void InsertArgumentAtIndex(size_t, const char *, char = '\0') = delete;
-  static bool StringToBoolean(const char *, bool, bool *) = delete;
-  static lldb::ScriptLanguage
-  StringToScriptLanguage(const char *, lldb::ScriptLanguage, bool *) = delete;
-  static lldb::Encoding
-  StringToEncoding(const char *,
-                   lldb::Encoding = lldb::eEncodingInvalid) = delete;
-  static uint32_t StringToGenericRegister(const char *) = delete;
-  static bool StringToVersion(const char *, uint32_t &, uint32_t &,
-                              uint32_t &) = delete;
-  const char *Unshift(const char *, char = '\0') = delete;
-  void AddOrReplaceEnvironmentVariable(const char *, const char *) = delete;
-  bool ContainsEnvironmentVariable(const char *,
-                                   size_t * = nullptr) const = delete;
-  static int64_t StringToOptionEnum(const char *, OptionEnumValueElement *,
-                                    int32_t, Error &) = delete;
-
   //------------------------------------------------------------------
   /// Insert the argument value at index \a idx to \a arg_cstr.
   ///
@@ -456,11 +412,6 @@
   static std::string EscapeLLDBCommandArgument(const std::string &arg,
                                                char quote_char);
 
-  // This one isn't really relevant to Arguments per se, but we're using the
-  // Args as a
-  // general strings container, so...
-  void LongestCommonPrefix(std::string &common_prefix);
-
   //------------------------------------------------------------------
   /// Add or replace an environment variable with the given value.
   ///
@@ -493,22 +444,25 @@
   bool ContainsEnvironmentVariable(llvm::StringRef env_var_name,
                                    size_t *argument_index = nullptr) const;
 
-protected:
+private:
   //------------------------------------------------------------------
   // Classes that inherit from Args can see and modify these
   //------------------------------------------------------------------
-  typedef std::list<std::string> arg_sstr_collection;
-  typedef std::vector<const char *> arg_cstr_collection;
-  typedef std::vector<char> arg_quote_char_collection;
-  arg_sstr_collection m_args;
-  arg_cstr_collection m_argv; ///< The current argument vector.
-  arg_quote_char_collection m_args_quote_char;
+  struct EntryData {
+    EntryData() {}
+    EntryData(llvm::StringRef str, char quote);
 
-  void UpdateArgsAfterOptionParsing();
+    llvm::StringRef ref;
+    char quote;
+    std::unique_ptr<char[]> ptr;
 
-  void UpdateArgvFromArgs();
+    const char *c_str() const { return ptr.get(); }
+  };
 
-  llvm::StringRef ParseSingleArgument(llvm::StringRef command);
+  std::vector<EntryData> m_entries;
+  std::vector<char *> m_argv;
+
+  void UpdateArgsAfterOptionParsing();
 };
 
 } // namespace lldb_private
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to