teemperor marked 2 inline comments as done. teemperor added a comment. In D60667#1467387 <https://reviews.llvm.org/D60667#1467387>, @amccarth wrote:
> I, too, have some concern that this could have unintended side effects. To > make the temporary `StringRef`s from the zero-terminated strings requires a > `strlen` call each time. So we're making two passes over a string each time > (once to measure and once to compare). Granted, these are mostly very short. I think I'm not understanding something here. There is one `strlen` call to make the StringRef from the literal, but I don't see why we need a second one to compare (as StringRef and ConstString both store the string length and don't recompute it when comparing). Simple godbolt test case here: https://godbolt.org/z/9X1RRt So both the old and new version both require one `strlen` call from what I can see (which is anyway optimized out for comparing against literals). > I'd love to see the static const variables for the short constant strings, > just to weigh readability hit. I can maybe make a new revision to show this, but I hope it's obvious from looking at the code that I changed in this revision. Also there are the problems/style difficulties that hopefully don't need a full revision to see: 1. We preferable want to have variables with the same name as the string content, but often the strings contain reserved words or words that don't translate into our lower_case_code_style so that's not possible. E.g. `var == this_` is not really as readable as `var == "this"` (and it looks like a typo). Same with stuff like `if (arg_type.GetTypeName() == "bool")` vs `if (arg_type.GetTypeName() == bool_)` or `image_infos[i].segments[k].name == ConstString("__TEXT"))` vs `image_infos[i].segments[k].name == text)`. 2. We get a bunch of local variables that just bloat the code: if (descriptor->IsCFType()) { ConstString type_name(valobj.GetTypeName()); if (type_name == "__CFMutableBitVector" || type_name == "__CFBitVector" || type_name == "CFMutableBitVectorRef" || type_name == "CFBitVectorRef") { if (valobj.IsPointerType()) is_type_ok = true; } } becomes this if (descriptor->IsCFType()) { ConstString type_name(valobj.GetTypeName()); static const ConstString cf_mutable_bit_vector("__CFMutableBitVector"), cf_bit_vector("__CFBitVector"), cf_mutable_bit_vector_ref("CFMutableBitVectorRef"), cf_bit_vector_ref("CFBitVectorRef") if (type_name == cf_mutable_bit_vector || type_name == cf_bit_vector || type_name == cf_mutable_bit_vector_ref || type_name == cf_bit_vector_ref) { if (valobj.IsPointerType()) is_type_ok = true; } } 3. Code becomes much harder to grep. E.g. if I want to check for the location where we filter the `this` variable from the local variables, I can currently do `git grep "\"this\"" | grep ==` and directly see where we do the comparison. But with local variables I have to do some funky grep with context filtering. In general grepping for these string literals without a few lines of context will be pointless with ConstString variables for literals. ================ Comment at: lldb/include/lldb/Utility/ConstString.h:173 + // StringRef doesn't. Therefore we have to do this check manually now. + if (!m_string ^ !rhs.data()) + return false; ---------------- amccarth wrote: > This is very clever way to express the constraint, but it's quite a bit of > cognitive load for the reader to figure out what's being tested here. The > comment sort of explains, but leaves it up to the reader to see how that maps > to the condition. (It also leaves me wondering whether it's useful to have > ConstString treat empty strings and nullptr as distinct things.) > > A simpler way to communicate the condition to humans might be: > > if (m_string == nullptr && rhs.data() != nullptr) return false; > if (m_string != nullptr && rhs.data() == nullptr) return false; > > The clever expression requires the reader to know or deduce: > > 1. that `m_string` and `rhs.data()` are pointers, > 2. that `!p` is equivalent to `p != 0` and that the `0` will be implicitly > converted to `nullptr`, yielding a `bool` that's `true` if `p` is a nullptr, > 3. that `^` is bitwise exclusive or, > 4. that boolean `!` and bitwise `^` have the appropriate operator > precedence, which is not always obvious when mixing types like this, > 5. that `true` and `false` are guaranteed to have values such that bitwise > `^` will do the expected thing to the promoted types, > 6. and that the resulting `int` will be implicitly compared for inequality > to 0. > > Most of these are reasonable things to expect a C++ programmer to know, but > expecting them to apply all of that knowledge (correctly) to figure out what > the expression does seems like an unnecessarily high cognitive burden. Yeah, that was rather quickly typed down when making this patch. Replaced with your version, thanks! Also, I don't like the whole nullptr/empty thing in ConstString, but I didn't want to change this behavior as a side effect of this patch. ================ Comment at: lldb/source/Core/Disassembler.cpp:1407 return (op.m_type == Instruction::Operand::Type::Register && - (op.m_register == ConstString(info.name) || - op.m_register == ConstString(info.alt_name))); + (op.m_register == info.name || op.m_register == info.alt_name)); }; ---------------- amccarth wrote: > This is _probably_ a performance win here, but it's not obvious. If the > resulting lambda is called many times, and it actually captured > `ConstString(info.name)` and `ConstString(info.alt_name)` rather than > `info.name` and `info.alt_name`, then we're trying off two pointer > comparisons for two StringRef comparisons. Thanks! Repository: rLLDB LLDB CHANGES SINCE LAST ACTION https://reviews.llvm.org/D60667/new/ https://reviews.llvm.org/D60667 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits