hokein added a comment. In https://reviews.llvm.org/D53363#1267628, @sammccall wrote:
> (I think your math is off in the description: 20 bits should be 1M lines, not > 4M) Oops...Update the desccription. > I think this is a win, as I think truncation will be rare and not terrible. > We should document the intentions around truncation though. > > Incidentally, this means replacing just the StringRef in > SymbolLocation::Position with a char* would save another 13%, because that's > also 8 bytes. Yeah, I have a rough patch for it, using char* will save us ~50MB memory, which will lead to ~300 MB memory usage in total. > (Obviously we'd probably replace the other StringRefs too, but it gives a > lower bound). ================ Comment at: clangd/index/Index.h:41 struct Position { - uint32_t Line = 0; // 0-based + static constexpr int LineBits = 20; + static constexpr int ColumnBits = 12; ---------------- sammccall wrote: > (not sure these constants are needed as it stands - YAML shouldn't use them, > see below) I think we need, for testing, for setters, rather than using magic number. ================ Comment at: clangd/index/Index.h:46 // Using UTF-16 code units. - uint32_t Column = 0; // 0-based + uint32_t Column : ColumnBits; // 0-based }; ---------------- sammccall wrote: > If we overflow the columns, it would be much easier to recognize if the > result is always e.g. 255, rather than an essentially random number from > 0-255 (from modulo 256 arithmetic). > > This would mean replacing the raw fields with setters/getters, which is > obviously a more invasive change. What about a compromise: add the setters, > and call them from the most important codepaths: `SymbolCollector` and > `Serialization`. It seems reasonable to use maximum value if we encounter overflows. > This would mean replacing the raw fields with setters/getters, which is > obviously a more invasive change. What about a compromise: add the setters, > and call them from the most important codepaths: SymbolCollector and > Serialization. Actually, there is not too much usage of the raw fields in the source code, I'd prefer to use all setters/getters in all places (and it would make the code consistent and more safe). How about this plan? 1. I update the patch to change all raw fields usage to getters/setters, and keep these raw fields public 2. do a cleanup internally 3. make these raw fields private ================ Comment at: clangd/index/YAMLSerialization.cpp:97 -template <> struct MappingTraits<SymbolLocation::Position> { - static void mapping(IO &IO, SymbolLocation::Position &Value) { - IO.mapRequired("Line", Value.Line); - IO.mapRequired("Column", Value.Column); +struct NormalizedPosition { + using Position = clang::clangd::SymbolLocation::Position; ---------------- sammccall wrote: > I don't think there's any reason to pack the YAML encoding. The YAML here is a bit tricky, the previous code could not compile because we can not bind bit-field to Non-const references, I added another traits to keep the original YAML encoding (which is more readable). Repository: rCTE Clang Tools Extra https://reviews.llvm.org/D53363 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits