hokein added a comment.

In https://reviews.llvm.org/D53363#1267628, @sammccall wrote:

> (I think your math is off in the description: 20 bits should be 1M lines, not 
> 4M)


Oops...Update the desccription.

> I think this is a win, as I think truncation will be rare and not terrible. 
> We should document the intentions around truncation though.
> 
> Incidentally, this means replacing just the StringRef in 
> SymbolLocation::Position with a char* would save another 13%, because that's 
> also 8 bytes.

Yeah, I have a rough patch for it, using char* will save us ~50MB memory, which 
will lead to ~300 MB memory usage in total.

> (Obviously we'd probably replace the other StringRefs too, but it gives a 
> lower bound).





================
Comment at: clangd/index/Index.h:41
   struct Position {
-    uint32_t Line = 0; // 0-based
+    static constexpr int LineBits = 20;
+    static constexpr int ColumnBits = 12;
----------------
sammccall wrote:
> (not sure these constants are needed as it stands - YAML shouldn't use them, 
> see below)
I think we need, for testing, for setters, rather than using magic number.


================
Comment at: clangd/index/Index.h:46
     // Using UTF-16 code units.
-    uint32_t Column = 0; // 0-based
+    uint32_t Column : ColumnBits; // 0-based
   };
----------------
sammccall wrote:
> If we overflow the columns, it would be much easier to recognize if the 
> result is always e.g. 255, rather than an essentially random number from 
> 0-255 (from modulo 256 arithmetic).
> 
> This would mean replacing the raw fields with setters/getters, which is 
> obviously a more invasive change. What about a compromise: add the setters, 
> and call them from the most important codepaths: `SymbolCollector` and 
> `Serialization`.
It seems reasonable to use maximum value if we encounter overflows.

> This would mean replacing the raw fields with setters/getters, which is 
> obviously a more invasive change. What about a compromise: add the setters, 
> and call them from the most important codepaths: SymbolCollector and 
> Serialization.

Actually, there is not too much usage of the raw fields in the source code, I'd 
prefer to use all setters/getters in all places (and it would make the code 
consistent and more safe). How about this plan?

1. I update the patch to change all raw fields usage to getters/setters, and 
keep these raw fields public
2. do a cleanup internally
3. make these raw fields private


================
Comment at: clangd/index/YAMLSerialization.cpp:97
 
-template <> struct MappingTraits<SymbolLocation::Position> {
-  static void mapping(IO &IO, SymbolLocation::Position &Value) {
-    IO.mapRequired("Line", Value.Line);
-    IO.mapRequired("Column", Value.Column);
+struct NormalizedPosition {
+  using Position = clang::clangd::SymbolLocation::Position;
----------------
sammccall wrote:
> I don't think there's any reason to pack the YAML encoding.

The YAML here is a bit tricky, the previous code could not compile because we 
can not bind bit-field to Non-const references, I added another traits to keep 
the original YAML encoding (which is more readable).


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D53363



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to