hokein added a comment. > Hmm, I think this is the same for other symbol payload e.g. definition can be > missing for a symbol. And it seems to me that the concern is on the > SymbolSlab level: if a slab is for a single TU, users should expect missing > information; if a slab is merged from all TUs, then users can expect > "complete" information. I think it's reasonable to assume that users of > SymbolSlab are aware of this. I think it's probably not worth the overhead of > maintaining and using two separate slabs.
My concerns of storing occurrences as an extra payload of `Symbol` are: - `SymbolSlab` is more like an implementation detail. Users of `SymbolIndex` are not aware of it, they only get `Symbol` objects, so it easily confuses users if they see any occurrence-related interface/member in `Symbol`. And we will write a looong comment explaining its correct behavior. It'd be better if we avoid this confusion in the API level. - The fields in `Symbol` structure are symbol properties, and could be stored in memory. However, occurrences are not, we can't guarantee that. - It seems that we are coupling `ID`, `Symbol`, `SymbolOccurrence` together: in the index implementation, we will go through ID=>Symbol=>Occurrences rather than ID=>Occurrences. > I think these options are reasonable if they turn out to be necessary. I think they are necessary. For collecting all occurrences for local symbols from the AST, we only need symbol occurrence information, other information (e.g. declaration&definition location, #include) should be discarded; Index for code completion should not collect symbol occurrences. > And making the SymbolCollector more complicated doesn't seem to be a problem > if we are indeed doing more complicated work, but I don't think this would > turn into a big problem as logic of xrefs seems pretty isolated. If xrefs is quite isolated, I think it is a good signal to have a dedicated class handling it. > I think implementing xrefs in a separate class would likely to cause more > duplicate and maintenance, e.g. two sets of options, two sets of > initializations or life-time tracking of collectors (they look a lot alike), > the same boilerplate factory code in tests, passing around two collectors in > user code. Merging xrefs to `SymbolCollector` couldn't avoid these problems, I think it is a matter of where we put these code: - different initialization of `SymbolCollector` for different use cases (e.g. setting different flags in SymbolCollectorOptions). - for dynamic index, index for xrefs and code completion would be triggered at different point: index for xrefs should happen when AST is ready; index for code completion happens when Preamble is ready; we might end up with two slabs instances in the dynamic index (1 symbol slab + 1 occurrence slab vs. 2 symbol slabs). The duplication is mainly about AST frontend action boilerplate code. To eliminate it, we could do some refactorings: - get rid of the clang ast action code in SymbolCollector, and SymbolOccurrenceCollector - introduce an `IndexSymbol` which is a subclass `index::IndexDataConsumer` - the `IndexSymbol` has two mode (indexing symbol or indexing occurrence), and dispatch ast information to `SymbolCollector`/`SymbolOccurrenceCollector`. Repository: rCTE Clang Tools Extra https://reviews.llvm.org/D50385 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits