AaronBallman wrote: > > We're currently inconsistent regarding attributes; sometimes we include the > > attributes in the range, sometimes we don't. But declaration source ranges > > are tough to reason about in general because there's many different moving > > parts. > > I totally agree. I was asking to change them deliberately and systematically > for the same reason. Such changes, even if they make sense and look good in > isolation, introduce more chaos into the system. As they change one set of > random behavior with another and each of these changes need to be handled by > source tooling consumers.
+1 > > Consider: `#pragma clang section bss = "test" .. static int i = 12;` > > I'd rather not dive into the weeds here and make a general decision around > whether we should refrain from such changes to semantics of source locations. > I am happy to discuss how we can improve things systematically in a separate > medium. I think we need to understand what we want before we can make decisions on what needs changing, though. Are there invariants we want to introduce, like the source range for the AST node should encompass the source locations tracked within the AST node? Or are we fine with the AST node tracking source locations which exist outside of the source range for the node itself? How do we want users of the AST to understand what the source range represents? > > I suspect the least amount of churn for tools is to continue to go with the > > maximal range and improve what we track at the edges. But I also suspect > > that will always be best-effort and tools are going to have to handle those > > on a case-by-case basis sometimes. > > I am also leaning towards this direction, having all that information > available at least gives the tools always something to build on top (rather > than lacking information, which is a lot harder to recover). I think there are separate problems here. We are missing source location information, definitely. For example, we do not track source locations for storage class specifiers. But that's orthogonal to answering the question of whether the source range for a `VarDecl` node should include or exclude storage class specifiers. > But I think we should be tackling that as a separate problem, if we want to > reduce churn and regressions for source tools. E.g. if we decide that we're > going to fix source ranges involving attribute tokens, we should try our best > to cover all the places that parse/contain attributes. Instead of improving > things here and there as we're making other changes. That's just death by > thousand cuts, and most of the time these will be small enough regressions > that no one prioritizes/fixes. +1, that's why I'm trying to understand what the underlying goals are for the source range information itself. Historically, source location tracking has primarily been about diagnostic quality. We've been shifting that focus organically over the years to be about diagnostic quality as well as AST fidelity for tooling and now there's some tension from that shift. So if we're going to add some rigor, we need to figure out what makes sense. I am currently leaning towards the idea that the source range for an AST node should be the union of the ranges tracked by the node and its children. e.g., the declaration `static constexpr int i = 100;` would have the source range covering `int i = 100` but not `static`, `constexpr`, or `;`. Someday, if we start tracking source locations for the storage class specifiers, the range may expand to include those. So we can mechanically build up the source range by using the tracked information from the AST rather than manually deciding the range when parsing and storing it in the AST. But then: macros make things discontiguous, so maybe that's not a good model? https://github.com/llvm/llvm-project/pull/133107 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits