void added inline comments.
================ Comment at: clang/include/clang/AST/Designator.h:88 + /// An array designator, e.g., "[42] = 0" and "[42 ... 50] = 1". + template <typename Ty> struct ArrayDesignatorInfo { + /// Location of the first and last index expression within the designated ---------------- rsmith wrote: > void wrote: > > rsmith wrote: > > > void wrote: > > > > rsmith wrote: > > > > > void wrote: > > > > > > void wrote: > > > > > > > rsmith wrote: > > > > > > > > Can we move the templating out from here to the whole > > > > > > > > `Designator` and `Designation` classes? It shouldn't be > > > > > > > > possible to mix the two kinds in the same `Designation`. > > > > > > > Grr...My previous comment was eaten. > > > > > > > > > > > > > > I'll give it a shot. > > > > > > > > > > > > > > However, I'm a bit surprised at how designators are handled by > > > > > > > Clang. I expected that a `Designation` would be an `Expr` with > > > > > > > the `Designator`s being L-values (e.g. `MemberExpr`s / > > > > > > > `ArraySubscriptExpr`s), but instead the `Designation` exists just > > > > > > > long enough to be turned into an explicit initialization list. Is > > > > > > > there a reason to do it that way instead of using expressions? > > > > > > So it looks like moving the template outside of the class won't > > > > > > work. The ability to switch between `Expr` and `unsigned` while > > > > > > retaining the same overall type is hardwired into things like the > > > > > > `ASTImporter`. > > > > > > > > > > > > This is kind of a massive mess. Maybe we shouldn't even allow them > > > > > > to use both `Expr` and `unsigned` but instead require them to use > > > > > > one or the other? Maybe we could require `unsigned` with the > > > > > > understanding that the `Expr` can be converted into a constant? > > > > > I'm not understanding something. Currently the `ASTImporter` only > > > > > deals with `DesignatedInitExpr::Designator`s , which only ever store > > > > > integer indexes. > > > > > > > > > > Basically, today, we have two different classes: > > > > > - A class that's specific to `DesignatedInitExpr`, and tracks array > > > > > index expressions by storing the index of the expression within the > > > > > `DesignatedInitExpr`'s list of children; this is also what > > > > > `ASTImporter` can import, because it's the one that's used in the > > > > > AST's representation. > > > > > - A class that's specific to `Sema`'s processing that tracks array > > > > > index expressions as `Expr*` instead. > > > > > > > > > > You want to refactor them to share code, which makes sense, because > > > > > they are basically the same other than how they refer to expressions. > > > > > (Not quite: `DesignatedInitExpr` can apparently refer to a field > > > > > either as an `IdentifierInfo*` or as a `FieldDecl*`, whereas the > > > > > `Sema` version always uses the `IdentifierInfo*` representation.) > > > > > > > > > > Each current user of one of these two classes uses only one of the > > > > > two, which means they're either exclusively using integers to refer > > > > > to expressions or exclusively using `Expr*`. So it seems to me that > > > > > you should be able to update each user to use either > > > > > `Designator<unsigned>` or `Designator<Expr*>`, depending on which > > > > > class they used before. > > > > > > > > > > What am I missing? > > > > I'm still allowing them to use a `Designator<unsigned>` / > > > > `Designator<Expr*>` as they see fit, only it's hidden from them via the > > > > `Create` methods. I personally find the use of two different versions > > > > (one using `unsigned` and one using `Expr*`) completely baffling. Why > > > > can't they all use `Expr*`? Also the `ASTImporter` only outputs the > > > > start of an array init range, which is at the very least > > > > counter-intuitive. That's one of the issues I'd like to tackle with > > > > follow-up patches, hopefully getting rid of the need for this template > > > > all together. This does mean that in the interim a non-array range > > > > designator will have extra `End` & `EllisisLoc` fields that aren't > > > > used, but that shouldn't be too horrible, given that they'd be there > > > > anyway because of the union. > > > The reason that's jumping out at me for having separate integer / `Expr*` > > > implementations here is space-efficiency -- we get to make array range > > > designators (and hence designators overall) be only 16 bytes rather than > > > the 32 bytes they occupy in this patch (assuming 64-bit pointers) by > > > storing indexes instead of pointers. > > > > > > If your eventual plan is to remove the children list from > > > `DesignatedInitExpr`, and store the pointers only in the designators, > > > that seems to cost 8 bytes per designator in the two common cases: > > > > > > - For a field designator: 32 bytes (with 16 bytes of padding) versus 16 > > > bytes + 8 bytes for the child pointer today > > > - For an array designator: 32 bytes (with 16 bytes of padding) versus 16 > > > bytes + 8 bytes for the child pointer today > > > - For an array range designator: 32 bytes (4 bytes of padding) versus 16 > > > bytes + 16 bytes for two child pointers today > > > > > > ... plus it'll presumably be painful to make the `Stmt` child iterator be > > > able to handle this. > > > > > > If you don't remove the separate children list from `DesignatedInitExpr`, > > > then it seems like this approach will cost 16 bytes per designator in all > > > cases, and we'll need to be careful in AST serialization / > > > deserialization that we don't accidentally duplicate the `Expr`s that now > > > have two pointers pointing to them instead of one, and likewise anywhere > > > else that assumes each `Expr` is only reachable by one path through the > > > AST (eg, `TreeTransform`, the recursive AST visitor). > > > > > > I think some more visibility into the eventual plan would help. > > The plan isn't detailed, but I basically want to address several of the > > points you mentioned here. In particular, I think the structure of > > `DesignatedInitExpr` is backwards from how every other `Expr` is handled in > > Clang. For instance, the `Expr` for something like `s.t.u` is a > > `MemberExpr` with a `MemberExpr` as its sub-expression and so on. > > `DesignatedInitExpr` on the other hand basically has a list of maybe > > expressions, maybe integers that refer to parts of the structure / array. > > It seems cleaner to me to use the `MemberExpr` / `ArraySubscriptExpr` way > > of referring to the member being initialized rather than using a > > specialized list that has to be handled differently from other `Expr`'s. > > > > The first step in my evil plot is to do this simple refactoring, so that > > there's no initial functionality change, before I do the more invasive > > changes that may break things. > > > > I'm doing this because I'm working on a feature that uses the `DIE` syntax, > > and it would be much simpler to have it be a `MemberExpr`. > > > > Am I completely off base here? > Thanks, that's really helpful. > > OK, so the end design would be something like (just building out some details > here so I can think about this better; I'm not expecting you would > necessarily do exactly this!): > - `DesignatedInitExpr` becomes a base class with derived classes for member > designators, array designators, and array range designators (And maybe also > there's a different representation for unresolved versus resolved member > designators.) > - These work much like `MemberExpr` / `ArraySubscriptExpr`, except that they > don't have a "base" object (and instead store an initializer for the > nominated field / array element(s)), and are "inside-out": for `.t.u = x` we > want the top-level expression to be a `DesignatedInitMemberExpr` that names > `t` and whose initializer is a `DesignatedInitMemberExpr` that names `u` and > whose initializer is `x`, whereas for `s.t.u` the top-level expression is a > `MemberExpr` that names `u` and its base subexpression is a `MemberExpr` that > names `t`. That is, while we model `s.t.u = x` as `((s.t).u) = x`, we model > `.t.u = x` more like `.t = (.u = x)` > - We use the same representation both in `DesignatedInitExpr` and in `Sema` > (and everywhere else that looks at the syntactic form of an initializer list). > - `Designation` and `Designator` are removed entirely. > > If something like that is the plan, then yes, I think that's completely > reasonable, and seems like a nice improvement -- and I think it's fine if the > intermediate state increases the AST size for designated initializers, as > this patch does, especially if you're aiming for this to be completed within > a single release cycle. > > (Just quickly checking the size of the representation: a single-field > `DesignatedInitExpr` is currently 40 bytes + 16 bytes for the `Designator` > array, and I think could easily be 40 bytes *total* with the new > representation; with two fields, it's currently 40 + 32 and could be 40 + 40 > with the new representation. I think that's going to be a win essentially > always.) Thanks! I'm going to dig into the DIE stuff more and will have a much more detailed plan that I can give you to see what you think before I go too far down the wrong path. :-) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140584/new/ https://reviews.llvm.org/D140584 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits