On Mon, 2019-07-01 at 13:07 +0300, Vladislav Ivanishin wrote: > This change is threefold: > - enable pretty printing of vec<>, not just vec<>* > - generalize 'vec<(\S+), (\S+), (\S+)>' regex, which is limiting > - extend to work for vl_ptr layout (only vl_embed was supported) > > The motivating example for all three is a vector of vectors in > tree-ssa-uninit.c: > > typedef vec<pred_info, va_heap, vl_ptr> pred_chain; > typedef vec<pred_chain, va_heap, vl_ptr> pred_chain_union; > > (This predictably expands to > vec<vec<pred_info, va_heap, vl_ptr>, va_heap, vl_ptr> > in gdb, if we use strip_typedefs() on it -- otherwise it's just > pred_chain_union. The first patch of the series takes care of that.) > > This example features "direct" vecs themselves rather than pointers > to > vecs, which is not a rare case; in fact, a quick grep showed that the > number of declarations of direct vecs is about the same as that of > pointers to vec. > > The GDB's Python API seems to do dereferencing automatically (i.e. it > detects whether to use '->', or '.' for field access), allowing to > use > the same code to access fields from a struct directly as well as > through > a pointer. > > This makes it possible to reuse VecPrinter (meant for vec<> *) for > plain > vec<>, but perhaps it would be beneficial to handle all pointers in a > generic way e.g. print the address and then print the dereferenced > object, letting the appropriate pretty printer to pick it up and do > its > thing, if one is provided for the type. > > The output I have for direct vecs now: > > (gdb) p *bb->succs > $94 = @0x7ffff76569b0 = {<edge 0x7ffff7654870 (2 -> 5)>, <edge > 0x7ffff76548a0 (2 -> 3)>} > > Compare that to > > (gdb) p bb->succs > $70 = 0x7ffff76569b0 = {<edge 0x7ffff7654870 (2 -> 5)>, <edge > 0x7ffff76548a0 (2 -> 3)>} > > Hope the choice of the '@' sign to represent the address of an object > taken by the pretty printer is not confusing? (GDB itself uses this > notation for references, like so: (edge_def *&) @0x7ffff7656c38.)
Thanks for the patch. I like the notation idea. [...] > diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py > index b036704b86a..a7e665cadb9 100644 > --- a/gcc/gdbhooks.py > +++ b/gcc/gdbhooks.py > @@ -128,12 +128,18 @@ and here's a length 1 vec: > (gdb) p bb->succs > $19 = 0x7ffff0428bb8 = {<edge 0x7ffff044d3f0 (5 -> EXIT)>} > > -You cannot yet use array notation [] to access the elements within the > -vector: attempting to do so instead gives you the vec itself (for vec[0]), > -or a (probably) invalid cast to vec<> for the memory after the vec (for > -vec[1] onwards). > +Direct instances of vec<> are also supported (their addresses are > +prefixed with the '@' sign). > > -Instead (for now) you must access m_vecdata: > +In case you have a vec<> pointer, to use array notation [] to access the > +elements within the vector you must first dereference the pointer, just > +as you do in C: > + (gdb) p (*bb->succs)[0] > + $50 = (edge_def *&) @0x7ffff7656c38: <edge 0x7ffff7654ab0 (9 -> 10)> > + (gdb) p bb->succs[0][0] > + $51 = (edge_def *&) @0x7ffff7656c38: <edge 0x7ffff7654ab0 (9 -> 10)> That's a nice improvement. If the user erroneously attempts, say: (gdb) p bb->succs[2] are they still accessing whatever's in memory after the vector pointer? If so, I think the comment needs to retain some kind of warning about this, as it's a nasty "gotcha". > +Another option is to access m_vecdata: > (gdb) p bb->preds->m_vecdata[0] > $20 = <edge 0x7ffff044d380 (3 -> 5)> > (gdb) p bb->preds->m_vecdata[1] [...] > @@ -441,6 +447,10 @@ class VecPrinter: > # -ex "up" -ex "p bb->preds" > def __init__(self, gdbval): > self.gdbval = gdbval > + if self.gdbval.type.code == gdb.TYPE_CODE_PTR: > + self.ptr = intptr(self.gdbval) > + else: > + self.ptr = None I think this could use a comment. If we have a NULL "vec *", then presumably self.ptr will be 0... > def display_hint (self): > return 'array' > @@ -448,14 +458,25 @@ class VecPrinter: > def to_string (self): > # A trivial implementation; prettyprinting the contents is done > # by gdb calling the "children" method below. > - return '0x%x' % intptr(self.gdbval) > + if self.ptr: > + return '0x%x' % self.ptr > + else: > + return '@0x%x' % intptr(self.gdbval.address) ...and if we have a NULL vec *, with self.ptr as 0, it will take the "else" path here, which I think is meant for the non-pointer case. Should the condition be "if self.ptr is not None"? > def children (self): > - if intptr(self.gdbval) == 0: > + if self.ptr == 0: > return > - m_vecpfx = self.gdbval['m_vecpfx'] > + m_vec = self.gdbval > + # There are 2 kinds of vec layouts: vl_embed, and vl_ptr. The latter > + # is implemented with the former stored in the m_vec field. Sadly, > + # template_argument(2) won't work: `gdb.error: No type named > vl_embed`. > + try: > + m_vec = m_vec['m_vec'] > + except: > + pass I don't like the use of naked "except"; I always worry it might silently swallow a SyntaxError at some later revision of the code. Can the code be more specific about what exception class(es) to swallow? (probably just gdb.Error) [...] FWIW I'm wishing we had automated test coverage for gdbhooks.py, but that's probably overkill, I'm not sure how to go about doing it. Dave