Hi Enrico, Any suggestion/example how to add a data formatter for our own STL string? >From the output below I can see we are using our own "*fbstring_core*" which I assume I need to write a type summary for this type:
frame variable corpus -T (const string &const) corpus = error: summary string parsing error: { (std::*fbstring_core*<char>) store_ = { (std::*fbstring_core*<char>::(anonymous union)) = { (char [24]) small_ = "www" (std::fbstring_core<char>::MediumLarge) ml_ = { (char *) data_ = 0x0000000000777777 "H\x89U\xa8H\x89M\xa0L\x89E\x98H\x8bE\xa8H\x89��_U��D\x88e�H\x8bE\xa0H\x89��]U��H\x89�H\x8dE�H\x89�H\x89�����L\x8dm�H\x8bE\x98H\x89��IU��\x88]�L\x8be\xb0L\x89�� (std::size_t) size_ = 0 (std::size_t) capacity_ = 1441151880758558720 } } } } Thanks. Jeffrey On Mon, Mar 28, 2016 at 11:38 AM, Enrico Granata <egran...@apple.com> wrote: > This is kind of orthogonal to your problem, but the reason why you are not > seeing the kind of simplified printing Greg is suggesting, is because your > std::string doesn’t look like any of the kinds we recognize > > Specifically, LLDB data formatters work by matching against type names, > and once they recognize a typename, then they try to inspect the variable > in order to grab a summary > In your example, your std::string exposes a layout that we are not > handling - hence we bail out of the formatter and we fall back to the raw > view > > If you want pretty printing to work, you’ll need to write a data formatter > > There are a few avenues. The obvious easy one is to extend the existing > std::string formatter to recognize your type’s internal layout. > If one were signing up for more infrastructure work, they could decide to > try and detect shared library loads and load formatters that match with > whatever libraries are being loaded. > > On Mar 28, 2016, at 9:47 AM, Greg Clayton via lldb-dev < > lldb-dev@lists.llvm.org> wrote: > > So you need to be prepared to escape any text that can have special > characters. A "std::string" or any container can contain special > characters. If you are encoding stuff into JSON, you will either need to > escape any special characters, or hex encode the string into ASCII hex > bytes. > > In debuggers we often get bogus data because variables are not > initialized, but the compiler tells us that a variable is valid in address > range [0x1000-0x2000), but it actually is [0x1200-0x2000). If we read a > variable in this case, a std::string might contain bogus data and the bytes > might not make sense. So you always have to be prepared for bad data. > > If we look at: > > store_ = { > = { > small_ = "www" > ml_ = (data_ = > > "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b", > size_ = 0, capacity_ = 1441151880758558720) > } > } > } > > We can see the "size_" is zero, and capacity_ is 1441151880758558720 > (which is 0x1400000000000000). "data_" seems to be some random pointer. > > On MacOSX, we have a special formatting code that displays std::string in > CPlusPlusLanguage.cpp that gets installed in the LoadLibCxxFormatters() or > LoadLibStdcppFormatters() functions with code like: > > lldb::TypeSummaryImplSP std_string_summary_sp(new > CXXFunctionSummaryFormat(stl_summary_flags, > lldb_private::formatters::LibcxxStringSummaryProvider, "std::string summary > provider")); > > cpp_category_sp->GetTypeSummariesContainer()->Add(ConstString("std::__1::string"), > std_string_summary_sp); > > Special flags are set on std::string to say "don't show children of this > and just show a summary" So if a std::string contained "hello". So for the > following code: > > std::string h ("hello"); > > You should just see: > > (lldb) fr var h > (std::__1::string) h = "hello" > > If you take a look at the normal value in the raw we see: > > (lldb) fr var --raw h > (std::__1::string) h = { > __r_ = { > std::__1::__libcpp_compressed_pair_imp<std::__1::basic_string<char, > std::__1::char_traits<char>, std::__1::allocator<char> >::__rep, > std::__1::allocator<char>, 2> = { > __first_ = { > = { > __l = { > __cap_ = 122511465736202 > __size_ = 0 > __data_ = 0x0000000000000000 > } > __s = { > = { > __size_ = '\n' > __lx = '\n' > } > __data_ = { > [0] = 'h' > [1] = 'e' > [2] = 'l' > [3] = 'l' > [4] = 'o' > [5] = '\0' > [6] = '\0' > [7] = '\0' > [8] = '\0' > [9] = '\0' > [10] = '\0' > [11] = '\0' > [12] = '\0' > [13] = '\0' > [14] = '\0' > [15] = '\0' > [16] = '\0' > [17] = '\0' > [18] = '\0' > [19] = '\0' > [20] = '\0' > [21] = '\0' > [22] = '\0' > } > } > __r = { > __words = { > [0] = 122511465736202 > [1] = 0 > [2] = 0 > } > } > } > } > } > } > } > > So the main question is why are our "std::string" formatters not kicking > in for you. That comes down to a typename match, or the format of the > string isn't what the formatter is expecting. > > But again, since you std::string can contain anything, you will need to > escape any and all text that is encoded into JSON to ensure it doesn't > contain anything JSON can't deal with. > > On Mar 27, 2016, at 9:20 PM, Jeffrey Tan via lldb-dev < > lldb-dev@lists.llvm.org> wrote: > > Thanks Siva. All the DW_TAG_member related errors seems to go away after > patching with your fix. The current problem is handling the decoding. > > Here is the correct decoding from gdb whic might be useful: > (gdb) p corpus > $3 = (const std::string &) @0x7fd133cfb888: { > static npos = 18446744073709551615, store_ = { > static kIsLittleEndian = <optimized out>, > static kIsBigEndian = <optimized out>, { > small_ = "www", '\000' <repeats 20 times>, "\024", ml_ = { > data_ = 0x777777 <std::_Any_data::_M_access<void > folly::fibers::Baton::waitFiber<folly::fibers::FirstArgOf<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}, > void>::type::value_type > folly::fibers::await<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}>(folly::fibers::FirstArgOf&&)::{lambda()#1}>(folly::fibers::FiberManager&, > folly::fibers::FirstArgOf<folly::fibers::FirstArgOf<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}, > void>::type::value_type > folly::fibers::await<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}>(folly::fibers::FirstArgOf&&)::{lambda()#1}, > void>::type::value_type)::{lambda(folly::fibers::Fiber&)#1}*>() const+25> > "\311\303UH\211\345H\211}\370H\213E\370]ÐUH\211\345H\203\354\020H\211}\370H\213E\370H\211\307\350~\264\312\377\220\311\303UH\211\345SH\203\354\030H\211}\350H\211u\340H\213E\340H\211\307\350\236\377\377\377H\213\030H\213E\350H\211\307\350O\264\312\377H\211ƿ\b", > size_ = 0, > capacity_ = 1441151880758558720}}}} > > Utf-16 does not seem to decode it, while 'latin-1' does: > > '\xc9'.decode('utf-16') > > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File > "/mnt/gvfs/third-party2/python/55c1fd79d91c77c95932db31a4769919611c12bb/2.7.8/centos6-native/da39a3e/lib/python2.7/encodings/utf_16.py", > line 16, in decode > return codecs.utf_16_decode(input, errors, True) > UnicodeDecodeError: 'utf16' codec can't decode byte 0xc9 in position 0: > truncated data > > '\xc9'.decode('latin-1') > > u'\xc9' > > Instead of guessing what kind of decoding I should use, I would use > 'ensure_ascii=False' to prevent the crash for now. > > I tried to reproduce this crash, but it seems that the crash might be > related with some internal stl implementation we are using. I will see if I > can narrow down to a small repro later. > > Thanks > Jeffrey > > On Sun, Mar 27, 2016 at 2:49 PM, Siva Chandra <sivachan...@gmail.com> > wrote: > On Sat, Mar 26, 2016 at 11:58 PM, Jeffrey Tan <jeffrey.fu...@gmail.com> > wrote: > > Btw: after patching with Siva's fix http://reviews.llvm.org/D18008, the > first field 'small_' is fixed, however the second field 'ml_' still emits > garbage: > > (lldb) fr v corpus > (const string &const) corpus = error: summary string parsing error: { > store_ = { > = { > small_ = "www" > ml_ = (data_ = > > "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b", > size_ = 0, capacity_ = 1441151880758558720) > } > } > } > > > Do you still see the DW_TAG_member related error? > > A wild (and really wild at that) guess: Is it utf16 data that is being > decoded as utf8? > > As David Blaikie mentioned on the other thread, it would really help > if you provide us with a minimal example to repro this. Atleast, repro > instructions. > > _______________________________________________ > lldb-dev mailing list > lldb-dev@lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > > > _______________________________________________ > lldb-dev mailing list > lldb-dev@lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > > > > Thanks, > *- Enrico* > 📩 egranata@.com ☎️ 27683 > >
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev