yihanaa added inline comments.
================ Comment at: clang/lib/Sema/SemaChecking.cpp:433 + + llvm::StringRef getFormatSpecifier(QualType T) { + if (auto *BT = T->getAs<BuiltinType>()) { ---------------- yihanaa wrote: > erichkeane wrote: > > rsmith wrote: > > > yihanaa wrote: > > > > rsmith wrote: > > > > > rsmith wrote: > > > > > > aaron.ballman wrote: > > > > > > > yihanaa wrote: > > > > > > > > I think this is better maintained in > > > > > > > > "clang/AST/FormatString.h". For example > > > > > > > > analyze_printf::PrintfSpecifier can get format specifier, what > > > > > > > > do you all think about? > > > > > > > +1 to this suggestion -- my hope is that we could generalize it > > > > > > > more then as I notice there are missing specifiers for things > > > > > > > like intmax_t, size_t, ptrdiff_t, _Decimal types, etc. Plus, that > > > > > > > will hopefully make it easier for __builtin_dump_struct to > > > > > > > benefit when new format specifiers are added, such as ones for > > > > > > > printing a _BitInt. > > > > > > I am somewhat uncertain: every one of these is making arbitrary > > > > > > choices about how to format the value, so it's not clear to me that > > > > > > this is general logic rather than something specific to > > > > > > `__builtin_dump_struct`. For example, using `%f` rather than one of > > > > > > the myriad other ways of formatting `double` is somewhat arbitrary. > > > > > > Using `%s` for any `const char*` is *extremely* arbitrary and will > > > > > > be wrong and will cause crashes in some cases, but it may be the > > > > > > pragmatically correct choice for a dumping utility. A > > > > > > general-purpose mechanism would use `%p` for all kinds of pointer. > > > > > > > > > > > > We could perhaps factor out the formatting for cases where there is > > > > > > a clear canonical default formatting, such as for integer types and > > > > > > probably `%p` for pointers, then call that from here with some > > > > > > special-casing, but without a second consumer for that > > > > > > functionality it's really not clear to me what form it should take. > > > > > I went ahead and did this, mostly to match concurrent changes to the > > > > > old implementation. There are a few cases where our existing "guess a > > > > > format specifier" logic does the wrong thing for dumping purposes, > > > > > which I've explicitly handled -- things like wanting to dump a `char` > > > > > / `signed char` / `unsigned char` member as a number rather than as a > > > > > (potentially non-printable or whitespace) character. > > > > When I was patching that old implementation, I found that for uint8_t, > > > > int8_t, Clang's existing "guess a format specifier" logic would treat > > > > the value as an integer, but for unsigned char, signed char, char > > > > types, it would Treat it as a character, please look at this example ( > > > > https://godbolt.org/z/ooqn4468T ), I guess this existing logic may have > > > > made some special judgment. > > > Yeah. I think in the case where we see some random call to `printf`, `%c` > > > is probably the right thing to guess here, but it doesn't seem > > > appropriate to me to use this in a dumping routine. If we could dump as > > > `'x'` for printable characters and as `'\xAB'` for everything else, > > > that'd be great, but `printf` can't do that itself and I'm not sure we > > > want to be injecting calls to `isprint` or whatever to make that work. > > > Dumping as an integer always seems like it's probably the least-bad > > > option. > > > > > > Somewhat related: I wonder if we should use `"\"%s\""` instead of simply > > > `"%s"` when dumping a `const char*`. That's not ideal but probably > > > clearer than the current dump output. > > I see value to having strings with SOME level of delimiter, if at least to > > handle cases when the string itself has a newline in it. > This looks good, but the string may need to be escaped (if there are some > special characters in the original string) Maybe users should wrap the printf function, and then pass the wrapper function as a parameter, so that they can do some custom things Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D124221/new/ https://reviews.llvm.org/D124221 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits