VitaNuo added a comment. Thanks for the comments!
================ Comment at: clang/tools/include-mapping/cppreference_parser.py:196 + "std::remove$": "algorithm", + "std::atomic.*": "atomic", + ---------------- kadircet wrote: > there's no variant of "std::atomic.*" called "atomic", in > https://en.cppreference.com/w/cpp/symbol_index. is it something in 2018 > version of the html book? otherwise there's only the unnamed variant and > std::shared_ptr variant, and we should be preferring unnamed variant. This is obsolete. As discussed offline, I've removed all the special-casing logic. All the variants are skipped now without exceptions. ================ Comment at: clang/tools/include-mapping/cppreference_parser.py:107 + # FIXME: only consider first symbol, assume further symbols are overloads + all_headers = sorted(list(all_headers)) + if len(all_headers) > 0: ---------------- kadircet wrote: > VitaNuo wrote: > > hokein wrote: > > > why sort the headers? > > This was a hack to make the generator return `atomic` rather than `memory` > > for `std::atomic.*` symbols. But I've noticed now that this adds some > > trouble to the C symbol map. I think now that cutting `all_headers` to the > > first element is wrong. > > I will special-case the `std::atomic.*` symbols instead, because so far I > > don't see a way to solve this programmatically without other collateral > > effects. > > > > Note that the problem is not `std::atomic` itself (for which Kadir > > extensively described the solution in some of the other comments), but > > rather `atomic_bool`, `atomic_char` etc. mentioned further in the page. The > > headers for them come from `all_headers` rather than `symbol_headers`, and > > there seems to be no way to limit them to `<atomic>` without other > > collateral effects. > > I think now that cutting all_headers to the first element is wrong. > > What's exactly breaking once you do that? (I guess by `first element` you > still mean "all the headers we've seen until the first declaration in the > page") > > > but rather atomic_bool, atomic_char etc. mentioned further in the page. > > The headers for them come from all_headers rather than symbol_headers, and > > there seems to be no way to limit them to <atomic> without other collateral > > effects. > > I guess this is happening because we're trying to find headers that are for > the declaration block containing the interesting symbol name. Any idea if > that logic is winning us anything? e.g. if we dropped the ` if > symbol_name in found_symbols:` what does change? > I guess by first element you still mean "all the headers we've seen until the > first declaration in the page" No, I mean the first element of the `all_headers` collection. `all_headers` is a collection that collects _all_ the headers in the page and, in case a symbol is not defined in a table with an individual header block, `all_headers` in the page are returned as an approximation. > What's exactly breaking once you do that? It is essentially arbitrary to limit `all_headers` to the first element (at the moment it would be the first element alphabetically). E.g., for the `INT8_C` and other constants [here](https://en.cppreference.com/w/cpp/types/integer), the result would be `inttypes.h` and `stdint.h` would be cut, so this result doesn't make sense. > if we dropped the `if symbol_name in found_symbols` what does change? Then all the headers in the table will be attributed to all the symbols in the table. Consider [this page](https://en.cppreference.com/w/cpp/numeric/math/div): `std::imaxdiv` would be reported as defined both in `cstdlib` and `cinttypes`, which is wrong. > I guess this is happening because we're trying to find headers that are for > the declaration block containing the interesting symbol name. Not really. Headers for `atomic_bool`, `atomic_char` etc. come from `all_headers` rather than `symbol_headers`. The reason for that is they are defined in a table which doesn't have its own header block, so we are trying to approximate the headers by naming all the headers in the file. ================ Comment at: clang/tools/include-mapping/cppreference_parser.py:165 # FIXME: use these as a fallback rather than ignoring entirely. - variants_for_symbol = variants_to_accept.get( - (namespace or "") + symbol_name, ()) - if variant and variant not in variants_for_symbol: + header_to_accept = variants_to_accept.get( + (namespace or "") + symbol_name, "") ---------------- kadircet wrote: > VitaNuo wrote: > > kadircet wrote: > > > `variant` is not necessarily the `header`, eg: > > > ``` > > > acos() > > > acos<>() (std::complex) (since C++11) > > > acos<>() (std::valarray) > > > ``` > > > > > > in this case variants are `std::complex` and `std::valarray`. > > > > > > hence we're not trying to "infer" the header we want to preserve but > > > rather decide on which symbol page we want to parse. we should still > > > accept "all the headers" mentioned in that variant symbol page. > > Ah ok, I was mostly looking at `std::remove (algorithm)`. Ok, I will use > > the term "variant" as before. Thanks. > sorry i wasn't just trying to nitpick on the name of the variable/parameters. > > i was trying to say `_ReadSymbolPage` should not be a function of > `variant_to_accept`. > we should "completely skip" parsing of symbol pages for variants we don't > want to take in (that's what this `continue` here is trying to achieve) > > later on when we're parsing a symbol page, the name of the variant shouldn't > effect the headers we're picking in any way (especially in the way you're > relying right now that says variant name and the header name will be the > same, as i pointed out this might as well be `std::complex`). > > Is this trying to work around a different problem? (especially around > `std::atomic_.*` I assume, i'll follow up on that thread) This is obsolete. As discussed offline, I've removed all the special-casing logic. All the variants are skipped now without exceptions. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142092/new/ https://reviews.llvm.org/D142092 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits