VitaNuo added a comment.

Thanks for the comments!



================
Comment at: clang/tools/include-mapping/cppreference_parser.py:196
+      "std::remove$": "algorithm",
+      "std::atomic.*": "atomic",
+
----------------
kadircet wrote:
> there's no variant of "std::atomic.*" called "atomic", in 
> https://en.cppreference.com/w/cpp/symbol_index. is it something in 2018 
> version of the html book? otherwise there's only the unnamed variant and 
> std::shared_ptr variant, and we should be preferring unnamed variant.
This is obsolete. As discussed offline, I've removed all the special-casing 
logic. All the variants are skipped now without exceptions.


================
Comment at: clang/tools/include-mapping/cppreference_parser.py:107
+  # FIXME: only consider first symbol, assume further symbols are overloads
+  all_headers = sorted(list(all_headers))
+  if len(all_headers) > 0:
----------------
kadircet wrote:
> VitaNuo wrote:
> > hokein wrote:
> > > why sort the headers?
> > This was a hack to make the generator return `atomic` rather than `memory` 
> > for `std::atomic.*` symbols. But I've noticed now that this adds some 
> > trouble to the C symbol map. I think now that cutting `all_headers` to the 
> > first element is wrong.
> > I will special-case the `std::atomic.*` symbols instead, because so far I 
> > don't see a way to solve this programmatically without other collateral 
> > effects.
> > 
> > Note that the problem is not `std::atomic` itself (for which Kadir 
> > extensively described the solution in some of the other comments), but 
> > rather `atomic_bool`, `atomic_char` etc. mentioned further in the page. The 
> > headers for them come from `all_headers` rather than `symbol_headers`, and 
> > there seems to be no way to limit them to `<atomic>` without other 
> > collateral effects.
> > I think now that cutting all_headers to the first element is wrong.
> 
> What's exactly breaking once you do that? (I guess by `first element` you 
> still mean "all the headers we've seen until the first declaration in the 
> page")
> 
> >  but rather atomic_bool, atomic_char etc. mentioned further in the page. 
> > The headers for them come from all_headers rather than symbol_headers, and 
> > there seems to be no way to limit them to <atomic> without other collateral 
> > effects.
> 
> I guess this is happening because we're trying to find headers that are for 
> the declaration block containing the interesting symbol name. Any idea if 
> that logic is winning us anything? e.g. if we dropped the `        if 
> symbol_name in found_symbols:` what does change?
> I guess by first element you still mean "all the headers we've seen until the 
> first declaration in the page"

No, I mean the first element of the `all_headers` collection. `all_headers` is 
a collection that collects _all_ the headers in the page and, in case a symbol 
is not defined in a table with an individual header block, `all_headers` in the 
page are returned as an approximation.

> What's exactly breaking once you do that?
It is essentially arbitrary to limit `all_headers` to the first element (at the 
moment it would be the first element alphabetically). E.g., for the `INT8_C` 
and other constants [here](https://en.cppreference.com/w/cpp/types/integer), 
the result would be `inttypes.h` and `stdint.h` would be cut, so this result 
doesn't make sense.

> if we dropped the  `if symbol_name in found_symbols` what does change?
Then all the headers in the table will be attributed to all the symbols in the 
table. Consider [this 
page](https://en.cppreference.com/w/cpp/numeric/math/div): `std::imaxdiv` would 
be reported as defined both in `cstdlib` and `cinttypes`, which is wrong. 

> I guess this is happening because we're trying to find headers that are for 
> the declaration block containing the interesting symbol name.

Not really.  Headers for `atomic_bool`, `atomic_char` etc. come from 
`all_headers` rather than `symbol_headers`. The reason for that is they are 
defined in a table which doesn't have its own header block, so we are trying to 
approximate the headers by naming all the headers in the file.


================
Comment at: clang/tools/include-mapping/cppreference_parser.py:165
       # FIXME: use these as a fallback rather than ignoring entirely.
-      variants_for_symbol = variants_to_accept.get(
-          (namespace or "") + symbol_name, ())
-      if variant and variant not in variants_for_symbol:
+      header_to_accept = variants_to_accept.get(
+          (namespace or "") + symbol_name, "")
----------------
kadircet wrote:
> VitaNuo wrote:
> > kadircet wrote:
> > > `variant` is not necessarily the `header`, eg:
> > > ```
> > > acos()
> > > acos<>() (std::complex) (since C++11)
> > > acos<>() (std::valarray)
> > > ```
> > > 
> > > in this case variants are `std::complex` and `std::valarray`. 
> > > 
> > > hence we're not trying to "infer" the header we want to preserve but 
> > > rather decide on which symbol page we want to parse. we should still 
> > > accept "all the headers" mentioned in that variant symbol page.
> > Ah ok, I was mostly looking at `std::remove (algorithm)`. Ok, I will use 
> > the term "variant" as before. Thanks.
> sorry i wasn't just trying to nitpick on the name of the variable/parameters.
> 
> i was trying to say `_ReadSymbolPage` should not be a function of 
> `variant_to_accept`.
> we should "completely skip" parsing of symbol pages for variants we don't 
> want to take in (that's what this `continue` here is trying to achieve)
> 
> later on when we're parsing a symbol page, the name of the variant shouldn't 
> effect the headers we're picking in any way (especially in the way you're 
> relying right now that says variant name and the header name will be the 
> same, as i pointed out this might as well be `std::complex`).
> 
> Is this trying to work around a different problem? (especially around 
> `std::atomic_.*` I assume, i'll follow up on that thread)
This is obsolete. As discussed offline, I've removed all the special-casing 
logic. All the variants are skipped now without exceptions.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142092/new/

https://reviews.llvm.org/D142092

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to