On Thu, Aug 15, 2024 at 10:07:00PM +0000, Joseph Myers wrote: > On Thu, 15 Aug 2024, Jakub Jelinek wrote: > > > + D(embed, T_EMBED, STDC23, INCL | EXPAND) \ > > I'd like to query the macro expansion handling. My understanding is that > a #embed directive only gets macro-expanded if it fails to match either of > the forms > > # embed < h-char-sequence > embed-parameter-sequenceopt new-line > # embed " q-char-sequence " embed-parameter-sequenceopt new-line > > and while the argument of a limit parameter gets processed (including > macro expansion) like in a #if condition, that processing doesn't apply to > the other standard parameters. (Maybe it *should* apply to prefix / > suffix / if_empty so the pp-tokens in their arguments can include macros > without needing to e.g. use a macro for the name of the file to embed as > well, but that doesn't seem to be what the current wording says. There > was some discussion of this on the reflector last month, ending with > <https://www.open-std.org/jtc1/sc22/wg14//26110>.) > > Apart from any consequences for arguments of prefix/suffix/is_empty (where > there is a plausible argument that the argument should get expanded at > some point and that the current wording is undesirable for usability), > this would also mean that e.g. > > #define LIMIT limit > #embed "file" LIMIT(1) > > isn't valid because LIMIT doesn't get expanded (the syntax for > non-expanded #embed is met, with an unknown parameter LIMIT), while > > #define limit ! > #embed "file" limit(1) > > *is* valid, because limit doesn't get expanded (which may be convenient > for usability - it means headers don't need to use __limit__ if using > #embed, even if files including the header might have defined limit as a > macro).
Is there an agreement on that? Because at least checking on godbolt, none of the existing implementations work that way. #define FILE "/etc/passwd" #define LIMIT limit(1) #define THIS , 1, 2, 3 #define PRE prefix (42, ONE #embed FILE LIMIT suffix(THIS) PRE ) TWO #embed "/etc/passwd" LIMIT suffix(THIS) PRE ) THREE #define limit prefix #embed "/etc/passwd" limit (4) suffix (THIS) is expanded to something like ONE 42,110 , 1, 2, 3 TWO 42,110 , 1, 2, 3 THREE 4 47, ... 104 , 1, 2, 3 (... just not to show all of godbolt's /etc/passwd) by JeanHeyd's branch which was used as a prior art for the paper and on the clang trunk. My reading of it wasn't that whether it is # embed < h-char-sequence > embed-parameter-sequence[opt] new-line or # embed < h-char-sequence > embed-parameter-sequence[opt] new-line or # embed pp-tokens new-line depends solely on the filename part in there, but also whether embed-parameter-sequence is syntactically valid (if specified). Though, guess I'm still not implementing it like that, (and none of the existing implementations do), because the last #embed in that case is treated by all 3 as prefix (4) suffix (, 1, 2, 3) even when it would be valid as limit (4) suffix (THIS) too. In that reading, it would be pretty hard to implement, one would need to check twice, first with disabled expansion read the tokens and see if it matches that way the syntax of embed-parameter-sequence (but guess LIMIT suffix(THIS) would be e.g. valid but the PRE ) in there is not) and if invalid repeat with the expansion enabled. Also, in either case (if it is determined solely based on the header tokens or on valid embed-parameter-sequence), what should happen if the tokens e.g. in limit clause argument is without expansion valid balanced token sequence but with expansion is not and it e.g. provides ) in there? I mean #define ARG 0) prefix ( #embed "file" limit (ARG) Anyway, if there is no preprocessing of the embed-parameter-sequence tokens, then it would make less sense that there are the limit vs. __limit__ etc. variants (though sure, if it is the pp-tokens case it would still matter). In any case, I'd hope that prefix/suffix/if_empty argument is macro expanded at some point, because otherwise it will be a serious usability problem for users. > > + if (CPP_PEDANTIC (pfile)) > > + { > > + if (CPP_OPTION (pfile, cplusplus)) > > + cpp_error (pfile, CPP_DL_PEDWARN, > > + "#%s is a GCC extension", "embed"); > > + else if (!CPP_OPTION (pfile, warning_directive)) > > + cpp_error (pfile, CPP_DL_PEDWARN, > > + "#%s before C23 is a GCC extension", "embed"); > > I don't think warning_directive directive should be used here as the > condition for diagnosing #embed as an extension; adding a separate > embed_directive would be better. (Especially if a future C++ version ends > up adding #embed; you could then use embed_directive as a condition for > the pedwarn for both C and C++, whereas warning_directive wouldn't work as > a condition for C++ since #warning is already in C++23.) Ok, will change that (there are really too many features and the table already needs 147 columns before this change, so wanted to avoid adding new stuff there unless necessary). Jakub