On Thu, Aug 15, 2024 at 10:07:00PM +0000, Joseph Myers wrote:
> On Thu, 15 Aug 2024, Jakub Jelinek wrote:
> 
> > +  D(embed, T_EMBED,        STDC23,    INCL | EXPAND)               \
> 
> I'd like to query the macro expansion handling.  My understanding is that 
> a #embed directive only gets macro-expanded if it fails to match either of 
> the forms
> 
>   # embed < h-char-sequence > embed-parameter-sequenceopt new-line
>   # embed " q-char-sequence " embed-parameter-sequenceopt new-line
> 
> and while the argument of a limit parameter gets processed (including 
> macro expansion) like in a #if condition, that processing doesn't apply to 
> the other standard parameters.  (Maybe it *should* apply to prefix / 
> suffix / if_empty so the pp-tokens in their arguments can include macros 
> without needing to e.g. use a macro for the name of the file to embed as 
> well, but that doesn't seem to be what the current wording says.  There 
> was some discussion of this on the reflector last month, ending with 
> <https://www.open-std.org/jtc1/sc22/wg14//26110>.)
> 
> Apart from any consequences for arguments of prefix/suffix/is_empty (where 
> there is a plausible argument that the argument should get expanded at 
> some point and that the current wording is undesirable for usability), 
> this would also mean that e.g.
> 
> #define LIMIT limit
> #embed "file" LIMIT(1)
> 
> isn't valid because LIMIT doesn't get expanded (the syntax for 
> non-expanded #embed is met, with an unknown parameter LIMIT), while
> 
> #define limit !
> #embed "file" limit(1)
> 
> *is* valid, because limit doesn't get expanded (which may be convenient 
> for usability - it means headers don't need to use __limit__ if using 
> #embed, even if files including the header might have defined limit as a 
> macro).

Is there an agreement on that?
Because at least checking on godbolt, none of the existing implementations
work that way.

#define FILE "/etc/passwd"
#define LIMIT limit(1)
#define THIS , 1, 2, 3
#define PRE prefix (42,
ONE
#embed FILE LIMIT suffix(THIS) PRE )
TWO
#embed "/etc/passwd" LIMIT suffix(THIS) PRE )
THREE
#define limit prefix
#embed "/etc/passwd" limit (4) suffix (THIS)

is expanded to something like
ONE 42,110 , 1, 2, 3 TWO 42,110 , 1, 2, 3 THREE 4 47, ... 104 , 1, 2, 3
(... just not to show all of godbolt's /etc/passwd)
by JeanHeyd's branch which was used as a prior art for the paper
and on the clang trunk.

My reading of it wasn't that whether it is
# embed < h-char-sequence > embed-parameter-sequence[opt] new-line
or
# embed < h-char-sequence > embed-parameter-sequence[opt] new-line
or
# embed pp-tokens new-line
depends solely on the filename part in there, but also whether
embed-parameter-sequence is syntactically valid (if specified).

Though, guess I'm still not implementing it like that,
(and none of the existing implementations do), because
the last #embed in that case is treated by all 3 as
prefix (4) suffix (, 1, 2, 3) even when it would be valid
as limit (4) suffix (THIS) too.
In that reading, it would be pretty hard to implement, one would need to
check twice, first with disabled expansion read the tokens and see if
it matches that way the syntax of embed-parameter-sequence (but guess
LIMIT suffix(THIS)
would be e.g. valid but the PRE ) in there is not) and if invalid repeat
with the expansion enabled.

Also, in either case (if it is determined solely based on the header
tokens or on valid embed-parameter-sequence), what should happen if
the tokens e.g. in limit clause argument is without expansion valid
balanced token sequence but with expansion is not and it e.g. provides
) in there?
I mean
#define ARG 0) prefix (
#embed "file" limit (ARG)

Anyway, if there is no preprocessing of the embed-parameter-sequence tokens,
then it would make less sense that there are the limit vs. __limit__ etc.
variants (though sure, if it is the pp-tokens case it would still matter).

In any case, I'd hope that prefix/suffix/if_empty argument is macro expanded
at some point, because otherwise it will be a serious usability problem
for users.

> > +  if (CPP_PEDANTIC (pfile))
> > +    {
> > +      if (CPP_OPTION (pfile, cplusplus))
> > +   cpp_error (pfile, CPP_DL_PEDWARN,
> > +              "#%s is a GCC extension", "embed");
> > +      else if (!CPP_OPTION (pfile, warning_directive))
> > +   cpp_error (pfile, CPP_DL_PEDWARN,
> > +              "#%s before C23 is a GCC extension", "embed");
> 
> I don't think warning_directive directive should be used here as the 
> condition for diagnosing #embed as an extension; adding a separate 
> embed_directive would be better.  (Especially if a future C++ version ends 
> up adding #embed; you could then use embed_directive as a condition for 
> the pedwarn for both C and C++, whereas warning_directive wouldn't work as 
> a condition for C++ since #warning is already in C++23.)

Ok, will change that (there are really too many features and the table
already needs 147 columns before this change, so wanted to avoid adding new
stuff there unless necessary).

        Jakub

Reply via email to