Georg-Johann Lay <a...@gjlay.de> writes: > So I had that situation where in an insn attribute, providing > a block of code (rather than just an expression) would be > useful. > > Expressions can provided by means of symbol_ref, like in > > (set (attr "length") > (symbol_ref ("1 + GET_MODE_SIZE (<MODE>mode)"))) > > However providing a block of code gives a syntax error from > the compiler, *NOT* from md_reader: > > (set (attr "length") > (symbol_ref > { > int len = 1; > return len; > })) > > This means such syntax is already supported to some degree, > there's just no semantics assigned to such code. > > Blocks of code are already supported in insn predicates, > like in > > (define_predicate "my_operand" > (match_code "code_label,label_ref,symbol_ref,plus,const") > { > some code... > return true-or-false; > }) > > In the insn attribute case, I hacked a bit and supported > blocks of code like in the example above. The biggest change > is that class attr_desc has to be moved from genattrtab.cc to > read-md.h so that it is a complete type as required by > md_reader::fprint_c_condition(). > > That method prints to code for symbol_ref and some others, and > it has to know the type of the attribute, like "int" for the > "length" attribute. The implementation in fprint_c_condition() > is straight forward: > > When cond (which is the payload string of symbol_ref, including the > '{}'s) starts with '{', the print a lambda that's called in place, > like in > > print "( [&]() -> <return_type> <cond> () )" > > The "&" capture is required so that variables like "insn" are > accessible. "operands[]" and "which_alternative" are global, > thus also accessible. > > Attached is the code I have so far (which is by no means a > proposed patch, so I am posting here on gcc@). > > As far as I can tell, there is no performance penalty, e.g. > in build times, when the feature is not used. Of course instead > of such syntax, a custom function could be used, or the > braces-brackets-parentheses-gibberish could be written out > in the symbol_ref as an expression. Though I think this > could be a nice addition, in particular because the scanning > side in md_reader already supports the syntax.
Looks good to me. I know you said it wasn't a patch submission, but it looks mostly ready to go. Some comments below: > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > index 7f4335e0aac..3e46693e8c2 100644 > --- a/gcc/doc/md.texi > +++ b/gcc/doc/md.texi > @@ -10265,6 +10265,56 @@ so there is no need to explicitly convert the > expression into a boolean > (match_test "(x & 2) != 0") > @end smallexample > > +@cindex @code{symbol_ref} and attributes > +@item (symbol_ref "@var{quoted-c-expr}") > + > +Specifies the value of the attribute sub-expression as a C expression, > +where the surrounding quotes are not part of the expression. > +Similar to @code{match_test}, variables @var{insn}, @var{operands[]} > +and @var{which_alternative} are available. Moreover, code and mode > +attributes can be used to compose the resulting C expression, like in > + > +@smallexample > +(set (attr "length") > + (symbol_ref ("1 + GET_MODE_SIZE (<MODE>mode)"))) > +@end smallexample > + > +where the according insn has exactly one mode iterator. > +See @ref{Mode Iterators} and @ref{Code Iterators}. I got the impression s/See @ref/@xref/ was recommended for sentence references. > + > +@item (symbol_ref "@{ @var{quoted-c-code} @}") > +@itemx (symbol_ref @{ @var{c-code} @}) > + > +The value of this subexpression is determined by running a block > +of C code which returns the desired value. > +The braces are part of the code, whereas the quotes in the quoted form are > not. > + > +This variant of @code{symbol_ref} allows for more comlpex code than > +just a single C expression, like for example: > + > +@smallexample > +(set (attr "length") > + (symbol_ref > + @{ > + int len; > + some_function (insn, <CODE>, <MODE>mode, & len); > + return len; > + @})) > +@end smallexample > + > +for an insn that has one code iterator and one mode iterator. > +Again, variables @var{insn}, @var{operands[]} and @var{which_alternative} > +can be used. The unquoted form only supports a subset of C, > +for example no C comments are supported, and strings that contain > +characters like @samp{@}} are problematic and may need to be escaped > +as @samp{\@}}. By unquoted form, do you mean (symbol_ref { ... })? I'd have expected that to be better than "{ ... }" (or at least, I thought that was the intention when { ... } was added). I was going to suggest not documenting the "{ ... }" form until I saw this. > + > +The return type is @code{int} for the @var{length} attribute, and > +@code{enum attr_@var{name}} for an insn attribute named @var{name}. > +The types and available enum values can be looked up in > +@file{$builddir/gcc/insn-attr-common.h}. > + > + > @cindex @code{le} and attributes > @cindex @code{leu} and attributes > @cindex @code{lt} and attributes > diff --git a/gcc/genattrtab.cc b/gcc/genattrtab.cc > index 03c7d6c74a3..20d45ba88af 100644 > --- a/gcc/genattrtab.cc > +++ b/gcc/genattrtab.cc > @@ -168,22 +168,6 @@ struct attr_value > int has_asm_insn; /* True if this value used for `asm' insns */ > }; > > -/* Structure for each attribute. */ > - > -class attr_desc > -{ > -public: > - char *name; /* Name of attribute. */ > - const char *enum_name; /* Enum name for DEFINE_ENUM_NAME. */ > - class attr_desc *next; /* Next attribute. */ > - struct attr_value *first_value; /* First value of this attribute. */ > - struct attr_value *default_val; /* Default value for this attribute. */ > - file_location loc; /* Where in the .md files it occurs. */ > - unsigned is_numeric : 1; /* Values of this attribute are > numeric. */ > - unsigned is_const : 1; /* Attribute value constant for each run. */ > - unsigned is_special : 1; /* Don't call `write_attr_set'. */ > -}; > - > /* Structure for each DEFINE_DELAY. */ > > class delay_desc > @@ -3368,7 +3352,7 @@ write_test_expr (FILE *outf, rtx exp, unsigned int > attrs_cached, int flags, > { > int comparison_operator = 0; > RTX_CODE code; > - class attr_desc *attr; > + class attr_desc *attr = nullptr; > > if (emit_parens) > fprintf (outf, "("); > @@ -3708,7 +3692,7 @@ write_test_expr (FILE *outf, rtx exp, unsigned int > attrs_cached, int flags, > > /* A random C expression. */ > case SYMBOL_REF: > - rtx_reader_ptr->fprint_c_condition (outf, XSTR (exp, 0)); > + rtx_reader_ptr->fprint_c_condition (outf, XSTR (exp, 0), attr); > break; > > /* The address of the branch target. */ > @@ -4052,12 +4036,7 @@ write_attr_get (FILE *outf, class attr_desc *attr) > > /* Write out start of function, then all values with explicit `case' lines, > then a `default', then the value with the most uses. */ > - if (attr->enum_name) > - fprintf (outf, "enum %s\n", attr->enum_name); > - else if (!attr->is_numeric) > - fprintf (outf, "enum attr_%s\n", attr->name); > - else > - fprintf (outf, "int\n"); > + attr->fprint_type (outf, "\n"); > > /* If the attribute name starts with a star, the remainder is the name of > the subroutine to use, instead of `get_attr_...'. */ > @@ -4389,7 +4368,7 @@ write_attr_value (FILE *outf, class attr_desc *attr, > rtx value) > break; > > case SYMBOL_REF: > - rtx_reader_ptr->fprint_c_condition (outf, XSTR (value, 0)); > + rtx_reader_ptr->fprint_c_condition (outf, XSTR (value, 0), attr); > break; > > case ATTR: > diff --git a/gcc/read-md.cc b/gcc/read-md.cc > index 93d1ea43781..a0025575cb1 100644 > --- a/gcc/read-md.cc > +++ b/gcc/read-md.cc > @@ -170,31 +170,62 @@ md_reader::join_c_conditions (const char *cond1, const > char *cond2) > directive for COND if its original file position is known. */ > > void > -md_reader::fprint_c_condition (FILE *outf, const char *cond) > +md_reader::fprint_c_condition (FILE *outf, const char *cond, > + const attr_desc *attr) > { > const char **halves = (const char **) htab_find (m_joined_conditions, > &cond); > if (halves != 0) > { > fprintf (outf, "("); > - fprint_c_condition (outf, halves[1]); > + fprint_c_condition (outf, halves[1], attr); > fprintf (outf, " && "); > - fprint_c_condition (outf, halves[2]); > + fprint_c_condition (outf, halves[2], attr); > fprintf (outf, ")"); > } > else > { > fputc ('\n', outf); > fprint_md_ptr_loc (outf, cond); > - fprintf (outf, "(%s)", cond); > + if (cond[0] == '{') > + { > + const struct ptr_loc *loc = get_md_ptr_loc (cond); > + > + if (! attr) > + { > + error_at (loc->loc, "TODO md_reader::fprint_c_condition: " > + "const attr_desc *attr is nullptr"); > + exit (11); > + } I think we should just skip the explicit return type if we don't know what it is. Simple cases will still work, and we should get a reasonably sensible error from the compiler for other cases. (And when we do, we can patch the caller to pass an appropriate type.) > + > + // Like "( [&]() -> <return_type> <cond> () )" > + // Where COND is actually the body of a function, including the > + // outer {}'s. Print this as a lambda that's evaluated in place. > + // The capture-all is required to have access to variables > + // like "insn". Objects like "operands[]" and "which_alternative" > + // are accessible since they are global. > + fprintf (outf, "( [&]() -> "); > + attr->fprint_type (outf, " "); > + fprintf (outf, "%s () )", cond); > + > + if (! strstr (cond, "return")) > + { > + error_at (loc->loc, "error: function body needs at least one" > + " 'return' statement"); > + error_at (loc->loc, "this is the body:\n%s\n", cond); > + exit (12); > + } IMO it'd be better to leave the compiler to enforce this. There are many other ways in which the lambda body could be malformed. > + } > + else > + fprintf (outf, "(%s)", cond); > } > } > > /* Special fprint_c_condition for writing to STDOUT. */ > > void > -md_reader::print_c_condition (const char *cond) > +md_reader::print_c_condition (const char *cond, const attr_desc *desc) > { > - fprint_c_condition (stdout, cond); > + fprint_c_condition (stdout, cond, desc); > } > > /* A vfprintf-like function for reporting an error against line LINENO > diff --git a/gcc/read-md.h b/gcc/read-md.h > index 9703551a8fd..ae10b651de1 100644 > --- a/gcc/read-md.h > +++ b/gcc/read-md.h > @@ -132,6 +132,38 @@ struct overloaded_name { > overloaded_instance **next_instance_ptr; > }; > > +/* Structure for each attribute. */ > + > +struct attr_value; > + > +class attr_desc > +{ > +public: > + char *name; /* Name of attribute. */ > + const char *enum_name; /* Enum name for DEFINE_ENUM_NAME. */ > + class attr_desc *next; /* Next attribute. */ > + struct attr_value *first_value; /* First value of this attribute. */ > + struct attr_value *default_val; /* Default value for this attribute. */ > + file_location loc; /* Where in the .md files it occurs. */ > + unsigned is_numeric : 1; /* Values of this attribute are > numeric. */ > + unsigned is_const : 1; /* Attribute value constant for each run. */ > + unsigned is_special : 1; /* Don't call `write_attr_set'. */ > + > + // Print the return type for functions like get_attr_<attribute-name> > + // to stream OUTF, followed by SUFFIX which should be white-space(s). > + void fprint_type (FILE *outf, const char *suffix) const > + { > + if (enum_name) > + fprintf (outf, "enum %s", enum_name); > + else if (! is_numeric) > + fprintf (outf, "enum attr_%s", name); > + else > + fprintf (outf, "int"); > + > + fprintf (outf, "%s", suffix); It shouldn't be necessary to emit the enum tag these days. If removing it causes anything to break, I think we should fix whatever that breaking thing is. Could you try doing that, as a pre-patch? Or I can give it a go, if you'd rather not. If we do that, then we can just a return a const char * for the type. And then in turn we can pass a const char * to (f)print_c_condition. The MD reader then wouldn't need to know about attributes. Thanks, Richard > + } > +}; > + > struct mapping; > > /* A class for reading .md files and RTL dump files. > @@ -204,8 +236,8 @@ class md_reader > void handle_enum (file_location loc, bool md_p); > > const char *join_c_conditions (const char *cond1, const char *cond2); > - void fprint_c_condition (FILE *outf, const char *cond); > - void print_c_condition (const char *cond); > + void fprint_c_condition (FILE *outf, const char *cond, const attr_desc * = > nullptr); > + void print_c_condition (const char *cond, const attr_desc * = nullptr); > > /* Defined in read-rtl.cc. */ > const char *apply_iterator_to_string (const char *string);