On Wed, Nov 14, 2012 at 5:13 PM, Lawrence Crowl <cr...@googlers.com> wrote:
> Diego and I seek your comments on the following (loose) proposal.
>
>
> Generating gimple and tree expressions require lots of detail,
> which is hard to remember and easy to get wrong.  There is some
> amount of boilerplate code that can, in most cases, be reduced and
> managed automatically.
>
> We will add a set of helper classes to be used as local variables
> to manage the details of handling the existing types.  That is,
> a layer over 'gimple_build_*'. We intend to provide helpers for
> those facilities that are both commonly used and have room for
> significant simplification.
>
>
> Generating an Expression
>
> Suppose one wants to generate the expression (shadow != 0) &
> (((base_addr & 7) + offset) >= shadow), where offset is a value and
> the other identifiers are variables.  The current code to generate
> this expression is as follows.
>
> /* t = shadow != 0 */
> g = gimple_build_assign_with_ops (NE_EXPR,
>             make_ssa_name (boolean_type_node, NULL),
>             shadow,
>             build_int_cst (shadow_type, 0));
> gimple_set_location (g, location);
> gsi_insert_after (&gsi, g, GSI_NEW_STMT);
> t = gimple_assign_lhs (g);
>
> /* a = base_addr & 7 */
> g = gimple_build_assign_with_ops (BIT_AND_EXPR,
>             make_ssa_name (uintptr_type, NULL),
>             base_addr,
>             build_int_cst (uintptr_type, 7));
> gimple_set_location (g, location);
> gsi_insert_after (&gsi, g, GSI_NEW_STMT);
>
> /* b = (shadow_type)a */
> g = gimple_build_assign_with_ops (NOP_EXPR,
>             make_ssa_name (shadow_type, NULL),
>             gimple_assign_lhs (g),
>             NULL_TREE);
> gimple_set_location (g, location);
> gsi_insert_after (&gsi, g, GSI_NEW_STMT);
>
> /* c = b + offset */
> g = gimple_build_assign_with_ops (PLUS_EXPR,
>             make_ssa_name (shadow_type, NULL),
>             gimple_assign_lhs (g),
>             build_int_cst (shadow_type, offset));
> gimple_set_location (g, location);
> gsi_insert_after (&gsi, g, GSI_NEW_STMT);
>
> /* d = c >= shadow */
> g = gimple_build_assign_with_ops (GE_EXPR,
>             make_ssa_name (boolean_type_node, NULL),
>             gimple_assign_lhs (g),
>             shadow);
> gimple_set_location (g, location);
> gsi_insert_after (&gsi, g, GSI_NEW_STMT);
>
> /* r = t & d */
> g = gimple_build_assign_with_ops (BIT_AND_EXPR,
>             make_ssa_name (boolean_type_node, NULL),
>             t,
>             gimple_assign_lhs (g));
> gimple_set_location (g, location);
> gsi_insert_after (&gsi, g, GSI_NEW_STMT);
> r = gimple_assign_lhs (g);
>
> We propose a simplified form using new build helper classes ssa_seq
> and ssa_stmt that would allow the above code to be written as
> follows.
>
> ssa_seq q;

Can it be more abstract, such as stmt_builder?


> ssa_stmt t = q.stmt (NE_EXPR, shadow, 0);
> ssa_stmt a = q.stmt (BIT_AND_EXPR, base_addr, 7);
> ssa_stmt b = q.stmt (shadow_type, a);
> ssa_stmt c = q.stmt (PLUS_EXPR, b, offset);
> ssa_stmt d = q.stmt (GE_EXPR, c, shadow);
> ssa_stmt e = q.stmt (BIT_AND_EXPR, t, d);


seq_seq::stmt(...) sounds like a getter interface, not a creator.

x = q.new_assignment (...);
x = q.new_call (..);
x.add_arg(..);
x = q.new_icall (..);

l1 = q.new_label ("xx");
l2 = q.new_label ("xxx");
join_l = q.new_label ("...");

x = new_if_then_else (cond, l1, l2, join_l);
q.insert_label (l1);
q.new_assignment (...);
q.insert_label(l2);
...
q.insert_label(join_l);
q.close_if_then_else(x);


> q.set_location (location);
> r = e.lhs ();
>
> There are a few important things to note about this example.
>
> .. We have a new class (ssa_seq) that knows how to sequence
> statements automatically and can build expressions out of types.
>
> .. Every statement created produces an SSA name.  Referencing an
> ssa_stmt instance in a an argument to ssa_seq::stmt retrieves the
> SSA name generated by that statement.

>
> .. The statement result type is that of the arguments.
>
> .. The type of integral constant arguments is that of the other
> argument.  (Which implies that only one argument can be constant.)
>
> .. The 'stmt' method handles linking the statement into the sequence.
>
> .. The 'set_location' method iterates over all statements.
>
> There will be another class of builders for generating GIMPLE
> in normal form (gimple_stmt).  We expect that this will mostly
> affect all transformations that need to generate new expressions
> and statements, like instrumentation passes.

What are the uses of the raw forms?

>
> We also expect to reduce calls to tree expression builders by
> allowing the use of numeric and string constants to be converted
> to the appropriate tree _CST node.  This will only work when the
> type of the constant can be deduced from the other argument in some
> expressions, of course.
>
>
> Generating a Type
>
> Consider the generation of the following type.
>
> struct __asan_global {
>   const_ptr_type_node __beg;
>   inttype __size;
>   inttype __size_with_redzone;
>   const_ptr_type_node __name;
>   inttype __has_dynamic_init;
> };
>
> The current code to generate it is as follows.
>
> tree inttype = build_nonstandard_integer_type (POINTER_SIZE, 1);
> tree ret = make_node (RECORD_TYPE);
> TYPE_NAME (ret) = get_identifier ("__asan_global");
> tree beg = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
>                        get_identifier ("__beg"), const_ptr_type_node);
> DECL_CONTEXT (beg) = ret;
> TYPE_FIELDS (ret) = beg;
> tree size = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
>                         get_identifier ("__size"), inttype);
> DECL_CONTEXT (size) = ret;
> DECL_CHAIN (beg) = size;
> tree red = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
>                        get_identifier ("__size_with_redzone"), inttype);
> DECL_CONTEXT (red) = ret;
> DECL_CHAIN (size) = red;
> tree name = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
>                         get_identifier ("__name"), const_ptr_type_node);
> DECL_CONTEXT (name) = ret;
> DECL_CHAIN (red) = name;
> tree init = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
>                         get_identifier ("__has_dynamic_init"), inttype);
> DECL_CONTEXT (init) = ret;
> DECL_CHAIN (name) = init;
> layout_type (ret);
>
> We propose a form as follows.
>
> tree inttype = build_nonstandard_integer_type (POINTER_SIZE, 1);
> record_builder rec ("__asan_global");
> rec.field ("__beg", const_ptr_type_node);
> rec.field ("__size", inttype);
> rec.field ("__size_with_redzone", inttype);
> rec.field ("__name", const_ptr_type_node);
> rec.field ("__has_dynamic_init", inttype);
> rec.finish ();
> tree ret = rec.as_tree ();

Again, something like new_field or add_field is more intuitive.

>
> There are a few important things to note about this example.
>
> .. The 'field' method will add context and chain links.
>
> .. The 'field' method is overloaded on both strings and identifiers.
>
> .. The 'finish' method lays out the struct.
>
>
> Proposal
>
> Create a set of IL builder classes that provide a simplified IL
> building interface.  Essentially, these builder classes will abstract
> most of the bookkeeping code required by the current interfaces.
>
> These classes will not replace the existing interfaces.  We do not
> expect that all the IL generation done in current transformations
> will be able to use the simplified interfaces.  The goal is to
> simplify most of them, however.


Looks like a good direction to go.

thanks,

David


>
> --
> Lawrence Crowl

Reply via email to