Diego and I seek your comments on the following (loose) proposal.
Generating gimple and tree expressions require lots of detail, which is hard to remember and easy to get wrong. There is some amount of boilerplate code that can, in most cases, be reduced and managed automatically. We will add a set of helper classes to be used as local variables to manage the details of handling the existing types. That is, a layer over 'gimple_build_*'. We intend to provide helpers for those facilities that are both commonly used and have room for significant simplification. Generating an Expression Suppose one wants to generate the expression (shadow != 0) & (((base_addr & 7) + offset) >= shadow), where offset is a value and the other identifiers are variables. The current code to generate this expression is as follows. /* t = shadow != 0 */ g = gimple_build_assign_with_ops (NE_EXPR, make_ssa_name (boolean_type_node, NULL), shadow, build_int_cst (shadow_type, 0)); gimple_set_location (g, location); gsi_insert_after (&gsi, g, GSI_NEW_STMT); t = gimple_assign_lhs (g); /* a = base_addr & 7 */ g = gimple_build_assign_with_ops (BIT_AND_EXPR, make_ssa_name (uintptr_type, NULL), base_addr, build_int_cst (uintptr_type, 7)); gimple_set_location (g, location); gsi_insert_after (&gsi, g, GSI_NEW_STMT); /* b = (shadow_type)a */ g = gimple_build_assign_with_ops (NOP_EXPR, make_ssa_name (shadow_type, NULL), gimple_assign_lhs (g), NULL_TREE); gimple_set_location (g, location); gsi_insert_after (&gsi, g, GSI_NEW_STMT); /* c = b + offset */ g = gimple_build_assign_with_ops (PLUS_EXPR, make_ssa_name (shadow_type, NULL), gimple_assign_lhs (g), build_int_cst (shadow_type, offset)); gimple_set_location (g, location); gsi_insert_after (&gsi, g, GSI_NEW_STMT); /* d = c >= shadow */ g = gimple_build_assign_with_ops (GE_EXPR, make_ssa_name (boolean_type_node, NULL), gimple_assign_lhs (g), shadow); gimple_set_location (g, location); gsi_insert_after (&gsi, g, GSI_NEW_STMT); /* r = t & d */ g = gimple_build_assign_with_ops (BIT_AND_EXPR, make_ssa_name (boolean_type_node, NULL), t, gimple_assign_lhs (g)); gimple_set_location (g, location); gsi_insert_after (&gsi, g, GSI_NEW_STMT); r = gimple_assign_lhs (g); We propose a simplified form using new build helper classes ssa_seq and ssa_stmt that would allow the above code to be written as follows. ssa_seq q; ssa_stmt t = q.stmt (NE_EXPR, shadow, 0); ssa_stmt a = q.stmt (BIT_AND_EXPR, base_addr, 7); ssa_stmt b = q.stmt (shadow_type, a); ssa_stmt c = q.stmt (PLUS_EXPR, b, offset); ssa_stmt d = q.stmt (GE_EXPR, c, shadow); ssa_stmt e = q.stmt (BIT_AND_EXPR, t, d); q.set_location (location); r = e.lhs (); There are a few important things to note about this example. .. We have a new class (ssa_seq) that knows how to sequence statements automatically and can build expressions out of types. .. Every statement created produces an SSA name. Referencing an ssa_stmt instance in a an argument to ssa_seq::stmt retrieves the SSA name generated by that statement. .. The statement result type is that of the arguments. .. The type of integral constant arguments is that of the other argument. (Which implies that only one argument can be constant.) .. The 'stmt' method handles linking the statement into the sequence. .. The 'set_location' method iterates over all statements. There will be another class of builders for generating GIMPLE in normal form (gimple_stmt). We expect that this will mostly affect all transformations that need to generate new expressions and statements, like instrumentation passes. We also expect to reduce calls to tree expression builders by allowing the use of numeric and string constants to be converted to the appropriate tree _CST node. This will only work when the type of the constant can be deduced from the other argument in some expressions, of course. Generating a Type Consider the generation of the following type. struct __asan_global { const_ptr_type_node __beg; inttype __size; inttype __size_with_redzone; const_ptr_type_node __name; inttype __has_dynamic_init; }; The current code to generate it is as follows. tree inttype = build_nonstandard_integer_type (POINTER_SIZE, 1); tree ret = make_node (RECORD_TYPE); TYPE_NAME (ret) = get_identifier ("__asan_global"); tree beg = build_decl (UNKNOWN_LOCATION, FIELD_DECL, get_identifier ("__beg"), const_ptr_type_node); DECL_CONTEXT (beg) = ret; TYPE_FIELDS (ret) = beg; tree size = build_decl (UNKNOWN_LOCATION, FIELD_DECL, get_identifier ("__size"), inttype); DECL_CONTEXT (size) = ret; DECL_CHAIN (beg) = size; tree red = build_decl (UNKNOWN_LOCATION, FIELD_DECL, get_identifier ("__size_with_redzone"), inttype); DECL_CONTEXT (red) = ret; DECL_CHAIN (size) = red; tree name = build_decl (UNKNOWN_LOCATION, FIELD_DECL, get_identifier ("__name"), const_ptr_type_node); DECL_CONTEXT (name) = ret; DECL_CHAIN (red) = name; tree init = build_decl (UNKNOWN_LOCATION, FIELD_DECL, get_identifier ("__has_dynamic_init"), inttype); DECL_CONTEXT (init) = ret; DECL_CHAIN (name) = init; layout_type (ret); We propose a form as follows. tree inttype = build_nonstandard_integer_type (POINTER_SIZE, 1); record_builder rec ("__asan_global"); rec.field ("__beg", const_ptr_type_node); rec.field ("__size", inttype); rec.field ("__size_with_redzone", inttype); rec.field ("__name", const_ptr_type_node); rec.field ("__has_dynamic_init", inttype); rec.finish (); tree ret = rec.as_tree (); There are a few important things to note about this example. .. The 'field' method will add context and chain links. .. The 'field' method is overloaded on both strings and identifiers. .. The 'finish' method lays out the struct. Proposal Create a set of IL builder classes that provide a simplified IL building interface. Essentially, these builder classes will abstract most of the bookkeeping code required by the current interfaces. These classes will not replace the existing interfaces. We do not expect that all the IL generation done in current transformations will be able to use the simplified interfaces. The goal is to simplify most of them, however. -- Lawrence Crowl