Hi Jeff.

    > This patch adds a port for the Linux kernel eBPF architecture to GCC.
    > 
    > ChangeLog:
    > 
    >   * configure.ac: Support for bpf-*-* targets.
    >   * configure: Regenerate.
    > 
    > contrib/ChangeLog:
    > 
    >   * config-list.mk (LIST): Disable go in bpf-*-* targets.
    > 
    > gcc/ChangeLog:
    > 
    >   * config.gcc: Support for bpf-*-* targets.
    >   * common/config/bpf/bpf-common.c: New file.
    >   * config/bpf/t-bpf: Likewise.
    >   * config/bpf/predicates.md: Likewise.
    >   * config/bpf/constraints.md: Likewise.
    >   * config/bpf/bpf.opt: Likewise.
    >   * config/bpf/bpf.md: Likewise.
    >   * config/bpf/bpf.h: Likewise.
    >   * config/bpf/bpf.c: Likewise.
    >   * config/bpf/bpf-protos.h: Likewise.
    >   * config/bpf/bpf-opts.h: Likewise.
    >   * config/bpf/bpf-helpers.h: Likewise.
    >   * config/bpf/bpf-helpers.def: Likewise.
    So I think various folks have already mentioned the configure rebuild
    issues, formatting and other stuff.  I'm going to try to keep them all
    in mind so that I don't duplicate anything.  If I do duplicate someone's
    comment, apologies in advance.
    
    At a high level I realize there's lots of things not supported due to
    the restricted environment it'll ultimately be used in.  However, you
    might want to consider extensions that would allow larger portions of
    the gcc testsuite to run and some kind of user mode simulator so that
    you can reasonably test the target.  Not a requirement, but could be
    useful (from experience :-)

I agree to both regards.

I have been thinking about Segher's suggestion on providing options to
lift some of the limitations, for compiler testing.  Unfortunately, many
of the restrictions are deeply rooted in the design of the
architecture... or the other way around.  Finding sane ways to implement
these extensions will be fun :)

As for the simulator, I have one, along with an initial GDB port... but
it doesn't work very well due to a particularly nasty bug in CGEN.  I
have a patch that seems to fix it but, as everything that touches cgen's
ifield handling code, it is difficult to be 100% sure about that, and I
also need to adapt some of the other existing cgen-based ports...  so it
will take a while before I have something that can run the GCC
testsuite.
    
    > diff --git a/gcc/common/config/bpf/bpf-common.c 
b/gcc/common/config/bpf/bpf-common.c
    > new file mode 100644
    > index 00000000000..a68feb62897
    > --- /dev/null
    > +++ b/gcc/common/config/bpf/bpf-common.c
    [ snip ]
    > +/* Implement TARGET_OPTION_OPTIMIZATION_TABLE.  */
    > +static const struct default_options bpf_option_optimization_table[] =
    > +  {
    > +    /* Enable -funroll-all-loops by default.  */
    > +    { OPT_LEVELS_ALL, OPT_funroll_all_loops, NULL, 1 },
    > +    /* Disable -fomit-frame-pointer by default.  */
    > +    { OPT_LEVELS_ALL, OPT_fomit_frame_pointer, NULL, 0 },
    > +    { OPT_LEVELS_NONE, 0, NULL, 0 }
    > +  };
    Curious about the motivation on the loop unrolling stuff.  In general we
    discourage targets from mucking around with the default
    flags/optimizations, but it is sometimes the right thing to do.

The kernel verifier doesn't allow backward jumps.

This may change at some point.  There is much discussion among the
kernel hackers in whether it is possible to allow bounded loops in a
safe way.  In that case, some of the restrictions may be lifted.

For now, only loops that can be peeled/massaged and then fully unrolled
are supported.

    Rather than -fomit-frame-pointer, I think you can use the
    FRAME_POINTER_REQUIRED hook if you always want a frame pointer.

Oh so specifying -fomit-frame-pointer there is redundant... good to
know.  Will remove it.
    
    > diff --git a/gcc/config/bpf/bpf-helpers.h b/gcc/config/bpf/bpf-helpers.h
    > new file mode 100644
    > index 00000000000..2fe96be7637
    > --- /dev/null
    > +++ b/gcc/config/bpf/bpf-helpers.h
    I can't remember, is this an installed header that consumers are
    expected to use?  If so you might want to be careful with polluting user
    code with BPF #defines such as BPF_ANY, BPF_NOEXIST, BPF_EXIST, etc.
    The #defines for mapping to the builtins are probably OK though.
    
Yes, it is a header file for consumers.  Unfortunately, the whole
purpose of the header is to provide an interface that is compatible with
the kernel's bpf_helpers.h (which at the moment is llvm-specific).  The
API is given :(

This is a point I plan to raise with the eBPF developers in a few weeks,
at the Linux Plumbers conference in Lisbon.
    
    > diff --git a/gcc/config/bpf/bpf.c b/gcc/config/bpf/bpf.c
    > new file mode 100644
    > index 00000000000..4a42259a9c3
    > --- /dev/null
    > +++ b/gcc/config/bpf/bpf.c
    > @@ -0,0 +1,1136 @@
    [ ... ]
    > +
    > +/* Return the builtin code corresponding to the kernel helper builtin
    > +   __builtin_NAME, or 0 if the name doesn't correspond to a kernel
    > +   helper builtin.  */
    > +
    > +static inline int
    > +bpf_helper_code (const char *name)
    > +{
    > +  int i;
    > +
    > +  for (i = 1; i < BPF_BUILTIN_HELPER_MAX; ++i)
    > +    {
    > +      if (strcmp (name, bpf_helper_names[i]) == 0)
    > + return i;
    > +    }
    > +
    > +  return 0;
    > +}
    Does this get called often?  If so the linear search could end up being
    expensive from a compile-time standpoint.

It gets called per function call to a symbol with the form
__builtin_bpf_helper_*...  you think it is worth of a hash?

    > +#define KERNEL_VERSION_CODE "__BPF_KERNEL_VERSION_CODE__="    
    > +    kernel_version_code
    > +      = (char *) alloca (strlen (KERNEL_VERSION_CODE) + 7 + 1);
    > +    strcpy (kernel_version_code, KERNEL_VERSION_CODE);
    > +#undef KERNEL_VERSION_CODE
    > +    strcat (kernel_version_code, version_code);
    > +    builtin_define (kernel_version_code);
    > +  }
    > +}
    Does builtin_define copy its argument?  If not, then I'd expect this to
    be problematical as the alloca'd space will be reclaimed.

Yes it does.
        
    > +
    > +static void
    > +bpf_compute_frame (void)
    > +{
    > +  int stack_alignment = STACK_BOUNDARY / BITS_PER_UNIT;
    > +  int padding_locals, regno;
    > +
    > +  /* Set the space used in the stack by local variables.  This is
    > +     rounded up to respect the minimum stack alignment.  */
    > +  cfun->machine->local_vars_size = get_frame_size ();
    > +
    > +  padding_locals = cfun->machine->local_vars_size % stack_alignment;
    > +  if (padding_locals)
    > +    padding_locals = stack_alignment - padding_locals;
    > +
    > +  cfun->machine->local_vars_size += padding_locals;
    > +
    > +  /* Set the space used in the stack by callee-saved used registers in
    > +     the current function.  There is no need to round up, since the
    > +     registers are all 8 bytes wide.  */
    > +  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
    > +    if ((!fixed_regs[regno]
    > +  && df_regs_ever_live_p (regno)
    > +  && !call_used_regs[regno])
    > + || (cfun->calls_alloca
    > +     && regno == STACK_POINTER_REGNUM))
    > +      cfun->machine->callee_saved_reg_size += 8;
    > +
    > +  /* Check that the total size of the frame doesn't exceed the limit
    > +     imposed by eBPF: currently 512 bytes.  */
    > +  if ((cfun->machine->local_vars_size
    > +       + cfun->machine->callee_saved_reg_size) > 512)
    > +    {
    > +      static int stack_limit_exceeded = 0;
    > +
    > +      if (!stack_limit_exceeded)
    > + error ("eBPF stack limit of 512 bytes exceeded");
    > +      stack_limit_exceeded = 1;
    > +    }
    Is the stack limit likely to change?  Would a param work better here
    which would allow us to accommodate such a change without having to
    re-release GCC?
    
It will probably be increased at some point.  Using a param sounds like
a good idea.  However...

The stack limit is associated with kernel version.  I guess we can just
set the appropriate defaults in bpf_option_override if we make it
variable, in case the user didn't specify a --param for it, so no
problem.

Also, if we allow the user to specify a stack frame bigger than 0x7fff,
bpf_expand_prologue will break.  Probably in that case we want to detect
this, warn and truncate to the -mkernel's default, also in
bpf_option_override.

Does that sound reasonable?

    > diff --git a/gcc/config/bpf/bpf.h b/gcc/config/bpf/bpf.h
    > +
    > +/**** Debugging Info ****/
    > +
    > +/* We cannot support DWARF2 because of the limitations of eBPF.  */
    > +#define DBX_DEBUGGING_INFO
    Umm, we're trying to get rid of DBX_DEBUGGING_INFO.  I'd rather not add
    another user at this point.  How tough would it be to support dwarf?

Yes relying on stabs sucks.

The main problem I found is that it is not possible to define a CFA, nor
to unwind frames in any way.  Given these limitations, is it still
possible to make GCC emit minimally useful DWARF, with locations and
such?  That would be great.
    
    > +;;;; Data movement
    > +
    > +(define_mode_iterator AMM [QI HI SI DI SF DF])
    > +
    > +(define_expand "mov<AMM:mode>"
    > +  [(set (match_operand:AMM 0 "general_operand" "")
    > +        (match_operand:AMM 1 "general_operand" ""))]
    > +        ""
    > +        "
    > +{
    > +    if (!register_operand(operands[0], <AMM:MODE>mode)
    > +        && !register_operand(operands[1], <AMM:MODE>mode))
    > +         operands[1] = force_reg (<AMM:MODE>mode, operands[1]); 
    > +
    > +    /* In cases where the moved entity is a constant address, we
    > +       need to emit an extra mov and modify the second operand to
    > +       obtain something like:
    > +
    > +         lddw %T, %1
    > +         ldxw %0, [%T+0]
    > +
    > +       Ditto for stores.  */
    > +
    > +    if (MEM_P (operands[1])
    > +        && CONSTANT_ADDRESS_P (XEXP (operands[1], 0)))
    > +      {
    > +         rtx tmp = gen_reg_rtx (DImode);
    > +
    > +         emit_move_insn (tmp, XEXP (operands[1], 0));
    > +         operands[1] = gen_rtx_MEM (<AMM:MODE>mode, tmp);
    > +      }
    > +
    > +    if (MEM_P (operands[0])
    > +        && CONSTANT_ADDRESS_P (XEXP (operands[0], 0)))
    > +      {
    > +         rtx tmp = gen_reg_rtx (DImode);
    > +  
    > +         emit_move_insn (tmp, XEXP (operands[0], 0));
    > +         operands[0] = gen_rtx_MEM (<AMM:MODE>mode, tmp);
    > +      }
    > +
    > +}")
    Hmm, what happens if you need to reload something from a constant
    address?  You can't call gen_reg_rtx once register allocation has
    started.  THe case where you need a scratch register really feels like
    you need to be defining secondary reloads.

I really have to think about this.  Richard's comment about the
possibility of not considering constant addresses legit already made me
ponder whether it would be better to use a different strategy here.

    Generally it looks pretty good.  I'd like to take more more looksie at
    patch #2 of the series after you've addressed the comments you've
    received so far.

Thanks, I appreciate.
Expect a V3 of the series soon :)

Reply via email to