Hi Jeff.
> This patch adds a port for the Linux kernel eBPF architecture to GCC. > > ChangeLog: > > * configure.ac: Support for bpf-*-* targets. > * configure: Regenerate. > > contrib/ChangeLog: > > * config-list.mk (LIST): Disable go in bpf-*-* targets. > > gcc/ChangeLog: > > * config.gcc: Support for bpf-*-* targets. > * common/config/bpf/bpf-common.c: New file. > * config/bpf/t-bpf: Likewise. > * config/bpf/predicates.md: Likewise. > * config/bpf/constraints.md: Likewise. > * config/bpf/bpf.opt: Likewise. > * config/bpf/bpf.md: Likewise. > * config/bpf/bpf.h: Likewise. > * config/bpf/bpf.c: Likewise. > * config/bpf/bpf-protos.h: Likewise. > * config/bpf/bpf-opts.h: Likewise. > * config/bpf/bpf-helpers.h: Likewise. > * config/bpf/bpf-helpers.def: Likewise. So I think various folks have already mentioned the configure rebuild issues, formatting and other stuff. I'm going to try to keep them all in mind so that I don't duplicate anything. If I do duplicate someone's comment, apologies in advance. At a high level I realize there's lots of things not supported due to the restricted environment it'll ultimately be used in. However, you might want to consider extensions that would allow larger portions of the gcc testsuite to run and some kind of user mode simulator so that you can reasonably test the target. Not a requirement, but could be useful (from experience :-) I agree to both regards. I have been thinking about Segher's suggestion on providing options to lift some of the limitations, for compiler testing. Unfortunately, many of the restrictions are deeply rooted in the design of the architecture... or the other way around. Finding sane ways to implement these extensions will be fun :) As for the simulator, I have one, along with an initial GDB port... but it doesn't work very well due to a particularly nasty bug in CGEN. I have a patch that seems to fix it but, as everything that touches cgen's ifield handling code, it is difficult to be 100% sure about that, and I also need to adapt some of the other existing cgen-based ports... so it will take a while before I have something that can run the GCC testsuite. > diff --git a/gcc/common/config/bpf/bpf-common.c b/gcc/common/config/bpf/bpf-common.c > new file mode 100644 > index 00000000000..a68feb62897 > --- /dev/null > +++ b/gcc/common/config/bpf/bpf-common.c [ snip ] > +/* Implement TARGET_OPTION_OPTIMIZATION_TABLE. */ > +static const struct default_options bpf_option_optimization_table[] = > + { > + /* Enable -funroll-all-loops by default. */ > + { OPT_LEVELS_ALL, OPT_funroll_all_loops, NULL, 1 }, > + /* Disable -fomit-frame-pointer by default. */ > + { OPT_LEVELS_ALL, OPT_fomit_frame_pointer, NULL, 0 }, > + { OPT_LEVELS_NONE, 0, NULL, 0 } > + }; Curious about the motivation on the loop unrolling stuff. In general we discourage targets from mucking around with the default flags/optimizations, but it is sometimes the right thing to do. The kernel verifier doesn't allow backward jumps. This may change at some point. There is much discussion among the kernel hackers in whether it is possible to allow bounded loops in a safe way. In that case, some of the restrictions may be lifted. For now, only loops that can be peeled/massaged and then fully unrolled are supported. Rather than -fomit-frame-pointer, I think you can use the FRAME_POINTER_REQUIRED hook if you always want a frame pointer. Oh so specifying -fomit-frame-pointer there is redundant... good to know. Will remove it. > diff --git a/gcc/config/bpf/bpf-helpers.h b/gcc/config/bpf/bpf-helpers.h > new file mode 100644 > index 00000000000..2fe96be7637 > --- /dev/null > +++ b/gcc/config/bpf/bpf-helpers.h I can't remember, is this an installed header that consumers are expected to use? If so you might want to be careful with polluting user code with BPF #defines such as BPF_ANY, BPF_NOEXIST, BPF_EXIST, etc. The #defines for mapping to the builtins are probably OK though. Yes, it is a header file for consumers. Unfortunately, the whole purpose of the header is to provide an interface that is compatible with the kernel's bpf_helpers.h (which at the moment is llvm-specific). The API is given :( This is a point I plan to raise with the eBPF developers in a few weeks, at the Linux Plumbers conference in Lisbon. > diff --git a/gcc/config/bpf/bpf.c b/gcc/config/bpf/bpf.c > new file mode 100644 > index 00000000000..4a42259a9c3 > --- /dev/null > +++ b/gcc/config/bpf/bpf.c > @@ -0,0 +1,1136 @@ [ ... ] > + > +/* Return the builtin code corresponding to the kernel helper builtin > + __builtin_NAME, or 0 if the name doesn't correspond to a kernel > + helper builtin. */ > + > +static inline int > +bpf_helper_code (const char *name) > +{ > + int i; > + > + for (i = 1; i < BPF_BUILTIN_HELPER_MAX; ++i) > + { > + if (strcmp (name, bpf_helper_names[i]) == 0) > + return i; > + } > + > + return 0; > +} Does this get called often? If so the linear search could end up being expensive from a compile-time standpoint. It gets called per function call to a symbol with the form __builtin_bpf_helper_*... you think it is worth of a hash? > +#define KERNEL_VERSION_CODE "__BPF_KERNEL_VERSION_CODE__=" > + kernel_version_code > + = (char *) alloca (strlen (KERNEL_VERSION_CODE) + 7 + 1); > + strcpy (kernel_version_code, KERNEL_VERSION_CODE); > +#undef KERNEL_VERSION_CODE > + strcat (kernel_version_code, version_code); > + builtin_define (kernel_version_code); > + } > +} Does builtin_define copy its argument? If not, then I'd expect this to be problematical as the alloca'd space will be reclaimed. Yes it does. > + > +static void > +bpf_compute_frame (void) > +{ > + int stack_alignment = STACK_BOUNDARY / BITS_PER_UNIT; > + int padding_locals, regno; > + > + /* Set the space used in the stack by local variables. This is > + rounded up to respect the minimum stack alignment. */ > + cfun->machine->local_vars_size = get_frame_size (); > + > + padding_locals = cfun->machine->local_vars_size % stack_alignment; > + if (padding_locals) > + padding_locals = stack_alignment - padding_locals; > + > + cfun->machine->local_vars_size += padding_locals; > + > + /* Set the space used in the stack by callee-saved used registers in > + the current function. There is no need to round up, since the > + registers are all 8 bytes wide. */ > + for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) > + if ((!fixed_regs[regno] > + && df_regs_ever_live_p (regno) > + && !call_used_regs[regno]) > + || (cfun->calls_alloca > + && regno == STACK_POINTER_REGNUM)) > + cfun->machine->callee_saved_reg_size += 8; > + > + /* Check that the total size of the frame doesn't exceed the limit > + imposed by eBPF: currently 512 bytes. */ > + if ((cfun->machine->local_vars_size > + + cfun->machine->callee_saved_reg_size) > 512) > + { > + static int stack_limit_exceeded = 0; > + > + if (!stack_limit_exceeded) > + error ("eBPF stack limit of 512 bytes exceeded"); > + stack_limit_exceeded = 1; > + } Is the stack limit likely to change? Would a param work better here which would allow us to accommodate such a change without having to re-release GCC? It will probably be increased at some point. Using a param sounds like a good idea. However... The stack limit is associated with kernel version. I guess we can just set the appropriate defaults in bpf_option_override if we make it variable, in case the user didn't specify a --param for it, so no problem. Also, if we allow the user to specify a stack frame bigger than 0x7fff, bpf_expand_prologue will break. Probably in that case we want to detect this, warn and truncate to the -mkernel's default, also in bpf_option_override. Does that sound reasonable? > diff --git a/gcc/config/bpf/bpf.h b/gcc/config/bpf/bpf.h > + > +/**** Debugging Info ****/ > + > +/* We cannot support DWARF2 because of the limitations of eBPF. */ > +#define DBX_DEBUGGING_INFO Umm, we're trying to get rid of DBX_DEBUGGING_INFO. I'd rather not add another user at this point. How tough would it be to support dwarf? Yes relying on stabs sucks. The main problem I found is that it is not possible to define a CFA, nor to unwind frames in any way. Given these limitations, is it still possible to make GCC emit minimally useful DWARF, with locations and such? That would be great. > +;;;; Data movement > + > +(define_mode_iterator AMM [QI HI SI DI SF DF]) > + > +(define_expand "mov<AMM:mode>" > + [(set (match_operand:AMM 0 "general_operand" "") > + (match_operand:AMM 1 "general_operand" ""))] > + "" > + " > +{ > + if (!register_operand(operands[0], <AMM:MODE>mode) > + && !register_operand(operands[1], <AMM:MODE>mode)) > + operands[1] = force_reg (<AMM:MODE>mode, operands[1]); > + > + /* In cases where the moved entity is a constant address, we > + need to emit an extra mov and modify the second operand to > + obtain something like: > + > + lddw %T, %1 > + ldxw %0, [%T+0] > + > + Ditto for stores. */ > + > + if (MEM_P (operands[1]) > + && CONSTANT_ADDRESS_P (XEXP (operands[1], 0))) > + { > + rtx tmp = gen_reg_rtx (DImode); > + > + emit_move_insn (tmp, XEXP (operands[1], 0)); > + operands[1] = gen_rtx_MEM (<AMM:MODE>mode, tmp); > + } > + > + if (MEM_P (operands[0]) > + && CONSTANT_ADDRESS_P (XEXP (operands[0], 0))) > + { > + rtx tmp = gen_reg_rtx (DImode); > + > + emit_move_insn (tmp, XEXP (operands[0], 0)); > + operands[0] = gen_rtx_MEM (<AMM:MODE>mode, tmp); > + } > + > +}") Hmm, what happens if you need to reload something from a constant address? You can't call gen_reg_rtx once register allocation has started. THe case where you need a scratch register really feels like you need to be defining secondary reloads. I really have to think about this. Richard's comment about the possibility of not considering constant addresses legit already made me ponder whether it would be better to use a different strategy here. Generally it looks pretty good. I'd like to take more more looksie at patch #2 of the series after you've addressed the comments you've received so far. Thanks, I appreciate. Expect a V3 of the series soon :)