yonghong-song added a comment.

In D70696#1767616 <https://reviews.llvm.org/D70696#1767616>, @dblaikie wrote:

> Many of the test cases could be collapsed into one file - using different 
> variables that are used, unused, locally or globally declared, etc.


Okay. Will try to consolidate into one or fewer files. Originally, I am using 
different files to avoid cases where in the future clang may generate different 
ordering w.r.t. different global variables.

> Is this new code only active for C compilations? (does clang reject requests 
> for the bpf target when the input is C++?) I ask due to the concerns around 
> globals used in inline functions where the inline function is unused - though 
> C has inline functions too, so I guess the question stands: Is that a 
> problem? What happens?

Currently, yes. my implementation only active for C compilation.
In the kernel documentation 
(https://www.kernel.org/doc/Documentation/networking/filter.txt), we have:

  The new instruction set was originally designed with the possible goal in
  mind to write programs in "restricted C" and compile into eBPF with a optional
  GCC/LLVM backend, so that it can just-in-time map to modern 64-bit CPUs with
  minimal performance overhead over two steps, that is, C -> eBPF -> native 
code.

For LLVM itself, people can compile a C++ program into BPF target. But 
"officially" we do not
support this. That is why I restricted to C only. For C++ programs, we don't 
get much usage/tests
from users.

Do you have a concrete example for this? I tried the following:

  -bash-4.4$ cat t.h
  inline int foo() { extern int g; return g; }
  -bash-4.4$ cat t.c
  int bar() { return 0; }
  -bash-4.4$ clang -target bpf -g -O0 -S -emit-llvm t.c

`foo` is not used, clang seems smart enough to deduce `g` is not used, so no 
debuginfo is emitted in this case.

In general, if an inline function is not used but an external variable is used 
inside that inline function, the worst case is extra debuginfo for that 
external variable. Since it is not used, it won't impact bpf loader.

> Should this be driven by a lower level of code generation - ie: is it OK to 
> only produce debug info descriptions for variables that are referenced in the 
> resulting LLVM IR? (compile time constants wouldn't be described then, for 
> instance - since they won't be code generated, loaded from memory, etc)

Yes, it is OK to only produce debug info only for variables that are referenced 
in the resulting LLVM IR. But we are discussing extern variables and no compile 
time constants here. Maybe I miss something?

> Is there somewhere I should be reading about the design requirement for these 
> global variable descriptions to understand the justification for them & the 
> ramifications if there are bugs that cause them not to be emitted?

We do not have design documents yet. The following are two links and I can 
explain more:

1. 
https://lore.kernel.org/bpf/CAEf4BzYCNo5GeVGMhp3fhysQ=_axAf=23ptwazs-yayafmx...@mail.gmail.com/T/#t

The typical config is at /boot/config-<...> in a linux machine. The config 
entry typically look like:

  CONFIG_CC_IS_GCC=y
  CONFIG_GCC_VERSION=40805
  CONFIG_INITRAMFS_SOURCE=""

Suppose a bpf program wants to check config value and based on its value to do 
something, user can write:

  extern bool CONFIG_CC_IS_GCC;
  extern int CONFIG_GCC_VERSION;
  extern char CONFIG_INITRAMFS_SOURCE[20];
  ...
  if (CONFIG_CC_IS_GCC) ...
  map_val = CONFIG_GCC_VERSION; 
  __builtin_memcpy(map_value, 8, CONFIG_INITRAMFS_SOURCE);

bpfloader will create a data section store all the above info and patch the 
correct address to the code.
Without extern var type info, it becomes a guess game what type/size the user 
is using.
Based on precise type information, bpf loader is able to do relocation much 
easily.

2. 
https://lore.kernel.org/bpf/87eez4odqp....@toke.dk/T/#m8d5c3e87ffe7f2764e02d722cb0d8cbc136880ed

This is for bpf program verification.
For example,
bpf_prog1:

  foo(...) {
    ... x ... y ...
    z =  bar(x /*struct t * */, y /* int */);
    ...
  }

and there is no bar body available yet.
The kernel verifier still able to verify program "foo"
and makes sure type leading to bar for all parameters
are correct.

Later, if there is a program
prog2(struct t *a, int b)
which is verified independently.

The in kernel, prog1 can call prog2 if there parameter types
and return types match. This is the BPF-way dynamic linking.
The types for external used functions can help cut down
verification cost at linking time.

If there is no debug information for these extern variables, the current
proposal is to fail the bpf loader and verifier. User can always workaround
such issues to create bpf maps for the first use case (which is more expensive 
and not user friendly) and do static
linking before loading into the kernel for the second use case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70696/new/

https://reviews.llvm.org/D70696



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to