On 05/22/2019 01:52 AM, Matthew Cover wrote: > > > __sk_buff has a member tc_classid which I'm interested in accessing from > > > the skb bpf context. > > > > > > A bpf program which accesses skb->tc_classid compiles, but fails > > > verification; the specific failure is "invalid bpf_context access". > > > > > > if (skb->tc_classid != 0) > > > return 1; > > > return 0; > > > > > > Some of the tests in tools/testing/selftests/bpf/verifier/ (those on > > > tc_classid) further confirm that this is, in all likelihood, intentional > > > behavior. > > > > > > The very similar bpf program which instead accesses skb->mark works as > > > desired. > > > > > > if (skb->mark != 0) > > > return 1; > > > return 0; > > > > You should be able to access skb->tc_classid, perhaps you're using the > > wrong program > > type? BPF_PROG_TYPE_SCHED_CLS is supposed to work (if not we'd have a > > regression). > > > > I am in fact using BPF_PROG_TYPE_SOCKET_FILTER and using the program as > PACKET_FANOUT_DATA with PACKET_FANOUT_EBPF. > > I have been working on a series of utils which leverage PACKET_FANOUT to > provide various per-socket-fd (per-cpu, per-queue, > per-rx-flow-hash-indirection-table-idx) statistics and pcap files. While > playing with PACKET_FANOUT_EBPF, I realized that I could use the bpf program > to categorize packets in ways packet-filter(7) does not provide. > > As a concrete example, I plan to build a util `rxtxmark` which could be > passed something like `--mark-list 42,88`. This would be translated to a bpf > program where the return code is the ordinality of the mark in the list. > > if (skb->mark == 42) > return 1; > if (skb->mark == 88) > return 2; > return 0; > > Packets enqueued to fd0 are simply ignored. Packets enqueued to the other fds > are processed into pcaps and statistics. > > While I may build a util for tc_classid which does per-user-requested-classid > pcaps and statistics like `rxtxmark` does for marks, I'm also interested in > using tc_classid as a simple way to capture tx packets from a long running > program on the fly. > > The program under inspection would simply be added to a net_cls cgroup which > has a unique classid defined. A bpf program would be attached to map packets > with that classid to fd1. While I can do this already by using iptables to > translate the tc_classid to a mark, that complicates the implementation > greatly since the firewall has to be touched (which is probably overreaching > for a packet capture util and would most likely be left to the user to > configure). >
And only now do I discover netsniff-ng; a seriously cool set of utils! Thank you for your efforts there Daniel! I still plan to continue advancing my various PACKET_FANOUT utils and eventually seeing how much, if any, of the common code would be of interest to the libpcap maintainers. But very cool that a quick look at the netsniff-ng help file shows that rxtxcpu et al could be accomplished with the right number of concurrent invocations of netsniff-ng. > > > I built a kernel (v5.1) with 4 instances of the following line removed > > > from net/core/filter.c to test the behavior when the instructions pass > > > verification. > > > > > > switch (off) { > > > - case bpf_ctx_range(struct __sk_buff, tc_classid): > > > ... > > > return false; > > > > > > It appears skb->tc_classid is always zero within my bpf program, even > > > when I verify by other means (e.g. netfilter) that the value is set > > > non-zero. > > > > > > I gather that sk_buff proper sometimes (i.e. at some layers) has > > > qdisc_skb_cb stored in skb->cb, but not always. > > > > > > I suspect that the tc_classid is available at l3 (and therefore to utils > > > like netfilter, ip route, tc), but not at l2 (and not to AF_PACKET). > > > > From tc/BPF context you can use it; it's been long time, but I think back > > then > > we mapped it into cb[] so it can be used within the BPF context to pass skb > > data > > around e.g. between tail calls, and cls_bpf_classify() when in > > direct-action mode > > which likely everyone is/should-be using then maps that skb->tc_classid u16 > > cb[] > > value to res->classid on program return which then in either > > sch_handle_ingress() > > or sch_handle_egress() is transferred into the skb->tc_index. > > > > It sounds like just before the start of a BPF_PROG_TYPE_SCHED_CLS bpf program > tc_classid id placed in skb->cb. The missing plumbing to support my use case > is probably the same thing, but for BPF_PROG_TYPE_SOCKET_FILTER. > > I'll see about familiarizing myself with both as time permits and perhaps I > can get tc_classid working for a BPF_PROG_TYPE_SOCKET_FILTER program; it > certainly sounds like it's doable. > > > > Is it impractical to make skb->tc_classid available in this bpf context > > > or is there just some plumbing which hasn't been connected yet? > > > > > > Is my suspicion that skb->cb no longer contains qdisc_skb_cb due to > > > crossing a layer boundary well founded? > > > > > > I'm willing to look into hooking things together as time permits if it's > > > a feasible task. > > > > > > It's trivial to have iptables match on tc_classid and set a mark which is > > > available to bpf at l2, but I'd like to better understand this. > > > > > > Thanks, > > > Matt C. > > >