>> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c >> index ef53ede..be4e18d 100644 >> --- a/net/sched/sch_api.c >> +++ b/net/sched/sch_api.c >> @@ -1865,6 +1865,7 @@ int tc_classify(struct sk_buff *skb, const struct >> tcf_proto *tp, >> const struct tcf_proto *old_tp = tp; >> int limit = 0; >> >> + skb->tc_at_ingress = !!(tp && tp->q->flags & TCQ_F_INGRESS); > > > I'd prefer if skb->tc_at_ingress is set directly to 0/1 in > sch_handle_ingress() > and __dev_queue_xmit() as we do right now, this would avoid above tests in > fast > path and it would also avoid to set the same thing in tc_classify() multiple > times f.e. on egress path walking through multiple qdiscs. I don't see > anything > in layers above tc that would read it and expect an AT_STACK-like > equivalent. > skb_reset_tc() could thus still remain as you have above in fast-path like > __netif_receive_skb_core().
I had been thinking about that. After submitting this I noticed that Florian's patchset had an elegant solution to avoid the branch: set tc_at_ingress in handle_ing before tc_classify and clear it on the return path. Then we only set + clear it once on ingress regardless of the depth of classifiers and do not touch it at all in other code. https://patchwork.ozlabs.org/patch/472698/ What do you think of that approach?