Sun, Oct 30, 2016 at 07:44:43PM CET, kubak...@wp.pl wrote: >On Sun, 30 Oct 2016 19:01:03 +0100, Jiri Pirko wrote: >> Sun, Oct 30, 2016 at 06:45:26PM CET, kubak...@wp.pl wrote: >> >On Sun, 30 Oct 2016 17:38:36 +0100, Jiri Pirko wrote: >> >> Sun, Oct 30, 2016 at 11:26:49AM CET, tg...@suug.ch wrote: >> [...] >> [...] >> >> [...] >> >> [...] >> >> [...] >> >> [...] >> [...] >> >> >> >> Agreed. >> > >> >Just to clarify my intention here was not to suggest the use of eBPF as >> >the IR. I was merely cautioning against bundling the new API with P4, >> >for multiple reasons. As John mentioned P4 spec was evolving in the >> >past. The spec is designed for HW more capable than the switch ASICs we >> >have today. As vendors move to provide more configurability we may need >> >to extend the API beyond P4. We may want to extend this API to for SW >> >hand-offs (as suggested by Thomas) which are not part of P4 spec. Also >> >John showed examples of matchd software which already uses P4 at the >> >frontend today and translates it to different targets (eBPF, u32, HW). >> >It may just be about the naming but I feel like calling the new API >> >more generically, switch AST or some such may help to avoid unnecessary >> >ties and confusion. >> >> Well, that basically means to create "something" that could be be used >> to translate p4 source to. Not sure how exactly this "something" should >> look like and how different would it be from p4. I thought it might >> be good to benefit from the p4 definition and use it directly. Not sure. > >We have to translate the P4 into "something" already, that something >is the AST we will load into the kernel. Or were you planning to use >some official P4 AST? I'm not suggesting we add our own high level
I'm not aware of existence of some official P4 AST. We have to figure it out. >language. I agree that P4 is a good starting point, and perhaps a good >high level language. I'm just cautious of creating an equivalency >between high level language (P4) and the kernel ABI. Understood. Definitelly good to be very cautious when defining a kernel API. > >Perhaps I'm just wasting everyone's time with this. > >> >> >> >> Exactly. Following drawing shows p4 pipeline setup for SW and Hw: >> >> >> >> | >> >> | +--> ebpf engine >> >> | | >> >> | | >> >> | compilerB >> >> | ^ >> >> | | >> >> p4src --> compilerA --> p4ast --TCNL--> cls_p4 --+-> driver -> compilerC >> >> -> HW >> >> | >> >> userspace | kernel >> >> | >> >> >> >> Now please consider runtime API for rule insertion/removal/stats/etc. >> >> Also, the single API is cls_p4 here: >> >> >> >> | >> >> | >> >> | >> >> | >> >> | ebpf map fillup >> >> | ^ >> >> | | >> >> p4 rule --TCNL--> cls_p4 --+-> driver -> HW table fillup >> >> | >> >> userspace | kernel >> >> >> > >> >My understanding was that the main purpose of SW eBPF translation would >> >be to piggy back on eBPF userspace map API. This seems not to be the >> >case here? Is "P4 rule" being added via some new API? From performance >> >> cls_p4 TC classifier. > >Oh, so the cls_p4 is just a proxy forwarding the requests to drivers >or eBPF backend. Got it. Sorry for being slow. And the requests >come down via change() op or something new? I wonder how such scheme >compares to eBPF maps performance-wise (updates/sec). I have no numbers at this time. I guess Jamal and Alexei did some measurements in this are in the past. > >> >perspective the SW AST implementation would probably not be any slower >> >than u32, so I don't think we need eBPF for performance. I must be >> >misreading this, if we want eBPF fallback we must extend eBPF with all >> >the map types anyway... so we could just use eBPF map API? I believe >> >John has already done some work in this space (see his GitHub :)) >> >> I don't think you can use existing BPF maps kernel API. You would still >> have to have another API just for the offloaded datapath. And that is >> a bypass. I strongly believe we need a single kernel API for both >> SW and HW datapath setup and runtime configuration. > >Agreed, single API is a must. What is the HW characteristic which >doesn't fit with eBPF map API, though? For eBPF offload I was planning >on adding offload hooks on eBPF map lookup/update paths and a way of >associating the map with a netdev. This should be enough to forward >updates to the driver and intercept reads to return the right >statistics.