On Tue 19 May 2020 at 21:39, Cong Wang <xiyou.wangc...@gmail.com> wrote: > On Tue, May 19, 2020 at 2:10 AM Vlad Buslov <vla...@mellanox.com> wrote: >> >> >> On Mon 18 May 2020 at 21:46, Cong Wang <xiyou.wangc...@gmail.com> wrote: >> > On Sun, May 17, 2020 at 11:44 PM Vlad Buslov <vla...@mellanox.com> wrote: >> >> >> >> >> >> On Sun 17 May 2020 at 22:13, Cong Wang <xiyou.wangc...@gmail.com> wrote: >> >> > On Fri, May 15, 2020 at 4:40 AM Vlad Buslov <vla...@mellanox.com> wrote: >> >> >> >> >> >> Output rate of current upstream kernel TC filter dump implementation if >> >> >> relatively low (~100k rules/sec depending on configuration). This >> >> >> constraint impacts performance of software switch implementation that >> >> >> rely on TC for their datapath implementation and periodically call TC >> >> >> filter dump to update rules stats. Moreover, TC filter dump output a >> >> >> lot >> >> >> of static data that don't change during the filter lifecycle (filter >> >> >> key, specific action details, etc.) which constitutes significant >> >> >> portion of payload on resulting netlink packets and increases amount of >> >> >> syscalls necessary to dump all filters on particular Qdisc. In order to >> >> >> significantly improve filter dump rate this patch sets implement new >> >> >> mode of TC filter dump operation named "terse dump" mode. In this mode >> >> >> only parameters necessary to identify the filter (handle, action >> >> >> cookie, >> >> >> etc.) and data that can change during filter lifecycle (filter flags, >> >> >> action stats, etc.) are preserved in dump output while everything else >> >> >> is omitted. >> >> >> >> >> >> Userspace API is implemented using new TCA_DUMP_FLAGS tlv with only >> >> >> available flag value TCA_DUMP_FLAGS_TERSE. Internally, new API requires >> >> >> individual classifier support (new tcf_proto_ops->terse_dump() >> >> >> callback). Support for action terse dump is implemented in act API and >> >> >> don't require changing individual action implementations. >> >> > >> >> > Sorry for being late. >> >> > >> >> > Why terse dump needs a new ops if it only dumps a subset of the >> >> > regular dump? That is, why not just pass a boolean flag to regular >> >> > ->dump() implementation? >> >> > >> >> > I guess that might break user-space ABI? At least some netlink >> >> > attributes are not always dumped anyway, so it does not look like >> >> > a problem? >> >> > >> >> > Thanks. >> >> >> >> Hi Cong, >> >> >> >> I considered adding a flag to ->dump() callback but decided against it >> >> for following reasons: >> >> >> >> - It complicates fl_dump() code by adding additional conditionals. Not a >> >> big problem but it seemed better for me to have a standalone callback >> >> because with combined implementation it is even hard to deduce what >> >> does terse dump actually output. >> > >> > This is not a problem, at least you can add a big if in fl_dump(), >> > something like: >> > >> > if (terse) { >> > // do terse dump >> > return 0; >> > } >> > // normal dump >> >> That is what I was trying to prevent with my implementation: having big >> "superfunctions" that implement multiple things with branching. Why not >> just have dedicated callbacks that do exactly one thing? > > 1. Saving one unnecessary ops. > 2. Easier to trace the code: all dumpings are in one implementation.
Okay, I see your point. > >> >> > >> >> >> >> - My initial implementation just called regular dump for classifiers >> >> that don't support terse dump, but in internal review Jiri insisted >> >> that cls API should fail if it can't satisfy user's request and having >> >> dedicated callback allows implementation to return an error if >> >> classifier doesn't define ->terse_dump(). With flag approach it would >> >> be not trivial to determine if implementation actually uses the flag. >> > >> > Hmm? For those not support terse dump, we can just do: >> > >> > if (terse) >> > return -EOPNOTSUPP; >> > // normal dump goes here >> > >> > You just have to pass 'terse' flag to all implementations and let them >> > to decide whether to support it or not. >> >> But why duplicate the same code to all existing cls dump implementations >> instead of having such check nicely implemented in cls API (via callback >> existence or a flag)? > > I am confused, your fl_terse_dump() also has duplication with fl_dump()... > > Thanks. I meant duplicating the "if terse not supported return -EOPNOTSUPP" in dump callback of every classifier implementation. With current implementation cls API handles such case by checking whether classifier implementation has ->terse_dump() defined and returns error otherwise. This can also be achieved by having a new classifier flag, in case we decide to proceed with folding both dump and terse_dump into single ->dump(bool terse) callback.