> On 2 Jun 2019, at 21:39, Toke Høiland-Jørgensen <t...@redhat.com> wrote: > > Kevin Darbyshire-Bryant <l...@darbyshire-bryant.me.uk> writes: > >> ctinfo is an action restoring data stored in conntrack marks to various >> fields. At present it has two independent modes of operation, >> restoration of DSCP into IPv4/v6 diffserv and restoration of conntrack >> marks into packet skb marks. >> >> It understands a number of parameters specific to this action in >> additional to the usual action syntax. Each operating mode is >> independent of the other so all options are optional, however not >> specifying at least one mode is a bit pointless. >> >> Usage: ... ctinfo [dscp mask[/statemask]] [cpmark [mask]] [zone ZONE] >> [CONTROL] [index <INDEX>] > > Yay, bikeshedding time! :)
I see your bikeshed and raise you… a bus shelter :-) > > As I said in reply to the kernel patch, the "X/Y" syntax usually means > "<value>/<mask applied to value>", where here they are just two > semi-related mask values. So I think it would be better to just make > 'statemask' its own parameter. Instead of creating another keyword how about we drop the ‘/‘ and make it a space separated optional parameter to ‘dscp’? eg. Usage: ... ctinfo [dscp mask [statemask]] [cpmark [mask]] blah blah > Other than that, just a few nits, below... > >> DSCP mode >> >> dscp enables copying of a DSCP store in the conntrack mark into the >> ipv4/v6 diffserv field. The mask is a 32bit field and specifies where >> in the conntrack mark the DSCP value is stored. It must be 6 contiguous >> bits long, e.g. 0xfc000000 would restore the DSCP from the upper 6 bits >> of the conntrack mark. >> >> The DSCP copying may be optionally controlled by a statemask. The >> statemask is a 32bit field, usually with a single bit set and must not >> overlap the dscp mask. The DSCP restore operation will only take place >> if the corresponding bit/s in conntrack mark yield a non zero result. >> >> eg. dscp 0xfc000000/0x01000000 would retrieve the DSCP from the top 6 >> bits, whilst using bit 25 as a flag to do so. Bit 26 is unused in this >> example. >> >> CPMARK mode >> >> cpmark enables copying of the conntrack mark to the packet skb mark. In >> this mode it is completely equivalent to the existing act_connmark. >> Additional functionality is provided by the optional mask parameter, >> whereby the stored conntrack mark is logically anded with the cpmark >> mask before being stored into skb mark. This allows shared usage of the >> conntrack mark between applications. >> >> eg. cpmark 0x00ffffff would restore only the lower 24 bits of the >> conntrack mark, thus may be useful in the event that the upper 8 bits >> are used by the DSCP function. >> >> Usage: ... ctinfo [dscp mask[/statemask]] [cpmark [mask]] [zone ZONE] >> [CONTROL] [index <INDEX>] >> where : >> dscp MASK is the bitmask to restore DSCP >> STATEMASK is the bitmask to determine conditional restoring >> cpmark MASK mask applied to restored packet mark >> ZONE is the conntrack zone >> CONTROL := reclassify | pipe | drop | continue | ok | >> goto chain <CHAIN_INDEX> >> >> Signed-off-by: Kevin Darbyshire-Bryant <l...@darbyshire-bryant.me.uk> >> >> --- >> v2 - fix whitespace issue in pkt_cls >> fix most warnings from checkpatch - some lines still over 80 chars >> due to long TLV names. >> v3 - fix some dangling else warnings. >> refactor stats printing to please checkpatch. >> send zone TLV even if default '0' zone. >> now checkpatch clean even though I think some of the formatting >> is horrible :-) >> sending via google's smtp 'cos MS' exchange office365 appears >> to mangle patches from git send-email. > > Ah, so it wasn't just me having problems ;) No, though I’m still not clear what’s going on or when Microsoft improved(tm) it :-/ > >> include/uapi/linux/pkt_cls.h | 1 + >> include/uapi/linux/tc_act/tc_ctinfo.h | 34 ++++ >> tc/Makefile | 1 + >> tc/m_ctinfo.c | 262 ++++++++++++++++++++++++++ >> 4 files changed, 298 insertions(+) >> create mode 100644 include/uapi/linux/tc_act/tc_ctinfo.h >> create mode 100644 tc/m_ctinfo.c >> >> diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h >> index 51a0496f..a93680fc 100644 >> --- a/include/uapi/linux/pkt_cls.h >> +++ b/include/uapi/linux/pkt_cls.h >> @@ -105,6 +105,7 @@ enum tca_id { >> TCA_ID_IFE = TCA_ACT_IFE, >> TCA_ID_SAMPLE = TCA_ACT_SAMPLE, >> /* other actions go here */ >> + TCA_ID_CTINFO, >> __TCA_ID_MAX = 255 >> }; >> >> diff --git a/include/uapi/linux/tc_act/tc_ctinfo.h >> b/include/uapi/linux/tc_act/tc_ctinfo.h >> new file mode 100644 >> index 00000000..da803e05 >> --- /dev/null >> +++ b/include/uapi/linux/tc_act/tc_ctinfo.h >> @@ -0,0 +1,34 @@ >> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ >> +#ifndef __UAPI_TC_CTINFO_H >> +#define __UAPI_TC_CTINFO_H >> + >> +#include <linux/types.h> >> +#include <linux/pkt_cls.h> >> + >> +struct tc_ctinfo { >> + tc_gen; >> +}; >> + >> +enum { >> + TCA_CTINFO_UNSPEC, >> + TCA_CTINFO_PAD, >> + TCA_CTINFO_TM, >> + TCA_CTINFO_ACT, >> + TCA_CTINFO_ZONE, >> + TCA_CTINFO_PARMS_DSCP_MASK, >> + TCA_CTINFO_PARMS_DSCP_STATEMASK, >> + TCA_CTINFO_PARMS_CPMARK_MASK, >> + TCA_CTINFO_STATS_DSCP_SET, >> + TCA_CTINFO_STATS_DSCP_ERROR, >> + TCA_CTINFO_STATS_CPMARK_SET, >> + __TCA_CTINFO_MAX >> +}; >> + >> +#define TCA_CTINFO_MAX (__TCA_CTINFO_MAX - 1) >> + >> +enum { >> + CTINFO_MODE_DSCP = BIT(0), >> + CTINFO_MODE_CPMARK = BIT(1) >> +}; >> + >> +#endif >> diff --git a/tc/Makefile b/tc/Makefile >> index 1a305cf4..60abddee 100644 >> --- a/tc/Makefile >> +++ b/tc/Makefile >> @@ -48,6 +48,7 @@ TCMODULES += m_csum.o >> TCMODULES += m_simple.o >> TCMODULES += m_vlan.o >> TCMODULES += m_connmark.o >> +TCMODULES += m_ctinfo.o >> TCMODULES += m_bpf.o >> TCMODULES += m_tunnel_key.o >> TCMODULES += m_sample.o >> diff --git a/tc/m_ctinfo.c b/tc/m_ctinfo.c >> new file mode 100644 >> index 00000000..af5102bf >> --- /dev/null >> +++ b/tc/m_ctinfo.c >> @@ -0,0 +1,262 @@ >> +/* SPDX-License-Identifier: GPL-2.0 */ >> +/* >> + * m_ctinfo.c netfilter ctinfo mark action >> + * >> + * Copyright (c) 2019 Kevin Darbyshire-Bryant <l...@darbyshire-bryant.me.uk> >> + */ >> + >> +#include <stdio.h> >> +#include <stdlib.h> >> +#include <unistd.h> >> +#include <string.h> >> +#include "utils.h" >> +#include "tc_util.h" >> +#include <linux/tc_act/tc_ctinfo.h> >> + >> +static void >> +explain(void) >> +{ >> + fprintf(stderr, >> + "Usage: ... ctinfo [dscp mask[/statemask]] [cpmark [mask]] >> [zone ZONE] [CONTROL] [index <INDEX>]\n" >> + "where :\n" >> + "\tdscp MASK bitmask location of stored DSCP\n" >> + "\t STATEMASK bitmask to determine conditional >> restoring\n" >> + "\tcpmark MASK mask applied to mark on restoration\n" >> + "\tZONE is the conntrack zone\n" >> + "\tCONTROL := reclassify | pipe | drop | continue | ok |\n" >> + "\t goto chain <CHAIN_INDEX>\n"); >> +} >> + >> +static void >> +usage(void) >> +{ >> + explain(); >> + exit(-1); >> +} >> + >> +static int >> +parse_ctinfo(struct action_util *a, int *argc_p, char ***argv_p, int tca_id, >> + struct nlmsghdr *n) >> +{ >> + unsigned int cpmarkmask = 0, dscpmask = 0, dscpstatemask = 0; >> + struct tc_ctinfo sel = {}; >> + unsigned short zone = 0; >> + char **argv = *argv_p; >> + struct rtattr *tail; >> + int argc = *argc_p; >> + int ok = 0; >> + >> + while (argc > 0) { >> + if (matches(*argv, "ctinfo") == 0) { >> + ok = 1; >> + argc--; >> + argv++; >> + } else if (matches(*argv, "help") == 0) { >> + usage(); >> + } else { >> + break; >> + } >> + >> + } >> + >> + if (!ok) { >> + explain(); >> + return -1; >> + } >> + >> + if (argc) { >> + if (matches(*argv, "dscp") == 0) { >> + NEXT_ARG(); >> + char *slash; >> + >> + slash = strchr(*argv, '/'); >> + if (slash) >> + *slash = '\0'; >> + if (get_u32(&dscpmask, *argv, 0)) { >> + fprintf(stderr, >> + "ctinfo: Illegal dscp \"mask\"\n"); >> + return -1; >> + } > > Not strictly necessary, but would it be nicer to the user to do the > 'must be exactly 6 bits' test here as well, in order to provide a > friendlier error message than just 'invalid’? Agreed. Have implemented that in v4, need to test before sending..both after sleep :-) > > Could also just add this via extack on the kernel side, of course… May I please dodge that particular bullet on this occasion? > >> + if (slash && get_u32(&dscpstatemask, slash + 1, 0)) { >> + fprintf(stderr, >> + "ctinfo: Illegal dscp \"statemask\"\n"); >> + return -1; >> + } >> + argc--; >> + argv++; >> + } >> + } >> + >> + /* cpmark has optional mask parameter, so the next arg might not */ >> + /* exist, or it might be the next option, or it may actually be a */ >> + /* 32bit mask */ >> + if (argc) { >> + if (matches(*argv, "cpmark") == 0) { >> + if (NEXT_ARG_OK()) { >> + NEXT_ARG_FWD(); >> + if (!get_u32(&cpmarkmask, *argv, 0)) >> + NEXT_ARG_FWD(); /* was a mask */ >> + else /* not a mask, use default */ >> + cpmarkmask = ~0; >> + } else { >> + NEXT_ARG_FWD(); >> + } >> + } >> + } >> + >> + if (argc) { >> + if (matches(*argv, "zone") == 0) { >> + NEXT_ARG(); >> + if (get_u16(&zone, *argv, 10)) { >> + fprintf(stderr, "ctinfo: Illegal \"zone\"\n"); >> + return -1; >> + } >> + argc--; >> + argv++; >> + } >> + } >> + >> + parse_action_control_dflt(&argc, &argv, &sel.action, >> + false, TC_ACT_PIPE); >> + >> + if (argc) { >> + if (matches(*argv, "index") == 0) { >> + NEXT_ARG(); >> + if (get_u32(&sel.index, *argv, 10)) { >> + fprintf(stderr, "ctinfo: Illegal \"index\"\n"); >> + return -1; >> + } >> + argc--; >> + argv++; >> + } >> + } >> + >> + tail = addattr_nest(n, MAX_MSG, tca_id); >> + addattr_l(n, MAX_MSG, TCA_CTINFO_ACT, &sel, sizeof(sel)); >> + addattr16(n, MAX_MSG, TCA_CTINFO_ZONE, zone); >> + >> + if (dscpmask) >> + addattr32(n, MAX_MSG, >> + TCA_CTINFO_PARMS_DSCP_MASK, dscpmask); >> + >> + if (dscpstatemask) >> + addattr32(n, MAX_MSG, >> + TCA_CTINFO_PARMS_DSCP_STATEMASK, dscpstatemask); >> + >> + if (cpmarkmask) >> + addattr32(n, MAX_MSG, >> + TCA_CTINFO_PARMS_CPMARK_MASK, cpmarkmask); >> + >> + addattr_nest_end(n, tail); >> + >> + *argc_p = argc; >> + *argv_p = argv; >> + return 0; >> +} >> + >> +static void print_ctinfo_stats(FILE *f, struct rtattr *tb[TCA_CTINFO_MAX + >> 1]) >> +{ >> + struct tcf_t *tm; >> + >> + if (tb[TCA_CTINFO_TM]) { >> + tm = RTA_DATA(tb[TCA_CTINFO_TM]); >> + >> + print_tm(f, tm); >> + } >> + >> + if (tb[TCA_CTINFO_STATS_DSCP_SET]) >> + print_lluint(PRINT_ANY, "dscpset", " DSCP set %llu", >> + rta_getattr_u64(tb[TCA_CTINFO_STATS_DSCP_SET])); >> + if (tb[TCA_CTINFO_STATS_DSCP_ERROR]) >> + print_lluint(PRINT_ANY, "dscperror", " error %llu", >> + rta_getattr_u64(tb[TCA_CTINFO_STATS_DSCP_ERROR])); >> + >> + if (tb[TCA_CTINFO_STATS_CPMARK_SET]) >> + print_lluint(PRINT_ANY, "cpmarkset", " CPMARK set %llu", >> + rta_getattr_u64(tb[TCA_CTINFO_STATS_CPMARK_SET])); >> +} >> + >> +static int print_ctinfo(struct action_util *au, FILE *f, struct rtattr *arg) >> +{ >> + unsigned int cpmarkmask = ~0, dscpmask = 0, dscpstatemask = 0; >> + struct rtattr *tb[TCA_CTINFO_MAX + 1]; >> + unsigned short zone = 0; >> + struct tc_ctinfo *ci; >> + >> + if (arg == NULL) >> + return -1; >> + >> + parse_rtattr_nested(tb, TCA_CTINFO_MAX, arg); >> + if (!tb[TCA_CTINFO_ACT]) { >> + print_string(PRINT_FP, NULL, "%s", >> + "[NULL ctinfo action parameters]"); >> + return -1; >> + } >> + >> + ci = RTA_DATA(tb[TCA_CTINFO_ACT]); >> + >> + if (tb[TCA_CTINFO_PARMS_DSCP_MASK]) { >> + if (RTA_PAYLOAD(tb[TCA_CTINFO_PARMS_DSCP_MASK]) >= >> + sizeof(__u32)) >> + dscpmask = rta_getattr_u32( >> + tb[TCA_CTINFO_PARMS_DSCP_MASK]); >> + else >> + print_string(PRINT_FP, NULL, "%s", >> + "[invalid dscp mask parameter]"); >> + } >> + >> + if (tb[TCA_CTINFO_PARMS_DSCP_STATEMASK]) { >> + if (RTA_PAYLOAD(tb[TCA_CTINFO_PARMS_DSCP_STATEMASK]) >= >> + sizeof(__u32)) >> + dscpstatemask = rta_getattr_u32( >> + tb[TCA_CTINFO_PARMS_DSCP_STATEMASK]); >> + else >> + print_string(PRINT_FP, NULL, "%s", >> + "[invalid dscp statemask parameter]"); >> + } >> + >> + if (tb[TCA_CTINFO_PARMS_CPMARK_MASK]) { >> + if (RTA_PAYLOAD(tb[TCA_CTINFO_PARMS_CPMARK_MASK]) >= >> + sizeof(__u32)) >> + cpmarkmask = rta_getattr_u32( >> + tb[TCA_CTINFO_PARMS_CPMARK_MASK]); >> + else >> + print_string(PRINT_FP, NULL, "%s", >> + "[invalid cpmark mask parameter]"); >> + } >> + >> + if (tb[TCA_CTINFO_ZONE] && RTA_PAYLOAD(tb[TCA_CTINFO_ZONE]) >= >> + sizeof(__u16)) >> + zone = rta_getattr_u16(tb[TCA_CTINFO_ZONE]); >> + >> + print_string(PRINT_ANY, "kind", "%s ", "ctinfo"); >> + print_hu(PRINT_ANY, "zone", "zone %u", zone); >> + print_action_control(f, " ", ci->action, ""); >> + >> + print_string(PRINT_FP, NULL, "%s", _SL_); >> + print_uint(PRINT_ANY, "index", "\t index %u", ci->index); >> + print_int(PRINT_ANY, "ref", " ref %d", ci->refcnt); >> + print_int(PRINT_ANY, "bind", " bind %d", ci->bindcnt); >> + >> + if (tb[TCA_CTINFO_PARMS_DSCP_MASK]) { >> + print_0xhex(PRINT_ANY, "dscpmask", " dscp %#010llx", dscpmask); >> + print_0xhex(PRINT_ANY, "dscpstatemask", "/%#010llx", >> + dscpstatemask); >> + } >> + >> + if (tb[TCA_CTINFO_PARMS_CPMARK_MASK]) >> + print_0xhex(PRINT_ANY, "mark", " mark %#010llx", >> cpmarkmask); > > Should be 'cpmark' not 'mark', no? Yes it should. The one that got away - squished :-) Thanks for looking through my code, can’t be a pleasant task, and spotting my bits of code blindness. Kevin