RE: [RFC v2] eal: allow worker lcore stacks to be allocated from hugepage memory
> From: Don Wallwork [mailto:d...@xsightlabs.com] > Sent: Friday, 29 April 2022 22.01 > > Add support for using hugepages for worker lcore stack memory. The > intent is to improve performance by reducing stack memory related TLB > misses and also by using memory local to the NUMA node of each lcore. > > EAL option '--huge-worker-stack [stack-size-kbytes]' is added to allow > the feature to be enabled at runtime. If the size is not specified, > the system pthread stack size will be used. It would be nice if DPDK EAL could parse size parameter values provided as "1M" or "128k"; but it is clearly not a requirement for this patch. Just mentioning it. > > Signed-off-by: Don Wallwork > --- > /** > * internal configuration > */ > @@ -102,6 +105,7 @@ struct internal_config { > unsigned int no_telemetry; /**< true to disable Telemetry */ > struct simd_bitwidth max_simd_bitwidth; > /**< max simd bitwidth path to use */ > + size_t huge_worker_stack_size; /**< worker thread stack size in > kbytes */ The command line parameter value has been converted from kbytes to bytes here, so this comment is wrong. Acked-by: Morten Brørup
RE: [RFC] eal: allow worker lcore stacks to be allocated from hugepage memory
> From: Don Wallwork [mailto:d...@xsightlabs.com] > Sent: Friday, 29 April 2022 20.52 > > On 4/27/2022 4:17 AM, Morten Brørup wrote: > > +CC: EAL and Memory maintainers. > > > >> From: Don Wallwork [mailto:d...@xsightlabs.com] > >> Sent: Tuesday, 26 April 2022 23.26 > >> > >> On 4/26/2022 5:21 PM, Stephen Hemminger wrote: > >>> On Tue, 26 Apr 2022 17:01:18 -0400 > >>> Don Wallwork wrote: > >>> > On 4/26/2022 10:58 AM, Stephen Hemminger wrote: > > On Tue, 26 Apr 2022 08:19:59 -0400 > > Don Wallwork wrote: > > > >> Add support for using hugepages for worker lcore stack memory. > The intent is to improve performance by reducing stack memory related > TLB misses and also by using memory local to the NUMA node of each > lcore. > > This certainly seems like a good idea! > > > > However, I wonder: Does the O/S assign memory local to the NUMA node > to an lcore-pinned thread's stack when instantiating the tread? And > does the DPDK EAL ensure that the preconditions for the O/S to do that > are present? > > > > (Not relevant for this patch, but the same locality questions come to > mind regarding Thread Local Storage.) > Currently, DPDK does not set pthread affinity until after the pthread > is > created and the stack has been allocated. If the affinity attribute > were set before the pthread_create call, it seems possible that > pthread_create could be NUMA aware when allocating the stack. However, > it looks like at least the glibc v2.35 implementation of pthread_create > does not consider this at stack allocation time. Thank you for the looking into this! Very interesting. So, your patch improves the memory locality (and TLB) for the stack, which is great. The same for Thread Local Storage needs to be addressed by glibc (and the C libraries of other OS'es), which is clearly out of scope here. I searched for RTE_DEFINE_PER_LCORE, and it is only rarely used in DPDK core libraries, so this is not a problem inside DPDK. > > Would it be possible to add a guard page or guard region by using the > O/S memory allocator instead of rte_zmalloc_socket()? Since the stack > is considered private to the process, i.e. not accessible from other > processes, this patch does not need to provide remote access to stack > memory from secondary processes - and thus it is not a requirement for > this features to use DPDK managed memory. > In order for each stack to have guard page protection, this would > likely > require reserving an entire hugepage per stack. Although guard pages > do > not require physical memory allocation, it would not be possible for > multiple stacks to share a hugepage and also have per stack guard page > protection. Makes sense; allocating an entire huge page for stack per worker thread could be considered too much. > > Do the worker threads need a different stack size than the main > thread? In my opinion: "Nice to have", not "must have". > The main thread stack behaves differently anyway; it can grow > dynamically, but regarless of this patch, pthread stack sizes are > always > fixed. This change only relates to worker threads. Agree. > > > > Do the worker threads need different stack sizes individually? In my > opinion: Perhaps "nice to have", certainly not "must have". > > > Currently, worker thread stack sizes are uniformly sized and not > dynamically resized. This patch does not change that aspect. Given > that, it seems unnecessary to add that complexity here. Agree.
[PATCH 1/2] pipeline: support hash functions
Add support for hash functions that compute a signature for an array of bytes read from a packet header or meta-data. Useful for flow affinity-based load balancing. Signed-off-by: Cristian Dumitrescu --- Depends-on: series-22635 ("[V2,1/3] table: improve learner table timers") lib/pipeline/rte_swx_pipeline.c | 212 +++ lib/pipeline/rte_swx_pipeline.h | 41 + lib/pipeline/rte_swx_pipeline_internal.h | 71 lib/pipeline/version.map | 3 + 4 files changed, 327 insertions(+) diff --git a/lib/pipeline/rte_swx_pipeline.c b/lib/pipeline/rte_swx_pipeline.c index 84d2c24311..ea7df98ecb 100644 --- a/lib/pipeline/rte_swx_pipeline.c +++ b/lib/pipeline/rte_swx_pipeline.c @@ -6,6 +6,9 @@ #include #include +#include +#include + #include #include #include @@ -1166,6 +1169,94 @@ extern_func_free(struct rte_swx_pipeline *p) } } +/* + * Hash function. + */ +static struct hash_func * +hash_func_find(struct rte_swx_pipeline *p, const char *name) +{ + struct hash_func *elem; + + TAILQ_FOREACH(elem, &p->hash_funcs, node) + if (strcmp(elem->name, name) == 0) + return elem; + + return NULL; +} + +int +rte_swx_pipeline_hash_func_register(struct rte_swx_pipeline *p, + const char *name, + rte_swx_hash_func_t func) +{ + struct hash_func *f; + + CHECK(p, EINVAL); + + CHECK_NAME(name, EINVAL); + CHECK(!hash_func_find(p, name), EEXIST); + + CHECK(func, EINVAL); + + /* Node allocation. */ + f = calloc(1, sizeof(struct hash_func)); + CHECK(func, ENOMEM); + + /* Node initialization. */ + strcpy(f->name, name); + f->func = func; + f->id = p->n_hash_funcs; + + /* Node add to tailq. */ + TAILQ_INSERT_TAIL(&p->hash_funcs, f, node); + p->n_hash_funcs++; + + return 0; +} + +static int +hash_func_build(struct rte_swx_pipeline *p) +{ + struct hash_func *func; + + /* Memory allocation. */ + p->hash_func_runtime = calloc(p->n_hash_funcs, sizeof(struct hash_func_runtime)); + CHECK(p->hash_func_runtime, ENOMEM); + + /* Hash function. */ + TAILQ_FOREACH(func, &p->hash_funcs, node) { + struct hash_func_runtime *r = &p->hash_func_runtime[func->id]; + + r->func = func->func; + } + + return 0; +} + +static void +hash_func_build_free(struct rte_swx_pipeline *p) +{ + free(p->hash_func_runtime); + p->hash_func_runtime = NULL; +} + +static void +hash_func_free(struct rte_swx_pipeline *p) +{ + hash_func_build_free(p); + + for ( ; ; ) { + struct hash_func *elem; + + elem = TAILQ_FIRST(&p->hash_funcs); + if (!elem) + break; + + TAILQ_REMOVE(&p->hash_funcs, elem, node); + free(elem); + } +} + /* * Header. */ @@ -2796,6 +2887,60 @@ instr_extern_func_exec(struct rte_swx_pipeline *p) thread_yield_cond(p, done ^ 1); } +/* + * hash. + */ +static int +instr_hash_translate(struct rte_swx_pipeline *p, +struct action *action, +char **tokens, +int n_tokens, +struct instruction *instr, +struct instruction_data *data __rte_unused) +{ + struct hash_func *func; + struct field *dst, *src_first, *src_last; + uint32_t src_struct_id_first = 0, src_struct_id_last = 0; + + CHECK(n_tokens == 5, EINVAL); + + func = hash_func_find(p, tokens[1]); + CHECK(func, EINVAL); + + dst = metadata_field_parse(p, tokens[2]); + CHECK(dst, EINVAL); + + src_first = struct_field_parse(p, action, tokens[3], &src_struct_id_first); + CHECK(src_first, EINVAL); + + src_last = struct_field_parse(p, action, tokens[4], &src_struct_id_last); + CHECK(src_last, EINVAL); + CHECK(src_struct_id_first == src_struct_id_last, EINVAL); + + instr->type = INSTR_HASH_FUNC; + instr->hash_func.hash_func_id = (uint8_t)func->id; + instr->hash_func.dst.offset = (uint8_t)dst->offset / 8; + instr->hash_func.dst.n_bits = (uint8_t)dst->n_bits; + instr->hash_func.src.struct_id = (uint8_t)src_struct_id_first; + instr->hash_func.src.offset = (uint16_t)src_first->offset / 8; + instr->hash_func.src.n_bytes = (uint16_t)((src_last->offset - src_first->offset) / 8); + + return 0; +} + +static inline void +instr_hash_func_exec(struct rte_swx_pipeline *p) +{ + struct thread *t = &p->threads[p->thread_id]; + struct instruction *ip = t->ip; + + /* Extern function execute. */ + __instr_hash_func_exec(p, t, ip); + + /* Thread. */ + thread_ip_inc(p); +} + /* * mov. */ @@ -6142,6 +6287,14 @@ instr_translate(struct rte_swx_pipeline *p,
[PATCH 2/2] examples/pipeline: support hash functions
Add example for hash function operation. Signed-off-by: Cristian Dumitrescu --- examples/pipeline/examples/hash_func.cli | 35 +++ examples/pipeline/examples/hash_func.spec | 107 ++ 2 files changed, 142 insertions(+) create mode 100644 examples/pipeline/examples/hash_func.cli create mode 100644 examples/pipeline/examples/hash_func.spec diff --git a/examples/pipeline/examples/hash_func.cli b/examples/pipeline/examples/hash_func.cli new file mode 100644 index 00..df6e6e6205 --- /dev/null +++ b/examples/pipeline/examples/hash_func.cli @@ -0,0 +1,35 @@ +; SPDX-License-Identifier: BSD-3-Clause +; Copyright(c) 2022 Intel Corporation + +; +; Customize the LINK parameters to match your setup. +; +mempool MEMPOOL0 buffer 2304 pool 32K cache 256 cpu 0 + +link LINK0 dev :18:00.0 rxq 1 128 MEMPOOL0 txq 1 512 promiscuous on +link LINK1 dev :18:00.1 rxq 1 128 MEMPOOL0 txq 1 512 promiscuous on +link LINK2 dev :3b:00.0 rxq 1 128 MEMPOOL0 txq 1 512 promiscuous on +link LINK3 dev :3b:00.1 rxq 1 128 MEMPOOL0 txq 1 512 promiscuous on + +; +; PIPELINE0 setup. +; +pipeline PIPELINE0 create 0 +pipeline PIPELINE0 mirror slots 4 sessions 16 + +pipeline PIPELINE0 port in 0 link LINK0 rxq 0 bsz 32 +pipeline PIPELINE0 port in 1 link LINK1 rxq 0 bsz 32 +pipeline PIPELINE0 port in 2 link LINK2 rxq 0 bsz 32 +pipeline PIPELINE0 port in 3 link LINK3 rxq 0 bsz 32 + +pipeline PIPELINE0 port out 0 link LINK0 txq 0 bsz 32 +pipeline PIPELINE0 port out 1 link LINK1 txq 0 bsz 32 +pipeline PIPELINE0 port out 2 link LINK2 txq 0 bsz 32 +pipeline PIPELINE0 port out 3 link LINK3 txq 0 bsz 32 + +pipeline PIPELINE0 build ./examples/pipeline/examples/hash_func.spec + +; +; Pipelines-to-threads mapping. +; +thread 1 pipeline PIPELINE0 enable diff --git a/examples/pipeline/examples/hash_func.spec b/examples/pipeline/examples/hash_func.spec new file mode 100644 index 00..22c9e13411 --- /dev/null +++ b/examples/pipeline/examples/hash_func.spec @@ -0,0 +1,107 @@ +; SPDX-License-Identifier: BSD-3-Clause +; Copyright(c) 2022 Intel Corporation + +; This simple example illustrates how to compute a hash signature over an n-tuple set of fields read +; from the packet headers and/or the packet meta-data by using the "hash" instruction. In this +; specific example, the n-tuple is the classical DiffServ 5-tuple. + +// +// Headers +// +struct ethernet_h { + bit<48> dst_addr + bit<48> src_addr + bit<16> ethertype +} + +struct ipv4_h { + bit<8> ver_ihl + bit<8> diffserv + bit<16> total_len + bit<16> identification + bit<16> flags_offset + bit<8> ttl + bit<8> protocol + bit<16> hdr_checksum + bit<32> src_addr + bit<32> dst_addr +} + +struct udp_h { + bit<16> src_port + bit<16> dst_port + bit<16> length + bit<16> checksum +} + +header ethernet instanceof ethernet_h +header ipv4 instanceof ipv4_h +header udp instanceof udp_h + +// +// Meta-data. +// +struct metadata_t { + bit<32> port + bit<32> src_addr + bit<32> dst_addr + bit<8> protocol + bit<16> src_port + bit<16> dst_port + bit<32> hash +} + +metadata instanceof metadata_t + +// +// Pipeline. +// +apply { + // + // RX and parse. + // + rx m.port + extract h.ethernet + extract h.ipv4 + extract h.udp + + // + // Prepare the n-tuple to be hashed in meta-data. + // + // This is required when: + //a) The n-tuple fields are part of different headers; + //b) Some n-tuple fields come from headers and some from meta-data. + // + mov m.src_addr h.ipv4.src_addr + mov m.dst_addr h.ipv4.dst_addr + mov m.protocol h.ipv4.protocol + mov m.src_port h.udp.src_port + mov m.dst_port h.udp.dst_port + + // + // Compute the hash over the n-tuple. + // + // Details: + //a) Hash function: jhash; + //b) Destination (i.e. hash result): m.hash; + //c) Source (i.e. n-tuple to be hashed): The 5-tuple formed by the meta-data fields + // (m.src_addr, m.dst_addr, m.protocol, m.src_port, m.dst_port). Only the first and + // the last n-tuple fields are specified in the hash instruction, but all the fields + // in between are part of the n-tuple to be hashed. + // + hash jhash m.hash m.src_addr m.dst_port + + // + // Use the computed hash to create a uniform distribution of pkts across the 4 output ports. + // + and m.hash 3 + mov m.port m.hash + + // + // De-parse and TX. + // + emit h.ethernet + emit h.ipv4 + emit h.udp + tx m.port +} -- 2.17.1
[PATCH V2 1/2] pipeline: support hash functions
Add support for hash functions that compute a signature for an array of bytes read from a packet header or meta-data. Useful for flow affinity-based load balancing. Signed-off-by: Cristian Dumitrescu --- lib/pipeline/rte_swx_pipeline.c | 212 +++ lib/pipeline/rte_swx_pipeline.h | 41 + lib/pipeline/rte_swx_pipeline_internal.h | 71 lib/pipeline/version.map | 3 + 4 files changed, 327 insertions(+) diff --git a/lib/pipeline/rte_swx_pipeline.c b/lib/pipeline/rte_swx_pipeline.c index 84d2c24311..ea7df98ecb 100644 --- a/lib/pipeline/rte_swx_pipeline.c +++ b/lib/pipeline/rte_swx_pipeline.c @@ -6,6 +6,9 @@ #include #include +#include +#include + #include #include #include @@ -1166,6 +1169,94 @@ extern_func_free(struct rte_swx_pipeline *p) } } +/* + * Hash function. + */ +static struct hash_func * +hash_func_find(struct rte_swx_pipeline *p, const char *name) +{ + struct hash_func *elem; + + TAILQ_FOREACH(elem, &p->hash_funcs, node) + if (strcmp(elem->name, name) == 0) + return elem; + + return NULL; +} + +int +rte_swx_pipeline_hash_func_register(struct rte_swx_pipeline *p, + const char *name, + rte_swx_hash_func_t func) +{ + struct hash_func *f; + + CHECK(p, EINVAL); + + CHECK_NAME(name, EINVAL); + CHECK(!hash_func_find(p, name), EEXIST); + + CHECK(func, EINVAL); + + /* Node allocation. */ + f = calloc(1, sizeof(struct hash_func)); + CHECK(func, ENOMEM); + + /* Node initialization. */ + strcpy(f->name, name); + f->func = func; + f->id = p->n_hash_funcs; + + /* Node add to tailq. */ + TAILQ_INSERT_TAIL(&p->hash_funcs, f, node); + p->n_hash_funcs++; + + return 0; +} + +static int +hash_func_build(struct rte_swx_pipeline *p) +{ + struct hash_func *func; + + /* Memory allocation. */ + p->hash_func_runtime = calloc(p->n_hash_funcs, sizeof(struct hash_func_runtime)); + CHECK(p->hash_func_runtime, ENOMEM); + + /* Hash function. */ + TAILQ_FOREACH(func, &p->hash_funcs, node) { + struct hash_func_runtime *r = &p->hash_func_runtime[func->id]; + + r->func = func->func; + } + + return 0; +} + +static void +hash_func_build_free(struct rte_swx_pipeline *p) +{ + free(p->hash_func_runtime); + p->hash_func_runtime = NULL; +} + +static void +hash_func_free(struct rte_swx_pipeline *p) +{ + hash_func_build_free(p); + + for ( ; ; ) { + struct hash_func *elem; + + elem = TAILQ_FIRST(&p->hash_funcs); + if (!elem) + break; + + TAILQ_REMOVE(&p->hash_funcs, elem, node); + free(elem); + } +} + /* * Header. */ @@ -2796,6 +2887,60 @@ instr_extern_func_exec(struct rte_swx_pipeline *p) thread_yield_cond(p, done ^ 1); } +/* + * hash. + */ +static int +instr_hash_translate(struct rte_swx_pipeline *p, +struct action *action, +char **tokens, +int n_tokens, +struct instruction *instr, +struct instruction_data *data __rte_unused) +{ + struct hash_func *func; + struct field *dst, *src_first, *src_last; + uint32_t src_struct_id_first = 0, src_struct_id_last = 0; + + CHECK(n_tokens == 5, EINVAL); + + func = hash_func_find(p, tokens[1]); + CHECK(func, EINVAL); + + dst = metadata_field_parse(p, tokens[2]); + CHECK(dst, EINVAL); + + src_first = struct_field_parse(p, action, tokens[3], &src_struct_id_first); + CHECK(src_first, EINVAL); + + src_last = struct_field_parse(p, action, tokens[4], &src_struct_id_last); + CHECK(src_last, EINVAL); + CHECK(src_struct_id_first == src_struct_id_last, EINVAL); + + instr->type = INSTR_HASH_FUNC; + instr->hash_func.hash_func_id = (uint8_t)func->id; + instr->hash_func.dst.offset = (uint8_t)dst->offset / 8; + instr->hash_func.dst.n_bits = (uint8_t)dst->n_bits; + instr->hash_func.src.struct_id = (uint8_t)src_struct_id_first; + instr->hash_func.src.offset = (uint16_t)src_first->offset / 8; + instr->hash_func.src.n_bytes = (uint16_t)((src_last->offset - src_first->offset) / 8); + + return 0; +} + +static inline void +instr_hash_func_exec(struct rte_swx_pipeline *p) +{ + struct thread *t = &p->threads[p->thread_id]; + struct instruction *ip = t->ip; + + /* Extern function execute. */ + __instr_hash_func_exec(p, t, ip); + + /* Thread. */ + thread_ip_inc(p); +} + /* * mov. */ @@ -6142,6 +6287,14 @@ instr_translate(struct rte_swx_pipeline *p, instr, da
[PATCH V2 2/2] examples/pipeline: support hash functions
Add example for hash function operation. Signed-off-by: Cristian Dumitrescu --- examples/pipeline/examples/hash_func.cli | 35 +++ examples/pipeline/examples/hash_func.spec | 107 ++ 2 files changed, 142 insertions(+) create mode 100644 examples/pipeline/examples/hash_func.cli create mode 100644 examples/pipeline/examples/hash_func.spec diff --git a/examples/pipeline/examples/hash_func.cli b/examples/pipeline/examples/hash_func.cli new file mode 100644 index 00..df6e6e6205 --- /dev/null +++ b/examples/pipeline/examples/hash_func.cli @@ -0,0 +1,35 @@ +; SPDX-License-Identifier: BSD-3-Clause +; Copyright(c) 2022 Intel Corporation + +; +; Customize the LINK parameters to match your setup. +; +mempool MEMPOOL0 buffer 2304 pool 32K cache 256 cpu 0 + +link LINK0 dev :18:00.0 rxq 1 128 MEMPOOL0 txq 1 512 promiscuous on +link LINK1 dev :18:00.1 rxq 1 128 MEMPOOL0 txq 1 512 promiscuous on +link LINK2 dev :3b:00.0 rxq 1 128 MEMPOOL0 txq 1 512 promiscuous on +link LINK3 dev :3b:00.1 rxq 1 128 MEMPOOL0 txq 1 512 promiscuous on + +; +; PIPELINE0 setup. +; +pipeline PIPELINE0 create 0 +pipeline PIPELINE0 mirror slots 4 sessions 16 + +pipeline PIPELINE0 port in 0 link LINK0 rxq 0 bsz 32 +pipeline PIPELINE0 port in 1 link LINK1 rxq 0 bsz 32 +pipeline PIPELINE0 port in 2 link LINK2 rxq 0 bsz 32 +pipeline PIPELINE0 port in 3 link LINK3 rxq 0 bsz 32 + +pipeline PIPELINE0 port out 0 link LINK0 txq 0 bsz 32 +pipeline PIPELINE0 port out 1 link LINK1 txq 0 bsz 32 +pipeline PIPELINE0 port out 2 link LINK2 txq 0 bsz 32 +pipeline PIPELINE0 port out 3 link LINK3 txq 0 bsz 32 + +pipeline PIPELINE0 build ./examples/pipeline/examples/hash_func.spec + +; +; Pipelines-to-threads mapping. +; +thread 1 pipeline PIPELINE0 enable diff --git a/examples/pipeline/examples/hash_func.spec b/examples/pipeline/examples/hash_func.spec new file mode 100644 index 00..22c9e13411 --- /dev/null +++ b/examples/pipeline/examples/hash_func.spec @@ -0,0 +1,107 @@ +; SPDX-License-Identifier: BSD-3-Clause +; Copyright(c) 2022 Intel Corporation + +; This simple example illustrates how to compute a hash signature over an n-tuple set of fields read +; from the packet headers and/or the packet meta-data by using the "hash" instruction. In this +; specific example, the n-tuple is the classical DiffServ 5-tuple. + +// +// Headers +// +struct ethernet_h { + bit<48> dst_addr + bit<48> src_addr + bit<16> ethertype +} + +struct ipv4_h { + bit<8> ver_ihl + bit<8> diffserv + bit<16> total_len + bit<16> identification + bit<16> flags_offset + bit<8> ttl + bit<8> protocol + bit<16> hdr_checksum + bit<32> src_addr + bit<32> dst_addr +} + +struct udp_h { + bit<16> src_port + bit<16> dst_port + bit<16> length + bit<16> checksum +} + +header ethernet instanceof ethernet_h +header ipv4 instanceof ipv4_h +header udp instanceof udp_h + +// +// Meta-data. +// +struct metadata_t { + bit<32> port + bit<32> src_addr + bit<32> dst_addr + bit<8> protocol + bit<16> src_port + bit<16> dst_port + bit<32> hash +} + +metadata instanceof metadata_t + +// +// Pipeline. +// +apply { + // + // RX and parse. + // + rx m.port + extract h.ethernet + extract h.ipv4 + extract h.udp + + // + // Prepare the n-tuple to be hashed in meta-data. + // + // This is required when: + //a) The n-tuple fields are part of different headers; + //b) Some n-tuple fields come from headers and some from meta-data. + // + mov m.src_addr h.ipv4.src_addr + mov m.dst_addr h.ipv4.dst_addr + mov m.protocol h.ipv4.protocol + mov m.src_port h.udp.src_port + mov m.dst_port h.udp.dst_port + + // + // Compute the hash over the n-tuple. + // + // Details: + //a) Hash function: jhash; + //b) Destination (i.e. hash result): m.hash; + //c) Source (i.e. n-tuple to be hashed): The 5-tuple formed by the meta-data fields + // (m.src_addr, m.dst_addr, m.protocol, m.src_port, m.dst_port). Only the first and + // the last n-tuple fields are specified in the hash instruction, but all the fields + // in between are part of the n-tuple to be hashed. + // + hash jhash m.hash m.src_addr m.dst_port + + // + // Use the computed hash to create a uniform distribution of pkts across the 4 output ports. + // + and m.hash 3 + mov m.port m.hash + + // + // De-parse and TX. + // + emit h.ethernet + emit h.ipv4 + emit h.udp + tx m.port +} -- 2.17.1
[PATCH V3 1/2] pipeline: support hash functions
Add support for hash functions that compute a signature for an array of bytes read from a packet header or meta-data. Useful for flow affinity-based load balancing. Signed-off-by: Cristian Dumitrescu --- Depends-on: series-22635 ("[V2,1/3] table: improve learner table timers") lib/pipeline/rte_swx_pipeline.c | 212 +++ lib/pipeline/rte_swx_pipeline.h | 41 + lib/pipeline/rte_swx_pipeline_internal.h | 71 lib/pipeline/version.map | 3 + 4 files changed, 327 insertions(+) diff --git a/lib/pipeline/rte_swx_pipeline.c b/lib/pipeline/rte_swx_pipeline.c index 84d2c24311..ea7df98ecb 100644 --- a/lib/pipeline/rte_swx_pipeline.c +++ b/lib/pipeline/rte_swx_pipeline.c @@ -6,6 +6,9 @@ #include #include +#include +#include + #include #include #include @@ -1166,6 +1169,94 @@ extern_func_free(struct rte_swx_pipeline *p) } } +/* + * Hash function. + */ +static struct hash_func * +hash_func_find(struct rte_swx_pipeline *p, const char *name) +{ + struct hash_func *elem; + + TAILQ_FOREACH(elem, &p->hash_funcs, node) + if (strcmp(elem->name, name) == 0) + return elem; + + return NULL; +} + +int +rte_swx_pipeline_hash_func_register(struct rte_swx_pipeline *p, + const char *name, + rte_swx_hash_func_t func) +{ + struct hash_func *f; + + CHECK(p, EINVAL); + + CHECK_NAME(name, EINVAL); + CHECK(!hash_func_find(p, name), EEXIST); + + CHECK(func, EINVAL); + + /* Node allocation. */ + f = calloc(1, sizeof(struct hash_func)); + CHECK(func, ENOMEM); + + /* Node initialization. */ + strcpy(f->name, name); + f->func = func; + f->id = p->n_hash_funcs; + + /* Node add to tailq. */ + TAILQ_INSERT_TAIL(&p->hash_funcs, f, node); + p->n_hash_funcs++; + + return 0; +} + +static int +hash_func_build(struct rte_swx_pipeline *p) +{ + struct hash_func *func; + + /* Memory allocation. */ + p->hash_func_runtime = calloc(p->n_hash_funcs, sizeof(struct hash_func_runtime)); + CHECK(p->hash_func_runtime, ENOMEM); + + /* Hash function. */ + TAILQ_FOREACH(func, &p->hash_funcs, node) { + struct hash_func_runtime *r = &p->hash_func_runtime[func->id]; + + r->func = func->func; + } + + return 0; +} + +static void +hash_func_build_free(struct rte_swx_pipeline *p) +{ + free(p->hash_func_runtime); + p->hash_func_runtime = NULL; +} + +static void +hash_func_free(struct rte_swx_pipeline *p) +{ + hash_func_build_free(p); + + for ( ; ; ) { + struct hash_func *elem; + + elem = TAILQ_FIRST(&p->hash_funcs); + if (!elem) + break; + + TAILQ_REMOVE(&p->hash_funcs, elem, node); + free(elem); + } +} + /* * Header. */ @@ -2796,6 +2887,60 @@ instr_extern_func_exec(struct rte_swx_pipeline *p) thread_yield_cond(p, done ^ 1); } +/* + * hash. + */ +static int +instr_hash_translate(struct rte_swx_pipeline *p, +struct action *action, +char **tokens, +int n_tokens, +struct instruction *instr, +struct instruction_data *data __rte_unused) +{ + struct hash_func *func; + struct field *dst, *src_first, *src_last; + uint32_t src_struct_id_first = 0, src_struct_id_last = 0; + + CHECK(n_tokens == 5, EINVAL); + + func = hash_func_find(p, tokens[1]); + CHECK(func, EINVAL); + + dst = metadata_field_parse(p, tokens[2]); + CHECK(dst, EINVAL); + + src_first = struct_field_parse(p, action, tokens[3], &src_struct_id_first); + CHECK(src_first, EINVAL); + + src_last = struct_field_parse(p, action, tokens[4], &src_struct_id_last); + CHECK(src_last, EINVAL); + CHECK(src_struct_id_first == src_struct_id_last, EINVAL); + + instr->type = INSTR_HASH_FUNC; + instr->hash_func.hash_func_id = (uint8_t)func->id; + instr->hash_func.dst.offset = (uint8_t)dst->offset / 8; + instr->hash_func.dst.n_bits = (uint8_t)dst->n_bits; + instr->hash_func.src.struct_id = (uint8_t)src_struct_id_first; + instr->hash_func.src.offset = (uint16_t)src_first->offset / 8; + instr->hash_func.src.n_bytes = (uint16_t)((src_last->offset - src_first->offset) / 8); + + return 0; +} + +static inline void +instr_hash_func_exec(struct rte_swx_pipeline *p) +{ + struct thread *t = &p->threads[p->thread_id]; + struct instruction *ip = t->ip; + + /* Extern function execute. */ + __instr_hash_func_exec(p, t, ip); + + /* Thread. */ + thread_ip_inc(p); +} + /* * mov. */ @@ -6142,6 +6287,14 @@ instr_translate(struct rte_swx_pipeline *p,
[PATCH V3 2/2] examples/pipeline: support hash functions
Add example for hash function operation. Signed-off-by: Cristian Dumitrescu --- examples/pipeline/examples/hash_func.cli | 35 +++ examples/pipeline/examples/hash_func.spec | 107 ++ 2 files changed, 142 insertions(+) create mode 100644 examples/pipeline/examples/hash_func.cli create mode 100644 examples/pipeline/examples/hash_func.spec diff --git a/examples/pipeline/examples/hash_func.cli b/examples/pipeline/examples/hash_func.cli new file mode 100644 index 00..df6e6e6205 --- /dev/null +++ b/examples/pipeline/examples/hash_func.cli @@ -0,0 +1,35 @@ +; SPDX-License-Identifier: BSD-3-Clause +; Copyright(c) 2022 Intel Corporation + +; +; Customize the LINK parameters to match your setup. +; +mempool MEMPOOL0 buffer 2304 pool 32K cache 256 cpu 0 + +link LINK0 dev :18:00.0 rxq 1 128 MEMPOOL0 txq 1 512 promiscuous on +link LINK1 dev :18:00.1 rxq 1 128 MEMPOOL0 txq 1 512 promiscuous on +link LINK2 dev :3b:00.0 rxq 1 128 MEMPOOL0 txq 1 512 promiscuous on +link LINK3 dev :3b:00.1 rxq 1 128 MEMPOOL0 txq 1 512 promiscuous on + +; +; PIPELINE0 setup. +; +pipeline PIPELINE0 create 0 +pipeline PIPELINE0 mirror slots 4 sessions 16 + +pipeline PIPELINE0 port in 0 link LINK0 rxq 0 bsz 32 +pipeline PIPELINE0 port in 1 link LINK1 rxq 0 bsz 32 +pipeline PIPELINE0 port in 2 link LINK2 rxq 0 bsz 32 +pipeline PIPELINE0 port in 3 link LINK3 rxq 0 bsz 32 + +pipeline PIPELINE0 port out 0 link LINK0 txq 0 bsz 32 +pipeline PIPELINE0 port out 1 link LINK1 txq 0 bsz 32 +pipeline PIPELINE0 port out 2 link LINK2 txq 0 bsz 32 +pipeline PIPELINE0 port out 3 link LINK3 txq 0 bsz 32 + +pipeline PIPELINE0 build ./examples/pipeline/examples/hash_func.spec + +; +; Pipelines-to-threads mapping. +; +thread 1 pipeline PIPELINE0 enable diff --git a/examples/pipeline/examples/hash_func.spec b/examples/pipeline/examples/hash_func.spec new file mode 100644 index 00..22c9e13411 --- /dev/null +++ b/examples/pipeline/examples/hash_func.spec @@ -0,0 +1,107 @@ +; SPDX-License-Identifier: BSD-3-Clause +; Copyright(c) 2022 Intel Corporation + +; This simple example illustrates how to compute a hash signature over an n-tuple set of fields read +; from the packet headers and/or the packet meta-data by using the "hash" instruction. In this +; specific example, the n-tuple is the classical DiffServ 5-tuple. + +// +// Headers +// +struct ethernet_h { + bit<48> dst_addr + bit<48> src_addr + bit<16> ethertype +} + +struct ipv4_h { + bit<8> ver_ihl + bit<8> diffserv + bit<16> total_len + bit<16> identification + bit<16> flags_offset + bit<8> ttl + bit<8> protocol + bit<16> hdr_checksum + bit<32> src_addr + bit<32> dst_addr +} + +struct udp_h { + bit<16> src_port + bit<16> dst_port + bit<16> length + bit<16> checksum +} + +header ethernet instanceof ethernet_h +header ipv4 instanceof ipv4_h +header udp instanceof udp_h + +// +// Meta-data. +// +struct metadata_t { + bit<32> port + bit<32> src_addr + bit<32> dst_addr + bit<8> protocol + bit<16> src_port + bit<16> dst_port + bit<32> hash +} + +metadata instanceof metadata_t + +// +// Pipeline. +// +apply { + // + // RX and parse. + // + rx m.port + extract h.ethernet + extract h.ipv4 + extract h.udp + + // + // Prepare the n-tuple to be hashed in meta-data. + // + // This is required when: + //a) The n-tuple fields are part of different headers; + //b) Some n-tuple fields come from headers and some from meta-data. + // + mov m.src_addr h.ipv4.src_addr + mov m.dst_addr h.ipv4.dst_addr + mov m.protocol h.ipv4.protocol + mov m.src_port h.udp.src_port + mov m.dst_port h.udp.dst_port + + // + // Compute the hash over the n-tuple. + // + // Details: + //a) Hash function: jhash; + //b) Destination (i.e. hash result): m.hash; + //c) Source (i.e. n-tuple to be hashed): The 5-tuple formed by the meta-data fields + // (m.src_addr, m.dst_addr, m.protocol, m.src_port, m.dst_port). Only the first and + // the last n-tuple fields are specified in the hash instruction, but all the fields + // in between are part of the n-tuple to be hashed. + // + hash jhash m.hash m.src_addr m.dst_port + + // + // Use the computed hash to create a uniform distribution of pkts across the 4 output ports. + // + and m.hash 3 + mov m.port m.hash + + // + // De-parse and TX. + // + emit h.ethernet + emit h.ipv4 + emit h.udp + tx m.port +} -- 2.17.1