Re: [dpdk-dev] [PATCH] latencystats: fix timestamp marking and latency calculation
> -Original Message- > From: reshma.pat...@intel.com [mailto:reshma.pat...@intel.com] > Sent: Friday, September 21, 2018 11:02 PM > To: long...@viettel.com.vn; konstantin.anan...@intel.com; dev@dpdk.org > Cc: Reshma Pattan > Subject: [PATCH] latencystats: fix timestamp marking and latency calculation > > Latency calculation logic is not correct for the case where packets gets > dropped before TX. As for the dropped packets, the timestamp is not > cleared, and such packets still gets counted for latency calculation in next > runs, that will result in inaccurate latency measurement. > > So fix this issue as below, > > Before setting timestamp in mbuf, check mbuf don't have any prior valid > time stamp flag set and after marking the timestamp, set mbuf flags to > indicate timestamp is valid. > > Before calculating timestamp check mbuf flags are set to indicate timestamp > is valid. > > With the above logic it is guaranteed that correct timestamps have been > used. > > Fixes: 5cd3cac9ed ("latency: added new library for latency stats") > > Reported-by: Bao-Long Tran > Signed-off-by: Reshma Pattan Tested-by: Bao-Long Tran
[dpdk-dev] [PATCH v4 2/4] app/test-eventdev: remove redundant newlines
Remove unnecessary newline at the end of logs. Signed-off-by: Pavan Nikhilesh Acked-by: Jerin Jacob --- app/test-eventdev/test_pipeline_common.c | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/app/test-eventdev/test_pipeline_common.c b/app/test-eventdev/test_pipeline_common.c index a54068df3..832ab8b6e 100644 --- a/app/test-eventdev/test_pipeline_common.c +++ b/app/test-eventdev/test_pipeline_common.c @@ -65,12 +65,12 @@ pipeline_test_result(struct evt_test *test, struct evt_options *opt) uint64_t total = 0; struct test_pipeline *t = evt_test_priv(test); - printf("Packet distribution across worker cores :\n"); + evt_info("Packet distribution across worker cores :"); for (i = 0; i < t->nb_workers; i++) total += t->worker[i].processed_pkts; for (i = 0; i < t->nb_workers; i++) - printf("Worker %d packets: "CLGRN"%"PRIx64" "CLNRM"percentage:" - CLGRN" %3.2f\n"CLNRM, i, + evt_info("Worker %d packets: "CLGRN"%"PRIx64""CLNRM" percentage:" + CLGRN" %3.2f"CLNRM, i, t->worker[i].processed_pkts, (((double)t->worker[i].processed_pkts)/total) * 100); @@ -234,7 +234,7 @@ pipeline_ethdev_setup(struct evt_test *test, struct evt_options *opt) RTE_SET_USED(opt); if (!rte_eth_dev_count_avail()) { - evt_err("No ethernet ports found.\n"); + evt_err("No ethernet ports found."); return -ENODEV; } @@ -253,7 +253,7 @@ pipeline_ethdev_setup(struct evt_test *test, struct evt_options *opt) if (local_port_conf.rx_adv_conf.rss_conf.rss_hf != port_conf.rx_adv_conf.rss_conf.rss_hf) { evt_info("Port %u modified RSS hash function based on hardware support," - "requested:%#"PRIx64" configured:%#"PRIx64"\n", + "requested:%#"PRIx64" configured:%#"PRIx64"", i, port_conf.rx_adv_conf.rss_conf.rss_hf, local_port_conf.rx_adv_conf.rss_conf.rss_hf); @@ -262,19 +262,19 @@ pipeline_ethdev_setup(struct evt_test *test, struct evt_options *opt) if (rte_eth_dev_configure(i, nb_queues, nb_queues, &local_port_conf) < 0) { - evt_err("Failed to configure eth port [%d]\n", i); + evt_err("Failed to configure eth port [%d]", i); return -EINVAL; } if (rte_eth_rx_queue_setup(i, 0, NB_RX_DESC, rte_socket_id(), &rx_conf, t->pool) < 0) { - evt_err("Failed to setup eth port [%d] rx_queue: %d.\n", + evt_err("Failed to setup eth port [%d] rx_queue: %d.", i, 0); return -EINVAL; } if (rte_eth_tx_queue_setup(i, 0, NB_TX_DESC, rte_socket_id(), NULL) < 0) { - evt_err("Failed to setup eth port [%d] tx_queue: %d.\n", + evt_err("Failed to setup eth port [%d] tx_queue: %d.", i, 0); return -EINVAL; } @@ -380,7 +380,7 @@ pipeline_event_rx_adapter_setup(struct evt_options *opt, uint8_t stride, ret = evt_service_setup(service_id); if (ret) { evt_err("Failed to setup service core" - " for Rx adapter\n"); + " for Rx adapter"); return ret; } } @@ -397,8 +397,7 @@ pipeline_event_rx_adapter_setup(struct evt_options *opt, uint8_t stride, evt_err("Rx adapter[%d] start failed", prod); return ret; } - printf("%s: Port[%d] using Rx adapter[%d] started\n", __func__, - prod, prod); + evt_info("Port[%d] using Rx adapter[%d] started", prod, prod); } return ret; -- 2.18.0
[dpdk-dev] [PATCH v4 1/4] app/test-eventdev: fix minor typos
Fix minor typos. Fixes: 314bcf58ca8f ("app/eventdev: add pipeline queue worker functions") Cc: sta...@dpdk.org Signed-off-by: Pavan Nikhilesh Acked-by: Jerin Jacob --- v4 Changes: - Address review comments (Jerin). v3 Changes: - Force all the ports to use the non-internal cap mode when we detect that one of the port doesn't have internal port capability. app/test-eventdev/test_pipeline_atq.c| 16 app/test-eventdev/test_pipeline_common.h | 8 app/test-eventdev/test_pipeline_queue.c | 16 3 files changed, 20 insertions(+), 20 deletions(-) diff --git a/app/test-eventdev/test_pipeline_atq.c b/app/test-eventdev/test_pipeline_atq.c index 26dc79f90..f0b2f9015 100644 --- a/app/test-eventdev/test_pipeline_atq.c +++ b/app/test-eventdev/test_pipeline_atq.c @@ -18,7 +18,7 @@ pipeline_atq_nb_event_queues(struct evt_options *opt) static int pipeline_atq_worker_single_stage_tx(void *arg) { - PIPELINE_WROKER_SINGLE_STAGE_INIT; + PIPELINE_WORKER_SINGLE_STAGE_INIT; while (t->done == false) { uint16_t event = rte_event_dequeue_burst(dev, port, &ev, 1, 0); @@ -43,7 +43,7 @@ pipeline_atq_worker_single_stage_tx(void *arg) static int pipeline_atq_worker_single_stage_fwd(void *arg) { - PIPELINE_WROKER_SINGLE_STAGE_INIT; + PIPELINE_WORKER_SINGLE_STAGE_INIT; const uint8_t tx_queue = t->tx_service.queue_id; while (t->done == false) { @@ -66,7 +66,7 @@ pipeline_atq_worker_single_stage_fwd(void *arg) static int pipeline_atq_worker_single_stage_burst_tx(void *arg) { - PIPELINE_WROKER_SINGLE_STAGE_BURST_INIT; + PIPELINE_WORKER_SINGLE_STAGE_BURST_INIT; while (t->done == false) { uint16_t nb_rx = rte_event_dequeue_burst(dev, port, ev, @@ -98,7 +98,7 @@ pipeline_atq_worker_single_stage_burst_tx(void *arg) static int pipeline_atq_worker_single_stage_burst_fwd(void *arg) { - PIPELINE_WROKER_SINGLE_STAGE_BURST_INIT; + PIPELINE_WORKER_SINGLE_STAGE_BURST_INIT; const uint8_t tx_queue = t->tx_service.queue_id; while (t->done == false) { @@ -126,7 +126,7 @@ pipeline_atq_worker_single_stage_burst_fwd(void *arg) static int pipeline_atq_worker_multi_stage_tx(void *arg) { - PIPELINE_WROKER_MULTI_STAGE_INIT; + PIPELINE_WORKER_MULTI_STAGE_INIT; const uint8_t nb_stages = t->opt->nb_stages; @@ -161,7 +161,7 @@ pipeline_atq_worker_multi_stage_tx(void *arg) static int pipeline_atq_worker_multi_stage_fwd(void *arg) { - PIPELINE_WROKER_MULTI_STAGE_INIT; + PIPELINE_WORKER_MULTI_STAGE_INIT; const uint8_t nb_stages = t->opt->nb_stages; const uint8_t tx_queue = t->tx_service.queue_id; @@ -192,7 +192,7 @@ pipeline_atq_worker_multi_stage_fwd(void *arg) static int pipeline_atq_worker_multi_stage_burst_tx(void *arg) { - PIPELINE_WROKER_MULTI_STAGE_BURST_INIT; + PIPELINE_WORKER_MULTI_STAGE_BURST_INIT; const uint8_t nb_stages = t->opt->nb_stages; while (t->done == false) { @@ -234,7 +234,7 @@ pipeline_atq_worker_multi_stage_burst_tx(void *arg) static int pipeline_atq_worker_multi_stage_burst_fwd(void *arg) { - PIPELINE_WROKER_MULTI_STAGE_BURST_INIT; + PIPELINE_WORKER_MULTI_STAGE_BURST_INIT; const uint8_t nb_stages = t->opt->nb_stages; const uint8_t tx_queue = t->tx_service.queue_id; diff --git a/app/test-eventdev/test_pipeline_common.h b/app/test-eventdev/test_pipeline_common.h index 5fb91607d..9cd6b905b 100644 --- a/app/test-eventdev/test_pipeline_common.h +++ b/app/test-eventdev/test_pipeline_common.h @@ -65,14 +65,14 @@ struct test_pipeline { #define BURST_SIZE 16 -#define PIPELINE_WROKER_SINGLE_STAGE_INIT \ +#define PIPELINE_WORKER_SINGLE_STAGE_INIT \ struct worker_data *w = arg; \ struct test_pipeline *t = w->t; \ const uint8_t dev = w->dev_id;\ const uint8_t port = w->port_id; \ struct rte_event ev -#define PIPELINE_WROKER_SINGLE_STAGE_BURST_INIT \ +#define PIPELINE_WORKER_SINGLE_STAGE_BURST_INIT \ int i; \ struct worker_data *w = arg; \ struct test_pipeline *t = w->t; \ @@ -80,7 +80,7 @@ struct test_pipeline { const uint8_t port = w->port_id;\ struct rte_event ev[BURST_SIZE + 1] -#define PIPELINE_WROKER_MULTI_STAGE_INIT \ +#define PIPELINE_WORKER_MULTI_STAGE_INIT \ struct worker_data *w = arg;\ struct test_pipeline *t = w->t; \ uint8_t cq_id; \ @@ -90,7 +90,7 @@ struct test_pipeline { uint8_t *const sched_type_list = &t->sched_type_list[0]; \ struct rte_event ev -#define PIPELINE_WROKER_MULTI_STAGE_BURST_INIT \ +#define PIPELINE_WORKER_MULTI_STAGE_BURST_INIT \ i
[dpdk-dev] [PATCH v4 4/4] doc: update eventdev application guide
Update eventdev application guide to reflect Tx adapter related changes. Signed-off-by: Pavan Nikhilesh Acked-by: Jerin Jacob --- .../eventdev_pipeline_atq_test_generic.svg| 848 +++--- ...ntdev_pipeline_atq_test_internal_port.svg} | 26 +- .../eventdev_pipeline_queue_test_generic.svg | 570 +++- ...dev_pipeline_queue_test_internal_port.svg} | 22 +- doc/guides/tools/testeventdev.rst | 44 +- 5 files changed, 932 insertions(+), 578 deletions(-) rename doc/guides/tools/img/{eventdev_pipeline_atq_test_lockfree.svg => eventdev_pipeline_atq_test_internal_port.svg} (99%) rename doc/guides/tools/img/{eventdev_pipeline_queue_test_lockfree.svg => eventdev_pipeline_queue_test_internal_port.svg} (99%) diff --git a/doc/guides/tools/img/eventdev_pipeline_atq_test_generic.svg b/doc/guides/tools/img/eventdev_pipeline_atq_test_generic.svg index e33367989..707b9b56b 100644 --- a/doc/guides/tools/img/eventdev_pipeline_atq_test_generic.svg +++ b/doc/guides/tools/img/eventdev_pipeline_atq_test_generic.svg @@ -20,7 +20,7 @@ height="288.34286" id="svg3868" version="1.1" - inkscape:version="0.92.2 (5c3e80d, 2017-08-06)" + inkscape:version="0.92.2 2405546, 2018-03-11" sodipodi:docname="eventdev_pipeline_atq_test_generic.svg" sodipodi:version="0.32" inkscape:output_extension="org.inkscape.output.svg.inkscape" @@ -42,22 +42,6 @@ d="M 5.77,0 -2.88,5 V -5 Z" id="path39725" /> - - - + gradientTransform="matrix(0.84881476,0,0,0.98593266,86.966576,5.0323108)" /> - - - - + + + + + + + + style="fill:#f78202;fill-opacity:1;fill-rule:evenodd;stroke:#f78202;stroke-width:1.0003pt;stroke-opacity:1" + transform="scale(0.4)" /> + style="fill:#f78202;fill-opacity:1;fill-rule:evenodd;stroke:#f78202;stroke-width:1.0003pt;stroke-opacity:1" + transform="scale(0.4)" /> + style="fill:#f78202;fill-opacity:1;fill-rule:evenodd;stroke:#f78202;stroke-width:1.0003pt;stroke-opacity:1" + transform="scale(0.4)" /> + + + refY="0" + refX="0" + id="marker35935-1-6-5-1-0" + style="overflow:visible" + inkscape:isstock="true" + inkscape:collect="always"> + style="fill:#ac14db;fill-opacity:1;fill-rule:evenodd;stroke:#ac14ff;stroke-width:1.0003pt;stroke-opacity:1" + transform="scale(0.4)" + inkscape:connector-curvature="0" /> - + style="fill:#ac14db;fill-opacity:1;fill-rule:evenodd;stroke:#ac14ff;stroke-width:1.0003pt;stroke-opacity:1" + transform="scale(0.4)" + inkscape:connector-curvature="0" /> + + + style="fill:#ac14db;fill-opacity:1;fill-rule:evenodd;stroke:#ac14ff;stroke-width:1.0003pt;stroke-opacity:1" + transform="scale(0.4)" + inkscape:connector-curvature="0" /> + + + + style="fill:#ac14db;fill-opacity:1;fill-rule:evenodd;stroke:#ac14ff;stroke-width:1.0003pt;stroke-opacity:1" + transform="scale(0.4)" + inkscape:connector-curvature="0" /> - - + + + + + + + + + port n+2 + style="font-size:10px;line-height:1.25">port n+1 port n+3 + style="font-size:10px;line-height:1.25">port n+2 total queues = number of ethernet dev + 1 + style="font-size:10px;line-height:1.25">total queues = 2 * number of ethernet dev +Event ethRx adptr 0 +Event ethRx adptr 1 +Event ethRx adptr q + + + + +(Tx Generic) + transform="translate(69.258261,-194.86398)"> Txq 0 + transform="translate(-12.211349,-3.253112)"> Txq 0 + transform="translate(-10.498979,-2.682322)"> Txq 0 -Event ethRx adptr 0 -Event ethRx adptr 1 -Event ethRx adptr q - - - Tx Serviceport n + 1 - - - - - + + x="502.77109" + y="189.40137" + id="tspan5223-0-9-02" + style="font-size:10px;line-height:1.25">port n+m+1 +Single link + style="display:inline;opacity:1;fill:#ff;fill-opacity:1;stroke:url(#linearGradient3995-8-9);stroke-width:1.2090857;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" + id="rect87-6-5-3-79-1" + width="72.081367" + height="32.405426" + x="499.944" + y="226.74811" + rx="16.175425" + ry="16.202713" /> Single port n+m+2 + +Link Q + x="512.51819" + y="301.5791" + id="tspan5223-0-9-0-4-2" + style="font-size:10px;line-height:1.25">port n+o
[dpdk-dev] [PATCH v4 3/4] app/test-eventdev: add Tx adapter support
Convert existing Tx service based pipeline to Tx adapter based APIs and simplify worker functions. Signed-off-by: Pavan Nikhilesh Acked-by: Jerin Jacob --- app/test-eventdev/test_pipeline_atq.c| 271 --- app/test-eventdev/test_pipeline_common.c | 200 + app/test-eventdev/test_pipeline_common.h | 62 +++--- app/test-eventdev/test_pipeline_queue.c | 244 ++-- 4 files changed, 367 insertions(+), 410 deletions(-) diff --git a/app/test-eventdev/test_pipeline_atq.c b/app/test-eventdev/test_pipeline_atq.c index f0b2f9015..c60635bf6 100644 --- a/app/test-eventdev/test_pipeline_atq.c +++ b/app/test-eventdev/test_pipeline_atq.c @@ -15,7 +15,7 @@ pipeline_atq_nb_event_queues(struct evt_options *opt) return rte_eth_dev_count_avail(); } -static int +static __rte_noinline int pipeline_atq_worker_single_stage_tx(void *arg) { PIPELINE_WORKER_SINGLE_STAGE_INIT; @@ -28,23 +28,18 @@ pipeline_atq_worker_single_stage_tx(void *arg) continue; } - if (ev.sched_type == RTE_SCHED_TYPE_ATOMIC) { - pipeline_tx_pkt(ev.mbuf); - w->processed_pkts++; - continue; - } - pipeline_fwd_event(&ev, RTE_SCHED_TYPE_ATOMIC); - pipeline_event_enqueue(dev, port, &ev); + pipeline_event_tx(dev, port, &ev); + w->processed_pkts++; } return 0; } -static int +static __rte_noinline int pipeline_atq_worker_single_stage_fwd(void *arg) { PIPELINE_WORKER_SINGLE_STAGE_INIT; - const uint8_t tx_queue = t->tx_service.queue_id; + const uint8_t *tx_queue = t->tx_evqueue_id; while (t->done == false) { uint16_t event = rte_event_dequeue_burst(dev, port, &ev, 1, 0); @@ -54,16 +49,16 @@ pipeline_atq_worker_single_stage_fwd(void *arg) continue; } - w->processed_pkts++; - ev.queue_id = tx_queue; + ev.queue_id = tx_queue[ev.mbuf->port]; pipeline_fwd_event(&ev, RTE_SCHED_TYPE_ATOMIC); pipeline_event_enqueue(dev, port, &ev); + w->processed_pkts++; } return 0; } -static int +static __rte_noinline int pipeline_atq_worker_single_stage_burst_tx(void *arg) { PIPELINE_WORKER_SINGLE_STAGE_BURST_INIT; @@ -79,27 +74,21 @@ pipeline_atq_worker_single_stage_burst_tx(void *arg) for (i = 0; i < nb_rx; i++) { rte_prefetch0(ev[i + 1].mbuf); - if (ev[i].sched_type == RTE_SCHED_TYPE_ATOMIC) { - - pipeline_tx_pkt(ev[i].mbuf); - ev[i].op = RTE_EVENT_OP_RELEASE; - w->processed_pkts++; - } else - pipeline_fwd_event(&ev[i], - RTE_SCHED_TYPE_ATOMIC); + rte_event_eth_tx_adapter_txq_set(ev[i].mbuf, 0); } - pipeline_event_enqueue_burst(dev, port, ev, nb_rx); + pipeline_event_tx_burst(dev, port, ev, nb_rx); + w->processed_pkts += nb_rx; } return 0; } -static int +static __rte_noinline int pipeline_atq_worker_single_stage_burst_fwd(void *arg) { PIPELINE_WORKER_SINGLE_STAGE_BURST_INIT; - const uint8_t tx_queue = t->tx_service.queue_id; + const uint8_t *tx_queue = t->tx_evqueue_id; while (t->done == false) { uint16_t nb_rx = rte_event_dequeue_burst(dev, port, ev, @@ -112,23 +101,22 @@ pipeline_atq_worker_single_stage_burst_fwd(void *arg) for (i = 0; i < nb_rx; i++) { rte_prefetch0(ev[i + 1].mbuf); - ev[i].queue_id = tx_queue; + rte_event_eth_tx_adapter_txq_set(ev[i].mbuf, 0); + ev[i].queue_id = tx_queue[ev[i].mbuf->port]; pipeline_fwd_event(&ev[i], RTE_SCHED_TYPE_ATOMIC); - w->processed_pkts++; } pipeline_event_enqueue_burst(dev, port, ev, nb_rx); + w->processed_pkts += nb_rx; } return 0; } -static int +static __rte_noinline int pipeline_atq_worker_multi_stage_tx(void *arg) { PIPELINE_WORKER_MULTI_STAGE_INIT; - const uint8_t nb_stages = t->opt->nb_stages; - while (t->done == false) { uint16_t event = rte_event_dequeue_burst(dev, port, &ev, 1, 0); @@ -141,29 +129,24 @@ pipeline_atq_worker_multi_stage_tx(void *arg) cq_id = ev.sub_event_type % nb_stages; if (cq_id == last_queue) { - if (ev.sched_type == RTE_SCHED_TYPE_ATOMIC) { - - pipeline_tx_pkt(ev.mbuf); -
[dpdk-dev] [PATCH v3 2/3] event/sw: implement unlinks in progress function
This commit adds a counter to each port, which counts the number of unlinks that have been performed. When the scheduler thread starts its scheduling routine, it "acks" all unlinks that have been requested, and the application is gauranteed that no more events will be scheduled to the port from the unlinked queue. Signed-off-by: Harry van Haaren --- v3: - Move RTE_SET_USED() to correct patch (Jerin) v2: - Fix unused "dev" variable (Jerin) --- drivers/event/sw/sw_evdev.c | 13 + drivers/event/sw/sw_evdev.h | 8 drivers/event/sw/sw_evdev_scheduler.c | 7 ++- 3 files changed, 27 insertions(+), 1 deletion(-) diff --git a/drivers/event/sw/sw_evdev.c b/drivers/event/sw/sw_evdev.c index a6bb91388..1175d6cdb 100644 --- a/drivers/event/sw/sw_evdev.c +++ b/drivers/event/sw/sw_evdev.c @@ -113,9 +113,21 @@ sw_port_unlink(struct rte_eventdev *dev, void *port, uint8_t queues[], } } } + + p->unlinks_in_progress += unlinked; + rte_smp_mb(); + return unlinked; } +static int +sw_port_unlinks_in_progress(struct rte_eventdev *dev, void *port) +{ + RTE_SET_USED(dev); + struct sw_port *p = port; + return p->unlinks_in_progress; +} + static int sw_port_setup(struct rte_eventdev *dev, uint8_t port_id, const struct rte_event_port_conf *conf) @@ -925,6 +937,7 @@ sw_probe(struct rte_vdev_device *vdev) .port_release = sw_port_release, .port_link = sw_port_link, .port_unlink = sw_port_unlink, + .port_unlinks_in_progress = sw_port_unlinks_in_progress, .eth_rx_adapter_caps_get = sw_eth_rx_adapter_caps_get, diff --git a/drivers/event/sw/sw_evdev.h b/drivers/event/sw/sw_evdev.h index d90b96d4b..7c77b2495 100644 --- a/drivers/event/sw/sw_evdev.h +++ b/drivers/event/sw/sw_evdev.h @@ -148,6 +148,14 @@ struct sw_port { /* A numeric ID for the port */ uint8_t id; + /* An atomic counter for when the port has been unlinked, and the +* scheduler has not yet acked this unlink - hence there may still be +* events in the buffers going to the port. When the unlinks in +* progress is read by the scheduler, no more events will be pushed to +* the port - hence the scheduler core can just assign zero. +*/ + uint8_t unlinks_in_progress; + int16_t is_directed; /** Takes from a single directed QID */ /** * For loadbalanced we can optimise pulling packets from diff --git a/drivers/event/sw/sw_evdev_scheduler.c b/drivers/event/sw/sw_evdev_scheduler.c index e3a41e02f..9b54d5ce7 100644 --- a/drivers/event/sw/sw_evdev_scheduler.c +++ b/drivers/event/sw/sw_evdev_scheduler.c @@ -517,13 +517,18 @@ sw_event_schedule(struct rte_eventdev *dev) /* Pull from rx_ring for ports */ do { in_pkts = 0; - for (i = 0; i < sw->port_count; i++) + for (i = 0; i < sw->port_count; i++) { + /* ack the unlinks in progress as done */ + if (sw->ports[i].unlinks_in_progress) + sw->ports[i].unlinks_in_progress = 0; + if (sw->ports[i].is_directed) in_pkts += sw_schedule_pull_port_dir(sw, i); else if (sw->ports[i].num_ordered_qids > 0) in_pkts += sw_schedule_pull_port_lb(sw, i); else in_pkts += sw_schedule_pull_port_no_reorder(sw, i); + } /* QID scan for re-ordered */ in_pkts += sw_schedule_reorder(sw, 0, -- 2.17.1
[dpdk-dev] [PATCH v3 1/3] event: add function for reading unlink in progress
This commit introduces a new function in the eventdev API, which allows applications to read the number of unlink requests in progress on a particular port of an eventdev instance. This information allows applications to verify when no more packets from a particular queue (or any queue) will arrive at a port. The application could decide to stop polling, or put the core into a sleep state if it wishes, as it is ensured that no new packets will arrive at a particular port anymore if all queues are unlinked. Suggested-by: Matias Elo Signed-off-by: Harry van Haaren Acked-by: Jerin Jacob --- v3: - Fix ack (was missing a > symbol) (Checkpatch email report) v2: - Fix @see function_name() syntax (Jerin) - Add @warning to indicate experimental API in header - Update unlink() return docs to state async behaviour - Added Ack as per ML Cheers, -Harry --- lib/librte_eventdev/rte_eventdev.c | 22 +++ lib/librte_eventdev/rte_eventdev.h | 39 +--- lib/librte_eventdev/rte_eventdev_pmd.h | 19 ++ lib/librte_eventdev/rte_eventdev_version.map | 1 + 4 files changed, 75 insertions(+), 6 deletions(-) diff --git a/lib/librte_eventdev/rte_eventdev.c b/lib/librte_eventdev/rte_eventdev.c index 801810edd..0a8572b7b 100644 --- a/lib/librte_eventdev/rte_eventdev.c +++ b/lib/librte_eventdev/rte_eventdev.c @@ -980,6 +980,28 @@ rte_event_port_unlink(uint8_t dev_id, uint8_t port_id, return diag; } +int __rte_experimental +rte_event_port_unlinks_in_progress(uint8_t dev_id, uint8_t port_id) +{ + struct rte_eventdev *dev; + + RTE_EVENTDEV_VALID_DEVID_OR_ERR_RET(dev_id, -EINVAL); + dev = &rte_eventdevs[dev_id]; + if (!is_valid_port(dev, port_id)) { + RTE_EDEV_LOG_ERR("Invalid port_id=%" PRIu8, port_id); + return -EINVAL; + } + + /* Return 0 if the PMD does not implement unlinks in progress. +* This allows PMDs which handle unlink synchronously to not implement +* this function at all. +*/ + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->port_unlinks_in_progress, 0); + + return (*dev->dev_ops->port_unlinks_in_progress)(dev, + dev->data->ports[port_id]); +} + int rte_event_port_links_get(uint8_t dev_id, uint8_t port_id, uint8_t queues[], uint8_t priorities[]) diff --git a/lib/librte_eventdev/rte_eventdev.h b/lib/librte_eventdev/rte_eventdev.h index b6fd6ee7f..a24213ea7 100644 --- a/lib/librte_eventdev/rte_eventdev.h +++ b/lib/librte_eventdev/rte_eventdev.h @@ -1656,12 +1656,13 @@ rte_event_port_link(uint8_t dev_id, uint8_t port_id, * event port designated by its *port_id* on the event device designated * by its *dev_id*. * - * The unlink establishment shall disable the event port *port_id* from - * receiving events from the specified event queue *queue_id* - * + * The unlink call issues an async request to disable the event port *port_id* + * from receiving events from the specified event queue *queue_id*. * Event queue(s) to event port unlink establishment can be changed at runtime * without re-configuring the device. * + * @see rte_event_port_unlinks_in_progress() to poll for completed unlinks. + * * @param dev_id * The identifier of the device. * @@ -1679,21 +1680,47 @@ rte_event_port_link(uint8_t dev_id, uint8_t port_id, * NULL. * * @return - * The number of unlinks actually established. The return value can be less + * The number of unlinks successfully requested. The return value can be less * than the value of the *nb_unlinks* parameter when the implementation has the * limitation on specific queue to port unlink establishment or * if invalid parameters are specified. * If the return value is less than *nb_unlinks*, the remaining queues at the - * end of queues[] are not established, and the caller has to take care of them. + * end of queues[] are not unlinked, and the caller has to take care of them. * If return value is less than *nb_unlinks* then implementation shall update * the rte_errno accordingly, Possible rte_errno values are * (-EINVAL) Invalid parameter - * */ int rte_event_port_unlink(uint8_t dev_id, uint8_t port_id, uint8_t queues[], uint16_t nb_unlinks); +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Returns the number of unlinks in progress. + * + * This function provides the application with a method to detect when an + * unlink has been completed by the implementation. + * + * @see rte_event_port_unlink() to issue unlink requests. + * + * @param dev_id + * The indentifier of the device. + * + * @param port_id + * Event port identifier to select port to check for unlinks in progress. + * + * @return + * The number of unlinks that are in progress. A return of zero indicates that + * there are no outstanding unlink requests. A positive return value indicates + * the number of unlinks that
[dpdk-dev] [PATCH v3 3/3] event/sw: add unit test for unlinks in progress
This commit adds a unit test that checks the behaviour of the unlinks_in_progress() function, ensuring that the returned values are the number of unlinks requested, until the scheduler runs and "acks" the requests, after which the count should be zero again. Signed-off-by: Harry van Haaren --- v3: - Move RTE_SET_USED() to correct patch (Jerin) v2: - Add print before running unlink test (Harry) --- drivers/event/sw/sw_evdev_selftest.c | 77 1 file changed, 77 insertions(+) diff --git a/drivers/event/sw/sw_evdev_selftest.c b/drivers/event/sw/sw_evdev_selftest.c index c40912db5..d00d5de61 100644 --- a/drivers/event/sw/sw_evdev_selftest.c +++ b/drivers/event/sw/sw_evdev_selftest.c @@ -1903,6 +1903,77 @@ qid_priorities(struct test *t) return 0; } +static int +unlink_in_progress(struct test *t) +{ + /* Test unlinking API, in particular that when an unlink request has +* not yet been seen by the scheduler thread, that the +* unlink_in_progress() function returns the number of unlinks. +*/ + unsigned int i; + /* Create instance with 1 ports, and 3 qids */ + if (init(t, 3, 1) < 0 || + create_ports(t, 1) < 0) { + printf("%d: Error initializing device\n", __LINE__); + return -1; + } + + for (i = 0; i < 3; i++) { + /* Create QID */ + const struct rte_event_queue_conf conf = { + .schedule_type = RTE_SCHED_TYPE_ATOMIC, + /* increase priority (0 == highest), as we go */ + .priority = RTE_EVENT_DEV_PRIORITY_NORMAL - i, + .nb_atomic_flows = 1024, + .nb_atomic_order_sequences = 1024, + }; + + if (rte_event_queue_setup(evdev, i, &conf) < 0) { + printf("%d: error creating qid %d\n", __LINE__, i); + return -1; + } + t->qid[i] = i; + } + t->nb_qids = i; + /* map all QIDs to port */ + rte_event_port_link(evdev, t->port[0], NULL, NULL, 0); + + if (rte_event_dev_start(evdev) < 0) { + printf("%d: Error with start call\n", __LINE__); + return -1; + } + + /* unlink all ports to have outstanding unlink requests */ + int ret = rte_event_port_unlink(evdev, t->port[0], NULL, 0); + if (ret < 0) { + printf("%d: Failed to unlink queues\n", __LINE__); + return -1; + } + + /* get active unlinks here, expect 3 */ + int unlinks_in_progress = + rte_event_port_unlinks_in_progress(evdev, t->port[0]); + if (unlinks_in_progress != 3) { + printf("%d: Expected num unlinks in progress == 3, got %d\n", + __LINE__, unlinks_in_progress); + return -1; + } + + /* run scheduler service on this thread to ack the unlinks */ + rte_service_run_iter_on_app_lcore(t->service_id, 1); + + /* active unlinks expected as 0 as scheduler thread has acked */ + unlinks_in_progress = + rte_event_port_unlinks_in_progress(evdev, t->port[0]); + if (unlinks_in_progress != 0) { + printf("%d: Expected num unlinks in progress == 0, got %d\n", + __LINE__, unlinks_in_progress); + } + + cleanup(t); + return 0; +} + static int load_balancing(struct test *t) { @@ -3260,6 +3331,12 @@ test_sw_eventdev(void) printf("ERROR - QID Priority test FAILED.\n"); goto test_fail; } + printf("*** Running Unlink-in-progress test...\n"); + ret = unlink_in_progress(t); + if (ret != 0) { + printf("ERROR - Unlink in progress test FAILED.\n"); + goto test_fail; + } printf("*** Running Ordered Reconfigure test...\n"); ret = ordered_reconfigure(t); if (ret != 0) { -- 2.17.1
[dpdk-dev] [PATCH 0/3] ethdev: add IP address and TCP/UDP port rewrite actions to flow API
This series of patches add support for actions: - SET_IPV4_SRC - set a new IPv4 source address. - SET_IPV4_DST - set a new IPv4 destination address. - SET_IPV6_SRC - set a new IPv6 source address. - SET_IPV6_DST - set a new IPv6 destination address. - SET_TP_SRC - set a new TCP/UDP source port number. - SET_TP_DST - set a new TCP/UDP destination port number. These actions are useful in Network Address Translation use case to edit IP address and TCP/UDP port numbers before switching the packets out to the destination device port. Patch 1 adds support for IP address rewrite to rte_flow and testpmd. Patch 2 adds support for TCP/UDP port rewrite to rte_flow and testpmd. Patch 3 shows CXGBE PMD example to offload these actions to hardware. Feedback and suggestions will be much appreciated. Thanks, Rahul RFC v1: http://mails.dpdk.org/archives/dev/2018-June/104913.html RFC v2: http://mails.dpdk.org/archives/dev/2018-August/109672.html --- Changes since RFC v2: - Updated comments, help messages, and doc to indicate that IP/TCP/UDP of the outermost headers are modified. - Updated comments and doc to indicate that a corresponding valid flow pattern item must be specified to offload corresponding header rewrite actions. - Re-based CXGBE PMD changes in patch 3 to tip. - Updated all instances of fw_filter_wr to new fw_filter2_wr and removed fw_filter_wr. - Ensure correct ULP type is set when offloading NAT actions. - Returning appropriate RTE_FLOW_ERROR_TYPE_ACTION error if a corresponding valid flow pattern item is not found. - Updated release notes. Rahul Lakkireddy (3): ethdev: add flow api actions to modify IP addresses ethdev: add flow api actions to modify TCP/UDP port numbers net/cxgbe: add flow actions to modify IP and TCP/UDP port address app/test-pmd/cmdline_flow.c | 156 + app/test-pmd/config.c | 12 ++ doc/guides/prog_guide/rte_flow.rst | 108 doc/guides/rel_notes/release_18_11.rst | 12 +- doc/guides/testpmd_app_ug/testpmd_funcs.rst | 28 +++ drivers/net/cxgbe/base/common.h | 1 + drivers/net/cxgbe/base/t4_msg.h | 1 + drivers/net/cxgbe/base/t4fw_interface.h | 23 ++- drivers/net/cxgbe/cxgbe_filter.c| 37 +++- drivers/net/cxgbe/cxgbe_filter.h| 23 +++ drivers/net/cxgbe/cxgbe_flow.c | 178 +++- drivers/net/cxgbe/cxgbe_main.c | 10 ++ lib/librte_ethdev/rte_flow.c| 12 ++ lib/librte_ethdev/rte_flow.h| 107 14 files changed, 696 insertions(+), 12 deletions(-) -- 2.18.0
[dpdk-dev] [PATCH 1/3] ethdev: add flow api actions to modify IP addresses
Add actions: - SET_IPV4_SRC - set a new IPv4 source address. - SET_IPV4_DST - set a new IPv4 destination address. - SET_IPV6_SRC - set a new IPv6 source address. - SET_IPV6_DST - set a new IPv6 destination address. Original work by Shagun Agrawal Signed-off-by: Rahul Lakkireddy --- Changes since RFC v2: - Updated comments, help messages, and doc to indicate that IP/TCP/UDP of the outermost headers are modified. - Updated comments and doc to indicate that a corresponding valid flow pattern item must be specified to offload corresponding header rewrite action. - Updated release notes. app/test-pmd/cmdline_flow.c | 104 app/test-pmd/config.c | 8 ++ doc/guides/prog_guide/rte_flow.rst | 72 ++ doc/guides/rel_notes/release_18_11.rst | 6 ++ doc/guides/testpmd_app_ug/testpmd_funcs.rst | 18 lib/librte_ethdev/rte_flow.c| 8 ++ lib/librte_ethdev/rte_flow.h| 70 + 7 files changed, 286 insertions(+) diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index f9260600e..1432498a3 100644 --- a/app/test-pmd/cmdline_flow.c +++ b/app/test-pmd/cmdline_flow.c @@ -243,6 +243,14 @@ enum index { ACTION_VXLAN_DECAP, ACTION_NVGRE_ENCAP, ACTION_NVGRE_DECAP, + ACTION_SET_IPV4_SRC, + ACTION_SET_IPV4_SRC_IPV4_SRC, + ACTION_SET_IPV4_DST, + ACTION_SET_IPV4_DST_IPV4_DST, + ACTION_SET_IPV6_SRC, + ACTION_SET_IPV6_SRC_IPV6_SRC, + ACTION_SET_IPV6_DST, + ACTION_SET_IPV6_DST_IPV6_DST, }; /** Maximum size for pattern in struct rte_flow_item_raw. */ @@ -816,6 +824,10 @@ static const enum index next_action[] = { ACTION_VXLAN_DECAP, ACTION_NVGRE_ENCAP, ACTION_NVGRE_DECAP, + ACTION_SET_IPV4_SRC, + ACTION_SET_IPV4_DST, + ACTION_SET_IPV6_SRC, + ACTION_SET_IPV6_DST, ZERO, }; @@ -918,6 +930,30 @@ static const enum index action_of_push_mpls[] = { ZERO, }; +static const enum index action_set_ipv4_src[] = { + ACTION_SET_IPV4_SRC_IPV4_SRC, + ACTION_NEXT, + ZERO, +}; + +static const enum index action_set_ipv4_dst[] = { + ACTION_SET_IPV4_DST_IPV4_DST, + ACTION_NEXT, + ZERO, +}; + +static const enum index action_set_ipv6_src[] = { + ACTION_SET_IPV6_SRC_IPV6_SRC, + ACTION_NEXT, + ZERO, +}; + +static const enum index action_set_ipv6_dst[] = { + ACTION_SET_IPV6_DST_IPV6_DST, + ACTION_NEXT, + ZERO, +}; + static const enum index action_jump[] = { ACTION_JUMP_GROUP, ACTION_NEXT, @@ -2470,6 +2506,74 @@ static const struct token token_list[] = { .next = NEXT(NEXT_ENTRY(ACTION_NEXT)), .call = parse_vc, }, + [ACTION_SET_IPV4_SRC] = { + .name = "set_ipv4_src", + .help = "Set a new IPv4 source address in the outermost" + " IPv4 header", + .priv = PRIV_ACTION(SET_IPV4_SRC, + sizeof(struct rte_flow_action_set_ipv4)), + .next = NEXT(action_set_ipv4_src), + .call = parse_vc, + }, + [ACTION_SET_IPV4_SRC_IPV4_SRC] = { + .name = "ipv4_addr", + .help = "new IPv4 source address to set", + .next = NEXT(action_set_ipv4_src, NEXT_ENTRY(IPV4_ADDR)), + .args = ARGS(ARGS_ENTRY_HTON + (struct rte_flow_action_set_ipv4, ipv4_addr)), + .call = parse_vc_conf, + }, + [ACTION_SET_IPV4_DST] = { + .name = "set_ipv4_dst", + .help = "Set a new IPv4 destination address in the outermost" + " IPv4 header", + .priv = PRIV_ACTION(SET_IPV4_DST, + sizeof(struct rte_flow_action_set_ipv4)), + .next = NEXT(action_set_ipv4_dst), + .call = parse_vc, + }, + [ACTION_SET_IPV4_DST_IPV4_DST] = { + .name = "ipv4_addr", + .help = "new IPv4 destination address to set", + .next = NEXT(action_set_ipv4_dst, NEXT_ENTRY(IPV4_ADDR)), + .args = ARGS(ARGS_ENTRY_HTON + (struct rte_flow_action_set_ipv4, ipv4_addr)), + .call = parse_vc_conf, + }, + [ACTION_SET_IPV6_SRC] = { + .name = "set_ipv6_src", + .help = "Set a new IPv6 source address in the outermost" + " IPv6 header", + .priv = PRIV_ACTION(SET_IPV6_SRC, + sizeof(struct rte_flow_action_set_ipv6)), + .next = NEXT(action_set_ipv6_src), + .call = parse_vc, + }, + [ACTION_SET_IPV6_SRC_IPV6_SRC] = { + .name = "ipv6_addr", + .help = "new IPv6 source address to set", + .next = NEXT(action_set_ipv6_src, NEXT_ENTRY(I
[dpdk-dev] [PATCH 2/3] ethdev: add flow api actions to modify TCP/UDP port numbers
Add actions: - SET_TP_SRC - set a new TCP/UDP source port number. - SET_TP_DST - set a new TCP/UDP destination port number. Original work by Shagun Agrawal Signed-off-by: Rahul Lakkireddy --- Changes since RFC v2: - Updated comments, help messages, and doc to indicate that IP/TCP/UDP of the outermost headers are modified. - Updated comments and doc to indicate that a corresponding valid flow pattern item must be specified to offload corresponding header rewrite action. - Updated release notes. app/test-pmd/cmdline_flow.c | 52 + app/test-pmd/config.c | 4 ++ doc/guides/prog_guide/rte_flow.rst | 36 ++ doc/guides/rel_notes/release_18_11.rst | 2 + doc/guides/testpmd_app_ug/testpmd_funcs.rst | 10 lib/librte_ethdev/rte_flow.c| 4 ++ lib/librte_ethdev/rte_flow.h| 37 +++ 7 files changed, 145 insertions(+) diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index 1432498a3..a9888cacf 100644 --- a/app/test-pmd/cmdline_flow.c +++ b/app/test-pmd/cmdline_flow.c @@ -251,6 +251,10 @@ enum index { ACTION_SET_IPV6_SRC_IPV6_SRC, ACTION_SET_IPV6_DST, ACTION_SET_IPV6_DST_IPV6_DST, + ACTION_SET_TP_SRC, + ACTION_SET_TP_SRC_TP_SRC, + ACTION_SET_TP_DST, + ACTION_SET_TP_DST_TP_DST, }; /** Maximum size for pattern in struct rte_flow_item_raw. */ @@ -828,6 +832,8 @@ static const enum index next_action[] = { ACTION_SET_IPV4_DST, ACTION_SET_IPV6_SRC, ACTION_SET_IPV6_DST, + ACTION_SET_TP_SRC, + ACTION_SET_TP_DST, ZERO, }; @@ -954,6 +960,18 @@ static const enum index action_set_ipv6_dst[] = { ZERO, }; +static const enum index action_set_tp_src[] = { + ACTION_SET_TP_SRC_TP_SRC, + ACTION_NEXT, + ZERO, +}; + +static const enum index action_set_tp_dst[] = { + ACTION_SET_TP_DST_TP_DST, + ACTION_NEXT, + ZERO, +}; + static const enum index action_jump[] = { ACTION_JUMP_GROUP, ACTION_NEXT, @@ -2574,6 +2592,40 @@ static const struct token token_list[] = { (struct rte_flow_action_set_ipv6, ipv6_addr)), .call = parse_vc_conf, }, + [ACTION_SET_TP_SRC] = { + .name = "set_tp_src", + .help = "set a new source port number in the outermost" + " TCP/UDP header", + .priv = PRIV_ACTION(SET_TP_SRC, + sizeof(struct rte_flow_action_set_tp)), + .next = NEXT(action_set_tp_src), + .call = parse_vc, + }, + [ACTION_SET_TP_SRC_TP_SRC] = { + .name = "port", + .help = "new source port number to set", + .next = NEXT(action_set_tp_src, NEXT_ENTRY(UNSIGNED)), + .args = ARGS(ARGS_ENTRY_HTON +(struct rte_flow_action_set_tp, port)), + .call = parse_vc_conf, + }, + [ACTION_SET_TP_DST] = { + .name = "set_tp_dst", + .help = "set a new destination port number in the outermost" + " TCP/UDP header", + .priv = PRIV_ACTION(SET_TP_DST, + sizeof(struct rte_flow_action_set_tp)), + .next = NEXT(action_set_tp_dst), + .call = parse_vc, + }, + [ACTION_SET_TP_DST_TP_DST] = { + .name = "port", + .help = "new destination port number to set", + .next = NEXT(action_set_tp_dst, NEXT_ENTRY(UNSIGNED)), + .args = ARGS(ARGS_ENTRY_HTON +(struct rte_flow_action_set_tp, port)), + .call = parse_vc_conf, + }, }; /** Remove and return last entry from argument stack. */ diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index 14dbdf7a3..1629a6d7a 100644 --- a/app/test-pmd/config.c +++ b/app/test-pmd/config.c @@ -1180,6 +1180,10 @@ static const struct { sizeof(struct rte_flow_action_set_ipv6)), MK_FLOW_ACTION(SET_IPV6_DST, sizeof(struct rte_flow_action_set_ipv6)), + MK_FLOW_ACTION(SET_TP_SRC, + sizeof(struct rte_flow_action_set_tp)), + MK_FLOW_ACTION(SET_TP_DST, + sizeof(struct rte_flow_action_set_tp)), }; /** Compute storage space needed by action configuration and copy it. */ diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst index b9bcaa3d1..4be160209 100644 --- a/doc/guides/prog_guide/rte_flow.rst +++ b/doc/guides/prog_guide/rte_flow.rst @@ -2148,6 +2148,42 @@ Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned. | ``ipv6_addr`` | new IPv6 destination address | +---+--+ +Action: ``SET_TP_SRC`` +^ + +Set a new source port numb
[dpdk-dev] [PATCH 3/3] net/cxgbe: add flow actions to modify IP and TCP/UDP port address
Query firmware for the new filter work request to offload flows with actions to modify IP and TCP/UDP port addresses. When available, translate IP and TCP/UDP port address modify actions to internal hardware specification and offload the flow to hardware. Original work by Shagun Agrawal Signed-off-by: Rahul Lakkireddy --- Changes since RFC v2: - Re-based to tip. - Updated all instances of fw_filter_wr to new fw_filter2_wr and removed fw_filter_wr. - Ensure correct ULP type is set when offloading NAT actions. - Returning appropriate RTE_FLOW_ERROR_TYPE_ACTION error if a corresponding valid flow pattern item for the header rewrite action is not found. - Updated release notes. doc/guides/rel_notes/release_18_11.rst | 4 +- drivers/net/cxgbe/base/common.h | 1 + drivers/net/cxgbe/base/t4_msg.h | 1 + drivers/net/cxgbe/base/t4fw_interface.h | 23 ++- drivers/net/cxgbe/cxgbe_filter.c| 37 +++-- drivers/net/cxgbe/cxgbe_filter.h| 23 +++ drivers/net/cxgbe/cxgbe_flow.c | 178 +++- drivers/net/cxgbe/cxgbe_main.c | 10 ++ 8 files changed, 265 insertions(+), 12 deletions(-) diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst index 84b0a6a4b..04d5d26a4 100644 --- a/doc/guides/rel_notes/release_18_11.rst +++ b/doc/guides/rel_notes/release_18_11.rst @@ -59,7 +59,9 @@ New Features Flow API support has been enhanced for CXGBE Poll Mode Driver to offload: * Match items: destination MAC address. - * Action items: push/pop/rewrite vlan header. + * Action items: push/pop/rewrite vlan header, rewrite IP addresses in +outermost IPv4/IPv6 header, rewrite port numbers in outermost TCP/UDP +header. * **Added support for SR-IOV in netvsc PMD.** diff --git a/drivers/net/cxgbe/base/common.h b/drivers/net/cxgbe/base/common.h index d9f74d995..fd2006682 100644 --- a/drivers/net/cxgbe/base/common.h +++ b/drivers/net/cxgbe/base/common.h @@ -271,6 +271,7 @@ struct adapter_params { bool ulptx_memwrite_dsgl; /* use of T5 DSGL allowed */ u8 fw_caps_support; /* 32-bit Port Capabilities */ + u8 filter2_wr_support;/* FW support for FILTER2_WR */ }; /* Firmware Port Capabilities types. diff --git a/drivers/net/cxgbe/base/t4_msg.h b/drivers/net/cxgbe/base/t4_msg.h index 2128da64f..6494f1827 100644 --- a/drivers/net/cxgbe/base/t4_msg.h +++ b/drivers/net/cxgbe/base/t4_msg.h @@ -32,6 +32,7 @@ enum CPL_error { enum { ULP_MODE_NONE = 0, + ULP_MODE_TCPDDP= 5, }; enum { diff --git a/drivers/net/cxgbe/base/t4fw_interface.h b/drivers/net/cxgbe/base/t4fw_interface.h index e2d2ee897..b4c95c588 100644 --- a/drivers/net/cxgbe/base/t4fw_interface.h +++ b/drivers/net/cxgbe/base/t4fw_interface.h @@ -61,6 +61,7 @@ enum fw_wr_opcodes { FW_ETH_TX_PKTS_WR = 0x09, FW_ETH_TX_PKT_VM_WR = 0x11, FW_ETH_TX_PKTS_VM_WR= 0x12, + FW_FILTER2_WR = 0x77, FW_ETH_TX_PKTS2_WR = 0x78, }; @@ -165,7 +166,7 @@ enum fw_filter_wr_cookie { FW_FILTER_WR_EINVAL, }; -struct fw_filter_wr { +struct fw_filter2_wr { __be32 op_pkd; __be32 len16_pkd; __be64 r3; @@ -195,6 +196,19 @@ struct fw_filter_wr { __be16 fpm; __be16 r7; __u8 sma[6]; + __be16 r8; + __u8 filter_type_swapmac; + __u8 natmode_to_ulp_type; + __be16 newlport; + __be16 newfport; + __u8 newlip[16]; + __u8 newfip[16]; + __be32 natseqcheck; + __be32 r9; + __be64 r10; + __be64 r11; + __be64 r12; + __be64 r13; }; #define S_FW_FILTER_WR_TID 12 @@ -300,6 +314,12 @@ struct fw_filter_wr { #define S_FW_FILTER_WR_MATCHTYPEM 0 #define V_FW_FILTER_WR_MATCHTYPEM(x) ((x) << S_FW_FILTER_WR_MATCHTYPEM) +#define S_FW_FILTER2_WR_NATMODE5 +#define V_FW_FILTER2_WR_NATMODE(x) ((x) << S_FW_FILTER2_WR_NATMODE) + +#define S_FW_FILTER2_WR_ULP_TYPE 0 +#define V_FW_FILTER2_WR_ULP_TYPE(x)((x) << S_FW_FILTER2_WR_ULP_TYPE) + /** * C O M M A N D s */ @@ -655,6 +675,7 @@ enum fw_params_param_dev { FW_PARAMS_PARAM_DEV_FWREV = 0x0B, /* fw version */ FW_PARAMS_PARAM_DEV_TPREV = 0x0C, /* tp version */ FW_PARAMS_PARAM_DEV_ULPTX_MEMWRITE_DSGL = 0x17, + FW_PARAMS_PARAM_DEV_FILTER2_WR = 0x1D, }; /* diff --git a/drivers/net/cxgbe/cxgbe_filter.c b/drivers/net/cxgbe/cxgbe_filter.c index dcb1dd03e..b876abf43 100644 --- a/drivers/net/cxgbe/cxgbe_filter.c +++ b/drivers/net/cxgbe/cxgbe_filter.c @@ -89,6 +89,9 @@ int validate_filter(struct adapter *adapter, struct ch_filter_specification *fs) if (fs->val.iport >= adapter->params.nports) return -ERANGE; + if (!fs->cap && fs->nat_mode && !ada
[dpdk-dev] [PATCH] ethdev: add action to swap source and destination MAC to flow API
This action is useful for offloading loopback mode, where the hardware will swap source and destination MAC addresses in the outermost Ethernet header before looping back the packet. This action can be used in conjunction with other rewrite actions to achieve MAC layer transparent NAT where the MAC addresses are swapped before either the source or destination MAC address is rewritten and NAT is performed. Must be used with a valid RTE_FLOW_ITEM_TYPE_ETH flow pattern item. Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error should be returned by the PMDs. Original work by Shagun Agrawal Signed-off-by: Rahul Lakkireddy --- RFC v1: http://mails.dpdk.org/archives/dev/2018-August/110232.html RFC v2: http://mails.dpdk.org/archives/dev/2018-August/110355.html Changes since RFC v2: - Updated release notes. app/test-pmd/cmdline_flow.c | 10 ++ app/test-pmd/config.c | 1 + doc/guides/prog_guide/rte_flow.rst | 19 +++ doc/guides/rel_notes/release_18_11.rst | 5 + doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 +++ lib/librte_ethdev/rte_flow.c| 1 + lib/librte_ethdev/rte_flow.h| 11 +++ 7 files changed, 50 insertions(+) diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index f9260600e..196c76de1 100644 --- a/app/test-pmd/cmdline_flow.c +++ b/app/test-pmd/cmdline_flow.c @@ -243,6 +243,7 @@ enum index { ACTION_VXLAN_DECAP, ACTION_NVGRE_ENCAP, ACTION_NVGRE_DECAP, + ACTION_MAC_SWAP, }; /** Maximum size for pattern in struct rte_flow_item_raw. */ @@ -816,6 +817,7 @@ static const enum index next_action[] = { ACTION_VXLAN_DECAP, ACTION_NVGRE_ENCAP, ACTION_NVGRE_DECAP, + ACTION_MAC_SWAP, ZERO, }; @@ -2470,6 +2472,14 @@ static const struct token token_list[] = { .next = NEXT(NEXT_ENTRY(ACTION_NEXT)), .call = parse_vc, }, + [ACTION_MAC_SWAP] = { + .name = "mac_swap", + .help = "Swap the source and destination MAC addresses" + " in the outermost Ethernet header", + .priv = PRIV_ACTION(MAC_SWAP, 0), + .next = NEXT(NEXT_ENTRY(ACTION_NEXT)), + .call = parse_vc, + }, }; /** Remove and return last entry from argument stack. */ diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index 794aa5268..43d803abb 100644 --- a/app/test-pmd/config.c +++ b/app/test-pmd/config.c @@ -1172,6 +1172,7 @@ static const struct { sizeof(struct rte_flow_action_of_pop_mpls)), MK_FLOW_ACTION(OF_PUSH_MPLS, sizeof(struct rte_flow_action_of_push_mpls)), + MK_FLOW_ACTION(MAC_SWAP, 0), }; /** Compute storage space needed by action configuration and copy it. */ diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst index b305a72a5..d09806d38 100644 --- a/doc/guides/prog_guide/rte_flow.rst +++ b/doc/guides/prog_guide/rte_flow.rst @@ -2076,6 +2076,25 @@ RTE_FLOW_ERROR_TYPE_ACTION error should be returned. This action modifies the payload of matched flows. +Action: ``MAC_SWAP`` +^ + +Swap the source and destination MAC addresses in the outermost Ethernet +header. + +It must be used with a valid RTE_FLOW_ITEM_TYPE_ETH flow pattern item. +Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned. + +.. _table_rte_flow_action_mac_swap: + +.. table:: MAC_SWAP + + +---+ + | Field | + +===+ + | no properties | + +---+ + Negative types ~~ diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst index f39cb15d2..83de90bcf 100644 --- a/doc/guides/rel_notes/release_18_11.rst +++ b/doc/guides/rel_notes/release_18_11.rst @@ -87,6 +87,11 @@ New Features the specified port. The port must be stopped before the command call in order to reconfigure queues. +* **Added new Flow API action to swap MAC addresses in Ethernet header.** + + Added new Flow API action to swap the source and destination MAC + addresses in the outermost Ethernet header. + API Changes --- diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst index 3a73000a6..c58b18e1a 100644 --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst @@ -3704,6 +3704,9 @@ This section lists supported actions and their attributes, if any. - ``nvgre_decap``: Performs a decapsulation action by stripping all headers of the NVGRE tunnel network overlay from the matched flow. +- ``mac_swap``: Swap the source and destination MAC addresses in the outermost + Ethernet header. + Destroying flow rules ~ diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c index cff4b5209..04b0b40
Re: [dpdk-dev] [PATCH v3 3/4] app/test-eventdev: add Tx adapter support
On 23.09.2018 13:35, Jerin Jacob wrote: > -Original Message- >> Date: Thu, 20 Sep 2018 03:52:34 +0530 >> From: Pavan Nikhilesh [...] >> -struct rte_event_dev_info info; >> -struct test_pipeline *t = evt_test_priv(test); >> -uint8_t tx_evqueue_id = 0; >> +uint8_t tx_evqueue_id[RTE_MAX_ETHPORTS] = {0}; > > Some old compiler throws error with this scheme. Please change to memset. Really? Could you give an example? That is perfectly legal C (since "forever"?) and I find it more readable than memset. Don't treat it as a request to keep the original version - if I were Pavan I would object this particular request since I prefer direct initialization, however here I'm more interested in learning more about your statement about compilers not supporting zero initialization of array members after the last initializer. And maybe also about to what extent we should be supporting old/non compliant compilers (the doc suggest to use gcc 4.9+). Best regards Andrzej
Re: [dpdk-dev] [PATCH] ethdev: add action to swap source and destination MAC to flow API
On 9/24/18 11:29 AM, wrote: This action is useful for offloading loopback mode, where the hardware will swap source and destination MAC addresses in the outermost Ethernet header before looping back the packet. This action can be used in conjunction with other rewrite actions to achieve MAC layer transparent NAT where the MAC addresses are swapped before either the source or destination MAC address is rewritten and NAT is performed. Must be used with a valid RTE_FLOW_ITEM_TYPE_ETH flow pattern item. Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error should be returned by the PMDs. Original work by Shagun Agrawal Signed-off-by: Rahul Lakkireddy Acked-by: Andrew Rybchenko
[dpdk-dev] [PATCH v2] test/event: fix RSS config in eth Rx adapter test
Remove RSS config as it is not required. The hardcoded RSS configuraton also generates an error on NICs that don't support it. Fixes: 8863a1fbfc66 ("ethdev: add supported hash function check") CC: sta...@dpdk.org Signed-off-by: Nikhil Rao --- test/test/test_event_eth_rx_adapter.c | 13 ++--- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/test/test/test_event_eth_rx_adapter.c b/test/test/test_event_eth_rx_adapter.c index 28f2146..3c19ee0 100644 --- a/test/test/test_event_eth_rx_adapter.c +++ b/test/test/test_event_eth_rx_adapter.c @@ -98,8 +98,7 @@ struct event_eth_rx_adapter_test_params { { static const struct rte_eth_conf port_conf_default = { .rxmode = { - .mq_mode = ETH_MQ_RX_RSS, - .max_rx_pkt_len = ETHER_MAX_LEN + .mq_mode = ETH_MQ_RX_NONE, }, .intr_conf = { .rxq = 1, @@ -114,16 +113,8 @@ struct event_eth_rx_adapter_test_params { { static const struct rte_eth_conf port_conf_default = { .rxmode = { - .mq_mode = ETH_MQ_RX_RSS, - .max_rx_pkt_len = ETHER_MAX_LEN + .mq_mode = ETH_MQ_RX_NONE, }, - .rx_adv_conf = { - .rss_conf = { - .rss_hf = ETH_RSS_IP | - ETH_RSS_TCP | - ETH_RSS_UDP, - } - } }; return port_init_common(port, &port_conf_default, mp); -- 1.8.3.1
Re: [dpdk-dev] [RFC] ipsec: new library for IPsec data-path processing
Hi Jerin, > > > > > > > > > > Anyway, let's pretend we found some smart way to distribute inbound > > > > > packets for the same SA to multiple HW queues/CPU > > > cores. > > > > > To make ipsec processing for such case to work correctly just > > > > > atomicity on check/update segn/replay_window is not enough. > > > > > I think it would require some extra synchronization: > > > > > make sure that we do final packet processing (seq check/update) at > > > > > the same order as we received the packets > > > > > (packets entered ipsec processing). > > > > > I don't really like to introduce such heavy mechanisms on SA level, > > > > > after all it supposed to be light and simple. > > > > > Though we plan CTX level API to support such scenario. > > > > > What I think would be useful addition for SA level API - have an > > > > > ability to do one update seqn/replay_window and multiple checks > > > concurrently. > > > > > > > > > > > In case of ingress also, the same problem exists. We will not be > > > > > > able to use RSS and spread the traffic to multiple cores. > > > Considering > > > > > > IPsec being CPU intensive, this would limit the net output of the > > > > > > chip. > > > > > That's true - but from other side implementation can offload heavy > > > > > part > > > > > (encrypt/decrypt, auth) to special HW (cryptodev). > > > > > In that case single core might be enough for SA and extra > > > > > synchronization would just slowdown things. > > > > > That's why I think it should be configurable what behavior (ST or > > > > > MT) to use. > > > > I do agree that these are the issues that we need to address to make the > > > > library MT safe. Whether the extra synchronization would slow down > > > > things is > > > > a very subjective question and will heavily depend on the platform. The > > > > library should have enough provisions to be able to support MT without > > > > causing overheads to ST. Right now, the library assumes ST. > > > > > > > > > I agree with Anoob here. > > > > > > I have two concerns with librte_ipsec as a separate library > > > > > > 1) There is an overlap with rte_security and new proposed library. > > > > I don't think there really is an overlap. > > As mentioned in your other email. IMO, There is an overlap as > RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL can support almost everything > in HW or HW + SW if some PMD wishes to do so. > > Answering some of the questions, you have asked in other thread based on > my understanding. > > Regarding RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL support, > Marvell/Cavium CPT hardware on next generation HW(Planning to upstream > around v19.02) can support RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL and > Anoob already pushed the application changes in ipsec-gw. Ok good to know. > > In our understanding of HW/SW roles/responsibilities for that type of > devices are: > > INLINE_PROTOCOL > > In control path, security session is created with the given SA and > rte_flow configuration etc. > > For outbound traffic, the application will have to do SA lookup and > identify the security action (inline/look aside crypto/protocol). For > packets identified for inline protocol processing, the application would > submit as plain packets to the ethernet device and the security capable > ethernet device would perform IPSec and send out the packet. For PMDs > which would need extra metadata (capability flag), set_pkt_metadata > function pointer would be called (from application). > This can be used to set some per packet field to identify the security > session to be used to > process the packet. Yes, as I can see, that's what ipsec-gw is doing right now and it wouldn't be a problem to do the same in ipsec lib. > Sequence number updation will be done by the PMD. Ok, so for INLINE_PROTOCOL upper layer wouldn't need to keep track for SQN values at all? You don’t' consider a possibility that by some reason that SA would need to be moved from device that support INLINE_PROTOCOL to the device that doesn't? > For inbound traffic, the packets for IPSec would be identified by using > rte_flow (hardware accelerated packet filtering). For the packets > identified for inline offload (SECURITY action), hardware would perform > the processing. For inline protocol processed IPSec packets, PMD would > set “user data” so that application can get the details of the security > processing done on the packet. Once the plain packet (after IPSec > processing) is received, a selector check need to be performed to make > sure we have a valid packet after IPSec processing. The user data is used > for that. Anti-replay check is handled by the PMD. The PMD would raise > an eth event in case of sequence number expiry or any SA expiry. Few questions here: 1) if I understand things right - to specify that it was an IPsec packet - PKT_RX_SEC_OFFLOAD will be set in mbuf ol_flags? 2) Basically 'userdata' will contain just a user provided at rte_securi
Re: [dpdk-dev] [PATCH v4 00/20] Support externally allocated memory in DPDK
On 23-Sep-18 10:21 PM, Thomas Monjalon wrote: Hi Anatoly, 21/09/2018 18:13, Anatoly Burakov: This is a proposal to enable using externally allocated memory in DPDK. About this change and previous ones, I think we may miss some documentation about the usage and the internal design of the DPDK memory allocation. You already updated some doc recently: http://git.dpdk.org/dpdk/commit/?id=b31739328 This is what we have currently: http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#memory-segments-and-memory-zones-memzone http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#malloc http://doc.dpdk.org/guides/prog_guide/mempool_lib.html This is probably a good time to check this doc again. Do you think it deserves more explanations, or maybe some figures? Maybe this could be split into two sections - explanation of user-facing API, and explanation of its inner workings. However, I don't want for DPDK documentation to become my personal soapbox, so i'm open to suggestions on what is missing and how to organize the memory docs better :) -- Thanks, Anatoly
[dpdk-dev] [PATCH v3] test/event: fix RSS config in eth Rx adapter test
Remove RSS config as it is not required. The hardcoded RSS configuration also generates an error on NICs that don't support it. Fixes: 8863a1fbfc66 ("ethdev: add supported hash function check") CC: sta...@dpdk.org Signed-off-by: Nikhil Rao --- v2: - use ETH_MQ_RX_NONE to disable RSS (Jerin Jacob) v3: - fix typo in commit message (checkpatch warning) test/test/test_event_eth_rx_adapter.c | 13 ++--- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/test/test/test_event_eth_rx_adapter.c b/test/test/test_event_eth_rx_adapter.c index 28f2146..3c19ee0 100644 --- a/test/test/test_event_eth_rx_adapter.c +++ b/test/test/test_event_eth_rx_adapter.c @@ -98,8 +98,7 @@ struct event_eth_rx_adapter_test_params { { static const struct rte_eth_conf port_conf_default = { .rxmode = { - .mq_mode = ETH_MQ_RX_RSS, - .max_rx_pkt_len = ETHER_MAX_LEN + .mq_mode = ETH_MQ_RX_NONE, }, .intr_conf = { .rxq = 1, @@ -114,16 +113,8 @@ struct event_eth_rx_adapter_test_params { { static const struct rte_eth_conf port_conf_default = { .rxmode = { - .mq_mode = ETH_MQ_RX_RSS, - .max_rx_pkt_len = ETHER_MAX_LEN + .mq_mode = ETH_MQ_RX_NONE, }, - .rx_adv_conf = { - .rss_conf = { - .rss_hf = ETH_RSS_IP | - ETH_RSS_TCP | - ETH_RSS_UDP, - } - } }; return port_init_common(port, &port_conf_default, mp); -- 1.8.3.1
Re: [dpdk-dev] [PATCH 07/21] net/atlantic: hardware register access routines
Hi Hemant, >> + * under the terms and conditions of the GNU General Public License, >> + * version 2, as published by the Free Software Foundation. >> + */ > > DPDK is a open source BSD-3 licensed framework. GPL license files are not > allowed in DPDK unless: > 1. They are part of kernel module (e.g. KNI) > 2. They are dual licensed and they support BSD-3 license as well. > > So, please submit single or dual BSD-3 licensed source code. Thanks, got it. This part of the code comes from linux kernel driver, I think its not a problem for us to distribute it under dual license.
Re: [dpdk-dev] [PATCH v5 1/8] net/mvneta: add neta PMD skeleton
On 9/20/2018 10:05 AM, Andrzej Ostruszka wrote: > From: Zyta Szpak > > Add neta pmd driver skeleton providing base for the further > development. > > Signed-off-by: Natalie Samsonov > Signed-off-by: Yelena Krivosheev > Signed-off-by: Dmitri Epshtein > Signed-off-by: Zyta Szpak > Signed-off-by: Andrzej Ostruszka <...> > @@ -0,0 +1,75 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2018 Marvell International Ltd. > + * Copyright(c) 2018 Semihalf. > + * All rights reserved. > + */ > + > +#ifndef _MVNETA_ETHDEV_H_ > +#define _MVNETA_ETHDEV_H_ > + > +/* > + * container_of is defined by both DPDK and MUSDK, > + * we'll declare only one version. > + * > + * Note that it is not used in this PMD anyway. > + */ > +#ifdef container_of > +#undef container_of > +#endif > + > +#include > +#include Can't find mv_neta.h in $(LIBMUSDK_PATH)/include [1] There is a "mv_neta.h" in "./src/include/drivers/mv_neta.h" but not in the installed path. /usr/local/include/drivers. Is there a specific build param required for musdk for neta support? [1] .../drivers/net/mvneta/mvneta_ethdev.h:24:10: fatal error: 'drivers/mv_neta.h' file not found #include ^~~
Re: [dpdk-dev] [PATCH v5 1/8] net/mvneta: add neta PMD skeleton
On 9/24/2018 10:21 AM, Ferruh Yigit wrote: > On 9/20/2018 10:05 AM, Andrzej Ostruszka wrote: >> From: Zyta Szpak >> >> Add neta pmd driver skeleton providing base for the further >> development. >> >> Signed-off-by: Natalie Samsonov >> Signed-off-by: Yelena Krivosheev >> Signed-off-by: Dmitri Epshtein >> Signed-off-by: Zyta Szpak >> Signed-off-by: Andrzej Ostruszka > > <...> > >> @@ -0,0 +1,75 @@ >> +/* SPDX-License-Identifier: BSD-3-Clause >> + * Copyright(c) 2018 Marvell International Ltd. >> + * Copyright(c) 2018 Semihalf. >> + * All rights reserved. >> + */ >> + >> +#ifndef _MVNETA_ETHDEV_H_ >> +#define _MVNETA_ETHDEV_H_ >> + >> +/* >> + * container_of is defined by both DPDK and MUSDK, >> + * we'll declare only one version. >> + * >> + * Note that it is not used in this PMD anyway. >> + */ >> +#ifdef container_of >> +#undef container_of >> +#endif >> + >> +#include >> +#include > > Can't find mv_neta.h in $(LIBMUSDK_PATH)/include [1] > > There is a "mv_neta.h" in "./src/include/drivers/mv_neta.h" but not in the > installed path. > /usr/local/include/drivers. > > Is there a specific build param required for musdk for neta support? I found it: --enable-bpool-dma=64 --enable-pp2=no --enable-neta, But this means I need different musdk builds for mvpp2 and mvneta! Can't it possible to use single musdk build for both libraries? > > [1] > .../drivers/net/mvneta/mvneta_ethdev.h:24:10: fatal error: 'drivers/mv_neta.h' > file not found > #include > > ^~~ >
Re: [dpdk-dev] [PATCH v5 1/8] net/mvneta: add neta PMD skeleton
On 9/24/2018 10:35 AM, Ferruh Yigit wrote: > On 9/24/2018 10:21 AM, Ferruh Yigit wrote: >> On 9/20/2018 10:05 AM, Andrzej Ostruszka wrote: >>> From: Zyta Szpak >>> >>> Add neta pmd driver skeleton providing base for the further >>> development. >>> >>> Signed-off-by: Natalie Samsonov >>> Signed-off-by: Yelena Krivosheev >>> Signed-off-by: Dmitri Epshtein >>> Signed-off-by: Zyta Szpak >>> Signed-off-by: Andrzej Ostruszka >> >> <...> >> >>> @@ -0,0 +1,75 @@ >>> +/* SPDX-License-Identifier: BSD-3-Clause >>> + * Copyright(c) 2018 Marvell International Ltd. >>> + * Copyright(c) 2018 Semihalf. >>> + * All rights reserved. >>> + */ >>> + >>> +#ifndef _MVNETA_ETHDEV_H_ >>> +#define _MVNETA_ETHDEV_H_ >>> + >>> +/* >>> + * container_of is defined by both DPDK and MUSDK, >>> + * we'll declare only one version. >>> + * >>> + * Note that it is not used in this PMD anyway. >>> + */ >>> +#ifdef container_of >>> +#undef container_of >>> +#endif >>> + >>> +#include >>> +#include >> >> Can't find mv_neta.h in $(LIBMUSDK_PATH)/include [1] >> >> There is a "mv_neta.h" in "./src/include/drivers/mv_neta.h" but not in the >> installed path. >> /usr/local/include/drivers. >> >> Is there a specific build param required for musdk for neta support? > > I found it: --enable-bpool-dma=64 --enable-pp2=no --enable-neta, btw, getting "configure: WARNING: unrecognized options: --enable-bpool-dma" FYI > > But this means I need different musdk builds for mvpp2 and mvneta! > Can't it possible to use single musdk build for both libraries? > >> >> [1] >> .../drivers/net/mvneta/mvneta_ethdev.h:24:10: fatal error: >> 'drivers/mv_neta.h' >> file not found >> #include >> >> ^~~ >> >
Re: [dpdk-dev] [PATCH] examples/eventdev_pipeline: add Tx adapter support
On Fri, Sep 21, 2018 at 01:14:53PM +0530, Rao, Nikhil wrote: > On 9/5/2018 7:15 PM, Pavan Nikhilesh wrote: > > Signed-off-by: Pavan Nikhilesh > > --- > > This patch depends on the following series: > > http://patches.dpdk.org/project/dpdk/list/?series=1121 > > > > examples/eventdev_pipeline/main.c | 62 ++-- > > examples/eventdev_pipeline/pipeline_common.h | 31 +- > > .../pipeline_worker_generic.c | 273 +- > > .../eventdev_pipeline/pipeline_worker_tx.c| 130 + > > 4 files changed, 186 insertions(+), 310 deletions(-) > > > > --- a/examples/eventdev_pipeline/pipeline_worker_generic.c > > +++ b/examples/eventdev_pipeline/pipeline_worker_generic.c > > @@ -119,153 +119,13 @@ worker_generic_burst(void *arg) > > return 0; > > } > > > > > static void > > -init_rx_adapter(uint16_t nb_ports) > > +init_adapters(uint16_t nb_ports) > > { > > int i; > > int ret; > > + uint8_t tx_port_id = 0; > > uint8_t evdev_id = 0; > > struct rte_event_dev_info dev_info; > > > > ret = rte_event_dev_info_get(evdev_id, &dev_info); > > > > - struct rte_event_port_conf rx_p_conf = { > > + struct rte_event_port_conf adptr_p_conf = { > > .dequeue_depth = 8, > > .enqueue_depth = 8, > > .new_event_threshold = 1200, > > }; > > > > We should restore the dequeue_depth to 128 for the port config passed to > the Tx adapter. Doing so takes the performance from 5.8 mpps to 11.7 > mpps for on my test setup (test setup uses the SW PMD). Restoring > enqueue_depth and new_event_threshold (64 and 4096 respectively) had > no noticeable effect. We replace the above values with the defaults passed by the driver if (adptr_p_conf.dequeue_depth > dev_info.max_event_port_dequeue_depth) adptr_p_conf.dequeue_depth = dev_info.max_event_port_dequeue_depth; if (adptr_p_conf.enqueue_depth > dev_info.max_event_port_enqueue_depth) adptr_p_conf.enqueue_depth = dev_info.max_event_port_enqueue_depth; Still I missed setting it to configurable defaults used for worker ports as below struct rte_event_port_conf adptr_p_conf = { .dequeue_depth = cdata.worker_cq_depth, .enqueue_depth = 64, .new_event_threshold = 1200, }; I will send the v2 soon > > Thanks, > Nikhil Thanks, Pavan.
Re: [dpdk-dev] [PATCH v4 1/5] vhost: unify struct VhostUserMsg usage
On 22-Sep-18 10:16 PM, Nikolay Nikolaev wrote: Do not use the typedef version of struct VhostUserMsg. Also unify the related parameter name. Signed-off-by: Nikolay Nikolaev --- Now this breaks the unity of the rest of the patchset, because all other patches still use a typedef :P -- Thanks, Anatoly
Re: [dpdk-dev] [PATCH v5 1/8] net/mvneta: add neta PMD skeleton
On 9/20/2018 10:05 AM, Andrzej Ostruszka wrote: > From: Zyta Szpak > > Add neta pmd driver skeleton providing base for the further > development. > > Signed-off-by: Natalie Samsonov > Signed-off-by: Yelena Krivosheev > Signed-off-by: Dmitri Epshtein > Signed-off-by: Zyta Szpak > Signed-off-by: Andrzej Ostruszka > --- > MAINTAINERS | 8 + > config/common_base| 5 + > devtools/test-build.sh| 2 + > doc/guides/nics/features/mvneta.ini | 11 + > doc/guides/nics/mvneta.rst| 152 +++ doc/guides/nics/mvneta.rst: WARNING: document isn't included in any toctree doc/guides/nics/index.rst needs to be updated.
Re: [dpdk-dev] [PATCH v5 1/8] net/mvneta: add neta PMD skeleton
On 9/20/2018 10:05 AM, Andrzej Ostruszka wrote: > +/** > + * DPDK callback to register the virtual device. > + * > + * @param vdev > + * Pointer to the virtual device. > + * > + * @return > + * 0 on success, negative error value otherwise. > + */ > +static int > +rte_pmd_mvneta_probe(struct rte_vdev_device *vdev) > +{ > + struct rte_kvargs *kvlist; > + struct mvneta_ifnames ifnames; > + int ret = -EINVAL; > + uint32_t i, ifnum; > + const char *params; > + > + params = rte_vdev_device_args(vdev); > + if (!params) > + return -EINVAL; > + > + kvlist = rte_kvargs_parse(params, valid_args); > + if (!kvlist) > + return -EINVAL; > + > + ifnum = rte_kvargs_count(kvlist, MVNETA_IFACE_NAME_ARG); > + if (ifnum > RTE_DIM(ifnames.names)) > + goto out_free_kvlist; > + > + ifnames.idx = 0; > + rte_kvargs_process(kvlist, MVNETA_IFACE_NAME_ARG, > +mvneta_ifnames_get, &ifnames); > + > + /* > + * The below system initialization should be done only once, > + * on the first provided configuration file > + */ > + if (mvneta_dev_num) > + goto init_devices; > + > + MVNETA_LOG(INFO, "Perform MUSDK initializations"); > + > + ret = rte_mvep_init(MVEP_MOD_T_NETA, kvlist); Giving build error for shared libraries [1], needs to link with rte_common_mvep, In makefile needed: LDLIBS += -lrte_common_mvep, please check "mvpp2/Makefile" [1] mvneta_ethdev.o: In function `rte_pmd_mvneta_probe': mvneta_ethdev.c:(.text+0xa58): undefined reference to `rte_mvep_init' mvneta_ethdev.c:(.text+0xc98): undefined reference to `rte_mvep_deinit' mvneta_ethdev.c:(.text+0xcb4): undefined reference to `rte_mvep_deinit' mvneta_ethdev.o: In function `rte_pmd_mvneta_remove': mvneta_ethdev.c:(.text+0xe58): undefined reference to `rte_mvep_deinit'
[dpdk-dev] [PATCH v2] examples/eventdev_pipeline: add Tx adapter support
Redo the worker pipelines and offload transmission to service cores seamlessly through Tx adapter. Signed-off-by: Pavan Nikhilesh --- v2 Changes: - Updated enqueue,dequeue depth thresholds. - remove redundant capability checks. examples/eventdev_pipeline/main.c | 88 +++--- examples/eventdev_pipeline/pipeline_common.h | 31 +- .../pipeline_worker_generic.c | 268 +- .../eventdev_pipeline/pipeline_worker_tx.c| 156 +- 4 files changed, 207 insertions(+), 336 deletions(-) diff --git a/examples/eventdev_pipeline/main.c b/examples/eventdev_pipeline/main.c index 700bc696f..92e08bc0c 100644 --- a/examples/eventdev_pipeline/main.c +++ b/examples/eventdev_pipeline/main.c @@ -26,20 +26,6 @@ core_in_use(unsigned int lcore_id) { fdata->tx_core[lcore_id] || fdata->worker_core[lcore_id]); } -static void -eth_tx_buffer_retry(struct rte_mbuf **pkts, uint16_t unsent, - void *userdata) -{ - int port_id = (uintptr_t) userdata; - unsigned int _sent = 0; - - do { - /* Note: hard-coded TX queue */ - _sent += rte_eth_tx_burst(port_id, 0, &pkts[_sent], - unsent - _sent); - } while (_sent != unsent); -} - /* * Parse the coremask given as argument (hexadecimal string) and fill * the global configuration (core role and core count) with the parsed @@ -263,6 +249,7 @@ parse_app_args(int argc, char **argv) static inline int port_init(uint8_t port, struct rte_mempool *mbuf_pool) { + struct rte_eth_rxconf rx_conf; static const struct rte_eth_conf port_conf_default = { .rxmode = { .mq_mode = ETH_MQ_RX_RSS, @@ -291,6 +278,8 @@ port_init(uint8_t port, struct rte_mempool *mbuf_pool) if (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_MBUF_FAST_FREE) port_conf.txmode.offloads |= DEV_TX_OFFLOAD_MBUF_FAST_FREE; + rx_conf = dev_info.default_rxconf; + rx_conf.offloads = port_conf.rxmode.offloads; port_conf.rx_adv_conf.rss_conf.rss_hf &= dev_info.flow_type_rss_offloads; @@ -311,7 +300,8 @@ port_init(uint8_t port, struct rte_mempool *mbuf_pool) /* Allocate and set up 1 RX queue per Ethernet port. */ for (q = 0; q < rx_rings; q++) { retval = rte_eth_rx_queue_setup(port, q, rx_ring_size, - rte_eth_dev_socket_id(port), NULL, mbuf_pool); + rte_eth_dev_socket_id(port), &rx_conf, + mbuf_pool); if (retval < 0) return retval; } @@ -350,7 +340,7 @@ port_init(uint8_t port, struct rte_mempool *mbuf_pool) static int init_ports(uint16_t num_ports) { - uint16_t portid, i; + uint16_t portid; if (!cdata.num_mbuf) cdata.num_mbuf = 16384 * num_ports; @@ -367,36 +357,26 @@ init_ports(uint16_t num_ports) rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu16 "\n", portid); - RTE_ETH_FOREACH_DEV(i) { - void *userdata = (void *)(uintptr_t) i; - fdata->tx_buf[i] = - rte_malloc(NULL, RTE_ETH_TX_BUFFER_SIZE(32), 0); - if (fdata->tx_buf[i] == NULL) - rte_panic("Out of memory\n"); - rte_eth_tx_buffer_init(fdata->tx_buf[i], 32); - rte_eth_tx_buffer_set_err_callback(fdata->tx_buf[i], - eth_tx_buffer_retry, - userdata); - } - return 0; } static void do_capability_setup(uint8_t eventdev_id) { + int ret; uint16_t i; - uint8_t mt_unsafe = 0; + uint8_t generic_pipeline = 0; uint8_t burst = 0; RTE_ETH_FOREACH_DEV(i) { - struct rte_eth_dev_info dev_info; - memset(&dev_info, 0, sizeof(struct rte_eth_dev_info)); - - rte_eth_dev_info_get(i, &dev_info); - /* Check if it is safe ask worker to tx. */ - mt_unsafe |= !(dev_info.tx_offload_capa & - DEV_TX_OFFLOAD_MT_LOCKFREE); + uint32_t caps = 0; + + ret = rte_event_eth_tx_adapter_caps_get(eventdev_id, i, &caps); + if (ret) + rte_exit(EXIT_FAILURE, + "Invalid capability for Tx adptr port %d\n", i); + generic_pipeline |= !(caps & + RTE_EVENT_ETH_TX_ADAPTER_CAP_INTERNAL_PORT); } struct rte_event_dev_info eventdev_info; @@ -406,21 +386,42 @@ do_capability_setup(uint8_t eventdev_id) burst = eventdev_info.event_dev_cap & RTE_EVENT_DEV_CAP_BURST_MODE ? 1 : 0; - if (mt_unsafe) + if (gener
Re: [dpdk-dev] [RFC] ipsec: new library for IPsec data-path processing
Hi Akhil, > > Hi Konstantin, > > On 9/18/2018 6:12 PM, Ananyev, Konstantin wrote: > >>> I am not saying this should be the ONLY way to do as it does not work > >>> very well with non NPU/FPGA class of SoC. > >>> > >>> So how about making the proposed IPSec library as plugin/driver to > >>> rte_security. > >> As I mentioned above, I don't think that pushing whole IPSec data-path > >> into rte_security > >> is the best possible approach. > >> Though I probably understand your concern: > >> In RFC code we always do whole prepare/process in SW (attach/remove ESP > >> headers/trailers, so paddings etc.), > >> i.e. right now only device types: RTE_SECURITY_ACTION_TYPE_NONE and > >> RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO are covered. > >> Though there are devices where most of prepare/process can be done in HW > >> (RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL/RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL), > >> plus in future could be devices where prepare/process would be split > >> between HW/SW in a custom way. > >> Is that so? > >> To address that issue I suppose we can do: > >> 1. Add support for RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL and > >> RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL > >> security devices into ipsec. > >> We planned to do it anyway, just don't have it done yet. > >> 2. For custom case - introduce RTE_SECURITY_ACTION_TYPE_INLINE_CUSTOM and > RTE_SECURITY_ACTION_TYPE_LOOKASIDE_CUSTOM > >> and add into rte_security_ops new functions: > >> uint16_t lookaside_prepare(struct rte_security_session *sess, struct > >> rte_mbuf *mb[], struct struct rte_crypto_op *cop[], uint16_t > num); > >> uint16_t lookaside_process(struct rte_security_session *sess, struct > >> rte_mbuf *mb[], struct struct rte_crypto_op *cop[], uint16_t > num); > >> uint16_t inline_process(struct rte_security_session *sess, struct > >> rte_mbuf *mb[], struct struct rte_crypto_op *cop[], uint16_t num); > >> So for custom HW, PMD can overwrite normal prepare/process behavior. > >> > > Actually after another thought: > > My previous assumption (probably wrong one) was that for both > > RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL and > > RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL > > devices can do whole data-path ipsec processing totally in HW - no need for > > any SW support (except init/config). > > Now looking at dpaa and dpaa2 devices (the only ones that supports > > RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL right now) > > I am not so sure about that - looks like some SW help might be needed for > > replay window updates, etc. > > Hemant, Shreyansh - can you guys confirm what is expected from > > RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL devices > > (HW/SW roses/responsibilities)? > > About RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL - I didn't find any driver > > inside DPDK source tree that does support that > capability. > > So my question is there any devices/drivers that do support it? > > If so, where could source code could be found, and what are HW/SW > > roles/responsibilities for that type of devices? > > Konstantin > > > > > In case of LOOKASIDE, the protocol errors like antireplay and sequence > number overflow shall be the responsibility of either PMD or the HW. > It should notify the application that the error has occurred and > application need to decide what it needs to decide next. Ok, thanks for clarification. Just to confirm - do we have a defined way for it right now in rte_security? > > As Jerin said in other email, the roles/responsibility of the PMD in > case of inline proto and lookaside case, nothing much is required from > the application to do any processing for ipsec. > > As per my understanding, the proposed RFC is to make the application > code cleaner for the protocol processing. Yes, unified data-path API is definitely one of the main goals. > 1. For inline proto and lookaside there won't be any change in the data > path. The main changes would be in the control path. Yes, from your and Jerin description data-path processing looks really lightweight for these cases. For control path - there is no much change, user would have to call rte_ipsec_sa_init() to start using given SA. > > 2. But in case of inline crypto and RTE_SECURITY_ACTION_TYPE_NONE, the > protocol processing will be done in the library and there would be > changes in both control and data path. Yes. > > As the rte_security currently provide generic APIs for control path only > and we may have it expanded for protocol specific datapath processing. > So for the application, working with inline crypto/ inline proto would > be quite similar and it won't need to do some extra processing for > inline crypto. > Same will be the case for RTE_SECURITY_ACTION_TYPE_NONE and lookaside. > > We may have the protocol specific APIs reside inside the rte_security > and we can use either the crypto/net PMD underneath it. As I understand, you suggest instead of introducing new libr
Re: [dpdk-dev] [PATCH v2 10/12] net/mvpp2: align documentation with MUSDK 18.09
On 9/23/2018 11:40 PM, Thomas Monjalon wrote: > 19/09/2018 19:15, Ferruh Yigit: >> On 9/4/2018 2:49 PM, Tomasz Duszynski wrote: >>> From: Natalie Samsonov >>> --- a/doc/guides/nics/mvpp2.rst >>> +++ b/doc/guides/nics/mvpp2.rst >>> - git clone >>> https://github.com/MarvellEmbeddedProcessors/linux-marvell.git -b >>> linux-4.4.52-armada-17.10 >>> + git clone >>> https://github.com/MarvellEmbeddedProcessors/linux-marvell.git -b >>> linux-4.4.120-armada-18.09 >> >> There is a strict dependency to MUSDK 18.09, dpdk18.11 won't compile with >> older >> versions. It is hard to trace this dependency, what do you think having a >> matrix >> in DPDK documentation showing which DPDK version supports which MUSDK? > > It does not compile even with MUSDK 18.09. > > With MUSDK 18.09, the error is: > drivers/crypto/mvsam/rte_mrvl_pmd.c:867:26: error: 'SAM_HW_RING_NUM' > undeclared I confirm same error. I wasn't building with crypto PMD enabled so not caught it. > > The explanation is in MUSDK: > commit 9bf8b3ca4ddfa00619c0023dfb08ae1601054fce > Author: Dmitri Epshtein > Date: Mon Nov 20 10:38:31 2017 +0200 > > sam: remove SAM_HW_RING_NUM from APIs > > Use function: > u32 sam_get_num_cios(u32 inst); > > As a consequence, next-net cannot be pulled! Got it, should I drop the patchset from tree?
Re: [dpdk-dev] [PATCH v2 09/33] crypto/octeontx: adds symmetric capabilities
Hi Fiona, Can you please comment on this? We are adding all capabilities of octeontx-crypto PMD as a macro in otx_cryptodev_capabilites.h file and then we are using it from otx_cryptodev_ops.c. This is the approach followed by QAT crypto PMD. As per my understanding, this is to ensure that cryptodev_ops file remains simple. For other PMDs with fewer number of capabilities, the structure can be populated in the .c file itself without the size of the file coming into the picture. But this would cause checkpatch to report error. Akhil's suggestion is to move the entire definition to a header and include it from the .c file. I believe, the QAT approach was to avoid variable definition in the header. What do you think would be a better approach here? Thanks, Anoob On 17-09-2018 18:05, Joseph, Anoob wrote: Hi Akhil, On 17-09-2018 17:31, Akhil Goyal wrote: External Email diff --git a/drivers/crypto/octeontx/otx_cryptodev_ops.c b/drivers/crypto/octeontx/otx_cryptodev_ops.c index d25f9c1..cc0030e 100644 --- a/drivers/crypto/octeontx/otx_cryptodev_ops.c +++ b/drivers/crypto/octeontx/otx_cryptodev_ops.c @@ -10,9 +10,15 @@ #include "cpt_pmd_logs.h" #include "otx_cryptodev.h" +#include "otx_cryptodev_capabilities.h" #include "otx_cryptodev_hw_access.h" #include "otx_cryptodev_ops.h" +static const struct rte_cryptodev_capabilities otx_capabilities[] = { + OTX_SYM_CAPABILITIES, + RTE_CRYPTODEV_END_OF_CAPABILITIES_LIST() +}; + better to have otx_capabilities structure defined in the otx_cryptodev_capabilities.h I don't see any value addition of creating a macro in one file using in a separate structure in another file which doesn't have anything new in that structure. It would also give checkpatch error. You can directly have a capability structure without the #define. This was the convention followed in qat driver. https://git.dpdk.org/dpdk/tree/drivers/crypto/qat/qat_sym_capabilities.h I guess it was to avoid variable definition in header. May be Pablo too can comment on this. I'll make the change accordingly. Thanks, Anoob
[dpdk-dev] Decoupling rte_mbuf and the segment buffer.
I've been wondering about the possibility to decouple the mbuf header (and the private data) from the payload. If I'm reading the code correctly, right now the payload is placed (together with the header and the private data) in one memory object dequeued from the mempool. I can use a specific mempool for pktbuf_pool by using rte_pktmbuf_pool_create_by_ops(). However, there is no way to have the header and buffer go to separate mempools, is that correct? As I see that struct rte_mbuf contains a pointer to the segment buffer (both VA and PA) and rte_pktmbuf_init() initializes those pointers. So one could expect that the rest of DPDK would not depend on the fact that the segment buffer is at some offset from the mbuf header, right? I was also looking at the rte_pktmbuf_attach_extbuf() patches from April. It seems that I could spin up my own (possibly non-RTE) memory pool and then rte_mempool_obj_iter() the pktbuf_pool to attach external buffers to all the mbufs in the original pool. The obvious downside here would be that some memory would be wasted in the mbuf pool (because the segment buffer would be allocated but unused) but in principle that should also work, right? Can you think of any other obvious downsides to that approach? Regards, Wojtek
Re: [dpdk-dev] [PATCH v2 10/12] net/mvpp2: align documentation with MUSDK 18.09
Hi Ferruh, pon., 24 wrz 2018 o 13:38 Ferruh Yigit napisał(a): > > On 9/23/2018 11:40 PM, Thomas Monjalon wrote: > > 19/09/2018 19:15, Ferruh Yigit: > >> On 9/4/2018 2:49 PM, Tomasz Duszynski wrote: > >>> From: Natalie Samsonov > >>> --- a/doc/guides/nics/mvpp2.rst > >>> +++ b/doc/guides/nics/mvpp2.rst > >>> - git clone > >>> https://github.com/MarvellEmbeddedProcessors/linux-marvell.git -b > >>> linux-4.4.52-armada-17.10 > >>> + git clone > >>> https://github.com/MarvellEmbeddedProcessors/linux-marvell.git -b > >>> linux-4.4.120-armada-18.09 > >> > >> There is a strict dependency to MUSDK 18.09, dpdk18.11 won't compile with > >> older > >> versions. It is hard to trace this dependency, what do you think having a > >> matrix > >> in DPDK documentation showing which DPDK version supports which MUSDK? > > > > It does not compile even with MUSDK 18.09. > > > > With MUSDK 18.09, the error is: > > drivers/crypto/mvsam/rte_mrvl_pmd.c:867:26: error: 'SAM_HW_RING_NUM' > > undeclared > > I confirm same error. I wasn't building with crypto PMD enabled so not caught > it. > > > > > The explanation is in MUSDK: > > commit 9bf8b3ca4ddfa00619c0023dfb08ae1601054fce > > Author: Dmitri Epshtein > > Date: Mon Nov 20 10:38:31 2017 +0200 > > > > sam: remove SAM_HW_RING_NUM from APIs > > > > Use function: > > u32 sam_get_num_cios(u32 inst); > > > > As a consequence, next-net cannot be pulled! > > Got it, should I drop the patchset from tree? We're checking the error and will provide fix asap. Please let know if this should be another version of the entire patchset or fix on top. Sorry for the problems. Best regards, Marcin
Re: [dpdk-dev] [PATCH v2 10/12] net/mvpp2: align documentation with MUSDK 18.09
On 9/24/2018 12:51 PM, Marcin Wojtas wrote: > Hi Ferruh, > > pon., 24 wrz 2018 o 13:38 Ferruh Yigit napisał(a): >> >> On 9/23/2018 11:40 PM, Thomas Monjalon wrote: >>> 19/09/2018 19:15, Ferruh Yigit: On 9/4/2018 2:49 PM, Tomasz Duszynski wrote: > From: Natalie Samsonov > --- a/doc/guides/nics/mvpp2.rst > +++ b/doc/guides/nics/mvpp2.rst > - git clone > https://github.com/MarvellEmbeddedProcessors/linux-marvell.git -b > linux-4.4.52-armada-17.10 > + git clone > https://github.com/MarvellEmbeddedProcessors/linux-marvell.git -b > linux-4.4.120-armada-18.09 There is a strict dependency to MUSDK 18.09, dpdk18.11 won't compile with older versions. It is hard to trace this dependency, what do you think having a matrix in DPDK documentation showing which DPDK version supports which MUSDK? >>> >>> It does not compile even with MUSDK 18.09. >>> >>> With MUSDK 18.09, the error is: >>> drivers/crypto/mvsam/rte_mrvl_pmd.c:867:26: error: 'SAM_HW_RING_NUM' >>> undeclared >> >> I confirm same error. I wasn't building with crypto PMD enabled so not >> caught it. >> >>> >>> The explanation is in MUSDK: >>> commit 9bf8b3ca4ddfa00619c0023dfb08ae1601054fce >>> Author: Dmitri Epshtein >>> Date: Mon Nov 20 10:38:31 2017 +0200 >>> >>> sam: remove SAM_HW_RING_NUM from APIs >>> >>> Use function: >>> u32 sam_get_num_cios(u32 inst); >>> >>> As a consequence, next-net cannot be pulled! >> >> Got it, should I drop the patchset from tree? > > We're checking the error and will provide fix asap. Please let know if > this should be another version of the entire patchset or fix on top. There is another comment from Thomas (mvpp2_tm.png). Both "fix on top" and "new version" is OK for me, pick whichever easy for you. For "fix on top", I will squash fixes to original commits, so fixes should be separate patches with a information which commit it targets. But overall build should not be broken, it should be clear in which commit dependency changed to 18.09. Let call the commit that switch happens X, all commits before X should compile successfully with 17.10, commit X and all following commits should be compile successfully with 18.09. > Sorry for the problems. > > Best regards, > Marcin >
Re: [dpdk-dev] [PATCH v2 10/12] net/mvpp2: align documentation with MUSDK 18.09
24/09/2018 13:36, Ferruh Yigit: > On 9/23/2018 11:40 PM, Thomas Monjalon wrote: > > As a consequence, next-net cannot be pulled! > > Got it, should I drop the patchset from tree? Yes I think it's better to re-consider this patchset later.
Re: [dpdk-dev] [PATCH v2 10/12] net/mvpp2: align documentation with MUSDK 18.09
pon., 24 wrz 2018 o 14:44 Thomas Monjalon napisał(a): > > 24/09/2018 13:36, Ferruh Yigit: > > On 9/23/2018 11:40 PM, Thomas Monjalon wrote: > > > As a consequence, next-net cannot be pulled! > > > > Got it, should I drop the patchset from tree? > > Yes I think it's better to re-consider this patchset later. > > Ok, complete, new version of it will be re-sent to the lists. Best regards, Marcin
[dpdk-dev] [PATCH v2 2/3] app/testpmd: add packet dump callback functions
add new rx/tx callback functions to be used for dumping the packets. Signed-off-by: Raslan Darawsheh --- app/test-pmd/config.c | 67 ++ app/test-pmd/testpmd.h | 15 +++ app/test-pmd/util.c| 17 + 3 files changed, 99 insertions(+) diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index a0f9349..fb45fea 100644 --- a/app/test-pmd/config.c +++ b/app/test-pmd/config.c @@ -2896,6 +2896,73 @@ set_pkt_forwarding_mode(const char *fwd_mode_name) } void +add_rx_dump_callbacks(portid_t portid) +{ + struct rte_eth_dev_info dev_info; + uint16_t queue; + + if (port_id_is_invalid(portid, ENABLED_WARN)) + return; + + rte_eth_dev_info_get(portid, &dev_info); + for (queue = 0; queue < dev_info.nb_rx_queues; queue++) + if (!ports[portid].rx_dump_cb[queue]) + ports[portid].rx_dump_cb[queue] = + rte_eth_add_rx_callback(portid, queue, + dump_rx_pkts, NULL); +} + +void +add_tx_dump_callbacks(portid_t portid) +{ + struct rte_eth_dev_info dev_info; + uint16_t queue; + + if (port_id_is_invalid(portid, ENABLED_WARN)) + return; + rte_eth_dev_info_get(portid, &dev_info); + for (queue = 0; queue < dev_info.nb_tx_queues; queue++) + if (!ports[portid].tx_dump_cb[queue]) + ports[portid].tx_dump_cb[queue] = + rte_eth_add_tx_callback(portid, queue, + dump_tx_pkts, NULL); +} + +void +remove_rx_dump_callbacks(portid_t portid) +{ + struct rte_eth_dev_info dev_info; + uint16_t queue; + + if (port_id_is_invalid(portid, ENABLED_WARN)) + return; + rte_eth_dev_info_get(portid, &dev_info); + for (queue = 0; queue < dev_info.nb_rx_queues; queue++) + if (ports[portid].rx_dump_cb[queue]) { + rte_eth_remove_rx_callback(portid, queue, + ports[portid].rx_dump_cb[queue]); + ports[portid].rx_dump_cb[queue] = NULL; + } +} + +void +remove_tx_dump_callbacks(portid_t portid) +{ + struct rte_eth_dev_info dev_info; + uint16_t queue; + + if (port_id_is_invalid(portid, ENABLED_WARN)) + return; + rte_eth_dev_info_get(portid, &dev_info); + for (queue = 0; queue < dev_info.nb_tx_queues; queue++) + if (ports[portid].tx_dump_cb[queue]) { + rte_eth_remove_tx_callback(portid, queue, + ports[portid].tx_dump_cb[queue]); + ports[portid].tx_dump_cb[queue] = NULL; + } +} + +void set_verbose_level(uint16_t vb_level) { printf("Change verbose level from %u to %u\n", diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index a1f6614..c0d7656 100644 --- a/app/test-pmd/testpmd.h +++ b/app/test-pmd/testpmd.h @@ -180,6 +180,8 @@ struct rte_port { uint32_tmc_addr_nb; /**< nb. of addr. in mc_addr_pool */ uint8_t slave_flag; /**< bonding slave port */ struct port_flow*flow_list; /**< Associated flows. */ + const struct rte_eth_rxtx_callback *rx_dump_cb[MAX_QUEUE_ID+1]; + const struct rte_eth_rxtx_callback *tx_dump_cb[MAX_QUEUE_ID+1]; #ifdef SOFTNIC struct softnic_port softport; /**< softnic params */ #endif @@ -743,6 +745,19 @@ int check_nb_rxq(queueid_t rxq); queueid_t get_allowed_max_nb_txq(portid_t *pid); int check_nb_txq(queueid_t txq); + +uint16_t dump_rx_pkts(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[], + uint16_t nb_pkts, __rte_unused uint16_t max_pkts, + __rte_unused void *user_param); + +uint16_t dump_tx_pkts(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[], + uint16_t nb_pkts, __rte_unused void *user_param); + +void add_rx_dump_callbacks(portid_t portid); +void remove_rx_dump_callbacks(portid_t portid); +void add_tx_dump_callbacks(portid_t portid); +void remove_tx_dump_callbacks(portid_t portid); + /* * Work-around of a compilation error with ICC on invocations of the * rte_be_to_cpu_16() function. diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c index 97c8349..9a40ec7 100644 --- a/app/test-pmd/util.c +++ b/app/test-pmd/util.c @@ -142,3 +142,20 @@ dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[], printf(" ol_flags: %s\n", buf); } } + +uint16_t +dump_rx_pkts(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[], +uint16_t nb_pkts, __rte_unused uint16_t max_pkts, +__rte_unused void *user_param) +{ + dump_pkt_burst(port_id, queue, pkts, nb_pkts, 1); + return nb_pkts; +} + +uint16_t +dump_tx_pkts(uint16_t po
[dpdk-dev] [PATCH v2 1/3] app/testpmd: move dumping packets to a separate function
verbosity for the received/sent packets is needed in all of the forwarding engines so moving it to be in a separate function Signed-off-by: Raslan Darawsheh --- app/test-pmd/Makefile | 1 + app/test-pmd/rxonly.c | 134 ++ app/test-pmd/util.c | 144 ++ 3 files changed, 148 insertions(+), 131 deletions(-) create mode 100644 app/test-pmd/util.c diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile index 2b4d604..e2c7845 100644 --- a/app/test-pmd/Makefile +++ b/app/test-pmd/Makefile @@ -35,6 +35,7 @@ SRCS-y += csumonly.c SRCS-y += icmpecho.c SRCS-$(CONFIG_RTE_LIBRTE_IEEE1588) += ieee1588fwd.c SRCS-$(CONFIG_RTE_LIBRTE_BPF) += bpf_cmd.c +SRCS-y += util.c ifeq ($(CONFIG_RTE_LIBRTE_PMD_SOFTNIC), y) SRCS-y += softnicfwd.c diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c index a93d806..3eca89c 100644 --- a/app/test-pmd/rxonly.c +++ b/app/test-pmd/rxonly.c @@ -40,14 +40,6 @@ #include "testpmd.h" -static inline void -print_ether_addr(const char *what, struct ether_addr *eth_addr) -{ - char buf[ETHER_ADDR_FMT_SIZE]; - ether_format_addr(buf, ETHER_ADDR_FMT_SIZE, eth_addr); - printf("%s%s", what, buf); -} - /* * Received a burst of packets. */ @@ -55,16 +47,8 @@ static void pkt_burst_receive(struct fwd_stream *fs) { struct rte_mbuf *pkts_burst[MAX_PKT_BURST]; - struct rte_mbuf *mb; - struct ether_hdr *eth_hdr; - uint16_t eth_type; - uint64_t ol_flags; uint16_t nb_rx; - uint16_t i, packet_type; - uint16_t is_encapsulation; - char buf[256]; - struct rte_net_hdr_lens hdr_lens; - uint32_t sw_packet_type; + uint16_t i; #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES uint64_t start_tsc; @@ -90,120 +74,8 @@ pkt_burst_receive(struct fwd_stream *fs) /* * Dump each received packet if verbose_level > 0. */ - if (verbose_level > 0) - printf("port %u/queue %u: received %u packets\n", - fs->rx_port, - (unsigned) fs->rx_queue, - (unsigned) nb_rx); - for (i = 0; i < nb_rx; i++) { - mb = pkts_burst[i]; - if (verbose_level == 0) { - rte_pktmbuf_free(mb); - continue; - } - eth_hdr = rte_pktmbuf_mtod(mb, struct ether_hdr *); - eth_type = RTE_BE_TO_CPU_16(eth_hdr->ether_type); - ol_flags = mb->ol_flags; - packet_type = mb->packet_type; - is_encapsulation = RTE_ETH_IS_TUNNEL_PKT(packet_type); - - print_ether_addr(" src=", ð_hdr->s_addr); - print_ether_addr(" - dst=", ð_hdr->d_addr); - printf(" - type=0x%04x - length=%u - nb_segs=%d", - eth_type, (unsigned) mb->pkt_len, - (int)mb->nb_segs); - if (ol_flags & PKT_RX_RSS_HASH) { - printf(" - RSS hash=0x%x", (unsigned) mb->hash.rss); - printf(" - RSS queue=0x%x",(unsigned) fs->rx_queue); - } - if (ol_flags & PKT_RX_FDIR) { - printf(" - FDIR matched "); - if (ol_flags & PKT_RX_FDIR_ID) - printf("ID=0x%x", - mb->hash.fdir.hi); - else if (ol_flags & PKT_RX_FDIR_FLX) - printf("flex bytes=0x%08x %08x", - mb->hash.fdir.hi, mb->hash.fdir.lo); - else - printf("hash=0x%x ID=0x%x ", - mb->hash.fdir.hash, mb->hash.fdir.id); - } - if (ol_flags & PKT_RX_TIMESTAMP) - printf(" - timestamp %"PRIu64" ", mb->timestamp); - if (ol_flags & PKT_RX_VLAN_STRIPPED) - printf(" - VLAN tci=0x%x", mb->vlan_tci); - if (ol_flags & PKT_RX_QINQ_STRIPPED) - printf(" - QinQ VLAN tci=0x%x, VLAN tci outer=0x%x", - mb->vlan_tci, mb->vlan_tci_outer); - if (mb->packet_type) { - rte_get_ptype_name(mb->packet_type, buf, sizeof(buf)); - printf(" - hw ptype: %s", buf); - } - sw_packet_type = rte_net_get_ptype(mb, &hdr_lens, - RTE_PTYPE_ALL_MASK); - rte_get_ptype_name(sw_packet_type, buf, sizeof(buf)); - printf(" - sw ptype: %s", buf); - if (sw_packet_type & RTE_PTYPE_L2_MASK) - printf(" - l2_len=%d", hdr_lens.l2_len); - if (sw_packet_type & RTE_PTYPE_L3_MASK) - printf(" - l3_len=%d", hdr_lens.l3_len); - if (sw_packet_typ
[dpdk-dev] [PATCH v2 3/3] app/testpmd: set packet dump based on verbosity level
when changing verbosity level it will configure rx/tx callbacks to dump packets based on the verbosity value as following: 1- dump only received packets: testpmd> set verbose 1 2- dump only sent packets: testpmd> set verbose 2 3- dump sent and received packets: testpmd> set verbose (any number > 2) 4- disable dump testpmd> set verbose 0 Signed-off-by: Raslan Darawsheh --- app/test-pmd/config.c | 25 + app/test-pmd/testpmd.c | 4 ++-- app/test-pmd/testpmd.h | 1 + 3 files changed, 28 insertions(+), 2 deletions(-) diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index fb45fea..f402e04 100644 --- a/app/test-pmd/config.c +++ b/app/test-pmd/config.c @@ -50,6 +50,7 @@ #endif #include #include +#include #include "testpmd.h" @@ -2963,11 +2964,35 @@ remove_tx_dump_callbacks(portid_t portid) } void +configure_rxtx_dump_callbacks(uint16_t verbose) +{ + portid_t portid; + +#ifndef RTE_ETHDEV_RXTX_CALLBACKS + TESTPMD_LOG(ERR, "setting rxtx callbacks is not enabled\n"); + return; +#endif + + RTE_ETH_FOREACH_DEV(portid) + { + if (verbose == 1 || verbose > 2) + add_rx_dump_callbacks(portid); + else + remove_rx_dump_callbacks(portid); + if (verbose >= 2) + add_tx_dump_callbacks(portid); + else + remove_tx_dump_callbacks(portid); + } +} + +void set_verbose_level(uint16_t vb_level) { printf("Change verbose level from %u to %u\n", (unsigned int) verbose_level, (unsigned int) vb_level); verbose_level = vb_level; + configure_rxtx_dump_callbacks(verbose_level); } void diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index 571ecb4..538723c 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -1665,7 +1665,7 @@ start_port(portid_t pid) return -1; } } - + configure_rxtx_dump_callbacks(0); printf("Configuring Port %d (socket %u)\n", pi, port->socket_id); /* configure port */ @@ -1764,7 +1764,7 @@ start_port(portid_t pid) return -1; } } - + configure_rxtx_dump_callbacks(verbose_level); /* start port */ if (rte_eth_dev_start(pi) < 0) { printf("Fail to start port %d\n", pi); diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index c0d7656..e68710d 100644 --- a/app/test-pmd/testpmd.h +++ b/app/test-pmd/testpmd.h @@ -757,6 +757,7 @@ void add_rx_dump_callbacks(portid_t portid); void remove_rx_dump_callbacks(portid_t portid); void add_tx_dump_callbacks(portid_t portid); void remove_tx_dump_callbacks(portid_t portid); +void configure_rxtx_dump_callbacks(uint16_t verbose); /* * Work-around of a compilation error with ICC on invocations of the -- 2.7.4
Re: [dpdk-dev] [PATCH v2 10/12] net/mvpp2: align documentation with MUSDK 18.09
On 9/24/2018 1:48 PM, Marcin Wojtas wrote: > pon., 24 wrz 2018 o 14:44 Thomas Monjalon napisał(a): >> >> 24/09/2018 13:36, Ferruh Yigit: >>> On 9/23/2018 11:40 PM, Thomas Monjalon wrote: As a consequence, next-net cannot be pulled! >>> >>> Got it, should I drop the patchset from tree? >> >> Yes I think it's better to re-consider this patchset later. >> >> > > Ok, complete, new version of it will be re-sent to the lists. OK, patch will be dropped from the next-net for now. > > Best regards, > Marcin >
Re: [dpdk-dev] [PATCH] latencystats: fix timestamp marking and latency calculation
Hi, > -Original Message- > From: long...@viettel.com.vn [mailto:long...@viettel.com.vn] > Sent: Saturday, September 22, 2018 3:58 AM > To: Pattan, Reshma ; Ananyev, Konstantin > ; dev@dpdk.org > Subject: RE: [PATCH] latencystats: fix timestamp marking and latency > calculation > > Hi Reshma, > > > -Original Message- > > From: reshma.pat...@intel.com [mailto:reshma.pat...@intel.com] > > Sent: Friday, September 21, 2018 11:02 PM > > To: long...@viettel.com.vn; konstantin.anan...@intel.com; > dev@dpdk.org > > Cc: Reshma Pattan > > Subject: [PATCH] latencystats: fix timestamp marking and latency > calculation > > > > Latency calculation logic is not correct for the case where packets > > gets dropped before TX. As for the dropped packets, the timestamp is > > not cleared, and such packets still gets counted for latency > > calculation in > next > > runs, that will result in inaccurate latency measurement. > > > > So fix this issue as below, > > > > Before setting timestamp in mbuf, check mbuf don't have any prior > > valid time stamp flag set and after marking the timestamp, set mbuf > > flags to indicate timestamp is valid. > > > > Before calculating timestamp check mbuf flags are set to indicate > timestamp > > is valid. > > > > This solution as suggested by Konstantin is great. Not only does it solve the > problem but also now the usage of mbuf->timestamp is not exclusive to > latencystats anymore. The application can make use of timestamp at the > same as latencystats simply by toggling PKT_RX_TIMESTAMP. I think we > should update the doc to include this information. > Do you mean latency stats document? Or Mbuf doc. Thanks, Reshma
Re: [dpdk-dev] [PATCH v2 10/12] net/mvpp2: align documentation with MUSDK 18.09
On 24.09.2018 14:50, Ferruh Yigit wrote: > On 9/24/2018 1:48 PM, Marcin Wojtas wrote: >> pon., 24 wrz 2018 o 14:44 Thomas Monjalon napisał(a): >>> >>> 24/09/2018 13:36, Ferruh Yigit: On 9/23/2018 11:40 PM, Thomas Monjalon wrote: > As a consequence, next-net cannot be pulled! Got it, should I drop the patchset from tree? >>> >>> Yes I think it's better to re-consider this patchset later. >>> >>> >> >> Ok, complete, new version of it will be re-sent to the lists. > > OK, patch will be dropped from the next-net for now. I will provide the new patchset shortly with following patch: http://patches.dpdk.org/patch/44255/ already "squashed"/applied. This is the patch for updating mvsam to MUSDK 18.09 which solves the problem you experience but I guess it goes to next-crypto. That way I hope there would be no problem with merging next-crypto and next-net will be able to compile. Best regards Andrzej
[dpdk-dev] [PATCH] ethdev: fix error handling logic
This patch fixes how function exit is handled when errors inside rte_eth_dev_create. Fixes: e489007a411c ("ethdev: add generic create/destroy ethdev APIs") Cc: sta...@dpdk.org Signed-off-by: Alejandro Lucero --- lib/librte_ethdev/rte_ethdev.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c index aa7730c..ef99f70 100644 --- a/lib/librte_ethdev/rte_ethdev.c +++ b/lib/librte_ethdev/rte_ethdev.c @@ -3467,10 +3467,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx, if (rte_eal_process_type() == RTE_PROC_PRIMARY) { ethdev = rte_eth_dev_allocate(name); - if (!ethdev) { - retval = -ENODEV; - goto probe_failed; - } + if (!ethdev) + return -ENODEV; if (priv_data_size) { ethdev->data->dev_private = rte_zmalloc_socket( @@ -3480,7 +3478,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx, if (!ethdev->data->dev_private) { RTE_LOG(ERR, EAL, "failed to allocate private data"); retval = -ENOMEM; - goto probe_failed; + goto data_alloc_failed; } } } else { @@ -3488,8 +3486,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx, if (!ethdev) { RTE_LOG(ERR, EAL, "secondary process attach failed, " "ethdev doesn't exist"); - retval = -ENODEV; - goto probe_failed; + return -ENODEV; } } @@ -3518,6 +3515,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx, if (rte_eal_process_type() == RTE_PROC_PRIMARY) rte_free(ethdev->data->dev_private); +data_alloc_failed: rte_eth_dev_release_port(ethdev); return retval; -- 1.9.1
[dpdk-dev] [PATCH 03/11] net/sfc/base: adjust PHY module info interface
From: Richard Houldsworth Adjust data types in interface to permit the complete module information buffer to be obtained in a single call. Signed-off-by: Richard Houldsworth Signed-off-by: Andrew Rybchenko --- drivers/net/sfc/base/efx.h | 10 -- drivers/net/sfc/base/efx_mcdi.c | 12 +++- drivers/net/sfc/base/efx_mcdi.h | 4 ++-- drivers/net/sfc/base/efx_phy.c | 7 --- 4 files changed, 21 insertions(+), 12 deletions(-) diff --git a/drivers/net/sfc/base/efx.h b/drivers/net/sfc/base/efx.h index 9eefda6a5..a8c3ae301 100644 --- a/drivers/net/sfc/base/efx.h +++ b/drivers/net/sfc/base/efx.h @@ -1051,12 +1051,18 @@ efx_phy_media_type_get( */ #defineEFX_PHY_MEDIA_INFO_DEV_ADDR_QSFP0xA0 +/* + * Maximum accessible data offset for PHY module information. + */ +#defineEFX_PHY_MEDIA_INFO_MAX_OFFSET 0x100 + + extern __checkReturn efx_rc_t efx_phy_module_get_info( __inefx_nic_t *enp, __inuint8_t dev_addr, - __inuint8_t offset, - __inuint8_t len, + __insize_t offset, + __insize_t len, __out_bcount(len) uint8_t *data); #if EFSYS_OPT_PHY_STATS diff --git a/drivers/net/sfc/base/efx_mcdi.c b/drivers/net/sfc/base/efx_mcdi.c index 01cc64e2e..f53be015b 100644 --- a/drivers/net/sfc/base/efx_mcdi.c +++ b/drivers/net/sfc/base/efx_mcdi.c @@ -2214,8 +2214,8 @@ efx_mcdi_get_phy_media_info( efx_mcdi_phy_module_get_info( __inefx_nic_t *enp, __inuint8_t dev_addr, - __inuint8_t offset, - __inuint8_t len, + __insize_t offset, + __insize_t len, __out_bcount(len) uint8_t *data) { efx_port_t *epp = &(enp->en_port); @@ -2296,12 +2296,14 @@ efx_mcdi_phy_module_get_info( goto fail1; } + EFX_STATIC_ASSERT(EFX_PHY_MEDIA_INFO_PAGE_SIZE <= 0xFF); + if (offset < EFX_PHY_MEDIA_INFO_PAGE_SIZE) { - uint8_t read_len = + size_t read_len = MIN(len, EFX_PHY_MEDIA_INFO_PAGE_SIZE - offset); rc = efx_mcdi_get_phy_media_info(enp, - mcdi_lower_page, offset, read_len, data); + mcdi_lower_page, (uint8_t)offset, (uint8_t)read_len, data); if (rc != 0) goto fail2; @@ -2318,7 +2320,7 @@ efx_mcdi_phy_module_get_info( EFSYS_ASSERT3U(offset, <, EFX_PHY_MEDIA_INFO_PAGE_SIZE); rc = efx_mcdi_get_phy_media_info(enp, - mcdi_upper_page, offset, len, data); + mcdi_upper_page, (uint8_t)offset, (uint8_t)len, data); if (rc != 0) goto fail3; } diff --git a/drivers/net/sfc/base/efx_mcdi.h b/drivers/net/sfc/base/efx_mcdi.h index 40072405e..ddf91c111 100644 --- a/drivers/net/sfc/base/efx_mcdi.h +++ b/drivers/net/sfc/base/efx_mcdi.h @@ -219,8 +219,8 @@ extern __checkReturn efx_rc_t efx_mcdi_phy_module_get_info( __inefx_nic_t *enp, __inuint8_t dev_addr, - __inuint8_t offset, - __inuint8_t len, + __insize_t offset, + __insize_t len, __out_bcount(len) uint8_t *data); #defineMCDI_IN(_emr, _type, _ofst) \ diff --git a/drivers/net/sfc/base/efx_phy.c b/drivers/net/sfc/base/efx_phy.c index e78d6efcb..63b89e6a4 100644 --- a/drivers/net/sfc/base/efx_phy.c +++ b/drivers/net/sfc/base/efx_phy.c @@ -288,8 +288,8 @@ efx_phy_media_type_get( efx_phy_module_get_info( __inefx_nic_t *enp, __inuint8_t dev_addr, - __inuint8_t offset, - __inuint8_t len, + __insize_t offset, + __insize_t len, __out_bcount(len) uint8_t *data) { efx_rc_t rc; @@ -297,7 +297,8 @@ efx_phy_module_get_info( EFSYS_ASSERT3U(enp->en_magic, ==, EFX_NIC_MAGIC); EFSYS_ASSERT(data != NULL); - if ((uint32_t)offset + len > 0x100) { + if ((offset > EFX_PHY_MEDIA_INFO_MAX_OFFSET) || + ((offset + len) > EFX_PHY_MEDIA_INFO_MAX_OFFSET)) { rc = EINVAL; goto fail1; } -- 2.17.1
[dpdk-dev] [PATCH 00/11] net/sfc: update base driver to support 50G and 100G
Add base driver patches to support 50G and 100G XtremeScale X2 family adapters. In this particular case it looks better to have separate patch which updates documentation (as a cut line which summarizes the result). There are few checkpatches.sh warnings due to coding style difference in base driver. Andrew Rybchenko (1): net/sfc: add 50G and 100G XtremeScale X2 family adapters Richard Houldsworth (9): net/sfc/base: make last byte of module information available net/sfc/base: expose PHY module device address constants net/sfc/base: adjust PHY module info interface net/sfc/base: update to current port mode terminology net/sfc/base: add X2 port modes to bandwidth calculator net/sfc/base: support improvements to bandwidth calculations net/sfc/base: infer port mode bandwidth from max link speed net/sfc/base: add accessor to whole link status net/sfc/base: use transceiver ID when reading info Tom Millington (1): net/sfc/base: guard Rx scale code with corresponding option doc/guides/nics/sfc_efx.rst | 4 + drivers/net/sfc/base/ef10_impl.h| 12 +-- drivers/net/sfc/base/ef10_mac.c | 6 +- drivers/net/sfc/base/ef10_nic.c | 122 +--- drivers/net/sfc/base/ef10_phy.c | 24 -- drivers/net/sfc/base/efx.h | 48 ++- drivers/net/sfc/base/efx_impl.h | 2 +- drivers/net/sfc/base/efx_mcdi.c | 80 ++ drivers/net/sfc/base/efx_mcdi.h | 4 +- drivers/net/sfc/base/efx_phy.c | 37 +++-- drivers/net/sfc/base/hunt_nic.c | 13 +-- drivers/net/sfc/base/medford2_nic.c | 12 +-- drivers/net/sfc/base/medford_nic.c | 12 +-- drivers/net/sfc/base/siena_nic.c| 2 + 14 files changed, 253 insertions(+), 125 deletions(-) -- 2.17.1
[dpdk-dev] [PATCH 04/11] net/sfc/base: update to current port mode terminology
From: Richard Houldsworth >From Medford onwards, the newer constants enumerating port modes should be used. Signed-off-by: Richard Houldsworth Signed-off-by: Andrew Rybchenko --- drivers/net/sfc/base/ef10_nic.c | 41 ++--- 1 file changed, 22 insertions(+), 19 deletions(-) diff --git a/drivers/net/sfc/base/ef10_nic.c b/drivers/net/sfc/base/ef10_nic.c index b54cd3940..c3634e351 100644 --- a/drivers/net/sfc/base/ef10_nic.c +++ b/drivers/net/sfc/base/ef10_nic.c @@ -135,26 +135,29 @@ ef10_nic_get_port_mode_bandwidth( efx_rc_t rc; switch (port_mode) { - case TLV_PORT_MODE_10G: + case TLV_PORT_MODE_1x1_NA: /* mode 0 */ bandwidth = 1; break; - case TLV_PORT_MODE_10G_10G: + case TLV_PORT_MODE_1x1_1x1: /* mode 2 */ bandwidth = 1 * 2; break; - case TLV_PORT_MODE_10G_10G_10G_10G: - case TLV_PORT_MODE_10G_10G_10G_10G_Q: - case TLV_PORT_MODE_10G_10G_10G_10G_Q1_Q2: - case TLV_PORT_MODE_10G_10G_10G_10G_Q2: + case TLV_PORT_MODE_4x1_NA: /* mode 4 */ + case TLV_PORT_MODE_2x1_2x1: /* mode 5 */ + case TLV_PORT_MODE_NA_4x1: /* mode 8 */ bandwidth = 1 * 4; break; - case TLV_PORT_MODE_40G: + /* Legacy Medford-only mode. Do not use (see bug63270) */ + case TLV_PORT_MODE_10G_10G_10G_10G_Q1_Q2: /* mode 9 */ + bandwidth = 1 * 4; + break; + case TLV_PORT_MODE_1x4_NA: /* mode 1 */ bandwidth = 4; break; - case TLV_PORT_MODE_40G_40G: + case TLV_PORT_MODE_1x4_1x4: /* mode 3 */ bandwidth = 4 * 2; break; - case TLV_PORT_MODE_40G_10G_10G: - case TLV_PORT_MODE_10G_10G_40G: + case TLV_PORT_MODE_1x4_2x1: /* mode 6 */ + case TLV_PORT_MODE_2x1_1x4: /* mode 7 */ bandwidth = 4 + (1 * 2); break; default: @@ -1468,8 +1471,8 @@ static struct ef10_external_port_map_s { */ { EFX_FAMILY_MEDFORD, - (1U << TLV_PORT_MODE_10G) | /* mode 0 */ - (1U << TLV_PORT_MODE_10G_10G), /* mode 2 */ + (1U << TLV_PORT_MODE_1x1_NA) | /* mode 0 */ + (1U << TLV_PORT_MODE_1x1_1x1), /* mode 2 */ 1, /* ports per cage */ 1 /* first cage */ }, @@ -1483,10 +1486,10 @@ static struct ef10_external_port_map_s { */ { EFX_FAMILY_MEDFORD, - (1U << TLV_PORT_MODE_40G) | /* mode 1 */ - (1U << TLV_PORT_MODE_40G_40G) | /* mode 3 */ - (1U << TLV_PORT_MODE_40G_10G_10G) | /* mode 6 */ - (1U << TLV_PORT_MODE_10G_10G_40G) | /* mode 7 */ + (1U << TLV_PORT_MODE_1x4_NA) | /* mode 1 */ + (1U << TLV_PORT_MODE_1x4_1x4) | /* mode 3 */ + (1U << TLV_PORT_MODE_1x4_2x1) | /* mode 6 */ + (1U << TLV_PORT_MODE_2x1_1x4) | /* mode 7 */ /* Do not use 10G_10G_10G_10G_Q1_Q2 (see bug63270) */ (1U << TLV_PORT_MODE_10G_10G_10G_10G_Q1_Q2),/* mode 9 */ 2, /* ports per cage */ @@ -1502,9 +1505,9 @@ static struct ef10_external_port_map_s { */ { EFX_FAMILY_MEDFORD, - (1U << TLV_PORT_MODE_10G_10G_10G_10G_Q) | /* mode 5 */ + (1U << TLV_PORT_MODE_2x1_2x1) | /* mode 5 */ /* Do not use 10G_10G_10G_10G_Q1 (see bug63270) */ - (1U << TLV_PORT_MODE_10G_10G_10G_10G_Q1), /* mode 4 */ + (1U << TLV_PORT_MODE_4x1_NA), /* mode 4 */ 4, /* ports per cage */ 1 /* first cage */ }, @@ -1518,7 +1521,7 @@ static struct ef10_external_port_map_s { */ { EFX_FAMILY_MEDFORD, - (1U << TLV_PORT_MODE_10G_10G_10G_10G_Q2), /* mode 8 */ + (1U << TLV_PORT_MODE_NA_4x1), /* mode 8 */ 4, /* ports per cage */ 2 /* first cage */ }, -- 2.17.1
[dpdk-dev] [PATCH 02/11] net/sfc/base: expose PHY module device address constants
From: Richard Houldsworth Rearrange so the valid addresses are visible to the caller. Signed-off-by: Richard Houldsworth Signed-off-by: Andrew Rybchenko --- drivers/net/sfc/base/efx.h | 21 + drivers/net/sfc/base/efx_mcdi.c | 21 - 2 files changed, 21 insertions(+), 21 deletions(-) diff --git a/drivers/net/sfc/base/efx.h b/drivers/net/sfc/base/efx.h index fd68d69c7..9eefda6a5 100644 --- a/drivers/net/sfc/base/efx.h +++ b/drivers/net/sfc/base/efx.h @@ -1030,6 +1030,27 @@ efx_phy_media_type_get( __inefx_nic_t *enp, __out efx_phy_media_type_t *typep); +/* + * 2-wire device address of the base information in accordance with SFF-8472 + * Diagnostic Monitoring Interface for Optical Transceivers section + * 4 Memory Organization. + */ +#defineEFX_PHY_MEDIA_INFO_DEV_ADDR_SFP_BASE0xA0 + +/* + * 2-wire device address of the digital diagnostics monitoring interface + * in accordance with SFF-8472 Diagnostic Monitoring Interface for Optical + * Transceivers section 4 Memory Organization. + */ +#defineEFX_PHY_MEDIA_INFO_DEV_ADDR_SFP_DDM 0xA2 + +/* + * Hard wired 2-wire device address for QSFP+ in accordance with SFF-8436 + * QSFP+ 10 Gbs 4X PLUGGABLE TRANSCEIVER section 7.4 Device Addressing and + * Operation. + */ +#defineEFX_PHY_MEDIA_INFO_DEV_ADDR_QSFP0xA0 + extern __checkReturn efx_rc_t efx_phy_module_get_info( __inefx_nic_t *enp, diff --git a/drivers/net/sfc/base/efx_mcdi.c b/drivers/net/sfc/base/efx_mcdi.c index c8d670c23..01cc64e2e 100644 --- a/drivers/net/sfc/base/efx_mcdi.c +++ b/drivers/net/sfc/base/efx_mcdi.c @@ -2210,27 +2210,6 @@ efx_mcdi_get_phy_media_info( return (rc); } -/* - * 2-wire device address of the base information in accordance with SFF-8472 - * Diagnostic Monitoring Interface for Optical Transceivers section - * 4 Memory Organization. - */ -#defineEFX_PHY_MEDIA_INFO_DEV_ADDR_SFP_BASE0xA0 - -/* - * 2-wire device address of the digital diagnostics monitoring interface - * in accordance with SFF-8472 Diagnostic Monitoring Interface for Optical - * Transceivers section 4 Memory Organization. - */ -#defineEFX_PHY_MEDIA_INFO_DEV_ADDR_SFP_DDM 0xA2 - -/* - * Hard wired 2-wire device address for QSFP+ in accordance with SFF-8436 - * QSFP+ 10 Gbs 4X PLUGGABLE TRANSCEIVER section 7.4 Device Addressing and - * Operation. - */ -#defineEFX_PHY_MEDIA_INFO_DEV_ADDR_QSFP0xA0 - __checkReturn efx_rc_t efx_mcdi_phy_module_get_info( __inefx_nic_t *enp, -- 2.17.1
[dpdk-dev] [PATCH 01/11] net/sfc/base: make last byte of module information available
From: Richard Houldsworth Adjust bounds so the interface supports reading the last available byte of data. Fixes: 19b64c6ac35f ("net/sfc/base: import libefx base") Cc: sta...@dpdk.org Signed-off-by: Richard Houldsworth Signed-off-by: Andrew Rybchenko --- drivers/net/sfc/base/efx_phy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/sfc/base/efx_phy.c b/drivers/net/sfc/base/efx_phy.c index 25059dfe1..e78d6efcb 100644 --- a/drivers/net/sfc/base/efx_phy.c +++ b/drivers/net/sfc/base/efx_phy.c @@ -297,7 +297,7 @@ efx_phy_module_get_info( EFSYS_ASSERT3U(enp->en_magic, ==, EFX_NIC_MAGIC); EFSYS_ASSERT(data != NULL); - if ((uint32_t)offset + len > 0xff) { + if ((uint32_t)offset + len > 0x100) { rc = EINVAL; goto fail1; } -- 2.17.1
[dpdk-dev] [PATCH 05/11] net/sfc/base: add X2 port modes to bandwidth calculator
From: Richard Houldsworth Add cases for the new port modes supported by X2 NICs. Lane bandwidth is calculated for pre-X2 cards so is an underestimate for X2 in 25G/100G modes. Signed-off-by: Richard Houldsworth Signed-off-by: Andrew Rybchenko --- drivers/net/sfc/base/ef10_nic.c | 43 ++--- 1 file changed, 34 insertions(+), 9 deletions(-) diff --git a/drivers/net/sfc/base/ef10_nic.c b/drivers/net/sfc/base/ef10_nic.c index c3634e351..1eea7c673 100644 --- a/drivers/net/sfc/base/ef10_nic.c +++ b/drivers/net/sfc/base/ef10_nic.c @@ -131,34 +131,59 @@ ef10_nic_get_port_mode_bandwidth( __inuint32_t port_mode, __out uint32_t *bandwidth_mbpsp) { + uint32_t single_lane = 1; + uint32_t dual_lane = 5; + uint32_t quad_lane = 4; uint32_t bandwidth; efx_rc_t rc; switch (port_mode) { case TLV_PORT_MODE_1x1_NA: /* mode 0 */ - bandwidth = 1; + bandwidth = single_lane; + break; + case TLV_PORT_MODE_1x2_NA: /* mode 10 */ + case TLV_PORT_MODE_NA_1x2: /* mode 11 */ + bandwidth = dual_lane; break; case TLV_PORT_MODE_1x1_1x1: /* mode 2 */ - bandwidth = 1 * 2; + bandwidth = single_lane + single_lane; break; case TLV_PORT_MODE_4x1_NA: /* mode 4 */ - case TLV_PORT_MODE_2x1_2x1: /* mode 5 */ case TLV_PORT_MODE_NA_4x1: /* mode 8 */ - bandwidth = 1 * 4; + bandwidth = 4 * single_lane; + break; + case TLV_PORT_MODE_2x1_2x1: /* mode 5 */ + bandwidth = (2 * single_lane) + (2 * single_lane); + break; + case TLV_PORT_MODE_1x2_1x2: /* mode 12 */ + bandwidth = dual_lane + dual_lane; + break; + case TLV_PORT_MODE_1x2_2x1: /* mode 17 */ + case TLV_PORT_MODE_2x1_1x2: /* mode 18 */ + bandwidth = dual_lane + (2 * single_lane); break; /* Legacy Medford-only mode. Do not use (see bug63270) */ case TLV_PORT_MODE_10G_10G_10G_10G_Q1_Q2: /* mode 9 */ - bandwidth = 1 * 4; + bandwidth = 4 * single_lane; break; case TLV_PORT_MODE_1x4_NA: /* mode 1 */ - bandwidth = 4; + case TLV_PORT_MODE_NA_1x4: /* mode 22 */ + bandwidth = quad_lane; break; - case TLV_PORT_MODE_1x4_1x4: /* mode 3 */ - bandwidth = 4 * 2; + case TLV_PORT_MODE_2x2_NA: /* mode 13 */ + case TLV_PORT_MODE_NA_2x2: /* mode 14 */ + bandwidth = 2 * dual_lane; break; case TLV_PORT_MODE_1x4_2x1: /* mode 6 */ case TLV_PORT_MODE_2x1_1x4: /* mode 7 */ - bandwidth = 4 + (1 * 2); + bandwidth = quad_lane + (2 * single_lane); + break; + case TLV_PORT_MODE_1x4_1x2: /* mode 15 */ + case TLV_PORT_MODE_1x2_1x4: /* mode 16 */ + bandwidth = quad_lane + dual_lane; + break; + case TLV_PORT_MODE_1x4_1x4: /* mode 3 */ + bandwidth = quad_lane + quad_lane; break; default: rc = EINVAL; -- 2.17.1
[dpdk-dev] [PATCH 07/11] net/sfc/base: infer port mode bandwidth from max link speed
From: Richard Houldsworth Limit the port mode bandwidth calculations by the maximum reported link speed. This system detects 25G vs 10G cards, and 100G port modes vs 40G. Signed-off-by: Richard Houldsworth Signed-off-by: Andrew Rybchenko --- drivers/net/sfc/base/ef10_nic.c | 23 --- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/drivers/net/sfc/base/ef10_nic.c b/drivers/net/sfc/base/ef10_nic.c index 8cd76d690..c197ff957 100644 --- a/drivers/net/sfc/base/ef10_nic.c +++ b/drivers/net/sfc/base/ef10_nic.c @@ -133,9 +133,11 @@ ef10_nic_get_port_mode_bandwidth( { uint32_t port_modes; uint32_t current_mode; - uint32_t single_lane = 1; - uint32_t dual_lane = 5; - uint32_t quad_lane = 4; + efx_port_t *epp = &(enp->en_port); + + uint32_t single_lane; + uint32_t dual_lane; + uint32_t quad_lane; uint32_t bandwidth; efx_rc_t rc; @@ -145,6 +147,21 @@ ef10_nic_get_port_mode_bandwidth( goto fail1; } + if (epp->ep_phy_cap_mask & (1 << EFX_PHY_CAP_25000FDX)) + single_lane = 25000; + else + single_lane = 1; + + if (epp->ep_phy_cap_mask & (1 << EFX_PHY_CAP_5FDX)) + dual_lane = 5; + else + dual_lane = 2; + + if (epp->ep_phy_cap_mask & (1 << EFX_PHY_CAP_10FDX)) + quad_lane = 10; + else + quad_lane = 4; + switch (current_mode) { case TLV_PORT_MODE_1x1_NA: /* mode 0 */ bandwidth = single_lane; -- 2.17.1
[dpdk-dev] [PATCH 06/11] net/sfc/base: support improvements to bandwidth calculations
From: Richard Houldsworth Change the interface to ef10_nic_get_port_mode_bandwidth() so more NIC information can be used to infer bandwidth requirements. Huntington calculations separated out completely. Signed-off-by: Richard Houldsworth Signed-off-by: Andrew Rybchenko --- drivers/net/sfc/base/ef10_impl.h| 2 +- drivers/net/sfc/base/ef10_nic.c | 16 +--- drivers/net/sfc/base/hunt_nic.c | 13 +++-- drivers/net/sfc/base/medford2_nic.c | 12 +--- drivers/net/sfc/base/medford_nic.c | 12 +--- 5 files changed, 19 insertions(+), 36 deletions(-) diff --git a/drivers/net/sfc/base/ef10_impl.h b/drivers/net/sfc/base/ef10_impl.h index b72e7d256..e43e26e68 100644 --- a/drivers/net/sfc/base/ef10_impl.h +++ b/drivers/net/sfc/base/ef10_impl.h @@ -1174,7 +1174,7 @@ efx_mcdi_get_port_modes( extern __checkReturn efx_rc_t ef10_nic_get_port_mode_bandwidth( - __inuint32_t port_mode, + __inefx_nic_t *enp, __out uint32_t *bandwidth_mbpsp); extern __checkReturn efx_rc_t diff --git a/drivers/net/sfc/base/ef10_nic.c b/drivers/net/sfc/base/ef10_nic.c index 1eea7c673..8cd76d690 100644 --- a/drivers/net/sfc/base/ef10_nic.c +++ b/drivers/net/sfc/base/ef10_nic.c @@ -128,16 +128,24 @@ efx_mcdi_get_port_modes( __checkReturn efx_rc_t ef10_nic_get_port_mode_bandwidth( - __inuint32_t port_mode, + __inefx_nic_t *enp, __out uint32_t *bandwidth_mbpsp) { + uint32_t port_modes; + uint32_t current_mode; uint32_t single_lane = 1; uint32_t dual_lane = 5; uint32_t quad_lane = 4; uint32_t bandwidth; efx_rc_t rc; - switch (port_mode) { + if ((rc = efx_mcdi_get_port_modes(enp, &port_modes, + ¤t_mode, NULL)) != 0) { + /* No port mode info available. */ + goto fail1; + } + + switch (current_mode) { case TLV_PORT_MODE_1x1_NA: /* mode 0 */ bandwidth = single_lane; break; @@ -187,13 +195,15 @@ ef10_nic_get_port_mode_bandwidth( break; default: rc = EINVAL; - goto fail1; + goto fail2; } *bandwidth_mbpsp = bandwidth; return (0); +fail2: + EFSYS_PROBE(fail2); fail1: EFSYS_PROBE1(fail1, efx_rc_t, rc); diff --git a/drivers/net/sfc/base/hunt_nic.c b/drivers/net/sfc/base/hunt_nic.c index 70c042f3f..ca30e90f7 100644 --- a/drivers/net/sfc/base/hunt_nic.c +++ b/drivers/net/sfc/base/hunt_nic.c @@ -20,7 +20,6 @@ hunt_nic_get_required_pcie_bandwidth( __out uint32_t *bandwidth_mbpsp) { uint32_t port_modes; - uint32_t max_port_mode; uint32_t bandwidth; efx_rc_t rc; @@ -47,17 +46,13 @@ hunt_nic_get_required_pcie_bandwidth( goto fail1; } else { if (port_modes & (1U << TLV_PORT_MODE_40G)) { - max_port_mode = TLV_PORT_MODE_40G; + bandwidth = 4; } else if (port_modes & (1U << TLV_PORT_MODE_10G_10G_10G_10G)) { - max_port_mode = TLV_PORT_MODE_10G_10G_10G_10G; + bandwidth = 4 * 1; } else { /* Assume two 10G ports */ - max_port_mode = TLV_PORT_MODE_10G_10G; + bandwidth = 2 * 1; } - - if ((rc = ef10_nic_get_port_mode_bandwidth(max_port_mode, - &bandwidth)) != 0) - goto fail2; } out: @@ -65,8 +60,6 @@ hunt_nic_get_required_pcie_bandwidth( return (0); -fail2: - EFSYS_PROBE(fail2); fail1: EFSYS_PROBE1(fail1, efx_rc_t, rc); diff --git a/drivers/net/sfc/base/medford2_nic.c b/drivers/net/sfc/base/medford2_nic.c index 3efc35886..6bc1e87cc 100644 --- a/drivers/net/sfc/base/medford2_nic.c +++ b/drivers/net/sfc/base/medford2_nic.c @@ -15,25 +15,15 @@ medford2_nic_get_required_pcie_bandwidth( __inefx_nic_t *enp, __out uint32_t *bandwidth_mbpsp) { - uint32_t port_modes; - uint32_t current_mode; uint32_t bandwidth; efx_rc_t rc; /* FIXME: support new Medford2 dynamic port modes */ - if ((rc = efx_mcdi_get_port_modes(enp, &port_modes, - ¤t_mode, NULL)) != 0) { - /* No port mode info available. */ - bandwidth = 0; - goto out; - } - - if ((rc = ef10_nic_get_port_mode_bandwidth(current_mode, + if ((rc = ef10_nic_get_port_mode_bandwidth(enp, &bandwidth)) != 0) goto fail1; -out: *bandwidth_mbpsp =
[dpdk-dev] [PATCH 11/11] net/sfc: add 50G and 100G XtremeScale X2 family adapters
Signed-off-by: Andrew Rybchenko --- doc/guides/nics/sfc_efx.rst | 4 1 file changed, 4 insertions(+) diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst index 63939ec83..425e669ed 100644 --- a/doc/guides/nics/sfc_efx.rst +++ b/doc/guides/nics/sfc_efx.rst @@ -240,6 +240,10 @@ Supported NICs - Solarflare X2522 Dual Port SFP28 10/25GbE Adapter + - Solarflare X2541 Single Port QSFP28 10/25G/100G Adapter + + - Solarflare X2542 Dual Port QSFP28 10/25G/100G Adapter + - Solarflare Flareon [Ultra] Server Adapters: - Solarflare SFN8522 Dual Port SFP+ Server Adapter -- 2.17.1
[dpdk-dev] [PATCH 09/11] net/sfc/base: add accessor to whole link status
From: Richard Houldsworth Add a function which makes an MCDI GET_LINK request and packages up the results. Currently, the get-link function is triggered from several entry points which then pass on or store selected parts of the data. When the driver needs to obtain the current link state, it is more efficient to do this in a single call. Signed-off-by: Richard Houldsworth Signed-off-by: Andrew Rybchenko --- drivers/net/sfc/base/ef10_impl.h | 10 +++--- drivers/net/sfc/base/ef10_mac.c | 6 +++--- drivers/net/sfc/base/ef10_nic.c | 4 ++-- drivers/net/sfc/base/ef10_phy.c | 24 drivers/net/sfc/base/efx.h | 15 +++ drivers/net/sfc/base/efx_impl.h | 2 +- drivers/net/sfc/base/efx_phy.c | 30 ++ 7 files changed, 66 insertions(+), 25 deletions(-) diff --git a/drivers/net/sfc/base/ef10_impl.h b/drivers/net/sfc/base/ef10_impl.h index e43e26e68..f971063a1 100644 --- a/drivers/net/sfc/base/ef10_impl.h +++ b/drivers/net/sfc/base/ef10_impl.h @@ -593,11 +593,7 @@ ef10_nvram_buffer_finish( /* PHY */ typedef struct ef10_link_state_s { - uint32_tels_adv_cap_mask; - uint32_tels_lp_cap_mask; - unsigned intels_fcntl; - efx_phy_fec_type_t els_fec; - efx_link_mode_t els_link_mode; + efx_phy_link_state_tepls; #if EFSYS_OPT_LOOPBACK efx_loopback_type_t els_loopback; #endif @@ -634,9 +630,9 @@ ef10_phy_oui_get( __out uint32_t *ouip); extern __checkReturn efx_rc_t -ef10_phy_fec_type_get( +ef10_phy_link_state_get( __inefx_nic_t *enp, - __out efx_phy_fec_type_t *fecp); + __out efx_phy_link_state_t *eplsp); #if EFSYS_OPT_PHY_STATS diff --git a/drivers/net/sfc/base/ef10_mac.c b/drivers/net/sfc/base/ef10_mac.c index ab73828f1..9f10f6f79 100644 --- a/drivers/net/sfc/base/ef10_mac.c +++ b/drivers/net/sfc/base/ef10_mac.c @@ -22,10 +22,10 @@ ef10_mac_poll( if ((rc = ef10_phy_get_link(enp, &els)) != 0) goto fail1; - epp->ep_adv_cap_mask = els.els_adv_cap_mask; - epp->ep_fcntl = els.els_fcntl; + epp->ep_adv_cap_mask = els.epls.epls_adv_cap_mask; + epp->ep_fcntl = els.epls.epls_fcntl; - *link_modep = els.els_link_mode; + *link_modep = els.epls.epls_link_mode; return (0); diff --git a/drivers/net/sfc/base/ef10_nic.c b/drivers/net/sfc/base/ef10_nic.c index 1b3d60682..50e23b7d4 100644 --- a/drivers/net/sfc/base/ef10_nic.c +++ b/drivers/net/sfc/base/ef10_nic.c @@ -1852,8 +1852,8 @@ ef10_nic_board_cfg( /* Obtain the default PHY advertised capabilities */ if ((rc = ef10_phy_get_link(enp, &els)) != 0) goto fail7; - epp->ep_default_adv_cap_mask = els.els_adv_cap_mask; - epp->ep_adv_cap_mask = els.els_adv_cap_mask; + epp->ep_default_adv_cap_mask = els.epls.epls_adv_cap_mask; + epp->ep_adv_cap_mask = els.epls.epls_adv_cap_mask; /* Check capabilities of running datapath firmware */ if ((rc = ef10_get_datapath_caps(enp)) != 0) diff --git a/drivers/net/sfc/base/ef10_phy.c b/drivers/net/sfc/base/ef10_phy.c index ec3600e96..84ccdde5d 100644 --- a/drivers/net/sfc/base/ef10_phy.c +++ b/drivers/net/sfc/base/ef10_phy.c @@ -286,9 +286,9 @@ ef10_phy_get_link( } mcdi_phy_decode_cap(MCDI_OUT_DWORD(req, GET_LINK_OUT_CAP), - &elsp->els_adv_cap_mask); + &elsp->epls.epls_adv_cap_mask); mcdi_phy_decode_cap(MCDI_OUT_DWORD(req, GET_LINK_OUT_LP_CAP), - &elsp->els_lp_cap_mask); + &elsp->epls.epls_lp_cap_mask); if (req.emr_out_length_used < MC_CMD_GET_LINK_OUT_V2_LEN) fec = MC_CMD_FEC_NONE; @@ -298,8 +298,16 @@ ef10_phy_get_link( mcdi_phy_decode_link_mode(enp, MCDI_OUT_DWORD(req, GET_LINK_OUT_FLAGS), MCDI_OUT_DWORD(req, GET_LINK_OUT_LINK_SPEED), MCDI_OUT_DWORD(req, GET_LINK_OUT_FCNTL), - fec, &elsp->els_link_mode, - &elsp->els_fcntl, &elsp->els_fec); + fec, &elsp->epls.epls_link_mode, + &elsp->epls.epls_fcntl, &elsp->epls.epls_fec); + + if (req.emr_out_length_used < MC_CMD_GET_LINK_OUT_V2_LEN) { + elsp->epls.epls_ld_cap_mask = 0; + } else { + mcdi_phy_decode_cap(MCDI_OUT_DWORD(req, GET_LINK_OUT_V2_LD_CAP), + &elsp->epls.epls_ld_cap_mask); + } + #if EFSYS_OPT_LOOPBACK /* @@ -543,18 +551,18 @@ ef10_phy_oui_get( } __checkReturn efx_rc_t -ef10_phy_fec_type_get( +ef10_phy_link_state_get( __inefx_nic_t *enp, - __out efx_phy_fec_type_t *fecp) + __out efx_phy_link_state_t *eplsp) {
[dpdk-dev] [PATCH 08/11] net/sfc/base: guard Rx scale code with corresponding option
From: Tom Millington Previously only some of the code was guarded by this which caused a build error when EFSYS_OPT_RX_SCALE is 0 (e.g. in manftest). Signed-off-by: Tom Millington Signed-off-by: Andrew Rybchenko --- drivers/net/sfc/base/ef10_nic.c | 7 +++ drivers/net/sfc/base/efx.h | 2 ++ drivers/net/sfc/base/siena_nic.c | 2 ++ 3 files changed, 11 insertions(+) diff --git a/drivers/net/sfc/base/ef10_nic.c b/drivers/net/sfc/base/ef10_nic.c index c197ff957..1b3d60682 100644 --- a/drivers/net/sfc/base/ef10_nic.c +++ b/drivers/net/sfc/base/ef10_nic.c @@ -1086,11 +1086,13 @@ ef10_get_datapath_caps( } encp->enc_rx_prefix_size = 14; +#if EFSYS_OPT_RX_SCALE /* Check if the firmware supports additional RSS modes */ if (CAP_FLAGS1(req, ADDITIONAL_RSS_MODES)) encp->enc_rx_scale_additional_modes_supported = B_TRUE; else encp->enc_rx_scale_additional_modes_supported = B_FALSE; +#endif /* EFSYS_OPT_RX_SCALE */ /* Check if the firmware supports TSO */ if (CAP_FLAGS1(req, TX_TSO)) @@ -1296,6 +1298,7 @@ ef10_get_datapath_caps( else encp->enc_hlb_counters = B_FALSE; +#if EFSYS_OPT_RX_SCALE if (CAP_FLAGS1(req, RX_RSS_LIMITED)) { /* Only one exclusive RSS context is available per port. */ encp->enc_rx_scale_max_exclusive_contexts = 1; @@ -1345,6 +1348,8 @@ ef10_get_datapath_caps( */ encp->enc_rx_scale_l4_hash_supported = B_TRUE; } +#endif /* EFSYS_OPT_RX_SCALE */ + /* Check if the firmware supports "FLAG" and "MARK" filter actions */ if (CAP_FLAGS2(req, FILTER_ACTION_FLAG)) encp->enc_filter_action_flag_supported = B_TRUE; @@ -1368,8 +1373,10 @@ ef10_get_datapath_caps( return (0); +#if EFSYS_OPT_RX_SCALE fail5: EFSYS_PROBE(fail5); +#endif /* EFSYS_OPT_RX_SCALE */ fail4: EFSYS_PROBE(fail4); fail3: diff --git a/drivers/net/sfc/base/efx.h b/drivers/net/sfc/base/efx.h index a8c3ae301..246708f9c 100644 --- a/drivers/net/sfc/base/efx.h +++ b/drivers/net/sfc/base/efx.h @@ -1281,6 +1281,7 @@ typedef struct efx_nic_cfg_s { uint32_tenc_rx_prefix_size; uint32_tenc_rx_buf_align_start; uint32_tenc_rx_buf_align_end; +#if EFSYS_OPT_RX_SCALE uint32_tenc_rx_scale_max_exclusive_contexts; /* * Mask of supported hash algorithms. @@ -1293,6 +1294,7 @@ typedef struct efx_nic_cfg_s { */ boolean_t enc_rx_scale_l4_hash_supported; boolean_t enc_rx_scale_additional_modes_supported; +#endif /* EFSYS_OPT_RX_SCALE */ #if EFSYS_OPT_LOOPBACK efx_qword_t enc_loopback_types[EFX_LINK_NMODES]; #endif /* EFSYS_OPT_LOOPBACK */ diff --git a/drivers/net/sfc/base/siena_nic.c b/drivers/net/sfc/base/siena_nic.c index 8a58986e8..fca17171b 100644 --- a/drivers/net/sfc/base/siena_nic.c +++ b/drivers/net/sfc/base/siena_nic.c @@ -114,6 +114,7 @@ siena_board_cfg( /* Alignment for WPTR updates */ encp->enc_rx_push_align = 1; +#if EFSYS_OPT_RX_SCALE /* There is one RSS context per function */ encp->enc_rx_scale_max_exclusive_contexts = 1; @@ -128,6 +129,7 @@ siena_board_cfg( /* There is no support for additional RSS modes */ encp->enc_rx_scale_additional_modes_supported = B_FALSE; +#endif /* EFSYS_OPT_RX_SCALE */ encp->enc_tx_dma_desc_size_max = EFX_MASK32(FSF_AZ_TX_KER_BYTE_COUNT); /* Fragments must not span 4k boundaries. */ -- 2.17.1
[dpdk-dev] [PATCH 10/11] net/sfc/base: use transceiver ID when reading info
From: Richard Houldsworth In efx_mcdi_phy_module_get_info() probe the transceiver identification byte rather than assume the module matches the fixed port type. This supports scenarios such as a SFP mounted in a QSFP port via a QSA module. Signed-off-by: Richard Houldsworth Signed-off-by: Andrew Rybchenko --- drivers/net/sfc/base/efx_mcdi.c | 47 - 1 file changed, 41 insertions(+), 6 deletions(-) diff --git a/drivers/net/sfc/base/efx_mcdi.c b/drivers/net/sfc/base/efx_mcdi.c index f53be015b..c896aa0bf 100644 --- a/drivers/net/sfc/base/efx_mcdi.c +++ b/drivers/net/sfc/base/efx_mcdi.c @@ -2150,6 +2150,14 @@ efx_mcdi_get_workarounds( */ #defineEFX_PHY_MEDIA_INFO_PAGE_SIZE0x80 +/* + * Transceiver identifiers from SFF-8024 Table 4-1. + */ +#defineEFX_SFF_TRANSCEIVER_ID_SFP 0x03 /* SFP/SFP+/SFP28 */ +#defineEFX_SFF_TRANSCEIVER_ID_QSFP 0x0c /* QSFP */ +#defineEFX_SFF_TRANSCEIVER_ID_QSFP_PLUS0x0d /* QSFP+ or later */ +#defineEFX_SFF_TRANSCEIVER_ID_QSFP28 0x11 /* QSFP28 or later */ + static __checkReturn efx_rc_t efx_mcdi_get_phy_media_info( __inefx_nic_t *enp, @@ -,6 +2230,7 @@ efx_mcdi_phy_module_get_info( efx_rc_t rc; uint32_t mcdi_lower_page; uint32_t mcdi_upper_page; + uint8_t id; EFSYS_ASSERT3U(enp->en_mod_flags, &, EFX_MOD_PROBE); @@ -2235,6 +2244,26 @@ efx_mcdi_phy_module_get_info( */ switch (epp->ep_fixed_port_type) { case EFX_PHY_MEDIA_SFP_PLUS: + case EFX_PHY_MEDIA_QSFP_PLUS: + /* Port type supports modules */ + break; + default: + rc = ENOTSUP; + goto fail1; + } + + /* +* For all supported port types, MCDI page 0 offset 0 holds the +* transceiver identifier. Probe to determine the data layout. +* Definitions from SFF-8024 Table 4-1. +*/ + rc = efx_mcdi_get_phy_media_info(enp, + 0, 0, sizeof(id), &id); + if (rc != 0) + goto fail2; + + switch (id) { + case EFX_SFF_TRANSCEIVER_ID_SFP: /* * In accordance with SFF-8472 Diagnostic Monitoring * Interface for Optical Transceivers section 4 Memory @@ -2269,10 +2298,12 @@ efx_mcdi_phy_module_get_info( break; default: rc = ENOTSUP; - goto fail1; + goto fail3; } break; - case EFX_PHY_MEDIA_QSFP_PLUS: + case EFX_SFF_TRANSCEIVER_ID_QSFP: + case EFX_SFF_TRANSCEIVER_ID_QSFP_PLUS: + case EFX_SFF_TRANSCEIVER_ID_QSFP28: switch (dev_addr) { case EFX_PHY_MEDIA_INFO_DEV_ADDR_QSFP: /* @@ -2288,12 +2319,12 @@ efx_mcdi_phy_module_get_info( break; default: rc = ENOTSUP; - goto fail1; + goto fail3; } break; default: rc = ENOTSUP; - goto fail1; + goto fail3; } EFX_STATIC_ASSERT(EFX_PHY_MEDIA_INFO_PAGE_SIZE <= 0xFF); @@ -2305,7 +2336,7 @@ efx_mcdi_phy_module_get_info( rc = efx_mcdi_get_phy_media_info(enp, mcdi_lower_page, (uint8_t)offset, (uint8_t)read_len, data); if (rc != 0) - goto fail2; + goto fail4; data += read_len; len -= read_len; @@ -2322,11 +2353,15 @@ efx_mcdi_phy_module_get_info( rc = efx_mcdi_get_phy_media_info(enp, mcdi_upper_page, (uint8_t)offset, (uint8_t)len, data); if (rc != 0) - goto fail3; + goto fail5; } return (0); +fail5: + EFSYS_PROBE(fail5); +fail4: + EFSYS_PROBE(fail4); fail3: EFSYS_PROBE(fail3); fail2: -- 2.17.1
Re: [dpdk-dev] [PATCH] ethdev: fix error handling logic
On 9/24/18 4:43 PM, Alejandro Lucero wrote: This patch fixes how function exit is handled when errors inside rte_eth_dev_create. Fixes: e489007a411c ("ethdev: add generic create/destroy ethdev APIs") Cc: sta...@dpdk.org Signed-off-by: Alejandro Lucero Minor nit/observation below, but anyway Reviewed-by: Andrew Rybchenko --- lib/librte_ethdev/rte_ethdev.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c index aa7730c..ef99f70 100644 --- a/lib/librte_ethdev/rte_ethdev.c +++ b/lib/librte_ethdev/rte_ethdev.c @@ -3467,10 +3467,8 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx, if (rte_eal_process_type() == RTE_PROC_PRIMARY) { ethdev = rte_eth_dev_allocate(name); - if (!ethdev) { - retval = -ENODEV; - goto probe_failed; - } + if (!ethdev) + return -ENODEV; As far as I can see rte_eth_dev_allocate() returns NULL if a device with such name already exists or no free ports left. I'd say that EEXIST and ENOSPC better describe what went wrong, but the patch simply does not change it and there is no easy way to fix it (except may be rte_errno usage). if (priv_data_size) { ethdev->data->dev_private = rte_zmalloc_socket( @@ -3480,7 +3478,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx, if (!ethdev->data->dev_private) { RTE_LOG(ERR, EAL, "failed to allocate private data"); retval = -ENOMEM; - goto probe_failed; + goto data_alloc_failed; } } } else { @@ -3488,8 +3486,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx, if (!ethdev) { RTE_LOG(ERR, EAL, "secondary process attach failed, " "ethdev doesn't exist"); - retval = -ENODEV; - goto probe_failed; + return -ENODEV; Here ENODEV is 100% correct since secondary simply failed to find ethdev with matching name. } } @@ -3518,6 +3515,7 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, uint16_t queue_idx, if (rte_eal_process_type() == RTE_PROC_PRIMARY) rte_free(ethdev->data->dev_private); +data_alloc_failed: rte_eth_dev_release_port(ethdev); return retval;
Re: [dpdk-dev] [PATCH 1/2] net/bonding: provide default Rx/Tx configuration
On Wed, Sep 5, 2018 at 5:14 AM Andrew Rybchenko wrote: > > From: Ivan Malov > > Default Rx/Tx configuration has become a helpful > resource for applications relying on the optimal > values to make rte_eth_rxconf and rte_eth_txconf > structures. These structures can then be tweaked. > > Default configuration is also used internally by > rte_eth_rx_queue_setup or rte_eth_tx_queue_setup > API calls when NULL pointer is passed by callers > with the argument for custom queue configuration. > > The use cases of bonding driver may also benefit > from exercising default settings in the same way. > > Restructure the code to collect various settings > from slave ports and make it possible to combine > default Rx/Tx configuration of these devices and > report it to the callers of rte_eth_dev_info_get. > > Signed-off-by: Ivan Malov > Signed-off-by: Andrew Rybchenko Acked-by: Chas Williams > --- > drivers/net/bonding/rte_eth_bond_api.c | 161 + > drivers/net/bonding/rte_eth_bond_pmd.c | 10 ++ > drivers/net/bonding/rte_eth_bond_private.h | 3 + > 3 files changed, 147 insertions(+), 27 deletions(-) > > diff --git a/drivers/net/bonding/rte_eth_bond_api.c > b/drivers/net/bonding/rte_eth_bond_api.c > index 8bc04cfd1..206a5c797 100644 > --- a/drivers/net/bonding/rte_eth_bond_api.c > +++ b/drivers/net/bonding/rte_eth_bond_api.c > @@ -269,6 +269,136 @@ slave_rte_flow_prepare(uint16_t slave_id, struct > bond_dev_private *internals) > return 0; > } > > +static void > +eth_bond_slave_inherit_dev_info_rx_first(struct bond_dev_private *internals, > +const struct rte_eth_dev_info *di) > +{ > + struct rte_eth_rxconf *rxconf_i = &internals->default_rxconf; > + > + internals->reta_size = di->reta_size; > + > + /* Inherit Rx offload capabilities from the first slave device */ > + internals->rx_offload_capa = di->rx_offload_capa; > + internals->rx_queue_offload_capa = di->rx_queue_offload_capa; > + internals->flow_type_rss_offloads = di->flow_type_rss_offloads; > + > + /* Inherit maximum Rx packet size from the first slave device */ > + internals->candidate_max_rx_pktlen = di->max_rx_pktlen; > + > + /* Inherit default Rx queue settings from the first slave device */ > + memcpy(rxconf_i, &di->default_rxconf, sizeof(*rxconf_i)); > + > + /* > +* Turn off descriptor prefetch and writeback by default for all > +* slave devices. Applications may tweak this setting if need be. > +*/ > + rxconf_i->rx_thresh.pthresh = 0; > + rxconf_i->rx_thresh.hthresh = 0; > + rxconf_i->rx_thresh.wthresh = 0; > + > + /* Setting this to zero should effectively enable default values */ > + rxconf_i->rx_free_thresh = 0; > + > + /* Disable deferred start by default for all slave devices */ > + rxconf_i->rx_deferred_start = 0; > +} > + > +static void > +eth_bond_slave_inherit_dev_info_tx_first(struct bond_dev_private *internals, > +const struct rte_eth_dev_info *di) > +{ > + struct rte_eth_txconf *txconf_i = &internals->default_txconf; > + > + /* Inherit Tx offload capabilities from the first slave device */ > + internals->tx_offload_capa = di->tx_offload_capa; > + internals->tx_queue_offload_capa = di->tx_queue_offload_capa; > + > + /* Inherit default Tx queue settings from the first slave device */ > + memcpy(txconf_i, &di->default_txconf, sizeof(*txconf_i)); > + > + /* > +* Turn off descriptor prefetch and writeback by default for all > +* slave devices. Applications may tweak this setting if need be. > +*/ > + txconf_i->tx_thresh.pthresh = 0; > + txconf_i->tx_thresh.hthresh = 0; > + txconf_i->tx_thresh.wthresh = 0; > + > + /* > +* Setting these parameters to zero assumes that default > +* values will be configured implicitly by slave devices. > +*/ > + txconf_i->tx_free_thresh = 0; > + txconf_i->tx_rs_thresh = 0; > + > + /* Disable deferred start by default for all slave devices */ > + txconf_i->tx_deferred_start = 0; > +} > + > +static void > +eth_bond_slave_inherit_dev_info_rx_next(struct bond_dev_private *internals, > + const struct rte_eth_dev_info *di) > +{ > + struct rte_eth_rxconf *rxconf_i = &internals->default_rxconf; > + const struct rte_eth_rxconf *rxconf = &di->default_rxconf; > + > + internals->rx_offload_capa &= di->rx_offload_capa; > + internals->rx_queue_offload_capa &= di->rx_queue_offload_capa; > + internals->flow_type_rss_offloads &= di->flow_type_rss_offloads; > + > + /* > +* If at least one slave device suggests enabling this > +* setting by default, enable it for all slave devices > +* since disabling it may not be necessarily supported. > +*/
Re: [dpdk-dev] [PATCH 2/2] net/bonding: inherit descriptor limits from slaves
On Wed, Sep 5, 2018 at 5:14 AM Andrew Rybchenko wrote: > > From: Ivan Malov > > Descriptor limits are used by applications to take > optimal decisions on queue sizing. > > Signed-off-by: Ivan Malov > Signed-off-by: Andrew Rybchenko Acked-by: Chas Williams > --- > drivers/net/bonding/rte_eth_bond_api.c | 54 ++ > drivers/net/bonding/rte_eth_bond_pmd.c | 8 > drivers/net/bonding/rte_eth_bond_private.h | 2 + > 3 files changed, 64 insertions(+) > > diff --git a/drivers/net/bonding/rte_eth_bond_api.c > b/drivers/net/bonding/rte_eth_bond_api.c > index 206a5c797..9e039e73b 100644 > --- a/drivers/net/bonding/rte_eth_bond_api.c > +++ b/drivers/net/bonding/rte_eth_bond_api.c > @@ -399,6 +399,43 @@ eth_bond_slave_inherit_dev_info_tx_next(struct > bond_dev_private *internals, > internals->tx_queue_offload_capa; > } > > +static void > +eth_bond_slave_inherit_desc_lim_first(struct rte_eth_desc_lim *bond_desc_lim, > + const struct rte_eth_desc_lim *slave_desc_lim) > +{ > + memcpy(bond_desc_lim, slave_desc_lim, sizeof(*bond_desc_lim)); > +} > + > +static int > +eth_bond_slave_inherit_desc_lim_next(struct rte_eth_desc_lim *bond_desc_lim, > + const struct rte_eth_desc_lim *slave_desc_lim) > +{ > + bond_desc_lim->nb_max = RTE_MIN(bond_desc_lim->nb_max, > + slave_desc_lim->nb_max); > + bond_desc_lim->nb_min = RTE_MAX(bond_desc_lim->nb_min, > + slave_desc_lim->nb_min); > + bond_desc_lim->nb_align = RTE_MAX(bond_desc_lim->nb_align, > + slave_desc_lim->nb_align); > + > + if (bond_desc_lim->nb_min > bond_desc_lim->nb_max || > + bond_desc_lim->nb_align > bond_desc_lim->nb_max) { > + RTE_BOND_LOG(ERR, "Failed to inherit descriptor limits"); > + return -EINVAL; > + } > + > + /* Treat maximum number of segments equal to 0 as unspecified */ > + if (slave_desc_lim->nb_seg_max != 0 && > + (bond_desc_lim->nb_seg_max == 0 || > +slave_desc_lim->nb_seg_max < bond_desc_lim->nb_seg_max)) > + bond_desc_lim->nb_seg_max = slave_desc_lim->nb_seg_max; > + if (slave_desc_lim->nb_mtu_seg_max != 0 && > + (bond_desc_lim->nb_mtu_seg_max == 0 || > +slave_desc_lim->nb_mtu_seg_max < bond_desc_lim->nb_mtu_seg_max)) > + bond_desc_lim->nb_mtu_seg_max = > slave_desc_lim->nb_mtu_seg_max; > + > + return 0; > +} > + > static int > __eth_bond_slave_add_lock_free(uint16_t bonded_port_id, uint16_t > slave_port_id) > { > @@ -458,9 +495,26 @@ __eth_bond_slave_add_lock_free(uint16_t bonded_port_id, > uint16_t slave_port_id) > > eth_bond_slave_inherit_dev_info_rx_first(internals, > &dev_info); > eth_bond_slave_inherit_dev_info_tx_first(internals, > &dev_info); > + > + eth_bond_slave_inherit_desc_lim_first(&internals->rx_desc_lim, > + &dev_info.rx_desc_lim); > + eth_bond_slave_inherit_desc_lim_first(&internals->tx_desc_lim, > + &dev_info.tx_desc_lim); > } else { > + int ret; > + > eth_bond_slave_inherit_dev_info_rx_next(internals, &dev_info); > eth_bond_slave_inherit_dev_info_tx_next(internals, &dev_info); > + > + ret = eth_bond_slave_inherit_desc_lim_next( > + &internals->rx_desc_lim, > &dev_info.rx_desc_lim); > + if (ret != 0) > + return ret; > + > + ret = eth_bond_slave_inherit_desc_lim_next( > + &internals->tx_desc_lim, > &dev_info.tx_desc_lim); > + if (ret != 0) > + return ret; > } > > bonded_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_hf &= > diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c > b/drivers/net/bonding/rte_eth_bond_pmd.c > index ee24e9658..46b660396 100644 > --- a/drivers/net/bonding/rte_eth_bond_pmd.c > +++ b/drivers/net/bonding/rte_eth_bond_pmd.c > @@ -2239,6 +2239,11 @@ bond_ethdev_info(struct rte_eth_dev *dev, struct > rte_eth_dev_info *dev_info) > memcpy(&dev_info->default_txconf, &internals->default_txconf, >sizeof(dev_info->default_txconf)); > > + memcpy(&dev_info->rx_desc_lim, &internals->rx_desc_lim, > + sizeof(dev_info->rx_desc_lim)); > + memcpy(&dev_info->tx_desc_lim, &internals->tx_desc_lim, > + sizeof(dev_info->tx_desc_lim)); > + > /** > * If dedicated hw queues enabled for link bonding device in LACP mode > * then we need to reduce the maximum number of data path queues by 1. > @@ -3064,6 +3069,9 @@ bond_alloc(struct rte_vdev_device *dev, uint8_t mode) > m
Re: [dpdk-dev] [PATCH v3 00/13] NXP DPAA driver enhancements
On 9/21/2018 12:05 PM, Hemant Agrawal wrote: > Misc driver level enhancements > > V3: fix the description and implementation of jumbo buffer fix > > V2: remove the unused function from map file > Add description/details in git commit logs. > > > Hemant Agrawal (9): > net/dpaa: configure frame queue on MAC ID basis > net/dpaa: fix jumbo buffer config > net/dpaa: implement scatter offload support > net/dpaa: minor debug log enhancements > bus/dpaa: add interrupt based portal fd support > net/dpaa: separate Rx function for LS1046 > net/dpaa: tune prefetch in Rx path > bus/dpaa: add check for re-definition in compat > mempool/dpaa: change the debug log level to DP > > Nipun Gupta (2): > bus/dpaa: avoid tag Set for eqcr in Tx path > bus/dpaa: avoid using be conversions for contextb > > Sachin Saxena (1): > net/dpaa: fix link speed based on MAC type > > Sunil Kumar Kori (1): > net/dpaa: rearranging of atomic queue support code Series applied to dpdk-next-net/master, thanks.
Re: [dpdk-dev] [PATCH v4 2/2] examples/vdpa: introduce a new sample for vDPA
On 09/24, Wang, Xiao W wrote: >Hi Xiaolong, > >Thanks for the update, 2 small comments below. > [snip] >> +./vdpa --log-level=9 -c 0x6 -n 4 --socket-mem 1024,1024 \ >> +-w :06:00.3,vdpa=1 -w :06:00.4,vdpa=1 \ >> +-- --interactive > >To demonstrate app doesn't need to launch dedicated worker threads for vhost >enqueue/dequeue operations, >We can use "-c 0x2" to indicate that no need to allocate dedicated worker >threads. > Got it, will do. >> + >> +.. note:: [snip] >> "%d\t\t"PCI_PRI_FMT"\t%"PRIu32"\t\t0x%"PRIu64"\n", did, >> +addr.domain, addr.bus, addr.devid, >> +addr.function, queue_num, features); > >Use PRIx64 instead of PRIu64 for features. >You can add a blank space between "PRIx64" and the other section to make it >more readable. >Refer to: > lib/librte_vhost/vhost_user.c: "guest memory region > %u, size: 0x%" PRIx64 "\n" Got it, will do. Thanks, Xiaolong > >BRs, >Xiao
Re: [dpdk-dev] [PATCH] drivers/net: do not redefine bool
On 9/21/2018 3:49 PM, Thomas Monjalon wrote: > 21/09/2018 15:47, Ferruh Yigit: >> On 9/20/2018 1:18 AM, Thomas Monjalon wrote: >>> When trying to include stdbool.h in DPDK base headers, there are a lot >>> of conflicts with drivers which redefine bool/true/false >>> in their compatibility layer. >>> >>> It is fixed by including stdbool.h in these drivers. >>> Some errors with usage of bool type are also fixed in some drivers. >>> >>> Note: the driver qede has a surprising mix of bool and int: >>> (~p_iov->b_pre_fp_hsi & ETH_HSI_VER_MINOR) >>> where the first variable is boolean and the version is a number. >>> It is replaced by >>> !p_iov->b_pre_fp_hsi >>> >>> Signed-off-by: Thomas Monjalon >>> --- >>> drivers/net/cxgbe/cxgbe_compat.h | 2 +- >>> drivers/net/e1000/base/e1000_osdep.h | 5 + >>> drivers/net/fm10k/base/fm10k_osdep.h | 8 +--- >>> drivers/net/fm10k/fm10k_ethdev.c | 4 ++-- >>> drivers/net/ixgbe/base/ixgbe_osdep.h | 6 +- >>> drivers/net/ixgbe/ixgbe_ethdev.c | 16 +--- >>> drivers/net/ixgbe/ixgbe_rxtx.c | 2 +- >>> drivers/net/qede/base/bcm_osal.h | 6 ++ >>> drivers/net/qede/base/ecore_vf.c | 3 +-- >>> drivers/net/qede/qede_ethdev.c | 2 +- >>> drivers/net/vmxnet3/base/vmxnet3_osdep.h | 3 ++- >>> 11 files changed, 22 insertions(+), 35 deletions(-) >> >> <...> >> >>> @@ -35,6 +35,7 @@ >>> #ifndef _E1000_OSDEP_H_ >>> #define _E1000_OSDEP_H_ >>> >>> +#include >>> #include >>> #include >>> #include >>> @@ -87,7 +88,6 @@ typedef int64_t s64; >>> typedef int32_ts32; >>> typedef int16_ts16; >>> typedef int8_t s8; >>> -typedef intbool; >>> >>> #define __le16 u16 >>> #define __le32 u32 >>> @@ -192,7 +192,4 @@ static inline uint16_t e1000_read_addr16(volatile void >>> *addr) >>> #define ETH_ADDR_LEN 6 >>> #endif >>> >>> -#define false FALSE >>> -#define true TRUE >>> - >> >> It is too much hassle to update Intel base driver code. > > It is not really base driver code. > It was agreed that *_osdep.h can be modified: > http://git.dpdk.org/dpdk/tree/drivers/net/ixgbe/base/README#n56 Right. > >> What would happen if not >> include stdbool and keep define for base code updates? Will it break build >> for >> applications? > > The problem is not applications, but using stdbool in DPDK headers. I see.
Re: [dpdk-dev] [PATCH v3 3/4] app/test-eventdev: add Tx adapter support
-Original Message- > Date: Mon, 24 Sep 2018 10:30:30 +0200 > From: Andrzej Ostruszka > To: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 3/4] app/test-eventdev: add Tx adapter > support > User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 > Thunderbird/52.9.1 > > > On 23.09.2018 13:35, Jerin Jacob wrote: > > -Original Message- > >> Date: Thu, 20 Sep 2018 03:52:34 +0530 > >> From: Pavan Nikhilesh > [...] > >> -struct rte_event_dev_info info; > >> -struct test_pipeline *t = evt_test_priv(test); > >> -uint8_t tx_evqueue_id = 0; > >> +uint8_t tx_evqueue_id[RTE_MAX_ETHPORTS] = {0}; > > > > Some old compiler throws error with this scheme. Please change to memset. > > Really? Could you give an example? > > That is perfectly legal C (since "forever"?) and I find it more readable > than memset. Don't treat it as a request to keep the original version - > if I were Pavan I would object this particular request since I prefer > direct initialization, however here I'm more interested in learning more > about your statement about compilers not supporting zero initialization > of array members after the last initializer. And maybe also about to > what extent we should be supporting old/non compliant compilers (the doc > suggest to use gcc 4.9+). Clang don't like this kind of zero-initialization depending on which type of parameter comes first in the structure. An array of uint8_t should be OK. I thought of keeping safe here as it was going for next revision. Unofficially, people used to test with old compiler such as gcc 4.7 etc. http://patches.dpdk.org/patch/40750/ // Search clang here. http://patches.dpdk.org/patch/38189/ > > Best regards > Andrzej
Re: [dpdk-dev] [PATCH v3 1/3] event: add function for reading unlink in progress
-Original Message- > Date: Mon, 24 Sep 2018 09:23:31 +0100 > From: Harry van Haaren > To: dev@dpdk.org > CC: jerin.ja...@caviumnetworks.com, Harry van Haaren > > Subject: [PATCH v3 1/3] event: add function for reading unlink in progress > X-Mailer: git-send-email 2.17.1 > > External Email > > This commit introduces a new function in the eventdev API, > which allows applications to read the number of unlink requests > in progress on a particular port of an eventdev instance. > > This information allows applications to verify when no more packets > from a particular queue (or any queue) will arrive at a port. > The application could decide to stop polling, or put the core into > a sleep state if it wishes, as it is ensured that no new packets > will arrive at a particular port anymore if all queues are unlinked. > > Suggested-by: Matias Elo > Signed-off-by: Harry van Haaren > Acked-by: Jerin Jacob Applied this series to dpdk-next-eventdev/master. Thanks.
Re: [dpdk-dev] [PATCH] drivers/net: do not redefine bool
On 9/20/2018 1:18 AM, Thomas Monjalon wrote: > When trying to include stdbool.h in DPDK base headers, there are a lot > of conflicts with drivers which redefine bool/true/false > in their compatibility layer. > > It is fixed by including stdbool.h in these drivers. > Some errors with usage of bool type are also fixed in some drivers. > > Note: the driver qede has a surprising mix of bool and int: > (~p_iov->b_pre_fp_hsi & ETH_HSI_VER_MINOR) > where the first variable is boolean and the version is a number. > It is replaced by > !p_iov->b_pre_fp_hsi > > Signed-off-by: Thomas Monjalon <...> > diff --git a/drivers/net/e1000/base/e1000_osdep.h > b/drivers/net/e1000/base/e1000_osdep.h > index b8868049f..556ed1742 100644 > --- a/drivers/net/e1000/base/e1000_osdep.h > +++ b/drivers/net/e1000/base/e1000_osdep.h > @@ -35,6 +35,7 @@ > #ifndef _E1000_OSDEP_H_ > #define _E1000_OSDEP_H_ > > +#include > #include > #include > #include > @@ -87,7 +88,6 @@ typedef int64_t s64; > typedef int32_t s32; > typedef int16_t s16; > typedef int8_t s8; > -typedef int bool; > > #define __le16 u16 > #define __le32 u32 > @@ -192,7 +192,4 @@ static inline uint16_t e1000_read_addr16(volatile void > *addr) > #define ETH_ADDR_LEN 6 > #endif > > -#define false FALSE > -#define true TRUE TRUE and FALSE also defined in this patch, can we remove them too? > - > #endif /* _E1000_OSDEP_H_ */ > diff --git a/drivers/net/fm10k/base/fm10k_osdep.h > b/drivers/net/fm10k/base/fm10k_osdep.h > index 199ebd8ea..9665239fd 100644 > --- a/drivers/net/fm10k/base/fm10k_osdep.h > +++ b/drivers/net/fm10k/base/fm10k_osdep.h > @@ -34,6 +34,7 @@ POSSIBILITY OF SUCH DAMAGE. > #ifndef _FM10K_OSDEP_H_ > #define _FM10K_OSDEP_H_ > > +#include > #include > #include > #include > @@ -61,12 +62,6 @@ POSSIBILITY OF SUCH DAMAGE. > > #define FALSE 0 > #define TRUE 1 > -#ifndef false > -#define false FALSE > -#endif > -#ifndef true > -#define true TRUE > -#endif Same here, TRUE and FALSE defined in this header and used in .c files one or two places, what about remove them and convert usage to "true" and "false" <...> > diff --git a/drivers/net/ixgbe/base/ixgbe_osdep.h > b/drivers/net/ixgbe/base/ixgbe_osdep.h > index bb5dfd2af..39e9118aa 100644 > --- a/drivers/net/ixgbe/base/ixgbe_osdep.h > +++ b/drivers/net/ixgbe/base/ixgbe_osdep.h > @@ -36,6 +36,7 @@ > #define _IXGBE_OS_H_ > > #include > +#include > #include > #include > #include > @@ -70,8 +71,6 @@ > #define FALSE 0 > #define TRUE1 Same again, can we remove TRUE and FALSE <...> > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c > b/drivers/net/ixgbe/ixgbe_ethdev.c > index cee886754..c272a4112 100644 > --- a/drivers/net/ixgbe/ixgbe_ethdev.c > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c > @@ -2527,7 +2527,9 @@ ixgbe_dev_start(struct rte_eth_dev *dev) > struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev); > struct rte_intr_handle *intr_handle = &pci_dev->intr_handle; > uint32_t intr_vector = 0; > - int err, link_up = 0, negotiate = 0; > + int err; > + bool negotiate = false; > + bool link_up = false; "link_up" is used in assignment to a single bit in uint16_t: dev->data->dev_link.link_status = link_up; When "link_up" is bool, should we change that line to: if (link_up) dev->data->dev_link.link_status = 1; else dev->data->dev_link.link_status = 0; <...> > @@ -3870,7 +3872,7 @@ ixgbevf_dev_info_get(struct rte_eth_dev *dev, > > static int > ixgbevf_check_link(struct ixgbe_hw *hw, ixgbe_link_speed *speed, > -int *link_up, int wait_to_complete) > +bool *link_up, int wait_to_complete) Also need to change "wait_to_complete" to bool because below changes start sending bool type to this function. <...> > diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c > index ae21f04a1..2dc14c47f 100644 > --- a/drivers/net/ixgbe/ixgbe_rxtx.c > +++ b/drivers/net/ixgbe/ixgbe_rxtx.c > @@ -2025,7 +2025,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf > **rx_pkts, uint16_t nb_pkts, > struct ixgbe_rx_entry *next_rxe = NULL; > struct rte_mbuf *first_seg; > struct rte_mbuf *rxm; > - struct rte_mbuf *nmb; > + struct rte_mbuf *nmb = NULL; This change is unrelated. Can we separate this one?
[dpdk-dev] [PATCH v5 1/5] vhost: unify struct VhostUserMsg usage
Do not use the typedef version of struct VhostUserMsg. Also unify the related parameter name. Signed-off-by: Nikolay Nikolaev --- lib/librte_vhost/vhost_user.c | 41 + 1 file changed, 21 insertions(+), 20 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 63d145b2d..505db3bfc 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -250,7 +250,7 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features) */ static int vhost_user_set_vring_num(struct virtio_net *dev, -VhostUserMsg *msg) +struct VhostUserMsg *msg) { struct vhost_virtqueue *vq = dev->virtqueue[msg->payload.state.index]; @@ -611,7 +611,7 @@ translate_ring_addresses(struct virtio_net *dev, int vq_index) * This function then converts these to our address space. */ static int -vhost_user_set_vring_addr(struct virtio_net **pdev, VhostUserMsg *msg) +vhost_user_set_vring_addr(struct virtio_net **pdev, struct VhostUserMsg *msg) { struct vhost_virtqueue *vq; struct vhost_vring_addr *addr = &msg->payload.addr; @@ -648,7 +648,7 @@ vhost_user_set_vring_addr(struct virtio_net **pdev, VhostUserMsg *msg) */ static int vhost_user_set_vring_base(struct virtio_net *dev, - VhostUserMsg *msg) + struct VhostUserMsg *msg) { dev->virtqueue[msg->payload.state.index]->last_used_idx = msg->payload.state.num; @@ -780,10 +780,10 @@ vhost_memory_changed(struct VhostUserMemory *new, } static int -vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *pmsg) +vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *msg) { struct virtio_net *dev = *pdev; - struct VhostUserMemory memory = pmsg->payload.memory; + struct VhostUserMemory memory = msg->payload.memory; struct rte_vhost_mem_region *reg; void *mmap_addr; uint64_t mmap_size; @@ -804,7 +804,7 @@ vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *pmsg) "(%d) memory regions not changed\n", dev->vid); for (i = 0; i < memory.nregions; i++) - close(pmsg->fds[i]); + close(msg->fds[i]); return 0; } @@ -845,7 +845,7 @@ vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *pmsg) dev->mem->nregions = memory.nregions; for (i = 0; i < memory.nregions; i++) { - fd = pmsg->fds[i]; + fd = msg->fds[i]; reg = &dev->mem->regions[i]; reg->guest_phys_addr = memory.regions[i].guest_phys_addr; @@ -994,16 +994,16 @@ virtio_is_ready(struct virtio_net *dev) } static void -vhost_user_set_vring_call(struct virtio_net *dev, struct VhostUserMsg *pmsg) +vhost_user_set_vring_call(struct virtio_net *dev, struct VhostUserMsg *msg) { struct vhost_vring_file file; struct vhost_virtqueue *vq; - file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK; - if (pmsg->payload.u64 & VHOST_USER_VRING_NOFD_MASK) + file.index = msg->payload.u64 & VHOST_USER_VRING_IDX_MASK; + if (msg->payload.u64 & VHOST_USER_VRING_NOFD_MASK) file.fd = VIRTIO_INVALID_EVENTFD; else - file.fd = pmsg->fds[0]; + file.fd = msg->fds[0]; RTE_LOG(INFO, VHOST_CONFIG, "vring call idx:%d file:%d\n", file.index, file.fd); @@ -1015,17 +1015,17 @@ vhost_user_set_vring_call(struct virtio_net *dev, struct VhostUserMsg *pmsg) } static int -vhost_user_set_vring_kick(struct virtio_net **pdev, struct VhostUserMsg *pmsg) +vhost_user_set_vring_kick(struct virtio_net **pdev, struct VhostUserMsg *msg) { struct vhost_vring_file file; struct vhost_virtqueue *vq; struct virtio_net *dev = *pdev; - file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK; - if (pmsg->payload.u64 & VHOST_USER_VRING_NOFD_MASK) + file.index = msg->payload.u64 & VHOST_USER_VRING_IDX_MASK; + if (msg->payload.u64 & VHOST_USER_VRING_NOFD_MASK) file.fd = VIRTIO_INVALID_EVENTFD; else - file.fd = pmsg->fds[0]; + file.fd = msg->fds[0]; RTE_LOG(INFO, VHOST_CONFIG, "vring kick idx:%d file:%d\n", file.index, file.fd); @@ -1073,7 +1073,7 @@ free_zmbufs(struct vhost_virtqueue *vq) */ static int vhost_user_get_vring_base(struct virtio_net *dev, - VhostUserMsg *msg) + struct VhostUserMsg *msg) { struct vhost_virtqueue *vq = dev->virtqueue[msg->payload.state.index]; @@ -1126,7 +1126,7 @@ vhost_user_get_vring_base(struct virtio_net *dev, */ static int vhost_user_set_vring_enable(struct virtio_net *dev, -
[dpdk-dev] [PATCH v5 2/5] vhost: make message handling functions prepare the reply
As VhostUserMsg structure is reused to generate the reply, move the relevant fields update into the respective message handling functions. Signed-off-by: Nikolay Nikolaev --- lib/librte_vhost/vhost_user.c | 24 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 505db3bfc..4ae7b9346 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -146,11 +146,15 @@ vhost_user_reset_owner(struct virtio_net *dev) * The features that we support are requested. */ static uint64_t -vhost_user_get_features(struct virtio_net *dev) +vhost_user_get_features(struct virtio_net *dev, struct VhostUserMsg *msg) { uint64_t features = 0; rte_vhost_driver_get_features(dev->ifname, &features); + + msg->payload.u64 = features; + msg->size = sizeof(msg->payload.u64); + return features; } @@ -158,11 +162,15 @@ vhost_user_get_features(struct virtio_net *dev) * The queue number that we support are requested. */ static uint32_t -vhost_user_get_queue_num(struct virtio_net *dev) +vhost_user_get_queue_num(struct virtio_net *dev, struct VhostUserMsg *msg) { uint32_t queue_num = 0; rte_vhost_driver_get_queue_num(dev->ifname, &queue_num); + + msg->payload.u64 = (uint64_t)queue_num; + msg->size = sizeof(msg->payload.u64); + return queue_num; } @@ -1117,6 +1125,8 @@ vhost_user_get_vring_base(struct virtio_net *dev, rte_free(vq->batch_copy_elems); vq->batch_copy_elems = NULL; + msg->size = sizeof(msg->payload.state); + return 0; } @@ -1244,6 +1254,8 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg) dev->log_base = dev->log_addr + off; dev->log_size = size; + msg->size = sizeof(msg->payload.u64); + return 0; } @@ -1658,8 +1670,7 @@ vhost_user_msg_handler(int vid, int fd) switch (msg.request.master) { case VHOST_USER_GET_FEATURES: - msg.payload.u64 = vhost_user_get_features(dev); - msg.size = sizeof(msg.payload.u64); + vhost_user_get_features(dev, &msg); send_vhost_reply(fd, &msg); break; case VHOST_USER_SET_FEATURES: @@ -1690,7 +1701,6 @@ vhost_user_msg_handler(int vid, int fd) if (ret) goto skip_to_reply; /* it needs a reply */ - msg.size = sizeof(msg.payload.u64); send_vhost_reply(fd, &msg); break; case VHOST_USER_SET_LOG_FD: @@ -1712,7 +1722,6 @@ vhost_user_msg_handler(int vid, int fd) ret = vhost_user_get_vring_base(dev, &msg); if (ret) goto skip_to_reply; - msg.size = sizeof(msg.payload.state); send_vhost_reply(fd, &msg); break; @@ -1730,8 +1739,7 @@ vhost_user_msg_handler(int vid, int fd) break; case VHOST_USER_GET_QUEUE_NUM: - msg.payload.u64 = (uint64_t)vhost_user_get_queue_num(dev); - msg.size = sizeof(msg.payload.u64); + vhost_user_get_queue_num(dev, &msg); send_vhost_reply(fd, &msg); break;
[dpdk-dev] [PATCH v5 3/5] vhost: handle unsupported message types in functions
Add new functions to handle the unsupported vhost message types: - vhost_user_set_vring_err - vhost_user_set_log_fd Signed-off-by: Nikolay Nikolaev --- lib/librte_vhost/vhost_user.c | 22 +- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 4ae7b9346..77905dda0 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1022,6 +1022,14 @@ vhost_user_set_vring_call(struct virtio_net *dev, struct VhostUserMsg *msg) vq->callfd = file.fd; } +static void vhost_user_set_vring_err(struct virtio_net **pdev __rte_unused, + struct VhostUserMsg *msg) +{ + if (!(msg->payload.u64 & VHOST_USER_VRING_NOFD_MASK)) + close(msg->fds[0]); + RTE_LOG(INFO, VHOST_CONFIG, "not implemented\n"); +} + static int vhost_user_set_vring_kick(struct virtio_net **pdev, struct VhostUserMsg *msg) { @@ -1259,6 +1267,13 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg) return 0; } +static void vhost_user_set_log_fd(struct virtio_net **pdev __rte_unused, + struct VhostUserMsg *msg) +{ + close(msg->fds[0]); + RTE_LOG(INFO, VHOST_CONFIG, "not implemented.\n"); +} + /* * An rarp packet is constructed and broadcasted to notify switches about * the new location of the migrated VM, so that packets from outside will @@ -1704,8 +1719,7 @@ vhost_user_msg_handler(int vid, int fd) send_vhost_reply(fd, &msg); break; case VHOST_USER_SET_LOG_FD: - close(msg.fds[0]); - RTE_LOG(INFO, VHOST_CONFIG, "not implemented.\n"); + vhost_user_set_log_fd(&dev, &msg); break; case VHOST_USER_SET_VRING_NUM: @@ -1733,9 +1747,7 @@ vhost_user_msg_handler(int vid, int fd) break; case VHOST_USER_SET_VRING_ERR: - if (!(msg.payload.u64 & VHOST_USER_VRING_NOFD_MASK)) - close(msg.fds[0]); - RTE_LOG(INFO, VHOST_CONFIG, "not implemented\n"); + vhost_user_set_vring_err(&dev, &msg); break; case VHOST_USER_GET_QUEUE_NUM:
[dpdk-dev] [PATCH v5 0/5] vhost_user.c code cleanup
vhost: vhost_user.c code cleanup This patchesries introduce a set of code redesigns in vhost_user.c. The goal is to unify and simplify vhost-user message handling. The patches do not intend to introduce any functional changes. v5 changes: - fixed the usage of struct VhostUserMsg in all patches (Anatoly Burakov) v4 changes: - use struct VhostUserMsg as the coding style guide suggests (Anatoly Burakov) - VH_RESULT_FATAL is removed as not needed anymore (Maxime Coquelin) v3 changes: - rebased on top of git://dpdk.org/next/dpdk-next-virtio dead0602 - introduce VH_RESULT_FATAL (Maxime Coquelin) - vhost_user_set_features return VH_RESULT_FATAL on failure. This allows keeping the propagate error logic (Ilya Maximets) - fixed vhost_user_set_vring_kick and vhost_user_set_protocol_features return VH_RESULT_ERR upon failure - fixed missing break in case VH_RESULT_ERR (Ilya Maximets) - fixed a type on the description of 2/5 patch (Maxime Coquelin) v2 changes: - Fix the comments by Tiwei Bie - Keep the old behavior - Fall through when the callback returns VH_RESULT_ERR - Fall through if the request is out of range --- Nikolay Nikolaev (5): vhost: unify struct VhostUserMsg usage vhost: make message handling functions prepare the reply vhost: handle unsupported message types in functions vhost: unify message handling function signature vhost: message handling implemented as a callback array lib/librte_vhost/vhost_user.c | 393 ++--- 1 file changed, 208 insertions(+), 185 deletions(-) -- Signature
[dpdk-dev] [PATCH v5 4/5] vhost: unify message handling function signature
Each vhost-user message handling function will return an int result which is described in the new enum vh_result: error, OK and reply. All functions will now have two arguments, virtio_net double pointer and VhostUserMsg pointer. Signed-off-by: Nikolay Nikolaev --- lib/librte_vhost/vhost_user.c | 211 - 1 file changed, 125 insertions(+), 86 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 77905dda0..ac89f413d 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -71,6 +71,16 @@ static const char *vhost_message_str[VHOST_USER_MAX] = { [VHOST_USER_CRYPTO_CLOSE_SESS] = "VHOST_USER_CRYPTO_CLOSE_SESS", }; +/* The possible results of a message handling function */ +enum vh_result { + /* Message handling failed */ + VH_RESULT_ERR = -1, + /* Message handling successful */ + VH_RESULT_OK= 0, + /* Message handling successful and reply prepared */ + VH_RESULT_REPLY = 1, +}; + static uint64_t get_blk_size(int fd) { @@ -127,27 +137,31 @@ vhost_backend_cleanup(struct virtio_net *dev) * the device hasn't been initialised. */ static int -vhost_user_set_owner(void) +vhost_user_set_owner(struct virtio_net **pdev __rte_unused, + VhostUserMsg * msg __rte_unused) { - return 0; + return VH_RESULT_OK; } static int -vhost_user_reset_owner(struct virtio_net *dev) +vhost_user_reset_owner(struct virtio_net **pdev, + VhostUserMsg * msg __rte_unused) { + struct virtio_net *dev = *pdev; vhost_destroy_device_notify(dev); cleanup_device(dev, 0); reset_device(dev); - return 0; + return VH_RESULT_OK; } /* * The features that we support are requested. */ -static uint64_t -vhost_user_get_features(struct virtio_net *dev, struct VhostUserMsg *msg) +static int +vhost_user_get_features(struct virtio_net **pdev, struct VhostUserMsg *msg) { + struct virtio_net *dev = *pdev; uint64_t features = 0; rte_vhost_driver_get_features(dev->ifname, &features); @@ -155,15 +169,16 @@ vhost_user_get_features(struct virtio_net *dev, struct VhostUserMsg *msg) msg->payload.u64 = features; msg->size = sizeof(msg->payload.u64); - return features; + return VH_RESULT_REPLY; } /* * The queue number that we support are requested. */ -static uint32_t -vhost_user_get_queue_num(struct virtio_net *dev, struct VhostUserMsg *msg) +static int +vhost_user_get_queue_num(struct virtio_net **pdev, struct VhostUserMsg *msg) { + struct virtio_net *dev = *pdev; uint32_t queue_num = 0; rte_vhost_driver_get_queue_num(dev->ifname, &queue_num); @@ -171,15 +186,17 @@ vhost_user_get_queue_num(struct virtio_net *dev, struct VhostUserMsg *msg) msg->payload.u64 = (uint64_t)queue_num; msg->size = sizeof(msg->payload.u64); - return queue_num; + return VH_RESULT_REPLY; } /* * We receive the negotiated features supported by us and the virtio device. */ static int -vhost_user_set_features(struct virtio_net *dev, uint64_t features) +vhost_user_set_features(struct virtio_net **pdev, VhostUserMsg *msg) { + struct virtio_net *dev = *pdev; + uint64_t features = msg->payload.u64; uint64_t vhost_features = 0; struct rte_vdpa_device *vdpa_dev; int did = -1; @@ -189,12 +206,12 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features) RTE_LOG(ERR, VHOST_CONFIG, "(%d) received invalid negotiated features.\n", dev->vid); - return -1; + return VH_RESULT_ERR; } if (dev->flags & VIRTIO_DEV_RUNNING) { if (dev->features == features) - return 0; + return VH_RESULT_OK; /* * Error out if master tries to change features while device is @@ -205,7 +222,7 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features) RTE_LOG(ERR, VHOST_CONFIG, "(%d) features changed while device is running.\n", dev->vid); - return -1; + return VH_RESULT_ERR; } if (dev->notify_ops->features_changed) @@ -250,16 +267,17 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features) if (vdpa_dev && vdpa_dev->ops->set_features) vdpa_dev->ops->set_features(dev->vid); - return 0; + return VH_RESULT_OK; } /* * The virtio device sends us the size of the descriptor ring. */ static int -vhost_user_set_vring_num(struct virtio_net *dev, +vhost_user_set_vring_num(struct virtio_net **pdev, struct VhostUserMsg *msg) { + struct virtio_net *dev =
[dpdk-dev] [PATCH v5 5/5] vhost: message handling implemented as a callback array
Introduce vhost_message_handlers, which maps the message request type to the message handler. Then replace the switch construct with a map and call. Failing vhost_user_set_features is fatal and all processing should stop immediately and propagate the error to the upper layers. Change the code accordingly to reflect that. Signed-off-by: Nikolay Nikolaev --- lib/librte_vhost/vhost_user.c | 149 +++-- 1 file changed, 56 insertions(+), 93 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index ac89f413d..faad3ba49 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1477,6 +1477,34 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, struct VhostUserMsg *msg) return VH_RESULT_OK; } +typedef int (*vhost_message_handler_t)(struct virtio_net **pdev, VhostUserMsg * msg); +static vhost_message_handler_t vhost_message_handlers[VHOST_USER_MAX] = { + [VHOST_USER_NONE] = NULL, + [VHOST_USER_GET_FEATURES] = vhost_user_get_features, + [VHOST_USER_SET_FEATURES] = vhost_user_set_features, + [VHOST_USER_SET_OWNER] = vhost_user_set_owner, + [VHOST_USER_RESET_OWNER] = vhost_user_reset_owner, + [VHOST_USER_SET_MEM_TABLE] = vhost_user_set_mem_table, + [VHOST_USER_SET_LOG_BASE] = vhost_user_set_log_base, + [VHOST_USER_SET_LOG_FD] = vhost_user_set_log_fd, + [VHOST_USER_SET_VRING_NUM] = vhost_user_set_vring_num, + [VHOST_USER_SET_VRING_ADDR] = vhost_user_set_vring_addr, + [VHOST_USER_SET_VRING_BASE] = vhost_user_set_vring_base, + [VHOST_USER_GET_VRING_BASE] = vhost_user_get_vring_base, + [VHOST_USER_SET_VRING_KICK] = vhost_user_set_vring_kick, + [VHOST_USER_SET_VRING_CALL] = vhost_user_set_vring_call, + [VHOST_USER_SET_VRING_ERR] = vhost_user_set_vring_err, + [VHOST_USER_GET_PROTOCOL_FEATURES] = vhost_user_get_protocol_features, + [VHOST_USER_SET_PROTOCOL_FEATURES] = vhost_user_set_protocol_features, + [VHOST_USER_GET_QUEUE_NUM] = vhost_user_get_queue_num, + [VHOST_USER_SET_VRING_ENABLE] = vhost_user_set_vring_enable, + [VHOST_USER_SEND_RARP] = vhost_user_send_rarp, + [VHOST_USER_NET_SET_MTU] = vhost_user_net_set_mtu, + [VHOST_USER_SET_SLAVE_REQ_FD] = vhost_user_set_req_fd, + [VHOST_USER_IOTLB_MSG] = vhost_user_iotlb_msg, +}; + + /* return bytes# of read on success or negative val on failure. */ static int read_vhost_message(int sockfd, struct VhostUserMsg *msg) @@ -1630,6 +1658,7 @@ vhost_user_msg_handler(int vid, int fd) int ret; int unlock_required = 0; uint32_t skip_master = 0; + int request; dev = get_device(vid); if (dev == NULL) @@ -1722,100 +1751,34 @@ vhost_user_msg_handler(int vid, int fd) goto skip_to_post_handle; } - switch (msg.request.master) { - case VHOST_USER_GET_FEATURES: - ret = vhost_user_get_features(&dev, &msg); - send_vhost_reply(fd, &msg); - break; - case VHOST_USER_SET_FEATURES: - ret = vhost_user_set_features(&dev, &msg); - break; - - case VHOST_USER_GET_PROTOCOL_FEATURES: - ret = vhost_user_get_protocol_features(&dev, &msg); - send_vhost_reply(fd, &msg); - break; - case VHOST_USER_SET_PROTOCOL_FEATURES: - ret = vhost_user_set_protocol_features(&dev, &msg); - break; - - case VHOST_USER_SET_OWNER: - ret = vhost_user_set_owner(&dev, &msg); - break; - case VHOST_USER_RESET_OWNER: - ret = vhost_user_reset_owner(&dev, &msg); - break; - - case VHOST_USER_SET_MEM_TABLE: - ret = vhost_user_set_mem_table(&dev, &msg); - break; - - case VHOST_USER_SET_LOG_BASE: - ret = vhost_user_set_log_base(&dev, &msg); - if (ret) - goto skip_to_reply; - /* it needs a reply */ - send_vhost_reply(fd, &msg); - break; - case VHOST_USER_SET_LOG_FD: - ret = vhost_user_set_log_fd(&dev, &msg); - break; - - case VHOST_USER_SET_VRING_NUM: - ret = vhost_user_set_vring_num(&dev, &msg); - break; - case VHOST_USER_SET_VRING_ADDR: - ret = vhost_user_set_vring_addr(&dev, &msg); - break; - case VHOST_USER_SET_VRING_BASE: - ret = vhost_user_set_vring_base(&dev, &msg); - break; - - case VHOST_USER_GET_VRING_BASE: - ret = vhost_user_get_vring_base(&dev, &msg); - if (ret) - goto skip_to_reply; - send_vhost_reply(fd, &msg); - break; - - case VHOST_USER_SET_VRING_KICK: - ret = vhost_user_set_vring_kick(&dev, &msg); -
[dpdk-dev] [PATCH] app/testpmd: fix printf format specifiers
change PRIu8 -> PRIu16 for port_id (portid_t is uint16_t) in eth_event_callback Fixes: 76ad4a2d82d4 ("app/testpmd: add generic event handler") Cc: sta...@dpdk.org Signed-off-by: Herakliusz Lipiec --- app/test-pmd/testpmd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index 001f0e5..dacc60a 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -2215,11 +2215,11 @@ eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param, RTE_SET_USED(ret_param); if (type >= RTE_ETH_EVENT_MAX) { - fprintf(stderr, "\nPort %" PRIu8 ": %s called upon invalid event %d\n", + fprintf(stderr, "\nPort %" PRIu16 ": %s called upon invalid event %d\n", port_id, __func__, type); fflush(stderr); } else if (event_print_mask & (UINT32_C(1) << type)) { - printf("\nPort %" PRIu8 ": %s event\n", port_id, + printf("\nPort %" PRIu16 ": %s event\n", port_id, event_desc[type]); fflush(stdout); } -- 2.9.5 -- Intel Research and Development Ireland Limited Registered in Ireland Registered Office: Collinstown Industrial Park, Leixlip, County Kildare Registered Number: 308263 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
Re: [dpdk-dev] [PATCH] netvsc: support multicast/promiscuous settings on VF
On 9/21/2018 5:54 PM, Stephen Hemminger wrote: > Provide API's to enable allmulticast and promiscuous in Netvsc PMD > with VF. This keeps the VF and PV path in sync. VF and PF? > > Signed-off-by: Stephen Hemminger > --- > Patch against dpdk-net-next > > drivers/net/netvsc/hn_ethdev.c | 14 + > drivers/net/netvsc/hn_var.h| 9 + > drivers/net/netvsc/hn_vf.c | 37 ++ > 3 files changed, 60 insertions(+) > > diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c > index b67cce1ba8f5..3092066ada36 100644 > --- a/drivers/net/netvsc/hn_ethdev.c > +++ b/drivers/net/netvsc/hn_ethdev.c > @@ -255,6 +255,7 @@ hn_dev_promiscuous_enable(struct rte_eth_dev *dev) > struct hn_data *hv = dev->data->dev_private; > > hn_rndis_set_rxfilter(hv, NDIS_PACKET_TYPE_PROMISCUOUS); > + hn_vf_promiscuous_enable(dev); This VF approach is confusing to me, is this calling a underlay device a VF device? <...> > +static int > +hn_dev_mc_addr_list(struct rte_eth_dev *dev, > + struct ether_addr *mc_addr_set, > + uint32_t nb_mc_addr) Just to double check, this dev_ops to add MAC multicast filters, to add MAC filters it is mac_addr_set, mac_addr_add, mac_addr_remove. Many HW seems can set the multicast MAC filters via "mac_addr_add" too. If this is the intention please enable "Multicast MAC filter" in netvsc.ini
Re: [dpdk-dev] [PATCH v1] eal: use correct data type for slab operations
+ dev. +cristian(Bitmap maintainer) Please review. Thanks! On Monday 24 September 2018 09:08 PM, Vivek Sharma wrote: > Currently, slab operations use unsigned long data type for 64-bit slab > related operations. On target 'i686-native-linuxapp-gcc', unsigned long > is 32-bit and thus, slab operations breaks on this target. Changing slab > operations to use unsigned long long for correct functioning on all targets. > > Fixes: de3cfa2c9823 ("sched: initial import") > Fixes: 693f715da45c ("remove extra parentheses in return statement") > CC: sta...@dpdk.org > > Signed-off-by: Vivek Sharma > --- > lib/librte_eal/common/include/rte_bitmap.h | 14 +++--- > test/test/test_bitmap.c| 18 ++ > 2 files changed, 25 insertions(+), 7 deletions(-) > > diff --git a/lib/librte_eal/common/include/rte_bitmap.h > b/lib/librte_eal/common/include/rte_bitmap.h > index d9facc6..7a36ce7 100644 > --- a/lib/librte_eal/common/include/rte_bitmap.h > +++ b/lib/librte_eal/common/include/rte_bitmap.h > @@ -88,7 +88,7 @@ __rte_bitmap_index1_inc(struct rte_bitmap *bmp) > static inline uint64_t > __rte_bitmap_mask1_get(struct rte_bitmap *bmp) > { > - return (~1lu) << bmp->offset1; > + return (~1llu) << bmp->offset1; > } > > static inline void > @@ -317,7 +317,7 @@ rte_bitmap_get(struct rte_bitmap *bmp, uint32_t pos) > index2 = pos >> RTE_BITMAP_SLAB_BIT_SIZE_LOG2; > offset2 = pos & RTE_BITMAP_SLAB_BIT_MASK; > slab2 = bmp->array2 + index2; > - return (*slab2) & (1lu << offset2); > + return (*slab2) & (1llu << offset2); > } > > /** > @@ -342,8 +342,8 @@ rte_bitmap_set(struct rte_bitmap *bmp, uint32_t pos) > slab2 = bmp->array2 + index2; > slab1 = bmp->array1 + index1; > > - *slab2 |= 1lu << offset2; > - *slab1 |= 1lu << offset1; > + *slab2 |= 1llu << offset2; > + *slab1 |= 1llu << offset1; > } > > /** > @@ -370,7 +370,7 @@ rte_bitmap_set_slab(struct rte_bitmap *bmp, uint32_t pos, > uint64_t slab) > slab1 = bmp->array1 + index1; > > *slab2 |= slab; > - *slab1 |= 1lu << offset1; > + *slab1 |= 1llu << offset1; > } > > static inline uint64_t > @@ -408,7 +408,7 @@ rte_bitmap_clear(struct rte_bitmap *bmp, uint32_t pos) > slab2 = bmp->array2 + index2; > > /* Return if array2 slab is not all-zeros */ > - *slab2 &= ~(1lu << offset2); > + *slab2 &= ~(1llu << offset2); > if (*slab2){ > return; > } > @@ -424,7 +424,7 @@ rte_bitmap_clear(struct rte_bitmap *bmp, uint32_t pos) > index1 = pos >> (RTE_BITMAP_SLAB_BIT_SIZE_LOG2 + > RTE_BITMAP_CL_BIT_SIZE_LOG2); > offset1 = (pos >> RTE_BITMAP_CL_BIT_SIZE_LOG2) & > RTE_BITMAP_SLAB_BIT_MASK; > slab1 = bmp->array1 + index1; > - *slab1 &= ~(1lu << offset1); > + *slab1 &= ~(1llu << offset1); > > return; > } > diff --git a/test/test/test_bitmap.c b/test/test/test_bitmap.c > index c3169e9..95c5184 100644 > --- a/test/test/test_bitmap.c > +++ b/test/test/test_bitmap.c > @@ -101,6 +101,7 @@ test_bitmap_slab_set_get(struct rte_bitmap *bmp) > static int > test_bitmap_set_get_clear(struct rte_bitmap *bmp) > { > + uint64_t val; > int i; > > rte_bitmap_reset(bmp); > @@ -124,6 +125,23 @@ test_bitmap_set_get_clear(struct rte_bitmap *bmp) > } > } > > + rte_bitmap_reset(bmp); > + > + /* Alternate slab set test */ > + for (i = 0; i < MAX_BITS; i++) { > + if (i % RTE_BITMAP_SLAB_BIT_SIZE) > + rte_bitmap_set(bmp, i); > + } > + > + for (i = 0; i < MAX_BITS; i++) { > + val = rte_bitmap_get(bmp, i); > + if (((i % RTE_BITMAP_SLAB_BIT_SIZE) && !val) || > + (!(i % RTE_BITMAP_SLAB_BIT_SIZE) && val)) { > + printf("Failed to get set bit.\n"); > + return TEST_FAILED; > + } > + } > + > return TEST_SUCCESS; > } > >
Re: [dpdk-dev] [PATCH v5 4/5] vhost: unify message handling function signature
On 24-Sep-18 4:21 PM, Nikolay Nikolaev wrote: Each vhost-user message handling function will return an int result which is described in the new enum vh_result: error, OK and reply. All functions will now have two arguments, virtio_net double pointer and VhostUserMsg pointer. Signed-off-by: Nikolay Nikolaev --- get_blk_size(int fd) { @@ -127,27 +137,31 @@ vhost_backend_cleanup(struct virtio_net *dev) * the device hasn't been initialised. */ static int -vhost_user_set_owner(void) +vhost_user_set_owner(struct virtio_net **pdev __rte_unused, + VhostUserMsg * msg __rte_unused) Missed a few instances of using a typedef in this patch. -- Thanks, Anatoly
Re: [dpdk-dev] [PATCH v3 0/2] net/failsafe: support multicast MAC address set
On 9/21/2018 5:09 PM, Gaëtan Rivet wrote: > Hi, > > Seems good, thanks. > > Acked-by: Gaetan Rivet Series applied to dpdk-next-net/master, thanks. > > On Fri, Sep 21, 2018 at 04:36:20PM +0100, Andrew Rybchenko wrote: >> v3: >> - move apply on sync to fs_eth_dev_conf_apply() to apply to >> a new subdevice only >> - use ethdev API to apply to sub-device on sync >> - remove unnecessary check the same pointer from the method >> implementation in failsafe >> >> v2: >> - fix setting of zero addresses since rte_realloc() returns NULL >> >> Evgeny Im (2): >> net/failsafe: remove not supported multicast MAC filter >> net/failsafe: support multicast address list set >> >> doc/guides/rel_notes/release_18_11.rst | 7 >> drivers/net/failsafe/failsafe.c | 1 + >> drivers/net/failsafe/failsafe_ether.c | 17 + >> drivers/net/failsafe/failsafe_ops.c | 50 + >> drivers/net/failsafe/failsafe_private.h | 2 + >> 5 files changed, 77 insertions(+) >> >> -- >> 2.17.1 >> >
[dpdk-dev] [PATCH] net/cxgbe: add missing DEV_RX_OFFLOAD_SCATTER flag
Scatter Rx is already supported by CXGBE PMD. So, add the missing DEV_RX_OFFLOAD_SCATTER flag to the list of supported Rx offload features. Also, move the macros for supported list of offload features to header file. Fixes: 436125e64174 ("net/cxgbe: update to Rx/Tx offload API") Cc: sta...@dpdk.org Reported-by: Martin Weiser Signed-off-by: Rahul Lakkireddy --- drivers/net/cxgbe/cxgbe.h| 15 +++ drivers/net/cxgbe/cxgbe_ethdev.c | 19 +++ 2 files changed, 22 insertions(+), 12 deletions(-) diff --git a/drivers/net/cxgbe/cxgbe.h b/drivers/net/cxgbe/cxgbe.h index 5e6f5c98d..eb58f8802 100644 --- a/drivers/net/cxgbe/cxgbe.h +++ b/drivers/net/cxgbe/cxgbe.h @@ -34,6 +34,21 @@ ETH_RSS_IPV6_UDP_EX) #define CXGBE_RSS_HF_ALL (ETH_RSS_IP | ETH_RSS_TCP | ETH_RSS_UDP) +/* Tx/Rx Offloads supported */ +#define CXGBE_TX_OFFLOADS (DEV_TX_OFFLOAD_VLAN_INSERT | \ + DEV_TX_OFFLOAD_IPV4_CKSUM | \ + DEV_TX_OFFLOAD_UDP_CKSUM | \ + DEV_TX_OFFLOAD_TCP_CKSUM | \ + DEV_TX_OFFLOAD_TCP_TSO) + +#define CXGBE_RX_OFFLOADS (DEV_RX_OFFLOAD_VLAN_STRIP | \ + DEV_RX_OFFLOAD_IPV4_CKSUM | \ + DEV_RX_OFFLOAD_UDP_CKSUM | \ + DEV_RX_OFFLOAD_TCP_CKSUM | \ + DEV_RX_OFFLOAD_JUMBO_FRAME | \ + DEV_RX_OFFLOAD_SCATTER) + + #define CXGBE_DEVARG_KEEP_OVLAN "keep_ovlan" #define CXGBE_DEVARG_FORCE_LINK_UP "force_link_up" diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c index 117263e2b..b2f83ea37 100644 --- a/drivers/net/cxgbe/cxgbe_ethdev.c +++ b/drivers/net/cxgbe/cxgbe_ethdev.c @@ -59,18 +59,6 @@ */ #include "t4_pci_id_tbl.h" -#define CXGBE_TX_OFFLOADS (DEV_TX_OFFLOAD_VLAN_INSERT |\ - DEV_TX_OFFLOAD_IPV4_CKSUM |\ - DEV_TX_OFFLOAD_UDP_CKSUM |\ - DEV_TX_OFFLOAD_TCP_CKSUM |\ - DEV_TX_OFFLOAD_TCP_TSO) - -#define CXGBE_RX_OFFLOADS (DEV_RX_OFFLOAD_VLAN_STRIP |\ - DEV_RX_OFFLOAD_IPV4_CKSUM |\ - DEV_RX_OFFLOAD_JUMBO_FRAME |\ - DEV_RX_OFFLOAD_UDP_CKSUM |\ - DEV_RX_OFFLOAD_TCP_CKSUM) - uint16_t cxgbe_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) { @@ -340,6 +328,7 @@ void cxgbe_dev_close(struct rte_eth_dev *eth_dev) int cxgbe_dev_start(struct rte_eth_dev *eth_dev) { struct port_info *pi = (struct port_info *)(eth_dev->data->dev_private); + struct rte_eth_rxmode *rx_conf = ð_dev->data->dev_conf.rxmode; struct adapter *adapter = pi->adapter; int err = 0, i; @@ -360,6 +349,11 @@ int cxgbe_dev_start(struct rte_eth_dev *eth_dev) goto out; } + if (rx_conf->offloads & DEV_RX_OFFLOAD_SCATTER) + eth_dev->data->scattered_rx = 1; + else + eth_dev->data->scattered_rx = 0; + cxgbe_enable_rx_queues(pi); err = setup_rss(pi); @@ -406,6 +400,7 @@ void cxgbe_dev_stop(struct rte_eth_dev *eth_dev) * have been disabled */ t4_sge_eth_clear_queues(pi); + eth_dev->data->scattered_rx = 0; } int cxgbe_dev_configure(struct rte_eth_dev *eth_dev) -- 2.18.0
[dpdk-dev] [PATCH] doc: announce CRC strip changes in release notes
Document changes done in commit 323e7b667f18 ("ethdev: make default behavior CRC strip on Rx") Signed-off-by: Ferruh Yigit --- doc/guides/rel_notes/release_18_11.rst | 6 ++ 1 file changed, 6 insertions(+) diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst index 2f53564a9..41b9cd8d5 100644 --- a/doc/guides/rel_notes/release_18_11.rst +++ b/doc/guides/rel_notes/release_18_11.rst @@ -112,6 +112,12 @@ API Changes flag the MAC can be properly configured in any case. This is particularly important for bonding. +* The default behaviour of CRC strip offload changed. Without any specific Rx + offload flag, default behavior by PMD is now to strip CRC. + DEV_RX_OFFLOAD_CRC_STRIP offload flag has been removed. + To request keeping CRC, application should set ``DEV_RX_OFFLOAD_KEEP_CRC`` Rx + offload. + ABI Changes --- -- 2.17.1
Re: [dpdk-dev] [PATCH] drivers/net: do not redefine bool
24/09/2018 17:06, Ferruh Yigit: > On 9/20/2018 1:18 AM, Thomas Monjalon wrote: > > -#define false FALSE > > -#define true TRUE > > TRUE and FALSE also defined in this patch, can we remove them too? I don't see the need to remove TRUE and FALSE. The base drivers use them on other platforms, and it is convenient to not change the base drivers. [...] > > static int > > ixgbevf_check_link(struct ixgbe_hw *hw, ixgbe_link_speed *speed, > > - int *link_up, int wait_to_complete) > > + bool *link_up, int wait_to_complete) > > Also need to change "wait_to_complete" to bool because below changes start > sending bool type to this function. [...] > > --- a/drivers/net/ixgbe/ixgbe_rxtx.c > > +++ b/drivers/net/ixgbe/ixgbe_rxtx.c > > @@ -2025,7 +2025,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf > > **rx_pkts, uint16_t nb_pkts, > > struct ixgbe_rx_entry *next_rxe = NULL; > > struct rte_mbuf *first_seg; > > struct rte_mbuf *rxm; > > - struct rte_mbuf *nmb; > > + struct rte_mbuf *nmb = NULL; > > This change is unrelated. Can we separate this one? Yes it looks unrelated but it becomes necessary when including stdbool.h. I don't know the root cause, but yes, it may deserve a separate commit. Maybe an ixgbe maintainer can take care of it?
Re: [dpdk-dev] [PATCH] doc: announce CRC strip changes in release notes
24/09/2018 19:31, Ferruh Yigit: > Document changes done in > commit 323e7b667f18 ("ethdev: make default behavior CRC strip on Rx") > > Signed-off-by: Ferruh Yigit > --- > --- a/doc/guides/rel_notes/release_18_11.rst > +++ b/doc/guides/rel_notes/release_18_11.rst > @@ -112,6 +112,12 @@ API Changes > +* The default behaviour of CRC strip offload changed. Without any specific Rx > + offload flag, default behavior by PMD is now to strip CRC. > + DEV_RX_OFFLOAD_CRC_STRIP offload flag has been removed. > + To request keeping CRC, application should set ``DEV_RX_OFFLOAD_KEEP_CRC`` > Rx > + offload. Acked-by: Thomas Monjalon Thanks
Re: [dpdk-dev] [PATCH] doc: announce CRC strip changes in release notes
On Mon, Sep 24, 2018 at 7:31 PM, Ferruh Yigit wrote: > Document changes done in > commit 323e7b667f18 ("ethdev: make default behavior CRC strip on Rx") > > Signed-off-by: Ferruh Yigit > --- > doc/guides/rel_notes/release_18_11.rst | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/doc/guides/rel_notes/release_18_11.rst > b/doc/guides/rel_notes/release_18_11.rst > index 2f53564a9..41b9cd8d5 100644 > --- a/doc/guides/rel_notes/release_18_11.rst > +++ b/doc/guides/rel_notes/release_18_11.rst > @@ -112,6 +112,12 @@ API Changes >flag the MAC can be properly configured in any case. This is particularly >important for bonding. > > +* The default behaviour of CRC strip offload changed. Without any specific Rx > + offload flag, default behavior by PMD is now to strip CRC. > + DEV_RX_OFFLOAD_CRC_STRIP offload flag has been removed. > + To request keeping CRC, application should set ``DEV_RX_OFFLOAD_KEEP_CRC`` > Rx > + offload. > + > > ABI Changes > --- Reviewed-by: David Marchand -- David Marchand
Re: [dpdk-dev] [PATCH v1] eal: use correct data type for slab operations
> -Original Message- > From: Vivek Sharma [mailto:vivek.sha...@caviumnetworks.com] > Sent: Monday, September 24, 2018 4:50 PM > To: Dumitrescu, Cristian > Cc: Sharma, Vivek ; sta...@dpdk.org; > dev@dpdk.org > Subject: Re: [PATCH v1] eal: use correct data type for slab operations > > + dev. > +cristian(Bitmap maintainer) > > Please review. > > Thanks! > > On Monday 24 September 2018 09:08 PM, Vivek Sharma wrote: > > Currently, slab operations use unsigned long data type for 64-bit slab > > related operations. On target 'i686-native-linuxapp-gcc', unsigned long > > is 32-bit and thus, slab operations breaks on this target. Changing slab > > operations to use unsigned long long for correct functioning on all targets. > > > > Fixes: de3cfa2c9823 ("sched: initial import") > > Fixes: 693f715da45c ("remove extra parentheses in return statement") > > CC: sta...@dpdk.org > > > > Signed-off-by: Vivek Sharma > > --- > > lib/librte_eal/common/include/rte_bitmap.h | 14 +++--- > > test/test/test_bitmap.c| 18 ++ > > 2 files changed, 25 insertions(+), 7 deletions(-) > > Acked-by: Cristian Dumitrescu
[dpdk-dev] [PATCH 1/2] net/mlx4: support externally allocated static memory
When MLX PMD registers memory for DMA, it accesses the global memseg list of DPDK to maximize the range of registration so that LKey search can be more efficient. Granularity of MR registration is per page. Externally allocated memory shouldn't be used for DMA because it can't be searched in the memseg list and free event can't be tracked by DPDK. If it is used, the following error will occur: net_mlx5: port 0 unable to find virtually contiguous chunk for address (0x5600017587c0). rte_memseg_contig_walk() failed. There's a pending patchset [1] which enables externally allocated memory. Once it is merged, users can register their own memory out of EAL then that will resolve this issue. Meanwhile, if the external memory is static (allocated on startup and never freed), such memory can also be registered by little tweak in the code. [1] http://patches.dpdk.org/project/dpdk/list/?series=1415 This patch is not a bug fix but needs to be included in stable versions. Fixes: 9797bfcce1c9 ("net/mlx4: add new memory region support") Cc: sta...@dpdk.org Cc: "Damjan Marion (damarion)" Cc: Ed Warnicke Signed-off-by: Yongseok Koh --- drivers/net/mlx4/mlx4_mr.c | 149 +++ drivers/net/mlx4/mlx4_rxtx.h | 35 +- 2 files changed, 183 insertions(+), 1 deletion(-) diff --git a/drivers/net/mlx4/mlx4_mr.c b/drivers/net/mlx4/mlx4_mr.c index d23d3c613..bee858643 100644 --- a/drivers/net/mlx4/mlx4_mr.c +++ b/drivers/net/mlx4/mlx4_mr.c @@ -289,6 +289,23 @@ mr_find_next_chunk(struct mlx4_mr *mr, struct mlx4_mr_cache *entry, uintptr_t end = 0; uint32_t idx = 0; + /* MR for external memory doesn't have memseg list. */ + if (mr->msl == NULL) { + struct ibv_mr *ibv_mr = mr->ibv_mr; + + assert(mr->ms_bmp_n == 1); + assert(mr->ms_n == 1); + assert(base_idx == 0); + /* +* Can't search it from memseg list but get it directly from +* verbs MR as there's only one chunk. +*/ + entry->start = (uintptr_t)ibv_mr->addr; + entry->end = (uintptr_t)ibv_mr->addr + mr->ibv_mr->length; + entry->lkey = rte_cpu_to_be_32(mr->ibv_mr->lkey); + /* Returning 1 ends iteration. */ + return 1; + } for (idx = base_idx; idx < mr->ms_bmp_n; ++idx) { if (rte_bitmap_get(mr->ms_bmp, idx)) { const struct rte_memseg_list *msl; @@ -809,6 +826,7 @@ mlx4_mr_mem_event_free_cb(struct rte_eth_dev *dev, const void *addr, size_t len) mr = mr_lookup_dev_list(dev, &entry, start); if (mr == NULL) continue; + assert(mr->msl); /* Can't be external memory. */ ms = rte_mem_virt2memseg((void *)start, msl); assert(ms != NULL); assert(msl->page_sz == ms->hugepage_sz); @@ -1055,6 +1073,133 @@ mlx4_mr_flush_local_cache(struct mlx4_mr_ctrl *mr_ctrl) (void *)mr_ctrl, mr_ctrl->cur_gen); } +/** + * Called during rte_mempool_mem_iter() by mlx4_mr_update_ext_mp(). + * + * Externally allocated chunk is registered and a MR is created for the chunk. + * The MR object is added to the global list. If memseg list of a MR object + * (mr->msl) is null, the MR object can be regarded as externally allocated + * memory. + * + * Once external memory is registered, it should be static. If the memory is + * freed and the virtual address range has different physical memory mapped + * again, it may cause crash on device due to the wrong translation entry. PMD + * can't track the free event of the external memory for now. + */ +static void +mlx4_mr_update_ext_mp_cb(struct rte_mempool *mp, void *opaque, +struct rte_mempool_memhdr *memhdr, +unsigned mem_idx __rte_unused) +{ + struct mr_update_mp_data *data = opaque; + struct rte_eth_dev *dev = data->dev; + struct priv *priv = dev->data->dev_private; + struct mlx4_mr_ctrl *mr_ctrl = data->mr_ctrl; + struct mlx4_mr *mr = NULL; + uintptr_t addr = (uintptr_t)memhdr->addr; + size_t len = memhdr->len; + struct mlx4_mr_cache entry; + uint32_t lkey; + + /* If already registered, it should return. */ + rte_rwlock_read_lock(&priv->mr.rwlock); + lkey = mr_lookup_dev(dev, &entry, addr); + rte_rwlock_read_unlock(&priv->mr.rwlock); + if (lkey != UINT32_MAX) + return; + mr = rte_zmalloc_socket(NULL, + RTE_ALIGN_CEIL(sizeof(*mr), + RTE_CACHE_LINE_SIZE), + RTE_CACHE_LINE_SIZE, mp->socket_id); + if (mr == NULL) { + WARN("port %u unable to allocate memory for a new MR of" +" mempool (%s).", +dev
[dpdk-dev] [PATCH 2/2] net/mlx5: support externally allocated static memory
When MLX PMD registers memory for DMA, it accesses the global memseg list of DPDK to maximize the range of registration so that LKey search can be more efficient. Granularity of MR registration is per page. Externally allocated memory shouldn't be used for DMA because it can't be searched in the memseg list and free event can't be tracked by DPDK. If it is used, the following error will occur: net_mlx5: port 0 unable to find virtually contiguous chunk for address (0x5600017587c0). rte_memseg_contig_walk() failed. There's a pending patchset [1] which enables externally allocated memory. Once it is merged, users can register their own memory out of EAL then that will resolve this issue. Meanwhile, if the external memory is static (allocated on startup and never freed), such memory can also be registered by little tweak in the code. [1] http://patches.dpdk.org/project/dpdk/list/?series=1415 This patch is not a bug fix but needs to be included in stable versions. Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support") Cc: sta...@dpdk.org Cc: "Damjan Marion (damarion)" Cc: Ed Warnicke Signed-off-by: Yongseok Koh --- drivers/net/mlx5/mlx5_mr.c | 155 +++ drivers/net/mlx5/mlx5_rxtx.h | 35 +- 2 files changed, 189 insertions(+), 1 deletion(-) diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c index 1d1bcb5fe..f4b15d3f6 100644 --- a/drivers/net/mlx5/mlx5_mr.c +++ b/drivers/net/mlx5/mlx5_mr.c @@ -277,6 +277,23 @@ mr_find_next_chunk(struct mlx5_mr *mr, struct mlx5_mr_cache *entry, uintptr_t end = 0; uint32_t idx = 0; + /* MR for external memory doesn't have memseg list. */ + if (mr->msl == NULL) { + struct ibv_mr *ibv_mr = mr->ibv_mr; + + assert(mr->ms_bmp_n == 1); + assert(mr->ms_n == 1); + assert(base_idx == 0); + /* +* Can't search it from memseg list but get it directly from +* verbs MR as there's only one chunk. +*/ + entry->start = (uintptr_t)ibv_mr->addr; + entry->end = (uintptr_t)ibv_mr->addr + mr->ibv_mr->length; + entry->lkey = rte_cpu_to_be_32(mr->ibv_mr->lkey); + /* Returning 1 ends iteration. */ + return 1; + } for (idx = base_idx; idx < mr->ms_bmp_n; ++idx) { if (rte_bitmap_get(mr->ms_bmp, idx)) { const struct rte_memseg_list *msl; @@ -811,6 +828,7 @@ mlx5_mr_mem_event_free_cb(struct rte_eth_dev *dev, const void *addr, size_t len) mr = mr_lookup_dev_list(dev, &entry, start); if (mr == NULL) continue; + assert(mr->msl); /* Can't be external memory. */ ms = rte_mem_virt2memseg((void *)start, msl); assert(ms != NULL); assert(msl->page_sz == ms->hugepage_sz); @@ -1061,6 +1079,139 @@ mlx5_mr_flush_local_cache(struct mlx5_mr_ctrl *mr_ctrl) (void *)mr_ctrl, mr_ctrl->cur_gen); } +/** + * Called during rte_mempool_mem_iter() by mlx5_mr_update_ext_mp(). + * + * Externally allocated chunk is registered and a MR is created for the chunk. + * The MR object is added to the global list. If memseg list of a MR object + * (mr->msl) is null, the MR object can be regarded as externally allocated + * memory. + * + * Once external memory is registered, it should be static. If the memory is + * freed and the virtual address range has different physical memory mapped + * again, it may cause crash on device due to the wrong translation entry. PMD + * can't track the free event of the external memory for now. + */ +static void +mlx5_mr_update_ext_mp_cb(struct rte_mempool *mp, void *opaque, +struct rte_mempool_memhdr *memhdr, +unsigned mem_idx __rte_unused) +{ + struct mr_update_mp_data *data = opaque; + struct rte_eth_dev *dev = data->dev; + struct priv *priv = dev->data->dev_private; + struct mlx5_mr_ctrl *mr_ctrl = data->mr_ctrl; + struct mlx5_mr *mr = NULL; + uintptr_t addr = (uintptr_t)memhdr->addr; + size_t len = memhdr->len; + struct mlx5_mr_cache entry; + uint32_t lkey; + + /* If already registered, it should return. */ + rte_rwlock_read_lock(&priv->mr.rwlock); + lkey = mr_lookup_dev(dev, &entry, addr); + rte_rwlock_read_unlock(&priv->mr.rwlock); + if (lkey != UINT32_MAX) + return; + mr = rte_zmalloc_socket(NULL, + RTE_ALIGN_CEIL(sizeof(*mr), + RTE_CACHE_LINE_SIZE), + RTE_CACHE_LINE_SIZE, mp->socket_id); + if (mr == NULL) { + DRV_LOG(WARNING, + "port %u unable to allocate memory for a new MR of" + "
[dpdk-dev] [PATCH v2 00/11] net/mlx5: add Direct Verbs flow driver support
RFC: https://mails.dpdk.org/archives/dev/2018-August/109950.html v2: * make changes for the newly introduced meson build. Ori Kam (11): net/mlx5: split flow validation to dedicated function net/mlx5: add flow prepare function net/mlx5: add flow translate function net/mlx5: add support for multiple flow drivers net/mlx5: add Direct Verbs validation function net/mlx5: add Direct Verbs prepare function net/mlx5: add Direct Verbs translate items net/mlx5: add Direct Verbs translate actions net/mlx5: add Direct Verbs driver to glue net/mlx5: add Direct Verbs final functions net/mlx5: add runtime parameter to enable Direct Verbs doc/guides/nics/mlx5.rst |7 + drivers/net/mlx5/Makefile |9 +- drivers/net/mlx5/meson.build |6 +- drivers/net/mlx5/mlx5.c|8 + drivers/net/mlx5/mlx5.h|2 + drivers/net/mlx5/mlx5_flow.c | 3271 +++- drivers/net/mlx5/mlx5_flow.h | 326 drivers/net/mlx5/mlx5_flow_dv.c| 1373 +++ drivers/net/mlx5/mlx5_flow_verbs.c | 1652 ++ drivers/net/mlx5/mlx5_glue.c | 45 + drivers/net/mlx5/mlx5_glue.h | 15 + drivers/net/mlx5/mlx5_prm.h| 220 +++ drivers/net/mlx5/mlx5_rxtx.h |7 + 13 files changed, 4620 insertions(+), 2321 deletions(-) create mode 100644 drivers/net/mlx5/mlx5_flow.h create mode 100644 drivers/net/mlx5/mlx5_flow_dv.c create mode 100644 drivers/net/mlx5/mlx5_flow_verbs.c -- 2.11.0
[dpdk-dev] [PATCH v2 01/11] net/mlx5: split flow validation to dedicated function
From: Ori Kam In current implementation the validation logic reside in the same function that calculates the size of the verbs spec and also create the verbs spec. This approach results in hard to maintain code which can't be shared. also in current logic there is a use of parser entity that holds the information between function calls. The main problem with this parser is that it assumes the connection between different functions. For example it assumes that the validation function was called and relevant values were set. This may result in an issue if and when we for example only call the validation function, or call the apply function without the validation (Currently according to RTE flow we must call validation before creating flow, but if we want to change that to save time during flow creation, for example the user validated some rule and just want to change the IP there is no true reason the validate the rule again). This commit address both of those issues by extracting the validation logic into detected functions and remove the use of the parser object. The side effect of those changes is that in some cases there will be a need to traverse the item list again. Signed-off-by: Ori Kam Acked-by: Yongseok Koh --- drivers/net/mlx5/mlx5_flow.c | 1889 +++--- 1 file changed, 1224 insertions(+), 665 deletions(-) diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index 3f548a9a4..d214846c1 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -91,6 +91,14 @@ extern const struct eth_dev_ops mlx5_dev_ops_isolate; #define MLX5_FLOW_MOD_MARK (1u << 1) #define MLX5_FLOW_MOD_COUNT (1u << 2) +/* Actions */ +#define MLX5_ACTION_DROP (1u << 0) +#define MLX5_ACTION_QUEUE (1u << 1) +#define MLX5_ACTION_RSS (1u << 2) +#define MLX5_ACTION_FLAG (1u << 3) +#define MLX5_ACTION_MARK (1u << 4) +#define MLX5_ACTION_COUNT (1u << 5) + /* possible L3 layers protocols filtering. */ #define MLX5_IP_PROTOCOL_TCP 6 #define MLX5_IP_PROTOCOL_UDP 17 @@ -299,14 +307,12 @@ struct mlx5_flow_counter { struct rte_flow { TAILQ_ENTRY(rte_flow) next; /**< Pointer to the next flow structure. */ struct rte_flow_attr attributes; /**< User flow attribute. */ - uint32_t l3_protocol_en:1; /**< Protocol filtering requested. */ uint32_t layers; /**< Bit-fields of present layers see MLX5_FLOW_LAYER_*. */ uint32_t modifier; /**< Bit-fields of present modifier see MLX5_FLOW_MOD_*. */ uint32_t fate; /**< Bit-fields of present fate see MLX5_FLOW_FATE_*. */ - uint8_t l3_protocol; /**< valid when l3_protocol_en is set. */ LIST_HEAD(verbs, mlx5_flow_verbs) verbs; /**< Verbs flows list. */ struct mlx5_flow_verbs *cur_verbs; /**< Current Verbs flow structure being filled. */ @@ -582,52 +588,23 @@ mlx5_flow_counter_release(struct mlx5_flow_counter *counter) * them in the @p flow if everything is correct. * * @param[in] dev - * Pointer to Ethernet device. + * Pointer to Ethernet device structure. * @param[in] attributes * Pointer to flow attributes * @param[in, out] flow * Pointer to the rte_flow structure. - * @param[out] error - * Pointer to error structure. * * @return - * 0 on success, a negative errno value otherwise and rte_errno is set. + * 0 on success. */ static int mlx5_flow_attributes(struct rte_eth_dev *dev, const struct rte_flow_attr *attributes, -struct rte_flow *flow, -struct rte_flow_error *error) +struct rte_flow *flow) { - uint32_t priority_max = - ((struct priv *)dev->data->dev_private)->config.flow_prio - 1; + struct priv *priv = dev->data->dev_private; + uint32_t priority_max = priv->config.flow_prio - 1; - if (attributes->group) - return rte_flow_error_set(error, ENOTSUP, - RTE_FLOW_ERROR_TYPE_ATTR_GROUP, - NULL, - "groups is not supported"); - if (attributes->priority != MLX5_FLOW_PRIO_RSVD && - attributes->priority >= priority_max) - return rte_flow_error_set(error, ENOTSUP, - RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, - NULL, - "priority out of range"); - if (attributes->egress) - return rte_flow_error_set(error, ENOTSUP, - RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, - NULL, - "egress is not supported"); - if (attributes->transfer) - return rte_flow_error_set(error, ENOTSUP, - RTE_FLOW_ERROR_TYPE_ATTR_TRANSFER, -
[dpdk-dev] [PATCH v2 02/11] net/mlx5: add flow prepare function
From: Ori Kam In current implementation the calculation of the flow size is done during the validation stage, and the same function is also used to translate the input parameters into verbs spec. This is hard to maintain and error prone. Another issue is that dev-flows (flows that are created implicitly in order to support the requested flow for example when the user request RSS on UDP 2 rules need to be created one for IPv4 and one for IPv6). In current implementation the dev-flows are created on the same memory allocation. This will be harder to implement in future drivers. The commits extract the calculation and creation of the dev-flow from the translation part (the part that converts the parameters into the format required by the driver). This results in that the prepare function only function is to allocate the dev-flow. Signed-off-by: Ori Kam Acked-by: Yongseok Koh --- drivers/net/mlx5/mlx5_flow.c | 269 ++- 1 file changed, 263 insertions(+), 6 deletions(-) diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index d214846c1..9732c8376 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -292,6 +292,15 @@ struct mlx5_flow_verbs { uint64_t hash_fields; /**< Verbs hash Rx queue hash fields. */ }; +/** Device flow structure. */ +struct mlx5_flow { + LIST_ENTRY(mlx5_flow) next; + struct rte_flow *flow; /**< Pointer to the main flow. */ + union { + struct mlx5_flow_verbs verbs; /**< Holds the verbs dev-flow. */ + }; +}; + /* Counters information. */ struct mlx5_flow_counter { LIST_ENTRY(mlx5_flow_counter) next; /**< Pointer to the next counter. */ @@ -321,6 +330,8 @@ struct rte_flow { uint8_t key[MLX5_RSS_HASH_KEY_LEN]; /**< RSS hash key. */ uint16_t (*queue)[]; /**< Destination queues to redirect traffic to. */ void *nl_flow; /**< Netlink flow buffer if relevant. */ + LIST_HEAD(dev_flows, mlx5_flow) dev_flows; + /**< Device flows that are part of the flow. */ }; static const struct rte_flow_ops mlx5_flow_ops = { @@ -2322,7 +2333,7 @@ mlx5_flow_rxq_flags_clear(struct rte_eth_dev *dev) * Pointer to error structure. * * @return - * 0 on success, a negative errno value otherwise and rte_ernno is set. + * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int mlx5_flow_validate_action_flag(uint64_t action_flags, @@ -2425,7 +2436,6 @@ mlx5_flow_validate_action_drop(uint64_t action_flags, } /* - * * Validate the queue action. * * @param[in] action @@ -2469,7 +2479,6 @@ mlx5_flow_validate_action_queue(const struct rte_flow_action *action, } /* - * * Validate the rss action. * * @param[in] action @@ -3195,7 +3204,7 @@ mlx5_flow_validate_item_mpls(const struct rte_flow_item *item __rte_unused, if (ret < 0) return ret; return 0; -#endif /* !HAVE_IBV_DEVICE_MPLS_SUPPORT */ +#endif return rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM, item, "MPLS is not supported by Verbs, please" @@ -3203,7 +3212,6 @@ mlx5_flow_validate_item_mpls(const struct rte_flow_item *item __rte_unused, } /** - * * Internal validation function. * * @param[in] dev @@ -3428,6 +3436,222 @@ mlx5_flow_validate(struct rte_eth_dev *dev, } /** + * Calculate the required bytes that are needed for the action part of the verbs + * flow, in addtion returns bit-fields with all the detected action, in order to + * avoid another interation over the actions. + * + * @param[in] actions + * Pointer to the list of actions. + * @param[out] action_flags + * Pointer to the detected actions. + * + * @return + * The size of the memory needed for all actions. + */ +static int +mlx5_flow_verbs_get_actions_and_size(const struct rte_flow_action actions[], +uint64_t *action_flags) +{ + int size = 0; + uint64_t detected_actions = 0; + + for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) { + switch (actions->type) { + case RTE_FLOW_ACTION_TYPE_VOID: + break; + case RTE_FLOW_ACTION_TYPE_FLAG: + size += sizeof(struct ibv_flow_spec_action_tag); + detected_actions |= MLX5_ACTION_FLAG; + break; + case RTE_FLOW_ACTION_TYPE_MARK: + size += sizeof(struct ibv_flow_spec_action_tag); + detected_actions |= MLX5_ACTION_MARK; + break; + case RTE_FLOW_ACTION_TYPE_DROP: + size += sizeof(struct ibv_flow_spec_action_drop); + detected_actions |= MLX5_ACTION_DROP; + break; + case RTE_FLOW_ACTION_TYPE_QUEUE: + det
[dpdk-dev] [PATCH v2 03/11] net/mlx5: add flow translate function
From: Ori Kam This commit modify the conversion of the input parameters into Verbs spec, in order to support all previous changes. Some of those changes are: removing the use of the parser, storing each flow in its own flow structure. Signed-off-by: Ori Kam Acked-by: Yongseok Koh --- drivers/net/mlx5/mlx5_flow.c | 1624 +++--- 1 file changed, 580 insertions(+), 1044 deletions(-) diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index 9732c8376..9cb77d55f 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -296,6 +296,7 @@ struct mlx5_flow_verbs { struct mlx5_flow { LIST_ENTRY(mlx5_flow) next; struct rte_flow *flow; /**< Pointer to the main flow. */ + uint32_t layers; /**< Bit-fields that holds the detected layers. */ union { struct mlx5_flow_verbs verbs; /**< Holds the verbs dev-flow. */ }; @@ -316,15 +317,8 @@ struct mlx5_flow_counter { struct rte_flow { TAILQ_ENTRY(rte_flow) next; /**< Pointer to the next flow structure. */ struct rte_flow_attr attributes; /**< User flow attribute. */ - uint32_t layers; + uint32_t layers; /**< Bit-fields that holds the detected layers. */ /**< Bit-fields of present layers see MLX5_FLOW_LAYER_*. */ - uint32_t modifier; - /**< Bit-fields of present modifier see MLX5_FLOW_MOD_*. */ - uint32_t fate; - /**< Bit-fields of present fate see MLX5_FLOW_FATE_*. */ - LIST_HEAD(verbs, mlx5_flow_verbs) verbs; /**< Verbs flows list. */ - struct mlx5_flow_verbs *cur_verbs; - /**< Current Verbs flow structure being filled. */ struct mlx5_flow_counter *counter; /**< Holds Verbs flow counter. */ struct rte_flow_action_rss rss;/**< RSS context. */ uint8_t key[MLX5_RSS_HASH_KEY_LEN]; /**< RSS hash key. */ @@ -332,6 +326,7 @@ struct rte_flow { void *nl_flow; /**< Netlink flow buffer if relevant. */ LIST_HEAD(dev_flows, mlx5_flow) dev_flows; /**< Device flows that are part of the flow. */ + uint32_t actions; /**< Bit-fields which mark all detected actions. */ }; static const struct rte_flow_ops mlx5_flow_ops = { @@ -430,7 +425,7 @@ static struct mlx5_flow_tunnel_info tunnels_info[] = { * Discover the maximum number of priority available. * * @param[in] dev - * Pointer to Ethernet device. + * Pointer to the Ethernet device structure. * * @return * number of supported flow priority on success, a negative errno @@ -497,34 +492,40 @@ mlx5_flow_discover_priorities(struct rte_eth_dev *dev) /** * Adjust flow priority. * - * @param dev - * Pointer to Ethernet device. - * @param flow - * Pointer to an rte flow. + * @param[in] dev + * Pointer to the Ethernet device structure. + * @param[in] priority + * The rule base priority. + * @param[in] subpriority + * The priority based on the items. + * + * @return + * The new priority. */ -static void -mlx5_flow_adjust_priority(struct rte_eth_dev *dev, struct rte_flow *flow) +static uint32_t +mlx5_flow_adjust_priority(struct rte_eth_dev *dev, + int32_t priority, + uint32_t subpriority) { + uint32_t res = 0; struct priv *priv = dev->data->dev_private; - uint32_t priority = flow->attributes.priority; - uint32_t subpriority = flow->cur_verbs->attr->priority; switch (priv->config.flow_prio) { case RTE_DIM(priority_map_3): - priority = priority_map_3[priority][subpriority]; + res = priority_map_3[priority][subpriority]; break; case RTE_DIM(priority_map_5): - priority = priority_map_5[priority][subpriority]; + res = priority_map_5[priority][subpriority]; break; } - flow->cur_verbs->attr->priority = priority; + return res; } /** * Get a flow counter. * * @param[in] dev - * Pointer to Ethernet device. + * Pointer to the Ethernet device structure. * @param[in] shared * Indicate if this counter is shared with other flows. * @param[in] id @@ -595,34 +596,6 @@ mlx5_flow_counter_release(struct mlx5_flow_counter *counter) } /** - * Verify the @p attributes will be correctly understood by the NIC and store - * them in the @p flow if everything is correct. - * - * @param[in] dev - * Pointer to Ethernet device structure. - * @param[in] attributes - * Pointer to flow attributes - * @param[in, out] flow - * Pointer to the rte_flow structure. - * - * @return - * 0 on success. - */ -static int -mlx5_flow_attributes(struct rte_eth_dev *dev, -const struct rte_flow_attr *attributes, -struct rte_flow *flow) -{ - struct priv *priv = dev->data->dev_private; - uint32_t priority_max = priv->config.flow_prio - 1; - - flow->attributes = *attributes; - if (attributes->priority
[dpdk-dev] [PATCH v2 04/11] net/mlx5: add support for multiple flow drivers
From: Ori Kam In the current PMD there is only support for Verbs driver API, to configure NIC rules and TC driver API, to configure eswitch rules. In order to support new drivers that will enable the use of new features for example the Direct Verbs driver API. There is a need to split each driver to a dedicated file and use function pointer to access the driver. This commit moves the Verbs API to a detected file and introduce the use of function pointers in the flow handling. The functions pointers that are in use: * validate - handle the validation of the flow. It can use both specific functions or shared functions that will be located in the mlx5_flow.c. * prepare - allocate a the device flow. There can be number of device flows that are connected to a single requested flow. * translate - converts the requested device flow into the driver flow. * apply - insert the flow into the NIC. * remove - remove the flow from the NIC but keeps it in memory. * destroy - remove the flow from memory. Signed-off-by: Ori Kam Acked-by: Yongseok Koh --- drivers/net/mlx5/Makefile |1 + drivers/net/mlx5/meson.build |1 + drivers/net/mlx5/mlx5.c|2 + drivers/net/mlx5/mlx5_flow.c | 1910 ++-- drivers/net/mlx5/mlx5_flow.h | 257 + drivers/net/mlx5/mlx5_flow_verbs.c | 1692 6 files changed, 2026 insertions(+), 1837 deletions(-) create mode 100644 drivers/net/mlx5/mlx5_flow.h create mode 100644 drivers/net/mlx5/mlx5_flow_verbs.c diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile index 2e70dec5b..9bd6bfb82 100644 --- a/drivers/net/mlx5/Makefile +++ b/drivers/net/mlx5/Makefile @@ -31,6 +31,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_stats.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_verbs.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl_flow.c diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build index 289c7a4c0..40cc95038 100644 --- a/drivers/net/mlx5/meson.build +++ b/drivers/net/mlx5/meson.build @@ -31,6 +31,7 @@ if build 'mlx5.c', 'mlx5_ethdev.c', 'mlx5_flow.c', + 'mlx5_flow_verbs.c', 'mlx5_mac.c', 'mlx5_mr.c', 'mlx5_nl.c', diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index fd89e2af3..ab44864e9 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -46,6 +46,7 @@ #include "mlx5_defs.h" #include "mlx5_glue.h" #include "mlx5_mr.h" +#include "mlx5_flow.h" /* Device parameter to enable RX completion queue compression. */ #define MLX5_RXQ_CQE_COMP_EN "rxq_cqe_comp_en" @@ -1185,6 +1186,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, if (err < 0) goto error; priv->config.flow_prio = err; + mlx5_flow_init_driver_ops(eth_dev); /* * Once the device is added to the list of memory event * callback, its global MR cache table cannot be expanded diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index 9cb77d55f..f5fcfffd4 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -31,83 +31,12 @@ #include "mlx5_defs.h" #include "mlx5_prm.h" #include "mlx5_glue.h" +#include "mlx5_flow.h" /* Dev ops structure defined in mlx5.c */ extern const struct eth_dev_ops mlx5_dev_ops; extern const struct eth_dev_ops mlx5_dev_ops_isolate; -/* Pattern outer Layer bits. */ -#define MLX5_FLOW_LAYER_OUTER_L2 (1u << 0) -#define MLX5_FLOW_LAYER_OUTER_L3_IPV4 (1u << 1) -#define MLX5_FLOW_LAYER_OUTER_L3_IPV6 (1u << 2) -#define MLX5_FLOW_LAYER_OUTER_L4_UDP (1u << 3) -#define MLX5_FLOW_LAYER_OUTER_L4_TCP (1u << 4) -#define MLX5_FLOW_LAYER_OUTER_VLAN (1u << 5) - -/* Pattern inner Layer bits. */ -#define MLX5_FLOW_LAYER_INNER_L2 (1u << 6) -#define MLX5_FLOW_LAYER_INNER_L3_IPV4 (1u << 7) -#define MLX5_FLOW_LAYER_INNER_L3_IPV6 (1u << 8) -#define MLX5_FLOW_LAYER_INNER_L4_UDP (1u << 9) -#define MLX5_FLOW_LAYER_INNER_L4_TCP (1u << 10) -#define MLX5_FLOW_LAYER_INNER_VLAN (1u << 11) - -/* Pattern tunnel Layer bits. */ -#define MLX5_FLOW_LAYER_VXLAN (1u << 12) -#define MLX5_FLOW_LAYER_VXLAN_GPE (1u << 13) -#define MLX5_FLOW_LAYER_GRE (1u << 14) -#define MLX5_FLOW_LAYER_MPLS (1u << 15) - -/* Outer Masks. */ -#define MLX5_FLOW_LAYER_OUTER_L3 \ - (MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6) -#define MLX5_FLOW_LAYER_OUTER_L4 \ - (MLX5_FLOW_LAYER_OUTER_L4_UDP | MLX5_FLOW_LAYER_OUTER_L4_TCP) -#define MLX5_FLOW_LAYER_OUTER \ - (MLX5_FLOW_LAYER_OUTER_L2 | MLX5_FLOW_LAYER_OUTER_L3 | \ -MLX5_FLOW_LAYER_OUTER_L4) - -/*
[dpdk-dev] [PATCH v2 05/11] net/mlx5: add Direct Verbs validation function
From: Ori Kam This is commit introduce the Direct Verbs driver API. The Direct Verbs is an API adds new features like encapsulation, match on metatdata. In this commit the validation function was added, most of the validation is done with functions that are also in use for the Verbs API. Signed-off-by: Ori Kam Acked-by: Yongseok Koh --- drivers/net/mlx5/Makefile | 6 + drivers/net/mlx5/meson.build| 3 + drivers/net/mlx5/mlx5_flow.h| 6 + drivers/net/mlx5/mlx5_flow_dv.c | 312 4 files changed, 327 insertions(+) create mode 100644 drivers/net/mlx5/mlx5_flow_dv.c diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile index 9bd6bfb82..d510a4275 100644 --- a/drivers/net/mlx5/Makefile +++ b/drivers/net/mlx5/Makefile @@ -31,6 +31,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_stats.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_dv.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_verbs.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c @@ -136,6 +137,11 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-config-h.sh enum MLX5DV_CONTEXT_FLAGS_CQE_128B_COMP \ $(AUTOCONF_OUTPUT) $Q sh -- '$<' '$@' \ + HAVE_IBV_FLOW_DV_SUPPORT \ + infiniband/mlx5dv.h \ + enum MLX5DV_FLOW_ACTION_TAG \ + $(AUTOCONF_OUTPUT) + $Q sh -- '$<' '$@' \ HAVE_ETHTOOL_LINK_MODE_25G \ /usr/include/linux/ethtool.h \ enum ETHTOOL_LINK_MODE_25000baseCR_Full_BIT \ diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build index 40cc95038..8075496f7 100644 --- a/drivers/net/mlx5/meson.build +++ b/drivers/net/mlx5/meson.build @@ -31,6 +31,7 @@ if build 'mlx5.c', 'mlx5_ethdev.c', 'mlx5_flow.c', + 'mlx5_flow_dv.c', 'mlx5_flow_verbs.c', 'mlx5_mac.c', 'mlx5_mr.c', @@ -93,6 +94,8 @@ if build 'MLX5DV_CONTEXT_FLAGS_MPW_ALLOWED' ], [ 'HAVE_IBV_MLX5_MOD_CQE_128B_COMP', 'infiniband/mlx5dv.h', 'MLX5DV_CONTEXT_FLAGS_CQE_128B_COMP' ], + [ 'HAVE_IBV_FLOW_DV_SUPPORT', 'infiniband/mlx5dv.h', + 'MLX5DV_FLOW_ACTION_TAG' ], [ 'HAVE_IBV_DEVICE_MPLS_SUPPORT', 'infiniband/verbs.h', 'IBV_FLOW_SPEC_MPLS' ], [ 'HAVE_IBV_WQ_FLAG_RX_END_PADDING', 'infiniband/verbs.h', diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h index 4df60db92..9b0cd28ae 100644 --- a/drivers/net/mlx5/mlx5_flow.h +++ b/drivers/net/mlx5/mlx5_flow.h @@ -103,6 +103,9 @@ #define MLX5_PRIORITY_MAP_L4 0 #define MLX5_PRIORITY_MAP_MAX 3 +/* Max number of actions per DV flow. */ +#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8 + /* Verbs specification header. */ struct ibv_spec_header { enum ibv_flow_spec_type type; @@ -250,6 +253,9 @@ int mlx5_flow_validate_item_vxlan_gpe(const struct rte_flow_item *item, struct rte_flow_error *error); void mlx5_flow_init_driver_ops(struct rte_eth_dev *dev); +/* mlx5_flow_dv.c */ +void mlx5_flow_dv_get_driver_ops(struct mlx5_flow_driver_ops *flow_ops); + /* mlx5_flow_verbs.c */ void mlx5_flow_verbs_get_driver_ops(struct mlx5_flow_driver_ops *flow_ops); diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c new file mode 100644 index 0..86a8b3cd0 --- /dev/null +++ b/drivers/net/mlx5/mlx5_flow_dv.c @@ -0,0 +1,312 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 Mellanox Technologies, Ltd + */ + +#include +#include +#include +#include + +/* Verbs header. */ +/* ISO C doesn't support unnamed structs/unions, disabling -pedantic. */ +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +#include +#ifdef PEDANTIC +#pragma GCC diagnostic error "-Wpedantic" +#endif + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "mlx5.h" +#include "mlx5_defs.h" +#include "mlx5_prm.h" +#include "mlx5_glue.h" +#include "mlx5_flow.h" + +#ifdef HAVE_IBV_FLOW_DV_SUPPORT + +/** + * Verify the @p attributes will be correctly understood by the NIC and store + * them in the @p flow if everything is correct. + * + * @param[in] dev + * Pointer to dev struct. + * @param[in] attributes + * Pointer to flow attributes + * @param[out] error + * Pointer to error structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +static int +flow_dv_validate_attributes(struct rte_eth_dev *dev, + const struct rte_flow_attr *attributes, + struct rte_flow_error *er
[dpdk-dev] [PATCH v2 06/11] net/mlx5: add Direct Verbs prepare function
From: Ori Kam This function allocates the Direct Verbs device flow, and introduce the relevant PRM structures. This commit also adds the matcher object. The matcher object acts as a mask and should be shared between flows. For example all rules that should match source IP with full mask should use the same matcher. A flow that should match dest IP or source IP but without full mask should have a new matcher allocated. Signed-off-by: Ori Kam Acked-by: Yongseok Koh --- drivers/net/mlx5/mlx5.h | 1 + drivers/net/mlx5/mlx5_flow.h| 31 +- drivers/net/mlx5/mlx5_flow_dv.c | 45 - drivers/net/mlx5/mlx5_prm.h | 213 drivers/net/mlx5/mlx5_rxtx.h| 7 ++ 5 files changed, 295 insertions(+), 2 deletions(-) diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 4d3e9f38f..8ff6d6987 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -213,6 +213,7 @@ struct priv { LIST_HEAD(txqibv, mlx5_txq_ibv) txqsibv; /* Verbs Tx queues. */ /* Verbs Indirection tables. */ LIST_HEAD(ind_tables, mlx5_ind_table_ibv) ind_tbls; + LIST_HEAD(matcher, mlx5_cache) matchers; uint32_t link_speed_capa; /* Link speed capabilities. */ struct mlx5_xstats_ctrl xstats_ctrl; /* Extended stats control. */ int primary_socket; /* Unix socket for primary process. */ diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h index 9b0cd28ae..0cf496db3 100644 --- a/drivers/net/mlx5/mlx5_flow.h +++ b/drivers/net/mlx5/mlx5_flow.h @@ -106,6 +106,34 @@ /* Max number of actions per DV flow. */ #define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8 +/* Matcher PRM representation */ +struct mlx5_flow_dv_match_params { + size_t size; + /**< Size of match value. Do NOT split size and key! */ + uint32_t buf[MLX5_ST_SZ_DW(fte_match_param)]; + /**< Matcher value. This value is used as the mask or as a key. */ +}; + +/* Matcher structure. */ +struct mlx5_flow_dv_matcher { + struct mlx5_cache cache; /**< Cache to struct mlx5dv_flow_matcher. */ + uint16_t crc; /**< CRC of key. */ + uint16_t priority; /**< Priority of matcher. */ + uint8_t egress; /**< Egress matcher. */ + struct mlx5_flow_dv_match_params mask; /**< Matcher mask. */ +}; + +/* DV flows structure. */ +struct mlx5_flow_dv { + uint64_t hash_fields; /**< Fields that participate in the hash. */ + struct mlx5_hrxq *hrxq; /**< Hash Rx queues. */ + /* Flow DV api: */ + struct mlx5_flow_dv_matcher *matcher; /**< Cache to matcher. */ + struct mlx5_flow_dv_match_params value; + /**< Holds the value that the packet is compared to. */ + struct ibv_flow *flow; /**< Installed flow. */ +}; + /* Verbs specification header. */ struct ibv_spec_header { enum ibv_flow_spec_type type; @@ -132,7 +160,8 @@ struct mlx5_flow { struct rte_flow *flow; /**< Pointer to the main flow. */ uint32_t layers; /**< Bit-fields that holds the detected layers. */ union { - struct mlx5_flow_verbs verbs; /**< Holds the verbs dev-flow. */ + struct mlx5_flow_dv dv; + struct mlx5_flow_verbs verbs; }; }; diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c index 86a8b3cd0..30d501a61 100644 --- a/drivers/net/mlx5/mlx5_flow_dv.c +++ b/drivers/net/mlx5/mlx5_flow_dv.c @@ -291,6 +291,49 @@ flow_dv_validate(struct rte_eth_dev *dev, const struct rte_flow_attr *attr, } /** + * Internal preparation function. Allocates the DV flow size, + * this size is constant. + * + * @param[in] attr + * Pointer to the flow attributes. + * @param[in] items + * Pointer to the list of items. + * @param[in] actions + * Pointer to the list of actions. + * @param[out] item_flags + * Pointer to bit mask of all items detected. + * @param[out] action_flags + * Pointer to bit mask of all actions detected. + * @param[out] error + * Pointer to the error structure. + * + * @return + * Pointer to mlx5_flow object on success, + * otherwise NULL and rte_ernno is set. + */ +static struct mlx5_flow * +flow_dv_prepare(const struct rte_flow_attr *attr __rte_unused, + const struct rte_flow_item items[] __rte_unused, + const struct rte_flow_action actions[] __rte_unused, + uint64_t *item_flags __rte_unused, + uint64_t *action_flags __rte_unused, + struct rte_flow_error *error) +{ + uint32_t size = sizeof(struct mlx5_flow); + struct mlx5_flow *flow; + + flow = rte_calloc(__func__, 1, size, 0); + if (!flow) { + rte_flow_error_set(error, ENOMEM, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, + "not enough memory to create flow"); + return NULL; + } + flow->dv.value.size = MLX5_ST_SZ_DB(fte_match_param); + r
[dpdk-dev] [PATCH v2 07/11] net/mlx5: add Direct Verbs translate items
From: Ori Kam This commit handles the translation of the requested flow into Direct Verbs API. The Direct Verbs introduce the matcher object which acts as shared mask for all flows that are using the same mask. So in this commit we translate the item and get in return a matcher and the value that should be matched. Signed-off-by: Ori Kam Acked-by: Yongseok Koh --- drivers/net/mlx5/mlx5_flow.c | 36 ++ drivers/net/mlx5/mlx5_flow.h | 25 ++ drivers/net/mlx5/mlx5_flow_dv.c| 775 - drivers/net/mlx5/mlx5_flow_verbs.c | 72 +--- drivers/net/mlx5/mlx5_prm.h| 7 + 5 files changed, 858 insertions(+), 57 deletions(-) diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index f5fcfffd4..196d9fcbe 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -444,6 +444,42 @@ mlx5_flow_item_acceptable(const struct rte_flow_item *item, } /** + * Adjust the hash fields according to the @p flow information. + * + * @param[in] dev_flow. + * Pointer to the mlx5_flow. + * @param[in] tunnel + * 1 when the hash field is for a tunnel item. + * @param[in] layer_types + * ETH_RSS_* types. + * @param[in] hash_fields + * Item hash fields. + * + * @return + * The hash fileds that should be used. + */ +uint64_t +mlx5_flow_hashfields_adjust(struct mlx5_flow *dev_flow, + int tunnel __rte_unused, uint32_t layer_types, + uint64_t hash_fields) +{ + struct rte_flow *flow = dev_flow->flow; +#ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT + int rss_request_inner = flow->rss.level >= 2; + + /* Check RSS hash level for tunnel. */ + if (tunnel && rss_request_inner) + hash_fields |= IBV_RX_HASH_INNER; + else if (tunnel || rss_request_inner) + return 0; +#endif + /* Check if requested layer matches RSS hash fields. */ + if (!(flow->rss.types & layer_types)) + return 0; + return hash_fields; +} + +/** * Lookup and set the ptype in the data Rx part. A single Ptype can be used, * if several tunnel rules are used on this queue, the tunnel ptype will be * cleared. diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h index 0cf496db3..7f0566fc9 100644 --- a/drivers/net/mlx5/mlx5_flow.h +++ b/drivers/net/mlx5/mlx5_flow.h @@ -89,6 +89,10 @@ #define MLX5_IP_PROTOCOL_GRE 47 #define MLX5_IP_PROTOCOL_MPLS 147 +/* Internent Protocol versions. */ +#define MLX5_VXLAN 4789 +#define MLX5_VXLAN_GPE 4790 + /* Priority reserved for default flows. */ #define MLX5_FLOW_PRIO_RSVD ((uint32_t)-1) @@ -103,6 +107,24 @@ #define MLX5_PRIORITY_MAP_L4 0 #define MLX5_PRIORITY_MAP_MAX 3 +/* Valid layer type for IPV4 RSS. */ +#define MLX5_IPV4_LAYER_TYPES \ + (ETH_RSS_IPV4 | ETH_RSS_FRAG_IPV4 | \ +ETH_RSS_NONFRAG_IPV4_TCP | ETH_RSS_NONFRAG_IPV4_UDP | \ +ETH_RSS_NONFRAG_IPV4_OTHER) + +/* IBV hash source bits for IPV4. */ +#define MLX5_IPV4_IBV_RX_HASH (IBV_RX_HASH_SRC_IPV4 | IBV_RX_HASH_DST_IPV4) + +/* Valid layer type for IPV6 RSS. */ +#define MLX5_IPV6_LAYER_TYPES \ + (ETH_RSS_IPV6 | ETH_RSS_FRAG_IPV6 | ETH_RSS_NONFRAG_IPV6_TCP | \ +ETH_RSS_NONFRAG_IPV6_UDP | ETH_RSS_IPV6_EX | ETH_RSS_IPV6_TCP_EX | \ +ETH_RSS_IPV6_UDP_EX | ETH_RSS_NONFRAG_IPV6_OTHER) + +/* IBV hash source bits for IPV6. */ +#define MLX5_IPV6_IBV_RX_HASH (IBV_RX_HASH_SRC_IPV6 | IBV_RX_HASH_DST_IPV6) + /* Max number of actions per DV flow. */ #define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8 @@ -223,6 +245,9 @@ struct mlx5_flow_driver_ops { /* mlx5_flow.c */ +uint64_t mlx5_flow_hashfields_adjust(struct mlx5_flow *dev_flow, int tunnel, +uint32_t layer_types, +uint64_t hash_fields); uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, int32_t priority, uint32_t subpriority); int mlx5_flow_validate_action_count(struct rte_eth_dev *dev, diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c index 30d501a61..acb1b7549 100644 --- a/drivers/net/mlx5/mlx5_flow_dv.c +++ b/drivers/net/mlx5/mlx5_flow_dv.c @@ -334,6 +334,779 @@ flow_dv_prepare(const struct rte_flow_attr *attr __rte_unused, } /** + * Add Ethernet item to matcher and to the value. + * + * @param[in, out] matcher + * Flow matcher. + * @param[in, out] key + * Flow matcher value. + * @param[in] item + * Flow pattern to translate. + * @param[in] inner + * Item is inner pattern. + */ +static void +flow_dv_translate_item_eth(void *matcher, void *key, + const struct rte_flow_item *item, int inner) +{ + const struct rte_flow_item_eth *eth_m = item->mask; + const struct rte_flow_item_eth *eth_v = item->spec; + const struct rte_flow_item_eth nic_mask = { + .dst.addr_bytes = "\xff\xff\xff\xff\xff\xff", + .src.addr_by
[dpdk-dev] [PATCH v2 08/11] net/mlx5: add Direct Verbs translate actions
From: Ori Kam In this commit we add the translation of flow actions. Unlike the Verbs API actions are separeted from the items and are passed to the API in array structure. Since the target action like RSS require the QP information those actions are handled both in the translate action and in the apply. Signed-off-by: Ori Kam Acked-by: Yongseok Koh --- drivers/net/mlx5/mlx5_flow.h| 7 + drivers/net/mlx5/mlx5_flow_dv.c | 61 + 2 files changed, 68 insertions(+) diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h index 7f0566fc9..ec860ef4b 100644 --- a/drivers/net/mlx5/mlx5_flow.h +++ b/drivers/net/mlx5/mlx5_flow.h @@ -136,6 +136,8 @@ struct mlx5_flow_dv_match_params { /**< Matcher value. This value is used as the mask or as a key. */ }; +#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8 + /* Matcher structure. */ struct mlx5_flow_dv_matcher { struct mlx5_cache cache; /**< Cache to struct mlx5dv_flow_matcher. */ @@ -154,6 +156,11 @@ struct mlx5_flow_dv { struct mlx5_flow_dv_match_params value; /**< Holds the value that the packet is compared to. */ struct ibv_flow *flow; /**< Installed flow. */ +#ifdef HAVE_IBV_FLOW_DV_SUPPORT + struct mlx5dv_flow_action_attr actions[MLX5_DV_MAX_NUMBER_OF_ACTIONS]; + /**< Action list. */ +#endif + int actions_n; /**< number of actions. */ }; /* Verbs specification header. */ diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c index acb1b7549..916989988 100644 --- a/drivers/net/mlx5/mlx5_flow_dv.c +++ b/drivers/net/mlx5/mlx5_flow_dv.c @@ -942,6 +942,65 @@ flow_dv_create_item(void *matcher, void *key, } } +/** + * Store the requested actions in an array. + * + * @param[in] action + * Flow action to translate. + * @param[in, out] dev_flow + * Pointer to the mlx5_flow. + */ +static void +flow_dv_create_action(const struct rte_flow_action *action, + struct mlx5_flow *dev_flow) +{ + const struct rte_flow_action_queue *queue; + const struct rte_flow_action_rss *rss; + int actions_n = dev_flow->dv.actions_n; + struct rte_flow *flow = dev_flow->flow; + + switch (action->type) { + case RTE_FLOW_ACTION_TYPE_VOID: + break; + case RTE_FLOW_ACTION_TYPE_FLAG: + dev_flow->dv.actions[actions_n].type = MLX5DV_FLOW_ACTION_TAG; + dev_flow->dv.actions[actions_n].tag_value = + MLX5_FLOW_MARK_DEFAULT; + actions_n++; + break; + case RTE_FLOW_ACTION_TYPE_MARK: + dev_flow->dv.actions[actions_n].type = MLX5DV_FLOW_ACTION_TAG; + dev_flow->dv.actions[actions_n].tag_value = + ((const struct rte_flow_action_mark *) +(action->conf))->id; + actions_n++; + break; + case RTE_FLOW_ACTION_TYPE_DROP: + dev_flow->dv.actions[actions_n].type = MLX5DV_FLOW_ACTION_DROP; + flow->actions |= MLX5_ACTION_DROP; + break; + case RTE_FLOW_ACTION_TYPE_QUEUE: + queue = action->conf; + flow->rss.queue_num = 1; + (*flow->queue)[0] = queue->index; + break; + case RTE_FLOW_ACTION_TYPE_RSS: + rss = action->conf; + if (flow->queue) + memcpy((*flow->queue), rss->queue, + rss->queue_num * sizeof(uint16_t)); + flow->rss.queue_num = rss->queue_num; + memcpy(flow->key, rss->key, MLX5_RSS_HASH_KEY_LEN); + flow->rss.types = rss->types; + flow->rss.level = rss->level; + /* Added to array only in apply since we need the QP */ + break; + default: + break; + } + dev_flow->dv.actions_n = actions_n; +} + static uint32_t matcher_zero[MLX5_ST_SZ_DW(fte_match_param)] = { 0 }; #define HEADER_IS_ZERO(match_criteria, headers) \ @@ -1103,6 +1162,8 @@ flow_dv_translate(struct rte_eth_dev *dev, matcher.egress = attr->egress; if (flow_dv_matcher_register(dev, &matcher, dev_flow, error)) return -rte_errno; + for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) + flow_dv_create_action(actions, dev_flow); return 0; } -- 2.11.0
[dpdk-dev] [PATCH v2 09/11] net/mlx5: add Direct Verbs driver to glue
From: Ori Kam This commit adds all Direct Verbs required functions to the glue lib. Signed-off-by: Ori Kam Acked-by: Yongseok Koh --- drivers/net/mlx5/Makefile| 2 +- drivers/net/mlx5/meson.build | 2 +- drivers/net/mlx5/mlx5_glue.c | 45 drivers/net/mlx5/mlx5_glue.h | 15 +++ 4 files changed, 62 insertions(+), 2 deletions(-) diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile index d510a4275..4243b37ca 100644 --- a/drivers/net/mlx5/Makefile +++ b/drivers/net/mlx5/Makefile @@ -8,7 +8,7 @@ include $(RTE_SDK)/mk/rte.vars.mk LIB = librte_pmd_mlx5.a LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION) LIB_GLUE_BASE = librte_pmd_mlx5_glue.so -LIB_GLUE_VERSION = 18.05.0 +LIB_GLUE_VERSION = 18.11.0 # Sources. SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5.c diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build index 8075496f7..3d09ece4f 100644 --- a/drivers/net/mlx5/meson.build +++ b/drivers/net/mlx5/meson.build @@ -4,7 +4,7 @@ pmd_dlopen = get_option('enable_driver_mlx_glue') LIB_GLUE_BASE = 'librte_pmd_mlx5_glue.so' -LIB_GLUE_VERSION = '18.05.0' +LIB_GLUE_VERSION = '18.11.0' LIB_GLUE = LIB_GLUE_BASE + '.' + LIB_GLUE_VERSION if pmd_dlopen dpdk_conf.set('RTE_LIBRTE_MLX5_DLOPEN_DEPS', 1) diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c index 84f9492a7..48590df5b 100644 --- a/drivers/net/mlx5/mlx5_glue.c +++ b/drivers/net/mlx5/mlx5_glue.c @@ -346,6 +346,48 @@ mlx5_glue_dv_create_qp(struct ibv_context *context, #endif } +static struct mlx5dv_flow_matcher * +mlx5_glue_dv_create_flow_matcher(struct ibv_context *context, +struct mlx5dv_flow_matcher_attr *matcher_attr) +{ +#ifdef HAVE_IBV_FLOW_DV_SUPPORT + return mlx5dv_create_flow_matcher(context, matcher_attr); +#else + (void)context; + (void)matcher_attr; + return NULL; +#endif +} + +static struct ibv_flow * +mlx5_glue_dv_create_flow(struct mlx5dv_flow_matcher *matcher, +struct mlx5dv_flow_match_parameters *match_value, +size_t num_actions, +struct mlx5dv_flow_action_attr *actions_attr) +{ +#ifdef HAVE_IBV_FLOW_DV_SUPPORT + return mlx5dv_create_flow(matcher, match_value, + num_actions, actions_attr); +#else + (void)matcher; + (void)match_value; + (void)num_actions; + (void)actions_attr; + return NULL; +#endif +} + +static int +mlx5_glue_dv_destroy_flow_matcher(struct mlx5dv_flow_matcher *matcher) +{ +#ifdef HAVE_IBV_FLOW_DV_SUPPORT + return mlx5dv_destroy_flow_matcher(matcher); +#else + (void)matcher; + return 0; +#endif +} + alignas(RTE_CACHE_LINE_SIZE) const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){ .version = MLX5_GLUE_VERSION, @@ -392,4 +434,7 @@ const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){ .dv_set_context_attr = mlx5_glue_dv_set_context_attr, .dv_init_obj = mlx5_glue_dv_init_obj, .dv_create_qp = mlx5_glue_dv_create_qp, + .dv_create_flow_matcher = mlx5_glue_dv_create_flow_matcher, + .dv_destroy_flow_matcher = mlx5_glue_dv_destroy_flow_matcher, + .dv_create_flow = mlx5_glue_dv_create_flow, }; diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h index e584d3679..f6e4e3842 100644 --- a/drivers/net/mlx5/mlx5_glue.h +++ b/drivers/net/mlx5/mlx5_glue.h @@ -39,6 +39,13 @@ struct mlx5dv_qp_init_attr; struct mlx5dv_wq_init_attr; #endif +#ifndef HAVE_IBV_FLOW_DV_SUPPORT +struct mlx5dv_flow_matcher; +struct mlx5dv_flow_matcher_attr; +struct mlx5dv_flow_action_attr; +struct mlx5dv_flow_match_parameters; +#endif + /* LIB_GLUE_VERSION must be updated every time this structure is modified. */ struct mlx5_glue { const char *version; @@ -122,6 +129,14 @@ struct mlx5_glue { (struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_init_attr_ex, struct mlx5dv_qp_init_attr *dv_qp_init_attr); + struct mlx5dv_flow_matcher *(*dv_create_flow_matcher) + (struct ibv_context *context, +struct mlx5dv_flow_matcher_attr *matcher_attr); + int (*dv_destroy_flow_matcher)(struct mlx5dv_flow_matcher *matcher); + struct ibv_flow *(*dv_create_flow)(struct mlx5dv_flow_matcher *matcher, + struct mlx5dv_flow_match_parameters *match_value, + size_t num_actions, + struct mlx5dv_flow_action_attr *actions_attr); }; const struct mlx5_glue *mlx5_glue; -- 2.11.0
[dpdk-dev] [PATCH v2 10/11] net/mlx5: add Direct Verbs final functions
From: Ori Kam This commits add the missing function which are apply, remove, and destroy. Signed-off-by: Ori Kam Acked-by: Yongseok Koh --- drivers/net/mlx5/mlx5_flow.c| 4 + drivers/net/mlx5/mlx5_flow.h| 2 + drivers/net/mlx5/mlx5_flow_dv.c | 192 +++- 3 files changed, 194 insertions(+), 4 deletions(-) diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index 196d9fcbe..1990d7acc 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -2473,5 +2473,9 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev, void mlx5_flow_init_driver_ops(struct rte_eth_dev *dev __rte_unused) { +#ifdef HAVE_IBV_FLOW_DV_SUPPORT + mlx5_flow_dv_get_driver_ops(&nic_ops); +#else mlx5_flow_verbs_get_driver_ops(&nic_ops); +#endif } diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h index ec860ef4b..53c0eeb56 100644 --- a/drivers/net/mlx5/mlx5_flow.h +++ b/drivers/net/mlx5/mlx5_flow.h @@ -189,7 +189,9 @@ struct mlx5_flow { struct rte_flow *flow; /**< Pointer to the main flow. */ uint32_t layers; /**< Bit-fields that holds the detected layers. */ union { +#ifdef HAVE_IBV_FLOW_DV_SUPPORT struct mlx5_flow_dv dv; +#endif struct mlx5_flow_verbs verbs; }; }; diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c index 916989988..71af410b2 100644 --- a/drivers/net/mlx5/mlx5_flow_dv.c +++ b/drivers/net/mlx5/mlx5_flow_dv.c @@ -2,6 +2,7 @@ * Copyright 2018 Mellanox Technologies, Ltd */ + #include #include #include @@ -1095,7 +1096,7 @@ flow_dv_matcher_register(struct rte_eth_dev *dev, if (matcher->egress) dv_attr.flags |= IBV_FLOW_ATTR_FLAGS_EGRESS; cache->cache.resource = - mlx5dv_create_flow_matcher(priv->ctx, &dv_attr); + mlx5_glue->dv_create_flow_matcher(priv->ctx, &dv_attr); if (!cache->cache.resource) return rte_flow_error_set(error, ENOMEM, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, @@ -1168,6 +1169,189 @@ flow_dv_translate(struct rte_eth_dev *dev, } /** + * Apply the flow to the NIC. + * + * @param[in] dev + * Pointer to the Ethernet device structure. + * @param[in, out] flow + * Pointer to flow structure. + * @param[out] error + * Pointer to error structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +static int +flow_dv_apply(struct rte_eth_dev *dev, struct rte_flow *flow, + struct rte_flow_error *error) +{ + struct mlx5_flow_dv *dv; + struct mlx5_flow *dev_flow; + int n; + int err; + + LIST_FOREACH(dev_flow, &flow->dev_flows, next) { + dv = &dev_flow->dv; + n = dv->actions_n; + if (flow->actions & MLX5_ACTION_DROP) { + dv->hrxq = mlx5_hrxq_drop_new(dev); + if (!dv->hrxq) { + rte_flow_error_set + (error, errno, +RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, +"cannot get drop hash queue"); + goto error; + } + dv->actions[n].type = MLX5DV_FLOW_ACTION_DEST_IBV_QP; + dv->actions[n].qp = dv->hrxq->qp; + n++; + } else { + struct mlx5_hrxq *hrxq; + hrxq = mlx5_hrxq_get(dev, flow->key, +MLX5_RSS_HASH_KEY_LEN, +dv->hash_fields, +(*flow->queue), +flow->rss.queue_num); + if (!hrxq) + hrxq = mlx5_hrxq_new + (dev, flow->key, MLX5_RSS_HASH_KEY_LEN, +dv->hash_fields, (*flow->queue), +flow->rss.queue_num, +!!(flow->layers & + MLX5_FLOW_LAYER_TUNNEL)); + if (!hrxq) { + rte_flow_error_set + (error, rte_errno, +RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, +"cannot get hash queue"); + goto error; + } + dv->hrxq = hrxq; + dv->actions[n].type = MLX5DV_FLOW_ACTION_DEST_IBV_QP; + dv->actions[n].qp = hrxq->qp; + n++; + } + dv->flow = + mlx5_glue->dv
[dpdk-dev] [PATCH v2 11/11] net/mlx5: add runtime parameter to enable Direct Verbs
From: Ori Kam DV flow API is based on new kernel API and is missing some functionality like counter but add other functionality like encap. In order not to affect current users even if the kernel supports the new DV API it should be enabled only manually. Signed-off-by: Ori Kam Acked-by: Yongseok Koh --- doc/guides/nics/mlx5.rst | 7 +++ drivers/net/mlx5/mlx5.c | 6 ++ drivers/net/mlx5/mlx5.h | 1 + drivers/net/mlx5/mlx5_flow.c | 9 +++-- 4 files changed, 21 insertions(+), 2 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index dbdb90b59..67696283e 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -397,6 +397,13 @@ Run-time configuration Disabled by default. +- ``dv_flow_en`` parameter [int] + + A nonzero value enables the DV flow steering assuming it is supported + by the driver. + + Disabled by default. + - ``representor`` parameter [list] This parameter can be used to instantiate DPDK Ethernet devices from diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index ab44864e9..9b208109b 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -90,6 +90,9 @@ /* Allow L3 VXLAN flow creation. */ #define MLX5_L3_VXLAN_EN "l3_vxlan_en" +/* Activate DV flow steering. */ +#define MLX5_DV_FLOW_EN "dv_flow_en" + /* Activate Netlink support in VF mode. */ #define MLX5_VF_NL_EN "vf_nl_en" @@ -491,6 +494,8 @@ mlx5_args_check(const char *key, const char *val, void *opaque) config->l3_vxlan_en = !!tmp; } else if (strcmp(MLX5_VF_NL_EN, key) == 0) { config->vf_nl_en = !!tmp; + } else if (strcmp(MLX5_DV_FLOW_EN, key) == 0) { + config->dv_flow_en = !!tmp; } else { DRV_LOG(WARNING, "%s: unknown parameter", key); rte_errno = EINVAL; @@ -528,6 +533,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs) MLX5_RX_VEC_EN, MLX5_L3_VXLAN_EN, MLX5_VF_NL_EN, + MLX5_DV_FLOW_EN, MLX5_REPRESENTOR, NULL, }; diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 8ff6d6987..8bb619d9e 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -111,6 +111,7 @@ struct mlx5_dev_config { unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */ unsigned int l3_vxlan_en:1; /* Enable L3 VXLAN flow creation. */ unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */ + unsigned int dv_flow_en:1; /* Enable DV flow. */ unsigned int swp:1; /* Tx generic tunnel checksum and TSO offload. */ struct { unsigned int enabled:1; /* Whether MPRQ is enabled. */ diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index 1990d7acc..2119211f5 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -2471,10 +2471,15 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev, * Pointer to Ethernet device structure. */ void -mlx5_flow_init_driver_ops(struct rte_eth_dev *dev __rte_unused) +mlx5_flow_init_driver_ops(struct rte_eth_dev *dev) { + struct priv *priv __rte_unused = dev->data->dev_private; + #ifdef HAVE_IBV_FLOW_DV_SUPPORT - mlx5_flow_dv_get_driver_ops(&nic_ops); + if (priv->config.dv_flow_en) + mlx5_flow_dv_get_driver_ops(&nic_ops); + else + mlx5_flow_verbs_get_driver_ops(&nic_ops); #else mlx5_flow_verbs_get_driver_ops(&nic_ops); #endif -- 2.11.0
[dpdk-dev] [PATCH v2 1/3] net/mlx5: add abstraction for multiple flow drivers
Flow engine has to support multiple driver paths. Verbs/DV for NIC flow steering and Linux TC flower for E-Switch flow steering. In the future, another flow driver could be added (devX). Signed-off-by: Yongseok Koh --- drivers/net/mlx5/mlx5.c| 1 - drivers/net/mlx5/mlx5_flow.c | 348 + drivers/net/mlx5/mlx5_flow.h | 17 +- drivers/net/mlx5/mlx5_flow_dv.c| 26 +-- drivers/net/mlx5/mlx5_flow_verbs.c | 20 +-- 5 files changed, 335 insertions(+), 77 deletions(-) diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 9b208109b..2f7d046e0 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -1192,7 +1192,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, if (err < 0) goto error; priv->config.flow_prio = err; - mlx5_flow_init_driver_ops(eth_dev); /* * Once the device is added to the list of memory event * callback, its global MR cache table cannot be expanded diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index 2119211f5..54008afa4 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -37,6 +37,23 @@ extern const struct eth_dev_ops mlx5_dev_ops; extern const struct eth_dev_ops mlx5_dev_ops_isolate; +/** Device flow drivers. */ +#ifdef HAVE_IBV_FLOW_DV_SUPPORT +extern const struct mlx5_flow_driver_ops mlx5_flow_dv_drv_ops; +#endif +extern const struct mlx5_flow_driver_ops mlx5_flow_verbs_drv_ops; + +const struct mlx5_flow_driver_ops mlx5_flow_null_drv_ops; + +const struct mlx5_flow_driver_ops *flow_drv_ops[] = { + [MLX5_FLOW_TYPE_MIN] = &mlx5_flow_null_drv_ops, +#ifdef HAVE_IBV_FLOW_DV_SUPPORT + [MLX5_FLOW_TYPE_DV] = &mlx5_flow_dv_drv_ops, +#endif + [MLX5_FLOW_TYPE_VERBS] = &mlx5_flow_verbs_drv_ops, + [MLX5_FLOW_TYPE_MAX] = &mlx5_flow_null_drv_ops +}; + enum mlx5_expansion { MLX5_EXPANSION_ROOT, MLX5_EXPANSION_ROOT_OUTER, @@ -282,9 +299,6 @@ static struct mlx5_flow_tunnel_info tunnels_info[] = { }, }; -/* Holds the nic operations that should be used. */ -struct mlx5_flow_driver_ops nic_ops; - /** * Discover the maximum number of priority available. * @@ -1510,6 +1524,284 @@ mlx5_flow_validate_item_mpls(const struct rte_flow_item *item __rte_unused, " update."); } +static int +flow_null_validate(struct rte_eth_dev *dev __rte_unused, + const struct rte_flow_attr *attr __rte_unused, + const struct rte_flow_item items[] __rte_unused, + const struct rte_flow_action actions[] __rte_unused, + struct rte_flow_error *error __rte_unused) +{ + rte_errno = ENOTSUP; + return -rte_errno; +} + +static struct mlx5_flow * +flow_null_prepare(const struct rte_flow_attr *attr __rte_unused, + const struct rte_flow_item items[] __rte_unused, + const struct rte_flow_action actions[] __rte_unused, + uint64_t *item_flags __rte_unused, + uint64_t *action_flags __rte_unused, + struct rte_flow_error *error __rte_unused) +{ + rte_errno = ENOTSUP; + return NULL; +} + +static int +flow_null_translate(struct rte_eth_dev *dev __rte_unused, + struct mlx5_flow *dev_flow __rte_unused, + const struct rte_flow_attr *attr __rte_unused, + const struct rte_flow_item items[] __rte_unused, + const struct rte_flow_action actions[] __rte_unused, + struct rte_flow_error *error __rte_unused) +{ + rte_errno = ENOTSUP; + return -rte_errno; +} + +static int +flow_null_apply(struct rte_eth_dev *dev __rte_unused, + struct rte_flow *flow __rte_unused, + struct rte_flow_error *error __rte_unused) +{ + rte_errno = ENOTSUP; + return -rte_errno; +} + +static void +flow_null_remove(struct rte_eth_dev *dev __rte_unused, +struct rte_flow *flow __rte_unused) +{ +} + +static void +flow_null_destroy(struct rte_eth_dev *dev __rte_unused, + struct rte_flow *flow __rte_unused) +{ +} + +/* Void driver to protect from null pointer reference. */ +const struct mlx5_flow_driver_ops mlx5_flow_null_drv_ops = { + .validate = flow_null_validate, + .prepare = flow_null_prepare, + .translate = flow_null_translate, + .apply = flow_null_apply, + .remove = flow_null_remove, + .destroy = flow_null_destroy, +}; + +/** + * Select flow driver type according to flow attributes and device + * configuration. + * + * @param[in] dev + * Pointer to the dev structure. + * @param[in] attr + * Pointer to the flow attributes. + * + * @return + * flow driver type if supported, MLX5_FLOW_TYPE_MAX otherwise. + */ +static enum mlx5_flow_drv_type +flow_get_drv_type(struct rte_eth_dev *dev __rte_unused, +
[dpdk-dev] [PATCH v2 0/3] net/mlx5: migrate Linux TC flower driver to new flow engine
This patchset is to migrate the existing E-Switch flow driver on to the new flow engine. This patchset depends on Ori's new flow engine [1]. [1] http://patches.dpdk.org/project/dpdk/list/?series=1473 v2: * make changes for the newly introduced meson build. Yongseok Koh (3): net/mlx5: add abstraction for multiple flow drivers net/mlx5: remove Netlink flow driver net/mlx5: add Linux TC flower driver for E-Switch flow drivers/net/mlx5/Makefile |2 +- drivers/net/mlx5/meson.build |2 +- drivers/net/mlx5/mlx5.c| 12 +- drivers/net/mlx5/mlx5.h| 25 - drivers/net/mlx5/mlx5_flow.c | 352 +++- drivers/net/mlx5/mlx5_flow.h | 33 +- drivers/net/mlx5/mlx5_flow_dv.c| 26 +- drivers/net/mlx5/mlx5_flow_tcf.c | 1608 drivers/net/mlx5/mlx5_flow_verbs.c | 20 +- drivers/net/mlx5/mlx5_nl_flow.c| 1228 --- 10 files changed, 1973 insertions(+), 1335 deletions(-) create mode 100644 drivers/net/mlx5/mlx5_flow_tcf.c delete mode 100644 drivers/net/mlx5/mlx5_nl_flow.c -- 2.11.0
[dpdk-dev] [PATCH v2 3/3] net/mlx5: add Linux TC flower driver for E-Switch flow
Flows having 'transfer' attribute have to be inserted to E-Switch on the NIC and the control path uses Linux TC flower interface via Netlink socket. This patch adds the flow driver on top of the new flow engine. Signed-off-by: Yongseok Koh --- drivers/net/mlx5/Makefile|1 + drivers/net/mlx5/meson.build |1 + drivers/net/mlx5/mlx5.c | 33 + drivers/net/mlx5/mlx5_flow.c |6 +- drivers/net/mlx5/mlx5_flow.h | 20 + drivers/net/mlx5/mlx5_flow_tcf.c | 1608 ++ 6 files changed, 1668 insertions(+), 1 deletion(-) create mode 100644 drivers/net/mlx5/mlx5_flow_tcf.c diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile index 9c1044808..ca1de9f21 100644 --- a/drivers/net/mlx5/Makefile +++ b/drivers/net/mlx5/Makefile @@ -32,6 +32,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_dv.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_tcf.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_verbs.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build index e5376291c..fd93ac162 100644 --- a/drivers/net/mlx5/meson.build +++ b/drivers/net/mlx5/meson.build @@ -32,6 +32,7 @@ if build 'mlx5_ethdev.c', 'mlx5_flow.c', 'mlx5_flow_dv.c', + 'mlx5_flow_tcf.c', 'mlx5_flow_verbs.c', 'mlx5_mac.c', 'mlx5_mr.c', diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index bb9a63fba..4be6a1cc9 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -44,6 +44,7 @@ #include "mlx5_rxtx.h" #include "mlx5_autoconf.h" #include "mlx5_defs.h" +#include "mlx5_flow.h" #include "mlx5_glue.h" #include "mlx5_mr.h" #include "mlx5_flow.h" @@ -286,6 +287,8 @@ mlx5_dev_close(struct rte_eth_dev *dev) close(priv->nl_socket_route); if (priv->nl_socket_rdma >= 0) close(priv->nl_socket_rdma); + if (priv->mnl_socket) + mlx5_flow_tcf_socket_destroy(priv->mnl_socket); ret = mlx5_hrxq_ibv_verify(dev); if (ret) DRV_LOG(WARNING, "port %u some hash Rx queue still remain", @@ -1135,6 +1138,34 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, claim_zero(mlx5_mac_addr_add(eth_dev, &mac, 0, 0)); if (vf && config.vf_nl_en) mlx5_nl_mac_addr_sync(eth_dev); + priv->mnl_socket = mlx5_flow_tcf_socket_create(); + if (!priv->mnl_socket) { + err = -rte_errno; + DRV_LOG(WARNING, + "flow rules relying on switch offloads will not be" + " supported: cannot open libmnl socket: %s", + strerror(rte_errno)); + } else { + struct rte_flow_error error; + unsigned int ifindex = mlx5_ifindex(eth_dev); + + if (!ifindex) { + err = -rte_errno; + error.message = + "cannot retrieve network interface index"; + } else { + err = mlx5_flow_tcf_init(priv->mnl_socket, ifindex, + &error); + } + if (err) { + DRV_LOG(WARNING, + "flow rules relying on switch offloads will" + " not be supported: %s: %s", + error.message, strerror(rte_errno)); + mlx5_flow_tcf_socket_destroy(priv->mnl_socket); + priv->mnl_socket = NULL; + } + } TAILQ_INIT(&priv->flows); TAILQ_INIT(&priv->ctrl_flows); /* Hint libmlx5 to use PMD allocator for data plane resources */ @@ -1187,6 +1218,8 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, close(priv->nl_socket_route); if (priv->nl_socket_rdma >= 0) close(priv->nl_socket_rdma); + if (priv->mnl_socket) + mlx5_flow_tcf_socket_destroy(priv->mnl_socket); if (own_domain_id) claim_zero(rte_eth_switch_domain_free(priv->domain_id)); rte_free(priv); diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index 54008afa4..7660bee30 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -41,6 +41,7 @@ extern const struct eth_dev_ops mlx5_dev_ops_isolate; #ifdef HAVE_IBV_FLOW_DV_SUPPORT extern const struct mlx5_flow_driver_ops mlx5_flow_dv_drv_ops; #endif +extern const struct mlx5_flow_driver_ops mlx5_flow_tcf_drv_ops; extern const struct mlx5_flow_driver_
[dpdk-dev] [PATCH v2 2/3] net/mlx5: remove Netlink flow driver
Netlink based E-Switch flow engine will be migrated to the new flow engine. nl_flow will be renamed to flow_tcf as it goes through Linux TC flower interface. Signed-off-by: Yongseok Koh --- drivers/net/mlx5/Makefile |1 - drivers/net/mlx5/meson.build|1 - drivers/net/mlx5/mlx5.c | 32 - drivers/net/mlx5/mlx5.h | 25 - drivers/net/mlx5/mlx5_nl_flow.c | 1228 --- 5 files changed, 1287 deletions(-) delete mode 100644 drivers/net/mlx5/mlx5_nl_flow.c diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile index 4243b37ca..9c1044808 100644 --- a/drivers/net/mlx5/Makefile +++ b/drivers/net/mlx5/Makefile @@ -35,7 +35,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_dv.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_verbs.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl_flow.c ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS),y) INSTALL-$(CONFIG_RTE_LIBRTE_MLX5_PMD)-lib += $(LIB_GLUE) diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build index 3d09ece4f..e5376291c 100644 --- a/drivers/net/mlx5/meson.build +++ b/drivers/net/mlx5/meson.build @@ -36,7 +36,6 @@ if build 'mlx5_mac.c', 'mlx5_mr.c', 'mlx5_nl.c', - 'mlx5_nl_flow.c', 'mlx5_rss.c', 'mlx5_rxmode.c', 'mlx5_rxq.c', diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 2f7d046e0..bb9a63fba 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -286,8 +286,6 @@ mlx5_dev_close(struct rte_eth_dev *dev) close(priv->nl_socket_route); if (priv->nl_socket_rdma >= 0) close(priv->nl_socket_rdma); - if (priv->mnl_socket) - mlx5_nl_flow_socket_destroy(priv->mnl_socket); ret = mlx5_hrxq_ibv_verify(dev); if (ret) DRV_LOG(WARNING, "port %u some hash Rx queue still remain", @@ -1137,34 +1135,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, claim_zero(mlx5_mac_addr_add(eth_dev, &mac, 0, 0)); if (vf && config.vf_nl_en) mlx5_nl_mac_addr_sync(eth_dev); - priv->mnl_socket = mlx5_nl_flow_socket_create(); - if (!priv->mnl_socket) { - err = -rte_errno; - DRV_LOG(WARNING, - "flow rules relying on switch offloads will not be" - " supported: cannot open libmnl socket: %s", - strerror(rte_errno)); - } else { - struct rte_flow_error error; - unsigned int ifindex = mlx5_ifindex(eth_dev); - - if (!ifindex) { - err = -rte_errno; - error.message = - "cannot retrieve network interface index"; - } else { - err = mlx5_nl_flow_init(priv->mnl_socket, ifindex, - &error); - } - if (err) { - DRV_LOG(WARNING, - "flow rules relying on switch offloads will" - " not be supported: %s: %s", - error.message, strerror(rte_errno)); - mlx5_nl_flow_socket_destroy(priv->mnl_socket); - priv->mnl_socket = NULL; - } - } TAILQ_INIT(&priv->flows); TAILQ_INIT(&priv->ctrl_flows); /* Hint libmlx5 to use PMD allocator for data plane resources */ @@ -1217,8 +1187,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, close(priv->nl_socket_route); if (priv->nl_socket_rdma >= 0) close(priv->nl_socket_rdma); - if (priv->mnl_socket) - mlx5_nl_flow_socket_destroy(priv->mnl_socket); if (own_domain_id) claim_zero(rte_eth_switch_domain_free(priv->domain_id)); rte_free(priv); diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 8bb619d9e..8de0d74ce 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -158,12 +158,6 @@ struct mlx5_drop { struct mlx5_rxq_ibv *rxq; /* Verbs Rx queue. */ }; -/** DPDK port to network interface index (ifindex) conversion. */ -struct mlx5_nl_flow_ptoi { - uint16_t port_id; /**< DPDK port ID. */ - unsigned int ifindex; /**< Network interface index. */ -}; - struct mnl_socket; struct priv { @@ -399,23 +393,4 @@ unsigned int mlx5_nl_ifindex(int nl, const char *name); int mlx5_nl_switch_info(int nl, unsigned int ifindex, struct mlx5_switch_info *info); -/* mlx5_nl_flow.c */ - -int mlx5_nl_flow_transpose(void *buf, - size_t size, -
[dpdk-dev] [PATCH v6 0/5] vhost: vhost_user.c code cleanup
vhost: vhost_user.c code cleanup This patchesries introduce a set of code redesigns in vhost_user.c. The goal is to unify and simplify vhost-user message handling. The patches do not intend to introduce any functional changes. v6 changes: - Even more fixes to the usage of struct VhostUserMsg in the patches (Anatoly Burakov) v5 changes: - fixed the usage of struct VhostUserMsg in all patches (Anatoly Burakov) v4 changes: - use struct VhostUserMsg as the coding style guide suggests (Anatoly Burakov) - VH_RESULT_FATAL is removed as not needed anymore (Maxime Coquelin) v3 changes: - rebased on top of git://dpdk.org/next/dpdk-next-virtio dead0602 - introduce VH_RESULT_FATAL (Maxime Coquelin) - vhost_user_set_features return VH_RESULT_FATAL on failure. This allows keeping the propagate error logic (Ilya Maximets) - fixed vhost_user_set_vring_kick and vhost_user_set_protocol_features return VH_RESULT_ERR upon failure - fixed missing break in case VH_RESULT_ERR (Ilya Maximets) - fixed a type on the description of 2/5 patch (Maxime Coquelin) v2 changes: - Fix the comments by Tiwei Bie - Keep the old behavior - Fall through when the callback returns VH_RESULT_ERR - Fall through if the request is out of range --- Nikolay Nikolaev (5): vhost: unify struct VhostUserMsg usage vhost: make message handling functions prepare the reply vhost: handle unsupported message types in functions vhost: unify message handling function signature vhost: message handling implemented as a callback array lib/librte_vhost/vhost_user.c | 394 ++--- 1 file changed, 209 insertions(+), 185 deletions(-) -- Signature