Hi Jasvinder,

Thanks for doing this work! Finally a man brave enough to do substantial 
changes to this library!

> -----Original Message-----
> From: Singh, Jasvinder
> Sent: Tuesday, June 25, 2019 4:32 PM
> To: dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitre...@intel.com>
> Subject: [PATCH v2 00/28] sched: feature enhancements
> 
> This patchset refactors the dpdk qos sched library to add
> following features to enhance the scheduler functionality.
> 
> 1. flexibile configuration of the pipe traffic classes and queues;
> 
>    Currently, each pipe has 16 queues hardwired into 4 TCs scheduled with
>    strict priority, and each TC has exactly with 4 queues that are
>    scheduled with Weighted Fair Queuing (WFQ).
> 
>    Instead of hardwiring queues to traffic class within the specific pipe,
>    the new implementation allows more flexible/configurable split of pipe
>    queues between strict priority (SP) and best-effort (BE) traffic classes
>    along with the support of more number of traffic classes i.e. max 16.
> 
>    All the high priority TCs (TC1, TC2, ...) have exactly 1 queue, while
>    the lowest priority BE TC, has 1, 4 or 8 queues. This is justified by
>    the fact that all the high priority TCs are fully provisioned (small to
>    medium traffic rates), while most of the traffic fits into the BE class,
>    which is typically oversubscribed.
> 
>    Furthermore, this change allows to use less than 16 queues per pipe when
>    not all the 16 queues are needed. Therefore, no memory will be allocated
>    to the queues that are not needed.
> 
> 2. Subport level configuration of pipe nodes;
> 
>    Currently, all parameters for the pipe nodes (subscribers) configuration
>    are part of the port level structure which forces all groups of
>    subscribers (i.e. pipes) in different subports to have similar
>    configurations in terms of their number, queue sizes, traffic-classes,
>    etc.
> 
>    The new implementation moves pipe nodes configuration parameters from
>    port level to subport level structure. Therefore, different subports of
>    the same port can have different configuration for the pipe nodes
>    (subscribers), for examples- number of pipes, queue sizes, queues to
>    traffic-class mapping, etc.
> 
> v2:
> - fix bug in subport parameters check
> - remove redundant RTE_SCHED_SUBPORT_PER_PORT macro
> - fix bug in grinder_scheduler function
> - improve doxygen comments
> - add error log information
> 
> Jasvinder Singh (27):
>   sched: update macros for flexible config
>   sched: update subport and pipe data structures
>   sched: update internal data structures
>   sched: update port config API
>   sched: update port free API
>   sched: update subport config API
>   sched: update pipe profile add API
>   sched: update pipe config API
>   sched: update pkt read and write API
>   sched: update subport and tc queue stats
>   sched: update port memory footprint API
>   sched: update packet enqueue API
>   sched: update grinder pipe and tc cache
>   sched: update grinder next pipe and tc functions
>   sched: update pipe and tc queues prefetch
>   sched: update grinder wrr compute function
>   sched: modify credits update function
>   sched: update mbuf prefetch function
>   sched: update grinder schedule function
>   sched: update grinder handle function
>   sched: update packet dequeue API
>   sched: update sched queue stats API
>   test/sched: update unit test
>   net/softnic: update softnic tm function
>   examples/qos_sched: update qos sched sample app
>   examples/ip_pipeline: update ip pipeline sample app
>   sched: code cleanup
> 
> Lukasz Krakowiak (1):
>   sched: add release note
> 
>  app/test/test_sched.c                         |   39 +-
>  doc/guides/rel_notes/deprecation.rst          |    6 -
>  doc/guides/rel_notes/release_19_08.rst        |    7 +-
>  drivers/net/softnic/rte_eth_softnic.c         |  131 +
>  drivers/net/softnic/rte_eth_softnic_cli.c     |  286 ++-
>  .../net/softnic/rte_eth_softnic_internals.h   |    8 +-
>  drivers/net/softnic/rte_eth_softnic_tm.c      |   89 +-
>  examples/ip_pipeline/cli.c                    |   85 +-
>  examples/ip_pipeline/tmgr.c                   |   22 +-
>  examples/ip_pipeline/tmgr.h                   |    3 -
>  examples/qos_sched/app_thread.c               |   11 +-
>  examples/qos_sched/cfg_file.c                 |  283 ++-
>  examples/qos_sched/init.c                     |  111 +-
>  examples/qos_sched/main.h                     |    7 +-
>  examples/qos_sched/profile.cfg                |   59 +-
>  examples/qos_sched/profile_ov.cfg             |   47 +-
>  examples/qos_sched/stats.c                    |  483 ++--
>  lib/librte_pipeline/rte_table_action.c        |    1 -
>  lib/librte_pipeline/rte_table_action.h        |    4 +-
>  lib/librte_sched/Makefile                     |    2 +-
>  lib/librte_sched/meson.build                  |    2 +-
>  lib/librte_sched/rte_sched.c                  | 2133 ++++++++++-------
>  lib/librte_sched/rte_sched.h                  |  229 +-
>  lib/librte_sched/rte_sched_common.h           |   41 +
>  24 files changed, 2634 insertions(+), 1455 deletions(-)
> 
> --
> 2.21.0

This library is tricky, as validating the functional correctness usually 
requires more than just a binary pass/fail result, like your usual PMD (are 
packets coming out? YES/NO). It requires an accuracy testing, for example how 
close is the actual pipe/shaper output rate from the expected rate, how 
accurate is the strict priority of WFQ scheduling, etc. Therefore, here are a 
few accuracy tests that we need to perform on this library to make sure we did 
not break any functionality, feel free to reach out to me if more details 
needed:

1. Subport shaper accuracy: Inject line rate into a subport, limit subport rate 
to X% of line rate (X = 5, 10, 15, ..., 90). Check that subport output rate 
matches the rate limit with a tolerance of 1% of the expected rate.
2. Subport traffic class rate limiting accuracy: same 1% tolerance
3. Pipe shaper accuracy: same 1% tolerance
4. Pipe traffic class rate limiting accuracy: same 1% tolerance
5. Traffic class strict priority scheduling
6. WFQ for best effort traffic class

On performance side, we need to make sure we don't get a massive performance 
degradation. We need proof points for:
1. 8x traffic classes, 4x best effort queues
2. 8x traffic classes, 1x best effort queue
3. 4x traffic classes, 4x best effort queues
4. 4x traffic classes, 1x best effort queue

On unit test side, a few tests to highlight:
1. Packet for queue X is enqueued to queue X and dequeued from queue X.
a) Typically tested by sending a single packet and tracing it through the 
queues.
b) Should be done for different queue IDs, different number of queues per 
subport and different number of subports.

I am sending some initial comments to the V2 patches now, more to come during 
the next few days.

Regards,
Cristian

Reply via email to