Re: [dpdk-dev] [PATCH 3/4] test-crypto-perf: add new PMD benchmarking mode

De Lara Guarch, Pablo Mon, 04 Sep 2017 07:24:30 -0700


> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Thursday, August 24, 2017 11:48 AM
> To: dev@dpdk.org
> Cc: De Lara Guarch, Pablo <pablo.de.lara.gua...@intel.com>; Doherty,
> Declan <declan.dohe...@intel.com>; Burakov, Anatoly
> <anatoly.bura...@intel.com>
> Subject: [PATCH 3/4] test-crypto-perf: add new PMD benchmarking mode
> 
> This patch adds a new benchmarking mode, which is intended for
> microbenchmarking individual parts of the cryptodev framework,
> specifically crypto ops alloc-build-free, cryptodev PMD enqueue and
> cryptodev PMD dequeue.
> 
> It works by first benchmarking crypto operation alloc-build-free loop (no
> enqueues/dequeues happening), and then benchmarking enqueue and
> dequeue separately, by first completely filling up the TX queue, and then
> completely draining the RX queue.
> 
> Results are shown as cycle counts per alloc/build/free, PMD enqueue and
> PMD dequeue.
> 
> One new test mode is added: "pmd-cyclecount"
>   (called with --ptest=pmd-cyclecount)
> 
> New command-line argument is also added:
>   --pmd-cyclecount-delay-ms: this is a pmd-cyclecount-specific parameter
>       that controls the delay between enqueue and dequeue. This is
>       useful for benchmarking hardware acceleration, as hardware may
>       not be able to keep up with enqueued packets. This parameter
>       can be increased if there are large amounts of dequeue
>       retries.
> 
> Signed-off-by: Burakov, Anatoly <anatoly.bura...@intel.com>


Git-check-log is complaining about this tag. Actually, the author is "Anatoly 
Burakov",
but here is "Burakov, Anatoly". Looks like this is something to fix.

> ---

...

> 
> diff --git a/app/test-crypto-perf/cperf_test_pmd_cyclecount.c b/app/test-
> crypto-perf/cperf_test_pmd_cyclecount.c
> new file mode 100644
> index 0000000..ef7fa83

...


> +void *
> +cperf_pmd_cyclecount_test_constructor(struct rte_mempool *sess_mp,
> +                     uint8_t dev_id, uint16_t qp_id,
> +                     const struct cperf_options *options,
> +                     const struct cperf_test_vector *test_vector,
> +                     const struct cperf_op_fns *op_fns)
> +{
> +     struct cperf_pmd_cyclecount_ctx *ctx = NULL;
> +     unsigned int mbuf_idx = 0;
> +     char pool_name[32] = "";
> +
> +     /* preallocate buffers for crypto ops as they can get quite big */
> +     size_t alloc_sz = sizeof(struct rte_crypto_op *) *
> +                     options->nb_descriptors;
> +
> +     ctx = rte_malloc(NULL, sizeof(struct cperf_pmd_cyclecount_ctx), 0);
> +     if (ctx == NULL)
> +             goto err;
> +
> +     ctx->dev_id = dev_id;
> +     ctx->qp_id = qp_id;
> +
> +     ctx->populate_ops = op_fns->populate_ops;
> +     ctx->options = options;
> +     ctx->test_vector = test_vector;
> +
> +     /* IV goes at the end of the cryptop operation */

Typo here: "crypto"

> +     uint16_t iv_offset = sizeof(struct rte_crypto_op) +
> +             sizeof(struct rte_crypto_sym_op);
> +

...

> +
> +     snprintf(pool_name, sizeof(pool_name), "cperf_op_pool_cdev_%d",
> +                     dev_id);
> +
> +     uint16_t priv_size = test_vector->cipher_iv.length +
> +             test_vector->auth_iv.length;

Missing "+ test_vector->aead_iv.length", added in a patch at the end of the 
previous release.

> +

...

> +/* benchmark alloc-build-free of ops */ static inline int
> +pmd_cyclecount_bench_ops(struct pmd_cyclecount_state *state,
> uint32_t cur_op,
> +                      uint16_t test_burst_size)

Use two tabs for the next line, instead of aligning with the parenthesis above.
Look at other functions that have the same issue.

> +{
> +     uint32_t iter_ops_left = state->opts->total_ops - cur_op;
> +     uint32_t iter_ops_needed = RTE_MIN(state->opts->nb_descriptors,
> +                                        iter_ops_left);
> +     uint32_t cur_iter_op;
> +
> +     for (cur_iter_op = 0; cur_iter_op < iter_ops_needed;
> +          cur_iter_op += test_burst_size) {

Same comment as above, about the alignment.

> +             uint32_t burst_size = RTE_MIN(state->opts->total_ops -
> cur_op,
> +                                           test_burst_size);
> +             struct rte_crypto_op **ops = &state->ctx-
> >ops[cur_iter_op];
> +
> +             if (burst_size != rte_crypto_op_bulk_alloc(
> +                     state->ctx->crypto_op_pool,
> +                     RTE_CRYPTO_OP_TYPE_SYMMETRIC,
> +                     ops, burst_size))

Same as above, add an extra tab.

> +                     return -1;
> +
> +             /* Setup crypto op, attach mbuf etc */
> +             (state->ctx->populate_ops)(ops,
> +                             &state->ctx->mbufs_in[cur_iter_op],
> +                             &state->ctx->mbufs_out[cur_iter_op],
> +                             burst_size, state->ctx->sess,
> +                             state->opts,
> +                             state->ctx->test_vector, iv_offset);
> +
> +#ifdef CPERF_LINEARIZATION_ENABLE
> +             /* Check if source mbufs require coalescing */
> +             if (state->linearize) {
> +                     uint8_t i;
> +                     for (i = 0; i < burst_size; i++) {
> +                             struct rte_mbuf *src =
> +                                             ops[i]->sym->m_src;
> +                             rte_pktmbuf_linearize(src);
> +                     }
> +             }
> +#endif /* CPERF_LINEARIZATION_ENABLE */
> +             rte_mempool_put_bulk(state->ctx->crypto_op_pool,
> +                             (void **)ops, burst_size);
> +     }
> +
> +     return 0;
> +}
> +

...

> +
> +/* benchmark enqueue, returns number of ops enqueued */ static
> uint32_t
> +pmd_cyclecount_bench_enq(struct pmd_cyclecount_state *state,
> +                      uint32_t iter_ops_needed, uint16_t
> test_burst_size) {
> +     /* Enqueue full descriptor ring of ops on crypto device */
> +     uint32_t cur_iter_op = 0;
> +     while (cur_iter_op < iter_ops_needed) {
> +             uint32_t burst_size =
> +                             RTE_MIN(iter_ops_needed - cur_iter_op,
> +                                     test_burst_size);
> +             struct rte_crypto_op **ops =
> +                             &state->ctx->ops[cur_iter_op];
> +             uint32_t burst_enqd;
> +
> +             burst_enqd =

No need to place the function in a separate line, I think.

> +                             rte_cryptodev_enqueue_burst(
> +                                     state->ctx->dev_id,
> +                                     state->ctx->qp_id,
> +                                     ops, burst_size);
> +
> +             /* if we couldn't enqueue anything, the queue is full */
> +             if (!burst_enqd) {
> +                     /* don't try to dequeue anything we didn't enqueue
> */
> +                     return cur_iter_op;
> +             }
> +
> +             if (burst_enqd < burst_size)
> +                     state->ops_enq_retries++;
> +             state->ops_enqd += burst_enqd;
> +             cur_iter_op += burst_enqd;
> +     }
> +     return iter_ops_needed;
> +}
> +

Re: [dpdk-dev] [PATCH 3/4] test-crypto-perf: add new PMD benchmarking mode

Reply via email to