>> > >> >> >
>> > >> >> > + @Van Haaren, Harry
>> > >> >
>> > >> >Hi All,
>> > >> >
>> > >> >I have been away on vacation for the last week - hence the delay
>> > >> >in reply on this thread.
>> > >> >
>> > >> ><snip discussion>
>> > >> >
>> > >> >> > > [1]
>> > >> >> > > Steps to reproduce:
>> > >> >> > > * Clone https://urldefense.proofpoint.com/v2/url?u=http-
>> > >> >3A__dpdk.org_git_next_dpdk-2Dnext-
>> > >>
>> >2Deventdev&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=G9w4KsPaQLAC
>> > >> BfGCL
>> > >> >35PtiRH996yqJDxAZwrWegU2qQ&m=-yaLm_cvg5cKTbBy3OoUs719W-
>> > >>
>> >E3ARETajJQmUvoE4aSAPjcEn1kulkRNxTn841D&s=lZjsn2zecck8IBBQRA7fId7
>> > >> BXSYKk
>> > >> >U8Tjj10gNQLB6U&e=
>> > >> >> > > * Apply [v5] app/eventdev: add crypto producer mode
>> > >> >> > >   git-pw --server
>> > >> >> > > https://urldefense.proofpoint.com/v2/url?u=https-
>> > >>
>> >3A__patches.dpdk.org_api_1.2_&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtf
>> > >> Q&r=G
>> > >> >9w4KsPaQLACBfGCL35PtiRH996yqJDxAZwrWegU2qQ&m=-
>> > >> >yaLm_cvg5cKTbBy3OoUs719W-
>> > >>
>> >E3ARETajJQmUvoE4aSAPjcEn1kulkRNxTn841D&s=VBQtpQ8vwHt9BnMrPLz
>> > >> SneOm
>> > >> >zhLdP5bfyLuY42fCnak&e=  --project dpdk
>> > >> >> > > patch apply 107645
>> > >> >> > > * Apply [RFC] app/eventdev: add software crypto adapter
>> support
>> > >> >> > >   git-pw --server
>> > >> >> > > https://urldefense.proofpoint.com/v2/url?u=https-
>> > >>
>> >3A__patches.dpdk.org_api_1.2_&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtf
>> > >> Q&r=G
>> > >> >9w4KsPaQLACBfGCL35PtiRH996yqJDxAZwrWegU2qQ&m=-
>> > >> >yaLm_cvg5cKTbBy3OoUs719W-
>> > >>
>> >E3ARETajJQmUvoE4aSAPjcEn1kulkRNxTn841D&s=VBQtpQ8vwHt9BnMrPLz
>> > >> SneOm
>> > >> >zhLdP5bfyLuY42fCnak&e=  --project dpdk
>> > >> >> > > patch apply 107029
>> > >> >> > > * meson x86_build_debug  -Dc_args='-g -O0' -
>> > >> Ddisable_drivers="*/cnxk"
>> > >> >> > > * ninja -C x86_build_debug
>> > >> >> > > * Command to reproduce crash
>> > >> >> > >   sudo ./x86_build_debug/app/dpdk-test-eventdev -l 0-8 -s
>> > >> >> > > 0xf0
>> > >> >> > > --vdev=event_sw0  --vdev="crypto_null" --
>> > >> >> > > --prod_type_cryptodev --crypto_adptr_mode 0
>> > >> >> > > --test=perf_queue --stlist=a --wlcores 1 --plcores 2
>> > >> >
>> > >> >Can confirm that these steps indeed cause segfault as reported.
>> > >> >
>> > >> >In debugging, it seems like there are *zero* NEW events, and large
>> > >> >numbers of RELEASE events are enqueued... if so, this is not
>> > >> >compliant to
>> > >> the Eventdev API.
>> > >> >Can somebody confirm that?
>> > >> >
>> > >> >The SW PMD is being told there are events to release, but there aren't
>> any.
>> > >> >Eventually, this leads to a mismatch in credit allocations, which
>> > >> >then causes the IQ-chunks datastructure to corrupt.
>> > >> >
>> > >> >All in all, I'm not convinced this is a SW PMD issue yet - initial
>> > >> >testing points to incorrect event OP NEW/FWD/RELEASE usage. Can we
>> > >> >verify that the OPs being sent are correct?
>> > >> >
>> > >>
>> > >> Looks like an issue in crypto adapter service. The service is
>> > >> starting with OP_FORWARD, if
>> RTE_EVENT_DEV_CAP_IMPLICIT_RELEASE_DISABLE is set.
>> > >> Abhinandan can confirm.
>> > >
>> > >The service is started with what application is requesting for from the
>> adapter.
>> > >The app can request either OP_NEW or FWD mode. Adapter while
>> creating
>> > >a
>> > new
>> > >instance
>> > >requests for evendev caps & based on that adapter enqueues events
>> > >back to evdev in FWD or NEW mode. All events are triggered by
>> > >application and adapter is transparent here. Could you please explain
>> > >me how this creating an issue?
>> > >
>> >
>> > In lib/eventdev/rte_event_crypto_adapter.c:
>> > ...
>> > eca_ops_enqueue_burst(struct event_crypto_adapter *adapter, ...
>> >                 rte_memcpy(ev, &m_data->response_info, sizeof(*ev));
>> >                 ev->event_ptr = ops[i];
>> >                 ev->event_type = RTE_EVENT_TYPE_CRYPTODEV;
>> >                 if (adapter->implicit_release_disabled)
>> >                         ev->op = RTE_EVENT_OP_FORWARD;
>> >                 else
>> >                         ev->op = RTE_EVENT_OP_NEW;  ...
>> >
>> > op and event_type is set in the service. Changing FORWARD to NEW will
>> > fix the crash.
>>
>> Yes, I think that is true, but lets ensure we're all understanding the 
>> reason.
>>
>> The crash reported occurs when events with FORWARD are sent into the SW
>> PMD, and later those are RELEASED. Notice, the event was never *NEW*.
>>
>> Eventdev demands that when adding "new" things (e.g. events not
>> previously seen by the PMD) into the Eventdev instance, the type of the
>> event must be NEW. The NEW op type consumes "credits" in the SW PMD,
>> and causes tracking for the NEW events.
>>
>> I think that here the events *starts* with FORWARD events (should be NEW),
>> and hence the crash occurs, because the NEW type was never enqueued
>> first.
>>
>> Shijith suggests changing FORWARD to NEW to fix the crash, I believe that
>> may *fix* the crash here, but doing so without consideration for "implicit-
>> release" mode may break things elsewhere.
>>
>> Is the better fix to ensure that any events being enqueued into Eventdev for
>> the first time are of a NEW type, and once circulated, either FORWARD or
>> NEW can be used in a valid way?
>>
>>
>> > I think, we should update the spec with what all values are used in
>> response info.
>> > I will remove setting op/event_type field of response info in the
>> application.
>> > PMD/service can take care of it.
>>
>> I'm not familiar with how the adapter/pmd/service interact - no input from
>> me.
>
>Harry and Shijith, Thanks for all the observations.
>
>After debugging, I think the changes are required in both adapter and 
>application:
>1. Application/Adapter in FWD mode case: The app is forming FWD events as an
>event originator (it is supposed to form NEW events) which is causing the 
>crash!
>App fix:
>root@dev:/home/intel/abhi/dpdk-next-eventdev#
>In crypto_adapter_enq_op_fwd() -> change as below:
>-       ev.op = RTE_EVENT_OP_FORWARD;
>+       ev.op = RTE_EVENT_OP_NEW;
>        ev.queue_id = p->queue_id;
>        ev.sched_type = RTE_SCHED_TYPE_ATOMIC;
>
 
Will send v7 with this change + changes to not set op and event_type in 
application.

>2. Adapter in NEW mode case: The app is calls rte_cryptodev_enqueue_burst()
>and directly enqueue crypto ops. Adapter had no clue crypto ops were derived
>from events or they were directly enqueued by application.
>So, below is the fix for that:
>root@dev:/home/intel/abhi/dpdk-next-eventdev# git diff
>lib/eventdev/rte_event_crypto_adapter.c
>diff --git a/lib/eventdev/rte_event_crypto_adapter.c
>b/lib/eventdev/rte_event_crypto_adapter.c
>index 0b484f3695..a6328b853d 100644
>--- a/lib/eventdev/rte_event_crypto_adapter.c
>+++ b/lib/eventdev/rte_event_crypto_adapter.c
>@@ -658,7 +658,9 @@ eca_ops_enqueue_burst(struct event_crypto_adapter
>*adapter,
>                rte_memcpy(ev, &m_data->response_info, sizeof(*ev));
>                ev->event_ptr = ops[i];
>                ev->event_type = RTE_EVENT_TYPE_CRYPTODEV;
>-               if (adapter->implicit_release_disabled)
>+               if (adapter->mode == RTE_EVENT_CRYPTO_ADAPTER_OP_NEW)
>+                       ev->op = RTE_EVENT_OP_NEW;
>+               else if (adapter->implicit_release_disabled)
>                        ev->op = RTE_EVENT_OP_FORWARD;
>                else
>                        ev->op = RTE_EVENT_OP_NEW;
>
>
>With the above fix, I can run the test for both NEW and FWD mode:
>
>root@xdp-dev:/home/intel/abhi/dpdk-next-eventdev/abhi# ./app/dpdk-test-
>eventdev -l 0-8 -s 0xf0 --vdev=event_sw0  --vdev="crypto_null" -- --
>prod_type_cryptodev --crypto_adptr_mode 0 --test=perf_queue --stlist=a --
>wlcores 1 --plcores 2
>EAL: Detected CPU lcores: 96
>EAL: Detected NUMA nodes: 2
>EAL: Detected static linkage of DPDK
>EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>EAL: Selected IOVA mode 'PA'
>EAL: VFIO support initialized
>CRYPTODEV: Creating cryptodev crypto_null
>
>CRYPTODEV: Initialisation parameters - name: crypto_null,socket id: 0, max 
>queue
>pairs: 8
>TELEMETRY: No legacy callbacks, legacy socket not created
>        driver               : event_sw
>        test                 : perf_queue
>        dev                  : 0
>        verbose_level        : 1
>        socket_id            : -1
>        pool_sz              : 16384
>        main lcore           : 0
>        nb_pkts              : 67108864
>        nb_timers            : 100000000
>        available lcores     : {0 1 2 3 8}
>        nb_flows             : 1024
>        worker deq depth     : 16
>        fwd_latency          : false
>        nb_prod_lcores       : 1
>        producer lcores      : {2}
>        nb_worker_lcores     : 1
>        worker lcores        : {1}
>        nb_stages            : 1
>        nb_evdev_ports       : 2
>        nb_evdev_queues      : 1
>        queue_priority       : false
>        sched_type_list      : {A}
>        crypto adapter mode  : OP_NEW
>        nb_cryptodev         : 1
>        prod_type            : Event crypto adapter producers
>        prod_enq_burst_sz    : 1
>CRYPTODEV: elt_size 0 is expanded to 208
>
>0.000 mpps avg 0.000 mppsEventDev todo-fix-name: ports 3, qids 1
>        rx   6080
>        drop 0
>        tx   2040
>        sched calls: 9064463
>        sched cq/qid call: 9069025
>        sched no IQ enq: 9063156
>        sched no CQ enq: 9063759
>        inflight 4096, credits: 0
>  Port 0
>        rx   0  drop 0  tx   2024       inflight 0
>        Max New: 4096   Avg cycles PP: 745      Credits: 40
>        Receive burst distribution:
>                0:100% 1-4:0.00% 5-8:0.00% 9-12:0.00% 13-16:0.00%
>        rx ring used:    0      free: 4096
>        cq ring used:    0      free:   16
>  Port 1
>        rx   0  drop 0  tx   0  inflight 0
>        Max New: 4096   Avg cycles PP: 0        Credits: 0
>        Receive burst distribution:
>                0:-nan%
>        rx ring used:    0      free: 4096
>        cq ring used:    0      free:   16
>  Port 2
>        rx   6080       drop 0  tx   16 inflight 16
>        Max New: 4096   Avg cycles PP: 0        Credits: 0
>        Receive burst distribution:
>                0:-nan%
>        rx ring used:    0      free: 4096
>        cq ring used:   16      free:    0
>  Queue 0 (Atomic)
>        rx   6080       drop 0  tx   2040
>        Per Port Stats:
>          Port 0: Pkts: 2024    Flows: 0
>          Port 1: Pkts: 0       Flows: 0
>          Port 2: Pkts: 16      Flows: 22
>        iq 1: Used 4040
>error: perf_launch_lcores() No schedules for seconds, deadlock
>
>Packet distribution across worker cores :
>Worker 0 packets: 7e8 percentage: 100.00
>Result: Failed
>
>
>root@xdp-dev:/home/intel/abhi/dpdk-next-eventdev/abhi# ./app/dpdk-test-
>eventdev -l 0-8 -s 0xf0 --vdev=event_sw0  --vdev="crypto_null" -- --
>prod_type_cryptodev --crypto_adptr_mode 1 --test=perf_queue --stlist=a --
>wlcores 1 --plcores 2
>EAL: Detected CPU lcores: 96
>EAL: Detected NUMA nodes: 2
>EAL: Detected static linkage of DPDK
>EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>EAL: Selected IOVA mode 'PA'
>EAL: VFIO support initialized
>CRYPTODEV: Creating cryptodev crypto_null
>
>CRYPTODEV: Initialisation parameters - name: crypto_null,socket id: 0, max 
>queue
>pairs: 8
>TELEMETRY: No legacy callbacks, legacy socket not created
>        driver               : event_sw
>        test                 : perf_queue
>        dev                  : 0
>        verbose_level        : 1
>        socket_id            : -1
>        pool_sz              : 16384
>        main lcore           : 0
>        nb_pkts              : 67108864
>        nb_timers            : 100000000
>        available lcores     : {0 1 2 3 8}
>        nb_flows             : 1024
>        worker deq depth     : 16
>        fwd_latency          : false
>        nb_prod_lcores       : 1
>        producer lcores      : {2}
>        nb_worker_lcores     : 1
>        worker lcores        : {1}
>        nb_stages            : 1
>        nb_evdev_ports       : 2
>        nb_evdev_queues      : 1
>        queue_priority       : false
>        sched_type_list      : {A}
>        crypto adapter mode  : OP_FORWARD
>        nb_cryptodev         : 1
>        prod_type            : Event crypto adapter producers
>        prod_enq_burst_sz    : 1
>CRYPTODEV: elt_size 0 is expanded to 208
>
>0.000 mpps avg 0.000 mppsEventDev todo-fix-name: ports 3, qids 1
>        rx   4480
>        drop 0
>        tx   447
>        sched calls: 8438712
>        sched cq/qid call: 8442432
>        sched no IQ enq: 8438434
>        sched no CQ enq: 8438494
>        inflight 4096, credits: 0
>  Port 0
>        rx   0  drop 0  tx   431        inflight 0
>        Max New: 4096   Avg cycles PP: 637      Credits: 47
>        Receive burst distribution:
>                0:100% 1-4:0.00% 5-8:0.00% 13-16:0.00%
>        rx ring used:    0      free: 4096
>        cq ring used:    0      free:   16
>  Port 1
>        rx   4480       drop 0  tx   0  inflight 0
>        Max New: 4096   Avg cycles PP: 0        Credits: 0
>        Receive burst distribution:
>                0:-nan%
>        rx ring used:    0      free: 4096
>        cq ring used:    0      free:   16
>  Port 2
>        rx   0  drop 0  tx   16 inflight 16
>        Max New: 4096   Avg cycles PP: 0        Credits: 0
>        Receive burst distribution:
>                0:-nan%
>        rx ring used:    0      free: 4096
>        cq ring used:   16      free:    0
>  Queue 0 (Atomic)
>        rx   4480       drop 0  tx   447
>        Per Port Stats:
>          Port 0: Pkts: 431     Flows: 0
>          Port 1: Pkts: 0       Flows: 0
>          Port 2: Pkts: 16      Flows: 1
>        iq 0: Used 4033
>error: perf_launch_lcores() No schedules for seconds, deadlock
>
>Packet distribution across worker cores :
>Worker 0 packets: 1af percentage: 100.00
>Result: Failed
>
>@Shijith Thotton, Any idea why the test is failing?

I'm not sure what the issue is here.

>Meantime, I will get the rest of the app code reviewed.
>I think, we can get both RFC and crypto producer patches in.
>

Better keep both patch separate. RFC can be merged after fixing above issue.

Reply via email to