> -----Original Message----- > From: Van Haaren, Harry <harry.van.haa...@intel.com> > Sent: Wednesday, February 23, 2022 3:43 PM > To: Shijith Thotton <sthot...@marvell.com>; Gujjar, Abhinandan S > <abhinandan.guj...@intel.com>; Jerin Jacob <jerinjac...@gmail.com>; > Hemant Agrawal <hemant.agra...@nxp.com>; Nipun Gupta > <nipun.gu...@nxp.com> > Cc: Jerin Jacob Kollanukkaran <jer...@marvell.com>; dev@dpdk.org > Subject: RE: [PATCH v5] app/eventdev: add crypto producer mode > > > -----Original Message----- > > From: Shijith Thotton <sthot...@marvell.com> > > Sent: Wednesday, February 23, 2022 10:02 AM > > To: Gujjar, Abhinandan S <abhinandan.guj...@intel.com>; Van Haaren, > > Harry <harry.van.haa...@intel.com>; Jerin Jacob > > <jerinjac...@gmail.com>; Hemant Agrawal <hemant.agra...@nxp.com>; > > Nipun Gupta <nipun.gu...@nxp.com> > > Cc: Jerin Jacob Kollanukkaran <jer...@marvell.com>; dev@dpdk.org > > Subject: RE: [PATCH v5] app/eventdev: add crypto producer mode > > > > > > > > >-----Original Message----- > > >From: Gujjar, Abhinandan S <abhinandan.guj...@intel.com> > > >Sent: Wednesday, February 23, 2022 2:32 PM > > >To: Shijith Thotton <sthot...@marvell.com>; Van Haaren, Harry > > ><harry.van.haa...@intel.com>; Jerin Jacob <jerinjac...@gmail.com>; > > >Hemant Agrawal <hemant.agra...@nxp.com>; Nipun Gupta > > ><nipun.gu...@nxp.com> > > >Cc: Jerin Jacob Kollanukkaran <jer...@marvell.com>; dev@dpdk.org > > >Subject: [EXT] RE: [PATCH v5] app/eventdev: add crypto producer mode > > > > > >External Email > > > > > >--------------------------------------------------------------------- > > >- > > > > > > > > >> -----Original Message----- > > >> From: Shijith Thotton <sthot...@marvell.com> > > >> Sent: Tuesday, February 22, 2022 12:34 PM > > >> To: Van Haaren, Harry <harry.van.haa...@intel.com>; Gujjar, > > >> Abhinandan S <abhinandan.guj...@intel.com>; Jerin Jacob > > >> <jerinjac...@gmail.com>; Hemant Agrawal > <hemant.agra...@nxp.com>; > > >> Nipun Gupta <nipun.gu...@nxp.com> > > >> Cc: Jerin Jacob Kollanukkaran <jer...@marvell.com>; dev@dpdk.org > > >> Subject: RE: [PATCH v5] app/eventdev: add crypto producer mode > > >> > > >> >> > > > >> >> > + @Van Haaren, Harry > > >> > > > >> >Hi All, > > >> > > > >> >I have been away on vacation for the last week - hence the delay > > >> >in reply on this thread. > > >> > > > >> ><snip discussion> > > >> > > > >> >> > > [1] > > >> >> > > Steps to reproduce: > > >> >> > > * Clone https://urldefense.proofpoint.com/v2/url?u=http- > > >> >3A__dpdk.org_git_next_dpdk-2Dnext- > > >> > >2Deventdev&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=G9w4KsPaQLAC > > >> BfGCL > > >> >35PtiRH996yqJDxAZwrWegU2qQ&m=-yaLm_cvg5cKTbBy3OoUs719W- > > >> > >E3ARETajJQmUvoE4aSAPjcEn1kulkRNxTn841D&s=lZjsn2zecck8IBBQRA7fId7 > > >> BXSYKk > > >> >U8Tjj10gNQLB6U&e= > > >> >> > > * Apply [v5] app/eventdev: add crypto producer mode > > >> >> > > git-pw --server > > >> >> > > https://urldefense.proofpoint.com/v2/url?u=https- > > >> > >3A__patches.dpdk.org_api_1.2_&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtf > > >> Q&r=G > > >> >9w4KsPaQLACBfGCL35PtiRH996yqJDxAZwrWegU2qQ&m=- > > >> >yaLm_cvg5cKTbBy3OoUs719W- > > >> > >E3ARETajJQmUvoE4aSAPjcEn1kulkRNxTn841D&s=VBQtpQ8vwHt9BnMrPLz > > >> SneOm > > >> >zhLdP5bfyLuY42fCnak&e= --project dpdk > > >> >> > > patch apply 107645 > > >> >> > > * Apply [RFC] app/eventdev: add software crypto adapter > support > > >> >> > > git-pw --server > > >> >> > > https://urldefense.proofpoint.com/v2/url?u=https- > > >> > >3A__patches.dpdk.org_api_1.2_&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtf > > >> Q&r=G > > >> >9w4KsPaQLACBfGCL35PtiRH996yqJDxAZwrWegU2qQ&m=- > > >> >yaLm_cvg5cKTbBy3OoUs719W- > > >> > >E3ARETajJQmUvoE4aSAPjcEn1kulkRNxTn841D&s=VBQtpQ8vwHt9BnMrPLz > > >> SneOm > > >> >zhLdP5bfyLuY42fCnak&e= --project dpdk > > >> >> > > patch apply 107029 > > >> >> > > * meson x86_build_debug -Dc_args='-g -O0' - > > >> Ddisable_drivers="*/cnxk" > > >> >> > > * ninja -C x86_build_debug > > >> >> > > * Command to reproduce crash > > >> >> > > sudo ./x86_build_debug/app/dpdk-test-eventdev -l 0-8 -s > > >> >> > > 0xf0 > > >> >> > > --vdev=event_sw0 --vdev="crypto_null" -- > > >> >> > > --prod_type_cryptodev --crypto_adptr_mode 0 > > >> >> > > --test=perf_queue --stlist=a --wlcores 1 --plcores 2 > > >> > > > >> >Can confirm that these steps indeed cause segfault as reported. > > >> > > > >> >In debugging, it seems like there are *zero* NEW events, and large > > >> >numbers of RELEASE events are enqueued... if so, this is not > > >> >compliant to > > >> the Eventdev API. > > >> >Can somebody confirm that? > > >> > > > >> >The SW PMD is being told there are events to release, but there aren't > any. > > >> >Eventually, this leads to a mismatch in credit allocations, which > > >> >then causes the IQ-chunks datastructure to corrupt. > > >> > > > >> >All in all, I'm not convinced this is a SW PMD issue yet - initial > > >> >testing points to incorrect event OP NEW/FWD/RELEASE usage. Can we > > >> >verify that the OPs being sent are correct? > > >> > > > >> > > >> Looks like an issue in crypto adapter service. The service is > > >> starting with OP_FORWARD, if > RTE_EVENT_DEV_CAP_IMPLICIT_RELEASE_DISABLE is set. > > >> Abhinandan can confirm. > > > > > >The service is started with what application is requesting for from the > adapter. > > >The app can request either OP_NEW or FWD mode. Adapter while > creating > > >a > > new > > >instance > > >requests for evendev caps & based on that adapter enqueues events > > >back to evdev in FWD or NEW mode. All events are triggered by > > >application and adapter is transparent here. Could you please explain > > >me how this creating an issue? > > > > > > > In lib/eventdev/rte_event_crypto_adapter.c: > > ... > > eca_ops_enqueue_burst(struct event_crypto_adapter *adapter, ... > > rte_memcpy(ev, &m_data->response_info, sizeof(*ev)); > > ev->event_ptr = ops[i]; > > ev->event_type = RTE_EVENT_TYPE_CRYPTODEV; > > if (adapter->implicit_release_disabled) > > ev->op = RTE_EVENT_OP_FORWARD; > > else > > ev->op = RTE_EVENT_OP_NEW; ... > > > > op and event_type is set in the service. Changing FORWARD to NEW will > > fix the crash. > > Yes, I think that is true, but lets ensure we're all understanding the reason. > > The crash reported occurs when events with FORWARD are sent into the SW > PMD, and later those are RELEASED. Notice, the event was never *NEW*. > > Eventdev demands that when adding "new" things (e.g. events not > previously seen by the PMD) into the Eventdev instance, the type of the > event must be NEW. The NEW op type consumes "credits" in the SW PMD, > and causes tracking for the NEW events. > > I think that here the events *starts* with FORWARD events (should be NEW), > and hence the crash occurs, because the NEW type was never enqueued > first. > > Shijith suggests changing FORWARD to NEW to fix the crash, I believe that > may *fix* the crash here, but doing so without consideration for "implicit- > release" mode may break things elsewhere. > > Is the better fix to ensure that any events being enqueued into Eventdev for > the first time are of a NEW type, and once circulated, either FORWARD or > NEW can be used in a valid way? > > > > I think, we should update the spec with what all values are used in > response info. > > I will remove setting op/event_type field of response info in the > application. > > PMD/service can take care of it. > > I'm not familiar with how the adapter/pmd/service interact - no input from > me.
Harry and Shijith, Thanks for all the observations. After debugging, I think the changes are required in both adapter and application: 1. Application/Adapter in FWD mode case: The app is forming FWD events as an event originator (it is supposed to form NEW events) which is causing the crash! App fix: root@dev:/home/intel/abhi/dpdk-next-eventdev# In crypto_adapter_enq_op_fwd() -> change as below: - ev.op = RTE_EVENT_OP_FORWARD; + ev.op = RTE_EVENT_OP_NEW; ev.queue_id = p->queue_id; ev.sched_type = RTE_SCHED_TYPE_ATOMIC; 2. Adapter in NEW mode case: The app is calls rte_cryptodev_enqueue_burst() and directly enqueue crypto ops. Adapter had no clue crypto ops were derived from events or they were directly enqueued by application. So, below is the fix for that: root@dev:/home/intel/abhi/dpdk-next-eventdev# git diff lib/eventdev/rte_event_crypto_adapter.c diff --git a/lib/eventdev/rte_event_crypto_adapter.c b/lib/eventdev/rte_event_crypto_adapter.c index 0b484f3695..a6328b853d 100644 --- a/lib/eventdev/rte_event_crypto_adapter.c +++ b/lib/eventdev/rte_event_crypto_adapter.c @@ -658,7 +658,9 @@ eca_ops_enqueue_burst(struct event_crypto_adapter *adapter, rte_memcpy(ev, &m_data->response_info, sizeof(*ev)); ev->event_ptr = ops[i]; ev->event_type = RTE_EVENT_TYPE_CRYPTODEV; - if (adapter->implicit_release_disabled) + if (adapter->mode == RTE_EVENT_CRYPTO_ADAPTER_OP_NEW) + ev->op = RTE_EVENT_OP_NEW; + else if (adapter->implicit_release_disabled) ev->op = RTE_EVENT_OP_FORWARD; else ev->op = RTE_EVENT_OP_NEW; With the above fix, I can run the test for both NEW and FWD mode: root@xdp-dev:/home/intel/abhi/dpdk-next-eventdev/abhi# ./app/dpdk-test-eventdev -l 0-8 -s 0xf0 --vdev=event_sw0 --vdev="crypto_null" -- --prod_type_cryptodev --crypto_adptr_mode 0 --test=perf_queue --stlist=a --wlcores 1 --plcores 2 EAL: Detected CPU lcores: 96 EAL: Detected NUMA nodes: 2 EAL: Detected static linkage of DPDK EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: Selected IOVA mode 'PA' EAL: VFIO support initialized CRYPTODEV: Creating cryptodev crypto_null CRYPTODEV: Initialisation parameters - name: crypto_null,socket id: 0, max queue pairs: 8 TELEMETRY: No legacy callbacks, legacy socket not created driver : event_sw test : perf_queue dev : 0 verbose_level : 1 socket_id : -1 pool_sz : 16384 main lcore : 0 nb_pkts : 67108864 nb_timers : 100000000 available lcores : {0 1 2 3 8} nb_flows : 1024 worker deq depth : 16 fwd_latency : false nb_prod_lcores : 1 producer lcores : {2} nb_worker_lcores : 1 worker lcores : {1} nb_stages : 1 nb_evdev_ports : 2 nb_evdev_queues : 1 queue_priority : false sched_type_list : {A} crypto adapter mode : OP_NEW nb_cryptodev : 1 prod_type : Event crypto adapter producers prod_enq_burst_sz : 1 CRYPTODEV: elt_size 0 is expanded to 208 0.000 mpps avg 0.000 mppsEventDev todo-fix-name: ports 3, qids 1 rx 6080 drop 0 tx 2040 sched calls: 9064463 sched cq/qid call: 9069025 sched no IQ enq: 9063156 sched no CQ enq: 9063759 inflight 4096, credits: 0 Port 0 rx 0 drop 0 tx 2024 inflight 0 Max New: 4096 Avg cycles PP: 745 Credits: 40 Receive burst distribution: 0:100% 1-4:0.00% 5-8:0.00% 9-12:0.00% 13-16:0.00% rx ring used: 0 free: 4096 cq ring used: 0 free: 16 Port 1 rx 0 drop 0 tx 0 inflight 0 Max New: 4096 Avg cycles PP: 0 Credits: 0 Receive burst distribution: 0:-nan% rx ring used: 0 free: 4096 cq ring used: 0 free: 16 Port 2 rx 6080 drop 0 tx 16 inflight 16 Max New: 4096 Avg cycles PP: 0 Credits: 0 Receive burst distribution: 0:-nan% rx ring used: 0 free: 4096 cq ring used: 16 free: 0 Queue 0 (Atomic) rx 6080 drop 0 tx 2040 Per Port Stats: Port 0: Pkts: 2024 Flows: 0 Port 1: Pkts: 0 Flows: 0 Port 2: Pkts: 16 Flows: 22 iq 1: Used 4040 error: perf_launch_lcores() No schedules for seconds, deadlock Packet distribution across worker cores : Worker 0 packets: 7e8 percentage: 100.00 Result: Failed root@xdp-dev:/home/intel/abhi/dpdk-next-eventdev/abhi# ./app/dpdk-test-eventdev -l 0-8 -s 0xf0 --vdev=event_sw0 --vdev="crypto_null" -- --prod_type_cryptodev --crypto_adptr_mode 1 --test=perf_queue --stlist=a --wlcores 1 --plcores 2 EAL: Detected CPU lcores: 96 EAL: Detected NUMA nodes: 2 EAL: Detected static linkage of DPDK EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: Selected IOVA mode 'PA' EAL: VFIO support initialized CRYPTODEV: Creating cryptodev crypto_null CRYPTODEV: Initialisation parameters - name: crypto_null,socket id: 0, max queue pairs: 8 TELEMETRY: No legacy callbacks, legacy socket not created driver : event_sw test : perf_queue dev : 0 verbose_level : 1 socket_id : -1 pool_sz : 16384 main lcore : 0 nb_pkts : 67108864 nb_timers : 100000000 available lcores : {0 1 2 3 8} nb_flows : 1024 worker deq depth : 16 fwd_latency : false nb_prod_lcores : 1 producer lcores : {2} nb_worker_lcores : 1 worker lcores : {1} nb_stages : 1 nb_evdev_ports : 2 nb_evdev_queues : 1 queue_priority : false sched_type_list : {A} crypto adapter mode : OP_FORWARD nb_cryptodev : 1 prod_type : Event crypto adapter producers prod_enq_burst_sz : 1 CRYPTODEV: elt_size 0 is expanded to 208 0.000 mpps avg 0.000 mppsEventDev todo-fix-name: ports 3, qids 1 rx 4480 drop 0 tx 447 sched calls: 8438712 sched cq/qid call: 8442432 sched no IQ enq: 8438434 sched no CQ enq: 8438494 inflight 4096, credits: 0 Port 0 rx 0 drop 0 tx 431 inflight 0 Max New: 4096 Avg cycles PP: 637 Credits: 47 Receive burst distribution: 0:100% 1-4:0.00% 5-8:0.00% 13-16:0.00% rx ring used: 0 free: 4096 cq ring used: 0 free: 16 Port 1 rx 4480 drop 0 tx 0 inflight 0 Max New: 4096 Avg cycles PP: 0 Credits: 0 Receive burst distribution: 0:-nan% rx ring used: 0 free: 4096 cq ring used: 0 free: 16 Port 2 rx 0 drop 0 tx 16 inflight 16 Max New: 4096 Avg cycles PP: 0 Credits: 0 Receive burst distribution: 0:-nan% rx ring used: 0 free: 4096 cq ring used: 16 free: 0 Queue 0 (Atomic) rx 4480 drop 0 tx 447 Per Port Stats: Port 0: Pkts: 431 Flows: 0 Port 1: Pkts: 0 Flows: 0 Port 2: Pkts: 16 Flows: 1 iq 0: Used 4033 error: perf_launch_lcores() No schedules for seconds, deadlock Packet distribution across worker cores : Worker 0 packets: 1af percentage: 100.00 Result: Failed @Shijith Thotton, Any idea why the test is failing? Meantime, I will get the rest of the app code reviewed. I think, we can get both RFC and crypto producer patches in. Regards Abhinandan