On 11/11/20 3:27 PM, Kegl Rohit wrote:
> Hello!
>
> We are using a imx6q platform.
> The fec interface is used to receive a continuous stream of custom /
> raw ethernet packets. The packet size is fixed ~132 bytes and they get
> sent every 250µs.
>
> While testing I observed spontaneous packet delays from time to time.
> After digging down deeper I think that the fec peripheral does not
> update the rx descriptor status correctly.
> I modified the queue_rx function which is called by the NAPI poll
> function. "no packet N" is printed when the queue_rx function doesn't
> process any descriptor.
> Therefore the variable N counts continuous calls without ready
> descriptors. When the current descriptor is ready&processed and moved
> to the next entry, then N is cleared again.
> Additionally an error is printed if the current descriptor is empty
> but the next one is already ready. In case this error happens the
> current descriptor and the next 11 ones are dumped.
> "C" ... current
> "E" ... empty
>
> [ 57.436478 < 0.020005>] no packet 1!
> [ 57.460850 < 0.024372>] no packet 1!
> [ 57.461107 < 0.000257>] ring error, current empty but next is not empty
> [ 57.461118 < 0.000011>] RX ahead
> [ 57.461135 < 0.000017>] 129 C E 0x8840 0x2c743a40 132
> [ 57.461146 < 0.000011>] 130 0x0840 0x2c744180 132
> [ 57.461158 < 0.000012>] 131 E 0x8840 0x2c7448c0 132
> [ 57.461170 < 0.000012>] 132 E 0x8840 0x2c745000 132
> [ 57.461181 < 0.000011>] 133 E 0x8840 0x2c745740 132
> [ 57.461192 < 0.000011>] 134 E 0x8840 0x2c745e80 132
> [ 57.461204 < 0.000012>] 135 E 0x8880 0x2c7465c0 114
> [ 57.461215 < 0.000011>] 136 E 0x8840 0x2c746d00 132
> [ 57.461227 < 0.000012>] 137 E 0x8840 0x2c747440 132
> [ 57.461239 < 0.000012>] 138 E 0x8840 0x2c748040 132
> [ 57.461250 < 0.000011>] 139 E 0x8840 0x2c748780 132
> [ 57.461262 < 0.000012>] 140 E 0x8840 0x2c748ec0 132
> [ 57.461477 < 0.000008>] no packet 2!
> [ 57.461506 < 0.000029>] ring error, current empty but next is not empty
> [ 57.461537 < 0.000031>] RX ahead
> [ 57.461550 < 0.000013>] 129 C E 0x8840 0x2c743a40 132
> [ 57.461563 < 0.000013>] 130 0x0840 0x2c744180 132
> [ 57.461577 < 0.000014>] 131 0x0840 0x2c7448c0 132
> [ 57.461589 < 0.000012>] 132 0x0840 0x2c745000 132
> [ 57.461601 < 0.000012>] 133 E 0x8840 0x2c745740 132
> [ 57.461613 < 0.000012>] 134 E 0x8840 0x2c745e80 132
> [ 57.461624 < 0.000011>] 135 E 0x8880 0x2c7465c0 114
> [ 57.461635 < 0.000011>] 136 E 0x8840 0x2c746d00 132
> [ 57.461645 < 0.000010>] 137 E 0x8840 0x2c747440 132
> [ 57.461657 < 0.000012>] 138 E 0x8840 0x2c748040 132
> [ 57.461668 < 0.000011>] 139 E 0x8840 0x2c748780 132
> [ 57.461680 < 0.000012>] 140 E 0x8840 0x2c748ec0 132
> [ 57.461894 < 0.000009>] no packet 3!
> [ 57.461926 < 0.000032>] ring error, current empty but next is not empty
> [ 57.461935 < 0.000009>] RX ahead
> [ 57.461947 < 0.000012>] 129 C E 0x8840 0x2c743a40 132
> [ 57.461959 < 0.000012>] 130 0x0840 0x2c744180 132
> [ 57.461970 < 0.000011>] 131 0x0840 0x2c7448c0 132
> [ 57.461982 < 0.000012>] 132 0x0840 0x2c745000 132
> [ 57.461993 < 0.000011>] 133 0x0840 0x2c745740 132
> [ 57.462005 < 0.000012>] 134 E 0x8840 0x2c745e80 132
> [ 57.462017 < 0.000012>] 135 E 0x8880 0x2c7465c0 114
> [ 57.462028 < 0.000011>] 136 E 0x8840 0x2c746d00 132
> [ 57.462039 < 0.000011>] 137 E 0x8840 0x2c747440 132
> [ 57.462051 < 0.000012>] 138 E 0x8840 0x2c748040 132
> [ 57.462062 < 0.000011>] 139 E 0x8840 0x2c748780 132
> [ 57.462075 < 0.000013>] 140 E 0x8840 0x2c748ec0 132
> [ 57.462289 < 0.000009>] no packet 4!
> [ 57.462316 < 0.000027>] ring error, current empty but next is not empty
> [ 57.462326 < 0.000010>] RX ahead
> [ 57.462339 < 0.000013>] 129 C E 0x8840 0x2c743a40 132
> [ 57.462351 < 0.000012>] 130 0x0840 0x2c744180 132
> [ 57.462362 < 0.000011>] 131 0x0840 0x2c7448c0 132
> [ 57.462373 < 0.000011>] 132 0x0840 0x2c745000 132
> [ 57.462384 < 0.000011>] 133 0x0840 0x2c745740 132
> [ 57.462397 < 0.000013>] 134 0x0840 0x2c745e80 132
> [ 57.462408 < 0.000011>] 135 0x0840 0x2c7465c0 132
> [ 57.462421 < 0.000013>] 136 E 0x8840 0x2c746d00 132
> [ 57.462431 < 0.000010>] 137 E 0x8840 0x2c747440 132
> [ 57.462443 < 0.000012>] 138 E 0x8840 0x2c748040 132
> [ 57.462454 < 0.000011>] 139 E 0x8840 0x2c748780 132
> [ 57.462467 < 0.000013>] 140 E 0x8840 0x2c748ec0 132
> [ 57.462697 < 0.000009>] no packet 5!
> [ 57.462730 < 0.000033>] ring error, current empty but next is not empty
> [ 57.462739 < 0.000009>] RX ahead
> [ 57.462752 < 0.000013>] 129 C E 0x8840 0x2c743a40 132
> [ 57.462763 < 0.000011>] 130 0x0840 0x2c744180 132
> [ 57.462775 < 0.000012>] 131 0x0840 0x2c7448c0 132
> [ 57.462787 < 0.000012>] 132 0x0840 0x2c745000 132
> [ 57.462799 < 0.000012>] 133 0x0840 0x2c745740 132
> [ 57.462809 < 0.000010>] 134 0x0840 0x2c745e80 132
> [ 57.462820 < 0.000011>] 135 0x0840 0x2c7465c0 132
> [ 57.462830 < 0.000010>] 136 0x0840 0x2c746d00 132
> [ 57.462842 < 0.000012>] 137 0x0840 0x2c747440 132
> [ 57.462853 < 0.000011>] 138 E 0x8840 0x2c748040 132
> [ 57.462864 < 0.000011>] 139 E 0x8840 0x2c748780 132
> [ 57.462877 < 0.000013>] 140 E 0x8840 0x2c748ec0 132
> [ 57.463093 < 0.000009>] no packet 6!
> [ 57.463120 < 0.000027>] RX ahead
> [ 57.463133 < 0.000013>] 129 C 0x0840 0x2c743a40 132
> [ 57.463144 < 0.000011>] 130 0x0840 0x2c744180 132
> [ 57.463155 < 0.000011>] 131 0x0840 0x2c7448c0 132
> [ 57.463166 < 0.000011>] 132 0x0840 0x2c745000 132
> [ 57.463179 < 0.000013>] 133 0x0840 0x2c745740 132
> [ 57.463190 < 0.000011>] 134 0x0840 0x2c745e80 132
> [ 57.463201 < 0.000011>] 135 0x0840 0x2c7465c0 132
> [ 57.463213 < 0.000012>] 136 0x0840 0x2c746d00 132
> [ 57.463224 < 0.000011>] 137 0x0840 0x2c747440 132
> [ 57.463235 < 0.000011>] 138 0x0840 0x2c748040 132
> [ 57.463245 < 0.000010>] 139 E 0x8840 0x2c748780 132
> [ 57.463256 < 0.000011>] 140 E 0x8840 0x2c748ec0 132
> [ 57.463695 < 0.000244>] rx 12
>
> As you can see, the described error is catched and the ring is dumped.
> 9 descriptors got ready before the current descriptor is ready.
> After that the current descriptor got ready and 12 packets were
> processed at once.
> I could also observe cases where the ring (512 entries) got full
> before the current descriptor was cleared.
> And also cases where the current and next descriptor were not ready.
> [ 57.462752 < 0.000013>] 129 C E 0x8840 0x2c743a40 132
> [ 57.462763 < 0.000011>] 130 E 0x0840 0x2c744180 132
> [ 57.462775 < 0.000012>] 131 0x0840 0x2c7448c0 132
>
> I am suspecting the errata:
>
> ERR005783 ENET: ENET Status FIFO may overflow due to consecutive short frames
> Description:
> When the MAC receives shorter frames (size 64 bytes) at a rate
> exceeding the average line-rate
> burst traffic of 400 Mbps the DMA is able to absorb, the receiver
> might drop incoming frames
> before a Pause frame is issued.
> Projected Impact:
> No malfunction will result aside from the frame drops.
> Workarounds:
> The application might want to implement some flow control to ensure
> the line-rate burst traffic is
> below 400 Mbps if it only uses consecutive small frames with minimal
> (96 bit times) or short
> Inter-frame gap (IFG) time following large frames at such a high rate.
> The limit does not exist for
> frames of size larger than 800 bytes.
> Proposed Solution:
> No fix scheduled
> Linux BSP Status:
> Workaround possible but not implemented in the BSP, impacting
> functionality as described above.
>
> Is the "ENET Status FIFO" some internal hardware FIFO or is it the
> descriptor ring.
> What would be the workaround when a "Workaround is possible"?
>
> I could only think of skipping/dropping the descriptor when the
> current is still busy but the next one is ready.
> But it is not easily possible because the "stuck" descriptor gets
> ready after a huge delay.
>
> Is this issue known already? Any suggestions?
>
>
> Thanks in advance
>
It could help if you could provide your patch, or at least the kernel version
and file (driver)
you are looking at :)