Hi- I have determined that somewhere upstream from my custom RFNOC component the fabric is intermittently dropping a fixed number of packets.
I have a custom transmit waveform encapsulated in a single RFNOC component. This waveform component effectively takes about 8 32-bit samples of user data and produces an entire transmit burst of close to 5 msec in length at a sample rate of 50 MHz. Therefore, a fairly large "upsampling" operation for a RFNOC block. This is a timed transmission, so I have interface logic that translates the CHDR info and single EOB to a series of packets with a timestamp on the first and the EOB set on the last packet along with the appropriate tlast set along the way. I can verify this works well and will run without issues for about a few minutes on startup. I have a similar RX component that receives this transmission in an analog loopback approach so I can verify the transmission. I have also inserted a packet number in my transmit data and have a checker(in the HDL) on the transmit side(upstream of my component) to check when there is an out of sequence happening. In chipscope I have it triggering when it happens so I can observe this behavior independent of the RX process. Setup: Ubuntu 20 LTS, E320, UHD 4.0.0.0-122-g75f2ba94 Here are some things I have observed: 1. It will run without an issue for about 1-2 mins on startup. Clean start or re-run does not matter. 1. It is always 34 source packets that are missing (each is 8 32 bit samples in length) each time it drops. 1. This never happens back to back so it looks like something is overflowing upstream however it is not perfectly periodic. 1. If I replace my core tx waveform processing with a simple fifo and allow the 8 sample packets to flow through my processing(no upsampling) it never drops anything. Obviously the large 1 to many and resultant stalling of the upstream is not making things happy. 1. This continues to happen if I totally disable the RX processing. 1. There is no indication of underruns or lates or other errors coming from the tx_core downstream of my component. I verified also by chipscoping that component and looking for anything. Some things I have tried: 1. I did increase the (info, pyld) fifo sizes on the input side of my components noc_shell. Did not change the behavior. I did not touch the stream endpoint buffers. 2. I am generally running this in host mode however I did try cross compiling the app and running embedded mode on the E320. Interesting observation is that it then becomes exactly 33 packets that are lost each time (weird or telling?). 3. If I insert usleeps in the while loop pushing down the data (txstream->send()) I can change the behavior so that it happens less frequently, takes longer to happen the first time, and the size of the number lost can change from the 34 normally. In my HDL I increment the timestamp by 50 msec so the obvious perfect sleep would be something like 50 msec minus the time rest of the code can take. Clearly this is hard to tune. Just setting 50 msec eventually causes a LLLLLLate condition. There is a sweet spot somewhere but without a RTOS this is a waste of time and would not be the right way to fix this. Any help or insight (things to try) would be greatly appreciated. I am out of ideas. My final idea would be to put my own FIFO just in front with a level indicator. Fill it up halfway and then monitor it with a register to keep it happy. Assuming I could keep up with this polling approach it should keep it happy unless there is a real bug upstream and someone is not obeying AXIS protocol. I would think this would be unnecessary however since RFNOC should not allow something like this to happen. Thanks in advance, Jeff Long
_______________________________________________ USRP-users mailing list -- usrp-users@lists.ettus.com To unsubscribe send an email to usrp-users-le...@lists.ettus.com