Wade-

Thanks for writing back.

Yes it’s a very simple setup with as you described host->SEP->tx_core->Radio.

I did actually have it in my plan to update to 4.1 since I have a X410 and 
would like to target that at some point soon. Its not something I can do 
immediately but probably in the next few weeks.

Yes I could look at packing them together or even building bigger packets that 
I throw away the unused data. My rates are pretty low as you can discern below, 
about 5 KBps from host to tx_core. What would be a min, pretty well tested 
packet size that you would suggest?

I did chipscope the chain but honestly there is a lot of hand off from ethernet 
down to my block that I would not understand what I was looking for. I did try 
to look at the flush controls in the noc_shell to see if they did something. I 
was guessing based on the name.

Is something getting dangerously high in terms of a FIFO upstream and it has 
some auto flush feature to keep a logjam or overflow from happening?

By the way what happened to sequence numbers? It does not seem like they make 
it down to the radio core anymore for obvious reasons. Are they just from host 
to SEP? Maybe I can track that?

On a side note, for lack of anything better to try, I did insert my own fifo 
right before my tx_core and connected the fifo level to a register so I could 
read it. Then within my send loop I monitor this level. Let it fill up halfway 
and then make a decision on each send whether to send a packet or go to sleep 
for a cycle based on keeping it near this halfway point. This keeps me from 
ever losing packets and works like a charm but should not be needed. This 
points to a backup issue upstream somewhere.

Once I switch to UHD 4.1 I will update this thread with results from testing 
under that.

Thanks
Jeff

From: Wade Fife <wade.f...@ettus.com>
Sent: Monday, March 7, 2022 7:57 PM
To: Jeffrey P Long <jpl...@mitre.org>
Cc: usrp-users@lists.ettus.com
Subject: [EXT] Re: [USRP-users] RFNOC dropping packets

Hi Jeff,

Can you describe the dataflow of your RFNoC graph (which blocks you're using 
and how they're connected)? For example, is it: host -> SEP -> Your Block -> 
Radio?

Could you try the latest version of UHD (v4.1.0.5)? There were many bug fixes 
since the initial release of 4.0. You may also want to regenerate your block to 
get a new noc_shell.

Those are very small packets (8 32-bit samples). That should be fine, but maybe 
there's a corner case with really short packets. Maybe you could try coalescing 
them into larger packets?

If I were debugging this, I'd use chipscope, maybe with some checker logic like 
you described, to look at the data flowing from the ethernet interface and 
follow that path to your block to see where the packets are getting dropped. It 
would also confirm that the packets are making it into the FPGA. But first I 
think updating to the latest version is a good idea so we're not chasing a bug 
that's already been fixed.

Thanks,

Wade

On Mon, Mar 7, 2022 at 10:17 AM Jeffrey P Long 
<jpl...@mitre.org<mailto:jpl...@mitre.org>> wrote:
Hi-

I have determined that somewhere upstream from my custom RFNOC component the 
fabric is intermittently dropping a fixed number of packets.

I have a custom transmit waveform encapsulated in a single RFNOC component. 
This waveform component effectively takes about 8 32-bit samples of user data 
and produces an entire transmit burst of close to 5 msec in length at a sample 
rate of 50 MHz. Therefore, a fairly large “upsampling” operation for a RFNOC 
block. This is a timed transmission, so I have interface logic that translates 
the CHDR info and single EOB to a series of packets with a timestamp on the 
first and the EOB set on the last packet along with the appropriate tlast set 
along the way. I can verify this works well and will run without issues for 
about a few minutes on startup. I have a similar RX component that receives 
this transmission in an analog loopback approach so I can verify the 
transmission. I have also inserted a packet number in my transmit data and have 
a checker(in the HDL) on the transmit side(upstream of my component) to check 
when there is an out of sequence happening. In chipscope I have it triggering 
when it happens so I can observe this behavior independent of the RX process.

Setup: Ubuntu 20 LTS, E320, UHD 4.0.0.0-122-g75f2ba94

Here are some things I have observed:


  1.  It will run without an issue for about 1-2 mins on startup. Clean start 
or re-run does not matter.


  1.  It is always 34 source packets that are missing (each is 8 32 bit samples 
in length) each time it drops.


  1.  This never happens back to back so it looks like something is overflowing 
upstream however it is not perfectly periodic.


  1.  If I replace my core tx waveform processing with a simple fifo and allow 
the 8 sample packets to flow through my processing(no upsampling) it never 
drops anything. Obviously the large 1 to many and resultant stalling of the 
upstream is not making things happy.


  1.  This continues to happen if I totally disable the RX processing.



  1.  There is no indication of underruns or lates or other errors coming from 
the tx_core downstream of my component. I verified also by chipscoping that 
component and looking for anything.


Some things I have tried:


  1.  I did increase the (info, pyld) fifo sizes on the input side of my 
components noc_shell. Did not change the behavior. I did not touch the stream 
endpoint buffers.
  2.  I am generally running this in host mode however I did try cross 
compiling the app and running embedded mode on the E320. Interesting 
observation is that it then becomes exactly 33 packets that are lost each time 
(weird or telling?).
  3.  If I insert usleeps in the while loop pushing down the data 
(txstream->send()) I can change the behavior so that it happens less 
frequently, takes longer to happen the first time, and the size of the number 
lost can change from the 34 normally. In my HDL I increment the timestamp by 50 
msec so the obvious perfect sleep would be something like 50 msec minus the 
time rest of the code can take. Clearly this is hard to tune. Just setting 50 
msec eventually causes a LLLLLLate condition. There is a sweet spot somewhere 
but without a RTOS this is a waste of time and would not be the right way to 
fix this.

Any help or insight (things to try) would be greatly appreciated. I am out of 
ideas. My final idea would be to put my own FIFO just in front with a level 
indicator. Fill it up halfway and then monitor it with a register to keep it 
happy. Assuming I could keep up with this polling approach it should keep it 
happy unless there is a real bug upstream and someone is not obeying AXIS 
protocol. I would think this would be unnecessary however since RFNOC should 
not allow something like this to happen.

Thanks in advance,
Jeff Long

_______________________________________________
USRP-users mailing list -- 
usrp-users@lists.ettus.com<mailto:usrp-users@lists.ettus.com>
To unsubscribe send an email to 
usrp-users-le...@lists.ettus.com<mailto:usrp-users-le...@lists.ettus.com>
_______________________________________________
USRP-users mailing list -- usrp-users@lists.ettus.com
To unsubscribe send an email to usrp-users-le...@lists.ettus.com

Reply via email to