Just chiming in to say you aren't crazy -- I've seen similar results in my
RFNoC loopback testing. There's a critical spp that you can't go below at
full rate without experiencing over/underflows. I didn't take the time to
dig down to the bottom of it, so I can't be of too much help here except to
confirm what you're seeing.

Nick

On Sat, Mar 24, 2018 at 11:52 AM Sebastian Leutner via USRP-users <
usrp-users@lists.ettus.com> wrote:

> I tried using the packet_resizer block which improved things a little
> bit but not significantly. I figured that as soon as the difference
> between the spp (coming from the Radio) and the set packet size in the
> resizer block is too large, I get overruns again. I increased the
> STR_SINK_FIFOSIZE of the packet_resizer from 11 to 14 and now am able to
> run [Radio, spp=64] -> [Resizer, pkt_size=6000] -> [FIFO] -> [Null Sink]
> without problems. However, when trying pkt_size 7000 I get overruns
> again and same goes for any RFNoC blocks connected in between the Radio
> and Resizer blocks using spp=64 .
>
> By outputting the UHD trace messages and digging into some of its code,
> I found out that the RFNoC flow control is configured in such a way that
> it will send at least one ACK for every two packets on the bus (every
> packet for smaller input FIFO or larger pkt_size).
>
> See /uhd/host/lib/rfnoc/graph_impl.cpp:
>  > // On the same crossbar, use lots of FC packets
>  > size_t pkts_per_ack = std::min(
>  >         uhd::rfnoc::DEFAULT_FC_XBAR_PKTS_PER_ACK,
>  >         buf_size_pkts - 1
>  > );
> DEFAULT_FC_XBAR_PKTS_PER_ACK is set to 2 in constants.hpp.
>
>  From my understanding we should really take the max here instead of the
> min. We already calculated buf_size_pkts based on the input FIFO size of
> the next downstream block. Do I miss something here?
>
> Furthermore, UHD assumes the maximum packet size (=8000) for the Radio
> block although I set spp=64. I will try next week to also set the
> pkt_size in the Radio block stream args explicitly or (if that does not
> help) in its port definition in the UHD block description .xml file.
>
> I am not absolutely sure, if the FC is the only reason for my problems
> but it definitely increase the amount of ACK messages for smaller spp.
> Another workaround for me would probably to increase the
> STR_SINK_FIFOSIZE for all the RFNoC blocks I'm using.
>
> Regarding the bus clk_rate: It is set to 166.67 MHz  in rfnoc-devel at
> the moment (=10.67 Gbps raw throughput). Calculating the additional
> overhead from CHDR headers when using spp=64 I get 6.80 Gbps (compared
> to the raw (un-packetized) 6.40 Gbps). From my understanding, the
> crossbar should be capable of handling this amount since I have only one
> stream direction through my blocks (please correct me if I'm wrong here).
>
> Thanks for your ideas and feedback!
>
>
> EJ Kreinar wrote:> Hi Sebastian,
> >
> >  > Do you think that it would suffice to change the packet size at my
> > last RFNoC block before the host? I will try out the already available
> > packet_resizer block tomorrow.
> >
> > Yes, this is probably the easiest solution. But, if you're not opposed
> > to custom HDL, an alternate option could be to create a modified FFT
> > block that simply outputs an integer number of FFTs within a single
> packet.
> >
> >> So the question would be if RFNoC can handle passing packets with spp=64
> > at 200 MSps between RFNoC blocks
> >
> > That's a good question... RFNoC blocks all share a crossbar, which runs
> > at a particular bus_clk rate, so there is a max throughput that the bus
> > can handle... Each sample on the crossbar is 8 bytes, so you get a total
> > throughput of bus_clk*8 bytes/second. There's also a header overhead of
> > 16 bytes per packet (or 8 bytes if there's no timestamp).
> >
> > I'm actually not sure what the current X310 bus_clk rate is set to... I
> > just noticed a recent commit that supposedly changes bus_clk to 187.5
> > MHz
> > (
> https://github.com/EttusResearch/fpga/commit/d08203f60d3460a170ad8b3550b478113b7c5968
> > <
> https://github.com/EttusResearch/fpga/commit/d08203f60d3460a170ad8b3550b478113b7c5968
> >).
> > So I'm not exactly clear what the bus_clk was set to before that, or on
> > the rfnoc-devel branch...
> >
> > But unless I'm misunderstanding, having multiple RFNoC blocks running at
> > a full 200 Msps might saturate the bus? Is that correct?
> >
> > EJ
> >
> > On Thu, Mar 22, 2018 at 3:33 PM, Sebastian Leutner via USRP-users
> > <usrp-users@lists.ettus.com <mailto:usrp-users@lists.ettus.com>> wrote:
> >
> >                     Hi all,
> >
> >                     when working with RFNoC at 200 MSps on the X310
> >                     using 10GbE I
> >                     experience overruns when using less than 512 samples
> >                     per packet (spp).
> >                     A simple flow graph [RFNoC Radio] -> [RFNoC FIFO] ->
> >                     [Null sink] with
> >                     the spp stream arg set at the RFNoC Radio block
> >                     shows the following
> >                     network utilization:
> >
> >                     spp | throughput [Gbps]
> >                     ------------------------
> >                     1024 | 6.49
> >                     512 | 6.58
> >                     256 | 3.60
> >                       64 | 0.70
> >
> >                     Although I understand that the total load will
> >                     increase a little bit
> >                     for smaller packets due to increased overhead
> >                     (headers) as seen from
> >                     spp=1024 to spp=512, I find it confusing that so
> >                     many packets are
> >                     dropped for spp <= 256.
> >
> >                     Total goodput should be 200 MSps * 4 byte per sample
> >                     (sc16) = 800 MBps
> >                     = 6.40 Gbps.
> >
> >                     Is RFNoC somehow limited to a certain number of
> >                     packets per second
> >                     (regardless of their size)?
> >                     Could this be resolved by increasing the
> >                     STR_SINK_FIFOSIZE noc_shell
> >                     parameter of any blocks connected to the RFNoC Radio?
> >
> >                     I would like to use spp=64 because that is the size
> >                     of the RFNoC FFT I
> >                     want to use. I am using UHD
> >                     4.0.0.rfnoc-devel-409-gec9138eb.
> >
> >                     Any help or ideas appreciated!
> >
> >                     Best,
> >                     Sebastian
> >
> >                 This is almost certainly an interrupt-rate issue having
> >                 to do with your
> >                 ethernet controller, and nothing to do with RFNoC, per
> se.
> >
> >                 If you're on Linux, try:
> >
> >                 ethtool --coalesce <device-name-here>  adaptive-rx on
> >                 ethtool --coalesce <device-name-here> adaptive-tx on
> >
> >
> >             Thanks Marcus for your quick response. Unfortunately, that
> >             did not help. Also, `ethtool -c enp1s0f0` still reports
> >             "Adaptive RX: off TX: off" afterwards. I also tried changing
> >             `rx-usecs` which reported correctly but did not help either.
> >             I am using Intel 82599ES 10-Gigabit SFI/SFP+ controller with
> >             the driver ixgbe (version: 5.1.0-k) on Ubuntu 16.04.
> >
> >             Do you know anything else I could try?
> >
> >             Thanks,
> >             Sebastian
> >
> >         The basic problem is that in order to achieve good performance
> >         at very-high sample-rates, jumbo-frames are required, and using
> >         a very small SPP implies
> >             very small frames, which necessarily leads to poor ethernet
> >         performance.
> >
> >         Do you actually need the FFT results to appear at the host at
> >         "real-time" rates, or can you do an integrate-and-dump within
> >         RFNoC, to reduce host side
> >             traffic?
> >
> >
> >     Yes, I need all the samples. Since it will be a full receiver
> >     implementation in RFNoC the output to the host will be much less
> >     than 6.40 Gbps but still a decent amount and definitely more than
> >     the 0.7 Gbps I was able to achieve with spp=64.
> >
> >     Do you think that it would suffice to change the packet size at my
> >     last RFNoC block before the host? I will try out the already
> >     available packet_resizer block tomorrow.
> >
> >     So the question would be if RFNoC can handle passing packets with
> >     spp=64 at 200 MSps between RFNoC blocks. If this is likely to be a
> >     problem, I could try wrapping all my HDL code into one RFNoC block
> >     and handle the packet resizing at input and output of this block.
> >     However, I would like to avoid this step if possible.
> >
> >     Thanks for your help!
> >
> >
> >     _______________________________________________
> >     USRP-users mailing list
> >     USRP-users@lists.ettus.com <mailto:USRP-users@lists.ettus.com>
> >     http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com
> >     <http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com>
> >
> >
>
> _______________________________________________
> USRP-users mailing list
> USRP-users@lists.ettus.com
> http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com
>
_______________________________________________
USRP-users mailing list
USRP-users@lists.ettus.com
http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com

Reply via email to