Hi all,
we have a couple of N310s in our lab and some of them seem to fail to
transmit reliably.
Each N310 is connected to a host via one of those SFP+ cables that came
with them from Ettus. We have 3 N310s that are connected via said cables
to one host each with an Intel X710 DA2 with an AMD TRX3970. All
machines run Ubuntu 20.04 with all updates.
I use the UHD 3.15LTS branch: UHD_3.15.0.0-7-g8d228dbe
I made sure to check out the very same commit and recompile and install it.
On 2 hosts I can run:
`./benchmark_rate --args
"addr=192.168.20.213,master_clock_rate=122.88e6" --tx_rate 61.44e6
--tx_channels "3" --rx_rate 61.44e6 --rx_channels "0,1"`
The full output is attached at the bottom of this email.
What I observe:
- It runs fine with 2 hosts
- The third host fails.
-- On the third host RX only works.
-- On the third host TX only is haunted: cf. full test output.
- We have a server with Intel Xeon 6254 and X722 where I observe the
same issue
- I switched USRPs between hosts, the issue seems to stick with the host.
It started with one host a couple of weeks back. But now our server
starts to fail with the same error: The exact same setup used to work on
that machine.
I am looking into this for quite a while now. I can't find the source of
the issue.
Has anyone had experience with that? I'd really appreciate hints how to
debug this.
Cheers
Johannes
On the working hosts the benchmark rate summary looks like this:
---------
Benchmark rate summary:
Num received samples: 1270556340
Num dropped samples: 0
Num overruns detected: 0
Num transmitted samples: 614440368
Num sequence errors (Tx): 0
Num sequence errors (Rx): 0
Num underruns detected: 0
Num late commands: 0
Num timeouts (Tx): 0
Num timeouts (Rx): 0
---------
But on the third device:
---------
[....]
SUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSU[00:00:16.262123]
Receiver error: ERROR_CODE_TIMEOUT, continuing...
SUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUSUU[00:00:16.565159]
Benchmark complete.
Benchmark rate summary:
Num received samples: 66501280
Num dropped samples: 0
Num overruns detected: 0
Num transmitted samples: 154312704
Num sequence errors (Tx): 3149
Num sequence errors (Rx): 0
Num underruns detected: 3156
Num late commands: 0
Num timeouts (Tx): 0
Num timeouts (Rx): 97
----------
We have a server with Intel X722 and Intel Xeon Gold 6252 that reports
the same issue:
----------
UUUUUUUU[00:00:16.180094] Receiver error: ERROR_CODE_TIMEOUT, continuing...
US[00:00:16.382393] Benchmark complete.
Benchmark rate summary:
Num received samples: 99763328
Num dropped samples: 0
Num overruns detected: 0
Num transmitted samples: 155804944
Num sequence errors (Tx): 3180
Num sequence errors (Rx): 0
Num underruns detected: 164974
Num late commands: 0
Num timeouts (Tx): 0
Num timeouts (Rx): 95
----------
Though, there are even more underruns.
Working output:
============
[INFO] [UHD] linux; GNU C++ version 9.3.0; Boost_107100;
UHD_3.15.0.0-7-g8d228dbe
[00:00:00.000002] Creating the usrp device with:
addr=192.168.20.213,master_clock_rate=122.88e6...
[INFO] [MPMD] Initializing 1 device(s) in parallel with args:
mgmt_addr=192.168.20.213,type=n3xx,product=n310,serial=319841B,claimed=False,addr=192.168.20.213,master_clock_rate=122.88e6
[INFO] [MPM.PeriphManager] init() called with device args
`time_source=gpsdo,clock_source=gpsdo,mgmt_addr=192.168.20.213,product=n310,master_clock_rate=122.88e6'.
[INFO] [0/Replay_0] Initializing block control (NOC ID: 0x4E91A00000000004)
[INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD100000011312)
[INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD100000011312)
[INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0000000000000)
[INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0000000000000)
[INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0000000000002)
[INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0000000000002)
[INFO] [0/FIFO_0] Initializing block control (NOC ID: 0xF1F0000000000000)
[INFO] [0/FIFO_1] Initializing block control (NOC ID: 0xF1F0000000000000)
[INFO] [0/FIFO_2] Initializing block control (NOC ID: 0xF1F0000000000000)
[INFO] [0/FIFO_3] Initializing block control (NOC ID: 0xF1F0000000000000)
Using Device: Single USRP:
Device: N300-Series Device
RX Channel: 0
RX DSP: 0
RX Dboard: A
RX Subdev: Magnesium
RX Channel: 1
RX DSP: 1
RX Dboard: A
RX Subdev: Magnesium
RX Channel: 2
RX DSP: 0
RX Dboard: B
RX Subdev: Magnesium
RX Channel: 3
RX DSP: 1
RX Dboard: B
RX Subdev: Magnesium
TX Channel: 0
TX DSP: 0
TX Dboard: A
TX Subdev: Magnesium
TX Channel: 1
TX DSP: 1
TX Dboard: A
TX Subdev: Magnesium
TX Channel: 2
TX DSP: 0
TX Dboard: B
TX Subdev: Magnesium
TX Channel: 3
TX DSP: 1
TX Dboard: B
TX Subdev: Magnesium
[00:00:04.045700] Setting device timestamp to 0...
[INFO] [MULTI_USRP] 1) catch time transition at pps edge
[INFO] [MULTI_USRP] 2) set times next pps (synchronously)
[00:00:05.689405] Testing receive rate 61.440000 Msps on 2 channels
[00:00:05.829315] Testing transmit rate 61.440000 Msps on 1 channels
[00:00:16.180163] Benchmark complete.
Benchmark rate summary:
Num received samples: 1270556340
Num dropped samples: 0
Num overruns detected: 0
Num transmitted samples: 614440368
Num sequence errors (Tx): 0
Num sequence errors (Rx): 0
Num underruns detected: 0
Num late commands: 0
Num timeouts (Tx): 0
Num timeouts (Rx): 0
Done!
=====================
_______________________________________________
USRP-users mailing list
USRP-users@lists.ettus.com
http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com