About a year ago, we embarked on developing a similar application. The TLDR is we never got to 2x200, but we did get 3x100, which to me is actually a bit better cuz 2x200 is really only 2x160 with the analog filtering, yet you have all those extra samples you have to take in for no benefit.
It is my conjecture that this was no fault of UHD, its rx processing is quite efficient for a single channel and single processor. Do I wish I could control the thread affinity of UHD spawned threads? Sure, but that would be an akward thing to expose in a platform independent and correct manner and few people ever need that (you are in the sub 1% zone, which is where we often play as well). When it comes to streaming multiple 200 MSA/sec (800 MB/sec per stream) of data, the key is balancing resource utilization and the path the data takes from USRP device to disk. EVERYTHING IS IMPORTANT HERE. We used multiple PCIe SSDs (1.2 GB/sec sustained write, so essentially enough for a single 200 MSA/sec data stream with some margin) and essentially, we ran out of the ability to efficiently segment the data to dedicated lanes as you only get about 40 PCIe lanes in a single motherboard. The chain you need to make sure is efficient is UHD device -> UHD receive -> app receive and buffer -> write to disk. That is 4 threads per channel and even with affinity and priority etc, it is hard for the OS to handle that workload with the required *latency* to keep up... Multiply this workload times the number of channels you want to receive and I think you start to get the problem. We can chat further if you want offline if people find this too distracting :) -----Original Message----- From: USRP-users [mailto:usrp-users-boun...@lists.ettus.com] On Behalf Of Tarik Kazaz via USRP-users Sent: Wednesday, June 27, 2018 7:41 AM To: 'Marcus Müller' <marcus.muel...@ettus.com>; 'usrp-users@lists.ettus.com' <usrp-users@lists.ettus.com> Subject: Re: [USRP-users] Streaming and storing signals of full BW of 2x UBX-160 cards to PC in file *** WARNING *** EXTERNAL EMAIL -- This message originates from outside our organization. Hi Marcus, Thank you for your comments, although this scares me a bit. Obviously storing samples at 1.6 GB/s will not be easy to achieve, or at least it is not straightforward. I will wait a bit with further responses, I hope that someone else tried to do same. Is it possible to use Dual 10Gbe connection towards two PCs (splitting single connection towards separate PCs)? What I mean with this is to: stream IQ samples from one ADC0 (UBX0) to one PC at 0.8 GB/s and stream IQ samples from another ADC1(UBX1) to another PC at 0.8 GB/s. Maybe my idea is not valid, however might be solution. Probably in this case there will be miss-alignment between samples stored at separate NVMe SSDs. Kind Regards, Tarik -----Original Message----- From: Marcus Müller [mailto:marcus.muel...@ettus.com] Sent: woensdag 27 juni 2018 11:23 To: Tarik Kazaz; 'usrp-users@lists.ettus.com' Subject: Re: [USRP-users] Streaming and storing signals of full BW of 2x UBX-160 cards to PC in file Hi Tarik, I didn't try to continously stream to storage of dual-channel full-rate X3x0 recordings myself, but I do have some experience to share; hopefully it helps more than it scares. * Even NVMe SSDs aren't uniform in access latency and write speed. So, make sure your write rate is *reliably* above 1.6 GB/s * A marketing 2100MS/s write speed is what you'll get on average; you'd need that on "short-term minimum", with "short" being defined by how well your OS buffers writes in RAM. * A quick research on some "SSD user benchmarking" sites revealed the Samsung 970 PRO 1TB would indeed do more than 2.1 GB/s write speed – but only on strictly size-limited sequential writes, so you definitely don't want a journaling file system on the same device, or do *anything* with the SSD at the same time as storing samples. (Same site says "sustained write 1.5 GB/s", so, it really doesn't work out with but one of these) * Gut feeling: Get more CPU cores, probably more RAM for the buffering to compensate instantaneous write speed differences, and build a software RAID 0 out of at least two of these PCIe SSDs, or out of >> four SATA SSDs * Storage subsystems usually aren't safe from preemption; you'll need to make sure that you have enough buffer to write things to while your storages system catches up after an interruption. Luckily, you can teach for example Linux how much storage write it is allowed to buffer into RAM before it starts blocking * 2x200 MS/s vs Quad-Core: that's a lot of packets per second that you're trying to juggle with only four cores at 2.8 GHz, considering these cores handle the network packets, the unpacking of the samples from these, the file writing and the file system as well as storage interfacing. * 16 GB RAM is certainly at the lower end of what I'd expect of a high- bandwidth storage system/workstation, but I don't think you'll need much more – after all, that's quite a lot of buffer (i.e. let's say 8s worth of samples, if you subtract all the RAM requirements of OS and applications that don't go into write buffering). Looking at the price of Dell's ECC RAM: you can probably live with some bit error probabilities in the 2e-11/bit/hr range (which I think is what's estimated for modern RAM) for noisy data without any problem, so ECC possibly isn't important here (but I don't know your operational requirements! So take with a grain of salt). The fact that you constantly overwrite with fresh values and thus most bits are probably shorter stored in RAM than one DRAM refresh cycle would be anyway does probably help physically, too. Best regards, Marcus On Wed, 2018-06-27 at 08:30 +0000, Tarik Kazaz via USRP-users wrote: > Hi All, > > I did further research on issues related to streaming and storing IQ > samples from USRP X310 (with UBX-160) sampling at 200Msps to PC. > The connection between USRP X310 and PC would be over two (2) 10 > Gigabit Ethernet interface. > > Based on my calculations writing speed to the memory of PCs should > be: > > DataRate_of_two_streams = 2(RF_cards) x 2(IQ) x 200 Msps x 16 (bits) = > 12800Mbps = 12.8 Gbps = 1.6GBps => > > (taking into account computer science definition of 1GB = 1024^3 > bits) > ð 1600MB/s = 1.5625GB/s > > These data rates are huge for regular SSDs. As possible solutions I > found that PC with: > 1. PCIe/NVMe SSDs > 2. RAM Disks > could support these data rates. > > First solution seems to be better in terms of the capacity as we would > like to be able to store signals during 5-10 min time interval. > This would require around 1TB capacity. I found that SAMSUNG NVMe SSDs > (at least based on specification) could support these data > rates: > > 1. SSD 960 PRO NVMe M.2 2TB (Sequential writing speed is 2,100 > MB/s) (https://www.samsung.com/us/computing/memory-storage/solid-stat > e-drives/ssd-960-pro-m-2-2tb-mz-v6p2t0bw/) > 2. SSD 970 PRO NVMe M.2 1TB (Sequential writing speed is 2,700 > MB/s) (https://www.samsung.com/us/computing/memory-storage/solid-stat > e-drives/ssd-970-pro-nvme-m2-1tb-mz-v7p1t0bw/) > 3. SSD 970 EVO NVMe M.2 2TB (Sequential writing speed is 2,500 > MB/s) (https://www.samsung.com/us/computing/memory-storage/solid-stat > e-drives/ssd-970-evo-nvme-m2-2tb-mz-v7e2t0bw/) > > > Does anyone has experience with similar set-up? Did you guys from > Ettus perform similar experiments? > We would be grateful for any advice or opinion on these issues. > > Cheers, > > Tarik > > > > From: Tarik Kazaz > Sent: maandag 25 juni 2018 15:07 > To: 'usrp-users@lists.ettus.com' > Subject: Streaming and storing signals of full BW of 2x UBX-160 cards > to PC in file > > Dear All, > > We are working on prototyping Signal Processing Algorithms for Radar > and localization scenarios (wideband signals). > At the moment we are setting up the testbed for testing our > algorithms. We are interested in the streaming and storing of high > data rate samples full bandwidth of 2x 160 MHz UBX cards to the file > in PC. > Later on, we would perform offline processing of those samples. > > We managed to find several posts related to the similar work. However, > we did not manage to find concrete suggestions or reference guideline > how to setup system that is able to perform offline acquisition of > full bandwidth signals supported by 2xUBX-160 cards in PC. > > Our questions is directed to Ettus Research developers and anyone who > was trying to achieve same: > > 1. What is hardware configuration for the PC in order to support > streaming and storing of samples from 2xUBX-160 cards > (2(cards)x2(IQ)x160(BW)x8(sample size)) to SSD memory? > 2. Did you in Ettus tried to do a similar experiment and could you > provide us references for HW and SW configurations? > > Our testbed at the moment consists of: > > 1. PC (Dell Precision Tower 5810 - https://www.dell.com/en-ca/work/sh > op/dell-desktops-workstations/dell-precision-tower-5810-workstation- > build-your-own/spd/precision-t5810-workstation/cup5810onca): > RAM: 4x4GB (https://www.micron.com/parts/modules/ddr4-sdram/mta9a > sf51272pz-2g3) (This we could also extend to 32GB or 64GB) > > CPU: Intel Xeon - Intel Xeon Processor E5-1620 v3 (4C, 3.5GHz, > Turbo, HT, 10M, 140W) > > SSD: skhynix 512GB, 2.5'', Read : up to 550MB/s, Write : up to > 480MB/s. (http://ssd.skhynix.com/ssd/en/about/sc300a.jsp > ) (this we plan to extend and maybe change as writing speed is low > and capacity is low. Do you have a recommendation > for SSD configuration ?) > > 2. USRP X310 > RF DaugtherBoard: 2x UBX-160MHz > PC-USRP interface: 2x10GB ethernet interface > > Thank you in advance on comments and suggestions, > > > Tarik > > > > _______________________________________________ > USRP-users mailing list > USRP-users@lists.ettus.com > http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com _______________________________________________ USRP-users mailing list USRP-users@lists.ettus.com http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com _______________________________________________ USRP-users mailing list USRP-users@lists.ettus.com http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com