So, what needs to be done? I noticed that there are already hooks for NEON in the volk library but no implementation (or very little... don't remember exactly).
My understanding of Orc is that it generates architecture-dependent vector processor instructions from an Orc abstraction language. Is integrating Orc into Volk for NEON as simple as linking into liborc with a compile switch indicating that we want NEON output? Are the smarts already built into the cmake build process? Can I drop Philip's _fff and _ccf filters into volk and hit "go?" (I know there's more nuance to it, but if the combination of integrating Orc code and NEON FIR filter code that's already written gets me 90% of the way there, I'd be VERY happy! Thanks, Sean ________________________________________ From: Nick Foster [n...@ettus.com] Sent: Tuesday, November 08, 2011 1:27 PM To: j...@ettus.com Cc: discuss-gnuradio@gnu.org; Nowlan, Sean Subject: Re: [Discuss-gnuradio] Complex Short/INT16 type Sean, with all the talk about optimization for ARM, the first thing I would do is start to integrate Volk with existing floating-point blocks. Stock GCC is very, very bad at vectorizing for the NEON SIMD unit -- even when hardware floating point is used in GCC, most float instructions end up allocated to the VFP rather than the NEON unit. You might find an easy 2x-3x improvement just by doing the heavy lifting in Volk rather than in C++. All of the Orc functions in Volk will work for NEON. There's no FIR filter in Orc right now (need to get accumulators working properly in Orc), but Philip Balister already wrote NEON FIR filter cores for the _fff and _ccf FIR filters. This isn't to say that short complex wouldn't be a useful addition to GR. Just that it's likely going to be more work than making use of the existing floating-point hardware the E100 already has. This is work that needs to be done anyway to make ARM platforms as useful as possible, and we (Josh, Phil, and I) are happy to help you optimize your application for E100 if you give us details on how your application works. We're putting together a "motivating example" using Volk to show users how to Volkify their own blocks. --n On Tue, Nov 8, 2011 at 9:13 AM, Josh Blum <j...@ettus.com> wrote: > > > On 11/07/2011 02:15 PM, Nowlan, Sean wrote: >> Hi all - >> >> I'm getting limited by the slow ARM processor in the E100 and I want >> to modify parts of gr-digital and gnuradio-core to support complex >> short/INT16 types in the modulation schemes. I suspect that it won't >> be as trivial as defining "typedef std::complex<short> gr_complexs;" >> in gnuradio-core/src/lib/runtime/gr_complex.h and doing a > find-and-replace in the relevant source files. There are probably > > It may be that simple for some blocks. Like the symbol table in BPSK. > >> issues with dynamic range that I'll have to deal with in addition to >> having to implement filters using fixed-point math. >> > > Often blocks will need to have scale factors. Fortunatly, with a FIR > filter, you get a free scale factor in the "filter taps" > >> Questions: >> >> 1) Do you think I'd save anything by doing all the modulation & >> filtering in complex float32 and then converting at the very end? > > Its good to make the conversion part of an operation that does something > useful rather than doing it for the sake of converting. Like a filter > that takes in floats and spits out shorts. > >> This will reduce the bandwidth requirement to the FPGA by two, but >> I'm afraid the float math is the true limitation. >> > > The format going into the FPGA is always integer. If you pass floats > into the UHD, they are copy-converted from host buffer to memory mapped > buffers. > >> 2) Why is there a gr_complex_to_interleaved_short block but not >> a gr_complex_to_complex_short block? Would it be better if I rolled >> my own or just hooked up a gr_complex_to_interleaved_short block and >> then a deinterleave block? Or alternatively, split the complex float >> vector into two streams and feed them to a USRP sink block using >> COMPLEX.INT16? >> > The interleaved short block is a strange hold-over from ancient times. I > would ignore it. I think a block such as "gr_complex_to_complex_short" > is a good idea. > >> 3) What specific parts of the modulation examples or >> gnuradio-core do you think I need to change to support complex short >> ints? >> > > Probably some new sc16 filter blocks for the matched filters. I have > mentioned the importance of volk before. > > The constellation stuff relies on this new constellation library in > gr-digital. Perhaps Ben can lean in here and offer some advice on how to > modify this for alternative data types. > > The recovery stuff in the BPSK is using Tom's new gri-control-loop to > simplify writing things like FLLs, PLLs. Thats a place too look, see how > the timing recovery blocks make use of it. > > -Josh > > _______________________________________________ > Discuss-gnuradio mailing list > Discuss-gnuradio@gnu.org > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio > _______________________________________________ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org https://lists.gnu.org/mailman/listinfo/discuss-gnuradio