On Mon, Apr 20, 2015 at 10:21 AM, Marcus Müller <marcus.muel...@ettus.com> wrote:
> Hi Marco, > > If I may recommend something, it would be having a look at VOLK [1]. It's > the optimizations library that comes with GNU Radio. > If you could implement some of these algorithms in CUDA, then every block > currently using VOLK (which is the majority of the arithmetically > challenging blocks at the moment) could automatically make use of your > accelerations, without having to change anything! Also, VOLK comes with > volk_profile, which it uses to test the different implementations that work > on your hardware, looking for the fastest one. That would be the ultimate > benchmark for your kernels, as it directly compares the efficiency of the > "general C" and CPU-SIMD implementations to your CUDA kernels. > We've never been hot on the idea of using VOLK for GPU stuff. VOLK kernels tend to do one thing at a time and don't worry about data movement (too much) because the SIMD registers are right there. Going to GPUs takes a lot longer, so you want to spend more of your time there once you get the data moved across. With VOLK, we'd be going back and forth, which is a huge performance killer. > Furthermore, gr-theano is worth a visit [2], because it actually does CUDA > to accellerate channel models. The point here is that GPUs and their high > memcpy latency (and CPU cost) aren't practical for all problems. If I just > want to add a small number of samples, doing it on a CPU might simply pay > off better; gr-theano for example offers a FFT, which might be one of the > algorithms typically working on large vectors where the CPU/GPU boundary > crossing might be worth it. > > Best regards, > Marcus > > [1] http://nathanwest.us/volk/ > [2] http://www.cgran.org/pages/gr-theano.html > I'm also not the biggest fan of CUDA for GNU Radio simply because it's too hardware specific. I'd be more interested in seeing OpenCL implementations -- but even that has it's limitations for support. Theano looks nice from what I've heard (mostly from Tim and his gr-theano work), and I don't believe that it's necessarily CUDA. Tom > On 04/20/2015 04:09 PM, marco Ribero wrote: > > I cannot do it. > For my thesis,I'm trying do bring various part of GnuRadio over CUDA.. > My idea is to rewrite already existing blocks with CUDA, possibly without > breaking compatibility with actual implementation of gnuradio. In this way > a normal user can use these blocks without problems. > > For the moment, I've token more confidence with gnuradio, made an FM CUDA > receiver and started to port over CUDA some blocks. Is mandatory to > minimize host-device memcpy. > My actual approach is : each block loads its code and communicate with > neighboors using async transfers,streams and other(so I need to pass > addresses of memory locations,lock bits,etc.. > > My next step will be: at the beginning,each block will send down its > device code and parameters..the block at the and of the chain will make a > dynamic compilation (CUDA 7).. if I'll have additional time I'll also use > warp parallelism(reducing global-shared memcpy) > > Thanks in any case, > marco > > > Il giorno lun 20 apr 2015 alle ore 12:48 Marcus Müller < > marcus.muel...@ettus.com> ha scritto: > >> Hi Marco, >> >> I just realized: Things might be much more easy than that, even: >> >> What you do sounds like a job for a hierarchical block; if you're not >> used to that concept: It's just a "subflowgraph", represented as a block >> with in- and outputs. >> If you put both your blocks inside, you'll always have them together. >> And: in the constructor of your hierarchical block, you can for example >> first construct your cuda block, and then give your "downstream" block the >> pointer to that in its constructor. >> >> To the user, this will look like one block, though there are two (or >> more) inside. >> >> Greetings, >> Marcus >> >> >> On 04/20/2015 12:29 PM, marco Ribero wrote: >> >> >> Thank you very much. Your solution is much cleaner. >> >> Have a good day, >> Marco >> >> Il giorno lun 20 apr 2015 alle ore 09:29 Marcus Müller < >> marcus.muel...@ettus.com> ha scritto: >> >>> Hi marco, >>> >>> what you describe as ID already exist: every block has a function >>> alias(), giving it a string "name", which can be used with >>> global_block_registry::block_lookup(name) [1]. >>> >>> You will need to wrap your alias in a pmt::intern to get it into a >>> stream tag, so use that with block_lookup, and cast the result to >>> your_block_type::sptr. >>> >>> Greetings, >>> Marcus >>> >>> [1] >>> http://gnuradio.org/doc/doxygen/classgr_1_1block__registry.html#a67a83c42e2030bba463c99d51e7a8f92 >>> >>> >>> >>> >> >> _______________________________________________ >> Discuss-gnuradio mailing >> listDiscuss-gnuradio@gnu.orghttps://lists.gnu.org/mailman/listinfo/discuss-gnuradio >> >> _______________________________________________ >> Discuss-gnuradio mailing list >> Discuss-gnuradio@gnu.org >> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio >> > > > _______________________________________________ > Discuss-gnuradio mailing > listDiscuss-gnuradio@gnu.orghttps://lists.gnu.org/mailman/listinfo/discuss-gnuradio > > > > _______________________________________________ > Discuss-gnuradio mailing list > Discuss-gnuradio@gnu.org > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio > >
_______________________________________________ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org https://lists.gnu.org/mailman/listinfo/discuss-gnuradio