Hey community! Here we go again. Another project update. I'm working with VOLK and SIMD for two weeks now. I could fix some hiccups with last weeks pack and unpack kernels. They run just fine during test now. Also, I added a 'volk_8u_x3_encodepolar_8u_x2' kernel. It operates on the the assumption that there is one active bit in a byte and it is located in the LSB. A quick performance test with a 2^32 samples head block after the encoder shows that generic crunches ~160MSps. So far I had an encoder which operated on packed bytes and did ~300MSps. An unpack block was added to the flowgraph with the 'extended_encoder' in use. The vector optimized version does ~570MSps. So it is ~3.5x as fast as the generic version. Some more optimization might yield even better results. At first glance it is weird that the output signature of the encoder is '8u_x2'. The kernel internally needs a temporary buffer which has the same size as the output buffer. Instead of malloc'ing and free'ing it on every call, it can be created once and be used all the time. During the week I was struggling with VOLK tests. Finally I solved those issues. But I'd like to refer to the mail I sent out the other day. SIMD code tends to have quite a few lines of code. In order to make it easier to read and understand, it would be great if it was possible to implement multiple functions within one '#ifdef LV_HAVE_ARCH ... #endif' paragraph. But so far the compiler refuses to compile if I did this. It is possible to add functions in the general section but that's only appropriate for a generic kernel or common functions. All the intrinsics I used so far are available on SSSE3. Although, I created aligned and unaligned versions of those kernels only store[u] and load[u] might make a difference here. My benchmarks don't show any significant difference. All benchmarks are done on a Sandy Bridge i7.
I suspect the encoder was easier to optimize than the decoder will be. So for the upcoming week and beyond I will focus on creating kernels for polar decoding. More info and current project progress can be found in [1], [2] and [3]. Cheers Johannes [1] https://github.com/jdemel/gnuradio [2] https://github.com/jdemel/socis-proposal [3] https://github.com/jdemel/volk _______________________________________________ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org https://lists.gnu.org/mailman/listinfo/discuss-gnuradio