On 02/16/2012 11:32 AM, Josh Blum wrote: > > > On 02/16/2012 11:24 AM, Tom Rondeau wrote: >> On Thu, Feb 16, 2012 at 2:08 PM, Josh Blum <j...@joshknows.com> wrote: >> >>> >>>> Also, you never want to work on the smallest amount of memory possible. >>>> This is covered in my discussion on my blog. Making arbitrarily small >>> calls >>>> to work functions causes much more overhead than just running the >>> unaligned >>>> version of a Volk call. I found this out pretty quickly when I started >>>> looking into things. Better to process a large chunk to get back into >>>> alignment than try to handle calls to small amounts of data. >>>> >>> >>> Perhaps this is because you have a processor that doesn't penalize you >>> for unaligned loads/stores. >>> >>> -Josh >>> >> >> I tested this on a handful of different processors: Core2Due, QuadCore, i7 >> (first get), i7 (second gen) and they all told me the same thing. You are > > For most if not all recent x86 processors there is no unaligned penalty. > You should be able to always call the unaligned volk routine and see no > difference in performance. I'm wondering about neon for example, which > has a penalty. And I suppose to a lesser extent, older x86 processors. I > dont have numbers now, but I think the volk profiler can confirm this > about said processors.
The answer for neon is probably a case of the "don't do that". In other words, keep your blocks fed with aligned multiples, regardless of how the scheduler handles things. -Josh _______________________________________________ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org https://lists.gnu.org/mailman/listinfo/discuss-gnuradio