On 02/16/2012 11:32 AM, Josh Blum wrote:
> 
> 
> On 02/16/2012 11:24 AM, Tom Rondeau wrote:
>> On Thu, Feb 16, 2012 at 2:08 PM, Josh Blum <j...@joshknows.com> wrote:
>>
>>>
>>>> Also, you never want to work on the smallest amount of memory possible.
>>>> This is covered in my discussion on my blog. Making arbitrarily small
>>> calls
>>>> to work functions causes much more overhead than just running the
>>> unaligned
>>>> version of a Volk call. I found this out pretty quickly when I started
>>>> looking into things. Better to process a large chunk to get back into
>>>> alignment than try to handle calls to small amounts of data.
>>>>
>>>
>>> Perhaps this is because you have a processor that doesn't penalize you
>>> for unaligned loads/stores.
>>>
>>> -Josh
>>>
>>
>> I tested this on a handful of different processors: Core2Due, QuadCore, i7
>> (first get), i7 (second gen) and they all told me the same thing. You are
> 
> For most if not all recent x86 processors there is no unaligned penalty.
> You should be able to always call the unaligned volk routine and see no
> difference in performance. I'm wondering about neon for example, which
> has a penalty. And I suppose to a lesser extent, older x86 processors. I
> dont have numbers now, but I think the volk profiler can confirm this
> about said processors.


The answer for neon is probably a case of the "don't do that". In other
words, keep your blocks fed with aligned multiples, regardless of how
the scheduler handles things.

-Josh

_______________________________________________
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Reply via email to