I'm not feeling the same guilt, Philip, so I'll just go ahead and
"complain" about ARM :D
So, the ARM/Thumb instruction sets don't come with a Modulo instruction;
Hence, "a%b" very likely is implemented as
a-⎣a/b⎦·b
And integer division often takes multiple cycles, and even more so, can
take 4 u
On 09/06/2017 10:17 PM, Philip Balister via USRP-users wrote:
>
>
> On 09/06/2017 07:07 PM, Taliver Heath wrote:
>> I had the same issues -- the big performance eater in my case was anything
>> that was doing modulo in a tight loop.
>>
>> So, if you have something like:
>>
>> for ( int i = 0; i <
On 09/06/2017 07:07 PM, Taliver Heath wrote:
> I had the same issues -- the big performance eater in my case was anything
> that was doing modulo in a tight loop.
>
> So, if you have something like:
>
> for ( int i = 0; i < 1000; i++) {
> array[i % arr_size] = ...
> }
Yeah, basically the % o
I had the same issues -- the big performance eater in my case was anything
that was doing modulo in a tight loop.
So, if you have something like:
for ( int i = 0; i < 1000; i++) {
array[i % arr_size] = ...
}
You'll take a pretty big hit.
On Wed, Sep 6, 2017 at 4:00 PM, Tom Bereknyei via USRP-
We ran into a similar issue. Big things that helped us was to move high
rate dsp calculations to RFNoC.
I've also had luck with volk_profile. It seems to help with some workloads.
On Wed, Sep 6, 2017 at 16:53 Philip Balister via USRP-users <
usrp-users@lists.ettus.com> wrote:
> On 09/06/2017 04:3
On 09/06/2017 04:38 PM, Marcus Müller via USRP-users wrote:
> Hi Mr Hamilton,
>
> So, what you'd want to optimize first depends on what needs the most
> optimization. Your x86 program might be a good place to start looking
> into what the bottleneck is. If you're running Linux on your x86, I can
>
Hi Mr Hamilton,
So, what you'd want to optimize first depends on what needs the most
optimization. Your x86 program might be a good place to start looking
into what the bottleneck is. If you're running Linux on your x86, I can
heartily recommend `perf`, which is a tool that lets you display live,
We're moving an application that we had running on pc hardware with the
Ettus B210, to the embedded arm E310. On the pc side we were at 80% idle
cpu when running (intel i5-4570). With armv7 we're down to 30% idle, with
one of the cores @100% so it's not keeping up.
Are there any arm specific opti