Hi,

On Mon, Nov 29, 2010 at 8:45 PM, Michael Hope <michael.h...@linaro.org> wrote:
> On Tue, Nov 30, 2010 at 12:37 AM, Dave Martin <dave.mar...@linaro.org> wrote:
>> On Sun, Nov 28, 2010 at 10:28 PM, Michael Hope <michael.h...@linaro.org> 
>> wrote:
>>> I sat down and measured the power consumption of the NEON unit on an
>>> OMAP3.  Method and results are here:
>>>  https://wiki.linaro.org/MichaelHope/Sandbox/NEONPower
>>>
>>> The board takes 2.37 W and the NEON unit adds an extra 120 mW.
>>> Assuming the core takes 1 W, then the code needs to run 12 % faster
>>> with NEON on to be a net power win.
>>>
>>> Note that the results are inaccurate but valid enough.
>>
>> Just to play devil's advocate... the results will differ, perhaps
>> significantly, between SoCs of course.
>>
>> In terms of the amount of energy required to perform a particular
>> operation (i.e., at the microbenchmark level) I agree with your
>> conclusion.  However, in practice I suspect this isn't enough.  I'm
>> not familiar with exactly when NEON is likely to get turned on and
>> off, but you need to factor in the behaviour of the OS--- if you
>> accelerate a DSP operation which is used a few dozen times per
>> timeslice, NEON will be used for only a tiny proportion of the time it
>> is used, because once NEON is on, it probably stays on at least until
>> the interrupt, and probably until the next task switch.  With the
>> kernel configured for dynamic timer tick, this can get even more
>> exaggerated, since the rescheduling frequency may drop.
>>
>> The real benefits, in performance and power, therefore come in
>> operations which dominate the run-time of a particular process, such
>> as intensive image handling or codec operations.  NEON in
>> widely-dispersed but sporadically used features (such as
>> general-purpose library code) could be expected to come at a net power
>> cost.  If you use NEON for memcpy for example, you will basically
>> never be able to turn the NEON unit off.  That's unlikely to be a win
>> overall, since even if you now optimise all the code in the system for
>> NEON, you're unlikely to see a significant performance boost-- NEON
>> simply isn't designed for accelerating general-purpose code.
>>
>> The correct decision for how to optimise a given piece of code seems
>> to depend on the SoC and the runtime load profile.  And while you can
>> usefully predict that at build-time for a media player or dedicated
>> media stack components, it's pretty much impossible to do so with
>> general-purpose libraries... unless there's a cunning strategy I
>> haven't thought of.
>>
>> Ideally, processes whose load varies significantly over time and
>> between different use cases (such as Xorg) would be able to select
>> between NEON-ised and non-NEON-ised implementations dynamically, based
>> on the current load.  But I guess we're some distance away from being
>> able to achieve that... ?
>
> I agree.  I've been wondering if this is more of a power management
> topic as what you've described there is basically the same as what the
> CPU frequency governor does in deciding the best way to achieve a
> workload.  Perhaps this can also turn into hints to executing code re:
> what instruction set to use.
>
> There might be an argument for explicit control as well.  Say you're
> decoding a AAC stream and using 20 % CPU - it might be more efficient
> to acquire and release the NEON unit from within the decoder to start
> it up faster and release it as soon as the job is done.
>
> Could a kernel developer describe how the NEON unit is controlled?  My
> understanding is:
>  * NEON is generally off
>  * Executing a NEON instruction causes a instruction trap, which kicks
> the kernel, which starts the unit up
>  * The kernel only saves the NEON registers if the code uses them

I'll give the architectural view--- someone else will have to comment
on the hardware.

Currently, at every context switch, the kernel disables VFP and NEON
by clearing the EN bit in the FPEXC control register.  The first
attempt use use VFP or NEON by the process will cause a trap into the
kernel, which does any necessary context switching of the VFP/NEON
registers, enables them by setting FPEXC.EN and returning to
userspace.  VFP and NEON remain enabled until the next context switch.

This policy has nothing to do with power--- it's purely done so that
the VFP and NEON context can be switched lazily.  If the kernel
switches to a process that doesn't use VFP or NEON, the old register
contents will remain, so you may also save an additional register bank
context switch if the next context switch takes you back to the
process which actually owns the register contents.

Particular SoCs may implement their own additional stragety for power
management.  A particular SoC may respond to the toggling of FPEXC.EN
by clock-gating the whole NEON functional unit for example.  Or there
may some entirely separate logic.  However, in the current
implementation I believe the NEON unit can't normally be destructively
powered down, since the kernel assumes that the last register contents
switched into the VFP/NEON register bank are preserved.

>
> I'm not sure about:
>  * Does NEON remain on as long as that process is executing?  Does it
> get turned off on task switch, or perhaps after a timeout?

Basically, NEON is turned on when a process tries to execute a
NEON/VFP instruction, and turned off on each task switch.

In principle, the kernel could be cleverer than this--- for example,
doing the NEON/VFP register state switch non-lazily and leaving the
unit on when switching to a process which is likely to use VFP/NEON;
or possibly applying a timeout as you suggest.

Obviously, there's a risk of pathological behaviour if NEON/VFP is
disabled too agressively, since you could churn constantly turning it
off and then back on again.

>  * VFP uses the same register set.  Does a floating point instruction
> also turn the NEON coprocessor on?

Yes-- these are one and the same thing from the kernel's point of
view.  FPEXC.EN=0 basically causes all instructions accessing that
register bank to trap.

Cheers
---Dave

_______________________________________________
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev

Reply via email to