On Tue, Jun 28, 2011 at 1:33 PM, Koen Kooi <k...@dominion.thruhere.net> wrote: > > Op 28 jun 2011, om 22:31 heeft Khem Raj het volgende geschreven: > >> On Tue, Jun 28, 2011 at 10:36 AM, Darren Hart <dvh...@linux.intel.com> wrote: >>> >>> >>> On 06/24/2011 04:54 AM, Koen Kooi wrote: >>>> Hi, >>>> >>>> We discussed tune files a bit during last nights TSC meeting and Khem had >>>> expressed the need before, so I'd like to get this discussion started by >>>> using >>>> armv7a as an example. >>>> >>>> For armv7a capable cores we have the following hardware features: >>>> >>>> * armv7a instruction set >>>> * thumb1 instruction set >>>> * thumb2 instruction set >>>> * VFP coprocessor >>>> * optional NEON coprocessor >>>> >>>> For the ABI we can choose the following: >>>> >>>> * softtp without hw support (e.g. no VFP instructions emitted, slow) >>>> * softfp with hw support (e.g. VFP and/or NEON instructions emitted, fast) >>>> * hardfp, emits VFP and/or NEON instructions, slightly faster than >>>> softfp/hw, >>>> incompatible with everything else >>>> >>>> And the extra knobs: >>>> >>>> * pure thumb1, no arm instructions (limited use) >>>> * thumb1/arm interworking >>>> * pure thumb2, no arm instructions >>>> * thumb2 interworking (not sure if that's actually usefull, thumb2 has >>>> complete coverage) >>>> >>>> In OE .dev we have the following vars: >>>> >>>> TARGET_FPU: switches between hw float and sw float, no reflection in >>>> package arch >>>> ARM_FP_ABI: switches between softfp and hardfp, will create 'armv7a' or >>>> 'armv7a-hardfp' as package arch >>>> ARM_INSTRUCTION_SET: switches between arm and thumb1, no reflection in >>>> package arch >>>> THUMB_INTERWORK: turns on interworking, no reflection in package arch >>>> >>>> (side note, oe-core/distroless and meta-yocto/poky don't turn set >>>> TARGET_FPU >>>> for armv7a and will generate slow code, angstrom does turn it on) >>> >>> >>> oe-core tune-cortexa8.inc doesn't make use of these variables (unlike >>> meta-texasinstruments) and does make use of the neon coprocessor, but >>> still uses the softfp float-api: >>> >>> TARGET_CC_ARCH = "-march=armv7-a -mtune=cortex-a8 -mfpu=neon >>> -mfloat-abi=softfp -fno-tree-vectorize" >>> >>> Seems like the oe-core tune files need to be synced up with vendor layers? >>> >> >> Well for enabling hardfp its a fundamental decision and I guess using >> softfloat >> in oe-core is probably best choice and the floating point parameter passing >> ABI >> I am taking about we still use -mfpu=neon so gcc will still try to utilize it >> but -fno-tree-vectorize is going to subdue the use of neon intrs since gcc >> is disallowed to vectorize > > Experience has shown that -fno-tree-vectorize generates faster code with gcc > 4.5 :)
Someday I will try to benchmark and find out whats going on for myself. _______________________________________________ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core