Op 28 jun 2011, om 22:31 heeft Khem Raj het volgende geschreven: > On Tue, Jun 28, 2011 at 10:36 AM, Darren Hart <dvh...@linux.intel.com> wrote: >> >> >> On 06/24/2011 04:54 AM, Koen Kooi wrote: >>> Hi, >>> >>> We discussed tune files a bit during last nights TSC meeting and Khem had >>> expressed the need before, so I'd like to get this discussion started by >>> using >>> armv7a as an example. >>> >>> For armv7a capable cores we have the following hardware features: >>> >>> * armv7a instruction set >>> * thumb1 instruction set >>> * thumb2 instruction set >>> * VFP coprocessor >>> * optional NEON coprocessor >>> >>> For the ABI we can choose the following: >>> >>> * softtp without hw support (e.g. no VFP instructions emitted, slow) >>> * softfp with hw support (e.g. VFP and/or NEON instructions emitted, fast) >>> * hardfp, emits VFP and/or NEON instructions, slightly faster than >>> softfp/hw, >>> incompatible with everything else >>> >>> And the extra knobs: >>> >>> * pure thumb1, no arm instructions (limited use) >>> * thumb1/arm interworking >>> * pure thumb2, no arm instructions >>> * thumb2 interworking (not sure if that's actually usefull, thumb2 has >>> complete coverage) >>> >>> In OE .dev we have the following vars: >>> >>> TARGET_FPU: switches between hw float and sw float, no reflection in >>> package arch >>> ARM_FP_ABI: switches between softfp and hardfp, will create 'armv7a' or >>> 'armv7a-hardfp' as package arch >>> ARM_INSTRUCTION_SET: switches between arm and thumb1, no reflection in >>> package arch >>> THUMB_INTERWORK: turns on interworking, no reflection in package arch >>> >>> (side note, oe-core/distroless and meta-yocto/poky don't turn set TARGET_FPU >>> for armv7a and will generate slow code, angstrom does turn it on) >> >> >> oe-core tune-cortexa8.inc doesn't make use of these variables (unlike >> meta-texasinstruments) and does make use of the neon coprocessor, but >> still uses the softfp float-api: >> >> TARGET_CC_ARCH = "-march=armv7-a -mtune=cortex-a8 -mfpu=neon >> -mfloat-abi=softfp -fno-tree-vectorize" >> >> Seems like the oe-core tune files need to be synced up with vendor layers? >> > > Well for enabling hardfp its a fundamental decision and I guess using > softfloat > in oe-core is probably best choice and the floating point parameter passing > ABI > I am taking about we still use -mfpu=neon so gcc will still try to utilize it > but -fno-tree-vectorize is going to subdue the use of neon intrs since gcc > is disallowed to vectorize
Experience has shown that -fno-tree-vectorize generates faster code with gcc 4.5 :) _______________________________________________ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core