2011/1/14 <leonid.moiseic...@nokia.com>: > Enabling run-fast mode using -ffast-math is not-trivial hack. Also required > updating packages for compilation flags or global options. > Patching glibc is much cheaper to implement and safer. > > In ideal case the speedup on cotext-a8 could be around 40% (depends on > vector/matrix size), even non-vector operations with floats improves for > margin more 10%. > BUT all float/doubles operations could be are affected: you may get different > outcome in comparison to IEEE mode.
Got any good benchmarks/tools we can run so we can verify this on actual MeeGo hardfp? > > With best wishes, > Leonid > > > -----Original Message----- > From: meego-dev-boun...@meego.com [mailto:meego-dev-boun...@meego.com] On > Behalf Of ext Thiago Macieira > Sent: 12 January, 2011 17:55 > To: meego-dev@meego.com > Subject: Re: [MeeGo-dev] ARM RunFast by default in glibc > > On Wednesday, 12 de January de 2011 16:01:31 Carsten Munk wrote: >> 2011/1/12 Arjan van de Ven <ar...@linux.intel.com>: >> > On 1/12/2011 1:06 AM, Carsten Munk wrote: >> >> Hi (ARM toolchain group mostly) >> >> >> >> Do we have a patch for glibc-2.11-12-g24c0bf7 and/or glibc-2.12.1 >> >> that enables ARM RunFast[1] mode by default anywhere? Would be good >> >> to push it along with hardfp while we're at it and getting things >> >> tested through. >> > >> > can this be turned into something that's passed in via CFLAGS ? >> > that way apps will not be surprised, and there is an easy way for us >> > to toggle > > Right now, it's a context configuration, so there's nothing that will really > work from CFLAGS. Without changing gcc, the only thing we could do is supply > different crt1.o, one that puts the FPU in RunFast, the other doesn't. > > But this will, like I said, apply to all code within a process, so it doesn't > help the library case. Libraries will need to cope with running in both modes. > >> > of course we can have a default in our OBS that you pick, but it >> > becomes an easy-to-manage (from a distro perspective) property >> >> That would be a better way than patching glibc, I would believe? > > Not necessarily. To do the right thing, the compiler would need to emit the > code that changes FPSCR before any FP operation, so this means an increase in > code size. I can also bet that there's a pipeline delay in modifying this > register. > > And there's no such GCC patch. > >> Wouldn't -ffast-math correspond to this on x86 side at least? >> >> Leonid, does this correspond to an auto-setup of RunFast on ARM, when >> used there? > > No, it's different. > > By the way, I should point out that on Cortex-A8, RunFast only has a > perceptible improvement for float. If you use double, you still have > performance issues. > > On Cortex-A9, both are fast. > > -- > Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org > Senior Product Manager - Nokia, Qt Development Frameworks > PGP/GPG: 0x6EF45358; fingerprint: > E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358 > _______________________________________________ > MeeGo-dev mailing list > MeeGo-dev@meego.com > http://lists.meego.com/listinfo/meego-dev > _______________________________________________ MeeGo-dev mailing list MeeGo-dev@meego.com http://lists.meego.com/listinfo/meego-dev