Enabling run-fast mode using -ffast-math is not-trivial hack. Also required updating packages for compilation flags or global options. Patching glibc is much cheaper to implement and safer.
In ideal case the speedup on cotext-a8 could be around 40% (depends on vector/matrix size), even non-vector operations with floats improves for margin more 10%. BUT all float/doubles operations could be are affected: you may get different outcome in comparison to IEEE mode. With best wishes, Leonid -----Original Message----- From: meego-dev-boun...@meego.com [mailto:meego-dev-boun...@meego.com] On Behalf Of ext Thiago Macieira Sent: 12 January, 2011 17:55 To: meego-dev@meego.com Subject: Re: [MeeGo-dev] ARM RunFast by default in glibc On Wednesday, 12 de January de 2011 16:01:31 Carsten Munk wrote: > 2011/1/12 Arjan van de Ven <ar...@linux.intel.com>: > > On 1/12/2011 1:06 AM, Carsten Munk wrote: > >> Hi (ARM toolchain group mostly) > >> > >> Do we have a patch for glibc-2.11-12-g24c0bf7 and/or glibc-2.12.1 > >> that enables ARM RunFast[1] mode by default anywhere? Would be good > >> to push it along with hardfp while we're at it and getting things > >> tested through. > > > > can this be turned into something that's passed in via CFLAGS ? > > that way apps will not be surprised, and there is an easy way for us > > to toggle Right now, it's a context configuration, so there's nothing that will really work from CFLAGS. Without changing gcc, the only thing we could do is supply different crt1.o, one that puts the FPU in RunFast, the other doesn't. But this will, like I said, apply to all code within a process, so it doesn't help the library case. Libraries will need to cope with running in both modes. > > of course we can have a default in our OBS that you pick, but it > > becomes an easy-to-manage (from a distro perspective) property > > That would be a better way than patching glibc, I would believe? Not necessarily. To do the right thing, the compiler would need to emit the code that changes FPSCR before any FP operation, so this means an increase in code size. I can also bet that there's a pipeline delay in modifying this register. And there's no such GCC patch. > Wouldn't -ffast-math correspond to this on x86 side at least? > > Leonid, does this correspond to an auto-setup of RunFast on ARM, when > used there? No, it's different. By the way, I should point out that on Cortex-A8, RunFast only has a perceptible improvement for float. If you use double, you still have performance issues. On Cortex-A9, both are fast. -- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Senior Product Manager - Nokia, Qt Development Frameworks PGP/GPG: 0x6EF45358; fingerprint: E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358 _______________________________________________ MeeGo-dev mailing list MeeGo-dev@meego.com http://lists.meego.com/listinfo/meego-dev