Enabling run-fast mode using -ffast-math is not-trivial hack. Also required 
updating packages for compilation flags or global options.
Patching glibc is much cheaper to implement and safer.

In ideal case the speedup on cotext-a8 could be around 40% (depends on 
vector/matrix size), even non-vector operations with floats improves for margin 
more 10%.
BUT all float/doubles operations could be are affected: you may get different 
outcome in comparison to IEEE mode.

With best wishes,
Leonid


-----Original Message-----
From: meego-dev-boun...@meego.com [mailto:meego-dev-boun...@meego.com] On 
Behalf Of ext Thiago Macieira
Sent: 12 January, 2011 17:55
To: meego-dev@meego.com
Subject: Re: [MeeGo-dev] ARM RunFast by default in glibc

On Wednesday, 12 de January de 2011 16:01:31 Carsten Munk wrote:
> 2011/1/12 Arjan van de Ven <ar...@linux.intel.com>:
> > On 1/12/2011 1:06 AM, Carsten Munk wrote:
> >> Hi (ARM toolchain group mostly)
> >> 
> >> Do we have a patch for glibc-2.11-12-g24c0bf7 and/or glibc-2.12.1 
> >> that enables ARM RunFast[1] mode by default anywhere? Would be good 
> >> to push it along with hardfp while we're at it and getting things 
> >> tested through.
> > 
> > can this be turned into something that's passed in via CFLAGS ?
> > that way apps will not be surprised, and there is an easy way for us 
> > to toggle

Right now, it's a context configuration, so there's nothing that will really 
work from CFLAGS. Without changing gcc, the only thing we could do is supply 
different crt1.o, one that puts the FPU in RunFast, the other doesn't.

But this will, like I said, apply to all code within a process, so it doesn't 
help the library case. Libraries will need to cope with running in both modes.

> > of course we can have a default in our OBS that you pick, but it 
> > becomes an easy-to-manage (from a distro perspective) property
> 
> That would be a better way than patching glibc, I would believe?

Not necessarily. To do the right thing, the compiler would need to emit the 
code that changes FPSCR before any FP operation, so this means an increase in 
code size. I can also bet that there's a pipeline delay in modifying this 
register.

And there's no such GCC patch.

> Wouldn't -ffast-math correspond to this on x86 side at least?
> 
> Leonid, does this correspond to an auto-setup of RunFast on ARM, when 
> used there?

No, it's different.

By the way, I should point out that on Cortex-A8, RunFast only has a 
perceptible improvement for float. If you use double, you still have 
performance issues.

On Cortex-A9, both are fast.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
  Senior Product Manager - Nokia, Qt Development Frameworks
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358
_______________________________________________
MeeGo-dev mailing list
MeeGo-dev@meego.com
http://lists.meego.com/listinfo/meego-dev

Reply via email to