I did an attempt with HB_USER_CFLAGS=-DHB_ALLOC_ALIGNMENT=8,
but couldn't replicate the slightly better results with
HB_STRICT_ALIGNMENT (I've rerun this one too to be sure).
Results were similar to default build.
No idea what could be the explanation, maybe my CFLAGS
were not properly interpreted (even though they are
visible in /build output).
Brgds,
Viktor
On 2009.05.27., at 14:42, Przemyslaw Czerpak wrote:
On Wed, 27 May 2009, Szak�ts Viktor wrote:
I've retested after your change, results below
(two runs each), plus attached:
new (r11148):
HB_STRICT_ALIGNMENT: 38.83/39.28, 38.89/39.36
default : 39.72/40.14, 39.66/40.20
old (r11143/r11144):
HB_STRICT_ALIGNMENT: 38.52/39.01, 38.52/39.11
default : 43.14/43.63, 42.92/43.66
After your change all warnings disappeared in
default mode, and performance got better with
same (little smaller) binary sizes.
Now GCC builds uses the same macros/inline functions to
store/retrieve values in byte arrays when HB_STRICT_ALIGNMENT
is set or not so the speed difference is not caused by them.
When HB_STRICT_ALIGNMENT is set then it makes also yet another
job which is the only one thing which may change performance
in GCC builds: if HB_ALLOC_ALIGNMENT is not set then is defined
with default value 8. It causes that hb_xgrab() always returns
addresses with 8 byte alignment. Such addresses are often a little
bit faster on some x86 CPUs then addresses with 4 byte alignment
and you probably use such CPU and looks that in your case the
performance difference is noticeable. I guess that you will find
similar performance difference if you use -DHB_ALLOC_ALIGNMENT=8
instead of HB_STRICT_ALIGNMENT. Just check.
Speed with HB_STRICT_ALIGNMENT is a little bit
better with the new rev (and without size growth),
so shouldn't we make it the default? if yes, on
what conditions?
1-st we have to find the exact reason of speed difference in your
case in current builds and later we can try to check on which CPUs
the speed difference is repeatable. HB_ALLOC_ALIGNMENT=8 increase
a little bit total memory usage but it's possible that on some CPUs
the speed improvement will be noticeable. It's also possible that
on some others due to bigger memory usage it will reduce the cache
efficiency reducing also execution performance.
best regards,
Przemek
_______________________________________________
Harbour mailing list
Harbour@harbour-project.org
http://lists.harbour-project.org/mailman/listinfo/harbour
_______________________________________________
Harbour mailing list
Harbour@harbour-project.org
http://lists.harbour-project.org/mailman/listinfo/harbour