Daniël Mantione wrote:
Op Tue, 26 Feb 2008, schreef Luiz Americo Pereira Camara:
Yury Sidorov wrote:
The patch removes packed record for some platforms.
IMO packed can be removed for all platforms. It will gain some speed.
I'd like to understand more this issue.
Why are non packed records faster?
Cache trashing. One of the most underestimated performance killers in
modern software.
The difference occurs at memory allocation or at memory access?
Memory access. What happens is that the non-packed version causes more
cache misses. A cache miss costs many cycles on a modern cpu, a
misaligned read just costs an extra memory access (which is fast if
cached) on x86, and extra load instruction on ARM. This much cheaper
than a chache miss.
It's much worse than that. Some architectures simply _can't_ do
unaligned access, and they will trigger an exception.
This exception will in many configurations be caught by the OS, that
then might simulate the read by doing 2 reads, putting the result
together, writing into the application memory, and doing a task switch.
This, in total, is several _orders of magnitude_ worse than unaligned
access on a supported platform.
Of course, unaligned access in itself is pretty bad.
--
Med venlig hilsen
Christian Iversen
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel