Re: [fpc-pascal] Memory alignment with FPC

2012-10-11 Thread dhkblaszyk
  

On 10 okt '12, dhkblas...@zeelandnet.nl wrote: 

> One more
question, when using packed records, is there anything to say about
performance? Are there some tests anywhere that show how the performance
is impacted?

I did some performance tests on win32 and it appears that
both packed and unpacked objects and records all show exactly the same
performance. Writing the individual variables in a record or object to
file takes about 5.5 times longer than writing them at once. If someone
wants my test app to run it on other platforms please let me know then I
can post the code. I will do more testing later on mac and linux32. I'm
interested how win64 and linux64 behave in this respect. So if someone
has these architectures please let me know. 

This makes me wonder if
choosing a proper value for $PACKRECORDS could make my file readable
safely on all platforms, only needing to convert the endianess if
applicable. This would not force me to do manual padding in my structs.
Say I use a value of 16 would that cover all ABI's FPC currently
supports? 

Jonas: do you have an overview of the alignment on all
architectures that FPC supports? Perhaps you could pinpoint where in the
compiler this is handled? If appreciated I could make a patch to include
this info in the documentation in the future. 

Regards, Darius ___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Memory alignment with FPC

2012-10-11 Thread Jonas Maebe


On 11 Oct 2012, at 13:59, dhkblas...@zeelandnet.nl wrote:


I did some performance tests on win32 and it appears that
both packed and unpacked objects and records all show exactly the same
performance. Writing the individual variables in a record or object to
file takes about 5.5 times longer than writing them at once. If  
someone
wants my test app to run it on other platforms please let me know  
then I
can post the code. I will do more testing later on mac and linux32.  
I'm

interested how win64 and linux64 behave in this respect. So if someone
has these architectures please let me know.


As mentioned before, it not only depends on the platform, but also on  
the contents of the object/record. E.g., a badly misaligned double  
will generally give much worse performance even on Intel.



This makes me wonder if
choosing a proper value for $PACKRECORDS could make my file readable
safely on all platforms, only needing to convert the endianess if
applicable. This would not force me to do manual padding in my  
structs.

Say I use a value of 16 would that cover all ABI's FPC currently
supports?


Yes.


Jonas: do you have an overview of the alignment on all
architectures that FPC supports?


The information is not just architecture-specific, but also OS- 
specific (e.g. the alignment of int64 is 4 on Darwin/i386, but 8 on  
all other i386 platforms). This is defined in the platform ABI  
documents (application binary interface).



Perhaps you could pinpoint where in the
compiler this is handled? If appreciated I could make a patch to  
include

this info in the documentation in the future.


It's a combination of tdef.alignment (and its overridden methods in  
compiler/symdef.pas), tdef.structalignment (idem) and the varalign  
information in compiler/systems/i_*.pas. And the latter information in  
turn can be overridden by the programmer with -Oa switch and the  
{$codealign ...} directive, or is sometimes also adjusted by us when  
e.g. new data types are introduced, when bugs are found or when  
support for a new ABI is added that has different requirements (some  
OSes support multiple ABIs).


I don't think documenting it in our manual is a good idea. It's not  
something people should depend on beyond what the official platform  
ABIs say, and those documents are maintained separately from our  
manual (and unfortunately seldom have stable URLs that can be referred  
to).



Jonas___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Memory alignment with FPC

2012-10-11 Thread Marco van de Voort
In our previous episode, Nico Erfurth said:
> x86 can handle unaligned access, but most implementations (I think
> current atoms and via nano are an exception) will suffer a rather high
> performance penalty.

I thought most modern x86's only had a penalty when an unaligned
access crossed a cacheline boundery ? (32 bytes now, 64 bytes on Haswell)

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Memory alignment with FPC

2012-10-11 Thread dhkblaszyk
  

On 11 okt '12, Jonas Maebe wrote: 

> As mentioned before, it not
only depends on the platform, but also on the contents of the
object/record. E.g., a badly misaligned double will generally give much
worse performance even on Intel. 
> 
>> This makes me wonder if
>>
choosing a proper value for $PACKRECORDS could make my file readable
>>
safely on all platforms, only needing to convert the endianess if
>>
applicable. This would not force me to do manual padding in my
structs.
>> Say I use a value of 16 would that cover all ABI's FPC
currently
>> supports?
> 
> Yes.

So misalignment of for instance a
double (or whatever type) will only happen if the record is packed and
the packed value is smaller than what the ABI prescribes, correct?


Let's assume I set the record to packed 16bytes, this would make
reading and writing records as a whole safe on all platform/architecture
combinations right? Apart from a few padding bytes, what are the
performance penalties of doing this then? Why would there be penalties?


Darius ___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Memory alignment with FPC

2012-10-11 Thread Jonas Maebe


On 11 Oct 2012, at 15:00, dhkblas...@zeelandnet.nl wrote:


So misalignment of for instance a
double (or whatever type) will only happen if the record is packed and
the packed value is smaller than what the ABI prescribes, correct?


Yes.


Let's assume I set the record to packed 16bytes, this would make
reading and writing records as a whole safe on all platform/ 
architecture

combinations right? Apart from a few padding bytes, what are the
performance penalties of doing this then? Why would there be  
penalties?


The cpu cache will contain lots of unused padding bytes.


Jonas
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Memory alignment with FPC

2012-10-11 Thread Marco van de Voort
In our previous episode, Jonas Maebe said:
> > reading and writing records as a whole safe on all platform/ 
> > architecture
> > combinations right? Apart from a few padding bytes, what are the
> > performance penalties of doing this then? Why would there be  
> > penalties?
> 
> The cpu cache will contain lots of unused padding bytes.

And operations that move records will move more bytes. (e.g. reallocation).

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Memory alignment with FPC

2012-10-11 Thread dhkblaszyk
  

On 11 okt '12, Jonas Maebe wrote: 

> On 11 Oct 2012, at 15:00,
dhkblas...@zeelandnet.nl [1]wrote:
> 
>> So misalignment of for instance
a double (or whatever type) will only happen if the record is packed and
the packed value is smaller than what the ABI prescribes, correct?
> 
>
Yes.
> 
>> Let's assume I set the record to packed 16bytes, this would
make reading and writing records as a whole safe on all platform/
architecture combinations right? Apart from a few padding bytes, what
are the performance penalties of doing this then? Why would there be
penalties?
> 
> The cpu cache will contain lots of unused padding
bytes.

Thanks, I think everything is clear now. My plan now is to
respect default padding and write records in one go to disk. The padding
value will be written to the file header so the records can be read back
one variable at a time when padding differs, otherwise they will be read
back in one go again. This will sure come at a cost, but only if the
file is shared between different ABI's (as is the case when sharing
between different endianess). The result will be that the data
structures will be at default padding internally allways making optimal
use of the CPU. 

So is there a way to get the padding value at runtime?


Darius 

Links:
--
[1] mailto:dhkblas...@zeelandnet.nl
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Memory alignment with FPC

2012-10-11 Thread Jonas Maebe


On 11 Oct 2012, at 15:23, dhkblas...@zeelandnet.nl wrote:


Thanks, I think everything is clear now. My plan now is to
respect default padding and write records in one go to disk. The  
padding
value will be written to the file header so the records can be read  
back
one variable at a time when padding differs, otherwise they will be  
read

back in one go again. This will sure come at a cost, but only if the
file is shared between different ABI's (as is the case when sharing
between different endianess). The result will be that the data
structures will be at default padding internally allways making  
optimal

use of the CPU.

So is there a way to get the padding value at runtime?


No. You really should write the fields one by one. Yes, it's slower.  
That's the cost of portability. You can always optimize by first  
writing them to a buffer and then writing the buffer in one go.



Jonas___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Memory alignment with FPC

2012-10-11 Thread Mark Morgan Lloyd

Marco van de Voort wrote:

In our previous episode, Nico Erfurth said:

x86 can handle unaligned access, but most implementations (I think
current atoms and via nano are an exception) will suffer a rather high
performance penalty.


I thought most modern x86's only had a penalty when an unaligned
access crossed a cacheline boundery ? (32 bytes now, 64 bytes on Haswell)


In any event, I run FPC and Lazarus on SPARC which is susceptible to 
misalignment and am not currently aware of any problems.


--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Memory alignment with FPC

2012-10-11 Thread dhkblaszyk
  

On 11 okt '12, Jonas Maebe wrote: 

> On 11 Oct 2012, at 15:23,
dhkblas...@zeelandnet.nl [1] wrote: 
> 
>> Thanks, I think everything is
clear now. My plan now is to
>> respect default padding and write
records in one go to disk. The padding
>> value will be written to the
file header so the records can be read back
>> one variable at a time
when padding differs, otherwise they will be read
>> back in one go
again. This will sure come at a cost, but only if the
>> file is shared
between different ABI's (as is the case when sharing
>> between
different endianess). The result will be that the data
>> structures
will be at default padding internally allways making optimal
>> use of
the CPU. 
>> 
>> So is there a way to get the padding value at
runtime?
> 
> No. You really should write the fields one by one. Yes,
it's slower. That's the cost of portability. You can always optimize by
first writing them to a buffer and then writing the buffer in one go. 
>
Jonas

Sorry I keep asking questions, but why write them one by one? If
I would store the offset each variable has at the time of writing (only
need to do one time per record type), I could easily make the loading
work (even if the ABI changes when the file is read back). What makes
you prefer writing the variables one by one over once at a time?


Darius 

  

Links:
--
[1] mailto:dhkblas...@zeelandnet.nl
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Memory alignment with FPC

2012-10-11 Thread Jonas Maebe


On 11 Oct 2012, at 16:11, dhkblas...@zeelandnet.nl wrote:


On 11 okt '12, Jonas Maebe wrote:


No. You really should write the fields one by one. Yes,
it's slower. That's the cost of portability. You can always  
optimize by

first writing them to a buffer and then writing the buffer in one go.


Sorry I keep asking questions, but why write them one by one? If
I would store the offset each variable has at the time of writing  
(only

need to do one time per record type), I could easily make the loading
work (even if the ABI changes when the file is read back). What makes
you prefer writing the variables one by one over once at a time?


I always prefer simple techniques over elaborate strategies aimed at  
optimizing things, especially if it's not clear that they will ever be  
the performance bottleneck in the first place. You're moreover trading  
space (storing all the offsets) for cpu operations here, and I/O is  
generally two or more orders of a magnitude slower than moving data in  
memory.



Jonas___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal