Thank you for the report.
According to Agner Fog's table, complex LEA instructions should have a
3-cycle latency on that architecture (Haswell). Optimisations with this
instruction are proving interesting because there's such a variety
between processor architectures. There are some that are fine with 3
components, but slows right down if a scale factor is used.
Kit
On 09/10/2023 14:06, Nataraj S Narayan via fpc-devel wrote:
Hi Gareth
model name : Intel(R) Core(TM) i5-4200U CPU @ 1.60GHz
Regards
Nataraj S Narayan
Synergy Info Systems
Software & Technology Consultants
Ettumanoor, INDIA
Ph:+91 9443211326
On Sun, Oct 8, 2023 at 6:40 PM J. Gareth Moreton via fpc-devel
<fpc-devel@lists.freepascal.org> wrote:
Hi Nataraj
Which processor is that run on? (although too close to call, it
implies LEA has a latency of 2 in that case)
Kit
On 08/10/2023 14:06, Nataraj S Narayan via fpc-devel wrote:
Hi
[nataraj@dflyHP ~]$ fpc ttt.pas
Free Pascal Compiler version 3.2.2 [2023/07/04] for x86_64
Copyright (c) 1993-2021 by Florian Klaempfl and others
Target OS: DragonFly for x86-64
Compiling ttt.pas
Linking ttt
/usr/local/bin/ld.bfd: warning:
/usr/local/lib/fpc/3.2.2/units/x86_64-dragonfly/rtl/prt0.o:
missing .note.GNU-stack section implies executable stack
/usr/local/bin/ld.bfd: NOTE: This behaviour is deprecated and
will be removed in a future version of the linker
121 lines compiled, 14.9 sec
[nataraj@dflyHP ~]$ ./ttt
Pascal control case: 6.7 ns/call
Using LEA instruction: 4.2 ns/call
Using ADD instructions: 4.0 ns/call
Nataraj S Narayan
Synergy Info Systems
Software & Technology Consultants
Ettumanoor, INDIA
Ph:+91 9443211326
On Sat, Oct 7, 2023 at 9:39 PM J. Gareth Moreton via fpc-devel
<fpc-devel@lists.freepascal.org> wrote:
That's interesting; I am interested to see the assembly
output for the
Pascal control cases. As for the 64-bit version, that was my
fault
since the assembly language is for Microsoft's ABI rather
than the
System V ABI, so it was checking a register with an undefined
value.
Find attached the fixed test.
Kit
P.S. Results on my Intel(R) Core(TM) i7-10750H
Pascal control case: 2.0 ns/call
Using LEA instruction: 1.7 ns/call
Using ADD instructions: 1.3 ns/call
On 07/10/2023 16:51, Tomas Hajny via fpc-devel wrote:
> On 2023-10-07 03:57, J. Gareth Moreton via fpc-devel wrote:
>
>
> Hi Kit,
>
>> Do you think this should suffice? Originally it ran for
1,000,000
>> repetitions but I fear that will take way too long on a
486, so I
>> reduced it to 10,000.
>
> OK, I tried it now. First of all, after turning on the old
machine, I
> realized that it wasn't Intel but AMD 486 DX4 - sorry for
my bad
> memory. :-( I compiled and ran the test under OS/2 there (I
was too
> lazy to boot it to DOS ;-) ), but I assume that it
shouldn't make any
> substantial difference. The ADD and LEA results were
basically the
> same there, both around 100 ns / call. The Pascal result
was around
> twice as long. Interestingly, the Pascal result for FPC
3.2.2 was
> around 10% longer than the same source compiled with FPC
2.0.3 (the
> assembler versions were obviously the same for both FPC
versions; I
> tried compiling it also with FPC 1.0.10 and the assembler
versions
> were more than three times slower due to missing support
for the
> nostackframe directive).
>
> I tested it under the AMD Athlon 1 GHz machine as well and
again, the
> results for LEA and ADD are basically equal (both 3.1
ns/call) and the
> result for Pascal slightly more than twice (7.3 ns/call).
However,
> rather surprisingly for me, the overall test run was _much_
longer
> there?! Finally, I tried compiling the test on a 64-bit
machine (AMD
> A9-9425) with Linux (compiled for 64-bits with FPC 3.2.3
compiled from
> a fresh 3.2 branch). The Pascal version shows about 4
ns/call, but the
> assembler version runs forever - well, certainly much
longer than my
> patience lasts. I haven't tried to analyze the reasons, but
that's
> what I get.
>
> Tomas
>
>
>
>>
>> On 03/10/2023 06:30, Tomas Hajny via fpc-devel wrote:
>>> On October 3, 2023 03:32:34 +0200, "J. Gareth Moreton via
fpc-devel"
>>> <fpc-devel@lists.freepascal.org> wrote:
>>>
>>>
>>> Hii Kit,
>>>
>>>> This is mainly to Florian, but also to anyone else who
can answer
>>>> the question - at which point did a complex LEA
instruction (using
>>>> all three input operands and some other specific
circumstances) get
>>>> slow? Preliminary research suggests the 486 was when it
gained
>>>> extra latency, and then Sandy Bridge when it got
particularly bad.
>>>> Icy Lake seems to be the architecture where faster LEA
instructions
>>>> are reintroduced, but I'm not sure about AMD processors.
>>> I cannot answer your question, but if you prepare a test
program, I
>>> can run it on an Intel 486 DX2 100 Mhz and AMD Athlon 1
GHz machines
>>> if it helps you in any way (at least I hope the 486 DX2
machine
>>> should be still able to start ;-) ).
>>>
>>> Tomas
>>>
>>> _______________________________________________
>>> fpc-devel maillist - fpc-devel@lists.freepascal.org
>>>
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
>>>
>> _______________________________________________
>> fpc-devel maillist - fpc-devel@lists.freepascal.org
>>
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
> _______________________________________________
> fpc-devel maillist - fpc-devel@lists.freepascal.org
> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
>_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist -fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist -fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel