Re: Testsuite generator script

2025-01-14 Thread Stefan Schulze Frielinghaus via Gcc
On Mon, Jan 13, 2025 at 07:18:05AM -0700, Jeff Law via Gcc wrote: > > > On 1/13/25 2:56 AM, Stefan Schulze Frielinghaus via Gcc wrote: > > Hi everyone, > > > > In order to better test our s390 builtins, I have been coming up with a > > small tool in order to a

Testsuite generator script

2025-01-13 Thread Stefan Schulze Frielinghaus via Gcc
automatically into some (build) directory which dejagnu then sources? Any pointers are highly appreciated. Cheers, Stefan

Secondary reload and pseudos

2024-11-08 Thread Stefan Schulze Frielinghaus via Gcc
(operands[1]; emit_insn (gen_rtx_SET (operands[2], gen_rtx_ASHIFT (DImode, operands[2], GEN_INT (48; emit_insn (gen_rtx_SET (gen_rtx_REG (DFmode, REGNO (operands[0])), gen_rtx_REG (DFmode, REGNO (operands[2]; DONE; }) That restores bootstrap. However, this feels a bit hacky and I'm wondering whether first of all the initial implementation is wrong at all, or whether there exists a more elegant solution? Any thoughts? Cheers, Stefan

Re: Referencing a register in different modes

2024-09-12 Thread Stefan Schulze Frielinghaus via Gcc
On Fri, Aug 09, 2024 at 09:49:03AM +0200, Stefan Schulze Frielinghaus wrote: > On Thu, Aug 08, 2024 at 01:56:48PM -0600, Jeff Law wrote: > > > I haven't tested it extensively but it triggers at least for the current > > > case. > > > I would have loved to also

Re: Referencing a register in different modes

2024-08-09 Thread Stefan Schulze Frielinghaus via Gcc
ional work---although the name is bit of a mouthful. > If you want to throw a patch over the wall for testing, happy to put it into > my tester and see what comes out the other side. I wouldn't be at all > surprised if it tripped on other targets. H

Re: Referencing a register in different modes

2024-08-08 Thread Stefan Schulze Frielinghaus via Gcc
On Thu, Aug 08, 2024 at 07:57:43AM -0600, Jeff Law wrote: > > > On 8/8/24 6:26 AM, Stefan Schulze Frielinghaus wrote: > > On Thu, Aug 08, 2024 at 06:03:13AM -0600, Jeff Law wrote: > > > > > > > > > On 8/8/24 5:15 AM, Stefan Schulze Frielinghaus via

Re: Referencing a register in different modes

2024-08-08 Thread Stefan Schulze Frielinghaus via Gcc
On Thu, Aug 08, 2024 at 06:03:13AM -0600, Jeff Law wrote: > > > On 8/8/24 5:15 AM, Stefan Schulze Frielinghaus via Gcc wrote: > > > > > However `(reg:DI 61 [ MEM[(const union T *)p_2(D)] ])` referencing the > > same pseudo in a different mode is not substituted in

Referencing a register in different modes

2024-08-08 Thread Stefan Schulze Frielinghaus via Gcc
out if gen_lowpart doesn't return a subreg since then most likely `expr` was a paradoxical subreg. At least in this example this leads to a partial initialization of pseudo 61 in insn 6. This is fixed up later by pass init-regs which is introducing insn 17 and zeroing the entire pseudo 61.

[RFC] genoutput: Error on unresolved iterator

2024-07-16 Thread Stefan Schulze Frielinghaus via Gcc
I just ran into an unresolved iterator https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657360.html which motivated me to dig into genoutput.cc where in process_template() we already emit an error but only if the new compact syntax is used. There is probably a reason for limiting the check to th

Re: Setting insn mnemonic partly automagically

2024-06-22 Thread Stefan Schulze Frielinghaus via Gcc
On Sat, Jun 22, 2024 at 01:00:54PM +0200, Georg-Johann Lay wrote: > Am 22.06.24 um 10:46 schrieb Stefan Schulze Frielinghaus: > > On Fri, Jun 21, 2024 at 09:50:43PM +0200, Georg-Johann Lay wrote: > > > > > > > > > Am 17.06.24 um 21:13 schrieb Stefan Schulze

Re: Setting insn mnemonic partly automagically

2024-06-22 Thread Stefan Schulze Frielinghaus via Gcc
On Fri, Jun 21, 2024 at 09:50:43PM +0200, Georg-Johann Lay wrote: > > > Am 17.06.24 um 21:13 schrieb Stefan Schulze Frielinghaus via Gcc: > > Hi all, > > > > I'm trying to add an alternative to an existing insn foobar: > > > > (defi

Setting insn mnemonic partly automagically

2024-06-17 Thread Stefan Schulze Frielinghaus via Gcc
ing alternatives be set automagically. Not sure whether this is supported? If all fails, I have another idea how to solve this by utilizing PRINT_OPERAND. However, now I'm curious whether my current attempt is feasible or not. Cheers, Stefan

Re: Partial vector

2024-06-04 Thread Stefan Schulze Frielinghaus via Gcc
On Tue, Jun 04, 2024 at 09:50:04AM +0200, Richard Biener wrote: > On Tue, Jun 4, 2024 at 8:52 AM Stefan Schulze Frielinghaus via Gcc > wrote: > > > > Hi all, > > > > Is there some sort of guarantee that the unused part of a partial vector has > > all bits set t

Partial vector

2024-06-03 Thread Stefan Schulze Frielinghaus via Gcc
probably better solved by having some sort of masking support by the hardware but I'm still keen to know. Cheers, Stefan

Build errors for older versions

2024-04-25 Thread Stefan Schulze Frielinghaus via Gcc
appear if I'm using e.g. Fedora 34. Is this known and if so does there exist a workaround such that building older versions on a recent OS works? Cheers, Stefan

Will GCC eventually learn to use BSR or even TZCNT on AMD/Intel processors?

2023-06-05 Thread Stefan Kanthak
instead of code fiddling with the stack! Stefan Kanthak

Who cares about size? (was: Who cares about performance (or Intel's CPU errata)?)

2023-05-29 Thread Stefan Kanthak
"Andrew Pinski" wrote: > On Sat, May 27, 2023 at 3:54 PM Stefan Kanthak > wrote: >> Nevertheless GCC fails to optimise code properly: >> >> --- .c --- >> int ispowerof2(unsigned long long argument) { >> return __builtin_popcountll(argument) =

Re: Who cares about performance (or Intel's CPU errata)?

2023-05-28 Thread Stefan Kanthak
"Andrew Pinski" wrote: > On Sat, May 27, 2023 at 3:54 PM Stefan Kanthak > wrote: [...] >> Nevertheless GCC fails to optimise code properly: >> >> --- .c --- >> int ispowerof2(unsigned long long argument) { >> return __builtin_popcountll(argu

Re: Who cares about performance (or Intel's CPU errata)?

2023-05-27 Thread Stefan Kanthak
"Andrew Pinski" wrote: > On Sat, May 27, 2023 at 2:25 PM Stefan Kanthak > wrote: >> >> Just to show how SLOPPY, INCONSEQUENTIAL and INCOMPETENT GCC's developers >> are: >> >> --- dontcare.c --- >> int ispowerof2(unsigned __int12

Re: Another epic optimiser failure

2023-05-27 Thread Stefan Kanthak
"Andrew Pinski" wrote: > On Sat, May 27, 2023 at 2:38 PM Stefan Kanthak > wrote: >> >> "Jakub Jelinek" wrote, completely clueless: >> >>> On Sat, May 27, 2023 at 11:04:11PM +0200, Stefan Kanthak wrote: >>>> OUCH: popcnt writes

Re: Another epic optimiser failure

2023-05-27 Thread Stefan Kanthak
"Jakub Jelinek" wrote, completely clueless: > On Sat, May 27, 2023 at 11:04:11PM +0200, Stefan Kanthak wrote: >> OUCH: popcnt writes the WHOLE result register, there is ABSOLUTELY >> no need to clear it beforehand nor to clear the higher 24 bits >> aft

Who cares about performance (or Intel's CPU errata)?

2023-05-27 Thread Stefan Kanthak
Ts output? See https://gcc.godbolt.org/z/jdjTc3EET for comparison! FIX YOUR BUGS, KIDS! Stefan

Another epic optimiser failure

2023-05-27 Thread Stefan Kanthak
zx eax, al# superfluous! ret Will GCC eventually generate properly optimised code instead of bloat? Stefan

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-27 Thread Stefan Kanthak
You wrote: >在 2023-05-26 23:40, Stefan Kanthak 写道: >> Feel free to propose this alternative here (better elsewhere, where you'll >> earn less laughter). >> But don't forget that this 23-bit mantissa will be all zeroes for quite some >> 64-bit (and even 32-

Epic code generator/optimiser failures

2023-05-27 Thread Stefan Kanthak
xmm0->ptest xmm1, xmm0 seteal->setzal .L1: ret ->ret 5 out of 14 instructions are superfluous here, or 18 of 50 bytes! OUCH #3/#4: see above! Will GCC eventually generate proper SSE4.1/AVX code? Stefan

Re: GCC plays "Shell Game", but looses track of the shell covering the nought

2023-05-27 Thread Stefan Kanthak
"Dave Blanchard" wrote: > Hi Stefan, thanks for sharing this information. > I was wondering if the code generators in earlier GCC > versions were any better? Just open one of the URLs I included, select another GCC version and see the resulting code. > Is this a proble

GCC plays "Shell Game", but looses track of the shell covering the nought

2023-05-27 Thread Stefan Kanthak
, 2 registers clobbered without need and reason, resulting in 2 superfluous memory writes It's e REAL shame how bad GCC's code generator is! Stefan

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
"Jonathan Wakely" wrote: > On Fri, 26 May 2023 at 15:34, Stefan Kanthak wrote: >> >> "Jonathan Wakely" wrote: >> >> > On Fri, 26 May 2023 at 14:55, Stefan Kanthak >> > wrote: >> >> [...] >> >> >> NOT obv

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
You wrote: >在 2023-05-26 14:46, Stefan Kanthak 写道: >> OOPS: why does GCC (ab)use the SSE2 alias "Willamette New Instruction Set" >> (... ...) >> OUCH: why does it FAIL to REALLY use SSE2, as shown in the comments on the >>right side? > > Pleas

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
"Jonathan Wakely" wrote: > On Fri, 26 May 2023 at 15:48, Stefan Kanthak wrote: >> >> "Jakub Jelinek" wrote: >> >> [...] >> >> > And for -m32 it is also the last option that wins, but as with >> > many other cases just

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
other words: although -march= selects a (documented sub)set of -mISA options, it does NEITHER reset any -mISA option set NOR any -mno-ISA option reset BEFORE or AFTER itself, i.e. all -m[no-]ISA options have precedence even if they preceed -march=. Just document that! Stefan

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
"Jonathan Wakely" wrote: > On Fri, 26 May 2023 at 14:55, Stefan Kanthak wrote: [...] >> NOT obvious is but that -m -march= does not clear any >> not supported in , i.e the last one does NOT win here. > > The last -march option selects the base set of instructi

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
"Jakub Jelinek" wrote: > On Fri, May 26, 2023 at 02:19:54PM +0200, Stefan Kanthak wrote: >> > I find it very SURPRISING that you're only just learning the basics of >> > how to use gcc NOW, after YELLING about all the OUCH. >> >> I'm NOT

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
"Jonathan Wakely" wrote: > On Fri, 26 May 2023 at 13:23, Stefan Kanthak wrote: >> >> "Jonathan Wakely" wrote: >> >> > On Fri, 26 May 2023 at 12:42, Stefan Kanthak wrote: >> >> Why does the documentation FAIL to specify that CP

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
"Jonathan Wakely" wrote: > On Fri, 26 May 2023 at 13:09, Stefan Kanthak wrote: >> >> "Jonathan Wakely" wrote: >> >> > On Fri, 26 May 2023 at 12:29, Stefan Kanthak >> > wrote: >> >> OUCH: as shown in https://godbolt.org/z

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
"Jonathan Wakely" wrote: > On Fri, 26 May 2023 at 12:42, Stefan Kanthak wrote: >> Why does the documentation FAIL to specify that CPU features given by >> -m* override -m32 or enables them in ADDITION to those enabled by -march=? > > Because it's obvious. I

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
"Jonathan Wakely" wrote: > On Fri, 26 May 2023 at 12:29, Stefan Kanthak wrote: >> >> "Jakub Jelinek" wrote: >> >> > On Fri, May 26, 2023 at 10:59:03AM +0200, Stefan Kanthak wrote: >> >> 3) SSE4.1 is supported since Core2, but -marc

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
"Jakub Jelinek" wrote: > On Fri, May 26, 2023 at 10:59:03AM +0200, Stefan Kanthak wrote: >> 3) SSE4.1 is supported since Core2, but -march=core2 fails to enable it. >>That's bad, REALITY CHECK, please! > > You're wrong. > SSE4.1 first appe

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
"Jakub Jelinek" wrote: > On Fri, May 26, 2023 at 10:59:03AM +0200, Stefan Kanthak wrote: >> 3) SSE4.1 is supported since Core2, but -march=core2 fails to enable it. >>That's bad, REALITY CHECK, please! > > You're wrong. > SSE4.1 first appe

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
"Jonathan Wakely" wrote: > On Fri, 26 May 2023 at 09:00, Stefan Kanthak wrote: >> >> "Jonathan Wakely" wrote: >> >> > On Fri, 26 May 2023, 08:01 Andrew Pinski via Gcc, wrote: >> > >> >> On Thu, May 25, 2023 at 11:56?PM S

Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread Stefan Kanthak
"Jonathan Wakely" wrote: > On Fri, 26 May 2023, 08:01 Andrew Pinski via Gcc, wrote: > >> On Thu, May 25, 2023 at 11:56?PM Stefan Kanthak >> wrote: >>> >>> Hi, >>> >>> compile the following function on a system with Core

Will GCC eventually support SSE2 or SSE4.1?

2023-05-25 Thread Stefan Kanthak
#ret 14 instructions in 33 bytes# 11 instructions in 32 bytes OUCH: why does GCC abuse EBX (and ECX too) and performs a superfluous memory write? Stefan Kanthak

Re: struct sockaddr_storage

2023-01-22 Thread Stefan Puiu via Gcc
Hi Alex, On Fri, Jan 20, 2023 at 2:40 PM Alejandro Colomar wrote: > > Hi Stefan, > > On 1/20/23 11:06, Stefan Puiu wrote: > > Hi Alex, > > > > On Thu, Jan 19, 2023 at 4:14 PM Alejandro Colomar > > wrote: > >> > >> Hi! > >> > &g

Re: struct sockaddr_storage

2023-01-20 Thread Stefan Puiu via Gcc
he size, I guess it might matter if you want to port your code to AIX, Solaris, OpenBSD etc. I don't think all software is meant to be portable, though (or portable to those platforms). Maybe a warning is in order that, for portable code, developers should check its size on the other platforms t

Re: B^HDEAD code generation (AMD64)

2023-01-09 Thread Stefan Kanthak
"Thomas Koenig" wrote: > On 09.01.23 12:35, Stefan Kanthak wrote: >> 20 superfluous instructions of the total 102 instructions! > > The proper place for bug reports is https://gcc.gnu.org/bugzilla/ . OUCH: there's NO proper place for bugs at all! > Feel fre

Re: Widening multiplication, but no narrowing division [i386/AMD64]

2023-01-09 Thread Stefan Kanthak
"Paul Koning" wrote: >> On Jan 9, 2023, at 10:20 AM, Stefan Kanthak wrote: >> >> "Paul Koning" wrote: >> >>>> On Jan 9, 2023, at 7:20 AM, Stefan Kanthak wrote: >>>> >>>> Hi, >>>> >>>>

Re: Widening multiplication, but no narrowing division [i386/AMD64]

2023-01-09 Thread Stefan Kanthak
"Paul Koning" wrote: >> On Jan 9, 2023, at 7:20 AM, Stefan Kanthak wrote: >> >> Hi, >> >> GCC (and other C compilers too) support the widening multiplication >> of i386/AMD64 processors, but DON'T support their narrowing division: > >

Re: Widening multiplication, but no narrowing division [i386/AMD64]

2023-01-09 Thread Stefan Kanthak
LIU Hao wrote: >在 2023/1/9 20:20, Stefan Kanthak 写道: >> Hi, >> >> GCC (and other C compilers too) support the widening multiplication >> of i386/AMD64 processors, but DON'T support their narrowing division: >> >> > > QWORD-DWORD division would c

Widening multiplication, but no narrowing division [i386/AMD64]

2023-01-09 Thread Stefan Kanthak
ret .end JFTR: dependent on the magnitude of the numbers and the processor it MIGHT be better to omit comparison and branch: there's a trade-öff between the latency of the (un-pipelined) division instruction and the latency of the conditional branch due to misprediction. Stefan Kanthak

EPIC optimiser failures (i386)

2023-01-09 Thread Stefan Kanthak
sub eax, DWORD PTR [esp+4] .endif setoah setzal sub al, ah # al = ZF - OF .if 0 cbw cwde .else movsx eax, al .endif ret Stefan Kanthak

B^HDEAD code generation (AMD64)

2023-01-09 Thread Stefan Kanthak
re's no need to modify ECX! cmovne rdx, rax cmovne rax, rsi ret .L9: mov rax, rsi mov rdx, rdi .L1: ret .L14: mov r8, r9 xor r9d, r9d mov rcx, r8 jmp .L4 20 superfluous instructio

B^HDEAD code generation (i386)

2023-01-09 Thread Stefan Kanthak
pop esi pop edi pop ebp ret .L9: mov ebx, edi # Ouch: GCC likes to play shell games! mov ecx, esi # mov edx, ebx # mov eax, ecx # pop ebx pop esi pop

Re: How to debug while using LTO?

2022-11-30 Thread Stefan Schulze Frielinghaus via Gcc
On Thu, Nov 24, 2022 at 05:53:53PM +0100, Richard Biener wrote: > > > > Am 24.11.2022 um 17:28 schrieb Stefan Schulze Frielinghaus via Gcc > > : > > > > Hi everyone, > > > > Currently I'm looking into a wrong-code bug and would like to unders

How to debug while using LTO?

2022-11-24 Thread Stefan Schulze Frielinghaus via Gcc
ually didn't expect that because I added -save-temps to all the intermediate commands which is also reflected in the environment variable COLLECT_GCC_OPTIONS. Thus, how do you keep temporary files? Cheers, Stefan

Re: Setting up editors for the GNU/GCC coding style?

2022-08-01 Thread Stefan Schulze Frielinghaus via Gcc
On Mon, Aug 01, 2022 at 12:25:21PM +0100, Jonathan Wakely wrote: > On Mon, 1 Aug 2022 at 09:24, Stefan Schulze Frielinghaus wrote: > > I gave unexpand from GNU coreutils 8.32 a try. Looks like it cannot > > deal with form feeds or maybe I'm missing something? > > > &

Re: Setting up editors for the GNU/GCC coding style?

2022-08-01 Thread Stefan Schulze Frielinghaus via Gcc
On Thu, Jul 28, 2022 at 08:53:37PM +0100, Jonathan Wakely via Gcc wrote: > On Thu, 28 Jul 2022 at 20:49, Tim Lange wrote: > > > > > > > > On Thu, Jul 28 2022 at 02:46:58 PM -0400, David Malcolm via Gcc > > wrote: > > > Is there documentation on setting up text editors to work with our > > > coding

Re: On(c)e more: optimizer failure

2021-08-27 Thread Stefan Kanthak
R incurs two cycles penalty on many Intel processors! Better use XORPD there. Stefan

Re: On(c)e more: optimizer failure

2021-08-23 Thread Stefan Kanthak
Gabriel Ravier wrote: > On 8/23/21 3:46 PM, Stefan Kanthak wrote: >> JFTR: do you consider your wild speculations to be on-topic here? > > I suppose I should apologize: I did not intend to make any accusations > here. No need to, I can stand a little heat. [...] > I

Re: On(c)e more: optimizer failure

2021-08-23 Thread Stefan Kanthak
Gabriel Ravier wrote: > On 8/22/21 11:22 PM, Stefan Kanthak wrote: [ 2bugzilla | !2bugzilla ] >> You (and everybody else) if free to use GCC bugzilla. >> Everybody and me is but also free NOT to use GCC bugzilla. >> >> Stefan > > Yes, you are free not

Re: On(c)e more: optimizer failure

2021-08-22 Thread Stefan Kanthak
Gabriel Ravier wrote: > On 8/21/21 10:19 PM, Stefan Kanthak wrote: >> Jakub Jelinek wrote: [...] >>> GCC doesn't do value range propagation of floating point values, not even >>> the special ones like NaNs, infinities, +/- zeros etc., and without that the &

Re: On(c)e more: optimizer failure

2021-08-21 Thread Stefan Kanthak
Jakub Jelinek wrote: > On Sat, Aug 21, 2021 at 09:40:16PM +0200, Stefan Kanthak wrote: >> > I believe your example doesn't take into account that the values can be NaN >> > which compares false in all situations. >> >> That's a misbelief! >> P

Re: On(c)e more: optimizer failure

2021-08-21 Thread Stefan Kanthak
nt: https://godbolt.org/z/1ra7zcsnd Replace if (isnan(argx) || isnan(argy)) return argx + argy; with if ((argx != argx) || (argy != argy)) return argx + argy; then feed the changed snippet to compiler explorer again, with and without -ffast-math Stefan > --matt > > On Sat, Aug

On(c)e more: optimizer failure

2021-08-21 Thread Stefan Kanthak
ret .L19: addsd %xmm1, %xmm0 ret .LC1: .long 0 .long 1072693248 Stefan

Re: 3rd deficiency (was: Superfluous branches due to insufficient flow analysis)

2021-08-14 Thread Stefan Kanthak
Gabriel Ravier wrote: Independent from the defunct flow analysis in the presence of NaNs, my example demonstrates another minor deficiency: know thy instruction set! See the comments in the assembly below. > On 8/13/21 8:58 PM, Stefan Kanthak wrote: >> Hi, >> >> compil

Re: Superfluous branches due to insufficient flow analysis

2021-08-14 Thread Stefan Kanthak
"Gabriel Ravier" wrote: Please don't FULL QUOTE! > On 8/13/21 8:58 PM, Stefan Kanthak wrote: >> Hi, >> >> compile the following naive implementation of nextafter() for AMD64: >> >> JFTR: ignore the aliasing casts, they don't matter here!

Superfluous branches due to insufficient flow analysis

2021-08-13 Thread Stefan Kanthak
movapd %xmm1, %xmm0 ret .L15: jne .L4 movabsq $-9223372036854775808, %rdx movq%xmm1, %rax andq%rdx, %rax orq $1, %rax movq%rax, %xmm0 ret Stefan

Re: Suboptimal code generated for __buitlin_trunc on AMD64 without SS4_4.1

2021-08-07 Thread Stefan Kanthak
Joseph Myers wrote: > On Fri, 6 Aug 2021, Stefan Kanthak wrote: PLEASE DON'T STRIP ATTRIBUTION LINES: I did not write the following paragraph! >> > I don't know what the standard says about NaNs in this case, I seem to >> > remember that arithmetic instructions

Optimizer failure

2021-08-07 Thread Stefan Kanthak
eax testl %eax, %eax js L0 leal1(%eax), %edx movl$0, %eax # SUPERFLUOUS: cmovne %edx, %eax# cmovne is only executed if eax was not 0 ret L0: subl$1, %eax ret regards Stefan

Re: Suboptimal code generated for __buitlin_trunc on AMD64 without SS4_4.1

2021-08-06 Thread Stefan Kanthak
Richard Biener wrote: > On August 6, 2021 4:32:48 PM GMT+02:00, Stefan Kanthak > wrote: >>Michael Matz wrote: >>> Btw, have you made speed measurements with your improvements? >> >>No. [...] >>If the constant happens to be present in L1 cache, it MAY lo

Re: Suboptimal code generated for __buitlin_trunc on AMD64 without SS4_4.1

2021-08-06 Thread Stefan Kanthak
Gabriel Paubert wrote: > On Fri, Aug 06, 2021 at 02:43:34PM +0200, Stefan Kanthak wrote: >> Gabriel Paubert wrote: >> >> > Hi, >> > >> > On Thu, Aug 05, 2021 at 01:58:12PM +0200, Stefan Kanthak wrote: [...] >> >> The whole idea

Re: Suboptimal code generated for __buitlin_trunc on AMD64 without SS4_4.1

2021-08-06 Thread Stefan Kanthak
Michael Matz wrote: > Hello, > > On Fri, 6 Aug 2021, Stefan Kanthak wrote: > >> For -ffast-math, where the sign of -0.0 is not handled and the spurios >> invalid floating-point exception for |argument| >= 2**63 is acceptable, > > This claim would need to be p

Re: Suboptimal code generated for __buitlin_trunc on AMD64 without SS4_4.1

2021-08-06 Thread Stefan Kanthak
Gabriel Paubert wrote: > Hi, > > On Thu, Aug 05, 2021 at 01:58:12PM +0200, Stefan Kanthak wrote: >> Gabriel Paubert wrote: >> >> >> > On Thu, Aug 05, 2021 at 09:25:02AM +0200, Stefan Kanthak wrote: >> >> .intel_

Re: Suboptimal code generated for __buitlin_trunc on AMD64 without SS4_4.1

2021-08-05 Thread Stefan Kanthak
Gabriel Paubert wrote: > On Thu, Aug 05, 2021 at 09:25:02AM +0200, Stefan Kanthak wrote: >> Hi, >> >> targeting AMD64 alias x86_64 with -O3, GCC 10.2.0 generates the >> following code (13 instructions using 57 bytes, plus 4 quadwords >> using 32 bytes) for _

Suboptimal code generated for __buitlin_floor on AMD64 without SS4_4.1

2021-08-05 Thread Stefan Kanthak
66 0f 73 d0 3fpsrlq xmm0, 63 37: 66 0f 73 f0 3fpsllq xmm0, 63 # xmm0 = (argument & -0.0) ? -0.0 : 0.0 3c: 66 0f 56 c3 orpdxmm0, xmm3 # xmm0 = floor(argument) 40: c3 .L0: ret .end regards Stefan

Suboptimal code generated for __buitlin_trunc on AMD64 without SS4_4.1

2021-08-05 Thread Stefan Kanthak
# xmm0 = (argument & -0.0) ? -0.0 : 0.0 1c: 66 0f 56 c1 orpdxmm0, xmm1 # xmm0 = trunc(argument) 20: c3 .L0: ret .end regards Stefan

Suboptimal code generated for __buitlin_ceil on AMD64 without SS4_4.1

2021-08-05 Thread Stefan Kanthak
6 0f 56 c3 orpdxmm0, xmm3 # xmm0 = ceil(argument) 40: c3 .L0: ret .end regards Stefan

Suboptimal code generated for __buitlin_rint on AMD64 without SS4_4.1

2021-08-05 Thread Stefan Kanthak
6 c1 orpdxmm0, xmm1 # xmm0 = round(argument) 20: c3 .L0: ret .end regards Stefan

Re: Are some builtin functions (for example log() vs. sqrt()) more equal than others?

2021-07-30 Thread Stefan Kanthak
"Joseph Myers" wrote: > On Fri, 30 Jul 2021, Stefan Kanthak wrote: > >> Joseph Myers wrote: >> >> > None of these are valid constant expressions as defined by the standard >> > (constant expressions cannot involve evaluated function calls). >

Re: Are some builtin functions (for example log() vs. sqrt()) more equal than others?

2021-07-30 Thread Stefan Kanthak
before calling the main() routine. JFTR: doing so would but inhibit the placement of such constants in the read-only data section ... what is also allowed by the standard. regards Stefan

Are some builtin functions (for example log() vs. sqrt()) more equal than others?

2021-07-30 Thread Stefan Kanthak
on log(sqrt(5.0) * 0.5 + 0.5)! NOT amused Stefan Kanthak

Optimiser failure for ternary foo == 0L ? NULL : bar;

2021-07-17 Thread Stefan Kanthak
e: 66 90 xchg %ax,%ax 10: 31 c0 xor%eax,%eax 12: c3 ret not amused Stefan Kanthak

[libgcc2.c] Implementation of __bswapsi2()

2020-11-12 Thread Stefan Kanthak
nt w) { return (v >> (31 & w)) | (v << (31 & -w)); } int __bswapsi2 (int u) // should better be unsigned __bswapsi2 (unsigned u)! { return __rotlsi3 (u & 0xff00ff00, 8) | __rotrsi3 (u & 0x00ff00ff, 8); } Stefan KanthaK PS: reimplementing __bswapdi2(

Re: [__mulvti3] register allocator plays shell game

2020-10-27 Thread Stefan Kanthak
Richard Biener wrote: > On Tue, Oct 27, 2020 at 12:01 AM Stefan Kanthak > wrote: >> >> Richard Biener wrote: >> >>> On Sun, Oct 25, 2020 at 8:37 PM Stefan Kanthak >>> wrote: >>>> >>>> Hi, >>>> >>>> fo

Re: Recognizing loop pattern

2020-10-27 Thread Stefan Schulze Frielinghaus via Gcc
On Mon, Oct 26, 2020 at 01:46:52PM +0100, Richard Biener wrote: > On Mon, Oct 26, 2020 at 10:59 AM Stefan Schulze Frielinghaus via Gcc > wrote: > > > > I'm trying to detect loops of the form > > > > while (*x != y) > > ++x; > > > > which

Re: [__mulvti3] register allocator plays shell game

2020-10-26 Thread Stefan Kanthak
Richard Biener wrote: > On Sun, Oct 25, 2020 at 8:37 PM Stefan Kanthak > wrote: >> >> Hi, >> >> for the AMD64 alias x86_64 platform and the __int128_t [DW]type, >> the first few lines of the __mulvDI3() function from libgcc2.c >> >

Recognizing loop pattern

2020-10-26 Thread Stefan Schulze Frielinghaus via Gcc
such loops? Any comments? Cheers, Stefan

[__mulvti3] register allocator plays shell game

2020-10-25 Thread Stefan Kanthak
, 63 cmp r8, rsi jne __mulvti3+0x48+65-31 cmp r9, rcx jne __mulvti3+0xa0+65-31 mov rax, rdi imul rdx ret ... not amused Stefan Kanthak

[Patch] Overflow-trapping integer arithmetic routines7code: bloated and slooooow

2020-10-05 Thread Stefan Kanthak
.de/gcc.html> for some examples. The attached diff/patch provides better implementations. Stefan libgcc2.diff Description: Binary data

UB or !UB? Plus poor register allocation

2020-10-01 Thread Stefan Kanthak
The following source implements the __absv?i2() functions (see <https://gcc.gnu.org/onlinedocs/gccint/Integer-library-routines.html>) for 32-bit, 64-bit and 128-bit integers in 3 different ways: --- ub_or_!ub.c --- // Copyleft 2014-2020, Stefan Kanthak #ifdef __amd64__ __int128_t __a

Missed optimisation in __udivmoddi4 of libgcc2

2020-09-13 Thread Stefan Kanthak
ng? (I use the variable names from the C source instead of register names here) mov %r11d, %ecx shld %cl, d0, d1 xor n2, n2 shld %cl, n1, n2 shld %cl, n0, n1 JFTR: the test at b8 is superfluous. regards Stefan

Re: Peephole optimisation: isWhitespace()

2020-08-25 Thread Stefan Kanthak
ken" or "almost never taken" may help the | processor better predict the remaining branches. JFTR: I didn't know his article before, but I hope that you are willing to learn. Stefan

Re: Peephole optimisation: isWhitespace()

2020-08-24 Thread Stefan Kanthak
"Richard Biener" wrote: > On Mon, Aug 24, 2020 at 1:22 PM Stefan Kanthak > wrote: >> >> "Richard Biener" wrote: >> >> > On Mon, Aug 17, 2020 at 7:09 PM Stefan Kanthak >> > wrote: >> >> >> >> "Al

Re: Peephole optimisation: isWhitespace()

2020-08-24 Thread Stefan Kanthak
"Richard Biener" wrote: > On Mon, Aug 17, 2020 at 7:09 PM Stefan Kanthak > wrote: >> >> "Allan Sandfeld Jensen" wrote: >> >> > On Freitag, 14. August 2020 18:43:12 CEST Stefan Kanthak wrote: >> >> Hi @ll, >> >>

Re: Peephole optimisation: isWhitespace()

2020-08-17 Thread Stefan Kanthak
"Allan Sandfeld Jensen" wrote: > On Freitag, 14. August 2020 18:43:12 CEST Stefan Kanthak wrote: >> Hi @ll, >> >> in his ACM queue article <https://queue.acm.org/detail.cfm?id=3372264>, >> Matt Godbolt used the function >> >> | b

Re: Peephole optimisation: isWhitespace()

2020-08-17 Thread Stefan Kanthak
"Nathan Sidwell" > On 8/16/20 9:54 AM, Stefan Kanthak wrote: >> "Nathan Sidwell" wrote: [...] >>> Have you benchmarked it? >> >> Of course! Did you? [...] > you seem very angry about being asked for data. As much as you hallucinated

Re: Peephole optimisation: isWhitespace()

2020-08-16 Thread Stefan Kanthak
"Nathan Sidwell" wrote: > On 8/14/20 12:43 PM, Stefan Kanthak wrote: >> Hi @ll, >> >> in his ACM queue article <https://queue.acm.org/detail.cfm?id=3372264>, >> Matt Godbolt used the function >> >> | bool isWhitespace(char c)

Peephole optimisation: isWhitespace()

2020-08-14 Thread Stefan Kanthak
x27;) shreax, cl ; eax >>= (c % ' ') xoredx, edx cmp ecx, 33 ; CF = c <= ' ' adcedx, edx ; edx = (c <= ' ') andeax, edx ret regards Stefan Kanthak

Almost an order of magnitude faster __udimodti4() for AMD64

2020-08-10 Thread Stefan Kanthak
tps://skanthak.homepage.t-online.de/integer.html#as-5>, as well as (after trivial editing) __udivdi3() from <https://skanthak.homepage.t-online.de/integer.html#ml-1> and __divdi3() from <https://skanthak.homepage.t-online.de/integer.html#ml-2> regards Stefan

make static method find_reloads_address_1(...) extern accessible

2019-09-30 Thread stefan
ch would requires it? Regards Stefan

Re: Bug in divmodhi4(), plus poor inperformant code

2018-12-06 Thread Stefan Kanthak
"Segher Boessenkool" wrote: > On Wed, Dec 05, 2018 at 02:19:14AM +0100, Stefan Kanthak wrote: >> "Paul Koning" wrote: >> >> > Yes, that's a rather nasty cut & paste error I made. >> >> I suspected that. >> Replacing &g

  1   2   >