In case of x86_64. This is the code:

src_1(bool, bool):
        cmp     dil, sil
        setb    al
        ret

tgt_1(bool, bool):
        xor     edi, 1
        mov     eax, edi
        and     eax, esi
        ret


Lets look at the latency of the src_1:
cmp: latency of 1: (page 663, table C-17)
setb: latency of 2. They don't report setb latency in intel instruction manual. 
But the closest instruction to this setbe does have latency of 2.

But for tgt_1:
xor: latency 1.
mov: latency 1. (But it seems x86_64 does optimize this instruction and 
basically it is latency 0 in this case.  In Zero-Latency MOV Instructions 
section they explain it [1].)
and: latency 1.

So even if you consider setb as latency of 1 it is equal. But if it is latency 
of 2, it should be a 1 latency win.

1) 
https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf

Best wishes,
Navid.

________________________________________
From: Jeff Law <jeffreya...@gmail.com>
Sent: Tuesday, November 23, 2021 11:14
To: Navid Rahimi; Navid Rahimi via Gcc-patches
Subject: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean 
comparison simplification



On 11/23/2021 11:34 AM, Navid Rahimi via Gcc-patches wrote:
> Hi GCC community,
>
> I wanted you take a quick look at this patch to solve this bug [1]. This is 
> the code example for the optimization [2] which does include a link to proof 
> of each different optimization.
>
> I think it should be possible to use simpler approach than what Andrew has 
> used here [3].
>
> P.S. Tested and verified on Linux x86_64.
>
> 1) 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D101808&amp;data=04%7C01%7Cnavidrahimi%40microsoft.com%7C29308ca3ff234b91a31608d9aeb57500%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732916650766903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=m%2BIgviZpMo0MT369dcIzefp810oz%2FMU9LC1Mk2FdChk%3D&amp;reserved=0
> 2) 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2FGc448eE3z&amp;data=04%7C01%7Cnavidrahimi%40microsoft.com%7C29308ca3ff234b91a31608d9aeb57500%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732916650766903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=IwNQZsaEUaB1MKRfL8OWkWYvx0ODq86Obt3eFuxZD40%3D&amp;reserved=0
> 3) 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D101808%23c1&amp;data=04%7C01%7Cnavidrahimi%40microsoft.com%7C29308ca3ff234b91a31608d9aeb57500%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732916650766903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=2DB2AlZWbjJ6Fd3Aw%2Bb2Oub3t8d1i7%2FnaUQKUe2m4uQ%3D&amp;reserved=0
Don't those match.pd patterns make things worse?  We're taking a single
expression evaluation (the conditional) and turning it into two logicals
AFAICT.

For the !x expression, obviously if x is a  constant, then we can
compute that at compile time and we're going from a single conditional
to a single logical which is probably a win, but that's not the case
with this patch AFAICT.

jeff

Reply via email to