On Mon, Apr 30, 2012 at 10:09 AM, Uros Bizjak <[email protected]> wrote:
> On Fri, Apr 27, 2012 at 3:30 PM, Paolo Bonzini <[email protected]> wrote:
>> tzcnt is encoded as "rep;bsf" and unlike lzcnt is a drop-in replacement
>> if we don't care about the flags (it has the same semantics for non-zero
>> values).
>>
>> Since bsf is usually slower, just emit tzcnt unconditionally. However,
>> write it as rep;bsf unless -mbmi is in use, to cater for old assemblers.
>
> Please emit "rep;bsf" when optimize_insn_for_speed_p () is true.
>
>> Bootstrapped on a non-BMI x86_64-linux host, regtest in progress.
>> Ok for mainline?
>
> OK with the optimize_insn_for_speed_p conditional.
I have committed similar patch, where we emit bsf when optimizing for
size (saving a whopping one byte) and rep;bsf for !TARGET_BMI. The
same functionality can be added to *ffs<mode>_1, since we don't care
what ends in the register for input operand == 0 (this is the key
difference between tzcnt and bsf).
2012-05-06 Uros Bizjak <[email protected]>
Paolo Bonzini <[email protected]>
* config/i386/i386.md (ctz<mode>2): Emit rep;bsf even for
!TARGET_BMI and bsf when optimizing for size.
(*ffs<mode>_1): Ditto.
Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.
Uros.
Index: i386.md
===================================================================
--- i386.md (revision 187217)
+++ i386.md (working copy)
@@ -12112,9 +12112,22 @@
(set (match_operand:SWI48 0 "register_operand" "=r")
(ctz:SWI48 (match_dup 1)))]
""
- "bsf{<imodesuffix>}\t{%1, %0|%0, %1}"
+{
+ if (optimize_function_for_size_p (cfun))
+ return "bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
+ else if (TARGET_BMI)
+ return "tzcnt{<imodesuffix>}\t{%1, %0|%0, %1}";
+ else
+ /* tzcnt expands to rep;bsf and we can use it even if !TARGET_BMI. */
+ return "rep; bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
+}
[(set_attr "type" "alu1")
(set_attr "prefix_0f" "1")
+ (set (attr "prefix_rep")
+ (if_then_else
+ (match_test "optimize_function_for_size_p (cfun)")
+ (const_string "0")
+ (const_string "1")))
(set_attr "mode" "<MODE>")])
(define_insn "ctz<mode>2"
@@ -12123,14 +12136,21 @@
(clobber (reg:CC FLAGS_REG))]
""
{
- if (TARGET_BMI)
+ if (optimize_function_for_size_p (cfun))
+ return "bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
+ else if (TARGET_BMI)
return "tzcnt{<imodesuffix>}\t{%1, %0|%0, %1}";
- else
- return "bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
+ else
+ /* tzcnt expands to rep;bsf and we can use it even if !TARGET_BMI. */
+ return "rep; bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
}
[(set_attr "type" "alu1")
(set_attr "prefix_0f" "1")
- (set (attr "prefix_rep") (symbol_ref "TARGET_BMI"))
+ (set (attr "prefix_rep")
+ (if_then_else
+ (match_test "optimize_function_for_size_p (cfun)")
+ (const_string "0")
+ (const_string "1")))
(set_attr "mode" "<MODE>")])
(define_expand "clz<mode>2"