On 27/04/12 11:49, Richard Guenther wrote:

Yes, it inlines it.  You may want to look at s390 which I believe has
a similar block-copy operation.

Richard.


I looked at s390 and even though the block copy instruction seems similar ours is much more restrictive since it expects values in specific registers, instead of allowing the register numbers to be passed to the instruction (which is the case with s390 mvcle insn).

I decided to try and not hardcode the registers in the instruction but since the instruction requires specific registers as operands I had to create a class per register (with a single register in it) and then register constraints for each of the classes. This turned out not to work. RA breaks even earlier than before. Here's what I did:
(define_expand "movmemqi"
  [(set (match_operand:BLK 0 "memory_operand")    ; destination
        (match_operand:BLK 1 "memory_operand"))   ; source
   (use (match_operand:QI 2 "general_operand"))]  ; count
  "!TARGET_NO_BLOCK_COPY && !reload_completed"
{
    rtx dst_addr = XEXP(operands[0], 0);
    rtx src_addr = XEXP(operands[1], 0);
    rtx dst_reg = gen_reg_rtx(QImode); /* will be forced into AH */
    rtx src_reg = gen_reg_rtx(QImode); /* will be forced into XL */
    rtx cnt_reg = gen_reg_rtx(QImode); /* will be forced into AL */

    emit_move_insn(cnt_reg, operands[2]);

    if(GET_CODE(dst_addr) == PLUS)
    {
        emit_move_insn(dst_reg, XEXP(dst_addr, 0));
        emit_insn(gen_addqi3(dst_reg, dst_reg, XEXP(dst_addr, 1)));
    }
    else
        emit_move_insn(dst_reg, dst_addr);

    if(GET_CODE(src_addr) == PLUS)
    {
        emit_move_insn(src_reg, XEXP(src_addr, 0));
        emit_insn(gen_addqi3(src_reg, src_reg, XEXP(src_addr, 1)));
    }
    else
        emit_move_insn(src_reg, src_addr);

    emit_insn(gen_bc2(dst_reg, src_reg, cnt_reg));

    DONE;
})

(define_insn "bc2"
  [(set (match_operand:QI 0 "register_operand" "=l")
        (const_int 0))
   (set (mem:BLK (match_operand:QI 1 "register_operand" "=h"))
        (mem:BLK (match_operand:QI 2 "register_operand" "=x")))
   (set (match_dup 2)
        (plus:QI (match_dup 2) (match_dup 0)))
   (set (match_dup 1) (plus:QI (match_dup 1) (match_dup 0)))]
  "!TARGET_NO_BLOCK_COPY"
  "bc2")

constraints l, h and x correspond to singleton classes for registers AL, AH and XL respectively. I think the problem here is the RA inability to deal with such a constrained register set. Since I want to be able to use our block copy instruction instead of disabling movmemqi, setmemqi and therefore branch to memcpy, is there anything I can try to tune the RA?

--
PMatos

Reply via email to