Insn canonicalization not only with constant
Hi, Although I have been porting and using gcc for quite a while now, I am still a newbie at the internals and would be grateful if you can help me. I have designed a CPU architecture where most of the instructions only accept data operands as registers and no immediate values are allowed, which is causing me some trouble in gcc. One of the problems is instruction canonicalization. I have some special single instructions to execute operations, which on other processors would take several instructions, e.g. scaling of 64-bit into 32-bit using pre-set, let's say something that looks like: (define_insn "scale_28_4" [(set (match_operand:SI 0 "register_operand" "=r") (ior:SI (ashift:SI (match_operand:SI 1 "register_operand" "r") (const_int 28 )) (lshiftrt:SI (match_operand:SI 2 "register_operand" "r") (const_int 4)) ))] "" "SCALE_28_4 tout= %0 in1= %1 tin2= %2" [(set_attr "type" "logic") (set_attr "length" "1")]) Instruction canonicalization doesn't work, since as explained in http://gcc.gnu.org/onlinedocs/gccint/Insn-Canonicalizations.html it only works if the second operand is a constant. Is it possible to change this to make register operands valid for canonicalization as well? Does the same problem affect 'mem' instructions with offset, and does it makes gcc canonicalizes only the ones with a constant offset? Any help is greatly appreciated! Sami
Re: Insn canonicalization not only with constant
Hi Rask, Basically the CPU has the 'SCALE_28_4' instruction which does the following: output = (operand1 >> 28) | (operand2 << 4) From my understanding the OR operation (ior), doesn't get canonicalized since it's second operand (in this case (lshiftrt:SI (match_operand:SI 2 "register_operand" "r") (const_int 4)) ) is not a constant. Sami
Re: Insn canonicalization not only with constant
Hi Andrew, You mean using a DI rotate left by 4 and then saving the output as SI (saving the hi part and ignoring the low one) ? Also, how is canonicalization detected anyway? Are there rules that gcc follows? How can they be changed? Sami Andrew Pinski wrote: output = (operand1 >> 28) | (operand2 << 4) Isn't that a rotate? if so you can use either rotate or rotatert instead.
Re: Insn canonicalization not only with constant
OK, I see what you mean. The reason you can get both (ior (ashift ...) (lshiftrt ...)) and (ior (lshiftrt ...) (ashift ...)) is that simplify-rtx.c has no rule to canonicalize such expressions and that LSHIFTRT and ASHIFT have the same precedence. Hmm, in simplify_binary_operation_1(), it says: /* Convert (ior (ashift A CX) (lshiftrt A CY)) where CX+CY equals the mode size to (rotate A CX). */ ok, so that means that in that specific shift example I could go away with a rotate operation (even though it has to be of mode DI -> SI). Right after that is code to make sure ASHIFT is the first operand for the simplification attempts that follow. You could try adding code to do this in general, but I don't know where such code should be added. I will look more into this. It might be that there is no simple way to activate canonicalization for the general case (i.e. any insn that defined in the machine description), and maybe it has to be done to every specific type of operation. Btw, I found this in rtlanal.c: int commutative_operand_precedence (rtx op) > : : It seems like commutative_operand_precedence() is only used twice to swap operand1 and operand2 - so the fact that it returns low values (or high, since the comment in the code seems wrong) for general operands shouldn't affect the ability to canonicalize them. Sami
Full comparison in 'cbranchsi4' leads to error in gcc 4.0
Hi, I am porting gcc (version 4.0) to a CPU supporting conditional jumps, which does not have a CC register. I have combined the comparision and jump operation in the definition of "cbranchsi4" as show at the end of this message. This works fine on gcc 3.4, however on gcc 4.0 it creates an error during optimization. According to my investigation, the error occurs when there is a division by a constant power of 2 which needs to be transformed into shifting. The error generated is: internal compiler error: in emit_cmp_and_jump_insn_1, at optabs.c:3599 The definition of cbranchsi4 in the machine-description: (define_insn "cbranchsi4" [(set (pc) (if_then_else (match_operator 0 "comparison_operator" [(match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "nonmemory_operand" "r")]) (label_ref (match_operand 3 "" "")) (pc)))] "" "c%C0jump %1 %2 %3" [(set_attr "type" "branch") (set_attr "length" "1")] ) Regards, Sami Khawam The University of Edinburgh http://www.see.ed.ac.uk/~sxk
Re: emit_conditional_move w/o compare insn (was: Full comparison in 'cbranchsi4' leads to error in gcc 4.0)
Hi Gary, Thanks a lot for the tip. After debugging, the problem seem to be coming from 'emit_conditional_move' which is called by 'expand_sdiv_pow2' when converting division-by-constants into shifting. The problem is that the architecture I have does not support compare operations: It either has a conditional move or a conditional branch instructions (the compare operation is part of these instructions). To make a conditional move, 'emit_conditional_move' generates a seperate compare and then a move insns, maybe hoping that these will be later combined into a single instruction during the optimization steps. Would it be possible to tell this to emit_conditional_move so that it generates the cmove insn directely? Many Thanks, Sami Khawam Gary Funck wrote: This works fine on gcc 3.4, however on gcc 4.0 it creates an error during optimization. According to my investigation, the error occurs when there is a division by a constant power of 2 which needs to be transformed into shifting. The error generated is: internal compiler error: in emit_cmp_and_jump_insn_1, at optabs.c:3599 The easiest thing to do is to debug gcc: set a breakpoint on fancy_abort, and and go up a few levels to emit_cmp_and_jump_insn_1(). Note the incoming rtx args (x and y) and mode. From the looks of the code in there it is looking for an instruction pattern that matches, and when no match is found, it tries a wider mode, until there are no wider modes, then it aborts. You need to find the mode and rtx arguments that are being passed in, and then understand why no matching instruction is found. For example, in your instruction pattern, (define_insn "cbranchsi4" [(set (pc) (if_then_else (match_operator 0 "comparison_operator" [(match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "nonmemory_operand" "r")]) (label_ref (match_operand 3 "" "")) (pc)))] "" "c%C0jump %1 %2 %3" [(set_attr "type" "branch") (set_attr "length" "1")] ) it isn't prepared to match a memory operand. Perhaps the optimizer pre-calculated a constant, and targeted the constant into memory rather than a register? In that case, there will be no match on the third argument because it is expecting a "nonmemoryoperand".