Ilya Lesokhin schrieb: >> That's because splitting a volatile mem is not appropriate. You would >> have to volatile QI accesses instead of one volatile HI access. This >> means the two parts could move far away from each other. GCC is >> conservative about volatiles, so should be a backend. >> > > i haven't thought about that, thank you. > i just noticed that writing: > > OCR2RAH= 0; > OCR2RAL=1; > > generates better code, and i thought its better the compiler handles it, > rather then making the code uglier. > but as i now see it doesn't guarantee the right behavior with respect to > volatile. > > >>>> DONE; >>>>> } >>>>> else /* One of the ops has to be in a register. */ >>>>> if (!register_operand(operand0, HImode) >>>>> && !(register_operand(operand1, HImode) || const0_rtx == operands[1])) >>>>> { >>>>> operands[1] = copy_to_mode_reg(HImode, operand1); >>>>> } >>>>> }") >>>>> >>>>> but then i get Segmentation fault when trying to compile code. >>>>> >>>> You move expansion cannot work because you need at least a QI register >>>> (except in the case when you like to store 0, which is already handled >> in >>>> the implementation. >>>> >>>> Presumably, you want code like this: >>>> >>>> (set (reg:QI n) >>>> (const_int m)) >>>> (set (mem:HI (const_addr:P)) >>>> (zero_extend:HI (reg:QI n))) >>>> >>>> You can do it in expand, but note that expand is called at different >> stages >>>> of compilation. You need an intermediate register, so you must be sure >> you >>>> can actually get one, i.e. can_create_pseudo_p is true. >>>> >>>> As an alternative, you can hook in after insn combine. >>>> combine will synthesize such insns, so you can supply a combiner pattern >>>> that matches and split it prior to register allocation by means of a >> split >>>> pattern. >>>> >>>> Yet an other alternative is to extend zero_extendqihi to accept memory >>>> operand in dest. Note that there is different handlich for volatiles. >>> >>> i belive that spliting the instruction into 2 instractions is supperior >> No, splitting a volatile access is not appropriate. >> >>> since its also hadles the cases where the low byte is zero or when the >> low >>> byte and the high byte are equal as suppose to zero_extend. >> I don't think it's a point trading marginal efficiency against >> volatile correctness. >> >> So an implementation will look like this: >> >> In movhi expander: >> >> if (optimize >> && can_create_pseudo_p() >> && MEM_P (operands[0]) >> && satisfies_constraint_M (operands[1]) >> && const0_rtx != operands[1]) >> { >> rtx val = GEN_INT (trunc_int_for_mode (INTVAL (operands[1], QImode)); >> rtx reg = copy_to_mode_reg (QImode, val); >> emit_insn (gen_store_zeroextendqihi2 (operands[0], reg); >> DONE; >> } >> >> And >> >> (define_insn "store_zeroextendqihi2" >> [(set (match_operand:HI 0 "memory_operand" "=Qm") >> (zero_extend:HI (match_operand:QI 1 "register_operand" "r")))] >> "" >> { >> return out_movhi_mr_zerox_r (insn, operands, NULL); >> }) >> >> where out_movhi_mr_zerox_r is similar to out_movhi_mr_r but adapted >> for your purposes. >> >> Moreover, you will have to fix length computation in adjust_insn_length. >> >> This approach is appropriate for volatile mems, too. And it does not >> make restrictions to the address involved. Why exclude indirect >> addressing...? >> > i wanted to exclude indirect addressing since splitting in that case may > generate worse code because then the compiler wont be able to use > post-increment/pre-decrement, but as you've kindly explained splitting is > not the right solution :( > > >>> also what do you mean by handlich for volatiles? >>> >>> >>> anyway, thanks for all your help, you've been the most helpful so far. >>> >>> unfortently, the task seems alot more complicated the i first thought. >> Yes, that's GCC. Don't say you didn't know before ;-) >> >>> OCR0SA = 1u; >>> is at first converted to and indirect memory acces: >>> ... >>> (const_int 1 [0x1])) -1 (nil)) >>> (set (mem/v:HI (reg/f:HI 43) [2 S2 A8]) >>> (reg:HI 44)) -1 (nil)) >>> >>> and only after the IRA phase its converted to something with a const >> address >>> ... >>> (set (mem/v:HI (const_int 210 [0xd2]) [2 S2 A8]) >>> (reg:HI 24 r24 [44])) 10 {*movhi} (expr_list:REG_EQUAL (const_int 1 >>> [0x1] (nil))) >> It's a (post)reload optimization. >> > > when i used -fdump-rtl-all i got the impression form the numbering of the > files that the IRA phase is before reload, did i get the convention wrong?
There is no dump after IRA and prior to reload. With -da, the next dump after .ira is .postreload. Note that some passes are intertwined, so pass numbers might be confusing. > i still want to handle the case where the low byte is zero, so i'm thinking > its better to create a pattern that receives a HI memory operand and 2 QI > reg operands, and the movhi expander will just use this pattern sending > zero_reg or copy_to_mode_reg(...) according to the value. > > do you see a problem with this solution? No. You can use unspec or make explicit what's going on. Note there is already zero_reg_rtx in avr.c. Johann _______________________________________________ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org https://lists.nongnu.org/mailman/listinfo/avr-gcc-list