------- Comment #3 from randolph at tausq dot org 2006-09-24 23:48 ------- Subject: Re: [hppa] Missing address increment optimization for fp load/stores
>> gcc starting from 4.0 produces this: >> >> .L3: >> fldds -16(%r26),%fr22 >> fldds -8(%r26),%fr23 >> fldds 0(%r26),%fr24 >> fldds 8(%r26),%fr25 >> ldo 32(%r26),%r26 >> fstds %fr22,-16(%r25) >> fstds %fr23,-8(%r25) >> fstds %fr24,0(%r25) >> fstds %fr25,8(%r25) >> b .L3 >> >> which I suspect is actually better, since it avoids dependencies between the >> loads. But I'm not familiar with hppa, can anybody comment? > > It looks close to optimal to me. The code is better than that generated > by 3.4.x or HP cc. Using the auto-increment forms would allow elimination > of the two ldo instructions to increment r25 and r26. Yeah, this looks pretty good. I've been told that not using the autoincrement forms might be even better as it avoids interlocks between successive instructions. The ldo insn just gets pipelined so it doesn't necessarily slow things down. I'll mark this bug as resolved. thanks randolph -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17264