https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
--- Comment #15 from Oleg Endo ---
Maybe the standard name patterns sdot_prod for 16 bit and 32 bit int vectors
could be implemented using the mac.w and mac.l insns, if the input vectors are
somehow put in memory.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
--- Comment #14 from Oleg Endo ---
(In reply to Oleg Endo from comment #3)
>
> - When compiling for big endian the RA mistakes mach and macl when
> storing mach:macl to a DImode reg:reg pair.
> This could probably fixed by providing appropri
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
--- Comment #13 from Oleg Endo ---
A more interesting real-world example from libjpeg would be function
jpeg_idct_ifast (jidctint.c).
If we take the code as-is, there are few mac opportunities due to sharing of
the terms. The expressions could
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
Oleg Endo changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Last reconfirmed|
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
--- Comment #11 from Oleg Endo ---
Another question is whether the following is OK to do on all SH
implementations:
int test33 (int x, int y, int z)
{
return x * y + z;
}
currently compiles:
mul.l r5,r4
sts macl,r0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
--- Comment #10 from Oleg Endo ---
I was wondering whether it would make sense to convert sequences such as
SH4 SH4A
mov.l @r15,r3 LS/2 LS/2
mul.l r2,r3CO/4 EX/3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
--- Comment #9 from Oleg Endo 2013-05-04 13:39:10
UTC ---
(In reply to comment #3)
> - Loops with multiple running sums like
> for (int i = 0; i < 16; ++i)
> {
> sum0 += (int64_t)(*a++) * (int64_t)(*b++);
> sum1 += (int64_t)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
--- Comment #8 from Oleg Endo 2012-11-07 21:31:39
UTC ---
Christian, I just wanted to check with you whether you've already started doing
something regarding the mac.w / mac.l instructions?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
--- Comment #7 from Oleg Endo 2012-10-11 20:43:02
UTC ---
A note regarding the SR.S bit. The insns sets and clrs are available only on
SH3* and SH4*. SH1* and SH2* (incl SH2A) do not implement them.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
--- Comment #6 from Oleg Endo 2012-07-22 16:47:44
UTC ---
If I understand correctly PR 29961 is somewhat related to this.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
--- Comment #5 from Kazumoto Kojima 2012-07-17
23:04:54 UTC ---
(In reply to comment #4)
> Kaz, do you think it is safe to assume that SR.S = 0 at function entry?
I think so. I can't imagine a practical system with setting
SR.S to one in its st
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
Oleg Endo changed:
What|Removed |Added
CC||kkojima at gcc dot gnu.org
--- Comment #4 fro
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
--- Comment #3 from Oleg Endo 2012-07-15 12:11:20
UTC ---
Created attachment 27799
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27799
Proof of concept patch
This is a proof of concept patch just to probe around.
The idea is to allow the R
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
chrbr at gcc dot gnu.org changed:
What|Removed |Added
Severity|normal |enhancement
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
--- Comment #2 from chrbr at gcc dot gnu.org 2012-07-13 11:00:55 UTC ---
I see the MAC only as a global optimization, since its interest is to spawns
across several loop BBs as you said. Their is also problem on clear the
accumulator.
That should
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949
--- Comment #1 from Oleg Endo 2012-07-13 10:34:20
UTC ---
(In reply to comment #0)
> So far, GCC does not utilize the integer multiply-add instructions.
> On SH1 only the mac.w instruction is supported.
> On SH2 and above the mac.w and mac.l inst
16 matches
Mail list logo