On 05/20/14 11:14, Wei Mi wrote:
On Tue, May 20, 2014 at 12:13 AM, Bin.Cheng wrote:
On Tue, May 20, 2014 at 1:30 AM, Jeff Law wrote:
On 05/19/14 00:38, Bin.Cheng wrote:
On Sat, May 17, 2014 at 12:32 AM, Jeff Law wrote:
On 05/16/14 04:07, Bin.Cheng wrote:
But can't you go through movXX
On Tue, May 20, 2014 at 12:13 AM, Bin.Cheng wrote:
> On Tue, May 20, 2014 at 1:30 AM, Jeff Law wrote:
>> On 05/19/14 00:38, Bin.Cheng wrote:
>>>
>>> On Sat, May 17, 2014 at 12:32 AM, Jeff Law wrote:
On 05/16/14 04:07, Bin.Cheng wrote:
But can't you go through movXX
On 05/20/14 01:13, Bin.Cheng wrote:
The idea being that common cases where a pair moves can be turned into a
single wider move without having to write target code to make that happen
much of the time. ie 2xQI->HI, 2xHI->SI, 2xSI->DI 2xSF->DF. For things
outside those simple cases, fall back to
On Tue, May 20, 2014 at 5:02 AM, Mike Stump wrote:
> On May 19, 2014, at 10:30 AM, Jeff Law wrote:
>>> Yes, I think it's more than upsizing the mode. There is another
>>> example from one of x86's candidate peephole patch at
>>> https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00467.html
>>>
>>> Th
On Tue, May 20, 2014 at 1:30 AM, Jeff Law wrote:
> On 05/19/14 00:38, Bin.Cheng wrote:
>>
>> On Sat, May 17, 2014 at 12:32 AM, Jeff Law wrote:
>>>
>>> On 05/16/14 04:07, Bin.Cheng wrote:
>>>
>>>
>>>
>>> But can't you go through movXX to generate either the simple insn on the
>>> ARM
>>> or the PA
On May 19, 2014, at 10:30 AM, Jeff Law wrote:
>> Yes, I think it's more than upsizing the mode. There is another
>> example from one of x86's candidate peephole patch at
>> https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00467.html
>>
>> The patch wants to do below transformation, which I think is
On 05/19/14 00:38, Bin.Cheng wrote:
On Sat, May 17, 2014 at 12:32 AM, Jeff Law wrote:
On 05/16/14 04:07, Bin.Cheng wrote:
Yes, I think this one does have a good reason. The target independent
pass just makes sure that two consecutive memory access instructions
are free of data-dependency wit
On 05/19/14 00:38, Bin.Cheng wrote:
1) Should we do it in a separated pass, or just along with scheduler?
ISTM that when we're able to combine insns that can impact the schedule
we'd like to generate, possibly in significant ways. That argues for a
separate pass that runs before the scheduler.
On 05/19/14 00:38, Bin.Cheng wrote:
On Sat, May 17, 2014 at 12:52 AM, Mike Stump wrote:
On May 16, 2014, at 3:07 AM, Bin.Cheng wrote:
I don't see how regrename will help resolve [base+offset] false
dependencies. Can you explain? I'd expect effects from
hardreg-copyprop "commoning" a base re
On Sat, May 17, 2014 at 12:18 AM, Jeff Law wrote:
> On 05/16/14 04:07, Bin.Cheng wrote:
>>
>> On Fri, May 16, 2014 at 1:13 AM, Jeff Law wrote:
>>>
>>> On 05/15/14 10:51, Mike Stump wrote:
On May 15, 2014, at 12:26 AM, bin.cheng wrote:
>
>
> Here comes up with a new GCC
On Sat, May 17, 2014 at 12:52 AM, Mike Stump wrote:
> On May 16, 2014, at 3:07 AM, Bin.Cheng wrote:
>>
>>> I don't see how regrename will help resolve [base+offset] false
>>> dependencies. Can you explain? I'd expect effects from
>>> hardreg-copyprop "commoning" a base register.
>> It's the regis
On Sat, May 17, 2014 at 12:32 AM, Jeff Law wrote:
> On 05/16/14 04:07, Bin.Cheng wrote:
>
>> Yes, I think this one does have a good reason. The target independent
>> pass just makes sure that two consecutive memory access instructions
>> are free of data-dependency with each other, then feeds it
On Fri, 2014-05-16 at 18:10 +0800, Bin.Cheng wrote:
> On Thu, May 15, 2014 at 6:31 PM, Oleg Endo wrote:
> >
> > How about the following.
> > Instead of adding new hooks and inserting the pass to the general pass
> > list, make the new
> > pass class take the necessary callback functions directly.
On May 16, 2014, at 3:07 AM, Bin.Cheng wrote:
>
>> I don't see how regrename will help resolve [base+offset] false
>> dependencies. Can you explain? I'd expect effects from
>> hardreg-copyprop "commoning" a base register.
> It's the register operand's false dependency, rather than the base's
> on
On 05/16/14 04:07, Bin.Cheng wrote:
Yes, I think this one does have a good reason. The target independent
pass just makes sure that two consecutive memory access instructions
are free of data-dependency with each other, then feeds it to back-end
hook. It's back-end's responsibility to generate
On 05/16/14 04:07, Bin.Cheng wrote:
On Fri, May 16, 2014 at 1:13 AM, Jeff Law wrote:
On 05/15/14 10:51, Mike Stump wrote:
On May 15, 2014, at 12:26 AM, bin.cheng wrote:
Here comes up with a new GCC pass looking through each basic block
and merging paired load store even they are not adjace
>
> Btw, the bswap pass enhancements that are currently in review may
> also be an opportunity to catch these. They can merge adjacent
> loads that are used "composed" (but not yet composed by storing
> into adjacent memory). The basic-block vectorizer should also
> handle this (if the compositio
On Fri, May 16, 2014 at 12:51 PM, Richard Biener wrote:
> Btw, the bswap pass enhancements that are currently in review may
> also be an opportunity to catch these. They can merge adjacent
> loads that are used "composed" (but not yet composed by storing
> into adjacent memory). The basic-block v
On Fri, May 16, 2014 at 12:10 PM, Bin.Cheng wrote:
> On Thu, May 15, 2014 at 6:31 PM, Oleg Endo wrote:
>> Hi,
>>
>> On 15 May 2014, at 09:26, "bin.cheng" wrote:
>>
>>> Hi,
>>> Targets like ARM and AARCH64 support double-word load store instructions,
>>> and these instructions are generally faste
On Thu, May 15, 2014 at 6:31 PM, Oleg Endo wrote:
> Hi,
>
> On 15 May 2014, at 09:26, "bin.cheng" wrote:
>
>> Hi,
>> Targets like ARM and AARCH64 support double-word load store instructions,
>> and these instructions are generally faster than the corresponding two
>> load/stores. GCC currently u
On Fri, May 16, 2014 at 12:57 AM, Steven Bosscher wrote:
> On Thu, May 15, 2014 at 9:26 AM, bin.cheng wrote:
>> Hi,
>> Targets like ARM and AARCH64 support double-word load store instructions,
>> and these instructions are generally faster than the corresponding two
>> load/stores. GCC currently
On Fri, May 16, 2014 at 1:13 AM, Jeff Law wrote:
> On 05/15/14 10:51, Mike Stump wrote:
>>
>> On May 15, 2014, at 12:26 AM, bin.cheng wrote:
>>>
>>> Here comes up with a new GCC pass looking through each basic block
>>> and merging paired load store even they are not adjacent to each
>>> other.
>
On Fri, May 16, 2014 at 12:51 AM, Mike Stump wrote:
> On May 15, 2014, at 12:26 AM, bin.cheng wrote:
>> Here comes up with a new GCC pass looking through each basic block and
>> merging paired load store even they are not adjacent to each other.
>
> So I have a target that has load and store mult
On May 15, 2014, at 1:01 PM, Jeff Law wrote:
> For the memory optimizations, IIRC, the dependencies keep them from getting
> into the ready queue at the same time. Thus it's significantly harder to get
> them to issue consecutively when you've got an issue rate > 1.
> But if you've got an issu
On 05/15/14 12:41, Mike Stump wrote:
On May 15, 2014, at 10:13 AM, Jeff Law wrote:
I've poked at the scheduler several times to do similar stuff, but
was never really satisfied with the results and never tried to
polish those prototypes into something worth submitting.
What was lacking? The
On May 15, 2014, at 10:13 AM, Jeff Law wrote:
> I've poked at the scheduler several times to do similar stuff, but was never
> really satisfied with the results and never tried to polish those prototypes
> into something worth submitting.
What was lacking? The cleanliness of the patch or the,
On Thu, May 15, 2014 at 12:26 AM, bin.cheng wrote:
> Hi,
> Targets like ARM and AARCH64 support double-word load store instructions,
> and these instructions are generally faster than the corresponding two
> load/stores. GCC currently uses peephole2 to merge paired load/store into
> one single in
On 05/15/14 10:51, Mike Stump wrote:
On May 15, 2014, at 12:26 AM, bin.cheng wrote:
Here comes up with a new GCC pass looking through each basic block
and merging paired load store even they are not adjacent to each
other.
So I have a target that has load and store multiple support that
suppo
On Thu, May 15, 2014 at 9:26 AM, bin.cheng wrote:
> Hi,
> Targets like ARM and AARCH64 support double-word load store instructions,
> and these instructions are generally faster than the corresponding two
> load/stores. GCC currently uses peephole2 to merge paired load/store into
> one single inst
On May 15, 2014, at 12:26 AM, bin.cheng wrote:
> Here comes up with a new GCC pass looking through each basic block and
> merging paired load store even they are not adjacent to each other.
So I have a target that has load and store multiple support that supports large
a number of registers (2-n
Hi,
On 15 May 2014, at 09:26, "bin.cheng" wrote:
> Hi,
> Targets like ARM and AARCH64 support double-word load store instructions,
> and these instructions are generally faster than the corresponding two
> load/stores. GCC currently uses peephole2 to merge paired load/store into
> one single in
Hi,
Targets like ARM and AARCH64 support double-word load store instructions,
and these instructions are generally faster than the corresponding two
load/stores. GCC currently uses peephole2 to merge paired load/store into
one single instruction which has a disadvantage. It can only handle simple
32 matches
Mail list logo