Hi,
I found in gcc/config/arm/ieee754-df.S, the function __aeabi_d2uiz
converts double into unsigned integer and the function always return 0
if the double value is negative. for example the following codes:
---sample codes--
unsigned long ul;
double d = -1.1
Hi,
In ifcvt.c's function find_if_case_2, it uses cheap_bb_rtx_cost_p to
judge the conversion.
Function cheap_bb_rtx_cost_p checks whether the total insn_rtx_cost on
non-jump insns in
basic block BB is less than MAX_COST.
So the question is why uses cheap_bb_rtx_cost_p, even when we know the
ELSE
Hi guys,
Is it CFLAGS used by libgcc/Makefile.in to build libgcc.a?
It seems if I configure gcc with CFLAGS="-O0 -g " environment
variable, libgcc is also compiled with -O0 option.
I'm wondering why do not use CFLAGS_FOR_TARGET
here(CFLAGS->INTERNAL_CFLAGS->gcc_compile_bare->gcc_compile).
Please h
Hi,
In libstdc++-v3/libsupc++/eh_term_handler.cc, it says by default the
demangler things are pulled in,
according to whether _GLIBCXX_HOSTED is defined. the demangler
exception terminating handler
are really big, especially for embedded system.
Secondly, _GLIBCXX_HOSTED is now defined if --enabl
> (Any reason this wasn't sent to the libstdc++ list?)
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43852 proposes a "quiet
> mode" which would reduce code size by disabling some of the code in
> eh_term_handler.cc and pure.cc - would that do what you want?
>
> I've not had time to do anything a
Hi,
I ran into a case and found conditional (const) propagation is
mishandled in cprop pass.
With following insn sequence after cprop1 pass:
(note 878 877 880 96 [bb 96] NOTE_INSN_BASIC_BLOCK)
(insn 882 881 883 96 (set (reg:CC 24 cc)
(co
On Tue, Sep 27, 2011 at 4:19 PM, Amker.Cheng wrote:
> Hi,
> I ran into a case and found conditional (const) propagation is
> mishandled in cprop pass.
> With following insn sequence after cprop1 pass:
>
> (note 878
> Unless there's something arch specific related to arm, insn 882 is a
> compare, which won't change r684. Why do you think 0 should
> propagated to r291 if r684 is not zero?
>
Thanks for replying.
Sorry if I misunderstood anything below, and please correct me.
insn 882 : cc <- compare (
>
> Nobody mentioned this so I might be way off but cc doesn't get (minus
> (reg r684) (const_int 0)). It gets the `condition codes` modification as
> a consequence of the subtraction.
>
Hi Paulo,
According to section "comparison operations" in internal:
"The comparison operators may be used to co
>>
>> I believe, the optimization you may be referring to is value range
>> propagation which does predication of values based on predicates of
>> conditions. GCC definitely applies VRP at the tree stage, I am not
>> sure if there is an RTL pass to do the same.
> There are also RTL optimizers which
Hi Jeff, Steven,
I have filed a bug at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50663
Could somebody confirm it?
I am studying this piece of codes and have spent some time on it,
I'm working on a patch and hoping could help on this issue,
Please help me review it later. Thanks.
--
Best Regar
Hi,
I looked into PR43491 a while and found in this case the gimple
generated before pre
is like:
reg.0_12 = reg
...
c()
reg.0_1 = reg
D.xxx = MEM[reg.0_1 + 8B]
The pre pass transforms it into:
reg.0_12 = reg
...
c()
reg.0_1 = reg.0_12
D.xxx = MEM[reg.0_1 + 8B]
>From now on, following passes(li
On Sat, Nov 26, 2011 at 3:41 PM, Amker.Cheng wrote:
> Hi,
> I looked into PR43491 a while and found in this case the gimple
> generated before pre
> is like:
>
> reg.0_12 = reg
> ...
> c()
> reg.0_1 = reg
> D.xxx = MEM[reg.0_1 + 8B]
>
> The pre pass tr
On Thu, Dec 1, 2011 at 11:45 PM, Richard Guenther
wrote:
> Well, it's not that easy if you still want to properly do redundant expression
> removal on global registers.
Yes, it might be complicate to make PRE fully aware of global register.
I also found comments in is_gimple_reg which says gcc d
HI,
I encountered a case with below codes:
int data_0;
int motion_test1(int data, int v)
{
int i;
int t, u;
int x;
if (data)
i = data_0 + x;
else {
v = 2;
i = 5;
}
t = data_0 + x;
u = i;
Forgot the command line:
arm-none-eabi-gcc -O2 -mthumb -mcpu=cortex-m3 -S test.c -o test.S
-fdump-tree-all
gcc is comfigured as arm-non-eabi, but I think it's independent of target.
--
Best Regards.
Hi,
Since SCCVN operates on SSA graph instead of the control flow graph
for the sake of efficiency,
it does not handle or value number the conditional expression of
GIMPLE_COND statement.
As a result, FRE/PRE does not simplify conditional expression, as
reported in bug 30997.
Since it would be com
Thanks Richard,
On Mon, Jan 2, 2012 at 8:33 PM, Richard Guenther
wrote:
>
> I've previously worked on changing GIMPLE_COND to no longer embed
> the comparison but carry a predicate SSA_NAME only (this is effectively
> what you do as pre-processing before SCCVN). It had some non-trivial
> fallout
On Mon, Jan 2, 2012 at 9:37 PM, Richard Guenther
wrote:
> Well, with
>
> Index: gcc/tree-ssa-pre.c
> ===
> --- gcc/tree-ssa-pre.c (revision 182784)
> +++ gcc/tree-ssa-pre.c (working copy)
> @@ -4335,16 +4335,23 @@ eliminate (void)
On Mon, Jan 2, 2012 at 10:54 PM, Richard Guenther
wrote:
> Yes. It won't handle
>
> if (x > 1)
> ...
> tem = x > 1;
>
> or
>
> if (x > 1)
> ...
> if (x > 1)
>
> though maybe we could teach PRE to do the insertion by properly
> putting x > 1 into EXP_GEN in compute_avail (but not into AVA
Hi,
I noticed gcc generates inconsistent codes for same function for builtin calls.
compile following program:
--
#include
int a(float x) {
return sqrtf(x);
}
int b(float x) {
return sqrtf(x);
}
With command:
arm-none-eabi-gcc -mthumb -mhar
On Fri, Jan 13, 2012 at 5:33 PM, Richard Guenther
wrote:
>
> No, I think the check is superfluous and should be removed. I also wonder
> why we exempt BUILT_IN_FREE here ... can you dig in SVN history a bit?
> For both things?
Thanks for clarifying. I will look into it.
--
Best Regards.
On Fri, Jan 13, 2012 at 10:17 PM, Amker.Cheng wrote:
> On Fri, Jan 13, 2012 at 5:33 PM, Richard Guenther
> wrote:
>>
>> No, I think the check is superfluous and should be removed. I also wonder
>> why we exempt BUILT_IN_FREE here ... can you dig in SVN history a bit?
Hi,
In PRE, function compute_antic_aux uses bitmap_set_subtract to compute
value/expression set subtraction.
The comment of bitmap_set_subtract says it subtracts all the values
and expressions contained in ORIG from DEST.
But the implementation as following
---
On Mon, Feb 6, 2012 at 7:28 PM, Richard Guenther
wrote:
> It's probably to have the SET in some canonical form - the resulting
I am wondering how the canonical form is maintained, since according
to the paper:
For an antileader set, it does not matter which expression represents
a value, as long a
Hi all:
I'm currently studying implementation of instruction sched in gcc.
it is possible to schedule insns directly from queue in case
there is nothing better to do and there are still vacant dispatch slots
in the current cycle.
Gcc only does this work in the second pass, but what's the point
Hi all:
Recently I found two relative old papers about non-blocking cache,
etc. which are :
1) Reducing memory latency via non-blocking and prefetching
caches. BY Tien-Fu Chen and Jean-Loup Baer.
2) Data Prefetching:A Cost/Performance Analysis BY Chris Metcalf
It seems the
On Sat, Sep 19, 2009 at 1:17 AM, Janis Johnson wrote:
> On Thu, 2009-09-17 at 21:48 -0700, Ian Lance Taylor wrote:
>
> There's also a prefetch built-in function; see
>
> http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#Other-Builtins
>
> It's been in GCC since 3.1.
>
> Janis
>
>
Thank you all
Hi :
I'm puzzled when looking into speculative scheduling in gcc, the 4.2.4 version.
First, I noticed the document describing IBM haifa instruction
scheduler(as PowerPC Reference Compiler Optimization Project).
It presents that the instruction motion from bb s(dominated by t)
to t is speculative
On Sun, Sep 20, 2009 at 3:43 PM, Maxim Kuvyrkov wrote:
> Amker.Cheng wrote:
>>
>> Hi :
>> I'm puzzled when looking into speculative scheduling in gcc, the 4.2.4
>> version.
>>
>> First, I noticed the document describing IBM haifa instruction
>> sc
Hi :
In function new_ready, it calls to min_insn_conflict_delay with
"min_insn_conflict_delay (curr_state, next, next)".
But the function's comments say that it returns minimal delay of issue of
the 2nd insn after issuing the 1st in given state.
Why the last two parameter for the call are both "
On Tue, Sep 22, 2009 at 11:50 PM, Vladimir Makarov wrote:
> Ian Lance Taylor wrote:
>>
>> "Amker.Cheng" writes:
>>
>>
>>>
>>> In function new_ready, it calls to min_insn_conflict_delay with
>>> "min_insn_conflict_delay (
Hi all:
I have found something strange when scheduling instructions.
considering following piece of code:
-c start
int func(float x)
{
int r = 0;
r = (*(unsigned int*)&x) >> 23;
return r;
}
-c e
Thanks Eric Fisher, got the answer, Please ignore this message.
--
Best Regards.
Hi :
The bb-reorder pass is relative simple comparing with others, but still
I got following puzzles.
1 : the comment at top of the bb-reorder.c file says that :
There are two parameters: Branch Threshold and Exec Threshold.
If the edge to a successor of the actual basic block is low
Hi :
In function fill_simple_delay_slots, there is following codes:
>starts here
/* If there are slots left to fill and our search was stopped by an
unconditional branch, try the insn at the branch target. We can
redirect the bra
Hi All :
It's possible to define multi delay slots for branch insns by using
define_delay,
and different slot should satisfy its own attribute test "delay-n".
Here comes question, in function "fill_simple_delay_slots", seems
it only uses
slots_filled to record how many slots needs to fill, a
On Tue, Dec 1, 2009 at 5:31 AM, Jeff Law wrote:
> On 11/25/09 07:34, Amker.Cheng wrote:
>
> First, it's worth noting very few targets support multiple delay slots and
> as a result that code isn't tested nearly as well as handling of single
> delay slots.
>
> I
Hi :
In regmove.c there is function "replace_in_call_usage" called in
fixup_match_1,
It replaces dst register by src in call_insn, I suspect whether it is necessary
Since comment of CALL_INSN_FUNCTION_USAGE says that no pseudo register
can appear in it and seems src is pseudo register.
further m
Hi All:
In gcc internal, section 16.19.8, there is a rule about
"define_insn_reservation" like:
"`condition` defines what RTL insns are described by this
construction. You should re-
member that you will be in trouble if `condition` for two or more
different `define_insn_
reservation` constructor
Hi :
I am studying multiplication-accumulate patterns for mips
and noticed there are some changes when IRA was merged.
There are two pattern which confused me, as :
1: In pattern "*mul_acc_si", there's constraint like "*?*?".
what does this supposed to do?
I could not connect "*?" with docu
> If you don't know anything about register class preferencing or reload as
> yet, then this is probably not going to make much sense to you, but it isn't
> anything important you need to worry about at this point. It is a very
> minor performance optimization.
>
It makes sense to me now, though I
> The reasoning here is
> that if splitting will result in worse code, then we shouldn't have
> accepted it in the first place. If dropping this alternative results in
> register allocator failures for some strange reason, then we accept it
> and generate the 3 instruction sequence with a new defi
Hi :
I'm wondering whether cfg is maintained properly during delay slot
scheduling,
Because when compiling libgcc/_divsc3.o, rtl dump in
libgcc2.c.198r.mach has following lines:
no bb for insn with uid = 293.
deleting insn with uid = 690.
deleting insn with uid = 904.
..
(note 298 905 303
> The CFG is not maintained during delay slot scheduling. This is, in
> fact, a very old and well-known problem. Look for any e-mail on this
> list that mentions reorg.c :-)
>
Thanks, further more , It seems cfg are not maintained after delay
slot scheduling.
also find that problem just before fina
> Cheng, can you explain what lead you to this "discovery", and what
> you're trying to achieve?
Thanks for all your enthusiastic explanation.
Well, we are now trying to find our processor's critical timing path
by running it at higher frequency than it was designed for.
One timing prob we found i
Hi All:
I read codes in bb-reorder pass. normally it's fine to take the most
probable basic block as the downward bb.
unfortunately, the processor I'm working on is a little different.
It has no pipeline stall when branches are taken, but does introduce
stall when they are not taken.
take an exa
Hi :
I noticed that on mips, the signed form instruction of multiply is
generated for
unsigned integer multiply operation.
for example, mult is used, rather than multu for following codes:
unsigned int x, y, z;
x = y * z;
Is it reasonable to do so? Thanks.
--
Best Regards.
found the cause, sorry to disturb, please ignore this message.
--
Best Regards.
> It would, however, be nice if you actually posted an answer to your
> (now solved) question. That way, any casual reader may learn something
> new.
>
Sorry for the unintentional offense, here comes the method:
for 2's complement binary number x31x30...x0,
unsigned value U = 2^(31)*x31 + 2^(30)*x3
Hi all:
Currently I'm building cross gcc for mips32 on winXp+cygwin.
I tried both gcc 4.2.4 and 4.2.3 and there is a building problem with 4.2.4
gcc makefile normally issue shell command "echo 'exec
$(ORIGINAL_AS_FOR_TARGET) "$$@"' >> as ; \"
at around line 1370, but ORIGINAL_AS_FOR_TARGET defi
Hi all:
Currently I am studying peephole optimization in gcc.
I defined a peephole using "define_peephole", but nothing happened.
It seems gcc does do the pattern match work in codes surrounded by
"HAVE_peephole",
but codes from "out-template" in that "define_peephole" are not
compiled into gc
It turns out there is a mistake in "out-template" of "define_peephole".
So, Sorry for disturbing!
--
Best Regards.
Hi :
There is a pattern "define_insn "s_"" in mips md file, like
(define_insn "s_"
[(set (match_operand:CC 0 "register_operand" "=z")
(swapped_fcond:CC (match_operand:SCALARF 1 "register_operand" "f")
(match_operand:SCALARF 2 "register_operand" "f")))]
""
"c
>
> You can get the RTL for these patterns when expanding stores like
>
> a = (b < c);
>
> In this case, GCC tries to avoid a conditional branch and (I suppose you are
> on GCC <4.5) instead of cmp and b you go through cmp and
> s. cmp does nothing but stashing away its operands, while
> s expan
> Indeed, looking at GCC 4.5 there's no cstore expander for floating-point
> variables. Maybe you can make a patch! :-)
>
yes, it seems gcc always generates set/compare/jump/set sequence,
then optimizes it out in if-convert pass. Maybe it was left behind by
early mips1, which has no conditional mo
HI:
There is comment on lui_movf in mips.md like following,
;; because we don't split it. FIXME: we should split instead.
I can split it into a move and a condmove(movesi_on_cc) insns , like
(define_split
[(set (match_operand:CC 0 "d_operand" "")
(match_operand:CC 1 "fcc_reload_opera
> It's the encoding of 1.0f (single precision). The point is that we want
> something we can safely compare with 0.0f using floating-point instructions.
> "Safe" means "without generating any kind of exception", so a subnormal
> representation like 0x0001 isn't acceptable. 1.0f seems as good
Hi :
Our processor has an errata that the direct fpu load cannot work right,
so I have to substitute instruction sequence "load_into_gpr ; move_gpr_into_fpr"
for direct fpload insn.
Currently I thought of two potential methods as following:
method 1:
step1 : keep a scratch register when e
> It is possible. Your expander can handle it before reload; to handle it
> during and after reload, you need to implement a TARGET_SECONDARY_RELOAD hook.
>
> http://gcc.gnu.org/onlinedocs/gccint/Register-Classes.html#index-TARGET_005fSECONDARY_005fRELOAD-3974
>
Thanks Dave, It works, but I found
> Ah, I forgot pro/epilogue generation, but I think that's the only other
> thing that happens after reload. That is a special case: it has to generate
> strict rtl that directly matches the insns it wants. You'll probably have to
> arrange for it to save at least one GPR early enough in the pro
On Sat, May 8, 2010 at 2:52 PM, Amker.Cheng wrote:
>> Ah, I forgot pro/epilogue generation, but I think that's the only other
>> thing that happens after reload. That is a special case: it has to generate
>> strict rtl that directly matches the insns it wants.
Hi :
I'm working on a fpu which cannot work fpload insns right, so I have
to use a GPR
reg as temp reg to first load mem into GPR then move GPR into fpu register.
I have handled most cases but the case gcc handling call clobbered fpu
registers.
since it is in reload pass, I have no available GPR
Hi:
as to page http://gcc.gnu.org/ml/gcc/2010-05/msg00091.html,
If the fpu register can not copied to/from memory directly, I have
to use intermediate GPR registers.
In fact, I return GP_REGS if copying x to a register in class FP_REGS
in any mode(including CCmode), this results in infinite recu
Hi all,
I compared assembly files of a function compiled by GCC4.3.4 and GCC3.4.4.
The function focuses on array computation and has no branch, or any
loop structure,
The command line is like "-march=mips32r2 -O3", and here is the
instruction statics:
total: 1879 : 1534
> Posting some random numbers without a test-case and precise command line
> parameters for both compilers makes the numbers useless, IMHO. You also
> only mention instruction counts. Have you actually benchmarked the
> resulting code? CPUs are complicated and what you might perceive as worse
> cod
Hi :
I found the temp register used for saving registers when expanding
prologue is defined by
macro MIPS_PROLOGUE_TEMP_REGNUM on mips target, like:
#define MIPS_PROLOGUE_TEMP_REGNUM \
(cfun->machine->interrupt_handler_p ? K0_REG_NUM : GP_REG_FIRST + 3)
I don't understand why using registers
>
> It's not "starting from $3". It's $3 and nothing else ;-) It's not
> intended to be used as (MIPS_PROLOGUE_TEMP_REGNUM + N).
>
> $3 was chosen because it's a MIPS16 register, and can therefore
> be used for both MIPS16 and normal-mode code. $2 used to be the
> static chain register, which le
Hi :
I am studying ira right now, there is following code in change_loop
if (parent_allocno == NULL
|| REGNO (ALLOCNO_REG (parent_allocno)) == REGNO (original_reg))
{
if (internal_flag_ira_verbose > 3 && ira_dump_file)
fprintf (ira_
>
> Yes, I think it can be NULL in some complicated cases when a loop exit edge
> comes not in the parent loop.
By that, you mean the case an regno lives on edges which transfer
between adjacent loops,
and not lives in parent loop?
So, the fprintf would access null pointer in this case.
Thanks for
Hi :
I am studying IRA right now (GCC4.4.1,mips32 target),
for following piece of code:
long long func(int a, int b)
{
long long r = (long long)a * (long long)b;
return r;
}
the asm generated on mips is like:
mult$5,$4
mfhi$5
mflo$2
j
Thanks for explanation.
here are three more questions
1 , If I am talking the right thing, there are two insns like
"*mulsi3_1" and "*smulsi3_highpart_insn",
which set two parts of DImode pseudo regs of DImode mult.
Since both parts pf result are used in the original example,
Hi:
At last of function change_loop, gcc try to change ALLOCNO_REG of
local allocno.
In the loop, ALLOCNO_SOMEWHERE_RENAMED_P (allocno) is set if allocno is not
caps.
Don't understand why the flag is set here. Doesn't all local allocnos'
flag are set in this
loop? seems conflicting with function
>>>
>>> while GCC3.4.4 treats the long long multiplication just like simple
>>> ones, which generates only one
>>> mult insn for each statement, like
>>>
>>> In my understanding, It‘s not necessary using three mult insn to implement
>>> long long mult, since the operands are converted from int type
Hi:
I found although there are standard pattern names such as "ceilm2/floorm2",
there is no insn pattern in mips.md for such float insns on mips target.
further more, there is no ceil/floor rtl code in rtl.def either.
based on these facts, I assuming those float insns are not supported by gcc,
b
HI:
found mult-acc insns like madd.s/d are only used when -mfp64 is specified,
as to codes, there macros defined as:
#define ISA_HAS_FP4 ((ISA_MIPS4 \
|| (ISA_MIPS32R2 && TARGET_FLOAT64) \
<--only float 64
Hi :
In function cse_main, gcc processes ebb path by path.
firstly, gcc finds the first bb of path in the reverse post order queue,
plus if the bb is still not visited.
then gcc finds all paths starting with that first bb.
the corresponding code is like:
do
{
bb = BASIC_
Hi,
I am studying gcc's points-to analysis right now and encountered a question.
In paper "Off-line Variable Substitution for Scaling Points-to
Analysis", section 3.2
It says that we should not substitute a variable with other if it is
taken address.
But in GCC's implementation, it units pointer but
> In theory, this is true, but a lot of the optimizations decrease
> accuracy at a cost of making the problem solvable in a reasonable
> amount of time.
> By performing it after building initial points-to sets, the amount of
> accuracy loss is incredibly small.
> The only type of constraint that wi
Hi :
In paper "Memory SSA-A Unified Approach for Sparsely Representing
Memory Operations",
section 2.2, it says :
"Whenever possible, compiler will create symbolic names to represent distinct
regions inside aggregates(called structure field tags or SFT). For instance,
in Figure 2(b), GCC will c
> The implementation of this stuff changes fairly regularly. The people
> who like this kind of thing are still honing in on the best way to
> handle aliasing information. Richard Guenther is the main guy working
> in this area today.
thanks very much for clarification.
--
Best Regards.
81 matches
Mail list logo