Re: decl_constant_value_for_broken_optimization

2008-02-27 Thread Paolo Bonzini



In the current compiler, it seems very likely that every call to
decl_constant_value_for_broken_optimization can simply be removed.
The constant propagation passes should implement the optimization.


What about format checking for constant arrays? :-(  That's the testcase 
that Joseph wrote for the patch that introduced the function.


Paolo


Re: optimizing predictable branches on x86

2008-02-27 Thread Kenny Simpson
> At least on x86 it should also be a good idea to know which way
> the branch is going to go, because it doesn't have explicit branch
> hints, you really want to be able to optimize the cold branch
> predictor case if converting from cmov to conditional branches.

x86 as of Pentium 4 does have branch hint instruction prefixes, but their use 
is somewhat
discouraged:

from http://softwarecommunity.intel.com/articles/eng/3431.htm:
"
The Pentium® 4 Processor introduced new instructions for adding static hints to 
branches. It is
not recommended that a programmer use these instructions, as they add slightly 
to the size of the
code and are static hints only. It is best to use a conditional branch in the 
manner that the
static predictor expects, rather than adding these branch hints.

In the event that a branch hint is necessary, the following instruction 
prefixes can be added
before a branch instruction to change the way the static predictor behaves:

* 0x3E – statically predict a branch as taken
* 0x2E – statically predict a branch as not taken
"

see also section 2.1.1 Instruction Prefixes in
http://download.intel.com/design/processor/manuals/253666.pdf:
"
Branch hint prefixes (2EH, 3EH) allow a program to give a hint to the processor 
about
the most likely code path for a branch. Use these prefixes only with conditional
branch instructions (Jcc). Other use of branch hint prefixes and/or other 
undefined
opcodes with Intel 64 or IA-32 instructions is reserved; such use may cause 
unpre-
dictable behavior.
"



  

Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs


Bootstrap failure on powerpc64-linux

2008-02-27 Thread Revital1 Eres

Hello,

I get the following bootstrap failure on powerpc64-linux, trunk r132684

configure with:
--with-cpu=default32  --enable-checking --enable-bootstrap

Revital

libtool: compile:  /home/revitale/mainline_branch/build/./gcc/xgcc
-B/home/revitale/mainline_branch/build/./gcc/
-B/home/revitale/mainline_branch/build/powerpc64-unknown-linux-gnu/bin/
-B/home/revitale/mainline_branch/build/powerpc64-unknown-linux-gnu/lib/
-isystem 
/home/revitale/mainline_branch/build/powerpc64-unknown-linux-gnu/include

-isystem 
/home/revitale/mainline_branch/build/powerpc64-unknown-linux-gnu/sys-include
 -DHAVE_CONFIG_H -I. -I../../../gcc/libgfortran -I.
-iquote../../../gcc/libgfortran/io -I../../../gcc/libgfortran/../gcc
-I../../../gcc/libgfortran/../gcc/config -I../.././gcc -D_GNU_SOURCE
-std=gnu99 -Wall -Wstrict-prototypes -Wmissing-prototypes
-Wold-style-definition -Wextra -Wwrite-strings -fcx-fortran-rules -g -O2
-MT maxloc1_4_r16.lo -MD -MP -MF .deps/maxloc1_4_r16.Tpo
-c ../../../gcc/libgfortran/generated/maxloc1_4_r16.c  -fPIC -DPIC
-o .libs/maxloc1_4_r16.o
../../../gcc/libgfortran/generated/maxloc1_4_r16.c: In function
'mmaxloc1_4_r16':
../../../gcc/libgfortran/generated/maxloc1_4_r16.c:220: internal compiler
error: in memory_address, at explow.c:492
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
make[3]: *** [maxloc1_4_r16.lo] Error 1
make[3]: Leaving directory
`/home/revitale/mainline_branch/build/powerpc64-unknown-linux-gnu/libgfortran'
make[2]: *** [all] Error 2
make[2]: Leaving directory
`/home/revitale/mainline_branch/build/powerpc64-unknown-linux-gnu/libgfortran'
make[1]: *** [all-target-libgfortran] Error 2
make[1]: Leaving directory `/home/revitale/mainline_branch/build'



Re: Bootstrap failure on powerpc64-linux

2008-02-27 Thread Dominique Dhumieres
This is PR373, see http://gcc.gnu.org/ml/gcc-patches/2008-02/msg01134.html
for a fix.

Dominique


ARM gcc generates incorrect code?

2008-02-27 Thread Krzysztof Halasa
Hi,

not sure where the bug is - gcc 4.2.4pre (CVS), binutils 2.17,
cross compiler X86_64 -> ARM BE.

-O2 -fno-strict-aliasing -fno-common -fno-stack-protector -marm
-fno-omit-frame-pointer -mapcs -mno-sched-prolog -mabi=apcs-gnu
-mno-thumb-interwork -march=armv5te -mtune=xscale -mcpu=xscale
-msoft-float -Uarm -fno-omit-frame-pointer -fno-optimize-sibling-calls

Building a test Linux kernel module:

#include 
#include 
#include 
#include 
#include 
#include 

#define USE_UACCESS 0

#if USE_UACCESS
#include 
#else
extern int __get_user_1(void *);

/*
.global __get_user_1
__get_user_1:
1:  ldrbt   r2, [r0]
mov r0, #0
mov pc, lr
*/

#define get_user(x,p)   \
({  \
register const u8 __user *__p asm("r0") = (p);  \
register unsigned long __r2 asm("r2");  \
register int __e asm("r0"); \
__asm__ __volatile__ (  \
__asmeq("%0", "r0") __asmeq("%1", "r2") \
"bl __get_user_1"   \
: "=&r" (__e), "=r" (__r2)  \
: "0" (__p) \
: "lr", "cc");  \
x = (u8) __r2;  \
__e;\
})

#endif

struct port {
u8 chan_buf[256];
unsigned int tx_count;
u8 modulo;
struct cdev cdev;
struct device *dev;
};

static struct class *test_class;
static dev_t rdev;
static struct port *main_port;

static ssize_t test_chan_write(struct file *file, const char __user *buf,
   size_t count, loff_t *f_pos)
{
struct port *port = main_port;
int res = 0;
unsigned int tail, chan, frame;

tail = port->tx_count % 2;
chan = tail % port->modulo;
frame = tail / port->modulo;

if (get_user(port->chan_buf[chan * 2 + frame], buf))
return -EFAULT;
port->tx_count++;
res++;
return res;
}

static const struct file_operations chan_fops = {
.owner   = THIS_MODULE,
.llseek  = no_llseek,
.write   = test_chan_write,
};


static int __init test_init_module(void)
{
int err;

if ((err = alloc_chrdev_region(&rdev, 0, 1, "test")))
return err;

if (IS_ERR(test_class = class_create(THIS_MODULE, "test"))) {
printk(KERN_ERR "Can't register device class 'test'\n");
err = PTR_ERR(test_class);
goto free_chrdev;
}

if (!(main_port = kzalloc(sizeof(*main_port), GFP_KERNEL))) {
err = -ENOBUFS;
goto destroy_class;
}

main_port->dev = device_create(test_class, NULL, rdev, "test");
if (IS_ERR(main_port->dev)) {
err = PTR_ERR(main_port->dev);
goto free;
}

main_port->tx_count = 0;
main_port->modulo = 2;

cdev_init(&main_port->cdev, &chan_fops);
main_port->cdev.owner = THIS_MODULE;
if ((err = cdev_add(&main_port->cdev, rdev, 1)))
goto destroy_device;

dev_set_drvdata(main_port->dev, &main_port);

printk(KERN_CRIT "start\n");
return 0;

destroy_device:
device_unregister(main_port->dev);
free:
kfree(main_port);
destroy_class:
class_destroy(test_class);
free_chrdev:
unregister_chrdev_region(rdev, 1);
return err;
}

static void __exit test_cleanup_module(void)
{
printk(KERN_CRIT "tx_count = %u, modulo = %u\n",
   main_port->tx_count, main_port->modulo);
cdev_del(&main_port->cdev);
device_unregister(main_port->dev);
kfree(main_port);
class_destroy(test_class);
unregister_chrdev_region(rdev, 1);
}

MODULE_LICENSE("GPL v2");
module_init(test_init_module);
module_exit(test_cleanup_module);

# Makefile
obj-m   := ixp-test.o
default:
make -C /usr/local/build/diskless/xscale_be-router-test-linux 
SUBDIRS=`pwd` ARCH=arm CROSS_COMPILE=armeb-pc-linux-gnu- modules

gcc produces the following assembly code:

 :
   0:   e1a0c00dmov ip, sp
   4:   e92dddf0stmdb   sp!, {r4, r5, r6, r7, r8, sl, fp, ip, lr, pc}
   8:   e24cb004sub fp, ip, #4  ; 0x4
   c:   e59f3054ldr r3, [pc, #84]   ; 68 <.text+0x68>
  10:   e1a1mov r0, r1  /* buf */
  14:   e5938000ldr r8, [r3]
  18:   e598a100ldr sl, [r8, #256]  /* port->tx_count */
  1c:   e5d86104ldrbr6, [r8, #260]  /* port->modulo */
 

Re: Draft SH uClinux FDPIC ABI

2008-02-27 Thread Kaz Kojima
"Joseph S. Myers" <[EMAIL PROTECTED]> wrote:
> Here is a draft FDPIC ABI for SH uClinux, based on the FR-V FDPIC ABI.  
> Please send any comments; CodeSourcery will be implementing the final ABI 
> version in GCC and Binutils.

Wow, great news!

One minor point I'm curious is the choice of the physical numbers
for new relocations, though proposed numbers 70-77 look good.

Regards,
kaz


RE: ARM gcc generates incorrect code?

2008-02-27 Thread Dave Korn
On 27 February 2008 11:48, Krzysztof Halasa wrote:

> Hi,
> 
> not sure where the bug is - gcc 4.2.4pre (CVS), binutils 2.17,
> cross compiler X86_64 -> ARM BE.

  That asm looks a bit odd to me (but I haven't had much coffee today so I
could be reading it wrong):-

> #define get_user(x,p) \
>   ({  \
>   register const u8 __user *__p asm("r0") = (p);  \
>   register unsigned long __r2 asm("r2");  \
>   register int __e asm("r0"); \
>   __asm__ __volatile__ (  \
>   __asmeq("%0", "r0") __asmeq("%1", "r2") \
>   "bl __get_user_1"   \
>   : "=&r" (__e), "=r" (__r2)  \
^ '&' means output operand (zero)
  is early-clobber, so cannot share
  a register with any input operand.

>   : "0" (__p) \
^^ '0' means forcibly share an input 
   operand with operand zero.

>   : "lr", "cc");  \
>   x = (u8) __r2;  \
>   __e;\
>   })

  That's quite likely to do reload's head in, isn't it?

> Gcc bug? get_user() bug? Should I file a bug entry?

  I think the macro could well be wrong.  Do you know why those constraints
were chosen?


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: optimizing predictable branches on x86

2008-02-27 Thread Jan Hubicka
> > At least on x86 it should also be a good idea to know which way
> > the branch is going to go, because it doesn't have explicit branch
> > hints, you really want to be able to optimize the cold branch
> > predictor case if converting from cmov to conditional branches.
> 
> x86 as of Pentium 4 does have branch hint instruction prefixes, but their use 
> is somewhat
> discouraged:
> 
> from http://softwarecommunity.intel.com/articles/eng/3431.htm:
> "
> The Pentium? 4 Processor introduced new instructions for adding static hints 
> to branches. It is
> not recommended that a programmer use these instructions, as they add 
> slightly to the size of the
> code and are static hints only. It is best to use a conditional branch in the 
> manner that the
> static predictor expects, rather than adding these branch hints.

GCC has support for this feature, but it has turned out to not gain
anything and was disabled by default, since branch reordering stramlines
code well enought to match the default predictor behaviour.
Same conclusion was done by other compiler teams too, ICC is not
generating the hints either.

Honza


plugin includes for MELT

2008-02-27 Thread Basile STARYNKEVITCH

Hello All,

{sent to the gcc@ mailing list and Bcc- to GlobalGCC partners}

This email is related to the plugin includes question 
http://gcc.gnu.org/ml/gcc/2008-02/msg00373.html 
http://gcc.gnu.org/ml/gcc/2008-02/msg00376.html within (in particular) 
the MELT branch http://gcc.gnu.org/ml/gcc/2008-02/msg00256.html

http://gcc.gnu.org/ml/gcc/2008-02/msg00355.html
http://gcc.gnu.org/wiki/MiddleEndLispTranslator
funded thru the GGCC http://ggcc.info/ ITEA http://itea2.org/ project

My MELT branch [originally I called it basilys] is (currently is not but 
should) generate C code during the cc1 execution from some LISP dialect 
(either in memory, or in *.bysl files); the generated C code is then 
compiled as a plugin into a shared object which is dynamically loaded 
thru dlopen (actually thru lt_dlopenext from  - the libtool 
dynamic loader wrapper). I don't care much about the perennity of the 
generated *.c or *.so files; they might need to be regenerated when 
bumping the gcc version (from 4.4 to 4.5 for example)


The point is that every MELT generated C file is a plugin to the 
middle-end hence depends upon all the middle-end stuff notably tree.h 
and many many others.


In its current (sad & buggy) state, MELT is not able to work without a 
GCC build and source trees (and I am using scripts which I uploaded to 
the Wiki page)! This is not acceptable, it should be able to run on a 
system without any of them (but of course some additional files, 
describing the internals of GCC used by MELT plugins, are required)


Practically, every MELT generated file has exactly one include directive:
  #include "run-basilys.h"
the gcc/run-basilys.h is in the MELT branch and of course include many 
other files eg

  #include "config.h"
  #include "system.h"
  #include "coretypes.h"
  #include "tree.h"
  #include "target.h"
  #include "cgraph.h"
  #include "ipa-prop.h"
  #include "tree-flow.h"
  #include "tree-pass.h"
  #include "flags.h"
  #include "timevar.h"
  #include "diagnostic.h"
  #include "tree-dump.h"
  #include "tree-inline.h"
  #include "compiler-probe.h"
  #include 
  #include 
  #include "basilys.h"

So far, my thoughts about all this is:

* some of the *.h are host- specific, but many are target- specific and 
I have hard time to understand which files exactly are host- specific

and which one are target- specific

* some of the *.h are generated, hence in the build tree (not in the 
source dir from SVN)


* disk space is cheap, but huge -I... include options are messy so I am 
thinking of having a single *generated* directory, e.g. in the build 
directory include/gcc-melt-plugin-$(host)--$(target) which is later 
installed in $(DESTDIR)$(includedir)/gcc-melt-plugin-$(host)--$(target)/ 
and which contains all the relevant *.h files needed to run-basilys.h 
(directly or indirectly included by it)



Does all the above make sense?


My understanding (which is poor regarding the gcc/Makefile.in) is that 
sys-include/ is not relevant in this discussion


Regards.

--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


Re: ARM gcc generates incorrect code?

2008-02-27 Thread Krzysztof Halasa
"Dave Korn" <[EMAIL PROTECTED]> writes:

>> #define get_user(x,p)
>> \
>>  ({  \
>>  register const u8 __user *__p asm("r0") = (p);  \
>>  register unsigned long __r2 asm("r2");  \
>>  register int __e asm("r0"); \
>>  __asm__ __volatile__ (  \
>>  __asmeq("%0", "r0") __asmeq("%1", "r2") \
>>  "bl __get_user_1"   \
>>  : "=&r" (__e), "=r" (__r2)  \
> ^ '&' means output operand (zero)
>   is early-clobber, so cannot share
>   a register with any input operand.

Well, GCC-Inline-Assembly-HOWTO.html says "An input operand can be
tied to an earlyclobber operand if its only use as an input occurs
before the early result is written" and it seems it's the case.

Though I'm not sure if it's relevant here.

>>  : "0" (__p) \
> ^^ '0' means forcibly share an input 
>operand with operand zero.
>
>>  : "lr", "cc");  \
>>  x = (u8) __r2;  \
>>  __e;\
>>  })
>
>   I think the macro could well be wrong.  Do you know why those constraints
> were chosen?

No. The macro is in normal Linux ARM kernel.
-- 
Krzysztof Halasa


RE: ARM gcc generates incorrect code?

2008-02-27 Thread Dave Korn
On 27 February 2008 13:07, Krzysztof Halasa wrote:

> "Dave Korn" writes:
> 
>>> #define get_user(x,p)
\
>>> ({  \
>>> register const u8 __user *__p asm("r0") = (p);  \
>>> register unsigned long __r2 asm("r2");  \
>>> register int __e asm("r0"); \
>>> __asm__ __volatile__ (  \
>>> __asmeq("%0", "r0") __asmeq("%1", "r2") \
>>> "bl __get_user_1"   \
>>> : "=&r" (__e), "=r" (__r2)  \
>> ^ '&' means output operand (zero)
>>   is early-clobber, so cannot share
>>   a register with any input operand.
> 
> Well, GCC-Inline-Assembly-HOWTO.html says "An input operand can be
> tied to an earlyclobber operand if its only use as an input occurs
> before the early result is written" and it seems it's the case.

  Hmmm, true, so I guess that shouldn't be a problem, unless a bug has cropped
up in that area of the compiler.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: ARM gcc generates incorrect code?

2008-02-27 Thread Daniel Jacobowitz
On Wed, Feb 27, 2008 at 12:40:37PM -, Dave Korn wrote:
> ^ '&' means output operand (zero)
>   is early-clobber, so cannot share
>   a register with any input operand.

> > : "0" (__p) \
> ^^ '0' means forcibly share an input 
>operand with operand zero.

That's standard.  It just means that if two input operands have the
same value, we can't reuse %0 for the other one.


-- 
Daniel Jacobowitz
CodeSourcery


Re: optimizing predictable branches (Was: ... on x86)

2008-02-27 Thread Jan Hubicka
> This is also interesting for the ARC700 processor.
> 
> There is also an issue if the flag for the conditionalized instruction is
> set in the immediately preceding instruction, and the result of the
> conditionalized instruction is required in the immediately following
> instruction, and if using a conditional branch with a short offset,
> there is also the opportunity to combine a comparison or bit test
> with the branch.
> 
> MOreover, since the ARCompact architecture a lot more registers than x86,
> if you don't use a frame pointer, there are also realistic
> opportunities to use conditional function returns.
> 
> Already back when I was an SH maintainer, I was annoyed that there is
> only one BRANCH_COST.  We should really have different ones for
> predictable and unpredictable/mispredicted branches.
> 
> Also, it would make sense if the cost could be modified according to if
> the compiler thinks it will be able to schedule a delay slot instruction.
> 
> Ideally alignment could also be taken into account, but that would
> require to do register allocation first, so there appears to be no viable
> pass ordering withing the gcc infrastructure to make this work.
> 
> For an exact modeling, we should actually have three branch costs,
> distinguishing the cost from having no prediction to having a wrong
> prediction.
> However, in 'hot' code we can assume we have some prediction - either
> right or wrong, and 'cold' code would typically not matter, unles you
> have a humongous program with very poor locality.
> 
> Howevr, for these reasons I think that COLD_BRANCH_COST is a misnomer,
> and could also promt port writers to put the wrong value there,
> since it's the mispredicted branches we are interested in.
> MISPREDICTED_BRANCH_COST would be more descriptive.

In the patch, I was using BRANCH_COST for the usual branches, that is
assumed to be badly predictable and PREDICTABLE_BRANCH_COST for the few
branches we identify to be well predictable.

COLD_BRANCH_COST is not for mispredicted branches, but for branches in
regions of programs that are expected to be rarely executed (via profile
feedback or because they lead to noreturn call for instance), so it is
basically what BRANCH_COST would be if we was optimizing for size. 

On i386 larger BRANCH_COST tends to increase code size because of
register pressure and because the expanded cmov sequences when cmov
instruction is missing tends to be quite ridiculous of set flags or sbb.
So COLD_BRANCH_COST is probably best to be left at 1 or 0, while both
predictable and unpredictable costs are higher.

In general I would love to bring our cost model to always have HOT and
COLD variants (or optimize for speed/optimize for size) and drive
expansion by it.  Ideally -O2 and -Os should be different only by
changing default behaviour of maybe_hot_bb_p and probably_cold_bb_p
predictates.  We are losing quite good amount of code size benefits byt
not doing that with profile feedback on.

I got stuck in GCC 4.2 times with maybe_hot_insn_p patch I will try to
come up with something sane for 4.4.  With default optimization we are
poor on predicting coldness of block of program (noreturn or
__builtin_expect and hot/cold function attributes are the only reliable
predictor in this), but this will change with LTO quite easilly.

Perhaps to avoid confusion we can drive backends to have either one
BRANCH_COST defined we can have:
PREDICTED_BRANCH_COST (hot_p)
MISPREDICTED_BRANCH_COST (hot_p)
and make them to default to BRANCH_COST if they are not defined at all?
Or have
BRANCH_COST (hot_p, predictable_p)?

Note that old pgcc used to have TAKEN and NOT_TAKEN_BRNCH_COST as well
as MISPREDICTION_COST (names can be different, I don't remember).  This
is perhaps more accurate model but I don't see how we can take resonably
advantage of it, since we reorder branches later anyway.

OK I guess the patch has gained enough interest so I will bring it to
mainline ;)

Honza


Re: Draft SH uClinux FDPIC ABI

2008-02-27 Thread Joseph S. Myers
On Wed, 27 Feb 2008, Kaz Kojima wrote:

> "Joseph S. Myers" <[EMAIL PROTECTED]> wrote:
> > Here is a draft FDPIC ABI for SH uClinux, based on the FR-V FDPIC ABI.  
> > Please send any comments; CodeSourcery will be implementing the final ABI 
> > version in GCC and Binutils.
> 
> Wow, great news!
> 
> One minor point I'm curious is the choice of the physical numbers
> for new relocations, though proposed numbers 70-77 look good.

They were chosen to be a range away from those present in 
include/elf/sh.h.

-- 
Joseph S. Myers
[EMAIL PROTECTED]


Re: decl_constant_value_for_broken_optimization

2008-02-27 Thread Ian Lance Taylor
Paolo Bonzini <[EMAIL PROTECTED]> writes:

> > In the current compiler, it seems very likely that every call to
> > decl_constant_value_for_broken_optimization can simply be removed.
> > The constant propagation passes should implement the optimization.
> 
> What about format checking for constant arrays? :-(  That's the
> testcase that Joseph wrote for the patch that introduced the function.

That uses decl_constant_value, not
decl_constant_value_for_broken_optimization.

Ian


RE: plugin includes for MELT

2008-02-27 Thread Dave Korn
On 27 February 2008 12:57, Basile STARYNKEVITCH wrote:

> My MELT branch [originally I called it basilys] is (currently is not but
> should) generate C code during the cc1 execution 

> The point is that every MELT generated C file is a plugin to the
> middle-end hence depends upon all the middle-end stuff notably tree.h
> and many many others.

> In its current (sad & buggy) state, MELT is not able to work without a
> GCC build and source trees 

> Practically, every MELT generated file has exactly one include directive:
>#include "run-basilys.h"
> the gcc/run-basilys.h is in the MELT branch and of course include many
> other files eg

> So far, my thoughts about all this is:
> 
> * some of the *.h are host- specific, but many are target- specific and
> I have hard time to understand which files exactly are host- specific
> and which one are target- specific

  Does this matter?  Any given compiler only has one combination of target and
host; are you hoping the plugins will be swappable between
differently-configured compilers?

> * some of the *.h are generated, hence in the build tree (not in the
> source dir from SVN)
> 
> * disk space is cheap, but huge -I... include options are messy so I am
> thinking of having a single *generated* directory, e.g. in the build
> directory include/gcc-melt-plugin-$(host)--$(target) which is later
> installed in $(DESTDIR)$(includedir)/gcc-melt-plugin-$(host)--$(target)/
> and which contains all the relevant *.h files needed to run-basilys.h
> (directly or indirectly included by it)

  It might be easiest to just generate a single pre-preprocessed .i file from
run-basilys.h (using -dD) as part of building the compiler, and install it to
the libexec include dir (or a 'melt/' subdir thereof, mightn't it?


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: plugin includes for MELT

2008-02-27 Thread Basile STARYNKEVITCH

Hello All,

A big thanks to Dave Korn, who wrote:

On 27 February 2008 12:57, Basile STARYNKEVITCH wrote:



Practically, every MELT generated file has exactly one include directive:
   #include "run-basilys.h"
the gcc/run-basilys.h is in the MELT branch and of course include many
other files eg



So far, my thoughts about all this is:

* some of the *.h are host- specific, but many are target- specific and
I have hard time to understand which files exactly are host- specific
and which one are target- specific


  Does this matter?  Any given compiler only has one combination of target and
host; are you hoping the plugins will be swappable between
differently-configured compilers?


Of course not, I explained incorrectly. The plugin is heavily dependent 
on the actual cc1 program dlopen-ing it. The plugin depend on all the 
*.h used in this particular cc1 and useful for MELT.



* some of the *.h are generated, hence in the build tree (not in the
source dir from SVN)

* disk space is cheap, but huge -I... include options are messy so I am
thinking of having a single *generated* directory, e.g. in the build
directory include/gcc-melt-plugin-$(host)--$(target) which is later
installed in $(DESTDIR)$(includedir)/gcc-melt-plugin-$(host)--$(target)/
and which contains all the relevant *.h files needed to run-basilys.h
(directly or indirectly included by it)


  It might be easiest to just generate a single pre-preprocessed .i file from
run-basilys.h (using -dD) as part of building the compiler, and install it to
the libexec include dir (or a 'melt/' subdir thereof, mightn't it?


A big thanks for the suggestion! I am a little bit concerned about 
scenario like the following (this happens often on Debian/Sid)
  the system had some external library, like libc going from 2.7.0 to 
2.7.1) or libtldl, upgraded in a minor way (no API change). The library 
deep internal stuff (like the /usr/include/bits/*.h files on Debian) 
updated a bit. The gcc compiler did not change at all (same version).


I'm trying to understand how other "plugin" related effort deals with 
this. Perhaps nobody really cares, but I tend to believe that any plugin 
effort should install the right *.h files outside of the source or build 
directories, for plugins...


Of course, I do know that plugin might mean to some people (not to me) a 
stability in the GCC in the internal API level. I don't care about this yet.


--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


SSA alias representation

2008-02-27 Thread Fran Baena
>  Symbols with their address taken are only renamed when they appear as
 >  virtual operands.  So, if you have:
 >
 >  p_3 = (i_5 > 10) ? &a : &b
 >  a = 4
 >
 >  notice that 'a' is never renamed in the LHS of the assignment.  It's
 >  renamed as a virtual operand:
 >
 >  p_3 = (i_5 > 10) ? &a : &b
 >
 >  # a_9 = VDEF 
 >  a = 4

 I have seen -fdump-tree-salias-all-vops, and i have realized that "a"
 is never renamed. That means that a previous scan of the tree is
 needed to know which symbols do not need to be versioned and which do
 need. I suppose that each definition and use of "a" has associated its
 virtual operator (as shown in next portion of code), to maintain the
 FUD chain.
 After that previous scan the renaming pass is applied.

  # a_8 = VDEF 
  a = 1;
  


  p_3 = (i_5 > 10) ? &a : &b

  # a_9 = VDEF 
  a = 4


 Thanks

 Fran


Re: plugin includes for MELT

2008-02-27 Thread Brian Dessent
Basile STARYNKEVITCH wrote:

> I'm trying to understand how other "plugin" related effort deals with
> this.

In an ideal world, you create a plugin API/ABI that is decoupled from
any of the internals of the main program and which has its own headers
and interface.  Plugin authors simply code to that API without involving
any gcc headers.  As a plugin author, this is ideal as you simply grab
this SDK and code your plugin without having to ever touch gcc sources,
and your plugin "just works" with any gcc.

This of course requires a ton more work to create and maintain, since
you have to a) invent a new API that is flexible enough to do everything
that a plugin ever might want to do, and in a way that does not
introduce too many target- or architecture-specific details; b) code
wrappers in gcc that translate the plugin API into the internal
representation; c) maintain those wrappers in the face of changing gcc
internals.  It's been my observation that whenever plugins are
discussed, the majority of gcc maintainers do not want to bear the
maintenance and support burden of this level of decoupling, so it's kind
of a pie in the sky position I think.

Brian


RE: plugin includes for MELT

2008-02-27 Thread Dave Korn
On 27 February 2008 18:26, Basile STARYNKEVITCH wrote:

> I'm trying to understand how other "plugin" related effort deals with
> this. Perhaps nobody really cares, but I tend to believe that any plugin
> effort should install the right *.h files outside of the source or build
> directories, for plugins...
> 
> Of course, I do know that plugin might mean to some people (not to me) a
> stability in the GCC in the internal API level. I don't care about this yet.

  I think you already have the answer.  Usually public plugin-APIs have to go
to great lengths to make the interface completely stable, then the plugins can
be distributed as object files.  On the other hand, you aren't planning on
offering a stable API (and that would be quite difficult considering how
quickly and significantly gcc internals change from version to version), so
there is no advantage in distributing the headers separately from the specific
gcc that they were used to build.

  So, since you are planning to compile the plugin during cc1 execution
anyway, why not just say that

 - plugins are distributed as source
 - the compiler keeps the gcc-private headers in its private libexec include
subdir, thus automatically making the correct headers go along with the
correct version+host+target compiler
 - when cc1 runs, it compiles any plugins needed (possibly caching the
compiled objects in another libexec subdir) and dlopens them.


  Then, as you gain experience with MELT and it becomes clearer to you which
parts of gcc's internal api are stable enough to be exposed, you can gradually
build up to a public, SDK-like header file, which exposes only those details
of the internals that you are either confident enough will not change, or that
you feel would not be too difficult to provide a backward-compatibility shim
layer for if they do change.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: plugin includes for MELT

2008-02-27 Thread Basile STARYNKEVITCH

Hello All,

Dave Korn wrote:

On 27 February 2008 18:26, Basile STARYNKEVITCH wrote:

  So, since you are planning to compile the plugin during cc1 execution
anyway, why not just say that

 - plugins are distributed as source


Yes, exactly. And to be more precise, all MELT plugin C code is 
generated (from some MELT Lisp code). This leaves the GCC API dependency 
to the generating stuff.


Of course I have to deal with the dependency of the MELT Lisp code on 
the GCC internals, but about this I have some ideas about (basically 
stay at the highest level possible).



 - the compiler keeps the gcc-private headers in its private libexec include
subdir, thus automatically making the correct headers go along with the
correct version+host+target compiler


I think it should be (in gcc/Makefile.in parlance) 
$(DESTDIR)$(libexecsubdir)/melt-private-include/ and I should have some 
Makefile.in trick to copy the relevant *.h there perhaps thru a 
install-melt-includes target



 - when cc1 runs, it compiles any plugins needed (possibly caching the
compiled objects in another libexec subdir) and dlopens them.


  Then, as you gain experience with MELT and it becomes clearer to you which
parts of gcc's internal api are stable enough to be exposed, you can gradually
build up to a public, SDK-like header file, which exposes only those details
of the internals that you are either confident enough will not change, or that
you feel would not be too difficult to provide a backward-compatibility shim
layer for if they do change.


Yes exactly. But this SDK-like header does not really exist. all the 
coding in MELT is done in Lisp (and some of this Lisp code would change 
as GCC evolves).


Actually, my current concern is just the part of MELT which translates 
LISP into C (the warm-basilys.bysl file, coded in my Lisp dialect, and 
[almost] able to translate itself into C), and as you might guess this 
part does not depend much on all the (unstable) GCC API, just on a more 
stable subset (mostly ggc.h basilys.h and the fact that tree [and some 
very few others GCC types] are "opaque" pointers).


Actually the whole MELT idea is to separate as much as possible stuff 
heavily depending on GCC unstable API to those depending on more stable 
stuff (like ggc.h & basilys.h):
  the MELT translator warm-basilys.bysl and the MELT related runtime 
basilys.c does not depend much on GCC unstable API. For instance, it 
does know about tree typename, but does not care about the details of 
tree.h. Regarding tree-s (in GCC sense) it only depends upon the fact 
that tree is a GGC-ed pointer.


  more specific stuff are other files (mostly to be written still)


Dave a big thanks for your insights & help. It was unvaluable!

--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


Re: plugin includes for MELT

2008-02-27 Thread Basile STARYNKEVITCH

Hello All,


Basile STARYNKEVITCH wrote:


I think it should be (in gcc/Makefile.in parlance) 
$(DESTDIR)$(libexecsubdir)/melt-private-include/ and I should have some 
Makefile.in trick to copy the relevant *.h there perhaps thru a 
install-melt-includes target



The one detail I don't understand yet is the link between the -B option 
to gcc and this  $(DESTDIR)$(libexecsubdir)/ are they somehow equal (or 
is the -B value some initial prefix of $(DESTDIR)$(libexecsubdir) ?


Regards.

--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


gcc-4.2-20080227 is now available

2008-02-27 Thread gccadmin
Snapshot gcc-4.2-20080227 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20080227/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.2 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_2-branch 
revision 132729

You'll find:

gcc-4.2-20080227.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.2-20080227.tar.bz2 C front end and core compiler

gcc-ada-4.2-20080227.tar.bz2  Ada front end and runtime

gcc-fortran-4.2-20080227.tar.bz2  Fortran front end and runtime

gcc-g++-4.2-20080227.tar.bz2  C++ front end and runtime

gcc-java-4.2-20080227.tar.bz2 Java front end and runtime

gcc-objc-4.2-20080227.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.2-20080227.tar.bz2The GCC testsuite

Diffs from 4.2-20080220 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.2
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: [PATCH] linux/fs.h - Convert debug functions declared inline __attribute__((format (printf,x,y) to statement expression macros

2008-02-27 Thread David Rientjes
On Tue, 26 Feb 2008, Joe Perches wrote:

> > Joe, what version of gcc are you using?
> 
> $ gcc --version
> gcc (GCC) 4.2.2 20071128 (prerelease) (4.2.2-3.1mdv2008.0)
> 
> It's definitely odd.
> The .o size changes are inconsistent.
> Some get bigger, some get smaller.
> 
> The versioning ones I understand but I have no idea why
> changes in drivers/ or mm/ or net/ exist.
> 

When I did the same comparisons on my x86_64 defconfig with gcc 4.1.3, I 
only saw differences in drivers/ and fs/.

> I think it's gcc optimization changes, but dunno...
> Any good ideas?
> 

What's interesting about this is that it doesn't appear to be related to 
your change (static inline function to macro definition).  It appears to 
be simply removing the static inline function.

The only reference to __simple_attr_check_format() in either the x86 or 
x86_64 defconfig is via DEFINE_SIMPLE_ATTRIBUTE() in fs/debugfs/file.c.

If you remove the only reference to it:

diff --git a/include/linux/fs.h b/include/linux/fs.h
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2044,7 +2044,6 @@ static inline void simple_transaction_set(struct file 
*file, size_t n)
 #define DEFINE_SIMPLE_ATTRIBUTE(__fops, __get, __set, __fmt)   \
 static int __fops ## _open(struct inode *inode, struct file *file) \
 {  \
-   __simple_attr_check_format(__fmt, 0ull);\
return simple_attr_open(inode, file, __get, __set, __fmt);  \
 }  \
 static struct file_operations __fops = {   \

The text size remains the same:

   textdata bss dec hex filename
5386111  846328  719560 6951999  6a143f vmlinux.before
5386111  846328  719560 6951999  6a143f vmlinux.after

Yet if you remove the reference _and_ the static inline function itself, 
replacing it with nothing:

diff --git a/include/linux/fs.h b/include/linux/fs.h
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2044,7 +2044,6 @@ static inline void simple_transaction_set(struct file 
*file, size_t n)
 #define DEFINE_SIMPLE_ATTRIBUTE(__fops, __get, __set, __fmt)   \
 static int __fops ## _open(struct inode *inode, struct file *file) \
 {  \
-   __simple_attr_check_format(__fmt, 0ull);\
return simple_attr_open(inode, file, __get, __set, __fmt);  \
 }  \
 static struct file_operations __fops = {   \
@@ -2055,12 +2054,6 @@ static struct file_operations __fops = { 
\
.write   = simple_attr_write,   \
 };
 
-static inline void __attribute__((format(printf, 1, 2)))
-__simple_attr_check_format(const char *fmt, ...)
-{
-   /* don't do anything, just let the compiler check the arguments; */
-}
-
 int simple_attr_open(struct inode *inode, struct file *file,
 int (*get)(void *, u64 *), int (*set)(void *, u64),
 const char *fmt);

The text size does become smaller:

   textdata bss dec hex filename
5386111  846328  719560 6951999  6a143f vmlinux.before
5386047  846328  719560 6951935  6a13ff vmlinux.after

gcc 4.0.3 maintains the same text size for both cases, while it appears 
gcc 4.1.3 and your version, 4.2.2, have this different behavior.

David


birthpoints in rtl.

2008-02-27 Thread Kenneth Zadeck
I want to start a discussion about some possible changes to the RTL
level of GCC.

This discussion is motivated by some of the issues raised in bug
26854.  We have addressed many of the issues in this bug, but the
remaining issue is cost, in both time and space, for the UD and DU
chains built by several back end optimizations.

I have started one possible fix for this bug, but as I do the work, I
am less convinced that it really is the best way to go.  Currently UD
or DU chains are represented as linked lists. If a pass asks for
DU-chains, there is a linked list from each def of a reg to each use
that it reaches.  

These linked lists are a big part of the space usage of this bug and I
believe a big part of the space usage of the back end, especially as
more passes are upgraded from just using the live information.

My original plan had been to convert these linked lists to VECs,
much in the way that basic block edges are represented.  However, I
believe that this is unlikely to help: currently each element of the
linked list takes two words - if I convert this to VECs, there will be
two malloc overheads per ref along with fields of the VEC and the 25%
wasted space for the data.  All of this to remove the next pointer.  
While I believe that this will improve the n**2 cases, I believe that
in most cases this will not win.

For DU and UD chains, the n**2 cases come from programs that look like
two case statements that follow each other (there are a lot of other
situations that can cause this, but this is the easiest to
visualize).   In the first case statement, each arm has a def of R and in
the second case statement each a use of R.  The number of DU chains
for this example is [the number of defs for R] * [the number of uses
for R].

To solve this, I would like to propose a datastructure that first
appeared in the open literature in by Reif and Lewis though I suspect
that it was originally developed by Shipiro and Saint.  This is the
BIRTHPOINT, and it was an idea that heavily influenced Wegman and
myself when we first developed SSA form.  The easiest way to explain a
birthpoint (in this day) is to say that a birthpoint is a simple noop
move that is inserted everywhere that one would normally add a phi
function.  I.e. it is missing the operands for each incoming edge.

Birthpoints are not nearly as useful as phi-functions because the
algorithms that use birthpoints do not generally leave the birthpoints
in the right places when they are finished.  There is a lot of value
added by the operand of phi-functions.  But they do solve the n**2
case for DU and UD chains (and because of the better SSA building
algorithms than were available when Reif and Lewis first proposed
their technique, will be much faster).

It will be possible to use some of the existing into SSA machinery
(adapted to work over RTL rather than trees or gimple) to both find
the birthpoints and to build the chains (this is what Reif and Lewis
did with their non-conditional constant propagator) and so it would
allow us to drop the RD dataflow problem which is itself, a time and
space hog.

The appeal for birthpoints is that unlike the abortive attempt in
the past to add SSA to RTL, adding a noop moves does not really mess
up anything.  We could either add them only in passes that use DU or
UD chains and get rid of them at the end of the pass or we could leave
them and only get rid of them in passes where they might get in the
way, like RA.  However, without the operands to the phis, you cannot
do SSA optimizations like conditional constant propagation.

There is the complication of how to add the noop move in the presence
of SUBREGs, and given the amount of pain that I suffered in adding the
moves for the DSE pass, I would need to get the help of one of the
active SUBREG elite, like Bonzini, Iant or Rsandifo to help.
However, I believe that birthpoints will remove many of the time and space
problems that have arisen because of the new usage of DU and UD chains
at the RTL level.

Assuming that the SUBREG issue is one that can be easily solved, this is
a big step to being able to put SSA like technology into the gcc backend
without breaking everything.

Comments? Volunteers?

Kenny


If anyone wants to see the references to the papers by Shapiro and
Saint or Reif and Lewis, I will be happy to send the references.
They are both obscure and difficult to understand.  
 



bootstrap failure on i686

2008-02-27 Thread Benjamin Kosnik

last 24 hrs I get this:

make[2]: Entering directory `/mnt/share/bld/gcc'
make[3]: Entering directory `/mnt/share/bld/gcc'
rm -f stage_current
make[3]: Leaving directory `/mnt/share/bld/gcc'
Comparing stages 2 and 3
warning: ./cc1-checksum.o differs
warning: ./cc1plus-checksum.o differs
Bootstrap comparison failure!
./cfgloopmanip.o differs
./tree-ssa-copy.o differs
make[2]: *** [compare] Error 1
make[2]: Leaving directory `/mnt/share/bld/gcc'
make[1]: *** [stage3-bubble] Error 2
make[1]: Leaving directory `/mnt/share/bld/gcc'
make: *** [bootstrap-lean] Error 2

-benjamin


Re: [PATCH] linux/fs.h - Convert debug functions declared inline __attribute__((format (printf,x,y) to statement expression macros

2008-02-27 Thread Jan Hubicka
>  
> -static inline void __attribute__((format(printf, 1, 2)))
> -__simple_attr_check_format(const char *fmt, ...)

It would be nice to have a testcase, but I guess it is because GCC can't
inline variadic functions.  The function gets identified as const and
removed as unused by DCE, but this happens later (that is after early
inlining and before real inlining).  GCC 4.0.3 didn't have early inliner
so it is probably where the difference is comming from.

One possibility to handle this side case would be to mark const
functions early during early optimization and only refine it using
Kenny's existing IPA pass that should turn this issue into no-op.

We probably also can simply allow inlining variadic functions not
calling va_start.  I must say that this option appeared to me but I was
unable to think of any sane use case.  This probably is one ;)

Honza
> -{
> - /* don't do anything, just let the compiler check the arguments; */
> -}
> -
>  int simple_attr_open(struct inode *inode, struct file *file,
>int (*get)(void *, u64 *), int (*set)(void *, u64),
>const char *fmt);
> 
> The text size does become smaller:
> 
>textdata bss dec hex filename
> 5386111  846328  719560 6951999  6a143f vmlinux.before
> 5386047  846328  719560 6951935  6a13ff vmlinux.after
> 
> gcc 4.0.3 maintains the same text size for both cases, while it appears 
> gcc 4.1.3 and your version, 4.2.2, have this different behavior.
> 
>   David


Excess registers pushed - regs_ever_live not right way?

2008-02-27 Thread Andrew Hutchinson
Register saves by prolog (pushes) are typically made with reference to 
"df_regs_ever_live_p()" or  "regs_ever_live. "||


If my understanding is correct,  these calls reflect register USEs and 
not register DEFs. So if register is used in a function, but not 
otherwise changed, it will get pushed unnecessarily on stack by prolog.


(as noted in this bug  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32871)

I checked a couple of other ports but they all use 
df_regs_ever_live_p(). Indeed this is noted method in manual.


The question is, what df routine or variable can be used to determine 
which registers are DEFs and hence destructively used by a function?


Maybe:  "df_invalidated_by_call"  in conjunction with:
"df_get_call_refs perhaps()" perhaps?


Andy





Re: Excess registers pushed - regs_ever_live not right way?

2008-02-27 Thread Seongbae Park (박성배, 朴成培)
On Wed, Feb 27, 2008 at 5:16 PM, Andrew Hutchinson
<[EMAIL PROTECTED]> wrote:
> Register saves by prolog (pushes) are typically made with reference to
>  "df_regs_ever_live_p()" or  "regs_ever_live. "||
>
>  If my understanding is correct,  these calls reflect register USEs and
>  not register DEFs. So if register is used in a function, but not
>  otherwise changed, it will get pushed unnecessarily on stack by prolog.

This implies that the register is either a global register
or a parameter register, in either case it won't be saved/restored
as callee save.
What kind of a register is it and how com there's only use of it in a function
but it's not a global ?

Seongbae


Re: Excess registers pushed - regs_ever_live not right way?

2008-02-27 Thread Andrew Hutchinson
Register contains  parameter that is passed to function. This register 
is not part of call used set.


If this type of register were modified by function, then it would be 
saved by function.


If this register is not modified by function, it should not be saved. 
This is true even if function is not a leaf function (as same register 
would be preserved by deeper calls)



Andy



Seongbae Park (박성배, 朴成培) wrote:

On Wed, Feb 27, 2008 at 5:16 PM, Andrew Hutchinson
<[EMAIL PROTECTED]> wrote:
  

Register saves by prolog (pushes) are typically made with reference to
 "df_regs_ever_live_p()" or  "regs_ever_live. "||

 If my understanding is correct,  these calls reflect register USEs and
 not register DEFs. So if register is used in a function, but not
 otherwise changed, it will get pushed unnecessarily on stack by prolog.



This implies that the register is either a global register
or a parameter register, in either case it won't be saved/restored
as callee save.
What kind of a register is it and how com there's only use of it in a function
but it's not a global ?

Seongbae

  


Re: Excess registers pushed - regs_ever_live not right way?

2008-02-27 Thread Seongbae Park (박성배, 朴成培)
You can use DF_REG_DEF_COUNT() - if this is indeed a parameter register,
there should be only one def (artificial def) or no def at all.
Or if you want to see all defs for the reg,
follow DF_REG_DEF_CHAIN().

Seongbae

On Wed, Feb 27, 2008 at 6:03 PM, Andrew Hutchinson
<[EMAIL PROTECTED]> wrote:
> Register contains  parameter that is passed to function. This register
>  is not part of call used set.
>
>  If this type of register were modified by function, then it would be
>  saved by function.
>
>  If this register is not modified by function, it should not be saved.
>  This is true even if function is not a leaf function (as same register
>  would be preserved by deeper calls)
>
>
>  Andy
>
>
>
>
>
>  Seongbae Park (박성배, 朴成培) wrote:
>  > On Wed, Feb 27, 2008 at 5:16 PM, Andrew Hutchinson
>  > <[EMAIL PROTECTED]> wrote:
>  >
>  >> Register saves by prolog (pushes) are typically made with reference to
>  >>  "df_regs_ever_live_p()" or  "regs_ever_live. "||
>  >>
>  >>  If my understanding is correct,  these calls reflect register USEs and
>  >>  not register DEFs. So if register is used in a function, but not
>  >>  otherwise changed, it will get pushed unnecessarily on stack by prolog.
>  >>
>  >
>  > This implies that the register is either a global register
>  > or a parameter register, in either case it won't be saved/restored
>  > as callee save.
>  > What kind of a register is it and how com there's only use of it in a 
> function
>  > but it's not a global ?
>  >
>  > Seongbae
>  >
>  >
>



-- 
#pragma ident "Seongbae Park, compiler, http://seongbae.blogspot.com";


Re: Excess registers pushed - regs_ever_live not right way?

2008-02-27 Thread Ian Lance Taylor
Andrew Hutchinson <[EMAIL PROTECTED]> writes:

> Register contains  parameter that is passed to function. This register
> is not part of call used set.

It's very odd to pass parameters in a register which the callee may
not modify.  What target is this?

Ian


Re: Excess registers pushed - regs_ever_live not right way?

2008-02-27 Thread Andrew Hutchinson
Thanks

I will check this.

DF Dump in RTL file does not list Artificial defs - which is what I
think I need. However, I do note that all potential parameter registers
(including those unused) - are listed as invalidated by call. - which
means 1 (or more) defs. So like you suggest I just need to find count.

Andy




Seongbae Park (박성배, 朴成培) wrote:
> You can use DF_REG_DEF_COUNT() - if this is indeed a parameter register,
> there should be only one def (artificial def) or no def at all.
> Or if you want to see all defs for the reg,
> follow DF_REG_DEF_CHAIN().
>
> Seongbae
>
> On Wed, Feb 27, 2008 at 6:03 PM, Andrew Hutchinson
> <[EMAIL PROTECTED]> wrote:
>   
>> Register contains  parameter that is passed to function. This register
>>  is not part of call used set.
>>
>>  If this type of register were modified by function, then it would be
>>  saved by function.
>>
>>  If this register is not modified by function, it should not be saved.
>>  This is true even if function is not a leaf function (as same register
>>  would be preserved by deeper calls)
>>
>>
>>  Andy
>>
>>
>>
>>
>>
>>  Seongbae Park (박성배, 朴成培) wrote:
>>  > On Wed, Feb 27, 2008 at 5:16 PM, Andrew Hutchinson
>>  > <[EMAIL PROTECTED]> wrote:
>>  >
>>  >> Register saves by prolog (pushes) are typically made with reference to
>>  >>  "df_regs_ever_live_p()" or  "regs_ever_live. "||
>>  >>
>>  >>  If my understanding is correct,  these calls reflect register USEs and
>>  >>  not register DEFs. So if register is used in a function, but not
>>  >>  otherwise changed, it will get pushed unnecessarily on stack by prolog.
>>  >>
>>  >
>>  > This implies that the register is either a global register
>>  > or a parameter register, in either case it won't be saved/restored
>>  > as callee save.
>>  > What kind of a register is it and how com there's only use of it in a 
>> function
>>  > but it's not a global ?
>>  >
>>  > Seongbae
>>  >
>>  >
>>
>> 
>
>
>
>   


Re: birthpoints in rtl.

2008-02-27 Thread Alexandre Oliva
On Feb 27, 2008, Kenneth Zadeck <[EMAIL PROTECTED]> wrote:

> The appeal for birthpoints is that unlike the abortive attempt in
> the past to add SSA to RTL, adding a noop moves does not really mess
> up anything.

IIRC, when people tried to do RTL SSA, the problem was with match_dups
in IN/OUT operands.

I think there's a relatively simple way to deal with this: an RTL form
that takes two operands, one for input, one for output, that would
represent the SSA names for input and output that must be coalesced to
the same variable when going out of SSA, say, during register
allocation.

Has something like this been considered before?

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: Rant about ChangeLog entries and commit messages - better to do something than just complain

2008-02-27 Thread Alexandre Oliva
On Feb 23, 2008, Andi Kleen <[EMAIL PROTECTED]> wrote:

> On Sat, Feb 23, 2008 at 10:53:53AM -0500, Daniel Jacobowitz wrote:
>> On Sat, Feb 23, 2008 at 08:52:41PM +1100, Tim Josling wrote:
>> > I wrote a little proof-of-concept script to take the mailing list
>> > archives and the ChangeLog files and annotate the ChangeLog files with
>> > the URLs of the probable email containing the patch.
>> 
>> This is really awesome.  Thank you!  I hope we can get these hosted
>> and maybe hyperlinked somewhere on a permanent basis.

> Agreed. Even nicer would be if the ChangeLogs in the repository
> were just updated with this.

+1

Thanks, Tim!  Great stuff!

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


how to use testsuite to check vectorization capabilities

2008-02-27 Thread Jaishri
Hello all,
I am studying vectorization in GCC.
I want to run the test cases given in gcc/gcc/testsuite/gcc.dg/vect
Any pointer will be of great help for me.

Thanks in advance

Jaishri



Re: Draft SH uClinux FDPIC ABI

2008-02-27 Thread Alexandre Oliva
On Feb 26, 2008, "Joseph S. Myers" <[EMAIL PROTECTED]> wrote:

> Here is a draft FDPIC ABI for SH uClinux, based on the FR-V FDPIC ABI.  
> Please send any comments; CodeSourcery will be implementing the final ABI 
> version in GCC and Binutils.

Cool!  Great news!

> In the picture above, function descriptors are placed at negative
> offsets relative to R12 and the GOT data address entries are placed at
> positive offsets relative to R12.  The link editor is free to place
> either the function descriptors at positive offsets (subject to
> alignment constraints) or the data address entries at negative
> offsets.  Also, note that there is no requirement that the function
> descriptors or data address entries have any particular grouping.

It was the need for using the atomic 64-bit load/store instructions
that motivated us to use 64-bit alignment for function descriptors on
FR-V.  Since SH doesn't have 64-bit load/store instructions, there's
no need for function descriptors to be aligned to 64-bit boundaries.

It was the different alignment of function descriptors and other
pointers in the GOT, and the fact that the reserved area had an odd
number of words, that motivated us to suggest the arrangement of
negative even-word offsets for descriptors and positive
even-or-odd-word offsets for other pointers.  Since you have relax the
alignment requirements for descriptors (I don't see it mandated
anywhere), you might as well leave the placement of descriptors and
other pointers completely free.  I think this could greatly simplify
the implementation of the linker.  If you look at the trouble I had to
try to accommodate 12-bit, 16-bit and 32-bit offsets without
overflowing the narrower offsets if at all possible, you'll probably
quickly change your mind about it.  Although you may want to optimize
for immediate offsets, 20-bit offsets and 32-bit offsets as well, so
you may sort of be on the same boat, at least for SH-2A.  Fun! :-)

> This arrangement will not make processes that the debugger attaches to
> after they are mapped in look like they have independent sets of
> breakpoints; they may just crash instead of they reach a breakpoint
> instruction set with ptrace for another process.

Typo (carried over from the original): instead of => instead, if

I've just fixed it in FR-V FDPIC 1.0b.

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: how to use testsuite to check vectorization capabilities

2008-02-27 Thread Tehila Meyzels
I think:
make check-gcc RUNTESTFLAGS="vect.exp"
is what you're looking for.

Tehila.

[EMAIL PROTECTED] wrote on 28/02/2008 08:32:21:

> Hello all,
> I am studying vectorization in GCC.
> I want to run the test cases given in gcc/gcc/testsuite/gcc.dg/vect
> Any pointer will be of great help for me.
>
> Thanks in advance
>
> Jaishri
>