RE: preprocessing question

2006-09-26 Thread Dave Korn
On 26 September 2006 08:43, Jan Beulich wrote:

 Daniel Jacobowitz <[EMAIL PROTECTED]> 25.09.06 18:43 >>>
>> On Mon, Sep 25, 2006 at 05:23:34PM +0200, Jan Beulich wrote:
>>> Can anyone set me strait on why, in the following code fragment
>>> 
>>> int x(unsigned);
>>> 
>>> struct alt_x {
>>> unsigned val;
>>> };
>>> 
>>> #define xalt_x
>>> #define alt_x(p) x(p+1)
>>> 
>>> int test(struct x *p) {
>>> return x(p->val);
>>> }
>>> 
>>> the function invoked in test() is alt_x (rather than x)? I would have
>>> expected that the preprocessor
>>> - finds that x is an object like macro, and replaces it with alt_x
>>> - finds that alt_x is a function-like macro and replaces it with x(...)
>>> - finds that again x is an object like macro, but recognizes that it
>>> already participated in expansion, so doesn't replace x by alt_x a
>>> second time.

> While, as Andreas also pointed out, the standard is a little vague in
> some of what it tries to explain here, it is in my opinion clearly said
> that the re-scanning restrictions are bound to the macro name, not
> the fact that a function-like macro's expansion result is being
> re-scanned. Hence, the re-scanning process of x has to be
> considered still in progress while expanding alt_x, and consequently
> x should not be subject to expansion anymore.

  Actually, it's because cpp has a special exception for recursive macros,
isn't it?  After

>>> #define xalt_x

the preprocessor token "x" is an object-like macro standing for "alt_x", so
when we get to

>>> #define alt_x(p) x(p+1)

  what the preprocessor sees after the first round of expansion is

#define alt_x(p) alt_x(p+1)

  at which point cpp's "No recursive expansion" rule kicks in and all further
expansion stops.

" If the expansion of a macro contains its own name, either directly or
via intermediate macros, it is not expanded again when the expansion is
examined for more macros.  This prevents infinite recursion.  *Note
Self-Referential Macros::, for the precise details.  "


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Explicit field layout

2006-09-26 Thread Bernd Jendrissek
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, Sep 25, 2006 at 10:04:51AM +0200, Ricardo FERNANDEZ PASCUAL wrote:
> I am sorry, but I fail to see the relation of this with rpcgen (which
> as far I know is a code generator for the RPC protocol). Am I looking
> at the wrong rpcgen?

It's the right rpcgen, and was just a random reminder in case it would
give you what you want without having to invent a whole new language if
all you want is to be able to read/write structured data from/to an
octet stream.

- -- 
A PC without Windows is like ice cream without ketchup.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.4 (GNU/Linux)
Comment: Please fetch my new key 804177F8 from hkp://wwwkeys.eu.pgp.net/

iD8DBQFFGPp+wyMv24BBd/gRAoKFAJ9EExroJOxXKCekkaOblm0yHzwq7ACfVEaF
HX8uWu1e5AMcjD6yJ7lK1hw=
=aUmr
-END PGP SIGNATURE-


Re: preprocessing question

2006-09-26 Thread Neil Booth
Jan Beulich wrote:-

> Can anyone set me strait on why, in the following code fragment
> 
> int x(unsigned);
> 
> struct alt_x {
>   unsigned val;
> };
> 
> #define xalt_x
> #define alt_x(p) x(p+1)
> 
> int test(struct x *p) {
>   return x(p->val);
> }
> 
> the function invoked in test() is alt_x (rather than x)? I would have
> expected that the preprocessor
> - finds that x is an object like macro, and replaces it with alt_x
> - finds that alt_x is a function-like macro and replaces it with x(...)
> - finds that again x is an object like macro, but recognizes that it
> already participated in expansion, so doesn't replace x by alt_x a
> second time.
> 
> Our compiler team also considers this misbehavior, but since I
> tested three other compilers, and they all behave the same, I
> continue to wonder if I'm mis-reading something in the standard.

The way GCC works is, once you've had to suck in the '(' for the
expansion of alt_x, you've left x, and further replacements are not
considered nested within x.

This is the behaviour of Dave Prosser's algorithm that the C90
committee agreed on, and then converted to (ambiguous) words.
Their general approach was to expand as much as possible when you
could be sure of not recursing.  Since the '(' leaves the original
expansion to suck in more tokens this is the case.

You will find GCC behaves the same even if just the ')' leaves the
original x.

This tends also to be the most useful behaviour when writing macros,
and *is* relied upon by a lot of software.  Changing the expansion
rules will break a lot of GCC's testcases, which I intentionally
wrote to cover its algorithm pretty well, so it couldn't be accidentally
broken in future.

Neil.


C question: typecast changes behavior with optimizations enabled.

2006-09-26 Thread Adam Dickmeiss

Consider the attached which sweeps through an array of chars..

1) If a typecast is used (CAST defined), the *src is not updated and 
main will see (sz == 0).


2) If no typecast is used, *src is updated, and size == 1.

I expected that *src would be updated in both cases (hence the assert).. 
But I am probably wrong.. and would like to know why:-)


[EMAIL PROTECTED]:~$ gcc -O3 o2tst.c  && ./a.out
a.out: o2tst.c:22: main: Assertion `sz == 1' failed.
Aborted
[EMAIL PROTECTED]:~$ gcc -O2 o2tst.c  && ./a.out
a.out: o2tst.c:22: main: Assertion `sz == 1' failed.
Aborted
[EMAIL PROTECTED]:~$ gcc -O o2tst.c  && ./a.out
[EMAIL PROTECTED]:~$ gcc -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v 
--enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr 
--enable-shared --with-system-zlib --libexecdir=/usr/lib 
--without-included-gettext --enable-threads=posix --enable-nls 
--program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-mpfr --with-tune=i686 
--enable-checking=release i486-linux-gnu

Thread model: posix
gcc version 4.1.2 20060901 (prerelease) (Debian 4.1.1-13)

Same behavior on
[EMAIL PROTECTED]:~/proj/idzebra-2.0.2/isamb$ gcc -v
Reading specs from /usr/local/lib/gcc/sparc-sun-solaris2.9/3.4.6/specs
Configured with: ../configure --with-as=/usr/ccs/bin/as 
--with-ld=/usr/ccs/bin/ld --enable-shared --enable-languages=c,c++,f77

Thread model: posix
gcc version 3.4.6

/ Adam
#include 
#include 

#define CAST (const unsigned char **)
// #define CAST 

static void decode_ptr(const char **src)
{
unsigned char c;
while ((c = *(* CAST src)++))
	;
}

int main(int argc, char **argv)
{
const char *src0 = "";
const char *src = src0;
size_t sz;

decode_ptr(&src);
sz = src - src0;
assert(sz == 1);
}



Re: what to do with this testcase?

2006-09-26 Thread Diego Novillo
Andrew MacLeod wrote on 09/26/06 10:34:

> 1 - eliminate test case (this is the easy choice! :-) 
> 2 - keep the testcase, remove the option.  (It probably doesn't really
> test anything then, so you might as well remove it)
> 3 - make a new testcase which doesn't require -fno-tree-lrs.
> 
#2.  It probably tests nothing now, but extra code to pass through the
compiler never hurts.


what to do with this testcase?

2006-09-26 Thread Andrew MacLeod

I've got the new out of ssa rewrite pretty much wrapped up, and in the
process I have removed the -fno-tree-lrs option.  This means we can no
longer turn off live range splitting at the tree level.   I mentioned I
was planning to remove this a few months ago.

Everything is fine, except for one testcase:
gcc.dg/max-1.c

this testcase is for PR 18548:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18548

and in the options, the testcase uses -fno-tree-lrs in order to
reproduce the problem.

It of course fails now because the option doesn't exist any more.

I see the following choices:

1 - eliminate test case (this is the easy choice! :-) 
2 - keep the testcase, remove the option.  (It probably doesn't really
test anything then, so you might as well remove it)
3 - make a new testcase which doesn't require -fno-tree-lrs.

I built a tree from that era, and get the testcase to fail as stated. I
have tried, but have not managed to create a failing testcase without
the option yet.  solicitations invited :-)  Is there any great
opposition to simply removing the test case? if so, can you reproduce it
without -fno-tree-lrs?

Andrew



Re: Explicit field layout

2006-09-26 Thread Daniel Berlin

On 9/26/06, Bernd Jendrissek <[EMAIL PROTECTED]> wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, Sep 25, 2006 at 10:04:51AM +0200, Ricardo FERNANDEZ PASCUAL wrote:
> I am sorry, but I fail to see the relation of this with rpcgen (which
> as far I know is a code generator for the RPC protocol). Am I looking
> at the wrong rpcgen?

It's the right rpcgen, and was just a random reminder in case it would
give you what you want without having to invent a whole new language if
all you want is to be able to read/write structured data from/to an
octet stream.



But uh, he's implementing a compiler for a language standard, not
programming a random application that wants to control field layout.


collect2 on AIX drags too many objects from archives ?

2006-09-26 Thread Olivier Hainque
Hello,

To address

  /* The AIX linker will discard static constructors in object files if
 nothing else in the file is referenced [...] */

collect2 on AIX builds the ctor/dtor tables from an explicit scan of
all the objects and libraries.

The scan actually also considers frame tables, so we end up dragging
every object with such tables even if the object is not needed at all
for other reasons.

For instance, with mainline on aix 5.3 configured with
-enable-languages=c,c++ --disable-nls --with-cpu=common, we see:

   /* useless.c */
   extern void nowhere ();
   static void useless ()
   {
 nowhere ();
   }

   /* main.c */
   int main ()
   {
 return 0;
   }

   $ g++ -c useless.cc
   $ ar rv libuseless.a useless.o
   ar: creating an archive file libuseless.a
   a - useless.o

   $ g++ -o m main.cc -L. -luseless
   ld: 0711-317 ERROR: Undefined symbol: .nowhere()
   ...
   collect2: ld returned 8 exit status

-Wl,-debug reads:

 List of libraries:
./libuseless.a,
 ...
 extern void *x7 __asm__ ("_GLOBAL__F_useless.cc__03A3E4DF");
 ...
 static void *frame_table[] = {
&x7,

The "useless" object has been included because of a reference to its
frame tables out of collect2's processing for libuseless.a. 

This behavior is problematic because it might cause the inclusion of
loads of useless objects in general, with two consequences: executables
potentially much larger than needed and unexpected dependencies.

We noticed this while working on DWARF2 exceptions support for Ada
with a static run-time library: almost the full library ends up
included in every executable, causing a significant waste in space and
forcing to systematically link with at least libm.

A possible way to address would be to perform a double scan: the
current one to discover ctors/dtors only, leaving the frame tables
alone, and a second one on the resulting executable (past a first link
phase) to discover the relevant frame tables only.

The obvious drawback is link-time performance, but since there is a
real COFF scanner and we're not using the generic "nm" based scheme,
it might not be that much of an issue. I have no hard data at hand, so
can't really tell at this point.

Thoughts ?

Thanks in advance,

Olivier













Re: preprocessing question

2006-09-26 Thread Jan Beulich
>>> Daniel Jacobowitz <[EMAIL PROTECTED]> 25.09.06 18:43 >>>
>On Mon, Sep 25, 2006 at 05:23:34PM +0200, Jan Beulich wrote:
>> Can anyone set me strait on why, in the following code fragment
>> 
>> int x(unsigned);
>> 
>> struct alt_x {
>>  unsigned val;
>> };
>> 
>> #define xalt_x
>> #define alt_x(p) x(p+1)
>> 
>> int test(struct x *p) {
>>  return x(p->val);
>> }
>> 
>> the function invoked in test() is alt_x (rather than x)? I would have
>> expected that the preprocessor
>> - finds that x is an object like macro, and replaces it with alt_x
>> - finds that alt_x is a function-like macro and replaces it with x(...)
>> - finds that again x is an object like macro, but recognizes that it
>> already participated in expansion, so doesn't replace x by alt_x a
>> second time.
>
>Why do you think that x has already participated in expansion?  It
>hasn't paricipated in the expansion of the function-like macro
>alt_x, which is what is being considered, if I'm reading c99 right,
>because no nested replacement of x occurred within the processing
>of alt_x().  It's a different scan.

While, as Andreas also pointed out, the standard is a little vague in
some of what it tries to explain here, it is in my opinion clearly said
that the re-scanning restrictions are bound to the macro name, not
the fact that a function-like macro's expansion result is being
re-scanned. Hence, the re-scanning process of x has to be
considered still in progress while expanding alt_x, and consequently
x should not be subject to expansion anymore.

Jan


Re: preprocessing question

2006-09-26 Thread Andreas Schwab
"Jan Beulich" <[EMAIL PROTECTED]> writes:

> While, as Andreas also pointed out, the standard is a little vague in
> some of what it tries to explain here, it is in my opinion clearly said
> that the re-scanning restrictions are bound to the macro name, not
> the fact that a function-like macro's expansion result is being
> re-scanned. Hence, the re-scanning process of x has to be
> considered still in progress while expanding alt_x, and consequently
> x should not be subject to expansion anymore.

I think this thread on comp.std.c covers all questions.

http://groups.google.com/group/comp.std.c/browse_thread/thread/ad28864b2c50bc30/a78d02f4c510124b?tvc=2

Apparently there is some unspecified behaviour connected with this
example, see .

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


RE: preprocessing question

2006-09-26 Thread Jan Beulich
 #define xalt_x
>
>the preprocessor token "x" is an object-like macro standing for "alt_x", so
>when we get to
>
 #define alt_x(p) x(p+1)
>
>  what the preprocessor sees after the first round of expansion is
>
>#define alt_x(p) alt_x(p+1)

As pointed out before - there is *no* expansion for preprocessing
directives, except where the standard explicitly says otherwise.

Jan


Re: collect2 on AIX drags too many objects from archives ?

2006-09-26 Thread Ian Lance Taylor
Olivier Hainque <[EMAIL PROTECTED]> writes:

> A possible way to address would be to perform a double scan: the
> current one to discover ctors/dtors only, leaving the frame tables
> alone, and a second one on the resulting executable (past a first link
> phase) to discover the relevant frame tables only.

Why do you need the double scan?  Why can't you just consistently
ignore the frame tables?

Ian


Re: C question: typecast changes behavior with optimizations enabled.

2006-09-26 Thread Ian Lance Taylor
Adam Dickmeiss <[EMAIL PROTECTED]> writes:

> Consider the attached which sweeps through an array of chars..
> 
> 1) If a typecast is used (CAST defined), the *src is not updated and
> main will see (sz == 0).
> 
> 2) If no typecast is used, *src is updated, and size == 1.

1) Wrong mailing list.  This would be appropriate for gcc-help or for
   a list for general C language questions.

2) Looks like an aliasing problem.  Try compiling with
   -fno-strict-aliasing.  Look at the documentation for
   -fstrict-aliasing and -Wstrict-aliasing.

Ian


3.4 vs. 4.1 performance issues

2006-09-26 Thread Erich Plondke

I've noticed while tinkering with 3.4 and 4.1 that some
code sequences turn out much better in 4.1.  However, other
code sequences turn out substantially worse in 4.1.

The most frustrating is the reduction in use of postmodify
addressing modes.  It looks like tree-ssa-loop-ivopts converts
a loop like:

   for (i = 0; i < MAX; i++) {
   sum += a[i];
   }

into something like:

   for (ivtmp = 0; ivtmp < MAX*4; ivtmp += 4)
   {
   sum += *(a+ivtmp)
   }

which is fine, except by the time we get to RTL, the load in
the first loop form is converted in GCC 3.4 into a load with
postincrement, and the RTL optimization turns the second form
address into an add of ivtmp and 4, an add of ivtmp and a,
and a load.


Similarly, I can do mulsidi3 fast but muldi3 not so fast.
If I have a code sequence like:

typedef long long int Word64;

extern short sat64_16(Word64 x);

#define MAC(C,A,B) ((C) + ((Word64)(A) * (B)))

#define SEQ(X) { \
   c1 = *coef; coef++; \
   c2 = *coef; coef++; \
   vLo = *(vb1+(X)); \
   vHi = *(vb1+(23-(X))); \
   sum1L = MAC(sum1L,vLo,c1); \
   sum2L = MAC(sum2L,vLo,c2); \
   sum1L = MAC(sum1L,vHi,-c2); \
   sum2L = MAC(sum2L,vHi,c1); \
   vLo = *(vb1+32+(X)); \
   vHi = *(vb1+32+(23-(X))); \
   sum1R = MAC(sum1R,vLo,c1); \
   sum2R = MAC(sum2R,vLo,c2); \
   sum1R = MAC(sum1R,vHi,-c2); \
   sum2R = MAC(sum2R,vHi,c1); \
   }

foo(const int *coef, int *vb1, short *out) {
   int vLo, vHi, c1, c2;
   Word64 sum1L = 0, sum2L = 0;
   Word64 sum1R = 0, sum2R = 0;

   SEQ(0);
   SEQ(1);
   SEQ(2);
   SEQ(3);
   SEQ(4);
   SEQ(5);
   SEQ(6);
   SEQ(7);
   out[0] = sat64_16(sum1L+sum2L);
   out[1] = sat64_16(sum1R+sum2R);
}



In GCC 3.4, the optimizer has no problem knowing that every
multiply is a mulsidi3.  In GCC 4.1, the tree optimizer
decides that the sign extend to DI for c1, c2, vLo, and VHi
should be done into a DImode temporary that is fed to the
MAC patterns, and combine dosn't convert them.  Indeed,
if I don't have a define_insn_and_split for DImode it doesn't
even have a chance, because the RTL expander has already
converted the DImode multiply into various SImode instructions.

So... how do I coax GCC 4.1 into liking postmodify and mulsidi again?  I've
tried fiddling with rtx_costs for postmodify and multiply, and they should be
accurate, but I get no love.  What other things can I try to play with?  Or
is this sort of thing a known deficiency in 4.1 that I should try to
work around?

I've attached a test for the latter case and the 3.4(.2) and 4.1(.1)
assembly outputs
for ARM, which exhibits this behavior.  Note particularly the smull's
and smlal's.

Thanks!

   Erich


simple.c
Description: Binary data


simple-3.4.s
Description: Binary data


simple-4.1.s
Description: Binary data


Re: Documentation for loop infrastructure

2006-09-26 Thread Sebastian Pop
Thank you, the documentation looks good.

Ira Rosen wrote:
> 
> @item @code{first_location_in_loop}: Provides information about the first
> location accessed by the data reference in the loop and about the access
> function used to represent evolution relative to this location. This data
> is used to support pointers, and is not used for arrays (for which we
> have base objects). Pointer accesses are represented as a one-dimensional
> access that starts from the first location accessed in the loop. For
> example:
> 
> @smallexample
>   for i
>  for j
>   *((int *)p + i + j) = a[i][j];
> @end smallexample
> 
> The access function of the pointer access is @[EMAIL PROTECTED], + [EMAIL 
> PROTECTED] relative

It is probably better to include the loop indexes in the example, and
modify the syntax of the scev for making it more explicit, like:

@smallexample
  for1 i
 for2 j
  *((int *)p + i + j) = a[i][j];
@end smallexample

and the access function becomes: @[EMAIL PROTECTED], + [EMAIL PROTECTED]



porting GCC & GCC backends

2006-09-26 Thread max blomme
I'm attempting to port GCC to our companys 32bit microprocessor, and I'm 
a bit overwhelmed.


Looking through some of the documentation (there's quite a lot of it!) I 
can't seem to find the answers to a few questions.  Pardon me if they 
seem basic and obvious.


We already have an assembler and linker for the processor.  Should I (or 
can I) use them as the back end of GCC?  If I do use them, can I use the 
GCC debugger (or binutils or wherever the debugger that is tied to GCC is)?


If I cannot use our current assembler & linker, is there any 
documentation for porting GAS (I've looked and I can't seem to find any)?


The processor has a relocateable object loader in it's internally ROMed 
BIOS which I'd prefer to use, but it's a proprietary format that, as far 
as I know, does not conform to any previous standard.  The loader is 
extendable, buy bootstrapping in another loader.


In the port for GAS, can I describe this format?  Or do I need to write 
a back end to convert it's output to our format, or write another loader?


If I'm posting to the wrong list, please point me to the correct one.

Max


RE: porting GCC & GCC backends

2006-09-26 Thread Dave Korn
On 26 September 2006 20:01, max blomme wrote:

> I'm attempting to port GCC to our companys 32bit microprocessor, and I'm
> a bit overwhelmed.
> 
> Looking through some of the documentation (there's quite a lot of it!) I
> can't seem to find the answers to a few questions.  Pardon me if they
> seem basic and obvious.
> 
> We already have an assembler and linker for the processor.  Should I (or
> can I) use them as the back end of GCC?  

  Yes, you can.  Gcc is designed to interoperate with system-native toolchains
whereever possible.

> If I do use them, can I use the
> GCC debugger (or binutils or wherever the debugger that is tied to GCC is)?

  If your code outputs standard debugging info in dwarf/coff/aout format,
probably.

> If I cannot use our current assembler & linker, is there any
> documentation for porting GAS (I've looked and I can't seem to find any)?

  Build binutils from source, cd into the docs dir and "make internals" should
generate "internals.info", which you could view by "info
--file=internals.info"; it's not build and installed by default.  (It may be
if you use --enable-maintainer-mode when configuring, I don't know).

> The processor has a relocateable object loader in it's internally ROMed
> BIOS which I'd prefer to use, but it's a proprietary format that, as far
> as I know, does not conform to any previous standard.  The loader is
> extendable, buy bootstrapping in another loader.
> 
> In the port for GAS, can I describe this format?  Or do I need to write
> a back end to convert it's output to our format, or write another loader?

  You need to extend the bfd library to understand it.  That's going to be a
nuisance.

> If I'm posting to the wrong list, please point me to the correct one.

  Most of this stuff is low-level toolchain rather than compiler-specific, so
the binutils list might be more informative for a good deal of it.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: porting GCC & GCC backends

2006-09-26 Thread Paul Brook
On Tuesday 26 September 2006 20:09, Dave Korn wrote:
> On 26 September 2006 20:01, max blomme wrote:
> > I'm attempting to port GCC to our companys 32bit microprocessor, and I'm
> > a bit overwhelmed.
> >
> > Looking through some of the documentation (there's quite a lot of it!) I
> > can't seem to find the answers to a few questions.  Pardon me if they
> > seem basic and obvious.
> >
> > We already have an assembler and linker for the processor.  Should I (or
> > can I) use them as the back end of GCC?
>
>   Yes, you can.  Gcc is designed to interoperate with system-native
> toolchains whereever possible.

Depending how crappy the existing assembler/linker are it may be easier to do 
a gas/binutils/ld ELF port. Non-elf ports are generally a pain and I'd advise 
avoiding them wherever possible.

You can always write a postlinker to convert an elf image into whatever format 
your loader is expecting. This is what uClinux and SymbianOS do.

Paul


Interesting -iquote bug

2006-09-26 Thread Mike Stump

In gcc's syslimits.h (gsyslimits.h), we do:

/* syslimits.h stands for the system's own limits.h file.
   If we can use it ok unmodified, then we install this text.
   If fixincludes fixes it, then the fixed version is installed
   instead of this text.  */

#define _GCC_NEXT_LIMITS_H  /* tell gcc's limits.h to  
recurse */

#include_next 
#undef _GCC_NEXT_LIMITS_H

and this can find a user limits.h in a directory named with -iquote  
whenever -I- isn't used.  The user wishes to not so find that file, as  
it breaks / on the system.


The doc says:

@samp{#include_next} does not distinguish between @code{<@var{file}>}
and @code{"@var{file}"} inclusion,

but at the same time:

@item [EMAIL PROTECTED]
@opindex iquote
Add the directory @var{dir} to the head of the list of directories to
be searched for header files only for the case of @samp{#include
"@var{file}"}; they are not searched for @samp{#include <@var{file}>},
otherwise just like @option{-I}.


I'd be tempted to say that include_next could be bug fixed to notice  
<>/"" and DTRT and document it as such.


What do others think?

1  Fix include_next
2  Tell user to use -I- (and undeprecate -I-).
3  Nothing, tell user to not try and use "limits.h" for anything.
4  Fix syslimits.h to `do something else'.  (I have no clue what.)


Re: collect2 on AIX drags too many objects from archives ?

2006-09-26 Thread Mike Stump

On Sep 26, 2006, at 2:36 AM, Olivier Hainque wrote:
  /* The AIX linker will discard static constructors in object files  
if

 nothing else in the file is referenced [...] */


Darwin has this same sort of issue and solves it by not wiring up  
ctors/dtors for all these things but instead have a separate  
convention to register/unregister the tables, both for the statically  
linked things, dlopen type things and shared library type things.   
Essentially, upon load, they wonder symbol tables (ok, actually the  
section table) and register based upon what they find.  See crt1.o  
(Csu/crt.c) and keymgr/keymgr.c for additional hints.


[RFC] Program Bounds Checking

2006-09-26 Thread David Edelsohn
Tzi-cker Chiueh has developed a low-overhead bounds checking
feature and approached the FSF about having it incorporated in GCC.  This
discussion originally was forwarded to the GCC Steering Committee, so I am
redirecting the conversation to the main GCC mailinglist.  Hopefully some
members of the GCC community who are knowledgeable about this type of
technology will follow up with Tzi-cker.

Thanks, David

--- Forwarded Message

Mudflap performs not-so-complete bound checking on C programs. Already its
performance overhead is 300%-500% of the original execution time.  Our
initial bound checking compiler, CASH
(http://www.ecsl.cs.sunysb.edu/cash/index.html), can perform complete
bound checking for C programs with a performance overhead of less than
10%.

The reason it can achieve this feat is because it exploits segmentation
hardware on X86/IA32 architecture. Because segmentation hardware is not
supported in IA64 and other embedded processors, our current proposal is
to exploit "debug resgister" hardware, which is more universal, to achieve
the same array bound checking performance level as CASH.

CASH's low run-time overhead makes it practical for the first time to turn
on bound checking for production-mode server programs such as Apache or
BIND.

--- End of Forwarded Message



Re: Interesting -iquote bug

2006-09-26 Thread Ian Lance Taylor
Mike Stump <[EMAIL PROTECTED]> writes:

> In gcc's syslimits.h (gsyslimits.h), we do:
> 
> /* syslimits.h stands for the system's own limits.h file.
> If we can use it ok unmodified, then we install this text.
> If fixincludes fixes it, then the fixed version is installed
> instead of this text.  */
> 
> #define _GCC_NEXT_LIMITS_H  /* tell gcc's limits.h to
> recurse */
> #include_next 
> #undef _GCC_NEXT_LIMITS_H
> 
> and this can find a user limits.h in a directory named with -iquote
> whenever -I- isn't used.  The user wishes to not so find that file, as
> it breaks / on the system.

My understanding has always been that #include_next should find a
version of the header file farther down the search path.  So if gcc's
limits.h was found via #include , then the #include_next
should not find a #include "limits.h", it should find the next
.

And that is pretty much what the documentation says.  And I don't
think that behaviour should change in any way.

So I don't understand what the issue is.  Can you give an example?

Ian


Notes from tinkering with the autovectorizer (4.1.1)

2006-09-26 Thread Erich Plondke

I've been tinkering with the autovectorizer.  It's really cool.
I particularly like the realignment support.

I've noticed just a few things while tinkering with it (in 4.1.1):

0) The realignment code takes the floor of the unaligned pointer, and we
increment the unaligned pointer in the loop.  This is great for
architectures like Alpha that have floor addressing modes, because finding
the floor is free.  But for architectures like ARM, it's much better to
take the floor outside the loop and be able to postincrement by VECSIZE
inside the loop.


1) The definition of the realignment instruction doesn't match hardware for
instrution sets like ARM WMMX, where aligned amounts shift by 0 bytes
instead of VECSIZE byes.  This makes it useless for vector realignment,
because in the case that the pointer happens to be aligned, we get the
wrong vector.  Looks like the SPARC realignment hook does the same thing...
Indeed, it looks like Altivec is the only one to support it, and they do
some trickery with shifting the wrong (against endianness) way based on the
two's compliment of the source (a very clever trick).  No other machine
(evidentally) can easily meet the description of the current realignment
mechanism.

Of course, for safety reasons I guess we don't always get the next vector
(the one at address floor(ptr+VECSIZE)), which would allow us to use the
shift-style instructions.

So, there may be a few options:

* Have a flag or hook where we can say it is always OK to read the next
   element.  This is probably a bad option; everyone who used the
   vectorizer would have to know that they may need to pad their
   arrays if they are in a protected memory environment.

* Conditionally fetch the next bundle, and don't do the fetch of the
   next data the last time around if might not be safe.  Probably
   a bad idea for architectures without conditional execution.

* Currently we drop out of the loop when there are VEC_ELEMENTS - 1
   iterations or less.  We could drop out when there are VEC_ELEMENTS
   or less, and then we could always fetch the next aligned data.

* Some other clever trick I don't know about. :-)

* Or keep it the way it is, and leave out the machines that have the
   shift-by-zero instead of the shift-by-VECSIZE behavior for
   an aligned pointer.


2) It seems like there may be some hooks that aren't documented.  For
instance, there seems to be some kind of support for the "vcond"
standard name, but I can't seem to find it in the documentation.


In general things work quite well, and it seems to play reasonably well with
things like the modulo scheduler.

Cheers,

 Erich

--
Why are ``tolerant'' people so intolerant of intolerant people?


RFC: deprecated functions calling deprecated functions

2006-09-26 Thread Eric Christopher

So, a testcase like this:

extern void foo() __attribute__((deprecated));
extern void bar() __attribute__((deprecated));

void foo() {}

void bar()
{
foo();
}

Should we warn on the invocation of foo() since it's also being called 
from within a deprecated function? We are today, but I've gotten a 
request for that to not warn. This seems reasonable, for example, if you 
deprecate an entire API or something, but still need to compile the library.


Thoughts?

-eric