* Dmitry Yu. Bolkhovityanov wrote on Sun, Nov 06, 2005 at 07:21:56AM CET:
>
> That's probably an old problem, but I haven't found any notion of
> it in GCC docs. So...
It's one better discussed on the gcc-help mailing list.
> #define V(value) = value
> This works fine, until I tr
Hi,
we have this long standing issue which really we should solve, one way
or another: otherwise there are both correctness and performance issues
which we cannot fix, new features which we cannot implement. I have
plenty of examples, just ask, in case, if you want more details and
motivation
On 11/6/05, Robert Dewar <[EMAIL PROTECTED]> wrote:
> Giovanni Bajo wrote:
>
> > I believe you are missing my point. What is the GCC command line option for
> > "try to optimize as best as you can, please, I don't care compiletime"? I
> > believe that should be -O3. Otherwise let's make -O4. Or -O6
On 11/6/05, Paolo Carlini <[EMAIL PROTECTED]> wrote:
> Hi,
>
> we have this long standing issue which really we should solve, one way
> or another: otherwise there are both correctness and performance issues
> which we cannot fix, new features which we cannot implement. I have
> plenty of examples,
Richard Guenther wrote:
We could just provide fallback libcalls in libgcc
Indeed, this is an option. Not one I can implement myself quickly, but I
think the idea of issuing a library call when the builtin is not
available was actually meant to enable this kind of solution.
Can you work on it?
Richard Guenther wrote:
We could just provide fallback libcalls in libgcc
All in all, I think this is really the best solution. For 4.2 Sparc will
also have the builtins available and even if we want that the libgcc
code is equivalent to what is currently available in
libstdc++-v3/config/cpu,
Robert Dewar <[EMAIL PROTECTED]> writes:
| Steven Bosscher wrote:
|
| > You must not have been paying attention to one of the most frequent
| > complaints about gcc, which is that it is dog slow already ;-)
|
| Sure, but to me -O2 says you don't care much about compilation time.
If the Ada fron
"Gary M Mann" <[EMAIL PROTECTED]> writes:
| Hi,
|
| The -fvisibility feature in GCC 4.0 is a really useful way of hiding all
| non-public symbols in a dynamic shared object.
|
| While I'm aware of a patch which backports this feature to GCC 3.4 (over at
| nedprod.com), I was wondering whether th
Gabriel Dos Reis <[EMAIL PROTECTED]> wrote:
>>> You must not have been paying attention to one of the most frequent
>>> complaints about gcc, which is that it is dog slow already ;-)
>>
>> Sure, but to me -O2 says you don't care much about compilation time.
>
> If the Ada front-end wishes, it can
On Sun, Nov 06, 2005 at 01:32:43PM +0100, Giovanni Bajo wrote:
> If -O1 means "optimize, but be fast", what does -O2 mean? And what does -O3
> mean? If -O2 means "the current set of optimizer that we put in -O2", that's
> unsatisfying for me.
`-O2'
Optimize even more. GCC performs nearly all
On 11/6/05, Paolo Carlini <[EMAIL PROTECTED]> wrote:
> Richard Guenther wrote:
> > We could just provide fallback libcalls in libgcc
> Indeed, this is an option. Not one I can implement myself quickly, but I
> think the idea of issuing a library call when the builtin is not
> available was actually
Richard Guenther wrote:
>Can you point me to some libstdc++ class/file where you use the
>builtins or other solution?
>
Simply config/cpu/*/atomicity.h will do, for ia64, powerpc, ia64, alpha,
s390, currently to implement __exchange_and_add and __atomic_add. Note
that in this way the latter are *n
Richard Guenther wrote:
Of course SPEC consists of real life applications. Whether it is a good
representation for todays real life applications is another question, but
it certainly is a very good set of tests.
It's mostly small programs by its nature, it is not very practical
to include rea
Gabriel Dos Reis wrote:
Robert Dewar <[EMAIL PROTECTED]> writes:
| Steven Bosscher wrote:
|
| > You must not have been paying attention to one of the most frequent
| > complaints about gcc, which is that it is dog slow already ;-)
|
| Sure, but to me -O2 says you don't care much about compila
Giovanni Bajo wrote:
If -O1 means "optimize, but be fast", what does -O2 mean? And what does -O3
mean? If -O2 means "the current set of optimizer that we put in -O2", that's
unsatisfying for me.
Right, that's exactly my point, you want some clear statement of the
goals of different levels, def
Jakub Jelinek wrote:
Including loop unrolling to -O2 is IMNSHO a bad idea, as loop unrolling
increases code size, sometimes a lot. And the distinction between -O2
and -O3 is exactly in the space-for-speed tradeoffs.
That's certainly a valid way of defining the difference (and certainly
used t
Robert Dewar <[EMAIL PROTECTED]> writes:
>> Including loop unrolling to -O2 is IMNSHO a bad idea, as loop unrolling
>> increases code size, sometimes a lot. And the distinction between -O2
>> and -O3 is exactly in the space-for-speed tradeoffs.
>That's certainly a valid way of defining the diffe
Hi,
On Sunday 06 November 2005 16:20, Mattias Engdegård wrote:
> Robert Dewar <[EMAIL PROTECTED]> writes:
> >> Including loop unrolling to -O2 is IMNSHO a bad idea, as loop unrolling
> >> increases code size, sometimes a lot. And the distinction between -O2
> >> and -O3 is exactly in the space-fo
On Nov 6, 2005, at 6:03 AM, Paolo Carlini wrote:
So - can't you work with some preprocessor magic and a define, if
the builtins are available?
I don't really understand this last remark of yours: is it an
alternate solution?!? Any preprocessor magic has to rely on a new
preprocessor builtin
Paolo Carlini wrote:
> Hi,
>
> we have this long standing issue which really we should solve, one way
> or another: otherwise there are both correctness and performance issues
> which we cannot fix, new features which we cannot implement. I have
> plenty of examples, just ask, in case, if you want
Hi Howard,
> Coincidentally I also explored this option in another product. We
> ended up implementing it and it seemed to work quite well. It did
> require the back end to "register" with the preprocessor those
> builtins it implemented, and quite frankly I don't know exactly how
> that reg
On Sun, Nov 06, 2005 at 11:34:30AM +0100, Paolo Carlini wrote:
> Thus my request: would it be possible to have available the builtins
> unconditionally, by way of a slow (locks) fallback replacing the real
> implementation when the actual target code doesn't allow for them?
I suppose that in som
Hi Mark,
>I think this is a somewhat difficult problem because of the tension
>between performance and functionality. In particular, as you say, the
>code sequence you want to use varies by CPU.
>
>I don't think I have good answers; this email is just me musing out loud.
>
>You probably don't wan
Paolo Carlini wrote:
> Actually, the situation is not as bad, as far as I can see: the worst
> case is i386 vs i486+, and Old-Sparc vs New-Sparc. More generally, a
> targer either cannot implement the builtin at all (a trivial fall back
> using locks or no MT support at all) or can in no more than
On 11/6/05, Mark Mitchell <[EMAIL PROTECTED]> wrote:
> Paolo Carlini wrote:
>
> > Actually, the situation is not as bad, as far as I can see: the worst
> > case is i386 vs i486+, and Old-Sparc vs New-Sparc. More generally, a
> > targer either cannot implement the builtin at all (a trivial fall back
Mark Mitchell wrote:
>Paolo Carlini wrote:
>
>>Actually, the situation is not as bad, as far as I can see: the worst
>>case is i386 vs i486+, and Old-Sparc vs New-Sparc. More generally, a
>>targer either cannot implement the builtin at all (a trivial fall back
>>using locks or no MT support at all
Paolo Carlini wrote:
> Yes, in principle you are right, but in that case we can reorder the
> ifs: first i686, last i386 ;) Seriously earlier today I was hoping we
> can have something smarter than a series of conditionals at the level of
> libgcc, I don't know it much. I was hoping we can manage
On Sun, Nov 06, 2005 at 11:02:29AM -0800, Mark Mitchell wrote:
> Are you saying that you don't expect there to ever be an architecture
> that might have three or more ways of doing locking? That seems rather
> optimistic to me. I think we ought to plan for needing as many versions
> as we have CP
Mark Mitchell wrote:
>>Yes, in principle you are right, but in that case we can reorder the
>>ifs: first i686, last i386 ;) Seriously earlier today I was hoping we
>>can have something smarter than a series of conditionals at the level of
>>libgcc, I don't know it much. I was hoping we can manage
Richard Henderson wrote:
> I believe some poor design decisions were made for p4 here. But even
> on a platform without such problems you can expect a factor of 30
> difference.
So, that suggests that inlining these operations probably isn't very
profitable. In that case, it seems like we could
Mark Mitchell wrote:
>Richard Henderson wrote:
>
>>I believe some poor design decisions were made for p4 here. But even
>>on a platform without such problems you can expect a factor of 30
>>difference.
>>
>>
>So, that suggests that inlining these operations probably isn't very
>profitable. I
Paolo Carlini wrote:
>>And, that if
>>__exchange_and_add is showing up on the top of the profile, the fix
>>probably isn't inlining -- it's to work out a way to make less use of
>>atomic operations.
>>
>>
I want to add that we are certain
Richard Henderson <[EMAIL PROTECTED]> writes:
> Not all targets are going to be able to implement the builtins,
> even with locks. It is imperitive that the target have an
> atomic store operation, so that other read-only references to
> the variable see either the old or new value, but not a mi
Hi Ian,
>I can see that there is a troubling case that code may be compiled for
>i386 and then run on a multi-processing system using newer processors.
>That is something which we would have to detect at run time, in start
>up code or the first time the builtins are invoked.
>
>
Earlier in this
* Paolo Carlini:
> Actually, the situation is not as bad, as far as I can see: the worst
> case is i386 vs i486+, and Old-Sparc vs New-Sparc. More generally, a
> targer either cannot implement the builtin at all (a trivial fall back
> using locks or no MT support at all) or can in no more than 1
>
Richard Henderson wrote:
To keep all this in perspective, folks should remember that atomic
operations are *slow*. Very very slow. Orders of magnitude slower
than function calls. Seriously. Taking p4 as the extreme example,
one can expect a null function call in around 10 cycles, but a locke
* Richard Henderson:
> To keep all this in perspective, folks should remember that atomic
> operations are *slow*. Very very slow. Orders of magnitude slower
> than function calls. Seriously. Taking p4 as the extreme example,
> one can expect a null function call in around 10 cycles, but a loc
On Sun, Nov 06, 2005 at 12:10:03PM -0800, Ian Lance Taylor wrote:
> How many processors out there support multi-processor systems but do
> not provide any sort of atomic store operation?
My point here had been more wrt the 8-byte operations, wherein
there are *pleanty* of multi-processor systems t
* Peter Dimov:
> Even on a P4, inlining may enable compiler optimizations. One case is when
> the compiler can see that the return value of __sync_fetch_and_or (for
> instance) isn't used. It's possible to use a wait-free "lock or" instead of
> a "lock cmpxchg" loop (MSVC 8 does this for _Inter
Richard Henderson wrote:
On Sun, Nov 06, 2005 at 12:10:03PM -0800, Ian Lance Taylor wrote:
How many processors out there support multi-processor systems but do
not provide any sort of atomic store operation?
My point here had been more wrt the 8-byte operations, wherein
there are *plean
Mark Mitchell wrote:
> Yes, GLIBC does that kind of thing, and we could do. In the simplest
> form, we could have startup code that checks the CPU, and sets up a
> table of function pointers that application code could use.
That's not what glibc does and it is a horrible idea. The indirect
jumps
Ulrich Drepper wrote:
> Mark Mitchell wrote:
>
>>Yes, GLIBC does that kind of thing, and we could do. In the simplest
>>form, we could have startup code that checks the CPU, and sets up a
>>table of function pointers that application code could use.
>
>
> That's not what glibc does and it is a
On Sun, Nov 06, 2005 at 10:51:51AM -0800, Richard Henderson wrote:
> I suppose that in some cases it would be possible to implement
> them in libgcc. Certainly we provided for that possibility
> by expanding to external calls.
Actually, no, it's not possible. At least in the context we're
discu
Richard Henderson wrote:
>One part of the application (say, libstdc++) is compiled with only
>i386 support. Here we wind up relying on a mutex to protect the
>memory update. Another part of the application (say, the exe) is
>compiled with i686 support, and so chooses to use atomic operations.
>T
On Mon, Nov 07, 2005 at 01:35:13AM +0100, Paolo Carlini wrote:
> We have to add to the library
> out-of-line versions of the builtins... (in order to do that, we may end
> up restoring the old inline assembly implementations of CAS, for example)
I don't think you need to restore inline assembly.
Richard Henderson wrote:
>On Mon, Nov 07, 2005 at 01:35:13AM +0100, Paolo Carlini wrote:
>
>
>>We have to add to the library
>>out-of-line versions of the builtins... (in order to do that, we may end
>>up restoring the old inline assembly implementations of CAS, for example)
>>
>>
>I don't t
Richard Henderson wrote:
>My thinking would be along the lines of
>
>
>#if !ARCH_ALWAYS_HAS_SYNC_BUILTINS
>
>
[snip]
>#endif
>
>
Well, there is a minor catch, which is, if we don't want to break the
ABI, we have to keep on implementing and exporting from the *.so
__exchange_and_add and __atom
I've built gcc-3.4.3 for HP-UX 11.23/IA-64 and used the pre-compiled
gcc-3.4.4 binary from the http://www.hp.com/go/gcc site. Both exhibit
the same problem. While trying to build Perl 5.8.6:
$ gmake
...
gcc -v -o libperl.so -shared -fPIC perl.o gv.o toke.o perly.o op.o
pad.o regcomp.o dump.o
Richard Henderson wrote:
>Actually, no, it's not possible. At least in the context we're
>discussing here. Consider:
>
>One part of the application (say, libstdc++) is compiled with only
>i386 support. Here we wind up relying on a mutex to protect the
>memory update. Another part of the applic
"Giovanni Bajo" <[EMAIL PROTECTED]> writes:
| Gabriel Dos Reis <[EMAIL PROTECTED]> wrote:
|
| >>> You must not have been paying attention to one of the most frequent
| >>> complaints about gcc, which is that it is dog slow already ;-)
| >>
| >> Sure, but to me -O2 says you don't care much about c
Hi all,
I am compiling small program on a SPARC gcc 3.4.3
test.c
---
struct test1
{
int a;
int b;
char c;
};
struct test2
{
char a;
char b;
char c;
};
struct test3
{
int a;
int b;
int c;
};
int main()
{
struct test1* t1, t11;
stru
51 matches
Mail list logo