Ulrich Drepper wrote:
>Andrew Pinski wrote:
>
>
>>You might not care about anything except for GNU/Linux but GCC has to care
>>to some point.
>>
>>
>The important point is that this (and similar things like vector
>instructions) is an issues which cannot be solved adequately with an
>one-siz
Andrew Pinski wrote:
> You might not care about anything except for GNU/Linux but GCC has to care
> to some point.
The important point is that this (and similar things like vector
instructions) is an issues which cannot be solved adequately with an
one-size-fits-all mechanism. It requires platfor
>
> This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
> --enigBEBEBAEF5B666A9C8D55160E
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: quoted-printable
>
> Andrew Pinski wrote:
> \> Uli you keep forgetting about other OS which don't use elf (like Mac
>
Andrew Pinski wrote:
\> Uli you keep forgetting about other OS which don't use elf (like Mac
OS X),
> though for Mac OS X, it is easier to support this as the way mach-o handles
> fat binaries, you only really need one libgcc which contains the functions
> for all of subprocessor types.
What is it
Ulrich Drepper wrote:
>Richard Guenther wrote:
>
>
>>Also, libgcc
>>does _not_ know the machine - it only knows the -march it was compiled
>>for. Inlining and transparently handling different sub-architecture just
>>does not play together well.
>>
>>
>Yes, libgcc doesn't know this. But the
>
> This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
> --enig26EB7A814A3B685CD0FE59C5
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: quoted-printable
>
> Daniel Jacobowitz wrote:
> > The only real problem with this is that it mandates use of shared
>
On Mon, Nov 07, 2005 at 07:40:51AM -0800, Ulrich Drepper wrote:
> Daniel Jacobowitz wrote:
> > The only real problem with this is that it mandates use of shared
> > libgcc for the routines in question... always. If they ever go into
> > libgcc.a, we can't make sure we got the right copy.
>
> This
Daniel Jacobowitz wrote:
> The only real problem with this is that it mandates use of shared
> libgcc for the routines in question... always. If they ever go into
> libgcc.a, we can't make sure we got the right copy.
This is what lies ahead of us anyway. Solaris apparently official
denounced sta
On Mon, Nov 07, 2005 at 07:19:01AM -0800, Ulrich Drepper wrote:
> Richard Guenther wrote:
> > Also, libgcc
> > does _not_ know the machine - it only knows the -march it was compiled
> > for. Inlining and transparently handling different sub-architecture just
> > does not play together well.
>
> Y
Richard Guenther wrote:
> Also, libgcc
> does _not_ know the machine - it only knows the -march it was compiled
> for. Inlining and transparently handling different sub-architecture just
> does not play together well.
Yes, libgcc doesn't know this. But the libgcc can be installed in the
correct
Richard Guenther <[EMAIL PROTECTED]> writes:
| On 11/7/05, Paolo Carlini <[EMAIL PROTECTED]> wrote:
| > Richard Guenther wrote:
| >
| > >Richard is right - it's enough that the inlined version doesn't agree with
| > >whatever smartness is in libgcc.
| > >
| > Like? If you are inlining atomic opera
Richard Guenther wrote:
>That works, if only code playing well (or failing with SIGILL) with
>either locking scheme is ever inlined. I.e. for x86 you cannot inline
>the i386 variant.
>
Indeed, we are in perfect agreement now. The point is exactly that: the
i386 version cannot be inlined, can only
On 11/7/05, Paolo Carlini <[EMAIL PROTECTED]> wrote:
> Paolo Carlini wrote:
>
> >Richard Guenther wrote:
> >
> >>ou are screwed, because if you pass a std::vector (assuming it needs
> >>locking) to kdelibs to mungle with and in a separate thread mungle with
> >>it in the -march=i686 application you
Paolo Carlini wrote:
>Richard Guenther wrote:
>
>>ou are screwed, because if you pass a std::vector (assuming it needs
>>locking) to kdelibs to mungle with and in a separate thread mungle with
>>it in the -march=i686 application you're using two different types of locking
>>which surely won't pla
Richard Guenther wrote:
>You are screwed, because if you pass a std::vector (assuming it needs
>locking) to kdelibs to mungle with and in a separate thread mungle with
>it in the -march=i686 application you're using two different types of locking
>which surely won't play well with each other. A s
On 11/7/05, Paolo Carlini <[EMAIL PROTECTED]> wrote:
> Richard Guenther wrote:
>
> >>Like? If you are inlining atomic operations this means that you are
> >>passing -march=i686, therefore in order to run the code in the first
> >>place the machine has to be an i686, and libgcc certainly knows that.
Richard Guenther wrote:
>>Like? If you are inlining atomic operations this means that you are
>>passing -march=i686, therefore in order to run the code in the first
>>place the machine has to be an i686, and libgcc certainly knows that.
>>
>>
>You build like kdelibs for -march=i386, it gets th
On 11/7/05, Paolo Carlini <[EMAIL PROTECTED]> wrote:
> Richard Guenther wrote:
>
> >Richard is right - it's enough that the inlined version doesn't agree with
> >whatever smartness is in libgcc.
> >
> Like? If you are inlining atomic operations this means that you are
> passing -march=i686, therefo
Richard Henderson wrote:
>On Mon, Nov 07, 2005 at 09:03:08AM +0100, Richard Guenther wrote:
>
>
>>... but still does not support changing the decision at application
>>compile-time.
>>
>>
>Correct. And I don't think we should support that. It makes
>hellish work for us, supporting a featur
Richard Henderson wrote:
>On Mon, Nov 07, 2005 at 02:24:03AM +0100, Paolo Carlini wrote:
>
>
>>To be sure: can you confirm that there is no easy solution for the
>>x86_64 issue?
>>
>>
>What x86_64 issue?
>
>
The issue would be that in the scheme relying completely on libstdc++,
an x86_64 c
Richard Guenther wrote:
>Richard is right - it's enough that the inlined version doesn't agree with
>whatever smartness is in libgcc.
>
Like? If you are inlining atomic operations this means that you are
passing -march=i686, therefore in order to run the code in the first
place the machine has to
Richard Henderson wrote:
>On Mon, Nov 07, 2005 at 03:45:46AM +0100, Paolo Carlini wrote:
>
>
>>Richard, sorry, I don't agree, on second thought. You are not
>>considering that the idea is using a "smart" libgcc, a la glibc, as per
>>Mark and Uli messages.
>>
>>
>Yes, I am. I stand by my sta
On Mon, Nov 07, 2005 at 02:24:03AM +0100, Paolo Carlini wrote:
> To be sure: can you confirm that there is no easy solution for the
> x86_64 issue?
What x86_64 issue?
r~
On Mon, Nov 07, 2005 at 09:03:08AM +0100, Richard Guenther wrote:
> ... but still does not support changing the decision at application
> compile-time.
Correct. And I don't think we should support that. It makes
hellish work for us, supporting a feature that users won't use
correctly.
r~
On Mon, Nov 07, 2005 at 03:45:46AM +0100, Paolo Carlini wrote:
> Richard, sorry, I don't agree, on second thought. You are not
> considering that the idea is using a "smart" libgcc, a la glibc, as per
> Mark and Uli messages.
Yes, I am. I stand by my statement: libgcc is the wrong level at
which
On 11/7/05, Paolo Carlini <[EMAIL PROTECTED]> wrote:
> Richard Henderson wrote:
>
> >Actually, no, it's not possible. At least in the context we're
> >discussing here. Consider:
> >
> >One part of the application (say, libstdc++) is compiled with only
> >i386 support. Here we wind up relying on
Richard Henderson wrote:
>Actually, no, it's not possible. At least in the context we're
>discussing here. Consider:
>
>One part of the application (say, libstdc++) is compiled with only
>i386 support. Here we wind up relying on a mutex to protect the
>memory update. Another part of the applic
Richard Henderson wrote:
>My thinking would be along the lines of
>
>
>#if !ARCH_ALWAYS_HAS_SYNC_BUILTINS
>
>
[snip]
>#endif
>
>
Well, there is a minor catch, which is, if we don't want to break the
ABI, we have to keep on implementing and exporting from the *.so
__exchange_and_add and __atom
Richard Henderson wrote:
>On Mon, Nov 07, 2005 at 01:35:13AM +0100, Paolo Carlini wrote:
>
>
>>We have to add to the library
>>out-of-line versions of the builtins... (in order to do that, we may end
>>up restoring the old inline assembly implementations of CAS, for example)
>>
>>
>I don't t
On Mon, Nov 07, 2005 at 01:35:13AM +0100, Paolo Carlini wrote:
> We have to add to the library
> out-of-line versions of the builtins... (in order to do that, we may end
> up restoring the old inline assembly implementations of CAS, for example)
I don't think you need to restore inline assembly.
Richard Henderson wrote:
>One part of the application (say, libstdc++) is compiled with only
>i386 support. Here we wind up relying on a mutex to protect the
>memory update. Another part of the application (say, the exe) is
>compiled with i686 support, and so chooses to use atomic operations.
>T
On Sun, Nov 06, 2005 at 10:51:51AM -0800, Richard Henderson wrote:
> I suppose that in some cases it would be possible to implement
> them in libgcc. Certainly we provided for that possibility
> by expanding to external calls.
Actually, no, it's not possible. At least in the context we're
discu
Ulrich Drepper wrote:
> Mark Mitchell wrote:
>
>>Yes, GLIBC does that kind of thing, and we could do. In the simplest
>>form, we could have startup code that checks the CPU, and sets up a
>>table of function pointers that application code could use.
>
>
> That's not what glibc does and it is a
Mark Mitchell wrote:
> Yes, GLIBC does that kind of thing, and we could do. In the simplest
> form, we could have startup code that checks the CPU, and sets up a
> table of function pointers that application code could use.
That's not what glibc does and it is a horrible idea. The indirect
jumps
Richard Henderson wrote:
On Sun, Nov 06, 2005 at 12:10:03PM -0800, Ian Lance Taylor wrote:
How many processors out there support multi-processor systems but do
not provide any sort of atomic store operation?
My point here had been more wrt the 8-byte operations, wherein
there are *plean
* Peter Dimov:
> Even on a P4, inlining may enable compiler optimizations. One case is when
> the compiler can see that the return value of __sync_fetch_and_or (for
> instance) isn't used. It's possible to use a wait-free "lock or" instead of
> a "lock cmpxchg" loop (MSVC 8 does this for _Inter
On Sun, Nov 06, 2005 at 12:10:03PM -0800, Ian Lance Taylor wrote:
> How many processors out there support multi-processor systems but do
> not provide any sort of atomic store operation?
My point here had been more wrt the 8-byte operations, wherein
there are *pleanty* of multi-processor systems t
* Richard Henderson:
> To keep all this in perspective, folks should remember that atomic
> operations are *slow*. Very very slow. Orders of magnitude slower
> than function calls. Seriously. Taking p4 as the extreme example,
> one can expect a null function call in around 10 cycles, but a loc
Richard Henderson wrote:
To keep all this in perspective, folks should remember that atomic
operations are *slow*. Very very slow. Orders of magnitude slower
than function calls. Seriously. Taking p4 as the extreme example,
one can expect a null function call in around 10 cycles, but a locke
* Paolo Carlini:
> Actually, the situation is not as bad, as far as I can see: the worst
> case is i386 vs i486+, and Old-Sparc vs New-Sparc. More generally, a
> targer either cannot implement the builtin at all (a trivial fall back
> using locks or no MT support at all) or can in no more than 1
>
Hi Ian,
>I can see that there is a troubling case that code may be compiled for
>i386 and then run on a multi-processing system using newer processors.
>That is something which we would have to detect at run time, in start
>up code or the first time the builtins are invoked.
>
>
Earlier in this
Richard Henderson <[EMAIL PROTECTED]> writes:
> Not all targets are going to be able to implement the builtins,
> even with locks. It is imperitive that the target have an
> atomic store operation, so that other read-only references to
> the variable see either the old or new value, but not a mi
Paolo Carlini wrote:
>>And, that if
>>__exchange_and_add is showing up on the top of the profile, the fix
>>probably isn't inlining -- it's to work out a way to make less use of
>>atomic operations.
>>
>>
I want to add that we are certain
Mark Mitchell wrote:
>Richard Henderson wrote:
>
>>I believe some poor design decisions were made for p4 here. But even
>>on a platform without such problems you can expect a factor of 30
>>difference.
>>
>>
>So, that suggests that inlining these operations probably isn't very
>profitable. I
Richard Henderson wrote:
> I believe some poor design decisions were made for p4 here. But even
> on a platform without such problems you can expect a factor of 30
> difference.
So, that suggests that inlining these operations probably isn't very
profitable. In that case, it seems like we could
Mark Mitchell wrote:
>>Yes, in principle you are right, but in that case we can reorder the
>>ifs: first i686, last i386 ;) Seriously earlier today I was hoping we
>>can have something smarter than a series of conditionals at the level of
>>libgcc, I don't know it much. I was hoping we can manage
On Sun, Nov 06, 2005 at 11:02:29AM -0800, Mark Mitchell wrote:
> Are you saying that you don't expect there to ever be an architecture
> that might have three or more ways of doing locking? That seems rather
> optimistic to me. I think we ought to plan for needing as many versions
> as we have CP
Paolo Carlini wrote:
> Yes, in principle you are right, but in that case we can reorder the
> ifs: first i686, last i386 ;) Seriously earlier today I was hoping we
> can have something smarter than a series of conditionals at the level of
> libgcc, I don't know it much. I was hoping we can manage
Mark Mitchell wrote:
>Paolo Carlini wrote:
>
>>Actually, the situation is not as bad, as far as I can see: the worst
>>case is i386 vs i486+, and Old-Sparc vs New-Sparc. More generally, a
>>targer either cannot implement the builtin at all (a trivial fall back
>>using locks or no MT support at all
On 11/6/05, Mark Mitchell <[EMAIL PROTECTED]> wrote:
> Paolo Carlini wrote:
>
> > Actually, the situation is not as bad, as far as I can see: the worst
> > case is i386 vs i486+, and Old-Sparc vs New-Sparc. More generally, a
> > targer either cannot implement the builtin at all (a trivial fall back
Paolo Carlini wrote:
> Actually, the situation is not as bad, as far as I can see: the worst
> case is i386 vs i486+, and Old-Sparc vs New-Sparc. More generally, a
> targer either cannot implement the builtin at all (a trivial fall back
> using locks or no MT support at all) or can in no more than
Hi Mark,
>I think this is a somewhat difficult problem because of the tension
>between performance and functionality. In particular, as you say, the
>code sequence you want to use varies by CPU.
>
>I don't think I have good answers; this email is just me musing out loud.
>
>You probably don't wan
On Sun, Nov 06, 2005 at 11:34:30AM +0100, Paolo Carlini wrote:
> Thus my request: would it be possible to have available the builtins
> unconditionally, by way of a slow (locks) fallback replacing the real
> implementation when the actual target code doesn't allow for them?
I suppose that in som
Hi Howard,
> Coincidentally I also explored this option in another product. We
> ended up implementing it and it seemed to work quite well. It did
> require the back end to "register" with the preprocessor those
> builtins it implemented, and quite frankly I don't know exactly how
> that reg
Paolo Carlini wrote:
> Hi,
>
> we have this long standing issue which really we should solve, one way
> or another: otherwise there are both correctness and performance issues
> which we cannot fix, new features which we cannot implement. I have
> plenty of examples, just ask, in case, if you want
On Nov 6, 2005, at 6:03 AM, Paolo Carlini wrote:
So - can't you work with some preprocessor magic and a define, if
the builtins are available?
I don't really understand this last remark of yours: is it an
alternate solution?!? Any preprocessor magic has to rely on a new
preprocessor builtin
Richard Guenther wrote:
>Can you point me to some libstdc++ class/file where you use the
>builtins or other solution?
>
Simply config/cpu/*/atomicity.h will do, for ia64, powerpc, ia64, alpha,
s390, currently to implement __exchange_and_add and __atomic_add. Note
that in this way the latter are *n
On 11/6/05, Paolo Carlini <[EMAIL PROTECTED]> wrote:
> Richard Guenther wrote:
> > We could just provide fallback libcalls in libgcc
> Indeed, this is an option. Not one I can implement myself quickly, but I
> think the idea of issuing a library call when the builtin is not
> available was actually
Richard Guenther wrote:
We could just provide fallback libcalls in libgcc
All in all, I think this is really the best solution. For 4.2 Sparc will
also have the builtins available and even if we want that the libgcc
code is equivalent to what is currently available in
libstdc++-v3/config/cpu,
Richard Guenther wrote:
We could just provide fallback libcalls in libgcc
Indeed, this is an option. Not one I can implement myself quickly, but I
think the idea of issuing a library call when the builtin is not
available was actually meant to enable this kind of solution.
Can you work on it?
On 11/6/05, Paolo Carlini <[EMAIL PROTECTED]> wrote:
> Hi,
>
> we have this long standing issue which really we should solve, one way
> or another: otherwise there are both correctness and performance issues
> which we cannot fix, new features which we cannot implement. I have
> plenty of examples,
Hi,
we have this long standing issue which really we should solve, one way
or another: otherwise there are both correctness and performance issues
which we cannot fix, new features which we cannot implement. I have
plenty of examples, just ask, in case, if you want more details and
motivation
62 matches
Mail list logo